How-To Tutorials

article-image-building-cloud-spy-camera-and-creating-gps-tracker

30 Oct 2015

4 min read

Building a Cloud Spy Camera and Creating a GPS Tracker

30 Oct 2015

In this article by Marco Schwartz author of the book Arduino for Secret Agents, we will build a GPS location tracker and use a spy camera for live streaming. Building a GPS location tracker It's now time to build a real GPS location tracker. For this project, we'll get the location just as before, using the GPS if available, and the GPRS location otherwise. However here, we are going to use the GPRS capabilities of the shield to send the latitude and longitude data to Dweet.io, which is a service we already used before. Then, we'll plot this data in Google Maps, allowing you to follow your device in real-time from anywhere in the world. (For more resources related to this topic, see here.) You need to define a name for the Thing that will contain the GPS location data: String dweetThing = "mysecretgpstracker"; Then, after getting the current location, we prepare the data to be sent to Dweet.io: uint16_t statuscode; int16_t length; char url[80]; String request = "www.dweet.io/dweet/for/" + dweetThing + "?latitude=" + latitude + "andlongitude=" + longitude; request.toCharArray(url, request.length()); After that, we actually send the data to Dweet.io: if (!fona.HTTP_GET_start(url, andstatuscode, (uint16_t *)andlength)) { Serial.println("Failed!"); } while (length > 0) { while (fona.available()) { char c = fona.read(); // Serial.write is too slow, we'll write directly to Serial register! #if defined(__AVR_ATmega328P__) || defined(__AVR_ATmega168__) loop_until_bit_is_set(UCSR0A, UDRE0); /* Wait until data register empty. */ UDR0 = c; #else Serial.write(c); #endif length--; } } fona.HTTP_GET_end(); Now, before testing the project, we are going to prepare our dashboard that will host the Google Maps widget. We are going to use Freeboard.io for this purpose. If you don't have an account yet, go to http://freeboard.io/. Create a new dashboard, and also a new datasource. Insert the name of your Thing inside the THING NAME field: Then, create a new Pane with a Google Maps widget. Link this widget to the latitude and longitude of your Location datasource: It's now time to test the project. Make sure to grab all the code, for example from the GitHub repository of the book. Also don't forget to modify the Thing name, as well as your GPRS settings. Then, upload the sketch to the board, and open the Serial monitor. This is what you should see: The most important line is the last one, which confirms that data has been sent to Dweet.io and has been stored there. Now, simply go back to the dashboard you just created: you can now see that the location on the map has been updated: Note that this map is also updated in real-time, as new measurements arrive from the board. You can also modify the delay between two updates of the position of the tracker, by changing the delay() function in the sketch. Congratulations, you just built your own GPS tracking device! Live streaming from the spy camera We are now going to use the camera to stream live video in a web browser. This stream will be accessible from any device connected to the same WiFi network as the Yun. To start with this project, log into your Yun using the following command (by changing the name of the board with the name of your Yun): ssh root@arduinoyun.local Then, type: mjpg_streamer -i "input_uvc.so -d /dev/video0 -r 640x480 -f 25" -o "output_http.so -p 8080 -w /www/webcam" & This will start the streaming from your Yun. You can now simply go the URL of your Yun, and add ':8080' at the end. For example, http://arduinoyun.local:8080. You should arrive on the streaming interface: You can now stream this video live to your mobile phone or any other device within the same network. It's the perfect project to spy in a room while you are sitting outside for example. Summary In this article, we built a device that allows us to track the GPS location of any object it is attached to and we built a spy camera project that can send pictures in the cloud whenever motion is detected Resources for Article: Further resources on this subject: Getting Started with Arduino [article] Arduino Development [article] Prototyping Arduino Projects using Python [article]

0
0
14544

Packt

30 Oct 2015

14 min read

Introducing OpenSIPS

Packt

30 Oct 2015

14 min read

This article by Flavio E Goncalves and Bogdan-Andrei Iancu, the authors of the book Building Telephony Systems with OpenSIPS - Second Edition, will help you to understand who can use OpenSIPS. This is followed by a description of the OpenSIPS design. It is essential to fully understand OpenSIPS—from the outside and inside perspectives—before using it. In this article, you will see the following: The different usage scenarios for OpenSIPS or when to use it The OpenSIPS internal design and structure (For more resources related to this topic, see here.) Who's using OpenSIPS? A wide range of Session Initiation Protocol (SIP) oriented service providers choose OpenSIPS to power their platform, mainly because of its throughput and capabilities and also because of its reliability as a stable and mature open source software project. Voice over Internet Protocol (VoIP) service providers take advantage of the rich feature set of OpenSIPS to build attractive and competitive services for residential or enterprise customers. The flexibility of OpenSIPS (in terms of service creation) and the fast releasing cycle give them an advantage in creating cutting-edge services in the competitive VoIP markets. The ability to pipe large amounts of traffic through OpenSIPS in a reliable and precise fashion allows providers for trunking services, Direct Inward Dialing (DID), or termination services to scale and become more efficient. The dynamic nature of these services and the interoperability concerns are key aspects that make OpenSIPS the perfect candidate for the job. Currently, OpenSIPS has exceeded its original Class 4 status and is now able to offer Class 5 signaling features. So, hosted/virtual Private Branch Exchange (PBX) providers use OpenSIPS as a core component to emulate PBX-like services. OpenSIPS is flexible enough to integrate all the complex Class 5 features and even more in order to interface with external SIP engines (such as media servers). The ability of OpenSIPS to frontend existing platforms for the load balancing and clustering (even geo distribution) core components opens the markets populated by existing players who offer PBX services or termination services. Such players can now take advantage of OpenSIPS too. Besides calls, OpenSIPS provides you with present and instant messaging capabilities, which are essential to buildservices such as Rich Communication Services (RCS). In the area of advanced SIP services, Local Number Portability (LNP) providers, Canonical Name (CNAME) providers, fraud detection providers are relying on the OpenSIPS skills to interface with various DB engines and handle huge amounts of data (typical for their services). OpenSIPS can offer both, an efficient SIP frontend and a powerful backend to connect to DBs and other external tools. With the OpenSIPS evolution in the Class 5 area, new capabilities such as Back-2-Back User Agent and Call Queuing were exploited by call center providers (inbound and outbound termination). When compared with the existing solutions, OpenSIPS-based call centers were able to handle large amounts of calls and offer, in a geo-distributed fashion, an all-in-one platform: call queuing, DID management, Public Switched Telephone Network (PSTN) routing, SIP peering, and others. Nevertheless, being an open software, OpenSIPS offers a perfect study and research platform. Universities and research centers are using OpenSIPS to familiarize students with the VoIP/SIP world to build projects or research cases. It is important to mention here that this relation with the universities and research institutes is in the benefit of both parties as the OpenSIPS project gets a lot of traction and expansions from these study/research projects. The project manages a list of OpenSIPS users, a list that is continuously growing: http://www.opensips.org/About/WhoIsUsing The OpenSIPS design Architecturally speaking, OpenSIPS is formed of two logical components: the core and modules. The core is the application itself and it provides the low-level functionality of OpenSIPS, the definition of various interfaces and some generic resources. The modules are shared libraries loaded on demand at the startup time. Each module implements a well-defined functionality for a specific routing algorithm or authentication method. There are mainly two types of modules in OpenSIPS: The modules providing functionalities and functions directly for the routing script The modules implementing a core defined interface (such as a module implementing the SQL interface will become a backend to a certain SQL server) The architecture of OpenSIPS is shown in the following figure: The OpenSIPS core The OpenSIPS core is a minimal application. By itself, it is only able to proxy the SIP requests and reply in a stateless mode with very basic scripting capabilities. In most of the cases, the core is used in conjunction with several modules. The OpenSIPS core provides the following: The SIP transport layer The SIP factory—the message parser and builder The routing script parser and interpreter The memory and locking manager The core script functions and variables A SQL interface definition (implementation provided by modules and not by the core itself) A NoSQL interface definition An AAA interface definition A management interface An events interface A statistics interface The SIP transport layer implements various SIP transport protocols. Currently, OpenSIPS supports UDP, TCP, Transport Layer Security (TLS), and WebSocket. The transport protocols that are to be used depend on the SIP listeners defined in the routing script. Multiple transport protocols may be used at the same time. The SIP factory layer provides you with functions to parse and build the SIP messages. OpenSIPS implements a lazy but efficient SIP parser—this means that the parsing is done on demand (OpenSIPS parses as far as requested) and selectively. (The header bodies are parsed only if requested; otherwise, only the header name is parsed.) The parsing process is transparent at the script level; each function (core or modules) does its own parsing internally. When it comes to changing the message, it is very important to know that the changes you make in the script are not applied in real time over the message but stored and applied only when all the message processing is done. Due to this offline handling of changes, you are not able to see your own changes over the message. For example, if you add a new SIP header and test its presence, you will not find it, or if you remove a SIP header and look again for it, you will find it. Therouting script parser and interpreter is loading and parsing (to the memory) the routing script at the startup time; after this, the file is no longer needed. The routing script cannot be reloaded at runtime—OpenSIPS must be restarted in order to read the routing script again. Aside from various global settings, the routing script contains the routing logic—this logic is defined via a C/Shell-like language, custom to OpenSIPS. Configuring the routing logic in OpenSIPS is more like doing simple programming. This approach (writing your own program to route the traffic) gives OpenSIPS a tremendous flexibility when it comes to routing. The memory and locking manager is a global resource in OpenSIPS. For performance reasons, OpenSIPS implements its own internal manager for the memory allocation and locking operations. This part is not visible at the route scripting level, but it may be configured at compiling time. (OpenSIPS has several implementations for the memory and locking managers.) The OpenSIPS core provides its own script functions and script variables to be used from the routing script. When compared with the functions and variables exported by the modules, the core set is quite limited in number and functionality. The online manual lists and documents all the functions and variables provided by the OpenSIPS core. The SQL interface is defined by the core but not implemented. The definition is done here in the OpenSIPS core for standardization reasons. Modules can use SQL services via the SQL interface without being aware of the underlying SQL driver. Additionally, the other modules may implement the SQL interface, offering drivers to various SQL DBs as MySQL, Postgres, Oracle, Berkeley, unixODBC, and many others. (See all the modules starting with the db_ prefix.) Even if the SQL interface is for internal usage (between modules), it is partially exposed at the script level by the avpops module: avp_db_query("select first_name, last_name from subscribers where username='$rU' and domain='$rd'", "$avp(fname);$avp(lname)"); Similar, the NoSQL interface is defined to standardize the operations with the NoSQL databases. The OpenSIPS modules currently implement drivers to Redis, CouchBase, Cassandra, MongoDB, Memcached, and other databases. (See all the modules starting with the cachedb_ prefix.) The NoSQL interface is directly exposed by the OpenSIPS core via a simple set of functions: cache_store("redis:cluster1", "key1", "$var(my_val)", 1200); cache_fetch("redis:cluster1", "key1", "$var(my_val)"); cache_remove("redis:cluster1", "key1"); The AAA Interface defines the interfacing to the AAA servers in a similar way. Currently, OpenSIPS supports the RADIUS driver for the AAA interface, the Diameter driver being under heavy rework at this time. The AAA interface is not exposed at all at the routing script level, being exclusively used internally between modules. The Management Interface (MI) is an OpenSIPS interface that allows the external applications to trigger predefined commands in OpenSIPS. Such commands typically allow an external application/script to do the following: Push data in OpenSIPS (such as setting a debug level, registering a contact, and so on) Fetch data from OpenSIPS (see registered users, ongoing calls, get statistics, and so on) Trigger an internal action in OpenSIPS (reloading the data, sending a message, and so on) The MI commands are provided by the OpenSIPS core (See the online documentation at http://www.opensips.org/Documentation/Interface-CoreMI-2-1.) and also by the modules. (Check the online module documentation to see the commands provided by each module at http://www.opensips.org/Documentation/Modules-2-1.) For the MI, OpenSIPS supports the following drivers: XMLRPC, FIFO file, Datagrams, Json RPC, and HTTP. (See all the modules starting with the mi_ prefix.) A simple example of interacting with OpenSIPS via MI interfaces is using the opensipsctl utility—it uses the FIFO or XMLRPC protocols to push MI commands into OpenSIPS. The opensipsctl utility allows you to run an MI command explicitly via the fifo file: opensipsctl fifo ps opensipsctl fifo debug 4 The following is a simple program in Python to trigger in order to run an MI command in OpenSIPS via the XMLRPC protocol: #!/usr/bin/python import xmlrpclib opensips = xmlrpclib.ServerProxy('http://127.0.0.1:8080/RPC2') print opensips.ps(); The event interface is an OpenSIPS interface that provides you with different ways to notify the external applications about certain events triggered in OpenSIPS. In order to notify an external application about OpenSIPS' internal events, the event interface provides the following functions: Managing the exported events Managing the subscriptions from different applications Exporting generic functions to raise an event (regardless of the transport protocol used) Communicating with different transport protocols to send the events Events can be triggered by the core activities (See the online documentation at http://www.opensips.org/Documentation/Interface-CoreEvents-2-1.), module activities (See the Exported Events section in the module documentation at http://www.opensips.org/Documentation/Modules-2-1.), or explicitly from the routing script via the raise_event() function: raise_event("E_SCRIPT_EVENT", $avp(attributes), $avp(values)); To deliver the events, the OpenSIPS modules implement the following drivers: datagram, RabbitMQ, and XMLRPC. (See all the modules starting with the event_ prefix.) The statistics interface provides you with access to various internal statistics of OpenSIPS. It provides valuable information on what is going on in OpenSIPS. This can be used by the external applications for the monitoring purposes, load evaluation, and real-time integration with other services. The values of statistics variables are exclusively numerical. OpenSIPS provides two types of statistics variables: Counter-like: Variables that keep counting things that happened in OpenSIPS, such as received requests, processed dialogs, failed DB queries, and so on Computed values: Variables that are calculated in real-time, such as how much memory is used, the current load, active dialogs, active transactions, and so on Refer to http://www.opensips.org/Documentation/Interface-Statistics-1-10 for more information. In OpenSIPS, the statistics variables are grouped in different sets depending on their purposes. For example, the OpenSIPS core provides shmem, load, net, and other groups (http://www.opensips.org/Documentation/Interface-CoreStatistics-2-1), while each OpenSIPS module provides its own group. (Typically, the group has the same name as the module.) The statistics can be easily queried via the MI with the get_statistics command: # get various statistic variables, by list of names > opensipsctl fifo get_statistics rcv_requests inuse_transactions > core:rcv_requests = 453 > tm:inuse_transactions = 10 The OpenSIPS modules Each OpenSIPS module is a dynamic library that may be loaded on demand at OpenSIPS startup, if instructed so in the routing script. The modules are identified by their names and they typically export various resources to be used in/from the routing script: A set of module parameters (optional)—they allow the module to be configured at startup A set of script functions (optional)—functions to be used from the routing script A set of asynchronous script functions (optional)—functions to be used from the routing script, but in an asynchronous way via a resume route A set of variables (optional)—variables that can be used from the routing script A set of statistics variables (optional)—statistics specific to the module that can be read via the MI A set of MI commands (optional)—commands specific to the module to be triggered via the MI outside OpenSIPS A set of events (optional)—events that can be triggered by the module (and delivered by the event modules to the external applications) Based on what they implement, there are three types of modules in OpenSIPS: Modules implementing the functionalities to be used from the routing script, that is, authentication methods, routing algorithms, registration handling, and so on. Modules implementing one of the interfaces defined by the core in order to provide a driver for a certain communication protocol, that is, MySQL driver for the SQL interface, RADIUS driver for the AAA interface, and others. Modules implementing their own particular API; such an API is to be used directly (bypassing the core) by the other modules. This allows modules to use other modules, independent of the core. Such modules are the b2b_entities, the dialog module, TM module, and so on. We can conclude that even if most of the modules provide functionalities to the routing script, there are modules providing functionalities to other modules via the core interfaces or their own APIs. These are some examples: The db_mysql module implements the SQL interface defined by the core; the auth_db module using the SQL interface to perform the SQL operations can transparently use any module implementing the SQL interface The b2b_entities module defines and implements its own API (to manage SIP UASs and UACs); the b2b_logic module uses the API from the b2b_entities module directly in order to build even more complex functionalities (that are used later from the routing script) This leads to the concept of module dependencies and an OpenSIPS module may depend on the following: An external library at linking time, that is, b2b_logicdepends on the xml2 library The core interface at startup time—if a module uses the SQL interface, you need to load one or more modules implementing an SQL driver Another module—one module depends directly and explicitly on another module (that is, b2b_logic depends on b2b_entities) The documentation of each module (the README file) contains the information on the module's dependencies, what functions or parameters are exposed, and what MI commands or events are available. See the online module documentations at http://www.opensips.org/Documentation/Modules-2-1. Summary In this article, you have seen who is using OpenSIPS. From the internal perspective, you learned the design/structure of OpenSIPS, such as the OpenSIPS core and OpenSIPS modules and what they can provide. Resources for Article: Further resources on this subject: Architecture of FreeSWITCH [article] Innovation of Communication and Information Technologies [article] Cisco Unified Communications Manager 8: Call Routing, Dial Plan, and E.164 [article]

0
0
18535

Packt

29 Oct 2015

5 min read

Team Project Setup

Packt

29 Oct 2015

5 min read

0
0
10756

Packt

29 Oct 2015

14 min read

Introducing TabLayout

Packt

29 Oct 2015

14 min read

0
0
8449

Packt

29 Oct 2015

32 min read

Protecting Your Bitcoins

Packt

29 Oct 2015

32 min read

0
0
6188

article-image-detecting-shapes-employing-hough-transform

Packt

29 Oct 2015

13 min read

Detecting Shapes Employing Hough Transform

Packt

29 Oct 2015

13 min read

In this article by Amgad Muhammad, the author of OpenCV Android Programming By Example, we will see a famous shape analysis technique called the Hough Transform. You will learn about the basic idea behind this technique, which made it very popular and widely used. Detecting shapes We have seen how to detect edges; however, this process is a pixel-by-pixel process that answers the question whether this pixel is an edge or not. Moving forward in shape analysis, we would need more concrete information than just the edge test; we will need better representation. For example, if we have a picture of a box and we did the edge detection, we will end up with thousands and thousands of edge pixels; however, if we tried to fit a line to these edge pixels, we will get a rectangle that is more symbolic and is a useful representation. Understanding the Hough line transform There are many ways to fit a line through a number of points and Hough transform is considered the underconstrained method where we use only one point to find all the possible lines that can go through this point. We use another point to also find all the lines that can go through it and we keep doing this for all the points that we have. We end up with a voting system where each point is voting for a line and the more points laying on the same line, the higher the votes given to this line. In a nutshell, the Hough transform can be described as mapping a point in space to the parameter space of the shape of interest. With the equation of a line in the x and y space, , we transform it to the slope (a) and intercept space (b) and with this transformation, a point in the x and yspace is actually a line in the slope and intercept space with the equation : Insert image_4584_03_07.png Here we have five points in the x and y space (left). When converted to the slope and intercept space, we get five lines (right): Insert image_4584_03_08.png Now, every point in the x and y space will vote for a slope and intercept in the slope and intercept space, so all we have to do is find the maxima in the parameter space and this will be the line to fit our points: Insert image_4584_03_09.png In the right image, you can find the maxima value based on the votes of the points in the left image, and in the left image you can see that maxima is the slope and intercept of the line fitting the points. In the case of vertical lines, the slope is infinity; that's why it is more practical to use the polar equation of a line instead of the slope and intercept form. In this case, the equation that we will work with is and again, we have two parameters, and , and we will follow the same idea except that now the space is and instead of the slope and intercept: Insert image_4584_03_10.png We will follow the voting system to find the maxima, which represents the and of the line fitting our points. However, this time, a point in the and space will be sinusoid, and if two or more sinusoids intersect at the same and , this means that they belong to the same line: Insert image_4584_03_11.png You can see the Hough transform in action using the applets at http://www.rob.cs.tu-bs.de/teaching/interactive/. Detecting lines using Hough transform In OpenCV, we have two implementations of the Hough line transform: The standard Hough transform: The process is pretty much following the preceding process; however, it is considered the slower option as the algorithm has to examine all the edge points in a given image. The probabilistic Hough line transform: This option is the one that we will use in our example. In the probabilistic version, the algorithm attempts to minimize the amount of computation needed to detect lines by exploiting the difference in the fraction of votes needed to detect the lines. Intuitively, for strong/long lines, we only need a small fraction of its supporting points to vote before deciding if the accumulator bin reaches a count that is non-accidental; however, for shorter lines, a higher portion is needed to decide this. In conclusion, the algorithm tries to minimize the number of edge points that are needed to decide on the fitting line. UI definitions We will add a new item menu to start the Hough transform algorithm. Go to the res/menu/soft_scanner.xml file and open it to include the following item menu: <item android_id="@+id/action_HTL" android_enabled="true" android_visible="true" android_title="@string/action_HL"> </item> Detecting and drawing lines The process to use the Hough line transform is divided in four steps: Load the image of interest. Detect image edges using Canny; the output will be a binary image. Call either the standard or probabilistic Hough line transform on the binary image. Draw the detected lines. In SoftScanner Activity, we need to edit the onOptionesItemSelected method and add the following case: else if(id==R.id.action_HTL) { if(sampledImage==null) { Context context = getApplicationContext(); CharSequence text = "You need to load an image first!"; int duration = Toast.LENGTH_SHORT; Toast toast = Toast.makeText(context, text, duration); toast.show(); return true; } Mat binaryImage=new Mat(); Imgproc.cvtColor(sampledImage, binaryImage,Imgproc.COLOR_RGB2GRAY); Imgproc.Canny(binaryImage, binaryImage, 80, 100); Mat lines = new Mat(); int threshold = 50; Imgproc.HoughLinesP(binaryImage, lines, 1, Math.PI/180, threshold); Imgproc.cvtColor(binaryImage, binaryImage, Imgproc.COLOR_GRAY2RGB); for (int i = 0; i < lines.cols(); i++) { double[] line = lines.get(0, i); double xStart = line[0], yStart = line[1], xEnd = line[2], yEnd = line[3]; org.opencv.core.Point lineStart = new org.opencv.core.Point(xStart, yStart); org.opencv.core.Point lineEnd = new org.opencv.core.Point(xEnd, yEnd); Core.line(binaryImage, lineStart, lineEnd, new Scalar(0,0,255), 3); } displayImage(binaryImage); return true; } The code is actually straightforward, as follows: if(sampledImage==null) { Context context = getApplicationContext(); CharSequence text = "You need to load an image first!"; int duration = Toast.LENGTH_SHORT; Toast toast = Toast.makeText(context, text, duration); toast.show(); return true; } We first handle the case if the user clicks the item menu and didn't load an image: Mat binaryImage=new Mat(); Imgproc.cvtColor(sampledImage, binaryImage, Imgproc.COLOR_RGB2GRAY); Imgproc.Canny(binaryImage, binaryImage, 80, 100); Then, we initialize a new Mat object and convert the loaded image from the full color space to the grayscale space. Finally, we call Imgproc.Canny to convert the grayscale image to a binary image with only the edges displayed: Mat lines = new Mat(); int threshold = 50; Imgproc.HoughLinesP(binaryImage, lines, 1, Math.PI/180, threshold); The next step is to call Imgproc.HoughLinesP, which is the probabilistic version of the original Hough transform method, passing in the following parameters: A Mat object representing the binary image version of the loaded image A Mat object to hold the detected lines as the parameters A double for the resolution, in pixels, of the parameter, ; in our case, we set it to one pixel A double for the resolution, in radians, of the parameter, ; in our case, we set to it to one degree, An integer for the accumulator threshold to return only the lines with enough votes Usually, when using the probabilistic version of the Hough transform, you would use a smaller threshold because the algorithm is used to minimize the number of points used to vote. However, in the standard Hough transform, you should use a larger threshold. Imgproc.cvtColor(binaryImage, binaryImage, Imgproc.COLOR_GRAY2RGB); for (int i = 0; i < lines.cols(); i++) { double[] line = lines.get(0, i); double xStart = line[0], yStart = line[1], xEnd = line[2], yEnd = line[3]; org.opencv.core.Point lineStart = new org.opencv.core.Point(xStart, yStart); org.opencv.core.Point lineEnd = new org.opencv.core.Point(xEnd, yEnd); Core.line(binaryImage, lineStart, lineEnd, new Scalar(0,0,255), 3); } displayImage(binaryImage); Finally, we convert the binary image to a full color space to display the detected lines and then we loop on the detected lines' Mat object columns and draw them one by one using the parameters, : Insert image_4584_03_15.png Hough lines (in blue) detected from the edge image Detecting circles using Hough transform OpenCV provides another implementation of the Hough transform, but this time, instead of detecting lines, we will detect circles following the same idea of transforming the space to the parameters space. With an equation of a circle, ,we have three parameters, ; where a and b are the centers of the circle in the x and y direction, respectively, and r is the radius. Now, the parameter space is three-dimensional and every edge point belonging to a circle will vote in this three-dimensional space; then we search for the maxima in the parameter space to detect the circle center and radius. This procedure is very memory- and computation-intensive and the three-dimensional space will be very sparse. The good news is that OpenCV implements the circle Hough transform using a method called Hough gradient method. The Hough gradient method works as follows: in step one, we will apply an edge detector, for example, the Canny edge detector. In step two, we will increment the accumulator cells (two-dimensional space) in the direction of the gradient for every edge pixel. Intuitively, if we are encountering a circle, the accumulator cell with higher votes is actually that circle's center. Now, we have built a list of potential centers and we need to find the circle's radius so we will—for every center—consider the edge pixels by sorting them according to their distance from the center and keep a single radius that is supported (voted for) by the highest number of edge pixels: Insert image_4584_03_14.jpg UI definitions To trigger the circle Hough transform, we will add one item to our existing menu. Go to the res/menu/soft_scanner.xml file and open it to include the following item menu: <item android_id="@+id/action_CHT" android_enabled="true" android_visible="true" android_title="@string/action_CHT"> </item> Detecting and drawing circles The process of detecting circles is similar to the process of detecting lines: Load the image of interest. Convert it from a full color space to a grayscale space. Call the Hough circle transform method on the grayscale image. Draw the detected circles. We will again edit onOptionsItemSelected to handle the circle Hough transform case: else if(id==R.id.action_CHT) { if(sampledImage==null) { Context context = getApplicationContext(); CharSequence text = "You need to load an image first!"; int duration = Toast.LENGTH_SHORT; Toast toast = Toast.makeText(context, text, duration); toast.show(); return true; } Mat grayImage=new Mat(); Imgproc.cvtColor(sampledImage, grayImage, Imgproc.COLOR_RGB2GRAY); double minDist=20; int thickness=5; double cannyHighThreshold=150; double accumlatorThreshold=50; Mat circles = new Mat(); Imgproc.HoughCircles(grayImage, circles, Imgproc.CV_HOUGH_GRADIENT, 1, minDist,cannyHighThreshold,accumlatorThreshold,0,0); Imgproc.cvtColor(grayImage, grayImage, Imgproc.COLOR_GRAY2RGB); for (int i = 0; i < circles.cols(); i++) { double[] circle = circles.get(0, i); double centerX = circle[0], centerY = circle[1], radius = circle[2]; org.opencv.core.Point center = new org.opencv.core.Point(centerX, centerY); Core.circle(grayImage, center, (int) radius, new Scalar(0,0,255),thickness); } displayImage(grayImage); return true; } The code for the circle Hough transform is just as the one to detect lines except for the following part: double minDist=20; int thickness=5; double cannyHighThreshold=150; double accumlatorThreshold=50; Mat circles = new Mat(); Imgproc.HoughCircles(grayImage, circles, Imgproc.CV_HOUGH_GRADIENT, 1, minDist,cannyHighThreshold,accumlatorThreshold,0,0); Imgproc.cvtColor(grayImage, grayImage, Imgproc.COLOR_GRAY2RGB); for (int i = 0; i < circles.cols(); i++) { double[] circle = circles.get(0, i); double centerX = circle[0], centerY = circle[1], radius = circle[2]; org.opencv.core.Point center = new org.opencv.core.Point(centerX, centerY); Core.circle(grayImage, center, (int) radius, new Scalar(0,0,255),thickness); } We detect circles by calling Imgproc.HoughCircles and passing to it the following parameters: A Mat object representing the 8-bit, single-channel grayscale input image. A Mat object that will hold the detected circles. Every column of the matrix will hold a circle represented by these parameters, . An integer for the detection method and currently, OpenCV only implements the Hough gradient algorithm. A double used as an inverse ratio of the accumulator resolution to the image resolution. For example, if we passed one, the accumulator has the same resolution as the input image. If we passed two, the accumulator is half the resolution of the input image. A double for the minimum distance between the centers of the detected circles. If the parameter is too small, multiple neighbor circles may be falsely detected in addition to a true one. If it is too large, some circles may be missed. A double used for the upper threshold for the internal Canny edge detector; as it will be half the upper one for the lower threshold. A double for the accumulator threshold for the circle centers at the detection stage. The smaller it is, the more false circles may be detected. An integer for the minimum radius to be detected; if unknown, pass zero. An integer for the maximum radius to be detected; if unknown, pass zero. Finally, we loop on the detected circles and draw them one by one using Core.circle. Summary In this article, we covered a well-known shape analysis technique called the Hough Transform to fit lines and circles in the edge pixels.

0
0
5455

How-To Tutorials

article-image-rotation-forest-classifier-ensemble-based-feature-extraction

Packt

28 Oct 2015

16 min read

Rotation Forest - A Classifier Ensemble Based on Feature Extraction

Packt

28 Oct 2015

16 min read

In this article by Gopi Subramanian author of the book Python Data Science Cookbook you will learn bagging methods based on decision tree-based algorithms are very popular among the data science community. Rotation Forest The claim to fame of most of these methods is that they need zero data preparation as compared to the other methods, can obtain very good results, and can be provided as a black box of tools in the hands of software engineers. By design, bagging lends itself nicely to parallelization. Hence, these methods can be easily applied on a very large dataset in a cluster environment. The decision tree algorithms split the input data into various regions at each level of the tree. Thus, they perform implicit feature selection. Feature selection is one of the most important tasks in building a good model. By providing implicit feature selection, trees are at an advantageous position as compared to other techniques. Hence, bagging with decision trees comes with this advantage. Almost no data preparation is needed for decision trees. For example, consider the scaling of attributes. Attribute scaling has no impact on the structure of the trees. The missing values also do not affect decision trees. The effect of outliers is very minimal on a decision tree. We don’t have to do explicit feature transformations to accommodate feature interactions. One of the major complaints against tree-based methods is the difficulty with pruning the trees to avoid overfitting. Big trees tend to also fit the noise present in the underlying data and hence lead to a low bias and high variance. However, when we grow a lot of trees and the final prediction is an average of the output of all the trees in the ensemble, we avoid these problems. In this article, we will see a powerful tree-based ensemble method called rotation forest. A typical random forest requires a large number of trees to be a part of its ensemble in order to achieve good performance. Rotation forest can achieve similar or better performance with less number of trees. Additionally, the authors of this algorithm claim that the underlying estimator can be anything other than a tree. This way, it is projected as a new framework to build an ensemble similar to gradient boosting. (For more resources related to this topic, see here.) The algorithm Random forest and bagging gives impressive results with very large ensembles; having a large number of estimators results in the improvement of the accuracy of these methods. On the contrary, rotation forest is designed to work with a smaller number of ensembles. Let's write down the steps involved in building a rotation forest. The number of trees required in the forest is typically specified by the user. Let T be the number of trees required to be built. We start with iterating from one through T, that is, we build T trees. For each tree T, perform the following steps: Split the attributes in the training set into K nonoverlapping subsets of equal size. We have K datasets, each with K attributes. For each of the K datasets, we proceed to do the following: Bootstrap 75% of the data from each K dataset and use the bootstrapped sample for further steps. Run a principal component analysis on the ith subset in K. Retain all the principal components. For every feature j in the Kth subset, we have a principle component, a. Let's denote it as aij, where it’s the principal component for the jth attribute in the ith subset. Store the principal components for the subset. Create a rotation matrix of size, n X n, where n is the total number of attributes. Arrange the principal component in the matrix such that the components match the position of the feature in the original training dataset. Project the training dataset on the rotation matrix using the matrix multiplication. Build a decision tree with the projected dataset. Store the tree and rotation matrix. A quick note about PCA: PCA is an unsupervised method. In multivariate problems, PCA is used to reduce the dimension of the data with minimal information loss or, in other words, retain maximum variation in the data. In PCA, we find the directions in the data with the most variation, that is, the eigenvectors corresponding to the largest eigenvalues of the covariance matrix and project the data onto these directions. With a dataset (n x m) with n instances and m dimensions, PCA projects it onto a smaller subspace (n x d), where d << m. A point to note is that PCA is computationally very expensive. Programming rotation forest in Python Now let's write a Python code to implement rotation forest. We will proceed to test it on a classification problem. We will generate some classification dataset to demonstrate rotation forest. To our knowledge, there is no Python implementation available for rotation forest. We will leverage scikit-learns’s implementation of the decision tree classifier and use the train_test_split method for the bootstrapping. Refer to the following link to learn more about scikit-learn: http://scikit-learn.org/stable/ First write the necessary code to implement rotation forest and apply it on a classification problem. We will start with loading all the necessary libraries. Let's leverage the make_classification method from the sklearn.dataset module to generate the training data. We will follow it with a method to select a random subset of attributes called gen_random_subset: from sklearn.datasets import make_classification from sklearn.metrics import classification_report from sklearn.cross_validation import train_test_split from sklearn.decomposition import PCA from sklearn.tree import DecisionTreeClassifier import numpy as np def get_data(): """ Make a sample classification dataset Returns : Independent variable y, dependent variable x """ no_features = 50 redundant_features = int(0.1*no_features) informative_features = int(0.6*no_features) repeated_features = int(0.1*no_features) x,y = make_classification(n_samples=500,n_features=no_features,flip_y=0.03, n_informative = informative_features, n_redundant = redundant_features ,n_repeated = repeated_features,random_state=7) return x,y def get_random_subset(iterable,k): subsets = [] iteration = 0 np.random.shuffle(iterable) subset = 0 limit = len(iterable)/k while iteration < limit: if k <= len(iterable): subset = k else: subset = len(iterable) subsets.append(iterable[-subset:]) del iterable[-subset:] iteration+=1 return subsets We will write the build_rotationtree_model function, where we will build fully-grown trees and proceed to evaluate the forest’s performance using the model_worth function: def build_rotationtree_model(x_train,y_train,d,k): models = [] r_matrices = [] feature_subsets = [] for i in range(d): x,_,_,_ = train_test_split(x_train,y_train,test_size=0.3,random_state=7) # Features ids feature_index = range(x.shape[1]) # Get subsets of features random_k_subset = get_random_subset(feature_index,k) feature_subsets.append(random_k_subset) # Rotation matrix R_matrix = np.zeros((x.shape[1],x.shape[1]),dtype=float) for each_subset in random_k_subset: pca = PCA() x_subset = x[:,each_subset] pca.fit(x_subset) for ii in range(0,len(pca.components_)): for jj in range(0,len(pca.components_)): R_matrix[each_subset[ii],each_subset[jj]] = pca.components_[ii,jj] x_transformed = x_train.dot(R_matrix) model = DecisionTreeClassifier() model.fit(x_transformed,y_train) models.append(model) r_matrices.append(R_matrix) return models,r_matrices,feature_subsets def model_worth(models,r_matrices,x,y): predicted_ys = [] for i,model in enumerate(models): x_mod = x.dot(r_matrices[i]) predicted_y = model.predict(x_mod) predicted_ys.append(predicted_y) predicted_matrix = np.asmatrix(predicted_ys) final_prediction = [] for i in range(len(y)): pred_from_all_models = np.ravel(predicted_matrix[:,i]) non_zero_pred = np.nonzero(pred_from_all_models)[0] is_one = len(non_zero_pred) > len(models)/2 final_prediction.append(is_one) print classification_report(y, final_prediction) Finally, we will write a main function used to invoke the functions that we have defined: if __name__ == "__main__": x,y = get_data() # plot_data(x,y) # Divide the data into Train, dev and test x_train,x_test_all,y_train,y_test_all = train_test_split(x,y,test_size = 0.3,random_state=9) x_dev,x_test,y_dev,y_test = train_test_split(x_test_all,y_test_all,test_size=0.3,random_state=9) # Build a bag of models models,r_matrices,features = build_rotationtree_model(x_train,y_train,25,5) model_worth(models,r_matrices,x_train,y_train) model_worth(models,r_matrices,x_dev,y_dev) Walking through our code Let's start with our main function. We will invoke get_data to get our predictor attributes in the response attributes. In get_data, we will leverage the make_classification dataset to generate our training data for our recipe: def get_data(): """ Make a sample classification dataset Returns : Independent variable y, dependent variable x """ no_features = 30 redundant_features = int(0.1*no_features) informative_features = int(0.6*no_features) repeated_features = int(0.1*no_features) x,y = make_classification(n_samples=500,n_features=no_features,flip_y=0.03, n_informative = informative_features, n_redundant = redundant_features ,n_repeated = repeated_features,random_state=7) return x,y Let's look at the parameters passed to the make_classification method. The first parameter is the number of instances required; in this case, we say we need 500 instances. The second parameter is about how many attributes per instance are required. We say that we need 30. The third parameter, flip_y, randomly interchanges 3% of the instances. This is done to introduce some noise in our data. The next parameter is how many of these 30 features should be informative enough to be used in our classification. We have specified that 60% of our features, that is, 18 out of 30 should be informative. The next parameter is about redundant features. These are generated as a linear combination of the informative features in order to introduce correlation among the features. Finally, the repeated features are duplicate features, which are drawn randomly from both the informative and redundant features. Let's split the data into training and testing sets using train_test_split. We will reserve 30% of our data to test: Divide the data into Train, dev and test x_train,x_test_all,y_train,y_test_all = train_test_split(x,y,test_size = 0.3,random_state=9) Once again, we will leverage train_test_split to split our test data into dev and test sets: x_dev,x_test,y_dev,y_test = train_test_split(x_test_all,y_test_all,test_size=0.3,random_state=9) With the data divided to build, evaluate, and test the model, we will proceed to build our models: models,r_matrices,features = build_rotationtree_model(x_train,y_train,25,5) We will invoke the build_rotationtree_model function to build our rotation forest. We will pass our training data, predictor, x_train and response variable, y_train, the total number of trees to be built—25 in this case—and finally, the subset of features to be used—5 in this case. Let's jump into this function: models = [] r_matrices = [] feature_subsets = [] We will begin with declaring three lists to store each of the decision tree, rotation matrix for this tree, and subset of features used in this iteration. We will proceed to build each tree in our ensemble. As a first order of business, we will bootstrap to retain only 75% of the data: x,_,_,_ = train_test_split(x_train,y_train,test_size=0.3,random_state=7) We will leverage the train_test_split function from scikit-learn for the bootstrapping. We will then decide the feature subsets: # Features ids feature_index = range(x.shape[1]) # Get subsets of features random_k_subset = get_random_subset(feature_index,k) feature_subsets.append(random_k_subset) The get_random_subset function takes the feature index and number of subsets required as parameters and returns K subsets. In this function, we will shuffle the feature index. The feature index is an array of numbers starting from zero and ending with the number of features in our training set: np.random.shuffle(iterable) Let's say that we have ten features and our k value is five, indicating that we need subsets with five nonoverlapping feature indices and we need to do two iterations. We will store the number of iterations needed in the limit variable: limit = len(iterable)/k while iteration < limit: if k <= len(iterable): subset = k else: subset = len(iterable) iteration+=1 If our required subset is less than the total number of attributes, we can proceed to use the first k entries in our iterable. As we shuffled our iterables, we will be returning different volumes at different times: subsets.append(iterable[-subset:]) On selecting a subset, we will remove it from the iterable as we need nonoverlapping sets: del iterable[-subset:] With all the subsets ready, we will declare our rotation matrix: del iterable[-subset:] With all the subsets ready, we will declare our rotation matrix: # Rotation matrix R_matrix = np.zeros((x.shape[1],x.shape[1]),dtype=float) As you can see, our rotation matrix is of size, n x n, where is the number of attributes in our dataset. You can see that we have used the shape attribute to declare this matrix filled with zeros: for each_subset in random_k_subset: pca = PCA() x_subset = x[:,each_subset] pca.fit(x_subset) For each of the K subsets of data having only K features, we will proceed to do a principal component analysis. We will fill our rotation matrix with the component values: for ii in range(0,len(pca.components_)): for jj in range(0,len(pca.components_)): R_matrix[each_subset[ii],each_subset[jj]] = pca.components_[ii,jj] For example, let's say that we have three attributes in our subset in a total of six attributes. For illustration, let's say that our subsets are as follows: 2,4,6 and 1,3,5 Our rotation matrix, R, is of size, 6 x 6. Assume that we want to fill the rotation matrix for the first subset of features. We will have three principal components, one each for 2,4, and 6 of size, 1 x 3. The output of PCA from scikit-learn is a matrix of size components X features. We will go through each component value in the for loop. At the first run, our feature of interest is two, and the cell (0,0) in the component matrix output from PCA gives the value of contribution of feature two to component one. We have to find the right place in the rotation matrix for this value. We will use the index from the component matrices, ii and jj, with the subset list to get the right place in the rotation matrix: R_matrix[each_subset[ii],each_subset[jj]] = pca.components_[ii,jj] The each_subset[0] and each_subset[0] will put us in cell (2,2) in the rotation matrix. As we go through the loop, the next component value in cell (0,1) in the component matrix will be placed in cell (2,4) in the rotation matrix and the last one in cell (2,6) in the rotation matrix. This is done for all the attributes in the first subset. Let's go to the second subset; here the first attribute is one. The cell (0,0) of the component matrix corresponds to the cell (1,1) in the rotation matrix. Proceeding this way, you will notice that the attribute component values are arranged in the same order as the attributes themselves. With our rotation matrix ready, let's project our input onto the rotation matrix: x_transformed = x_train.dot(R_matrix) It’s time now to fit our decision tree: model = DecisionTreeClassifier() model.fit(x_transformed,y_train) Finally, we will store our models and the corresponding rotation matrices: models.append(model) r_matrices.append(R_matrix) With our model built, let's proceed to see how good our model is in both the train and dev datasets using the model_worth function: model_worth(models,r_matrices,x_train,y_train) model_worth(models,r_matrices,x_dev,y_dev) Let's see our model_worth function: for i,model in enumerate(models): x_mod = x.dot(r_matrices[i]) predicted_y = model.predict(x_mod) predicted_ys.append(predicted_y) In this function, perform prediction using each of the trees that we built. However, before doing the prediction, we will project our input using the rotation matrix. We will store all our prediction output in a list called predicted_ys. Let's say that we have 100 instances to predict and ten models in our tree; for each instance, we have ten predictions. We will store these as a matrix for convenience: predicted_matrix = np.asmatrix(predicted_ys) Now, we will proceed to give a final classification for each of our input records: final_prediction = [] for i in range(len(y)): pred_from_all_models = np.ravel(predicted_matrix[:,i]) non_zero_pred = np.nonzero(pred_from_all_models)[0] is_one = len(non_zero_pred) > len(models)/2 final_prediction.append(is_one) We will store our final prediction in a list called final_prediction. We will go through each of the predictions for our instance. Let's say that we are in the first instance (i=0 in our for loop); pred_from_all_models stores the output from all the trees in our model. It’s an array of zeros and ones indicating which class has the model classified in this instance. We will make another array out of it, non_zero_pred, which has only those entries from the parent arrays that are non-zero. Finally, if the length of this non-zero array is greater than half the number of models that we have, we say that our final prediction is one for the instance of interest. What we have accomplished here is the classic voting scheme. Let's look at how good our models are now by calling a classification report: print classification_report(y, final_prediction) Here is how good our model has performed on the training set: Let's see how good our model performance is on the dev dataset: References More information about rotation forest can be obtained from the following paper:. Rotation Forest: A New Classifier Ensemble Method Juan J. Rodrı´guez, Member, IEEE Computer Society, Ludmila I. Kuncheva, Member, IEEE, and Carlos J. Alonso The paper also claims that when rotation forest was compared to bagging, AdBoost, and random forest on 33 datasets, rotation forest outperformed all the other three algorithms. Similar to gradient boosting, authors of the paper claim that rotation forest is an overall framework and the underlying ensemble is not necessary to be a decision tree. Work is in progress in testing other algorithms such as Naïve Bayes, Neural Networks, and similar others. Summary In this article will learnt the bagging methods based on decision tree-based algorithms are very popular among the data science community. Resources for Article: Further resources on this subject: Mobile Phone Forensics – A First Step into Android Forensics [article] Develop a Digital Clock [article] Monitoring Physical Network Bandwidth Using OpenStack Ceilometer [article]

0
0
10885

Packt

28 Oct 2015

8 min read

Working On Your Bot

Packt

28 Oct 2015

8 min read

In this article by Kassandra Perch, the author of Learning JavaScript Robotics, we will learn how to wire up servos and motors, how to create a project with a motor and using the REPL, and how to create a project with a servo and a sensor (For more resources related to this topic, see here.) Wiring up servos and motors Wiring up servos will look similar to wiring up sensors, except the signal maps to an output. Wiring up a motor is similar to wiring an LED. Wiring up servos To wire up a servo, you'll have to use a setup similar to the following figure: A servo wiring diagram The wire colors may vary for your servo. If your wires are red, brown, and orange, red is 5V, brown is GND, and orange is signal. When in doubt, check the data sheet that came with your servo. After wiring up the servo, plug the board in and listen to your servo. If you hear a clicking noise, quickly unplug the board—this means your servo is trying to place itself in a position it cannot reach. Usually, there is a small screw at the bottom of most servos that you can use to calibrate them. Use a small screwdriver to rotate this until it stops clicking when the power is turned on. This procedure is the same for continuous servos—the diagram does not change much either. Just replace the regular servo with a continuous one and you're good to go. Wiring up motors Wiring up motors looks like the following diagram: A motor wiring diagram Again, you'll want the signal pin to go to a PWM pin. As there are only two pins, it can be confusing where the power pin goes—it goes to a PWM pin because, similar to our LED getting its power from the PWM pin, the same pin will provide the power to run the motor. Now that we know how to wire these up, let's work on a project involving a motor and Johnny-Five's REPL. Creating a project with a motor and using the REPL Grab your motor and board, and follow the diagram in the previous section to wire a motor. Let's use pin 6 for the signal pin, as shown in the preceding diagram. What we're going to do in our code is create a Motor object and inject it into the REPL, so we can play around with it in the command line. Create a motor.js file and put in the following code: var five = require('johnny-five'); var board = new five.Board(); board.on('ready', function(){ var motor = new five.Motor({ pin: 6 }); this.repl.inject({ motor: motor }); }); Then, plug in your board and use the motor.js node to start the program. Exploring the motor API If we take a look at the documentation on the Johnny-Five website, there are few things we can try here. First, let's turn our motor on at about half speed: > motor.start(125); The .start() method takes a value between 0 and 255. Sounds familiar? That's because these are the values we can assign to a PWM pin! Okay, let's tell our motor to coast to a stop: > motor.stop(); Note that while this function will cause the motor to coast to a stop, there is a dedicated .brake() method. However, this requires a dedicated break pin, which can be available using shields and certain motors. If you happen to have a directional motor, you can tell the motor to run in reverse using .reverse() with a value between 0 and 255: > motor.reverse(125); This will cause a directional motor to run in reverse at half speed. Note that this requires a shield. And that's about it. Operating motors isn't difficult and Johnny-Five makes it even easier. Now that we've learned how this operates, let's try a servo. Creating a project with a servo and a sensor Let's start with just a servo and the REPL, then we can add in a sensor. Use the diagram from the previous section as a reference to wire up a servo, and use pin 6 for signal. Before we write our program, let's take a look at some of the options the Servo object constructor gives us. You can set an arbitrary range by passing [min, max] to the range property. This is great for low quality servos that have trouble at very low and very high values. The type property is also important. We'll be using a standard servo, but you'll need to set this to continuous if you're using a continuous servo. Since standard is the default, we can leave this out for now. The offset property is important for calibration. If your servo is set too far in one direction, you can change the offset to make sure it can programmatically reach every angle it was meant to. If you hear clicking at very high or low values, try adjusting the offset. You can invert the direction of the servo with the invert property or initialize the servo at the center with center. Centering the servo helps you to know whether you need to calibrate it. If you center it and the arm isn't centered, try adjusting the offset property. Now that we've got a good grasp on the constructor, let's write some code. Create a file called servo-repl.js and enter the following: var five = require('johnny-five'); var board = new five.Board(); board.on('ready', function(){ var servo = new five.Servo({ pin: 6 }); this.repl.inject({ servo: servo }); }); This code simply constructs a standard servo object for pin 6 and injects it into the REPL. Then, run it using the following command line: > node servo-repl.js Your servo should jump to its initialization point. Now, let's figure out how to write the code that makes the servo move. Exploring the servo API with the REPL The most basic thing we can do with a servo is set it to a specific angle. We do this by calling the .to() function with a degree, as follows: > servo.to(90); This should center the servo. You can also set a time on the .to() function, which can take a certain amount of time: > servo.to(20, 500); This will move the servo from 90 degrees to 20 degrees in over 500 ms. You can even determine how many steps the servo takes to get to the new angle, as follows: > servo.to(120, 500, 10); This will move the servo to 120 degrees in over 500 ms in 10 discreet steps. The .to() function is very powerful and will be used in a majority of your Servo objects. However, there are many useful functions. For instance, checking whether a servo is calibrated correctly is easier when you can see all angles quickly. For this, we can use the .sweep() function, as follows: > servo.sweep(); This will sweep the servo back and forth between its minimum and maximum values, which are 0 and 180, unless set in the constructor via the range property. You can also specify a range to sweep, as follows: > servo.sweep({ range: [20, 120] }); This will sweep the servo from 20 to 120 repeatedly. You can also set the interval property, which will change how long the sweep takes, and a step property, which sets the number of discreet steps taken, as follows: > servo.sweep({ range: [20, 120], interval: 1000, step: 10 }); This will cause the servo to sweep from 20 to 120 every second in 10 discreet steps. You can stop a servo's movement with the .stop() method, as follows: > servo.stop(); For continuous servos, you can use .cw() and .ccw() with a speed between 0 and 255 to move the continuous servo back and forth. Now that we've seen the Servo object API at work, let's hook our servo up to a sensor. In this case, we'll use a photocell. This code is a good example for a few reasons: it shows off Johnny-Five's event API, allows us to use a servo with an event, and gets us used to wiring inputs to outputs using events. First, let's add a photocell to our project using the following diagram: A servo and photoresistor wiring diagram Then, create a photoresistor-servo.js file, and add the following. var five = require('johnny-five'); var board = new five.Board(); board.on('ready', function(){ var servo = new five.Servo({ pin: 6 }); var photoresistor = new five.Sensor({ pin: "A0", freq: 250 }); photoresistor.scale(0, 180).on('change', function(){ servo.to(this.value); }); }); How this works is as follows. During the data event, we tell our servo to move to the correct position based on the scaled data from our photoresistor. Now, run the following command line: > node photoresistor-servo.js Then, try turning the light on and covering up your photoresistor and watch the servo move! Summary We now know how to use servos and motors that helps to move small robots. Wheeled robots are good to go! But what about more complex projects, such as the hexapod? Walking takes timing. As we mentioned in the .to() function, we can time the servo movement, thanks to the Animation library. Resources for Article: Further resources on this subject: Internet-Connected Smart Water Meter [article] Raspberry Pi LED Blueprints [article] Welcome to JavaScript in the full stack [article]

0
0
23435

article-image-making-simple-web-based-ssh-client-using-nodejs-and-socketio

Jakub Mandula

28 Oct 2015

7 min read

Making a simple Web based SSH client using Node.js and Socket.io

Jakub Mandula

28 Oct 2015

7 min read

If you are reading this post, you probably know what SSH stands for. But just for the sake of formality, here we go: SSH stands for Secure Shell. It is a network protocol for secure access to the shell on a remote computer. You can do much more over SSH besides commanding your computer. Here you can find further information: http://en.wikipedia.org/wiki/Secure_Shell. In this post, we are going to create a very simple web terminal. And when I say simple, I mean it! However much you like colors, it will not support them because the parsing is just beyond the scope of this post. If you want a good client-side terminal library use term.js. It is made by the same guy who wrote pty.js, which we will be using. It is able to handle pretty much all key events and COLORS!!!! Installation I am going to assume you already have your node and npm installed. First we will install all of the npm packages we will be using: npm install express pty.js socket.io Express is a super cool web framework for Node. We are going to use it to serve our static files. I know it is a bit overkill, but I like Express. pty.js is where the magic will be happening. It forks processes into virtual pseudo terminals and provides bindings for communication. Socket.io is what we will use to transmit the data from the web browser to the server and back. It uses modern WebSockets, but provides fallbacks for backward compatibility. Anytime you want to create a real-time application, Socket.io is the way to go. Planning First things first, we need to think what we want the program to do. We want the program to create an instance of a shell on the server (remote machine) and send all of the text to the browser. Back in the browser, we want to capture any user events and send them back to the server shell. The WebSSH server This is the code that will power the terminal forwarding. Open a new file named server.js and start by importing all of the libraries: var express = require('express'); var https = require('https'); var http = require('http'); var fs = require('fs'); var pty = require('pty.js'); Set up express: // Setup the express app var app = express(); // Static file serving app.use("/",express.static("./")); Next we are going to create the server. // Creating an HTTP server var server = http.createServer(app).listen(8080) If you want to use HTTPS, which you probably will, you need to generate a key and certificate and import them as shown. var options = { key: fs.readFileSync('keys/key.pem'), cert: fs.readFileSync('keys/cert.pem') }; Then use the options object to create the actual server. Notice that this time we are using the https package. // Create an HTTPS server var server = https.createServer(options, app).listen(8080) CAUTION: Even if you use HTTPS, do not use this example program on the Internet. You are not authenticating the client in any way and thus providing a free open gate to your computer. Please make sure you only use this on your Private network protected by a firewall!!! Now bind the socket.io instance to the server: var io = require('socket.io')(server); After this, we can set up the place where the magic happens. // When a new socket connects io.on('connection', function(socket){ // Create terminal var term = pty.spawn('sh', [], { name: 'xterm-color', cols: 80, rows: 30, cwd: process.env.HOME, env: process.env }); // Listen on the terminal for output and send it to the client term.on('data', function(data){ socket.emit('output', data); }); // Listen on the client and send any input to the terminal socket.on('input', function(data){ term.write(data); }); // When socket disconnects, destroy the terminal socket.on("disconnect", function(){ term.destroy(); console.log("bye"); }); }); In this block, all we do is wait for new connections. When we get one, we spawn a new virtual terminal and start to pump the data from the terminal to the socket and vice versa. After the socket disconnects, we make sure to destroy the terminal. If you have noticed, I am using the simple sh shell. I did this mainly because I don't have a fancy prompt on it. Because we are not adding any parsing logic, my bash prompt would show up like this: ]0;piman@mothership: ~ _[01;32mâœ“ [33mpiman_[0m â†£ _[1;34m[~]_[37m$[0m - Eww! But you may use any shell you like. This is all that we need on the server side. Save the file and close it. Client side The client side is going to be just a very simple HTML file. Start with a very simple HTML markup: <!doctype html> <html> <head> <title>SSH Client</title> <script type="text/javascript" src="//cdnjs.cloudflare.com/ajax/libs/socket.io/1.3.5/socket.io.min.js"></script> <script type="text/javascript" src="//cdnjs.cloudflare.com/ajax/libs/jquery/2.1.4/jquery.min.js"></script> <style> body { margin: 0; padding: 0; } .terminal { font-family: monospace; color: white; background: black; } </style> </head> <body> <h1>SSH</h1> <div class="terminal"> </div> <script> </script> </body> </html> I am downloading the client side libraries jquery and socket.io from cdnjs. All of the client code will be written in the script tag below the terminal div. Surprisingly the code is very simple: // Connect to the socket.io server var socket = io.connect('http://localhost:8080'); // Wait for data from the server socket.on('output', function (data) { // Insert some line breaks where they belong data = data.replace("n", "<br>"); data = data.replace("r", "<br>"); // Append the data to our terminal $('.terminal').append(data); }); // Listen for user input and pass it to the server $(document).on("keypress",function(e){ var char = String.fromCharCode(e.which); socket.emit("input", char); }); Notice that we do not have to explicitly append the text the client types to the terminal mainly because the server echos it back anyways. Now we are done! Run the server and open up the URL in your browser. node server.js You should see a small prompt and be able to start typing commands. You can now explore you machine from the browser! Remember that our Web Terminal does not support Tab, Ctrl, Backspace or Esc characters. Implementing this is your homework. Conclusion I hope you found this tutorial useful. You can apply the knowledge in any real-time application where communication with the server is critical. All the code is available here. Please note, that if you'd like to use a browser terminal I strongly recommend term.js. It supports colors and styles and all the basic keys including Tabs, Backspace etc. I use it in my PiDashboard project. It is much cleaner and less tedious than the example I have here. I can't wait what amazing apps you will invent based on this. About the Author Jakub Mandula is a student interested in anything to do with technology, computers, mathematics or science.

0
6
48193

article-image-c-language-support-asynchrony

Packt

28 Oct 2015

25 min read

C# Language Support for Asynchrony

Packt

28 Oct 2015

25 min read

In this article by Eugene Agafonov and Andrew Koryavchenko, the authors of the book, Mastering C# Concurrency, talks about Task Parallel Library in detail. Also, the C# language infrastructure that supports asynchronous calls have been explained. The Task Parallel Library makes it possible to combine asynchronous tasks and set dependencies between them. To get a clear understanding, in this article, we will use this approach to solve a real problem—downloading images from Bing (the search engine). Also, we will do the following: Implement standard synchronous approach Use Task Parallel Library to create an asynchronous version of the program Use C# 5.0 built-in asynchrony support to make the code easier to read and maintain Simulate C# asynchronous infrastructure with the help of iterators Learn about other useful features of Task Parallel Library Make any C# type compatible with built-in asynchronous keywords (For more resources related to this topic, see here.) Implementing the downloading of images from Bing Everyday Bing.com publishes its background image that can be used as desktop wallpaper. There is an XML API to get information about these pictures that can be found at http://www.bing.com/hpimagearchive.aspx. Creating a simple synchronous solution Let's try to write a program to download the last eight images from this site. We will start by defining objects to store image information. This is where a thumbnail image and its description will be stored: using System.Drawing; public class WallpaperInfo{ private readonly Image _thumbnail; private readonly string _description; public WallpaperInfo(Image thumbnail, string description) { _thumbnail = thumbnail; _description = description; } public Image Thumbnail { get { return _thumbnail; } } public string Description { get { return _description; } } } The next container type is for all the downloaded pictures and the time required to download and make the thumbnail images from the original pictures: public class WallpapersInfo { private readonly long _milliseconds; private readonly WallpaperInfo[] _wallpapers; public WallpapersInfo(long milliseconds, WallpaperInfo[] wallpapers) { _milliseconds = milliseconds; _wallpapers = wallpapers; } public long Milliseconds { get { return _milliseconds; } } public WallpaperInfo[] Wallpapers { get { return _wallpapers; } } } Now we need to create a loader class to download images from Bing. We need to define a Loader static class and follow with an implementation. Let's create a method that will make a thumbnail image from the source image stream: private static Image GetThumbnail(Stream imageStream) { using (imageStream) { var fullBitmap = Image.FromStream(imageStream); return new Bitmap(fullBitmap, 192, 108); } } To communicate via the HTTP protocol, it is recommended to use the System.Net.HttpClient type from the System.Net.dll assembly. Let's create the following extension methods that will allow us to use the POST HTTP method to download an image and get an opened stream: private static Stream DownloadData(this HttpClient client, string uri) { var response = client.PostAsync( uri, new StringContent(string.Empty)).Result; return response.Content.ReadAsStreamAsync().Result; } private static Task<Stream> DownloadDataAsync(this HttpClient client, string uri) { Task<HttpResponseMessage> responseTask = client.PostAsync( uri, new StringContent(string.Empty)); return responseTask.ContinueWith(task => task.Result.Content.ReadAsStreamAsync()).Unwrap(); } To create the easiest implementation possible, we will implement downloading without any asynchrony. Here, we will define HTTP endpoints for the Bing API: private const string _catalogUri = "http://www.bing.com/hpimagearchive.aspx? format=xml&idx=0&n=8&mbl=1&mkt=en-ww"; private const string _imageUri = "http://bing.com{0}_1920x1080.jpg"; Then, we will start measuring the time required to finish downloading and download an XML catalog that has information about the images that we need: var sw = Stopwatch.StartNew(); var client = new HttpClient(); var catalogXmlString = client.DownloadString(_catalogUri); Next, the XML string will be parsed to an XML document: var xDoc = XDocument.Parse(catalogXmlString); Now using LINQ to XML, we will query the information needed from the document and run the download process for each image: var wallpapers = xDoc .Root .Elements("image") .Select(e => new { Desc = e.Element("copyright").Value, Url = e.Element("urlBase").Value }) .Select(item => new { item.Desc, FullImageData = client.DownloadData( string.Format(_imageUri, item.Url)) }) .Select( item => new WallpaperInfo( GetThumbnail(item.FullImageData), item.Desc)) .ToArray(); sw.Stop(); The first Select method call extracts image URL and description from each image XML element that is a direct child of root element. This information is contained inside the urlBase and copyright XML elements inside the image element. The second one downloads an image from the Bing site. The last Select method creates a thumbnail image and stores all the information needed inside the WallPaperInfo class instance. To display the results, we need to create a user interface. Windows Forms is a simple and fast to implement technology, so we use it to show the results to the user. There is a button that runs the download, a panel to show the downloaded pictures, and a label that will show the time required to finish downloading. Here is the implementation code. This includes a calculation of the top co-ordinate for each element, a code to display the images and start the download process: private int GetItemTop(int height, int index) { return index * (height + 8) + 8; } private void RefreshContent(WallpapersInfo info) { _resultPanel.Controls.Clear(); _resultPanel.Controls.AddRange( info.Wallpapers.SelectMany((wallpaper, i) => new Control[] { new PictureBox { Left = 4, Image = wallpaper.Thumbnail, AutoSize = true, Top = GetItemTop(wallpaper.Thumbnail.Height, i) }, new Label { Left = wallpaper.Thumbnail.Width + 8, Top = GetItemTop(wallpaper.Thumbnail.Height, i), Text = wallpaper.Description, AutoSize = true } }).ToArray()); _timeLabel.Text = string.Format( "Time: {0}ms", info.Milliseconds); } private void _loadSyncBtn_Click(object sender, System.EventArgs e) { var info = Loader.SyncLoad(); RefreshContent(info); } The result looks as follows: So the time to download all these images should be about several seconds if the internet connection is broadband. Can we do this faster? We certainly can! Now we will download and process the images one by one, but we totally can process each image in parallel. Creating a parallel solution with Task Parallel Library The Task Parallel Library and the code that shows the relationships between tasks naturally splits into several stages as follows: Load images catalog XML from Bing Parsing the XML document and get the information needed about the images Load each image's data from Bing Create a thumbnail image for each image downloaded The process can be visualized with the dependency chart: HttpClient has naturally asynchronous API, so we only need to combine everything together with the help of a Task.ContinueWith method: public static Task<WallpapersInfo> TaskLoad() { var sw = Stopwatch.StartNew(); var downloadBingXmlTask = new HttpClient().GetStringAsync(_catalogUri); var parseXmlTask = downloadBingXmlTask.ContinueWith(task => { var xmlDocument = XDocument.Parse(task.Result); return xmlDocument.Root .Elements("image") .Select(e => new { Description = e.Element("copyright").Value, Url = e.Element("urlBase").Value }); }); var downloadImagesTask = parseXmlTask.ContinueWith( task => Task.WhenAll( task.Result.Select(item => new HttpClient() .DownloadDataAsync(string.Format(_imageUri, item.Url)) .ContinueWith(downloadTask => new WallpaperInfo( GetThumbnail(downloadTask.Result), item.Description))))) .Unwrap(); return downloadImagesTask.ContinueWith(task => { sw.Stop(); return new WallpapersInfo(sw.ElapsedMilliseconds, task.Result); }); } The code has some interesting moments. The first task is created by the HttpClient instance, and it completes when the download process succeeds. Now we will attach a subsequent task, which will use the XML string downloaded by the previous task, and then we will create an XML document from this string and extract the information needed. Now this is becoming more complicated. We want to create a task to download each image and continue until all these tasks complete successfully. So we will use the LINQ Select method to run downloads for each image that was defined in the XML catalog, and after the download process completes, we will create a thumbnail image and store the information in the WallpaperInfo instance. This creates IEnumerable<Task<WallpaperInfo>> as a result, and to wait for all these tasks to complete, we will use the Task.WhenAll method. However, this is a task that is inside a continuation task, and the result is going to be of the Task<Task<WallpaperInfo[]>> type. To get the inner task, we will use the Unwrap method, which has the following syntax: public static Task Unwrap(this Task<Task> task) This can be used on any Task<Task> instance and will create a proxy task that represents an entire asynchronous operation properly. The last task is to stop the timer and return the downloaded images and is quite straightforward. We have to add another button to the UI to run this implementation. Notice the implementation of the button click handler: private void _loadTaskBtn_Click(object sender, System.EventArgs e) { var info = Loader.TaskLoad(); info.ContinueWith(task => RefreshContent(task.Result), CancellationToken.None, TaskContinuationOptions.None, TaskScheduler.FromCurrentSynchronizationContext()); } Since the TaskLoad method is asynchronous, it returns immediately. To display the results, we have to define a continuation task. The default task scheduler will run a task code on a thread pool worker thread. To work with UI controls, we have to run the code on the UI thread, and we use a task scheduler that captures the current synchronization context and runs the continuation task on this. Let's name the button as Load using TPL and test the results. If your internet connection is fast, this implementation will download the images in parallel much faster compared to the previous sequential download process. If we look back at the code, we will see that it is quite hard to understand what it actually does. We can see how one task depends on other, but the original goal is unclear despite the code being very compact and easy. Imagine what will happen if we would try to add exception handling here. We will have to append an additional continuation task with exception handling to each task. This will be much harder to read and understand. In a real-world program, it will be a challenging task to keep in mind these tasks composition and support a code written in such a paradigm. Enhancing the code with C# 5.0 built-in support for asynchrony Fortunately, C# 5.0 introduced the async and await keywords that are intended to make asynchronous code look like synchronous, and thus, makes reading of code and understanding the program flow easier. However, this is another abstraction and it hides many things that happen under the hood from the programmer, which in several situations is not a good thing. Now let's rewrite the previous code using new C# 5.0 features: public static async Task<WallpapersInfo> AsyncLoad() { var sw = Stopwatch.StartNew(); var client = new HttpClient(); var catalogXmlString = await client.GetStringAsync(_catalogUri); var xDoc = XDocument.Parse(catalogXmlString); var wallpapersTask = xDoc .Root .Elements("image") .Select(e => new { Description = e.Element("copyright").Value, Url = e.Element("urlBase").Value }) .Select(async item => new { item.Description, FullImageData = await client.DownloadDataAsync( string.Format(_imageUri, item.Url)) }); var wallpapersItems = await Task.WhenAll(wallpapersTask); var wallpapers = wallpapersItems.Select( item => new WallpaperInfo( GetThumbnail(item.FullImageData), item.Description)); sw.Stop(); return new WallpapersInfo(sw.ElapsedMilliseconds, wallpapers.ToArray()); } Now the code looks almost like the first synchronous implementation. The AsyncLoad method has a async modifier and a Task<T> return value, and such methods must always return Task or be declared as void—this is enforced by the compiler. However, in the method's code, the type that is returned is just T. This is strange at first, but the method's return value will be eventually turned into Task<T> by the C# 5.0 compiler. The async modifier is necessary to use await inside the method. In the further code, there is await inside a lambda expression, and we need to mark this lambda as async as well. So what is going on when we use await inside our code? It does not always mean that the call is actually asynchronous. It can happen that by the time we call the method, the result is already available, so we just get the result and proceed further. However, the most common case is when we make an asynchronous call. In this case, we start. for example by downloading a XML string from Bing via HTTP and immediately return a task that is a continuation task and contains the rest of the code after the line with await. To run this, we need to add another button named Load using async. We are going to use await in the button click event handler as well, so we need to mark it with the async modifier: private async void _loadAsyncBtn_Click(object sender, System.EventArgs e) { var info = await Loader.AsyncLoad(); RefreshContent(info); } Now if the code after await is being run in a continuation task, why is there no multithreaded access exception? The RefreshContent method runs in another task, but the C# compiler is aware of the synchronization context and generates a code that executes the continuation task on the UI thread. The result should be as fast as a TPL implementation but the code is much cleaner and easy to follow. The last but not least, is possibility to put asynchronous method calls inside a try block. The C# compiler generates a code that will propagate the exception into the current context and unwrap the AggregateException instance to get the original exception from it. In C# 5.0, it was impossible to use await inside catch and finally blocks, but C# 6.0 introduced a new async/await infrastructure and this limitation was removed. Simulating C# asynchronous infrastructure with iterators To dig into the implementation details, it makes sense to look at the decompiled code of the AsyncLoad method: public static Task<WallpapersInfo> AsyncLoad() { Loader.<AsyncLoad>d__21 stateMachine; stateMachine.<>t__builder = AsyncTaskMethodBuilder<WallpapersInfo>.Create(); stateMachine.<>1__state = -1; stateMachine .<>t__builder .Start<Loader.<AsyncLoad>d__21>(ref stateMachine); return stateMachine.<>t__builder.Task; } The method body was replaced by a compiler-generated code that creates a special kind of state machine. We will not review the further implementation details here, because it is quite complicated and is subject to change from version to version. However, what's going on is that the code gets divided into separate pieces at each line where await is present, and each piece becomes a separate state in the generated state machine. Then, a special System.Runtime.CompilerServices.AsyncTaskMethodBuilder structure creates Task that represents the generated state machine workflow. This state machine is quite similar to the one that is generated for the iterator methods that leverage the yield keyword. In C# 6.0, the same universal code gets generated for the code containing yield and await. To illustrate the general principles behind the generated code, we can use iterator methods to implement another version of asynchronous images download from Bing. Therefore, we can turn an asynchronous method into an iterator method that returns the IEnumerable<Task> instance. We replace each await with yield return making each iteration to be returned as Task. To run such a method, we need to execute each task and return the final result. This code can be considered as an analogue of AsyncTaskMethodBuilder: private static Task<TResult> ExecuteIterator<TResult>( Func<Action<TResult>,IEnumerable<Task>> iteratorGetter) { return Task.Run(() => { var result = default(TResult); foreach (var task in iteratorGetter(res => result = res)) task.Wait(); return result; }); } We iterate through each task and await its completion. Since we cannot use the out and ref parameters in iterator methods, we use a lambda expression to return the result from each task. To make the code easier to understand, we have created a new container task and used the foreach loop; however, to be closer to the original implementation, we should get the first task and use the ContinueWith method providing the next task to it and continue until the last task. In this case, we will end up having one final task representing an entire sequence of asynchronous operations, but the code will become more complicated as well. Since it is not possible to use the yield keyword inside a lambda expressions in the current C# versions, we will implement image download and thumbnail generation as a separate method: private static IEnumerable<Task> GetImageIterator( string url, string desc, Action<WallpaperInfo> resultSetter) { var loadTask = new HttpClient().DownloadDataAsync( string.Format(_imageUri, url)); yield return loadTask; var thumbTask = Task.FromResult(GetThumbnail(loadTask.Result)); yield return thumbTask; resultSetter(new WallpaperInfo(thumbTask.Result, desc)); } It looks like a common C# async code with yield return used instead of the await keyword and resultSetter used instead of return. Notice the Task.FromResult method that we used to get Task from the synchronous GetThumbnail method. We can use Task.Run and put this operation on a separate worker thread, but it will be an ineffective solution; Task.FromResult allows us to get Task that is already completed and has a result. If you use await with such task, it will be translated into a synchronous call. The main code can be rewritten in the same way: private static IEnumerable<Task> GetWallpapersIterator( Action<WallpaperInfo[]> resultSetter) { var catalogTask = new HttpClient().GetStringAsync(_catalogUri); yield return catalogTask; var xDoc = XDocument.Parse(catalogTask.Result); var imagesTask = Task.WhenAll(xDoc .Root .Elements("image") .Select(e => new { Description = e.Element("copyright").Value, Url = e.Element("urlBase").Value }) .Select(item => ExecuteIterator<WallpaperInfo>( resSetter => GetImageIterator( item.Url, item.Description, resSetter)))); yield return imagesTask; resultSetter(imagesTask.Result); } This combines everything together: public static WallpapersInfo IteratorLoad() { var sw = Stopwatch.StartNew(); var wallpapers = ExecuteIterator<WallpaperInfo[]>( GetWallpapersIterator) .Result; sw.Stop(); return new WallpapersInfo(sw.ElapsedMilliseconds, wallpapers); } To run this, we will create one more button called Load using iterator. The button click handler just runs the IteratorLoad method and then refreshes the UI. This also works with about the same speed as other asynchronous implementations. This example can help us to understand the logic behind the C# code generation for asynchronous methods used with await. Of course, the real code is much more complicated, but the principles behind it remain the same. Is the async keyword really needed? It is a common question about why do we need to mark methods as async? We have already mentioned iterator methods in C# and the yield keyword. This is very similar to async/await, and yet we do not need to mark iterator methods with any modifier. The C# compiler is able to determine that it is an iterator method when it meets the yield return or yield break operators inside such a method. So the question is, why is it not the same with await and the asynchronous methods? The reason is that asynchrony support was introduced in the latest C# version, and it is very important not to break any legacy code while changing the language. Imagine if any code used await as a name for a field or variable. If C# developers make await a keyword without any conditions, this old code will break and stop compiling. The current approach guarantees that if we do not mark a method with async, the old code will continue to work. Fire-and-forget tasks Besides Task and Task<T>, we can declare an asynchronous method as void. It is useful in the case of top-level event handlers, for example, the button click or text changed handlers in the UI. An event handler that returns a value is possible, but is very inconvenient to use and does not make much sense. So allowing async void methods makes it possible to use await inside such event handlers: private async void button1_Click(object sender, EventArgs e) { await SomeAsyncStuff(); } It seems that nothing bad is happening, and the C# compiler generates almost the same code as for the Task returning method, but there is an important catch related to exceptions handling. When an asynchronous method returns Task, exceptions are connected to this task and can be handled both by TPL and the try/catch block in case await is used. However, if we have a async void method, we have no Task to attach the exceptions to and those exceptions just get posted to the current synchronization context. These exceptions can be observed using AppDomain.UnhandledException or similar events in a GUI application, but this is very easy to miss and not a good practice. The other problem is that we cannot use a void returning asynchronous method with await, since there is no return value that can be used to await on it. We cannot compose such a method with other asynchronous tasks and participate in the program workflow. It is basically a fire-and-forget operation that we start, and then we have no way to control how it will proceed (if we did not write the code for this explicitly). Another problem is void returning async lambda expression. It is very hard to notice that lambda returns void, and all problems related to usual methods are related to lambda expression as well. Imagine that we want to run some operation in parallel. To achieve this, we can use the Parallel.ForEach method. To download some news in parallel, we can write a code like this: Parallel.ForEach(Enumerable.Range(1,10), async i => { var news = await newsClient.GetTopNews(i); newsCollection.Add(news); }); However, this will not work, because the second parameter of the ForEach method is Action<T>, which is a void returning delegate. Thus, we will spawn 10 download processes, but since we cannot wait for completion, we abandon all asynchronous operations that we just started and ignore the results. A general rule of thumb is to avoid using async void methods. If this is inevitable and there is an event handler, then always wrap the inner await method calls in try/catch blocks and provide exception handling. Other useful TPL features Task Parallel Library has a large codebase and some useful features such as Task.Unwrap or Task.FromResult that are not very well known to developers. We have still not mentioned two more extremely useful methods yet. They are covered in the following sections. Task.Delay Often, it is required to wait for a certain amount of time in the code. One of the traditional ways to wait is using the Thread.Sleep method. The problem is that Thread.Sleep blocks the current thread, and it is not asynchronous. Another disadvantage is that we cannot cancel waiting if something has happened. To implement a solution for this, we will have to use system synchronization primitives such as an event, but this is not very easy to code. To keep the code simple, we can use the Task.Delay method: // Do something await Task.Delay(1000); // Do something This method can be canceled with a help of the CancellationToken infrastructure and uses system timer under the hood, so this kind of waiting is truly asynchronous. Task.Yield Sometimes we need a part of the code to be guaranteed to run asynchronously. For example, we need to keep the UI responsive, or maybe we would like to implement a fine-grained scenario. Anyway, as we already know that using await does not mean that the call will be asynchronous. If we want to return control immediately and run the rest of the code as a continuation task, we can use the Task.Yield method: // Do something await Task.Yield(); // Do something Task.Yield just causes a continuation to be posted on the current synchronization context, or if the synchronization context is not available, a continuation will be posted on a thread pool worker thread. Implementing a custom awaitable type Until now we have only used Task with the await operator. However, it is not the only type that is compatible with await. Actually, the await operator can be used with every type that contains the GetAwaiter method with no parameters and the return type that does the following: Implements the INotifyCompletion interface Contains the IsCompleted boolean property Has the GetResult method with no parameters This method can even be an extension method, so it is possible to extend the existing types and add the await compatibility to them. In this example, we will create such a method for the Uri type. This method will download content as a string via HTTP from the address provided in the Uri instance: private static TaskAwaiter<string> GetAwaiter(this Uri url) { return new HttpClient().GetStringAsync(url).GetAwaiter(); } var content = await new Uri("http://google.com"); Console.WriteLine(content.Substring(0, 10)); If we run this, we will see the first 10 characters of the Google website content. As you may notice, here we used the Task type indirectly, returning the already provided awaiter method for the Task type. We can implement an awaiter method manually from scratch, but it really does not make any sense. To understand how this works it will be enough to create a custom wrapper around an already existing TaskAwaiter: struct DownloadAwaiter : INotifyCompletion { private readonly TaskAwaiter<string> _awaiter; public DownloadAwaiter(Uri uri) { Console.WriteLine("Start downloading from {0}", uri); var task = new HttpClient().GetStringAsync(uri); _awaiter = task.GetAwaiter(); Task.GetAwaiter().OnCompleted(() => Console.WriteLine( "download completed")); } public bool IsCompleted { get { return _awaiter.IsCompleted; } } public void OnCompleted(Action continuation) { _awaiter.OnCompleted(continuation); } public string GetResult() { return _awaiter.GetResult(); } } With this code, we have customized asynchronous execution that provides diagnostic information to the console. To get rid of TaskAwaiter, it will be enough to change the OnCompleted method with custom code that will execute some operation and then a continuation provided in this method. To use this custom awaiter, we need to change GetAwaiter accordingly: private static DownloadAwaiter GetAwaiter(this Uri uri) { return new DownloadAwaiter(uri); } If we run this, we will see additional information on the console. This can be useful for diagnostics and debugging. Summary In this article, we have looked at the C# language infrastructure that supports asynchronous calls. We have covered the new C# keywords, async and await, and how we can use Task Parallel Library with the new C# syntax. We have learned how C# generates code and creates a state machine that represents an asynchronous operation, and we implemented an analogue solution with the help of iterator methods and the yield keyword. Besides this, we have studied additional Task Parallel Library features and looked at how we can use await with any custom type. Resources for Article: Further resources on this subject: R ─ CLASSIFICATION AND REGRESSION TREES [article] INTRODUCING THE BOOST C++ LIBRARIES [article] CLIENT AND SERVER APPLICATIONS [article]

0
0
8297

article-image-art-android-development-using-android-studio

Packt

28 Oct 2015

5 min read

The Art of Android Development Using Android Studio

Packt

28 Oct 2015

5 min read

In this article by Mike van Drongelen, the author of the book Android Studio Cookbook, you will see why Android Studio is the number one IDE to develop Android apps. It is available for free for anyone who wants to develop professional Android apps. Android Studio is not just a stable and fast IDE (based on Jetbrains IntelliJ IDEA), it also comes with cool stuff such as Gradle, better refactoring methods, and a much better layout editor to name just a few of them. If you have been using Eclipse before, then you're going to love this IDE. Android Studio tip Want to refactor your code? Use the shortcut CTRL + T (for Windows: Ctrl + Alt + Shift + T) to see what options you have. You can, for example, rename a class or method or extract code from a method. Any type of Android app can be developed using Android Studio. Think of apps for phones, phablets, tablets, TVs, cars, glasses, and other wearables such as watches. Or consider an app that uses a cloud-base backend such as Parse or App Engine, a watch face app, or even a complete media center solution for TV. So, what is in the book? The sky is the limit, and the book will help you make the right choices while developing your apps. For example, on smaller screens, provide smart navigation and use fragments to make apps look great on a tablet too. Or, see how content providers can help you to manage and persist data and how to share data among applications. The observer pattern that comes with content providers will save you a lot of time. Android Studio tip Do you often need to return to a particular place in your code? Create a bookmark with Cmd + F3 (for Windows: F11). To display a list of bookmarks to choose from, use the shortcut: Cmd + F3 (for Windows: Shift + F11). Material design The book will also elaborate on material design. Create cool apps using CardView and RecycleView widgets. Find out how to create special effects and how to perform great transitions. A chapter is dedicated to the investigation of the Camera2 API and how to capture and preview photos. In addition, you will learn how to apply filters and how to share the results on Facebook. The following image is an example of one of the results: Android Studio tip Are you looking for something? Press Shift two times and start typing what you're searching for. Or to display all recent files, use the Cmd + E shortcut (for Windows: Ctrl + E). Quality and performance You will learn about patterns and how support annotations can help you improve the quality of your code. Testing your app is just as important as developing one, and it will take your app to the next level. Aim for a five-star rating in the Google Play Store later. The book shows you how to do unit testing based on jUnit or Robolectric and how to use code analysis tools such as Android Lint. You will learn about memory optimization using the Android Device Monitor, detect issues and learn how to fix them as shown in the following screenshot: Android Studio tip You can easily extract code from a method that has become too large. Just mark the code that you want to move and use the shortcut Cmd + Alt + M (for Windows: Ctrl + Alt + M). Having a physical Android device to test your apps is strongly recommended, but with thousands of Android devices being available, testing on real devices could be pretty expensive. Genymotion is a real, fast, and easy-to-use emulator and comes with many real-world device configurations. Did all your unit tests succeed? There are no more OutOfMemoryExceptions any more? No memory leaks found? Then it is about time to distribute your app to your beta testers. The final chapters explain how to configure your app for a beta release by creating the build types and build flavours that you need. Finally, distribute your app to your beta testers using Google Play to learn from their feedback. Did you know? Android Marshmallow (Android 6.0) introduces runtime permissions, which will change the way users give permission for an app. The book The art of Android development using Android Studio contains around 30 real-world recipes, clarifying all topics being discussed. It is a great start for programmers that have been using Eclipse for Android development before but is also suitable for new Android developers that know about the Java Syntax already. Summary The book nicely explains all the things you need to know to find your way in Android Studio and how to create high-quality and great looking apps. Resources for Article: Further resources on this subject: Introducing an Android platform [article] Testing with the Android SDK [article] Android Virtual Device Manager [article]

0
0
16552

Packt

28 Oct 2015

28 min read

An Introduction to Kibana

Packt

28 Oct 2015

28 min read

0
0
5214

article-image-putting-your-database-heart-azure-solutions

Packt

28 Oct 2015

19 min read

Putting Your Database at the Heart of Azure Solutions

Packt

28 Oct 2015

19 min read

In this article by Riccardo Becker, author of the book Learning Azure DocumentDB, we will see how to build a real scenario around an Internet of Things scenario. This scenario will build a basic Internet of Things platform that can help to accelerate building your own. In this article, we will cover the following: Have a look at a fictitious scenario Learn how to combine Azure components with DocumentDB Demonstrate how to migrate data to DocumentDB (For more resources related to this topic, see here.) Introducing an Internet of Things scenario Before we start exploring different capabilities to support a real-life scenario, we will briefly explain the scenario we will use throughout this article. IoT, Inc. IoT, Inc. is a fictitious start-up company that is planning to build solutions in the Internet of Things domain. The first solution they will build is a registration hub, where IoT devices can be registered. These devices can be diverse, ranging from home automation devices up to devices that control traffic lights and street lights. The main use case for this solution is offering the capability for devices to register themselves against a hub. The hub will be built with DocumentDB as its core component and some Web API to expose this functionality. Before devices can register themselves, they need to be whitelisted in order to prevent malicious devices to start registering. In the following screenshot, we see the high-level design of the registration requirement: The first version of the solution contains the following components: A Web API containing methods to whitelist, register, unregister, and suspend devices DocumentDB, containing all the device information including information regarding other Microsoft Azure resources Event Hub, a Microsoft Azure asset that enables scalable publish-subscribe mechanism to ingress and egress millions of events per second Power BI, Microsoft’s online offering to expose reporting capabilities and the ability to share reports Obviously, we will focus on the core of the solution which is DocumentDB but it is nice to touch some of the Azure components, as well to see how well they co-operate and how easy it is to set up a demonstration for IoT scenarios. The devices on the left-hand side are chosen randomly and will be mimicked by an emulator written in C#. The Web API will expose the functionality required to let devices register themselves at the solution and start sending data afterwards (which will be ingested to the Event Hub and reported using Power BI). Technical requirements To be able to service potentially millions of devices, it is necessary that registration request from a device is being stored in a separate collection based on the country where the device is located or manufactured. Every device is being modeled in the same way, whereas additional metadata can be provided upon registration or afterwards when updating. To achieve country-based partitioning, we will create a custom PartitionResolver to achieve this goal. To extend the basic security model, we reduce the amount of sensitive information in our configuration files. Enhance searching capabilities because we want to service multiple types of devices each with their own metadata and device-specific information. Querying on all the information is desired to support full-text search and enable users to quickly search and find their devices. Designing the model Every device is being modeled similar to be able to service multiple types of devices. The device model contains at least the deviceid and a location. Furthermore, the device model contains a dictionary where additional device properties can be stored. The next code snippet shows the device model: [JsonProperty("id")] public string DeviceId { get; set; } [JsonProperty("location")] public Point Location { get; set; } //practically store any metadata information for this device [JsonProperty("metadata")] public IDictionary<string, object> MetaData { get; set; } The Location property is of type Microsoft.Azure.Documents.Spatial.Point because we want to run spatial queries later on in this section, for example, getting all the devices within 10 kilometers of a building. Building a custom partition resolver To meet the first technical requirement (partition data based on the country), we need to build a custom partition resolver. To be able to build one, we need to implement the IPartitionResolver interface and add some logic. The resolver will take the Location property of the device model and retrieves the country that corresponds with the latitude and longitude provided upon registration. In the following code snippet, you see the full implementation of the GeographyPartitionResolver class: public class GeographyPartitionResolver : IPartitionResolver { private readonly DocumentClient _client; private readonly BingMapsHelper _helper; private readonly Database _database; public GeographyPartitionResolver(DocumentClient client, Database database) { _client = client; _database = database; _helper = new BingMapsHelper(); } public object GetPartitionKey(object document) { //get the country for this document //document should be of type DeviceModel if (document.GetType() == typeof(DeviceModel)) { //get the Location and translate to country var country = _helper.GetCountryByLatitudeLongitude( (document as DeviceModel).Location.Position.Latitude, (document as DeviceModel).Location.Position.Longitude); return country; } return String.Empty; } public string ResolveForCreate(object partitionKey) { //get the country for this partitionkey //check if there is a collection for the country found var countryCollection = _client.CreateDocumentCollectionQuery(database.SelfLink). ToList().Where(cl => cl.Id.Equals(partitionKey.ToString())).FirstOrDefault(); if (null == countryCollection) { countryCollection = new DocumentCollection { Id = partitionKey.ToString() }; countryCollection = _client.CreateDocumentCollectionAsync(_database.SelfLink, countryCollection).Result; } return countryCollection.SelfLink; } /// <summary> /// Returns a list of collectionlinks for the designated partitionkey (one per country) /// </summary> /// <param name="partitionKey"></param> /// <returns></returns> public IEnumerable<string> ResolveForRead(object partitionKey) { var countryCollection = _client.CreateDocumentCollectionQuery(_database.SelfLink). ToList().Where(cl => cl.Id.Equals(partitionKey.ToString())).FirstOrDefault(); return new List<string> { countryCollection.SelfLink }; } } In order to have the DocumentDB client use this custom PartitionResolver, we need to assign it. The code is as follows: GeographyPartitionResolver resolver = new GeographyPartitionResolver(docDbClient, _database); docDbClient.PartitionResolvers[_database.SelfLink] = resolver; //Adding a typical device and have the resolver sort out what //country is involved and whether or not the collection already //exists (and create a collection for the country if needed), use //the next code snippet. var deviceInAmsterdam = new DeviceModel { DeviceId = Guid.NewGuid().ToString(), Location = new Point(4.8951679, 52.3702157) }; Document modelAmsDocument = docDbClient.CreateDocumentAsync(_database.SelfLink, deviceInAmsterdam).Result; //get all the devices in Amsterdam var doc = docDbClient.CreateDocumentQuery<DeviceModel>( _database.SelfLink, null, resolver.GetPartitionKey(deviceInAmsterdam)); Now that we have created a country-based PartitionResolver, we can start working on the Web API that exposes the registration method. Building the Web API A Web API is an online service that can be used by any clients running any framework that supports the HTTP programming stack. Currently, REST is a way of interacting with APIs so that we will build a REST API. Building a good API should aim for platform independence. A well-designed API should also be able to extend and evolve without affecting existing clients. First, we need to whitelist the devices that should be able to register themselves against our device registry. The whitelist should at least contain a device ID, a unique identifier for a device that is used to match during the whitelisting process. A good candidate for a device ID is the mac address of the device or some random GUID. Registering a device The registration Web API contains a POST method that does the actual registration. First, it creates access to an Event Hub (not explained here) and stores the credentials needed inside the DocumentDB document. The document is then created inside the designated collection (based on the location). To learn more about Event Hubs, please visit https://azure.microsoft.com/en-us/services/event-hubs/. [Route("api/registration")] [HttpPost] public async Task<IHttpActionResult> Post([FromBody]DeviceModel value) { //add the device to the designated documentDB collection (based on country) try { var serviceUri = ServiceBusEnvironment.CreateServiceUri("sb", serviceBusNamespace, String.Format("{0}/publishers/{1}", "telemetry", value.DeviceId)) .ToString() .Trim('/'); var sasToken = SharedAccessSignatureTokenProvider.GetSharedAccessSignature(EventHubKeyName, EventHubKey, serviceUri, TimeSpan.FromDays(365 * 100)); // hundred years will do //this token can be used by the device to send telemetry //this token and the eventhub name will be saved with the metadata of the document to be saved to DocumentDB value.MetaData.Add("Namespace", serviceBusNamespace); value.MetaData.Add("EventHubName", "telemetry"); value.MetaData.Add("EventHubToken", sasToken); var document = await docDbClient.CreateDocumentAsync(_database.SelfLink, value); return Created(document.ContentLocation, value); } catch (Exception ex) { return InternalServerError(ex); } } After this registration call, the right credentials on the Event Hub have been created for this specific device. The device is now able to ingress data to the Event Hub and have consumers like Power BI consume the data and present it. Event Hubs is a highly scalable publish-subscribe event ingestor. It can collect millions of events per second so that you can process and analyze the massive amounts of data produced by your connected devices and applications. Once collected into Event Hubs, you can transform and store the data by using any real-time analytics provider or with batching/storage adapters. At the time of writing, Microsoft announced the release of Azure IoT Suite and IoT Hubs. These solutions offer internet of things capabilities as a service and are well-suited to build our scenario as well. Increasing searching We have seen how to query our documents and retrieve the information we need. For this approach, we need to understand the DocumentDB SQL language. Microsoft has an online offering that enables full-text search called Azure Search service. This feature enables us to perform full-text searches and it also includes search behaviours similar to search engines. We could also benefit from so called type-ahead query suggestions based on the input of a user. Imagine a search box on our IoT Inc. portal that offers free text searching while the user types and search for devices that include any of the search terms on the fly. Azure Search runs on Azure; therefore, it is scalable and can easily be upgraded to offer more search and storage capacity. Azure Search stores all your data inside an index, offering full-text search capabilities on your data. Setting up Azure Search Setting up Azure Search is pretty straightforward and can be done by using the REST API it offers or on the Azure portal. We will set up the Azure Search service through the portal and later on, we will utilize the REST API to start configuring our search service. We set up the Azure Search service through the Azure portal (http://portal.azure.com). Find the Search service and fill out some information. In the following screenshot, we can see how we have created the free tier for Azure Search: You can see that we use the Free tier for this scenario and that there are no datasources configured yet. We will do that know by using the REST API. We will use the REST API, since it offers more insight on how the whole concept works. We use Fiddler to create a new datasource inside our search environment. The following screenshot shows how to use Fiddler to create a datasource and add a DocumentDB collection: In the Composer window of Fiddler, you can see we need to POST a payload to the Search service we created earlier. The Api-Key is mandatory and also set the content type to be JSON. Inside the body of the request, the connection information to our DocumentDB environment is need and the collection we want to add (in this case, Netherlands). Now that we have added the collection, it is time to create an Azure Search index. Again, we use Fiddler for this purpose. Since we use the free tier of Azure Search, we can only add five indexes at most. For this scenario, we add an index on ID (device ID), location, and metadata. At the time of writing, Azure Search does not support complex types. Note that the metadata node is represented as a collection of strings. We could check in the portal to see if the creation of the index was successful. Go to the Search blade and select the Search service we have just created. You can check the indexes part to see whether the index was actually created. The next step is creating an indexer. An indexer connects the index with the provided data source. Creating this indexer takes some time. You can check in the portal if the indexing process was successful. We actually find that documents are part of the index now. If your indexer needs to process thousands of documents, it might take some time for the indexing process to finish. You can check the progress of the indexer using the REST API again. https://iotinc.search.windows.net/indexers/deviceindexer/status?api-version=2015-02-28 Using this REST call returns the result of the indexing process and indicates if it is still running and also shows if there are any errors. Errors could be caused by documents that do not have the id property available. The final step involves testing to check whether the indexing works. We will search for a device ID, as shown in the next screenshot: In the Inspector tab, we can check for the results. It actually returns the correct document also containing the location field. The metadata is missing because complex JSON is not supported (yet) at the time of writing. Indexing complex JSON types is not supported yet. It is possible to add SQL queries to the data source. We could explicitly add a SELECT statement to surface the properties of the complex JSON we have like metadata or the Point property. Try adding additional queries to your data source to enable querying complex JSON types. Now that we have created an Azure Search service that indexes our DocumentDB collection(s), we can build a nice query-as-you-type field on our portal. Try this yourself. Enhancing security Microsoft Azure offers a capability to move your secrets away from your application towards Azure Key Vault. Azure Key Vault helps to protect cryptographic keys, secrets, and other information you want to store in a safe place outside your application boundaries (connectionstring are also good candidates). Key Vault can help us to protect the DocumentDB URI and its key. DocumentDB has no (in-place) encryption feature at the time of writing, although a lot of people already asked for it to be on the roadmap. Creating and configuring Key Vault Before we can use Key Vault, we need to create and configure it first. The easiest way to achieve this is by using PowerShell cmdlets. Please visit https://msdn.microsoft.com/en-us/mt173057.aspx to read more about PowerShell. The following PowerShell cmdlets demonstrate how to set up and configure a Key Vault: Command Description Get-AzureSubscription This command will prompt you to log in using your Microsoft Account. It returns a list of all Azure subscriptions that are available to you. Select-AzureSubscription -SubscriptionName "Windows Azure MSDN Premium" This tells PowerShell to use this subscription as being subject to our next steps. Switch-AzureMode AzureResourceManager New-AzureResourceGroup –Name 'IoTIncResourceGroup' –Location 'West Europe' This creates a new Azure Resource Group with a name and a location. New-AzureKeyVault -VaultName 'IoTIncKeyVault' -ResourceGroupName 'IoTIncResourceGroup' -Location 'West Europe' This creates a new Key Vault inside the resource group and provide a name and location. $secretvalue = ConvertTo-SecureString '<DOCUMENTDB KEY>' -AsPlainText –Force This creates a security string for my DocumentDB key. $secret = Set-AzureKeyVaultSecret -VaultName 'IoTIncKeyVault' -Name 'DocumentDBKey' -SecretValue $secretvalue This creates a key named DocumentDBKey into the vault and assigns it the secret value we have just received. Set-AzureKeyVaultAccessPolicy -VaultName 'IoTIncKeyVault' -ServicePrincipalName <SPN> -PermissionsToKeys decrypt,sign This configures the application with the Service Principal Name <SPN> to get the appropriate rights to decrypt and sign Set-AzureKeyVaultAccessPolicy -VaultName 'IoTIncKeyVault' -ServicePrincipalName <SPN> -PermissionsToSecrets Get This configures the application with SPN to also be able to get a key. Key Vault must be used together with Azure Active Directory to work. The SPN we need in the steps for powershell is actually is a client ID of an application I have set up in my Azure Active Directory. Please visit https://azure.microsoft.com/nl-nl/documentation/articles/active-directory-integrating-applications/ to see how you can create an application. Make sure to copy the client ID (which is retrievable afterwards) and the key (which is not retrievable afterwards). We use these two pieces of information to take the next step. Using Key Vault from ASP.NET In order to use the Key Vault we have created in the previous section, we need to install some NuGet packages into our solution and/or projects: Install-Package Microsoft.IdentityModel.Clients.ActiveDirectory -Version 2.16.204221202 Install-Package Microsoft.Azure.KeyVault These two packages enable us to use AD and Key Vault from our ASP.NET application. The next step is to add some configuration information to our web.config file: <add key="ClientId" value="<CLIENTID OF THE APP CREATED IN AD" /> <add key="ClientSecret" value="<THE SECRET FROM AZURE AD PORTAL>" />  <add key="SecretUri" value="https://iotinckeyvault.vault.azure.net:443/secrets/DocumentDBKey" /> If you deploy the ASP.NET application to Azure, you could even configure these settings from the Azure portal itself, completely removing this from the web.config file. This technique adds an additional ring of security around your application. The following code snippet shows how to use AD and Key Vault inside the registration functionality of our scenario: //no more keys in code or .config files. Just a appid, secret and the unique URL to our key (SecretUri). When deploying to Azure we could //even skip this by setting appid and clientsecret in the Azure Portal. var kv = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(Utils.GetToken)); var sec = kv.GetSecretAsync(WebConfigurationManager.AppSettings["SecretUri"]).Result.Value; The Utils.GetToken method is shown next. This method retrieves an access token from AD by supplying the ClientId and the secret. Since we configured Key Vault to allow this application to get the keys, the call to GetSecretAsync() will succeed. The code is as follows: public async static Task<string> GetToken(string authority, string resource, string scope) { var authContext = new AuthenticationContext(authority); ClientCredential clientCred = new ClientCredential(WebConfigurationManager.AppSettings["ClientId"], WebConfigurationManager.AppSettings["ClientSecret"]); AuthenticationResult result = await authContext.AcquireTokenAsync(resource, clientCred); if (result == null) throw new InvalidOperationException("Failed to obtain the JWT token"); return result.AccessToken; } Instead of storing the key to DocumentDB somewhere in code or in the web.config file, it is now moved away to Key Vault. We could do the same with the URI to our DocumentDB and with other sensitive information as well (for example, storage account keys or connection strings). Encrypting sensitive data The documents we created in the previous section contains sensitive data like namespaces, Event Hub names, and tokens. We could also use Key Vault to encrypt those specific values to enhance our security. In case someone gets hold of a document containing the device information, he is still unable to mimic this device since the keys are encrypted. Try to use Key Vault to encrypt the sensitive information that is stored in DocumentDB before it is saved in there. Migrating data This section discusses how to use a tool to migrate data from an existing data source to DocumentDB. For this scenario, we assume that we already have a large datastore containing existing devices and their registration information (Event Hub connection information). In this section, we will see how to migrate an existing data store to our new DocumentDB environment. We use the DocumentDB Data Migration Tool for this. You can download this tool from the Microsoft Download Center (http://www.microsoft.com/en-us/download/details.aspx?id=46436) or from GitHub if you want to check the code. The tool is intuitive and enables us to migrate from several datasources: JSON files MongoDB SQL Server CSV files Azure Table storage Amazon DynamoDB HBase DocumentDB collections To demonstrate the use, we migrate our existing Netherlands collection to our United Kingdom collection. Start the tool and enter the right connection string to our DocumentDB database. We do this for both our source and target information in the tool. The connection strings you need to provide should look like this: AccountEndpoint=https://<YOURDOCDBURL>;AccountKey=<ACCOUNTKEY>;Database=<NAMEOFDATABASE>. You can click on the Verify button to make sure these are correct. In the Source Information field, we provide the Netherlands as being the source to pull data from. In the Target Information field, we specify the United Kingdom as the target. In the following screenshot, you can see how these settings are provided in the migration tool for the source information: The following screenshot shows the settings for the target information: It is also possible to migrate data to a collection that is not created yet. The migration tool can do this if you enter a collection name that is not available inside your database. You also need to select the pricing tier. Optionally, setting the partition key could help to distribute your documents based on this key across all collections you add in this screen. This information is sufficient to run our example. Go to the Summary tab and verify the information you entered. Press Import to start the migration process. We can verify a successful import on the Import results pane. This example is a simple migration scenario but the tool is also capable of using complex queries to only migrate those documents that need to moved or migrated. Try migrating data from an Azure Table storage table to DocumentDB by using this tool. Summary In this article, we saw how to integrate DocumentDB with other Microsoft Azure features. We discussed how to setup the Azure Search service and how create an index to our collection. We also covered how to use the Azure Search feature to enable full-text search on our documents which could enable users to query while typing. Next, we saw how to add additional security to our scenario by using Key Vault. We also discussed how to create and configure Key Vault by using PowerShell cmdlets, and we saw how to enable our ASP.NET scenario application to make use of the Key Vault .NET SDK. Then, we discussed how to retrieve the sensitive information from Key Vault instead of configuration files. Finally, we saw how to migrate an existing data source to our collection by using the DocumentDB Data Migration Tool. Resources for Article: Further resources on this subject: Microsoft Azure – Developing Web API For Mobile Apps [article] Introduction To Microsoft Azure Cloud Services [article] Security In Microsoft Azure [article]

0
0
27672

Packt

28 Oct 2015

40 min read

Understanding Ranges

Packt

28 Oct 2015

40 min read

In this article by Michael Parker, author of the book Learning D, explains since they were first introduced, ranges have become a pervasive part of D. It's possible to write D code and never need to create any custom ranges or algorithms yourself, but it helps tremendously to understand what they are, where they are used in Phobos, and how to get from a range to an array or another data structure. If you intend to use Phobos, you're going to run into them eventually. Unfortunately, some new D users have a difficult time understanding and using ranges. The aim of this article is to present ranges and functional styles in D from the ground up, so you can see they aren't some arcane secret understood only by a chosen few. Then, you can start writing idiomatic D early on in your journey. In this article, we lay the foundation with the basics of constructing and using ranges in two sections: Ranges defined Ranges in use (For more resources related to this topic, see here.) Ranges defined In this section, we're going to explore what ranges are and see the concrete definitions of the different types of ranges recognized by Phobos. First, we'll dig into an example of the sort of problem ranges are intended to solve and in the process, develop our own solution. This will help form an understanding of ranges from the ground up. The problem As part of an ongoing project, you've been asked to create a utility function, filterArray, which takes an array of any type and produces a new array containing all of the elements from the source array that pass a Boolean condition. The algorithm should be nondestructive, meaning it should not modify the source array at all. For example, given an array of integers as the input, filterArray could be used to produce a new array containing all of the even numbers from the source array. It should be immediately obvious that a function template can handle the requirement to support any type. With a bit of thought and experimentation, a solution can soon be found to enable support for different Boolean expressions, perhaps a string mixin, a delegate, or both. After browsing the Phobos documentation for a bit, you come across a template that looks like it will help with the std.functional.unaryFun implementation. Its declaration is as follows: template unaryFun(alias fun, string parmName = "a"); The alias fun parameter can be a string representing an expression, or any callable type that accepts one argument. If it is the former, the name of the variable inside the expression should be parmName, which is "a" by default. The following snippet demonstrates this: int num = 10; assert(unaryFun!("(a & 1) == 0")(num)); assert(unaryFun!("(x > 0)", "x")(num)); If fun is a callable type, then unaryFun is documented to alias itself to fun and the parmName parameter is ignored. The following snippet calls unaryFun first with struct that implements opCall, then calls it again with a delegate literal: struct IsEven { bool opCall(int x) { return (x & 1) == 0; } } IsEven isEven; assert(unaryFun!isEven(num)); assert(unaryFun!(x => x > 0)(num)); With this, you have everything you need to implement the utility function to spec: import std.functional; T[] filterArray(alias predicate, T)(T[] source) if(is(typeof(unaryFun!predicate(source[0]))) { T[] sink; foreach(t; source) { if(unaryFun!predicate(t)) sink ~= t; } return sink; } unittest { auto ints = [1, 2, 3, 4, 5, 6, 7]; auto even = ints.filterArray!(x => (x & 1) == 0)(); assert(even == [2, 4, 6]); } The unittest verifies that the function works as expected. As a standalone implementation, it gets the job done and is quite likely good enough. But, what if, later on down the road, someone decides to create more functions that perform specific operations on arrays in the same manner? The natural outcome of that is to use the output of one operation as the input for another, creating a chain of function calls to transform the original data. The most obvious problem is that any such function that cannot perform its operation in place must allocate at least once every time it's called. This means, chain operations on a single array will end up allocating memory multiple times. This is not the sort of habit you want to get into in any language, especially in performance-critical code, but in D, you have to take the GC into account. Any given allocation could trigger a garbage collection cycle, so it's a good idea to program to the GC; don't be afraid of allocating, but do so only when necessary and keep it out of your inner loops. In filterArray, the naïve appending can be improved upon, but the allocation can't be eliminated unless a second parameter is added to act as the sink. This allows the allocation strategy to be decided at the call site rather than by the function, but it leads to another problem. If all of the operations in a chain require a sink and the sink for one operation becomes the source for the next, then multiple arrays must be declared to act as sinks. This can quickly become unwieldy. Another potential issue is that filterArray is eager, meaning that every time the function is called, the filtering takes place immediately. If all of the functions in a chain are eager, it becomes quite difficult to get around the need for allocations or multiple sinks. The alternative, lazy functions, do not perform their work at the time they are called, but rather at some future point. Not only does this make it possible to put off the work until the result is actually needed (if at all), it also opens the door to reducing the amount of copying or allocating needed by operations in the chain. Everything could happen in one step at the end. Finally, why should each operation be limited to arrays? Often, we want to execute an algorithm on the elements of a list, a set, or some other container, so why not support any collection of elements? By making each operation generic enough to work with any type of container, it's possible to build a library of reusable algorithms without the need to implement each algorithm for each type of container. The solution Now we're going to implement a more generic version of filterArray, called filter, which can work with any container type. It needs to avoid allocation and should also be lazy. To facilitate this, the function should work with a well-defined interface that abstracts the container away from the algorithm. By doing so, it's possible to implement multiple algorithms that understand the same interface. It also takes the decision on whether or not to allocate completely out of the algorithms. The interface of the abstraction need not be an actual interface type. Template constraints can be used to verify that a given type meets the requirements. You might have heard of duck typing. It originates from the old saying, If it looks like a duck, swims like a duck, and quacks like a duck, then it's probably a duck. The concept is that if a given object instance has the interface of a given type, then it's probably an instance of that type. D's template constraints and compile-time capabilities easily allow for duck typing. The interface In looking for inspiration to define the new interface, it's tempting to turn to other languages like Java and C++. On one hand, we want to iterate the container elements, which brings to mind the iterator implementations in other languages. However, we also want to do a bit more than that, as demonstrated by the following chain of function calls: container.getType.algorithm1.algorithm2.algorithm3.toContainer(); Conceptually, the instance returned by getType will be consumed by algorithm1, meaning that inside the function, it will be iterated to the point where it can produce no more elements. But then, algorithm1 should return an instance of the same type, which can iterate over the same container, and which will in turn be consumed by algorithm2. The process repeats for algorithm3. This implies that instances of the new type should be able to be instantiated independent of the container they represent. Moreover, given that D supports slicing, the role of getType previously could easily be played by opSlice. Iteration need not always begin with the first element of a container and end with the last; any range of elements should be supported. In fact, there's really no reason for an actual container instance to exist at all in some cases. Imagine a random number generator; we should be able to plug one into the preceding function chain; just eliminate the container and replace getType with the generator instance. As long as it conforms to the interface we define, it doesn't matter that there is no concrete container instance backing it. The short version of it is, we don't want to think solely in terms of iteration, as it's only a part of the problem we're trying to solve. We want a type that not only supports iteration, of either an actual container or a conceptual one, but one that also can be instantiated independently of any container, knows both its beginning and ending boundaries, and, in order to allow for lazy algorithms, can be used to generate new instances that know how to iterate over the same elements. Considering these requirements, Iterator isn't a good fit as a name for the new type. Rather than naming it for what it does or how it's used, it seems more appropriate to name it for what it represents. There's more than one possible name that fits, but we'll go with Range (as in, a range of elements). That's it for the requirements and the type name. Now, moving on to the API. For any algorithm that needs to sequentially iterate a range of elements from beginning to end, three basic primitives are required: There must be a way to determine whether or not any elements are available There must be a means to access the next element in the sequence There must be a way to advance the sequence so that another element can be made ready Based on these requirements, there are several ways to approach naming the three primitives, but we'll just take a shortcut and use the same names used in D. The first primitive will be called empty and can be implemented either as a member function that returns bool or as a bool member variable. The second primitive will be called front, which again could be a member function or variable, and which returns T, the element type of the range. The third primitive can only be a member function and will be called popFront, as conceptually it is removing the current front from the sequence to ready the next element. A range for arrays Wrapping an array in the Range interface is quite easy. It looks like this: auto range(T)(T[] array) { struct ArrayRange(T) { private T[] _array; bool empty() @property { return _array.length == 0; } ref T front() { return _array[0]; } void popFront() { _array = _array[1 .. $]; } } return ArrayRange!T(array); } By implementing the iterator as struct, there's no need to allocate GC memory for a new instance. The only member is a slice of the source array, which again avoids allocation. Look at the implementation of popFront. Rather than requiring a separate variable to track the current array index, it slices the first element out of _array so that the next element is always at index 0, consequently shortening.length of the slice by 1 so that after every item has been consumed, _array.length will be 0. This makes the implementation of both empty and front dead simple. ArrayRange can be a Voldemort type because there is no need to declare its type in any algorithm it's passed to. As long as the algorithms are implemented as templates, the compiler can infer everything that needs to be known for them to work. Moreover, thanks to UFCS, it's possible to call this function as if it were an array property. Given an array called myArray, the following is valid: auto range = myArray.range; Next, we need a template to go in the other direction. This needs to allocate a new array, walk the iterator, and store the result of each call to element in the new array. Its implementation is as follows: T[] array(T, R)(R range) @property { T[] ret; while(!range.empty) { ret ~= range.front; range.popFront(); } return ret; } This can be called after any operation that produces any Range in order to get an array. If the range comes at the end of one or more lazy operations, this will cause all of them to execute simply by the call to popFront (we'll see how shortly). In that case, no allocations happen except as needed in this function when elements are appended to ret. Again, the appending strategy here is naïve, so there's room for improvement in order to reduce the potential number of allocations. Now it's time to implement an algorithm to make use of our new range interface. The implementation of filter The filter function isn't going to do any filtering at all. If that sounds counterintuitive, recall that we want the function to be lazy; all of the work should be delayed until it is actually needed. The way to accomplish that is to wrap the input range in a custom range that has an internal implementation of the filtering algorithm. We'll call this wrapper FilteredRange. It will be a Voldemort type, which is local to the filter function. Before seeing the entire implementation, it will help to examine it in pieces as there's a bit more to see here than with ArrayRange. FilteredRange has only one member: private R _source; R is the type of the range that is passed to filter. The empty and front functions simply delegate to the source range, so we'll look at popFront next: void popFront() { _source.popFront(); skipNext(); } This will always pop the front from the source range before running the filtering logic, which is implemented in the private helper function skipNext: private void skipNext() { while(!_source.empty && !unaryFun!predicate(_source.front)) _source.popFront(); } This function tests the result of _source.front against the predicate. If it doesn't match, the loop moves on to the next element, repeating the process until either a match is found or the source range is empty. So, imagine you have an array arr of the values [1,2,3,4]. Given what we've implemented so far, what would be the result of the following chain? Let's have a look at the following code: arr.range.filter!(x => (x & 1) == 0).front; As mentioned previously, front delegates to _source.front. In this case, the source range is an instance of ArrayRange; its front returns _source[0]. Since popFront was never called at any point, the first value in the array was never tested against the predicate. Therefore, the return value is 1, a value which doesn't match the predicate. The first value returned by front should be 2, since it's the first even number in the array. In order to make this behave as expected, FilteredRange needs to ensure the wrapped range is in a state such that either the first call to front will properly return a filtered value, or empty will return true, meaning there are no values in the source range that match the predicate. This is best done in the constructor: this(R source) { _source = source; skipNext(); } Calling skipNext in the constructor ensures that the first element of the source range is tested against the predicate; however, it does mean that our filter implementation isn't completely lazy. In an extreme case, that _source contains no values that match the predicate; it's actually going to be completely eager. The source elements will be consumed as soon as the range is instantiated. Not all algorithms will lend themselves to 100 percent laziness. No matter what we have here is lazy enough. Wrapped up inside the filter function, the whole thing looks like this: import std.functional; auto filter(alias predicate, R)(R source) if(is(typeof(unaryFun!predicate))) { struct FilteredRange { private R _source; this(R source) { _source = source; skipNext(); } bool empty() { return _source.empty; } auto ref front() { return _source.front; } void popFront() { _source.popFront(); skipNext(); } private void skipNext() { while(!_source.empty && !unaryFun!predicate(_source.front)) _source.popFront(); } } return FilteredRange(source); } It might be tempting to take the filtering logic out of the skipNext method and add it to front, which is another way to guarantee that it's performed on every element. Then no work would need to be done in the constructor and popFront would simply become a wrapper for _source.popFront. The problem with that approach is that front can potentially be called multiple times without calling popFront in between. Aside from the fact that it should return the same value each time, which can easily be accommodated, this still means the current element will be tested against the predicate on each call. That's unnecessary work. As a general rule, any work that needs to be done inside a range should happen as a result of calling popFront, leaving front to simply focus on returning the current element. The test With the implementation complete, it's time to put it through its paces. Here are a few test cases in a unittest block: unittest { auto arr = [10, 13, 300, 42, 121, 20, 33, 45, 50, 109, 18]; auto result = arr.range .filter!(x => x < 100 ) .filter!(x => (x & 1) == 0) .array!int(); assert(result == [10,42,20,50,18]); arr = [1,2,3,4,5,6]; result = arr.range.filter!(x => (x & 1) == 0).array!int; assert(result == [2, 4, 6]); arr = [1, 3, 5, 7]; auto r = arr.range.filter!(x => (x & 1) == 0); assert(r.empty); arr = [2,4,6,8]; result = arr.range.filter!(x => (x & 1) == 0).array!int; assert(result == arr); } Assuming all of this has been saved in a file called filter.d, the following will compile it for unit testing: dmd -unittest -main filter That should result in an executable called filter which, when executed, should print nothing to the screen, indicating a successful test run. Notice the test that calls empty directly on the returned range. Sometimes, we might not need to convert a range to a container at the end of the chain. For example, to print the results, it's quite reasonable to iterate a range directly. Why allocate when it isn't necessary? The real ranges The purpose of the preceding exercise was to get a feel of the motivation behind D ranges. We didn't develop a concrete type called Range, just an interface. D does the same, with a small set of interfaces defining ranges for different purposes. The interface we developed exactly corresponds to the basic kind of D range, called an input range, one of one of two foundational range interfaces in D (the upshot of that is that both ArrayRange and FilteredRange are valid input ranges, though, as we'll eventually see, there's no reason to use either outside of this article). There are also certain optional properties that ranges might have, which, when present, some algorithms might take advantage of. We'll take a brief look at the range interfaces now, then see more details regarding their usage in the next section. Input ranges This foundational range is defined to be anything from which data can be sequentially read via the three primitives empty, front, and popFront. The first two should be treated as properties, meaning they can be variables or functions. This is important to keep in mind when implementing any generic range-based algorithm yourself; calls to these two primitives should be made without parentheses. The three higher-order range interfaces, we'll see shortly, build upon the input range interface. To reinforce a point made earlier, one general rule to live by when crafting input ranges is that consecutive calls to front should return the same value until popFront is called; popFront prepares an element to be returned and front returns it. Breaking this rule can lead to unexpected consequences when working with range-based algorithms, or even foreach. Input ranges are somewhat special in that they are recognized by the compiler. The opApply enables iteration of a custom type with a foreach loop. An alternative is to provide an implementation of the input range primitives. When the compiler encounters a foreach loop, it first checks to see if the iterated instance is of a type that implements opApply. If not, it then checks for the input range interface and, if found, rewrites the loop. In a given range someRange, take for example the following loop: foreach(e; range) { ... } This is rewritten to something like this: for(auto __r = range; !__r.empty; __r.popFront()) { auto e = __r.front; ... } This has implications. To demonstrate, let's use the ArrayRange from earlier: auto ar = [1, 2, 3, 4, 5].range; foreach(n; ar) { writeln(n); } if(!ar.empty) writeln(ar.front); The last line prints 1. If you're surprised, look back up at the for loop that the compiler generates. ArrayRange is a struct, so when it's assigned to __r, a copy is generated. The slices inside, ar and __r, point to the same memory, but their .ptr and .length properties are distinct. As the length of the __r slice decreases, the length of the ar slice remains the same. When implementing generic algorithms that loop over a source range, it's not a good idea to assume the original range will not be consumed by the loop. If it's a class instead of struct, it will be consumed by the loop, as classes are references types. Furthermore, there are no guarantees about the internal implementation of a range. There could be struct-based ranges that are actually consumed in a foreach loop. Generic functions should always assume this is the case. Test if a given range type R is an input range: import std.range : isInputRange; static assert(isInputRange!R); There are no special requirements on the return value of the front property. Elements can be returned by value or by reference, they can be qualified or unqualified, they can be inferred via auto, and so on. Any qualifiers, storage classes, or attributes that can be applied to functions and their return values can be used with any range function, though it might not always make sense to do so. Forward ranges The most basic of the higher-order ranges is the forward range. This is defined as an input range that allows its current point of iteration to be saved via a primitive appropriately named save. Effectively, the implementation should return a copy of the current state of the range. For ranges that are struct types, it could be as simple as: auto save() { return this; } For ranges that are class types, it requires allocating a new instance: auto save() { return new MyForwardRange(this); } Forward ranges are useful for implementing algorithms that require a look ahead. For example, consider the case of searching a range for a pair of adjacent elements that pass an equality test: auto saved = r.save; if(!saved.empty) { for(saved.popFront(); !saved.empty; r.popFront(), saved.popFront()) { if(r.front == saved.front) return r; } } return saved; Because this uses a for loop and not a foreach loop, the ranges are iterated directly and are going to be consumed. Before the loop begins, a copy of the current state of the range is made by calling r.save. Then, iteration begins over both the copy and the original. The original range is positioned at the first element, and the call to saved.popFront in the beginning of the loop statement positions the saved range at the second element. As the ranges are iterated in unison, the comparison is always made on adjacent elements. If a match is found, r is returned, meaning that the returned range is positioned at the first element of a matching pair. If no match is found, saved is returned—since it's one element ahead of r, it will have been consumed completely and its empty property will be true. The preceding example is derived from a more generic implementation in Phobos, std.range.findAdjacent. It can use any binary (two argument) Boolean condition to test adjacent elements and is constrained to only accept forward ranges. It's important to understand that calling save usually does not mean a deep copy, but it sometimes can. If we were to add a save function to the ArrayRange from earlier, we could simply return this; the array elements would not be copied. A class-based range, on the other hand, will usually perform a deep copy because it's a reference type. When implementing generic functions, you should never make the assumption that the range does not require a deep copy. For example, given a range r: auto saved = r; // INCORRECT!! auto saved = r.save; // Correct. If r is a class, the first line is almost certainly going to result in incorrect behavior. It would in the preceding example loop. To test if a given range R is a forward range: import std.range : isForwardRange; static assert(isForwardRange!R); Bidirectional ranges A bidirectional range is a forward range that includes the primitives back and popBack, allowing a range to be sequentially iterated in reverse. The former should be a property, the latter a function. Given a bidirectional range r, the following forms of iteration are possible: foreach_reverse(e; r) writeln(e); for(; !r.empty; r.popBack) writeln(r.back); } Like its cousin foreach, the foreach_reverse loop will be rewritten into a for loop that does not consume the original range; the for loop shown here does consume it. Test whether a given range type R is a bidirectional range: import std.range : isBidirectionalRange; static assert(isBidirectionalRange!R); Random-access ranges A random-access range is a bidirectional range that supports indexing and is required to provide a length primitive unless it's infinite (two topics we'll discuss shortly). For custom range types, this is achieved via the opIndex operator overload. It is assumed that r[n] returns a reference to the (n+1)th element of the range, just as when indexing an array. Test whether a given range R is a random-access range: import std.range : isRandomAccessRange; static assert(isRandomAccessRange!R); Dynamic arrays can be treated as random-access ranges by importing std.array. This pulls functions into scope that accept dynamic arrays as parameters and allows them to pass all the isRandomAccessRange checks. This makes our ArrayRange from earlier obsolete. Often, when you need a random-access range, it's sufficient just to use an array instead of creating a new range type. However, char and wchar arrays (string and wstring) are not considered random-access ranges, so they will not work with any algorithm that requires one. Getting a random-access range from char[] and wchar[] Recall that a single Unicode character can be composed of multiple elements in a char or wchar array, which is an aspect of strings that would seriously complicate any algorithm implementation that needs to directly index the array. To get around this, the thing to do in a general case is to convert char[] and wchar[] into dchar[]. This can be done with std.utf.toUTF32, which encodes UTF-8 and UTF-16 strings into UTF-32 strings. Alternatively, if you know you're only working with ASCII characters, you can use std.string.representation to get ubyte[] or ushort[] (on dstring, it returns uint[]). Output ranges The output range is the second foundational range type. It's defined to be anything that can be sequentially written to via the primitive put. Generally, it should be implemented to accept a single parameter, but the parameter could be a single element, an array of elements, a pointer to elements, or another data structure containing elements. When working with output ranges, never call the range's implementation of put directly; instead, use the Phobos' utility function std.range.put. It will call the range's implementation internally, but it allows for a wider range of argument types. Given a range r and element e, it would look like this: import std.range : put; put(r, e); The benefit here is if e is anything other than a single element, such as an array or another range, the global put does what is necessary to pull elements from it and put them into r one at a time. With this, you can define and implement a simple output range that might look something like this: MyOutputRange(T) { private T[] _elements; void put(T elem) { _elements ~= elem; } } Now, you need not worry about calling put in a loop, or overloading it to accept collections of T. For example, let's have a look at the following code: MyOutputRange!int range; auto nums = [11, 22, 33, 44, 55]; import std.range : put; put(range, nums); Note that using UFCS here will cause compilation to fail, as the compiler will attempt to call MyOutputRange.put directly, rather than the utility function. However, it's fine to use UFCS when the first parameter is a dynamic array. This allows arrays to pass the isOutputRange predicate Test whether a given range R is an output range: import std.range : isOutputRange; static assert(isOutputRange!(R, E)); Here, E is the type of element accepted by R.put. Optional range primitives In addition to the five primary range types, some algorithms in Phobos are designed to look for optional primitives that can be used as an optimization or, in some cases, a requirement. There are predicate templates in std.range that allow the same options to be used outside of Phobos. hasLength Ranges that expose a length property can reduce the amount of work needed to determine the number of elements they contain. A great example is the std.range.walkLength function, which will determine and return the length of any range, whether it has a length primitive or not. Given a range that satisfies the std.range.hasLength predicate, the operation becomes a call to the length property; otherwise, the range must be iterated until it is consumed, incrementing a variable every time popFront is called. Generally, length is expected to be a O(1) operation. If any given implementation cannot meet that expectation, it should be clearly documented as such. For non-infinite random-access ranges, length is a requirement. For all others, it's optional. isInfinite An input range with an empty property, which is implemented as a compile-time value set to false is considered an infinite range. For example, let's have a look at the following code: struct IR { private uint _number; enum empty = false; auto front() { return _number; } void popFront() { ++_number; } } Here, empty is a manifest constant, but it could alternatively be implemented as follows: static immutable empty = false; The predicate template std.range.isInfinite can be used to identify infinite ranges. Any range that is always going to return false from empty should be implemented to pass isInfinite. Wrapper ranges (such as the FilterRange we implemented earlier) in some functions might check isInfinite and customize an algorithm's behavior when it's true. Simply returning false from an empty function will break this, potentially leading to infinite loops or other undesired behavior. Other options There are a handful of other optional primitives and behaviors, as follows: hasSlicing: This returns true for any forward range that supports slicing. There are a set of requirements specified by the documentation for finite versus infinite ranges and whether opDollar is implemented. hasMobileElements: This is true for any input range whose elements can be moved around in the memory (as opposed to copied) via the primitives moveFront, moveBack, and moveAt. hasSwappableElements: This returns true if a range supports swapping elements through its interface. The requirements are different depending on the range type. hasAssignableElements: This returns true if elements are assignable through range primitives such as front, back, or opIndex. At http://dlang.org/phobos/std_range_primitives.html, you can find specific documentation for all of these tests, including any special requirements that must be implemented by a range type to satisfy them. Ranges in use The key concept to understand ranges in the general case is that, unless they are infinite, they are consumable. In idiomatic usage, they aren't intended to be kept around, adding and removing elements to and from them as if they were some sort of container. A range is generally created only when needed, passed to an algorithm as input, then ultimately consumed, often at the end of a chain of algorithms. Even forward ranges and output ranges with their save and put primitives usually aren't intended to live long beyond an algorithm. That's not to say it's forbidden to keep a range around; some might even be designed for long life. For example, the random number generators in std.random are all ranges that are intended to be reused. However, idiomatic usage in D generally means lazy, fire-and-forget ranges that allow algorithms to operate on data from any source. For most programs, the need to deal with ranges directly should be rare; most code will be passing ranges to algorithms, then either converting the result to a container or iterating it with a foreach loop. Only when implementing custom containers and range-based algorithms is it necessary to implement a range or call a range interface directly. Still, understanding what's going on under the hood helps in understanding the algorithms in Phobos, even if you never need to implement a range or algorithm yourself. That's the focus of the remainder of this article. Custom ranges When implementing custom ranges, some thought should be given to the primitives that need to be supported and how to implement them. Since arrays support a number of primitives out of the box, it might be tempting to return a slice from a custom type, rather than struct wrapping an array or something else. While that might be desirable in some cases, keep in mind that a slice is also an output range and has assignable elements (unless it's qualified as const or immutable, but those can be cast away). In many cases, what's really wanted is an input range that can never be modified; one that allows iteration and prevents unwanted allocations. A custom range should be as lightweight as possible. If a container uses an array or pointer internally, the range should operate on a slice of the array, or a copy of the pointer, rather than a copy of the data. This is especially true for the save primitive of a forward iterator; it could be called more than once in a chain of algorithms, so an implementation that requires deep copying would be extremely suboptimal (not to mention problematic for a range that requires ref return values from front). Now we're going to implement two actual ranges that demonstrate two different scenarios. One is intended to be a one-off range used to iterate a container, and one is suited to sticking around for as long as needed. Both can be used with any of the algorithms and range operations in Phobos. Getting a range from a stack Here's a barebones, simple stack implementation with the common operations push, pop, top, and isEmpty (named to avoid confusion with the input range interface). It uses an array to store its elements, appending them in the push operation and decreasing the array length in the pop operation. The top of the stack is always _array[$-1]: struct Stack(T) { private T[] _array; void push(T element) { _array ~= element; } void pop() { assert(!isEmpty); _array.length -= 1; } ref T top() { assert(!isEmpty); return _array[$-1]; } bool isEmpty() { return _array.length == 0; } } Rather than adding an opApply to iterate a stack directly, we want to create a range to do the job for us so that we can use it with all of those algorithms. Additionally, we don't want the stack to be modified through the range interface, so we should declare a new range type internally. That might look like this: private struct Range { T[] _elements; bool empty() { return _elements.length == 0; } T front() { return _elements[$-1]; } void popFront() { _elements.length -= 1; } } Add this anywhere you'd like inside the Stack declaration. Note the iteration of popFront. Effectively, this range will iterate the elements backwards. Since the end of the array is the top of the stack, that means it's iterating the stack from the top to the bottom. We could also add back and popBack primitives that iterate from the bottom to the top, but we'd also have to add a save primitive since bidirectional ranges must also be forward ranges. Now, all we need is a function to return a Range instance: auto elements() { return Range(_array); } Again, add this anywhere inside the Stack declaration. A real implementation might also add the ability to get a range instance from slicing a stack. Now, test it out: Stack!int stack; foreach(i; 0..10) stack.push(i); writeln("Iterating..."); foreach(i; stack.elements) writeln(i); stack.pop(); stack.pop(); writeln("Iterating..."); foreach(i; stack.elements) writeln(i); One of the great side effects of this sort of range implementation is that you can modify the container behind the range's back and the range doesn't care: foreach(i; stack.elements) { stack.pop(); writeln(i); } writeln(stack.top); This will still print exactly what was in the stack at the time the range was created, but the writeln outside the loop will cause an assertion failure because the stack will be empty by then. Of course, it's still possible to implement a container that can cause its ranges not just to become stale, but to become unstable and lead to an array bounds error or an access violation or some such. However, D's slices used in conjunction with structs give a good deal of flexibility. A name generator range Imagine that we're working on a game and need to generate fictional names. For this example, let's say it's a music group simulator and the names are those of group members. We'll need a data structure to hold the list of possible names. To keep the example simple, we'll implement one that holds both first and last names: struct NameList { private: string[] _firstNames; string[] _lastNames; struct Generator { private string[] _first; private string[] _last; private string _next; enum empty = false; this(string[] first, string[] last) { _first = first; _last = last; popFront(); } string front() { return _next; } void popFront() { import std.random : uniform; auto firstIdx = uniform(0, _first.length); auto lastIdx = uniform(0, _last.length); _next = _first[firstIdx] ~ " " ~ _last[lastIdx]; } } public: auto generator() { return Generator(_firstNames, _lastNames); } } The custom range is in the highlighted block. It's a struct called Generator that stores two slices, _first and _last, which are both initialized in its only constructor. It also has a field called _next, which we'll come back to in a minute. The goal of the range is to provide an endless stream of randomly generated names, which means it doesn't make sense for its empty property to ever return true. As such, it is marked as an infinite range by the manifest constant implementation of empty that is set to false. This range has a constructor because it needs to do a little work to prepare itself before front is called for the first time. All of the work is done in popFront, which the constructor calls after the member variables are set up. Inside popFront, you can see that we're using the std.random.uniform function. By default, this function uses a global random number generator and returns a value in the range specified by the parameters, in this case 0 and the length of each array. The first parameter is inclusive and the second is exclusive. Two random numbers are generated, one for each array, and then used to combine a first name and a last name to store in the _next member, which is the value returned when front is called. Remember, consecutive calls to front without any calls to popFront should always return the same value. std.random.uniform can be configured to use any instance of one of the random number generator implementations in Phobos. It can also be configured to treat the bounds differently. For example, both could be inclusive, exclusive, or the reverse of the default. See the documentation at http://dlang.org/phobos/std_random.html for details. The generator property of NameList returns an instance of Generator. Presumably, the names in a NameList would be loaded from a file on disk, or a database, or perhaps even imported at compile-time. It's perfectly fine to keep a single Generator instance handy for the life of the program as implemented. However, if the NameList instance backing the range supported reloading or appending, not all changes would be reflected in the range. In that scenario, it's better to go through generator every time new names need to be generated. Now, let's see how our custom range might be used: auto nameList = NameList( ["George", "John", "Paul", "Ringo", "Bob", "Jimi", "Waylon", "Willie", "Johnny", "Kris", "Frank", "Dean", "Anne", "Nancy", "Joan", "Lita", "Janice", "Pat", "Dionne", "Whitney", "Donna", "Diana"], ["Harrison", "Jones", "Lennon", "Denver", "McCartney", "Simon", "Starr", "Marley", "Dylan", "Hendrix", "Jennings", "Nelson", "Cash", "Mathis", "Kristofferson", "Sinatra", "Martin", "Wilson", "Jett", "Baez", "Ford", "Joplin", "Benatar", "Boone", "Warwick", "Houston", "Sommers", "Ross"] ); import std.range : take; auto names = nameList.generator.take(4); writeln("These artists want to form a new band:"); foreach(artist; names) writeln(artist); First up, we initialize a NameList instance with two array literals, one of first names and one of last names. Next, the highlighted line is where the range is used. We call nameList.generator and then, using UFCS, pass the returned Generator instance to std.range.take. This function creates a new lazy range containing a number of elements, four in this case, from the source range. In other words, the result is the equivalent of calling front and popFront four times on the range returned from nameList.generator, but since it's lazy, the popping doesn't occur until the foreach loop. That loop produces four randomly generated names that are each written to standard output. One iteration yielded the following names for me: These artists want to form a new band: Dionne WilsonJohnny StarrRingo SinatraDean Kristofferson Other considerations The Generator range is infinite, so it doesn't need length. There should never be a need to index it, iterate it in reverse, or assign any values to it. It has exactly the interface it needs. But it's not always so obvious where to draw the line when implementing a custom range. Consider the interface for a range from a queue data structure. A basic queue implementation allows two operations to add and remove items—enqueue and dequeue (or push and pop if you prefer). It provides the self-describing properties empty and length. What sort of interface should a range from a queue implement? An input range with a length property is perhaps the most obvious, reflecting the interface of the queue itself. Would it make sense to add a save property? Should it also be a bidirectional range? What about indexing? Should the range be random-access? There are queue implementations out there in different languages that allow indexing, either through an operator overload or a function such as getElementAt. Does that make sense? Maybe. More importantly, if a queue doesn't allow indexing, does it make sense for a range produced from that queue to allow it? What about slicing? Or assignable elements? For our queue type at least, there are no clear-cut answers to these questions. A variety of factors come into play when choosing which range primitives to implement, including the internal data structure used to implement the queue, the complexity requirements of the primitives involved (indexing should be an O(1) operation), whether the queue was implemented to meet a specific need or is a more general-purpose data structure, and so on. A good rule of thumb is that if a range can be made a forward range, then it should be. Custom algorithms When implementing custom, range-based algorithms, it's not enough to just drop an input range interface onto the returned range type and be done with it. Some consideration needs to be given to the type of range used as input to the function and how its interface should affect the interface of the returned range. Consider the FilteredRange we implemented earlier, which provides the minimal input range interface. Given that it's a wrapper range, what happens when the source range is an infinite range? Let's look at it step by step. First, an infinite range is passed in to filter. Next, it's wrapped up in a FilteredRange instance that's returned from the function. The returned range is going to be iterated at some point, either directly by the caller or somewhere in a chain of algorithms. There's one problem, though: with a source range that's infinite, the FilteredRange instance can never be consumed. Because its empty property simply wraps that of the source range, it's always going to return false if the source range is infinite. However, since FilteredRange does not implement empty as a compile-time constant, it will never match the isInfiniteRange predicate. This will cause any algorithm that makes that check to assume it's dealing with a finite range and, if iterating it, enter into an infinite loop. Imagine trying to track down that bug. One option is to prohibit infinite ranges with a template constraint, but that's too restrictive. The better way around this potential problem is to check the source range against the isInfinite predicate inside the FilteredRange implementation. Then, the appropriate form of the empty primitive of FilteredRange can be configured with conditional compilation: import std.range : isInfinite; static if(isInfinite!T) enum empty = false; else bool empty(){ return _source.empty; } With this, FilteredRange will satisfy the isInfinite predicate when it wraps an infinite range, avoiding the infinite loop bug. Another good rule of thumb is that a wrapper range should implement as many of the primitives provided by the source range as it reasonably can. If the range returned by a function has fewer primitives than the one that went in, it is usable with fewer algorithms. But not all ranges can accommodate every primitive. Take FilteredRange as an example again. It could be configured to support the bidirectional interface, but that would have a bit of a performance impact as the constructor would have to find the last element in the source range that satisfies the predicate in addition to finding the first, so that both front and back are primed to return the correct values. Rather than using conditional compilation, std.algorithm provides two functions, filter and filterBidirectional, so that users must explicitly choose to use the latter version. A bidirectional range passed to filter will produce a forward range, but the latter maintains the interface. The random-access interface, on the other hand, makes no sense on FilteredRange. Any value taken from the range must satisfy the predicate, but if users can randomly index the range, they could quite easily get values that don't satisfy the predicate. It could work if the range was made eager rather than lazy. In that case, it would allocate new storage and copy all the elements from the source that satisfies the predicate, but that defeats the purpose of using lazy ranges in the first place. Summary In this article, we've taken an introductory look at ranges in D and how to implement them into containers and algorithms. For more information on ranges and their primitives and traits, see the documentation at http://dlang.org/phobos/std_range.html. Resources for Article: Further resources on this subject: Transactions and Operators [article] Using Protocols and Protocol Extensions [article] Hand Gesture Recognition Using a Kinect Depth Sensor [article]

0
0
32189

Packt

26 Oct 2015

7 min read

Introduction to MapBox

Packt

26 Oct 2015

7 min read

In this article by Bill Kastanakis, author of the book MapBox Cookbook, he has given an introduction to MapBox. Most of the websites we visit everyday us maps in order to display information about locations or point of interests to the user. It's amazing how this technology has evolved over the past decades. In the early days with the introduction of the Internet, maps used to be static images. Users were unable to interact with maps, and they were limited to just displaying static information. Interactive maps were available only to mapping professionals and accessed via very expensive GIS software. Cartographers have used this type of software to create or improve maps, usually for an agency or an organization. Again, if the location information was to be made available to the public, there were only two options: static images or a printed version. (For more resources related to this topic, see here.) Improvements made on Internet technologies opened up several possibilities for interactive content. It was a natural transition for maps to become live, respond to search queries, and allow user interactions (such as panning and changing the zoom level). Mobile devices were just starting to evolve, and a new age of smartphones was just about to begin. It was natural for maps to become even more important to consumers. Interactive maps are now in their pockets. More importantly, they can tell the users location. These maps also have the ability to display a great variety of data. In the age where smartphones and tables have become aware of the location, information has become even more important to companies. They use it to improve user experience. From general purpose websites (such as Google Maps) to more focused apps (such as Four Square and Facebook), maps are now a crucial component in the digital world. The popularity of mapping technologies is increasing over the years. From free open source solutions to commercial services for web and mobile developers and even services specialized for cartographers and visualization professionals, a number of services have become available to developers. Currently, there is an option for developers to choose from a variety of services that will work better on their specific task, and best of all, if you don't have increased traffic requirements, most of them will offer free plans for their consumers. What is MapBox? The issue with most of the solutions available is that they look extremely similar. Observing the most commonly used websites and services that implement a map, you can easily verify that they completely lack personality. Maps have the same colors and are present with the same features, such as roads, buildings, and labels. Currently, displaying road addresses in a specific website doesn't make sense. Customizing maps is a tedious task and is the main reason why it's avoided. What if the map that is provided by a service is not working well with the color theme used in your website or app? MapBox is a service provider that allows users to select a variety of customization options. This is one of the most popular features that has set it apart from competition. The power to fully customize your map in every detail, including the color theme, features you want to present to the user, information displayed, and so on, is indispensable. MapBox provides you with tools to fully write CartoCSS, the language behind the MapBox customization, SDKs, and frameworks to integrate their maps into your website with minimal effort and a lot more tools to assist you in your task to provide a unique experience to your users. Data Let's see what MapBox has to offer, and we will begin with three available datasets: MapBox Streets is the core technology behind MapBox street data. It's powered by open street maps and has an extremely vibrant community of 1.5 million cartographers and users, which constantly refine and improve map data in real time, as shown in the following screenshot: MapBox Terrain is composed of datasets fetched from 24 datasets owned by 13 organizations. You will be able to access elevation data, hill shades, and topography lines, as shown in the following screenshot: MapBox Satellite offers high-resolution cloudless datasets with satellite imagery, as shown in the following image: MapBox Editor MapBox Editor is an online editor where you can easily create and customize maps. It's purpose is to easily customize the map color theme by choosing from presets or creating your own styles. Additionally, you can add features, such as Markers, Lines, or define areas using polygons. Maps are also multilingual; currently, there are four different language options to choose from when you work with MapBox Editor. Although adding data manually in MapBox Editor is handy, it also offers the ability to batch import data, and it supports the most commonly used formats. The user interface is strictly visual; no coding skills is needed in order to create, customize, and present a map. It is very ideal if you want to quickly create and share maps. The user interface also supports sharing to all the major platforms, such as WordPress, and embedding in forums or on a website using iFrames. CartoCSS CartoCSS is a powerful open source style sheet language developed by MapBox and is widely supported by several other mapping and visualization platforms. It's extremely similar to CSS, and if you ever used CSS, it will be very easy to adapt. Take a look at the following code: #layer { line-color: #C00; line-width: 1; } #layer::glow { line-color: #0AF; line-opacity: 0.5; line-width: 4; } TileMill TileMill is a free open source desktop editor that you can use to write CartoCSS and fully customize your maps. The customization is done by adding layers of data from various sources and then customizing the layer properties using CartoCSS, a CSS-like style sheet language. When you complete the editing of the map, you can then export the tiles and upload them to your MapBox account in order to use the map on your website. TileMill was used as a standard solution for this type of work, but it uses raster data. This changed recently with the introduction of MapBox Studio, which uses vector data. MapBox Studio MapBox Studio is the new open source toolbox that was created by the MapBox team to customize maps, and the plan is to slowly replace TileMill. The advantage is that it uses vector tiles instead of raster. Vector tiles are superior because they hold infinite detail; they are not dependent on the resolution found in a fixed size image. You can still use CartoCSS to customize the map, and as with TileMill, at any point, you can export and share the map on your website. The API and SDK Accessing MapBox data using various APIs is also very easy. You can use JavaScript, WebGL, or simply access the data using REST service calls. If you are into mobile development, they offer separate SDKs to develop native apps for iOS and Android that take advantage of the amazing MapBox technologies and customization while maintaining a native look and feel. MapBox allows you to use your own sources. You can import a custom dataset and overlay the data to Mapbox streets, terrains, or satellite. Another noteworthy feature is that you are not limited to fetching data from various sources, but you can also query the tile metadata. Summary In this article, we learned what Mapbox, Mapbox Editor, CartoCSS, TileMill and MapBox Studio is all about. Resources for Article: Further resources on this subject: Constructing and Evaluating Your Design Solution [article] Designing Site Layouts in Inkscape [article] Displaying SQL Server Data using a Linq Data Source [article]

0
0
2538

Building a Cloud Spy Camera and Creating a GPS Tracker

Introducing OpenSIPS

Team Project Setup

Introducing TabLayout

Protecting Your Bitcoins

Detecting Shapes Employing Hough Transform

Rotation Forest - A Classifier Ensemble Based on Feature Extraction

Working On Your Bot

Making a simple Web based SSH client using Node.js and Socket.io

C# Language Support for Asynchrony

Trending Topics

The Art of Android Development Using Android Studio

An Introduction to Kibana

Putting Your Database at the Heart of Azure Solutions

Understanding Ranges

Introduction to MapBox

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access