Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-animations-sprites
Packt
03 Aug 2016
10 min read
Save for later

Animations Sprites

Packt
03 Aug 2016
10 min read
In this article by, Abdelrahman Saher and Francesco Sapio, from the book, Unity 5.x 2D Game Development Blueprints, we will learn how to create and play animations for the player character to see as Unity controls the player and other elements in the game. The following is what we will go through: (For more resources related to this topic, see here.) Animating sprites Integrating animations into animators Continuing our platform game Animating sprites Creating and using animation for sprites is a bit easier than other parts of the development stage. By using animations and tools to animate our game, we have the ability to breathe some life into it. Let's start by creating a running animation for our player. There are two ways of creating animations in Unity: automatic clip creation and manual clip creation. Automatic clip creation This is the recommended method for creating 2D animations. Here, Unity is able to create the entire animation for you with a single-click. If you navigate in the Project Panel to Platformer Pack | Player | p1_walk, you can find an animation sheet as a single file p1_walk.png and a folder of a PNG image for each frame of the animation. We will use the latter. The reason for this is because the single sprite sheet will not work perfectly as it is not optimized for Unity. In the Project Panel, create a new folder and rename it to Animations. Then, select all the PNG images in Platformer Pack | Player | p1_walk | PNG and drop them in the Hierarchy Panel: A new window will appear that will give us the possibility to save them as a new animation in a folder that we chose. Let's save the animation in our new folder titled Animations as WalkAnim: After saving the animation, look in the Project Panel next to the animation file. Now, there is another asset with the name of one of the dropped sprites. This is an Animator Controller and, as the name suggests, it is used to control the animation. Let's rename it to PlayerAnimator so that we can distinguish it later on. In the Hierarchy panel, a game object has been automatically created with the original name of our controller. If we select it, the Inspector should look like the following: You can always add an Animator component to a game object by clicking on Add Component | Miscellaneous | Animator. As you can see, below the Sprite Renderer component there is an Animator component. This component will control the animation for the player and is usually accessed through a custom script to change the animations. For now, drag and drop the new controller PlayerAnimator on to our Player object. Manual clip creation Now, we also need a jump animation for our character. However, since we only have one sprite for the player jumping, we will manually create the animation clip for it. To achieve this, select the Player object in the Hierarchy panel and open the Animation window from Window | Animation. The Animation window will appear, as shown in the screenshot below: As you can see, our animation WalkAnim is already selected. To create a new animation clip, click on where the text WalkAnim is. As a result, a dropdown menu appears and here you can select Create New Clip. Save the new animation in the Animations folder as JumpAnim. On the right, you can find the animation timeline. Select from the Project Panel the folder Platformer Pack/Player. Drag and drop the sprite p1_jump on the timeline. You can see that the timeline for the animation has changed. In fact, now it contains the jumping animation, even if it is made out of only one sprite. Finally, save what we have done so far. The Animation window's features are best used to make fine tunes for the animation or even merging two or more animations into one. Now the Animations folder should look like this in the Project panel: By selecting the WalkAnim file, you will be able to see the Preview panel, which is collocated at the bottom of the Inspector when an object that may contain animation is selected. To test the animation, drag the Player object and drop it in the Preview panel and hit play: In the Preview panel, you can check out your animations without having to test them directly from code. In addition, you can easily select the desired animation and then drag the animation into a game object with the corresponding Animator Controller and dropping it in the Preview panel. The Animator In order to display an animation on a game object, you will be using both Animator Components and Animator Controllers. These two work hand in hand to control the animation of any animated object that you might have, and are described below: Animator Controller uses a state-machine to manage the animation states and the transitions between one another, almost like a flow chart of animations. Animator Component uses an Animator Controller to define which animation clips to use and applies them on the game object when needed. It also controls the blending and the transitions between them. Let's start modifying our controller to make it right for our character animations. Click on the Player and then open the Animator window from Window | Animator. We should see something like this: This is a state-machine, although it is automatically generated. To move around the grid, hold the middle mouse button and drag around. First, let's understand how all the different kinds of nodes work: Entry node (marked green): It is used when transitioning into a state machine, provided the required conditions were met. Exit node (marked red): It is used to exit a state machine when the conditions have been changed or completed. By default, it is not present, as there isn't one in the previous image. Default node (marked orange): It is the default state of the Animator and is automatically transitioned to from the entry node. Sub-state nodes (marked grey): They are also called custom nodes. They are used typically to represent a state for an object where an event will occur (in our case, an animation will be played). Transitions (arrows): They allow state machines to switch between one another by setting the conditions that will be used by Animator to decide which state will be activated. To keep things organized, let's reorder the nodes in the grid. Drag the three sub-states just right under the Entry node. Order them from left to right WalkAnim, New Animation, and JumpAnim. Then, right-click on New Animation and choose Set as Layer Default State. Now, our Animator window should look like the following: To edit a node, we need to select it and modify it as needed in the Inspector. So, select New Animation and the Inspector should be like the screenshot below: Here, we can have access to all the properties of the state or node New Animation. Let's change its name to Idle. Next, we need to change the speed of the state machine, which controls how fast the animation will be played. Next, we have Motion which refers to the animation that will be used for this state. After we have changed the name, save the scene, and this is what everything should look like now: We can test what we have done so far, by hitting play. As we can see in the Game view, the character is not animated. This is because the character is always in the Idle state and there are no transitions to let him change state. While the game is in runtime, we can see in the Animator window that the Idle state is running. Stop the game, and right-click on the WalkAnim node in the Animator window. Select from the menu Set as Layer Default State. As a result, the walking animation will be played automatically at the beginning of the game. If we press the play button again, we can notice that the walk animation is played, as shown in the screenshot below: You can experiment with the other states of the Animator. For example, you can try to set JumpAnim as the default animation or even tweak the speed of each state to see how they will be affected. Now that we know the basics of how the Animator works, let's stop the playback and revert the default state to the Idle state. To be able to connect our states together, we need to create transitions. To achieve this, right-click on the Idle state and select Make Transition which turns the mouse cursor into an arrow. By clicking on other states, we can connect them with a transition. In our case, click on the WalkAnim state to make a transition from the Idle state to the WalkAnim state. The animator window should look like the following: If we click on the arrow, we can have access to its properties in the Inspector, as shown in the following screenshot: The main properties that we might want to change are: Name (optional): We can assign a name to the transition. This is useful to keep everything organized and easy to access. In this case, let's name this transition Start Walking. Has Exit Time: Whether or not the animation should be played to the end before exiting its state when the conditions are not being met anymore. Conditions: The conditions that should be met so that the transition takes place. Let's try adding a condition and see what happens: When we try to create a condition for our transition, the following message appears next to Parameter does not exist in Controller which means that we need to add parameters that will be used for our condition. To create a parameter, switch to Parameters in the top left of the Animator window and add a new float using the + button and name it PlayerSpeed, as shown in the following screenshot: Any parameters that are created in the Animator are usually changed from code and those changes affect the state of animation. In the following screenshot, we can see the PlayerSpeed parameter on the left side: Now that we have created a parameter, let's head back to the transition. Click the drop down button next to the condition we created earlier and choose the parameter PlayerSpeed. After choosing the parameter, another option appears next to it. You can either choose Greater or Less, which means that the transition will happen when this parameter is respectively less than X or greater than X. Don't worry, as that X will be changed by our code later on. For now, choose Greater and set the value to 1, which means that when the player speed is more than one, the walk animation starts playing. You can test what we have done so far and change the PlayerSpeed parameter in runtime. Summary This wraps up everything that we will cover in this article. So far, we have added animations to our character to be played according to the player controls. Resources for Article: Further resources on this subject: Animations in Cocos2d-x [Article] Adding Animations [Article] Bringing Your Game to Life with AI and Animations [Article]
Read more
  • 0
  • 0
  • 27850

article-image-responsive-web-design
Packt
03 Aug 2016
32 min read
Save for later

What is Responsive Web Design

Packt
03 Aug 2016
32 min read
In this article by Alex Libby, Gaurav Gupta, and Asoj Talesra, the authors of the book, Responsive Web Design with HTML5 and CSS3 Essentials we will cover the basic elements of responsive web design (RWD). Getting started with Responsive Web Design If one had to describe Responsive Web Design in a sentence, then responsive design describes how the content is displayed across various screens and devices, such as mobiles, tablets, phablets or desktops. To understand what this means, let's use water as an example. The property of water is that it takes the shape of the container in which it is poured. It is an approach in which a website or a webpage adjusts the layout according to the size or resolution of the screen dynamically. This ensures that the users get the best experience while using the website. We develop a single website that uses a single code base. This will contain fluid, flexible images, proportion-based grids, fluid images or videos and CSS3 media queries to work across multiple devices and device resolutions—the key to making them work is the use of percentage values in place of fixed units, such as pixels or ems-based sizes. The best part of this is that we can use this technique without the knowledge or need of server based/backend solutions—to see it in action, we can use Packt's website as an example. Go ahead and browse to https://www.packtpub.com/web-development/mastering-html5-forms; this is what we will see as a desktop view: The mobile view for the same website shows this if viewed on a smaller device: We can clearly see the same core content is being displayed (that is, an image of the book, the buy button, pricing details and information about the book), but element such as the menu have been transformed into a single drop down located in the top left corner. This is what responsive web design is all about—producing a flexible design that adapts according to which device we choose to use in a format that suits the device being used. Understanding the elements of RWD Now that we've been introduced to RWD, it's important to understand some of the elements that make up the philosophy of what we know as flexible design. A key part of this is understanding the viewport or visible screen estate available to us—in addition, there are several key elements that make up RWD. There are several key elements involved—in addition to viewports, these center around viewports, flexible media, responsive text and grids, and media queries. We will cover each in more detail later in the book, but for now, let's have a quick overview of the elements that make up RWD. Controlling the viewport A key part of RWD is working with the viewport, or visible content area on a device. If we're working with desktops, then it is usually the resolution; this is not the case for mobile devices. There is a temptation to reach for JavaScript (or a library, such as jQuery) to set values, such as viewport width or height: there is no need, as we can do this using CSS: <meta name="viewport" content="width=device-width"> Or by using this directive: <meta name="viewport" content="width=device-width, initial-scale=1"> This means that the browser should render the width of the page to the same width as the browser window—if, for example, the latter is 480px, then the width of the page will be 480px. To see what a difference not setting a viewport can have, take a look at this example screenshot: This example was created from displaying some text in Chrome, in iPhone 6 Plus emulation mode, but without a viewport. Now, let's take a look at the same text, but this time with a viewport directive set: Even though this is a simple example, do you notice any difference? Yes, the title color has changed, but more importantly the width of our display has increased. This is all part of setting a viewport—browsers frequently assume we want to view content as if we're on a desktop PC. If we don't tell it that the viewport area has been shrunken in size, it will try to shoe horn all of the content into a smaller size, which doesn't work very well! It's critical therefore that we set the right viewport for our design and that we allow it to scale up or down in size, irrespective of the device—we will explore this in more detail. Creating flexible grids When designing responsive websites, we can either create our own layout or use a grid system already created for use, such as Bootstrap. The key here though is ensuring that the mechanics of our layout sizes and spacing are set according to the content we want to display for our users, and that when the browser is resized in width, it realigns itself correctly. For many developers, the standard unit of measure has been pixel values; a key part of responsive design is to make the switch to using percentage and em (or preferably rem) units. The latter scale better than standard pixels, although there is a certain leap of faith needed to get accustomed to working with the replacements! Making media responsive A key part of our layout is, of course, images and text—the former though can give designers a bit of a headache, as it is not enough to simply use large images and set overflow: hidden to hide the parts that are not visible! Images in a responsive website must be as flexible as the grid used to host them—for some, this may be a big issue if the website is very content-heavy; now is a good time to consider if some of that content is no longer needed, and can be removed from the website. We can, of course simply apply display: none to any image which shouldn't be displayed, according to the viewport set. This isn't a good idea though, as content still has to be downloaded before styles can be applied; it means we're downloading more than is necessary! Instead, we should assess the level of content, make sure it is fully optimized, and apply percentage values so it can be resized automatically to a suitable size when the browser viewport changes. Constructing suitable breakpoints With content and media in place, we must turn our attention to media queries—there is a temptation to create queries that suit specific devices, but this can become a maintenance headache. We can avoid the headache by designing queries based on where the content breaks, rather than for specific devices—the trick to this is to start small and gradually enhance the experience, with the use of media queries: <link rel="stylesheet" media="(max-device-width: 320px)" href="mobile.css" /> <link rel="stylesheet" media="(min-width: 1600px)" href="widescreen.css" /> We should aim for around 75 characters per line, to maintain an optimal length for our content. Introducing flexible grid layouts For many years, designers have built layouts of different types—they may be as simple as a calling card website, right through to a theme for a content management system, such as WordPress or Joomla. The meteoric rise of accessing the Internet through different devices means that we can no longer create layouts that are tied to specific devices or sizes—we must be flexible! To achieve this flexibility requires us to embrace a number of changes in our design process – the first being the type of layout we should create. A key part of this is the use of percentage values to define our layouts; rather than create something from ground up, we can make use of a predefined grid system that has been tried and tested, as a basis for future designs. The irony is that there are lots of grid systems vying for our attention, so without further ado, let's make a start by exploring the different types of layouts, and how they compare to responsive designs. Understanding the different layout types A problem that has been faced by web designers for some years is the type of layout their website should use—should it be fluid, fixed width, have the benefits of being elastic or a hybrid version that draws on the benefits of a mix of these layouts? The type of layout we choose use will of course depend on client requirements—making it a fluid layout means we are effectively one step closer to making it responsive: the difference being that the latter uses media queries to allow resizing of content for different devices, not just normal desktops! To understand the differences, and how responsive layouts compare, let's take a quick look at each in turn: Fixed-Width layouts: These are constrained to a fixed with; a good size is around 960px, as this can be split equally into columns, with no remainder. The downside is the fixed width makes assumptions about the available viewport area, and that if the screen is too small or large, it results in scrolling or lots of which affects the user experience. Fluid layouts: Instead of using static values, we use percentage-based units; it means that no matter what the size of the browser window, our website will adjust accordingly. This removes the problems that surround fixed layouts at a stroke. Elastic layouts: They are similar to fluid layouts, but the constraints are measure by type or font size, using em or rem units; these are based on the defined font size, so 16px is 1 rem, 32px is 2 rem, and so on. These layouts allow for decent readability, with lines of 45-70 characters; font sizes are resized automatically. We may still see scrollbars appear in some instances, or experience some odd effects, if we zoom our page content. Hybrid layouts: They combine a mix of two or more of these different layout types; this allows us to choose static widths for some elements whilst others remain elastic or fluid. In comparison, responsive layouts take fluid layouts a step further, by using media queries to not only make our designs resize automatically, but present different views of our content on multiple devices. Exploring the benefits of flexible grid layouts Now that we've been introduced to grid layouts as a tenet of responsive design, it's a good opportunity to explore why we should use them. Creating a layout from scratch can be time-consuming, and need lots of testing—there are some real benefits from using a grid layout: Grids make for a simpler design: Instead of trying to develop the proverbial wheel, we can focus on providing the content instead; the infrastructure will have already been tested by the developer and other users. They provide for a visually appealing design: Many people prefer content to be displayed in columns, so grid layouts make good use of this concept, to help organize content on the page. Grids can of course adapt to different size viewports: The system they use makes it easier to display a single codebase on multiple devices, which reduces the effort required for developers to maintain and webmasters to manage. Grids help with the display of adverts: Google has been known to favor websites which display genuine content and not those where it believes the sole purpose of the website is for ad generation; we can use the grid to define specific area for adverts, without getting in the way of natural content. All in all, it makes sense to familiarize ourselves with grid layouts—the temptation is of course to use an existing library. There is nothing wrong with this, but to really get the benefit out of using them, it's good to understand some of the basics around the mechanics of grid layouts, and how this can help with the construction of our website. Making media responsive Our journey through the basics of adding responsive capabilities to a website has so far touched on how we make our layouts respond automatically to changes – it's time for us to do the same to media! If your first thought is that we need lots of additional functionality to make media responsive, then I am sorry to disappoint—it's much easier, and requires zero additional software to do it! Yes, all we need is just a text editor and a browser. I'll use my favorite editor, Sublime Text, but you can use whatever works for you. Over the course of this chapter, we will take a look in turn at images, video, audio and text, and we'll see how with some simple changes, we can make each of them responsive. Let's kick off our journey first, with a look at making image content responsive. Creating fluid images It is often said that images speak a thousand words. We can express a lot more with media than we can using words. This is particularly true for website selling products—a clear, crisp image clearly paints a better picture than a poor quality one! When constructing responsive websites, we need our images to adjust in size automatically—to see why this is important, go ahead and extract coffee.html from a copy of the code download that accompanies this book, and run it in a browser. Try resizing the window—we should see something akin to this: It doesn't look great, does it? Leaving aside my predilection for nature's finest bean drink (that is, coffee!), we can't have images that don't resize properly, so let's take a look at what is involved to make this happen: Go ahead and extract a copy of coffee.html and save it to our project area. We also need our image. This is in the img folder; save a copy to the img folder in our project area. In a new text file, add the following code, saving it as coffee.css: img { max-width: 100%; height: auto; } Revert back to coffee.html. You will see line 6 is currently commented out; remove the comment tags. Save the file, then preview it in a browser. If all is well, we will still see the same image as before, but this time try resizing it. This time around, our image grows or shrinks automatically, depending on the size of our browser window: Although our image does indeed fit better, there are a couple of points we should be aware of, when using this method: Sometimes you might see !important set as a property against the height attribute when working with responsive images; this isn't necessary, unless you're setting sizes in a website where image sizes may be overridden at a later date. We've set max-width to 100% as a minimum. You may also need to set a width value too, to be sure that your images do not become too big and break your layout. This is an easy technique to use, although there is a downside that can trip us up—spot what it is? If we use a high quality image, its file size will be hefty. We can't expect users of mobile devices to download it, can we? Don't worry though—there is a great alternative that has quickly gained popularity amongst browsers; we can use the <picture> element to control what is displayed, depending on the size of the available window. Implementing the <picture> element In a nutshell, responsive images are images that are displayed their optimal form on a page, depending on the device your website is being viewed from. This can mean several things: You want to show a separate image asset based on the user's physical screen size—this might be a 13.5 inch laptop, or a 5inch mobile phone screen. You want to show a separate image based on the resolution of the device, or using the device-pixel ratio (which is the ratio of device pixels to CSS pixels). You want to show an image in a specified image format (WebP, for example) if the browser supports it. Traditionally, we might have used simple scripting to achieve this, but it is at the risk of potentially downloading multiple images or none at all, if the script loads after images have loaded, or if we don't specify any image in our HTML and want the script to take care of loading images. Making video responsive Flexible videos are somewhat more complex than images. The HTML5 <video> maintains its aspect ratio just like images, and therefore we can apply the same CSS principle to make it responsive: video { max-width: 100%; height: auto !important; } Until relatively recently, there have been issues with HTML5 video—this is due in the main to split support for codecs, required to run HTML video. The CSS required to make a HTML5 video is very straightforward, but using it directly presents a few challenges: Hosting video is bandwidth intensive and expensive Streaming requires complex hardware support in addition to video It is not easy to maintain a consistent look and feel across different formats and platforms For many, a better alternative is to host the video through a third-party service such as YouTube—we can let them worry about bandwidth issues and providing a consistent look and feel; we just have to make it fit on the page! This requires a little more CSS styling to make it work, so let's dig in and find out what is involved. We clearly need a better way to manage responsive images! A relatively new tag for HTML5 is perfect for this job: <picture>. We can use this in one of three different ways, depending on whether we want to resize an existing image, display a larger one, or show a high-resolution version of the image. Implementing the <picture> element. In a nutshell, responsive images are images that are displayed their optimal form on a page, depending on the device your website is being viewed from. This can mean several things: You want to show a separate image asset based on the user's physical screen size—this might be a 13.5 inch laptop, or a 5inch mobile phone screen You want to show a separate image based on the resolution of the device, or using the device-pixel ratio (which is the ratio of device pixels to CSS pixels) You want to show an image in a specified image format (WebP, for example) if the browser supports it Traditionally, we might have used simple scripting to achieve this, but it is at the risk of potentially downloading multiple images or none at all, if the script loads after images have loaded, or if we don't specify any image in our HTML and want the script to take care of loading images. We clearly need a better way to manage responsive images! A relatively new tag for HTML5 is perfect for this job: <picture>. We can use this in one of three different ways, depending on whether we want to resize an existing image, display a larger one, or show a high-resolution version of the image. Making text fit on screen When building websites, it goes without saying but our designs clearly must start somewhere—this is usually with adding text. It's therefore essential that we allow for this in our responsive designs at the same time. Now is a perfect opportunity to explore how to do this—although text is not media in the same way as images or video, it is still content that has to be added at some point to our pages! With this in mind, let's dive in and explore how we can make our text responsive. Sizing with em units When working on non-responsive websites, it's likely that sizes will be quoted in pixel values – it's a perfectly acceptable way of working. However, if we begin to make our websites responsive, then content won't resize well using pixel values—we have to use something else. There are two alternatives - em or rem units. The former is based on setting a base font size that in most browsers defaults to 16px; in this example, the equivalent pixel sizes are given in the comments that follow each rule: h1 { font-size: 2.4em; } /* 38px */ p { line-height: 1.4em; } /* 22px */ Unfortunately there is an inherent problem with using em units—if we nest elements, then font sizes will be compounded, as em units are calculated relative to its parent. For example, if the font size of a list element is set at 1.4em (22px), then the font size of a list within a list becomes 30.8em (1.4 x 22px). To work around these issues, we can use rem values as a replacement—these are calculated from the root element, in place of the parent element. If you look carefully throughout many of the demos created for this book, you will see rem units being used to define the sizes of elements in that demo. Using rem units as a replacement The rem (or root em) unit is set to be relative to the root, instead of the parent – it means that we eliminate any issue with compounding at a stroke, as our reference point remains constant, and is not affected by other elements on the page. The downside of this is support—rem units are not supported in IE7 or 8, so if we still have to support these browsers, then we must fall back to using pixel or em values instead. This of course raises the question—should we still support these browsers, or is their usage of our website so small, as to not be worth the effort required to update our code? If the answer is that we must support IE8 or below, then we can take a hybrid approach—we can set both pixel/em and rem values at the same time in our code, thus: .article-body { font-size: 1.125rem; /* 18 / 16 */ font-size: 18px; } .caps, figure, footer { font-size: 0.875rem; /* 14 / 16 */ font-size: 14px; } Notice how we set rem values first? Browsers which support rem units will use these first; any that don't can automatically fall back to using pixel or em values instead. Exploring some examples Open a browser—let's go and visit some websites. Now, you may think I've lost my marbles, but stay with me: I want to show you a few examples. Let's take a look at a couple of example websites at different screen widths—how about this example, from my favorite coffee company, Starbucks: Try resizing the browser window—if you get small enough, you will see something akin to this: Now, what was the point of all that, I hear you ask? Well, it's simple—all of them use media queries in some form or other; CSS Tricks uses the queries built into WordPress, Packt's website is hosted using Drupal, and Starbuck's website is based around the Handlebars template system. The key here is that all use media queries to determine what should be displayed—throughout the course of this chapter, we'll explore using them in more detail, and see how we can use them to better manage content in responsive websites. Let's make a start with exploring their make up in more detail. Understanding media queries The developer Bruce Lee sums it up perfectly, when liking the effects of media queries to how water acts in different containers: "Empty your mind, be formless, shapeless - like water. Now you put water in a cup, it becomes the cup; you put water into a bottle it becomes the bottle; you put it in a teapot it becomes the teapot. Now water can flow or it can crash. Be water, my friend." We can use media queries to apply different CSS styles, based on available screen estate or specific device characteristics. These might include, but not be limited to the type of display, screen resolution or display density. Media queries work on the basis of testing to see if certain conditions are true, using this format: @media [not|only] [mediatype] and ([media feature]) { // CSS code; } We can use a similar principle to determine if entire style sheets should be loaded, instead of individual queries: <link rel="stylesheet" media="mediatype and|only|not (media feature)" href="myStyle.css"> Seems pretty simple, right? The great thing about media queries is that we don't need to download or install any additional software to use or create them – we can build most of them in the browser directly. Removing the need for breakpoints Up until now, we've covered how we can use breakpoints to control what is displayed, and when, according to which device is being used. Let's assume you're working on a project for a client, and have created a series of queries that use values such as 320px, 480px, 768px, and 1024px to cover support for a good range of devices. No matter what our design looks like, we will always be faced with two issues, if we focus on using specific screen viewports as the basis for controlling our designs: Keeping up with the sheer number of devices that are available The inflexibility of limiting our screen width So hold on: we're creating breakpoints, yet this can end up causing us more problems? If we're finding ourselves creating lots of media queries that address specific problems (in addition to standard ones), then we will start to lose the benefits of a responsive website—instead we should re-examine our website, to understand why the design isn't working and see if we can't tweak it so as to remove the need for the custom query. Ultimately our website and target devices will dictate what is required—a good rule of thumb is if we are creating more custom queries than a standard bunch of 4-6 breakpoints, then perhaps it is time to recheck our design! As an alternative to working with specific screen sizes, there is a different approach we can take, which is to follow the principle of adaptive design, and not responsive design. Instead of simply specifying a number of fixed screen sizes (such as for the iPhone 6 Plus or a Samsung Galaxy unit), we build our designs around the point at which the design begins to fail. Why? The answer is simple—the idea here is to come up with different bands, where designs will work between a lower and upper value, instead of simply specifying a query that checks for fixed screen sizes that are lower or above certain values. Understanding the importance of speed The advent of using different devices to access the internet means speed is critical – the time it takes to download content from hosting servers, and how quickly the user can interact with the website are key to the success of any website. Why it is important to focus on the performance of our website on the mobile devices or those devices with lesser screen resolution? There are several reasons for this—they include: 80 percent of internet users owns a smartphone Around 90 percent of users go online through a mobile device, with 48% of users using search engines to research new products Approximately 72 percent users abandon a website if the loading time is more than 5-6 seconds Mobile digital media time is now significantly higher than compared to desktop use If we do not consider statistics such as these, then we may go ahead and construct our website, but end up with a customer losing both income and market share, if we have not fully considered the extent of where our website should work. Coupled with this is the question of performance – if our website is slow, then this will put customers off, and contribute to lost sales. A study performed by San Francisco-based Kissmetrics shows that mobile users wait between 6 to 10 seconds before they close the website and lose faith in it. At the same time, tests performed by Guy Podjarny for the Mediaqueri.es website (http://mediaqueri.es) indicate that we're frequently downloading the same content for both large and small screens—this is entirely unnecessary, when with some simple changes, we can vary content to better suit desktop PCs or mobile devices! So what can we do? Well, before we start exploring where to make changes, let's take a look at some of the reasons why websites run slowly. Understanding why pages load slowly Although we may build a great website that works well across multiple devices, it's still no good if it is slow! Every website will of course operate differently, but there are a number of factors to allow for, which can affect page (and website) speed: Downloading data unnecessarily: On a responsive website, we may hide elements that are not displayed on smaller devices; the use of display: none in code means that we still download content, even though we're not showing it on screen, resulting in slower websites and higher bandwidth usage. Downloading images before shrinking them: If we have not optimized our website with properly sized images, then we may end up downloading images that are larger than necessary on a mobile device. We can of course make them fluid by using percentage-based size values, but this places extra demand on the server and browser to resize them. A complicated DOM in use on the website: When creating a responsive website, we have to add in a layer of extra code to manage different devices; this makes the DOM more complicated, and slow our website down. It is therefore imperative that we're not adding in any unnecessary elements that require additional parsing time by the browser. Downloading media or feeds from external sources: It goes without saying that these are not under our control; if our website is dependent on them, then the speed of our website will be affected if these external sources fail. Use of Flash: Websites that rely on heavy use of Flash will clearly be slower to access than those that don't use the technology. It is worth considering if our website really needs to use it; recent changes by Adobe mean that Flash as a technology is being retired in favor of animation using other means such as HTML5 Canvas or WebGL. There is one point to consider that we've not covered in this list—the average size of a page has significantly increased since the dawn of the Internet in the mid-nineties. Although these figures may not be 100% accurate, they still give a stark impression of how things have changed: 1995: At that time the average page size used to be around 14.1 KB in size. The reason for it can be that it contained around 2 or 3 objects. That means just 2 or 3 calls to server on which the website was hosted. 2008: The average page size increased to around 498 KB in size, with an average use of around 70 objects that includes changes to CSS, images and JavaScript. Although this is tempered with the increased use of broadband, not everyone can afford fast access, so we will lose customers if our website is slow to load. All is not lost though—there are some tricks we can use to help optimize the performance of our websites. Testing website compatibility At this stage, our website would be optimized, and tested for performance—but what about compatibility? Although the wide range of available browsers has remained relatively static (at least for the ones in mainstream use), the functionality they offer is constantly changing—it makes it difficult for developers and designers to handle all of the nuances required to support each browser. In addition, the wide range makes it costly to support—in an ideal world, we would support every device available, but this is impossible; instead, we must use analytical software to determine which devices are being used, and therefore worthy of support. Working out a solution If we test our website on a device such as an iPhone 6, then there is a good chance it will work as well on other Apple devices, such as iPads. The same can be said for testing on a mobile device such as a Samsung Galaxy S4—we can use this principle to help prioritize support for particular mobile devices, if they require more tweaks to be made than for other devices. Ultimately though, we must use analytical software to determine who visits our website; the information such as browser, source, OS and device used will help determine what our target audience should be. This does not mean we completely neglect other devices; we can ensure they work with our website, but this will not be a priority during development. A key point of note is that we should not attempt to support every device – this is too costly to manage, and we would never keep up with all of the devices available for sale! Instead, we can use our analytics software to determine which devices are being used by our visitors; we can then test a number of different properties: Screen size: This should encompass a variety of different resolutions for desktop and mobile devices. Connection speed: Testing across different connection speeds will help us understand how the website behaves, and identify opportunities or weaknesses where we may need to effect changes. Pixel density: Some devices will support higher a pixel density, which allows them to display higher resolution images or content; this will make it easier to view and fix any issues with displaying content. Interaction style: The ability to view the Internet across different devices means that we should consider how our visitors interact with the website: is it purely on a desktop, or do they use tablets, smartphones or gaming-based devices? It's highly likely that the former two will be used to an extent, but the latter is not likely to feature as highly. Once we've determined which devices we should be supporting, then there are a range of tools available for us to use, to test browser compatibility. These include physical devices (ideal, but expensive to maintain), emulators or online services (these can be commercial, or free). Let's take a look at a selection of what is available, to help us test compatibility. Exploring tools available for testing When we test a mobile or responsive website, there are factors which we need to consider before we start testing, to help deliver a website which looks consistent across all the devices and browsers. These factors include: Does the website look good? Are there any bugs or defects? Is our website really responsive? To help test our websites, we can use any one of several tools (either paid or free)—a key point to note though is that we can already get a good idea of how well our websites work, by simply using the Developer toolbar that is available in most browsers! Viewing with Chrome We can easily emulate a mobile device within Chrome, by pressing Ctrl + Shift + M; Chrome displays a toolbar at the top of the window, which allows us to select different devices: If we click on the menu entry (currently showing iPhone 6 Plus), and change it to Edit, we can add new devices; this allows us to set specific dimensions, user agent strings and whether the device supports high-resolution images: Although browsers can go some way to providing an indication of how well our website works, they can only provide a limited view – sometimes we need to take things a step further and use commercial solutions to test our websites across multiple browsers at the same time. Let's take a look at some of the options available commercially. Exploring our options If you've spent any time developing code, then there is a good chance you may already be aware of Browserstack (from https://www.browserstack.com)—other options include the following: GhostLab: https://www.vanamco.com/ghostlab/ Muir: http://labs.iqfoundry.com/ CrossBrowserTesting: http://www.crossbrowsertesting.com/ If however all we need to do is check our website for its level of responsiveness, then we don't need to use paid options – there are a number of websites that allow us to check, without needing to installing plugins or additional tools: Am I Responsive: http://ami.responsive.is ScreenQueries: http://screenqueri.es Cybercrab's screen check facility: http://cybercrab.com/screencheck Remy Sharp's check website: http://responsivepx.com We can also use bookmarklets to check to see how well our websites work on different devices – a couple of examples to try are at http://codebomber.com/jquery/resizer and http://responsive.victorcoulon.fr/; it is worth noting that current browsers already include this functionality, making the bookmarklets less attractive as an option. We have now reached the end of our journey through the essentials of creating responsive websites with nothing more than plain HTML and CSS code. We hope you have enjoyed it as much as we have with writing, and that it helps you make a start into the world of responsive design using little more than plain HTML and CSS. Summary This article covers the elements of RWD and introduces us to the different flexible grid layouts. Resources for Article: Responsive Web Design with WordPress Web Design Principles in Inkscape Top Features You Need to Know About – Responsive Web Design
Read more
  • 0
  • 1
  • 43398

article-image-saying-hello-java-ee
Packt
03 Aug 2016
27 min read
Save for later

Saying Hello to Java EE

Packt
03 Aug 2016
27 min read
To develop a scalable, distributed, well-presented, complex, and multi-layered enterprise application is complicated. The development becomes even worse if the developer is not well aware of the software development fundamentals. Instead of looking at a bigger scenario, if we cut it down into parts and later combine them, it becomes easy for understanding as well as for developing. Each technology has some basics which we cannot overlook. Rather, if we overlook them, it will be the biggest mistake; the same is applicable to Java EE. In this article by TejaswiniMandar Jog, author of the book Learning Modular Java Programming, we are going to explore the following: Java EE technologies Why servlet and JSP? Introduction to Spring MVC Creating a sample application through Spring MVC (For more resources related to this topic, see here.) The enterprise as an application To withstand the high, competitive, and daily increasing requirements, it's becoming more and more difficult nowadays to develop an enterprise application. The difficulty is due to more than one kind of service, requirement of application to be robust and should support concurrency, security, and many more. Along with these things, enterprise applications should provide an easy user interface but good look and feel for different users. In the last article, we discussed enterprise applications. The discussion was more over understanding the terminology or the aspect. Let's now discuss it in terms of development, and what developers look forward to: The very first thing even before starting the development is: what we are we developing and why? Yes, as a developer we need to understand the requirements or the expectations from the application. Developers have to develop an application which will meet the requirements. The application should be efficient and with high quality so as to sustain in the market. The application code should be reliable and bug-free to avoid runtime problems. No application is perfect; it's a continuous process to update it for new demands. Develop an application in such a way that it is easy to update. To meet high expectations, developers write code which becomes complicated to understand as well as to change. Each one of us wants to have a new and different product, different from what is on the market. To achieve this, designers make an over-clumsy design which is not easy to change in the future. Try to avoid over-complexity both in design and business logic. When development starts, developers look forward to providing a solution, but they have to give thought to what they are developing and how the code will be organized in terms of easy maintenance and future extension. Yes, we are thinking about modules which are doing a defined task and those which are less dependent. Try to write a module which will be loosely coupled and highly cohesive. Today we are using enterprise applications through different browsers, such as Internet Explorer, Mozilla, or Firefox. We are even using mobile browsers for the same task. This demands an application that has been developed to withstand the number of platforms and browsers. Going through all this discussion, many technologies come to mind. We will go through one such platform which covers the maximum of the above requirements: the Java Enterprise Edition (Java EE) platform. Let's dive in and explore it!! The Java EE platform Sun Microsystems released the Java EE platform in 2000, which was formerly known as the J2EE specification. It defines the standards for developing component-based enterprise applications easily. The concrete implementation is provided by application servers such as Weblogic and GlassFish, and servlet containers such as Tomcat. Today we have Java EE 8 on the market. Features of the Java EE platform The following are the various features of the Java EE platform: Platform independency: Different types of information which the user needs in day-to-day life is spread all over the network on a wide range of platforms. Java EE is well adapted to support, and use this widely spread multiple format information on different platforms easily. Modularity: The development of enterprise applications is complex and needs to be well organized. The complexity of the application can be reduced by dividing it into different, small modules which perform individual tasks, which allows for easy maintenance and testing. They can be organized in separate layers or tiers. These modules interact with each other to perform a business logic. Reusability: Enterprise applications need frequent updates to match up client requirements. Inheritance, the fundamental aspect of an object-oriented approach, offers reusability of the components with the help of functions. Java EE offers modularity which can be used individually whenever required. Scalability: To meet the demands of the growing market, the enterprise application should keep on providing new functionalities to the users. In order to provide these new functionalities, the developers have to change the application. They may add new modules or make changes in already existing ones. Java EE offers well-managed modules which make scalability easy. The technologies used in Java EE are as follows: Java servlet Java Server Pages Enterprise Java Bean Java Messaging API XML Java Transaction API Java Mail Web Services The world of dotcoms In the 1990s, many people started using computers for a number of reasons. For personal use, it was really good. When it came to enterprise use, it was helpful to speed up the work. But one main drawback was; how to share files, data or information? The computers were in a network but if someone wanted to access the data from any computer then they had to access it personally. Sometimes, they had to learn the programs on that computer, which is not only very time-consuming but also unending. What if we can use the existing network to share the data remotely?? It was a thought put forward by a British computer scientist, Sir Tim Berners-Lee. He thought of a way to share the data through the network by exploring an emerging technology called hypertext. In October 1990, Tim wrote three technologies to fulfill sharing using Hyper Text Markup Language (HTML), Uniform Resource Identifier (URI), and Hyper Text Transfer Protocol (HTTP): HTML is a computer language which is used in website creation. Hypertext facilitates clicking on a link to navigate on the Internet. Markups are HTML tags defining what to do with the text they contain. URIs defines a resource by location or name of resource, or both. URIs generally refer to a text document or images. HTTP is the set of rules for transferring the files on the Web. HTTP runs on the top of TCP/IP. He also wrote the first web page browser (World Wide Webapp) and the first web server (HTTP). The web server is where the application is hosted. This opened the doors to the new amazing world of the "dotcom". This was just the beginning and many more technologies have been added to make the Web more realistic. Using HTTP and HTML, people were able to browse files and get content from remote servers. A little bit of user interaction or dynamicity was only possible through JavaScript. People were using the Web but were not satisfied; they needed something more. Something which was able to generate output in a totally dynamic way, maybe displaying the data which had been obtained from the data store. Something which can manipulate user input and accordingly display the results on the browser. Java developed one technology: Common Gateway Interface (CGI). As CGI was a small Java program, it was capable of manipulating the data at the server side and producing the result. When any user made a request, the server forward the edit to CGI, which was an external program. We got an output but with two drawbacks: Each time the CGI script was called, a new process was created. As we were thinking of a huge number of hits to the server, the CGI became a performance hazard. Being an external script, CGI was not capable of taking advantage of server abilities. To add dynamic content which can overcome the above drawbacks and replace CGI, the servletwas developed by Sun in June 1997. Servlet – the dynamicity Servlets are Java programs that generate dynamic output which will be displayed in the browser and hosted on the server. These servers are normally called servlet containers or web servers. These containers are responsible for managing the lifecycle of the servlets and they can take advantage of the capabilities of servers. A single instance of a servlet handles multiple requests through multithreading. This enhances the performance of the application. Let's discuss servlets in depth to understand them better. The servlet is capable of handling the request (input) from the user and generates the response (output) in HTML dynamically. To create a servlet, we have to write a class which will be extended from GenericServlet or HttpServlet. These classes have service() as a method, to handle request and response. The server manages the lifecycle of a servlet as follows: The servlet will be loaded on arrival of a request by the servers. The instance will be created. The init() will be invoked to do the initialization. The preceding steps will be performed only once in the life cycle of the servlet unless the servlet has not been destroyed. After initialization, the thread will be created separately for each request by the server, and request and response objects will be passed to the servlet thread. The server will call the service() function. The service() function will generate a dynamic page and bind it to the HttpResponse object. Once the response is sent back to the user, the thread will be deallocated. From the preceding steps, it is pretty clear that the servlet is responsible for: Reading the user input Manipulating the received input Generating the response A good developer always keeps a rule of thumb in mind that a module should not have more than one responsibility, but here the servlet is doing much more. So this has addressed the first problem in testing the code, maybe we will find a solution for this. But the second issue is about response generation. We cannot neglect a very significant problem in writing well-designed code to have a nice look and feel for the page from the servlet. That means a programmer has to know or adapt designing skills as well, but, why should a servlet be responsible for presentation? The basic thought of taking presentation out of the servlet leads to Java Server Page (JSP). JSP solves the issue of using highly designed HTML pages. JSP provides the facility of using all HTML tags as well as writing logical code using Java. The designers can create well-designed pages using HTML, where programmers can add code using scriptlet, expression, declaration, or directives. Even standard actions like useBean can be used to take advantage of Java Beans. These JSP's now get transformed, compiled into the servlet by the servers. Now we have three components: Controller, which handles request and response Model, which holds data acquired from handling business logic View, which does the presentation Combining these three we have come across a design pattern—Model-View-Controller (MVC). Using MVC design patterns, we are trying to write modules which have a clear separation of work. These modules can be upgradable for future enhancement. These modules can be easily tested as they are less dependent on other modules. The discussion of MVC is incomplete without knowing two architectural flavors of it: MVC I architecture MVC II architecture MVC I architecture In this model, the web application development is page-centric around JSP pages. In MVC I, JSP performs the functionalities of handling a request and response and manipulating the input, as well as producing the output alone. In such web applications, we find a number of JSP pages, each one of them performing different functionalities. MVC I architecture is good for small web applications where less complexity and maintaining the flow is easy. The JSP performs the dual task of business logic and presentation together, which makes it unsuitable for enterprise applications. MVC I architecture MVC II architecture In MVC II, a more powerful model has been put forward to give a solution to enterprise applications with a clear separation of work. It comprises two components: one is the controller and other the view, as compared to MVC I where view and controller is JSP (view). The servlets are responsible for maintaining the flow (the controller) and JSP to present the data (the view). In MVC II, it's easy for developers to develop business logic- the modules which are reusable. MVC II is more flexible due to responsibility separation. MVC II architecture The practical aspect We have traveled a long way. So, instead of moving ahead, let's first develop a web application to accept data from the user and display that using MVC II architecture. We need to perform the following steps: Create a dynamic web application using the name Ch02_HelloJavaEE. Find the servlet-api.jar file from your tomcat/lib folder. Add servlet-api.jar to the lib folder. Create index.jsp containing the form which will accept data from the user. Create a servlet with the name HelloWorldServlet in the com.packt.ch02.servlets package. Declare the method doGet(HttpServletRequestreq,HttpServletResponsers) to perform the following task: Read the request data using the HttpServletRequest object. Set the MIME type. Get an object of PrintWriter. Perform the business logic. Bind the result to the session, application or request scope. Create the view with name hello.jsp under the folder jsps. Configure the servlet in deployment descriptor (DD) for the URL pattern. Use expression language or Java Tag Library to display the model in the JSP page. Let's develop the code. The filesystem for the project is shown in the following screenshot: We have created a web application and added the JARs. Let's now add index.jsp to accept the data from the user: <form action="HelloWorldServlet"> <tr> <td>NAME:</td> <td><input type="text" name="name"></td> </tr> <tr> <td></td> <td><input type="submit" value="ENTER"></td> </tr> </form> When the user submits the form, the request will be sent to the URL HelloWorldServlet. Let's create the HelloWorldServlet which will get invoked for the above URL, which will have doGet(). Create a model with the name message, which we will display in the view. It is time to forward the request with the help of the RequestDispatcher object. It will be done as follows: protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {     // TODO Auto-generated method stub     //read the request parameter     String name=request.getParameter("name");     //get the writer PrintWriter writer=response.getWriter();       //set the MIME type response.setContentType("text/html");       // create a model and set it to the scope of request request.setAttribute("message","Hello "+name +" From JAVA Enterprise"); RequestDispatcher dispatcher=request.getRequestDispatcher("jsps/hello.jsp"); dispatcher.forward(request, response);     } Now create the page hello.jsp under the folder jsps to display the model message as follows: <h2>${message }</h2> The final step is to configure the servlet which we just have created in DD. The configuration is made for the URL HelloWorldServlet as follows: <servlet> <servlet-name>HelloWorldServlet</servlet-name> <servlet-class>com.packt.ch02.servlets.HelloWorldServlet </servlet-class> </servlet> <servlet-mapping> <servlet-name>HelloWorldServlet</servlet-name> <url-pattern>/HelloWorldServlet</url-pattern></servlet-mapping> Let's deploy the application to check the output: Displaying the home page for a J2EE application The following screenshot shows the output when a name is entered by the user: Showing the output when a name is entered by the user After developing the above application, we now have a sound knowledge of how web development happens, how to manage the flow, and how navigation happens. We can observe one more thing: that whether it's searching data, adding data, or any other kind of operation, there are certain steps which are common, as follows: Reading the request data Binding this data to a domain object in terms of model data Sending the response We need to perform one or more of the above steps as per the business requirement. Obviously, by only performing the above steps, we will not be able to achieve the end effect but there is no alternative. Let's discuss an example. We want to manage our contact list. We want to have the facilities for adding a new contact, updating a contact, searching one or many contacts, and deleting a contact. The required data will be taken from the user by asking them to fill in a form. Then the data will be persisted in the database. Here, for example, we just want to insert the record in the database. We have to start the coding from reading request data, binding it to an object and then our business operation. The programmers have to unnecessarily repeat these steps. Can't they get rid of them? Is it possible to automate this process?? This is the perfect time to discuss frameworks. What is a framework? A framework is software which gives generalized solutions to common tasks which occur in application development. It provides a platform which can be used by the developers to build up their application elegantly. Advantages of frameworks The advantages of using frameworks are as follows: Faster development Easy binding of request data to a domain object Predefined solutions Validations framework In December 1996, Sun Microsystems published a specification for JavaBean. This specification was about the rules, using which developers can develop reusable, less complex Java components. These POJO classes are now going to be used as a basis for developing a lightweight, less complex, flexible framework: the Spring framework. This framework is from the thoughts of Rod Johnson in February 2003. The Spring framework consists of seven modules: Spring modules Though Spring consists of several modules, the developer doesn't have to be always dependent on the framework. They can use any module as per the requirement. It's not even compulsory to develop the code which has been dependent upon Spring API. It is called a non-intrusive framework. Spring works on the basis of dependency injection (DI), which makes it easy for integration. Each class which the developer develops has some dependencies. Take the example of JDBC: to obtain a connection, the developer needs to provide URL, username, and password values. Obtaining the connection is dependent on these values so we can call them dependencies, and injection of these dependencies in objects is called DI. This makes the emerging spring framework the top choice for the middle tier or business tier in enterprise applications. Spring MVC The spring MVC module is a choice when we look forward for developing web applications. The spring MVC helps to simplify development to develop a robust application. This module can also be used to leave common concerns such as reading request data, data binding to domain object, server-side validation and page rendering to the framework and will concentrate on business logic processes. That's what, as a developer we were looking for. The spring MVC can be integrated with technologies such as Velocity, Freemarker, Excel, and PDF. They can even take advantage of other services such as aspect-oriented programming for cross-cutting technologies, transaction management, and security provided by the framework. The components Let's first try to understand the flow of normal web applications in view of the Spring framework so that it will be easy to discuss the component and all other details: On hitting the URL, the web page will be displayed in the browser. The user will fill in the form and submit it. The front controller intercepts the request. The front controller tries to find the Spring MVC controller and pass the request to it. Business logic will be executed and the generated result is bound to the ModelAndView. The ModelAndView will be sent back to the front controller. The front controller, with the help of ViewResolver, will discover the view, bind the data and send it to the browser. Spring MVC The front controller As already seen in servlet JSP to maintain each flow of the application the developer will develop the servlet and data model from servlet will be forwarded to JSP using attributes. There is no single servlet to maintain the application flow completely. This drawback has been overcome in Spring MVC as it depends on the front controller design pattern. In the front controller design pattern, there will be a single entry point to the application. Whatever URLs are hit by the client, it will be handled by a single piece of the code and then it will delegate the request to the other objects in the application. In Spring MVC, the DispatcherServlet acts as front controller. DispatcherServlet takes the decision about which Spring MVC controller the request will be delegated to. In the case of a single Spring MVC controller in the application, the decision is quite easy. But we know in enterprise applications, there are going to be multiple Spring MVC controllers. Here, the front controller needs help to find the correct Spring MVC controller. The helping hand is the configuration file, where the information to discover the Spring MVC controller is configured using handler mapping. Once the Spring MVC controller is found, the front controller will delegate the request to it. Spring MVC controller All processes, such as the actual business logic, decision making or manipulation of data, happen in the Spring MVC controller. Once this module completes the operation, it will send the view and the model encapsulated in the object normally in the form of ModelAndView to the front controller. The front controller will further resolve the location of the view. The module which helps front controller to obtain the view information is ViewResolver. ModelAndView The object which holds information about the model and view is called as ModelAndView. The model represents the piece of information used by the view for display in the browser of different formats. ViewResolver The Spring MVC controller returns ModelAndView to the front controller. The ViewResolver interface helps to map the logical view name to the actual view. In web applications, data can be displayed in a number of formats, from as simple as JSP to complicated formats like JasperReport. Spring provides InternalResourceViewResolver, JspViewResolver, JasperReportsViewResolver, VelocityLayoutViewResolver, and so on, to support different view formats. The configuration file DispatcherServlet needs to discover information about the Spring MVC controller, ViewResolver, and many more. All this information is centrally configured in a file named XXX-servlet.xml where XXX is the name of the front controller. Sometimes the beans will be distributed across multiple configuration files. In this case, extra configuration has to be made, which we will see later in this article. The basic configuration file will be: <beans xsi_schemaLocation="    http://www.springframework.org/schema/beans        http://www.springframework.org/schema/beans/spring-beans-3.0.xsd    http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd"> <!—mapping of the controller --> <!—bean to be configured here for view resolver  - -> </beans> The controller configuration file will be named name_of_servlet-servlet.xml. In our project, we will name this HelloWeb-servlet.xml. Let's do the basics of a web application using Spring MVC to accept the data and display it. We need to perform the following steps: Create a web application named Ch02_HelloWorld. Add the required JAR files for Spring (as shown in the following screenshot) and servlets in the lib folder. Create an index page from where the data can be collected from the user and a request sent to the controller. Configure the front controller in DD. Create a SpringMVCcontroller as HelloWorldController. Add a method for accepting requests in the controller which performs business logic, and sends the view name and model name along with its value to the front controller. Create an XML file in WEB-INF as Front_Controller_name-servlet.xml and configure SpringMVCcontroller and ViewResolver. Create a JSP which acts as a view to display the data with the help of Expression Language (EL) and Java Standard Tag Library (JSTL). Let's create the application. The filesystem for the project is as follows: We have already created the dynamic web project Ch02_HelloSpring and added the required JAR files in lib folder. Let's start by creating index.jsp page as: <form action="hello.htm"> <tr> <td>NAME:</td> <td><input type="text" name="name"></td> </tr> <tr> <td></td> <td><input type="submit" value="ENTER"></td> </tr> </form>   When we submit the form, the request is sent to the resource which is mapped for the URL hello.htm. Spring MVC follows the front controller design pattern. So all the requests hitting the application will be first attended by the front controller and then it will send it to the respective Spring controllers. The front controller is mapped in DD as: <servlet> <servlet-name>HelloSpring</servlet-name> <servlet-class>org.springframework.web.servlet.DispatcherServlet </servlet-class> </servlet> <servlet-mapping> <servlet-name>HelloSpring</servlet-name> <url-pattern>*.htm</url-pattern> </servlet-mapping> Now the controller needs help to find the Spring MVC controller. This will be taken care of by the configuration file. This file will have the name XXX-servlet.xml where XXX is replaced by the name of the front controller from DD. Here, in this case HelloSpring-servlet.xml will have the configuration. This file we need to keep in the WEB-INF folder. In the Configuration files section, we saw the structure of the file. In this file, the mapping will be done to find out how the package in which the controllers are kept will be configured. This is done as follows: <context:component-scan base-package="com.packt.ch02.controllers" /> Now the front controller will find the controller from the package specified as a value of base-package attribute. The front controller will now visit HelloController. This class has to be annotated by @Controller: @Controller public class HelloController { //code here } Once the front controller knows what the controller class is, the task of finding the appropriate method starts. This will be done by matching the values of @RequestMapping annotation applied either on the class or on the methods present in the class. In our case, the URL mapping is hello.htm. So the method will be developed as: @RequestMapping(value="/hello.htm") publicModelAndViewsayHello(HttpServletRequest request)   {     String name=request.getParameter("name"); ModelAndView mv=new ModelAndView(); mv.setViewName("hello");     String message="Hello "+name +" From Spring"; mv.addObject("message",message); return mv;   } This method will return a ModelAndView object which contains a view name, model name and value for the model. In our code the view name is hello and the model is presented by message. The Front Controller now again uses HelloSpring-servlet.xml for finding the ViewResolver to get the actual name and location of the view. ViewResolver will provide the directory name (location) where the view is placed with a property prefix. The format of the view is given by the property suffix. Using the view name, prefix and suffix, the front controller gets the page. The ViewResolver will bind the model to be used in the view: <bean id="viewResolver" class="org.springframework.web.servlet.view.InternalResourceViewResolver"> <property name="prefix" value="/WEB-INF/jsps/" /> <property name="suffix" value=".jsp" /> </bean> In our case, it will be /WEB-INF/jsps/ as prefix, hello as the name of page, and .jsp is the suffix value. Combining them, we will get /WEB-INF/jsps/hello.jsp, which acts as our view. The Actual view is written as prefix+view_name from ModelAndView+suffix, for instance: /WEB-INF/jsps/+hello+.jsp The data is bounded by the front controller and the view will be able to use it: <h2>${message}</h2>. This page is now ready to be rendered by the browser, which will give output in the browser as: Displaying the home page for a Spring application Entering the name in the text field (for example, Bob) and submitting the form gives the following output: Showing an output when a name is entered by the user Now we understand the working of spring MVC, let's discuss a few more things required in order to develop the Spring MVC controller. Each class which we want to discover as the controller should be annotated with the @Controller annotation. In this class, there may be number of methods which can be invoked on request. The method which we want to map for URL has to be annotated with the annotation @RequestMapping. There can be more than one method mapped for the same URL but it will be invoked for different HTTP methods. This can be done as follows: @RequestMapping(value="/hello.htm",method= RequestMethod.GET) publicModelAndViewsayHello(HttpServletRequest request)   {       } @RequestMapping(value="/hello.htm",method= RequestMethod.POST) publicModelAndViewsayHello(HttpServletRequest request)   {      } These methods normally accept Request as parameter and will return ModelAndView. But the following return types and parameters are also supported. The following are some of the supported method argument types: HttpServletRequest HttpSession Java.util.Map/ org.springframework.ui.Model/ org.springframework.ui.ModelMap @PathVariable @RequestParam org.springframework.validation.Errors/ org.springframework.validation.BindingResult The following are some of the supported method return types: ModelAndView Model Map View String void Sometimes the bean configuration is scattered in more than one file. For example, we can have controller configuration in one file and database, security-related configuration in a separate file. In that case, we have to add extra configuration in DD to load multiple configuration files, as follows: <servlet> <servlet-name>HelloSpring</servlet-name> <servlet-class>org.springframework.web.servlet.DispatcherServlet </servlet-class> <init-param> <param-name>contextConfigLocation</param-name> <param-value>/WEB-INF/beans.xml</param-value> </init-param> </servlet> <servlet-mapping> <servlet-name>HelloSpring</servlet-name> <url-pattern>*.htm</url-pattern> </servlet-mapping> Summary In this article, we learned how the application will be single-handedly controlled by the front controller, the dispatcher servlet. The actual work of performing business logic and giving the name of the view, data model back will be done by the Spring MVC controller. The view will be resolved by the front controller with the help of the respective ViewResolver. The view will display the data got from the Spring MVC controller. To understand and explore Spring MVC, we need to understand the web layer, business logic layer and data layer in depth using the basics of Spring MVC discussed in this article. Resources for Article:   Further resources on this subject: Working with Spring Tag Libraries [article] Working with JIRA [article] A capability model for microservices [article]
Read more
  • 0
  • 0
  • 2232

article-image-scripting-animation-maya
Packt
02 Aug 2016
28 min read
Save for later

Scripting for Animation in Maya

Packt
02 Aug 2016
28 min read
This article, written by Adrian Herbez, author of Maya Programming with Python Cookbook, will cover various recipes related to animating objects with scripting: Querying animation data Working with animation layers Copying animation from one object to another Setting keyframes Creating expressions via script (For more resources related to this topic, see here.) In this article, we'll be looking at how to use scripting to create animation and set keyframes. We'll also see how to work with animation layers and create expressions from code. Querying animation data In this example, we'll be looking at how to retrieve information about animated objects, including which attributes are animated and both the location and value of keyframes. Although this script is unlikely to be useful by itself, knowing the number, time, and values of keyframes is sometimes a prerequisite for more complex animation tasks. Getting ready To make get the most out of this script, you'll need to have an object with some animation curves defined. Either load up a scene with animation or skip ahead to the recipe on setting keyframes. How to do it... Create a new file and add the following code: import maya.cmds as cmds def getAnimationData(): objs = cmds.ls(selection=True) obj = objs[0] animAttributes = cmds.listAnimatable(obj); for attribute in animAttributes: numKeyframes = cmds.keyframe(attribute, query=True, keyframeCount=True) if (numKeyframes > 0): print("---------------------------") print("Found ", numKeyframes, " keyframes on ", attribute) times = cmds.keyframe(attribute, query=True, index=(0,numKeyframes), timeChange=True) values = cmds.keyframe(attribute, query=True, index=(0,numKeyframes), valueChange=True) print('frame#, time, value') for i in range(0, numKeyframes): print(i, times[i], values[i]) print("---------------------------") getAnimationData() If you select an object with animation curves and run the script, you should see a readout of the time and value for each keyframe on each animated attribute. For example, if we had a simple bouncing ball animation with the following curves: We would see something like the following output in the script editor: --------------------------- ('Found ', 2, ' keyframes on ', u'|bouncingBall.translateX') frame#, time, value (0, 0.0, 0.0) (1, 190.0, 38.0) --------------------------- --------------------------- ('Found ', 20, ' keyframes on ', u'|bouncingBall.translateY') frame#, time, value (0, 0.0, 10.0) (1, 10.0, 0.0) (2, 20.0, 8.0) (3, 30.0, 0.0) (4, 40.0, 6.4000000000000004) (5, 50.0, 0.0) (6, 60.0, 5.120000000000001) (7, 70.0, 0.0) (8, 80.0, 4.096000000000001) (9, 90.0, 0.0) (10, 100.0, 3.276800000000001) (11, 110.0, 0.0) (12, 120.0, 2.6214400000000011) (13, 130.0, 0.0) (14, 140.0, 2.0971520000000008) (15, 150.0, 0.0) (16, 160.0, 1.6777216000000008) (17, 170.0, 0.0) (18, 180.0, 1.3421772800000007) (19, 190.0, 0.0) --------------------------- How it works... We start out by grabbing the selected object, as usual. Once we've done that, we'll iterate over all the keyframeable attributes, determine if they have any keyframes and, if they do, run through the times and values. To get the list of keyframeable attributes, we use the listAnimateable command: objs = cmds.ls(selection=True) obj = objs[0] animAttributes = cmds.listAnimatable(obj) This will give us a list of all the attributes on the selected object that can be animated, including any custom attributes that have been added to it. If you were to print out the contents of the animAttributes array, you would likely see something like the following: |bouncingBall.rotateX |bouncingBall.rotateY |bouncingBall.rotateZ Although the bouncingBall.rotateX part likely makes sense, you may be wondering about the | symbol. This symbol is used by Maya to indicate hierarchical relationships between nodes in order to provide fully qualified node and attribute names. If the bouncingBall object was a child of a group named ballGroup, we would see this instead: |ballGroup|bouncingBall.rotateX Every such fully qualified name will contain at least one pipe (|) symbol, as we see in the first, nongrouped example, but there can be many more—one for each additional layer of hierarchy. While this can lead to long strings for attribute names, it allows Maya to make use of objects that may have the same name, but under different parts of a larger hierarchy (to have control objects named handControl for each hand of a character, for example). Now that we have a list of all of the possibly animated attributes for the object, we'll next want to determine if there are any keyframes set on it. To do this, we can use the keyframe command in the query mode. for attribute in animAttributes: numKeyframes = cmds.keyframe(attribute, query=True, keyframeCount=True) At this point, we have a variable (numKeyframes) that will be greater than zero for any attribute with at least one keyframe. Getting the total number of keyframes on an attribute is only one of the things that the keyframe command can do; we'll also use it to grab the time and value for each of the keyframes. To do this, we'll call it two more times, both in the query mode—once to get the times and once to get the values: times = cmds.keyframe(attribute, query=True, index=(0,numKeyframes), timeChange=True) values = cmds.keyframe(attribute, query=True, index=(0,numKeyframes), valueChange=True) These two lines are identical in everything except what type of information we're asking for. The important thing to note here is the index flag, which is used to tell Maya which keyframes we're interested in. The command requires a two-element argument representing the first (inclusive) and last (exclusive) index of keyframes to examine. So, if we had total 20 keyframes, we would pass in (0,20), which would examine the keys with indices from 0 to 19. The flags we're using to get the values likely look a bit odd—both valueChange and timeChange might lead you to believe that we would be getting relative values, rather than absolute. However, when used in the previously mentioned manner, the command will give us what we want—the actual time and value for each keyframe, as they appear in the graph editor. If you want to query information on a single keyframe, you still have to pass in a pair of values- just use the index that you're interested in twice- to get the fourth frame, for example, use (3,3). At this point, we have two arrays—the times array, which contains the time value for each keyframe, and the values array that contains the actual attribute value. All that's left is to print out the information that we've found: print('frame#, time, value') for i in range(0, numKeyframes): print(i, times[i], values[i]) There's more... Using the indices to get data on keyframes is an easy way to run through all of the data for a curve, but it's not the only way to specify a range. The keyframe command can also accept time values. If we wanted to know how many keyframes existed on a given attribute between frame 1 and frame 100, for example, we could do the following: numKeyframes = cmds.keyframe(attributeName, query=True, time=(1,100) keyframeCount=True) Also, if you find yourself with highly nested objects and need to extract just the object and attribute names, you may find Python's built-in split function helpful. You can call split on a string to have Python break it up into a list of parts. By default, Python will break up the input string by spaces, but you can specify a particular string or character to split on. Assume that you have a string like the following: |group4|group3|group2|group1|ball.rotateZ Then, you could use split to break it apart based on the | symbol. It would give you a list, and using −1 as an index would give you just ball.rotateZ. Putting that into a function that can be used to extract the object/attribute names from a full string would be easy, and it would look something like the following: def getObjectAttributeFromFull(fullString): parts = fullString.split("|") return parts[-1] Using it would look something like this: inputString = "|group4|group3|group2|group1|ball.rotateZ" result = getObjectAttributeFromFull(inputString) print(result) # outputs "ball.rotateZ" Working with animation layers Maya offers the ability to create multiple layers of animation in a scene, which can be a good way to build up complex animation. The layers can then be independently enabled or disabled, or blended together, granting the user a great deal of control over the end result. In this example, we'll be looking at how to examine the layers that exist in a scene, and building a script will ensure that we have a layer of a given name. For example, we might want to create a script that would add additional randomized motion to the rotations of selected objects without overriding their existing motion. To do this, we would want to make sure that we had an animation layer named randomMotion, which we could then add keyframes to. How to do it... Create a new script and add the following code: import maya.cmds as cmds def makeAnimLayer(layerName): baseAnimationLayer = cmds.animLayer(query=True, root=True) foundLayer = False if (baseAnimationLayer != None): childLayers = cmds.animLayer(baseAnimationLayer, query=True, children=True) if (childLayers != None) and (len(childLayers) > 0): if layerName in childLayers: foundLayer = True if not foundLayer: cmds.animLayer(layerName) else: print('Layer ' + layerName + ' already exists') makeAnimLayer("myLayer") Run the script, and you should see an animation layer named myLayer appear in the Anim tab of the channel box. How it works... The first thing that we want to do is to find out if there is already an animation layer with the given name present in the scene. To do this, we start by grabbing the name of the root animation layer: baseAnimationLayer = cmds.animLayer(query=True, root=True) In almost all cases, this should return one of two possible values—either BaseAnimation or (if there aren't any animation layers yet) Python's built-in None value. We'll want to create a new layer in either of the following two possible cases: There are no animation layers yet There are animation layers, but none with the target name In order to make the testing for the above a bit easier, we first create a variable to hold whether or not we've found an animation layer and set it to False: foundLayer = False Now we need to check to see whether it's true that both animation layers exist and one of them has the given name. First off, we check that there was, in fact, a base animation layer: if (baseAnimationLayer != None): If this is the case, we want to grab all the children of the base animation layer and check to see whether any of them have the name we're looking for. To grab the children animation layers, we'll use the animLayer command again, again in the query mode: childLayers = cmds.animLayer(baseAnimationLayer, query=True, children=True) Once we've done that, we'll want to see if any of the child layers match the one we're looking for. We'll also need to account for the possibility that there were no child layers (which could happen if animation layers were created then later deleted, leaving only the base layer): if (childLayers != None) and (len(childLayers) > 0): if layerName in childLayers: foundLayer = True If there were child layers and the name we're looking for was found, we set our foundLayer variable to True. If the layer wasn't found, we create it. This's easily done by using the animLayer command one more time, with the name of the layer we're trying to create: if not foundLayer: cmds.animLayer(layerName) Finally, we finish off by printing a message if the layer was found to let the user know. There's more... Having animation layers is great, in that we can make use of them when creating or modifying keyframes. However, we can't actually add animation to layers without first adding the objects in question to the animation layer. Let's say that we had an object named bouncingBall, and we wanted to set some keyframes on its translateY attribute, in the bounceLayer animation layer. The actual command to set the keyframe(s) would look something like this: cmds.setKeyframe("bouncingBall.translateY", value=yVal, time=frame, animLayer="bounceLayer") However, this would only work as expected if we had first added the bouncingBall object to the bounceLayer animation layer. To do it, we could use the animLayer command in the edit mode, with the addSelectedObjects flag. Note that because the flag operates on the currently selected objects, we would need to first select the object we want to add: cmds.select("bouncingBall", replace=True) cmds.animLayer("bounceLayer", edit=True, addSelectedObjects=True) Adding the object will, by default, add all of its animatable attributes. You can also add specific attributes, rather than entire objects. For example, if we only wanted to add the translateY attribute to our animation layer, we could do the following: cmds.animLayer("bounceLayer", edit=True, attribute="bouncingBall.translateY") Copying animation from one object to another In this example, we'll create a script that will copy all of the animation data on one object to one or more additional objects, which could be useful to duplicate motion across a range of objects. Getting ready For the script to work, you'll need an object with some keyframes set. Either create some simple animation or skip ahead to the example on creating keyframes with script, later in this article. How to do it... Create a new script and add the following code: import maya.cmds as cmds def getAttName(fullname): parts = fullname.split('.') return parts[-1] def copyKeyframes(): objs = cmds.ls(selection=True) if (len(objs) < 2): cmds.error("Please select at least two objects") sourceObj = objs[0] animAttributes = cmds.listAnimatable(sourceObj); for attribute in animAttributes: numKeyframes = cmds.keyframe(attribute, query=True, keyframeCount=True) if (numKeyframes > 0): cmds.copyKey(attribute) for obj in objs[1:]: cmds.pasteKey(obj, attribute=getAttName(attribute), option="replace") copyKeyframes() Select the animated object, shift-select at least one other object, and run the script. You'll see that all of the objects have the same motion. How it works... The very first part of our script is a helper function that we'll be using to strip the attribute name off a full object name/attribute name string. More on it will be given later. Now on to the bulk of the script. First off, we run a check to make sure that the user has selected at least two objects. If not, we'll display a friendly error message to let the user know what they need to do: objs = cmds.ls(selection=True) if (len(objs) < 2): cmds.error("Please select at least two objects") The error command will also stop the script from running, so if we're still going, we know that we had at least two objects selected. We'll set the first one to be selected to be our source object. We could just as easily use the second-selected object, but that would mean using the first selected object as the destination, limiting us to a single target:     sourceObj = objs[0] Now we're ready to start copying animation, but first, we'll need to determine which attributes are currently animated, through a combination of finding all the attributes that can be animated, and checking each one to see whether there are any keyframes on it: animAttributes = cmds.listAnimatable(sourceObj); for attribute in animAttributes: numKeyframes = cmds.keyframe(attribute, query=True, keyframeCount=True) If we have at least one keyframe for the given attribute, we move forward with the copying: if (numKeyframes > 0): cmds.copyKey(attribute) The copyKey command will cause the keyframes for a given object to be temporarily held in memory. If used without any additional flags, it will grab all of the keyframes for the specified attribute, exactly what we want in this case. If we wanted only a subset of the keyframes, we could use the time flag to specify a range. We're passing in each of the values that were returned by the listAnimatable function. These will be full names (both object name and attribute). That's fine for the copyKey command, but will require a bit of additional work for the paste operation. Since we're copying the keys onto a different object than the one that we copied them from, we'll need to separate out the object and attribute names. For example, our attribute value might be something like this: |group1|bouncingBall.rotateX From this, we'll want to trim off just the attribute name (rotateX) since we're getting the object name from the selection list. To do this, we created a simple helper function that takes a full-length object/attribute name and returns just the attribute name. That's easy enough to do by just breaking the name/attribute string apart on the . and returning the last element, which in this case is the attribute: def getAttName(fullname): parts = fullname.split('.') return parts[-1] Python's split function breaks apart the string into an array of strings, and using a negative index will count back from the end, with −1 giving us the last element. Now we can actually paste our keys. We'll run through all the remaining selected objects, starting with the second, and paste our copied keyframes: for obj in objs[1:]: cmds.pasteKey(obj, attribute=getAttName(attribute), option="replace") Note that we're using the nature of Python's for loops to make the code a bit more readable. Rather than using an index, as would be the case in most other languages, we can just use the for x in y construction. In this case, obj will be a temporary variable, scoped to the for loop, that takes on the value of each item in the list. Also note that instead of passing in the entire list, we use objs[1:] to indicate the entire list, starting at index 1 (the second element). The colon allows us to specify a subrange of the objs list, and leaving the right-hand side blank will cause Python to include all the items to the end of the list. We pass in the name of the object (from our original selection), the attribute (stripped from full name/attribute string via our helper function), and we use option="replace" to ensure that the keyframes we're pasting in replace anything that's already there. Original animation (top). Here, we see the result of pasting keys with the default settings (left) and with the replace option (right). Note that the default results still contain the original curves, just pushed to later frames If we didn't include the option flag, Maya would default to inserting the pasted keyframes while moving any keyframes already present forward in the timeline. There's more... There are a lot of other options for the option flag, each of which handles possible conflicts with the keys you're pasting and the ones that may already exist in a slightly different way. Be sure to have a look at the built-in documentation for the pasteKeys command for more information. Another, and perhaps better option to control how pasted keys interact with existing one is to paste the new keys into a separate animation layer. For example, if we wanted to make sure that our pasted keys end up in an animation layer named extraAnimation, we could modify the call to pasteKeys as follows: cmds.pasteKey(objs[i], attribute=getAttName(attribute), option="replace", animLayer="extraAnimation") Note that if there was no animation layer named extraAnimation present, Maya would fail to copy the keys. See the section on working with animation layers for more information on how to query existing layers and create new ones. Setting keyframes While there are certainly a variety of ways to get things to move in Maya, the vast majority of motion is driven by keyframes. In this example, we'll be looking at how to create keyframes with code by making that old animation standby—a bouncing ball. Getting ready The script we'll be creating will animate the currently selected object, so make sure that you have an object—either the traditional sphere or something else you'd like to make bounce. How to do it... Create a new file and add the following code: import maya.cmds as cmds def setKeyframes(): objs = cmds.ls(selection=True) obj = objs[0] yVal = 0 xVal = 0 frame = 0 maxVal = 10 for i in range(0, 20): frame = i * 10 xVal = i * 2 if i % 2 == 1: yVal = 0 else: yVal = maxVal maxVal *= 0.8 cmds.setKeyframe(obj + '.translateY', value=yVal, time=frame) cmds.setKeyframe(obj + '.translateX', value=xVal, time=frame) setKeyframes() Run the preceding script with an object selected and trigger playback. You should see the object move up and down. How it works... In order to get our object to bounce, we'll need to set keyframes such that the object alternates between a Y-value of zero and an ever-decreasing maximum so that the animation mimics the way a falling object loses velocity with each bounce. We'll also make it move forward along the x-axis as it bounces. We start by grabbing the currently selected object and setting a few variables to make things easier to read as we run through our loop. Our yVal and xVal variables will hold the current value that we want to set the position of the object to. We also have a frame variable to hold the current frame and a maxVal variable, which will be used to hold the Y-value of the object's current height. This example is sufficiently simple that we don't really need separate variables for frame and the attribute values, but setting things up this way makes it easier to swap in more complex math or logic to control where keyframes get set and to what value. This gives us the following: yVal = 0 xVal = 0 frame = 0 maxVal = 10 The bulk of the script is a single loop, in which we set keyframes on both the X and Y positions. For the xVal variable, we'll just be multiplying a constant value (in this case, 2 units). We'll do the same thing for our frame. For the yVal variable, we'll want to alternate between an ever-decreasing value (for the successive peaks) and zero (for when the ball hits the ground). To alternate between zero and non-zero, we'll check to see whether our loop variable is divisible by two. One easy way to do this is to take the value modulo (%) 2. This will give us the remainder when the value is divided by two, which will be zero in the case of even numbers and one in the case of odd numbers. For odd values, we'll set yVal to zero, and for even ones, we'll set it to maxVal. To make sure that the ball bounces a little less each time, we set maxVal to 80% of its current value each time we make use of it. Putting all of that together gives us the following loop: for i in range(0, 20): frame = i * 10 xVal = i * 2 if (i % 2) == 1: yVal = 0 else: yVal = maxVal maxVal *= 0.8 Now we're finally ready to actually set keyframes on our object. This's easily done with the setKeyframe command. We'll need to specify the following three things: The attribute to keyframe (object name and attribute) The time at which to set the keyframe The actual value to set the attribute to In this case, this ends up looking like the following: cmds.setKeyframe(obj + '.translateY', value=yVal, time=frame) cmds.setKeyframe(obj + '.translateX', value=xVal, time=frame) And that's it! A proper bouncing ball (or other object) animated with pure code. There's more... By default, the setKeyframe command will create keyframes with both in tangent and out tangent being set to spline. That's fine for a lot of things, but will result in overly smooth animation for something that's supposed to be striking a hard surface. We can improve our bounce animation by keeping smooth tangents for the keyframes when the object reaches its maximum height, but setting the tangents at its minimum to be linear. This will give us a nice sharp change every time the ball strikes the ground. To do this, all we need to do is to set both the inTangentType and outTangentType flags to linear, as follows: cmds.setKeyframe(obj + ".translateY", value=animVal, time=frame, inTangentType="linear", outTangentType="linear") To make sure that we only have linear tangents when the ball hits the ground, we could set up a variable to hold the tangent type, and set it to one of two values in much the same way that we set the yVal variable. This would end up looking like this: tangentType = "auto" for i in range(0, 20): frame = i * 10 if i % 2 == 1: yVal = 0 tangentType = "linear" else: yVal = maxVal tangentType = "spline" maxVal *= 0.8 cmds.setKeyframe(obj + '.translateY', value=yVal, time=frame, inTangentType=tangentType, outTangentType=tangentType) Creating expressions via script While most animation in Maya is created manually, it can often be useful to drive attributes directly via script, especially for mechanical objects or background items. One way to approach this is through Maya's expression editor. In addition to creating expressions via the expression editor, it is also possible to create expressions with scripting, in a beautiful example of code-driven code. In this example, we'll be creating a script that can be used to create a sine wave-based expression to smoothly alter a given attribute between two values. Note that expressions cannot actually use Python code directly; they require the code to be written in the MEL syntax. But this doesn't mean that we can't use Python to create expressions, which is what we'll do in this example. Getting ready Before we dive into the script, we'll first need to have a good handle on the kind of expression we'll be creating. There are a lot of different ways to approach expressions, but in this instance, we'll keep things relatively simple and tie the attribute to a sine wave based on the current time. Why a sine wave? Sine waves are great because they alter smoothly between two values, with a nice easing into and out of both the minimum and maximums. While the minimum and maximum values range from −1 to 1, it's easy enough to alter the output to move between any two numbers we want. We'll also make things a bit more flexible by setting up the expression to rely on a custom speed attribute that can be used to control the rate at which the attribute animates. The end result will be a value that varies smoothly between any two numbers at a user-specified (and keyframeable) rate. How to do it... Create a new script and add the following code: import maya.cmds as cmds def createExpression(att, minVal, maxVal, speed): objs = cmds.ls(selection=True) obj = objs[0] cmds.addAttr(obj, longName="speed", shortName="speed", min=0, keyable=True) amplitude = (maxVal – minVal)/2.0 offset = minVal + amplitude baseString = "{0}.{1} = ".format(obj, att) sineClause = '(sin(time * ' + obj + '.speed)' valueClause = ' * ' + str(amplitude) + ' + ' + str(offset) + ')' expressionString = baseString + sineClause + valueClause cmds.expression(string=expressionString) createExpression('translateY', 5, 10, 1) How it works... The first that we do is to add a speed attribute to our object. We'll be sure to make it keyable for later animation: cmds.addAttr(obj, longName="speed", shortName="speed", min=0, keyable=True) It's generally a good idea to include at least one keyframeable attribute when creating expressions. While math-driven animation is certainly a powerful technique, you'll likely still want to be able to alter the specifics. Giving yourself one or more keyframeable attributes is an easy way to do just that. Now we're ready to build up our expression. But first, we'll need to understand exactly what we want; in this case, a value that smoothly varies between two extremes, with the ability to control its speed. We can easily build an expression to do that using the sine function, with the current time as the input. Here's what it looks like in a general form: animatedValue = (sin(time * S) * M) + O; Where: S is a value that will either speed up (if greater than 1) or slow down (if less) the rate at which the input to the sine function changes M is a multiplier to alter the overall range through which the value changes O is an offset to ensure that the minimum and maximum values are correct You can also think about it visually—S will cause our wave to stretch or shrink along the horizontal (time) axis, M will expand or contract it vertically, and O will move the entire shape of the curve either up or down. S is already taken care of; it's our newly created "speed" attribute. M and O will need to be calculated, based on the fact that sine functions always produce values ranging from −1 to 1. The overall range of values should be from our minVal to our maxVal, so you might think that M should be equal to (maxVal – minVal). However, since it gets applied to both −1 and 1, this would leave us with double the desired change. So, the final value we want is instead (maxVal – minVal)/2. We store that into our amplitude variable as follows: amplitude = (maxVal – minVal)/2.0 Next up is the offset value O. We want to move our graph such that the minimum and maximum values are where they should be. It might seem like that would mean just adding our minVal, but if we left it at that, our output would dip below the minimum for 50% of the time (anytime the sine function is producing negative output). To fix it, we set O to (minVal + M) or in the case of our script: offset = minVal + amplitude This way, we move the 0 position of the wave to be midway between our minVal and maxVal, which is exactly what we want. To make things clearer, let's look at the different parts we're tacking onto sin(), and the way they effect the minimum and maximum values the expression will output. We'll assume that the end result we're looking for is a range from 0 to 4. Expression Additional component Minimum Maximum sin(time) None- raw sin function −1 1 sin(time * speed) Multiply input by "speed" −1 (faster) 1 (faster) sin(time * speed) * 2 Multiply output by 2 −2 2 (sin(time * speed) * 2) + 2 Add 2 to output 0 4   Note that 2 = (4-0)/2 and 2 = 0 + 2. Here's what the preceding progression looks like when graphed:   Four steps in building up an expression to var an attribute from 0 to 4 with a sine function. Okay, now that we have the math locked down, we're ready to translate that into Maya's expression syntax. If we wanted an object named myBall to animate along Y with the previous values, we would want to end up with: myBall.translateY = (sin(time * myBall.speed) * 5) + 12; This would work as expected if entered into Maya's expression editor, but we want to make sure that we have a more general-purpose solution that can be used with any object and any values. That's straightforward enough and just requires building up the preceding string from various literals and variables, which is what we do in the next few lines: baseString = "{0}.{1} = ".format(obj, att) sineClause = '(sin(time * ' + obj + '.speed)' valueClause = ' * ' + str(amplitude) + ' + ' + str(offset) + ')' expressionString = baseString + sineClause + valueClause I've broken up the string creation into a few different lines to make things clearer, but it's by no means necessary. The key idea here is that we're switching back and forth between literals (sin(time *, .speed, and so on) and variables (obj, att, amplitude, and offset) to build the overall string. Note that we have to wrap numbers in the str() function to keep Python from complaining when we combine them with strings. At this point, we have our expression string ready to go. All that's left is to actually add it to the scene as an expression, which is easily done with the expression command: cmds.expression(string=expressionString) And that's it! We will now have an attribute that varies smoothly between any two values. There's more... There are tons of other ways to use expressions to drive animation, and all sorts of simple mathematical tricks that can be employed. For example, you can easily get a value to move smoothly to a target value with a nice easing-in to the target by running this every frame: animatedAttribute = animatedAttribute + (targetValue – animatedAttribute) * 0.2; This will add 20% of the current difference between the target and the current value to the attribute, which will move it towards the target. Since the amount that is added is always a percentage of the current difference, the per-frame effect reduces as the value approaches the target, providing an ease-in effect. If we were to combine this with some code to randomly choose a new target value, we would end up with an easy way to, say, animate the heads of background characters to randomly look in different positions (maybe to provide a stadium crowd). Assume that we had added custom attributes for targetX, targetY, and targetZ to our object that would end up looking something like the following: if (frame % 20 == 0) { myCone.targetX = rand(time) * 360; myCone.targetY = rand(time) * 360; myCone.targetZ = rand(time) * 360; } myObject.rotateX += (myObject.targetX - myCone.rotateX) * 0.2; myObject.rotateY += (myObject.targetY - myCone.rotateY) * 0.2; myObject.rotateZ += (myObject.targetZ - myCone.rotateZ) * 0.2; Note that we're using the modulo (%) operator to do something (setting the target) only when the frame is an even multiple of 20. We're also using the current time as the seed value for the rand() function to ensure that we get different results as the animation progresses. The previously mentioned example is how the code would look if we entered it directly into Maya's expression editor; note the MEL-style (rather than Python) syntax. Generating this code via Python would be a bit more involved than our sine wave example, but would use all the same principles—building up a string from literals and variables, then passing that string to the expression command. Summary In this article, we primarily discussed scripting and animation using Maya.  Resources for Article: Further resources on this subject: Introspecting Maya, Python, and PyMEL [article] Discovering Python's parallel programming tools [article] Mining Twitter with Python – Influence and Engagement [article]
Read more
  • 0
  • 0
  • 28343

article-image-rapid-application-development-django-openduty-story
Bálint Csergő
01 Aug 2016
5 min read
Save for later

Rapid Application Development with Django, the Openduty story

Bálint Csergő
01 Aug 2016
5 min read
Openduty is an open source incident escalation tool, which is something like Pagerduty but free and much simpler. It was born during a hackathon at Ustream back in 2014. The project received a lot of attention in the devops community, and was also featured in Devops weekly andPycoders weekly.It is listed at Full Stack Python as an example Django project. This article is going to include some design decisions we made during the hackathon, and detail some of the main components of the Opendutysystem. Design When we started the project, we already knew what we wanted to end up with: We had to work quickly—it was a hackathon after all An API similar to Pagerduty Ability to send notifications asynchronously A nice calendar to organize on—call schedules can’t hurt anyone, right? Tokens for authorizing notifiers So we chose the corresponding components to reach our goal. Get the job done quickly If you have to develop apps rapidly in Python, Django is the framework you choose. It's a bit heavyweight, but hey, it gives you everything you need and sometimes even more. Don't get me wrong; I'm a big fan of Flask also, but it can be a bit fiddly to assemble everything by hand at the start. Flask may pay off later, and you may win on a lower amount of dependencies, but we only had 24 hours, so we went with Django. An API When it comes to Django and REST APIs, one of the GOTO soluitions is The Django REST Framework. It has all the nuts and bolts you'll need when you're assembling an API, like serializers, authentication, and permissions. It can even give you the possibility to make all your API calls self-describing. Let me show you howserializers work in the Rest Framework. class OnCallSerializer(serializers.Serializer): person = serializers.CharField() email = serializers.EmailField() start = serializers.DateTimeField() end = serializers.DateTimeField() The code above represents a person who is on-call on the API. As you can see, it is pretty simple; you just have to define the fields. It even does the validation for you, since you have to give a type to every field. But believe me, it's capable of more good things like generating a serializer from your Django model: class SchedulePolicySerializer(serializers.HyperlinkedModelSerializer): rules = serializers.RelatedField(many=True, read_only=True) class Meta: model = SchedulePolicy fields = ('name', 'repeat_times', 'rules') This example shows how you can customize a ModelSerializer, make fields read-only, and only accept given fields from an API call. Async Task Execution When you have tasks that are long-running, such as generating huge reports, resizing images, or even transcoding some media, it is a common practice thatyou must move the actual execution of those out of your webapp into a separate layer. This decreases the load on the webservers, helps in avoiding long or even timing out requests, and just makes your app more resilient and scalable. In the Python world, the go-to solution for asynchronous task execution is called Celery. In Openduty, we use Celery heavily to send notifications asynchronously and also to delay the execution of any given notification task by the delay defined in the service settings. Defining a task is this simple: @app.task(ignore_result=True) def send_notifications(notification_id): try: notification = ScheduledNotification.objects.get(id = notification_id) if notification.notifier == UserNotificationMethod.METHOD_XMPP: notifier = XmppNotifier(settings.XMPP_SETTINGS) #choosing notifier removed from example code snippet notifier.notify(notification) #logging task result removed from example snippet raise And calling an already defined task is also almost as simple as calling any regular function: send_notifications.apply_async((notification.id,) ,eta=notification.send_at) This means exactly what you think: Send the notification with the id: notification.id at notification.send_at. But how do these things get executed? Under the hood, Celery wraps your decorated functions so that when you call them, they get enqueued instead of being executed directly. When the celery worker detects that there is a task to be executed, it simply takes it from the queue and executes it asynchronously. Calendar We use django-scheduler for the awesome-looking calendar in Openduty. It is a pretty good project generally, supports recurring events, and provides you with a UI for your calendar, so you won't even have to fiddle with that. Tokens and Auth Service token implementation is a simple thing. You want them to be unique, and what else would you choose if not aUUID? There is a nice plugin for Django models used to handle UUID fields, called django-uuidfield. It just does what it says—addingUUIDField support to your models. User authentication is a bit more interesting, so we currently support plain Django Users, and you can use LDAP as your user provider. Summary This was just a short summary about the design decisions made when we coded Openduty. I also demonstrated the power of the components through some snippets that are relevant. If you are on a short deadline, consider using Django and its extensions. There is a good chance that somebody has already done what you need to do, or something similar, which can always be adapted to your needs thanks to the awesome power of the open source community. About the author BálintCsergő is a software engineer from Budapest, currently working as an infrastructure engineer at Hortonworks. He lovesUnix systems, PHP, Python, Ruby, the Oracle database, Arduino, Java, C#, music, and beer.
Read more
  • 0
  • 0
  • 14960

article-image-memory
Packt
01 Aug 2016
26 min read
Save for later

Memory

Packt
01 Aug 2016
26 min read
In this article by Enrique López Mañas and Diego Grancini, authors of the book Android High Performance Programming explains how memory is the matter to focus on. A bad memory managed application can affect the behavior of the whole system or it can affect the other applications installed on our device in the same way as other applications could affect ours. As we all know, Android has a wide range of devices in the market with a lot of different configurations and memory amounts. It's up to the developers to understand the strategy to take while dealing with this big amount of fragmentation, the pattern to follow while developing, and the tools to use to profile the code. This is the aim of this article. In the following sections, we will focus on heap memory. We will take a look at how our device handles memory deepening, what the garbage collection is, and how it works in order to understand how to avoid common developing mistakes and clarify what we will discuss to define best practices. We will also go through patterns definition in order to reduce drastically the risk of what we will identify as a memory leak and memory churn. This article will end with an overview of official tools and APIs that Android provides to profile our code and to find possible causes of memory leaks and that aren't deepened. (For more resources related to this topic, see here.) Walkthrough Before starting the discussion about how to improve and profile our code, it's really important to understand how Android devices handle memory. Then, in the following pages, we will analyze differences between the runtimes that Android uses, know more about the garbage collection, understand what a memory leak and memory churn are, and how Java handles object references. How memory works Have you ever thought about how a restaurant works during its service? Let's think about it for a while. When new groups of customers get into the restaurant, there's a waiter ready to search for a place to allocate them. But, the restaurant is a limited space. So, there is the need to free tables when possible. That's why, when a group has finished to eat, another waiter cleans and prepares the just freed table for other groups to come. The first waiter has to find the table with the right number of seats for every new group. Then, the second waiter's task should be fast and shouldn't hinder or block others' tasks. Another important aspect of this is how many seats are occupied by the group; the restaurant owner wants to have as much free seats as possible to place new clients. So, it's important to control that every group fills the right number of seats without occupying tables that could be freed and used in order to have more tables for other new groups. This is absolutely similar to what happens in an Android system. Every time we create a new object in our code, it needs to be saved in memory. So, it's allocated as part of our application private memory to be accessed whenever needed and the system keeps allocating memory for us during the whole application lifetime. Nevertheless, the system has a limited memory to use and it cannot allocate memory indefinitely. So, how is it possible for the system to have enough memory for our application all the time? And, why is there no need for an Android developer to free up memory? Let's find it out. Garbage collection The Garbage collection is an old concept that is based on two main aspects: Find no more referenced objects Free the memory referenced by those objects When that object is no more referenced, its "table" can be cleaned and freed up. This is, what it's done to provide memory for future new objects allocations. These operations of allocation of new objects and deallocation of no more referenced objects are executed by the particular runtime in use in the device, and there is no need for the developer to do anything just because they are all managed automatically. In spite of what happens in other languages, such as C or C++, there is no need for the developer to allocate and deallocate memory. In particular, while the allocation is made when needed, the garbage collection task is executed when a memory upper limit is reached. Those automatic operations in the background don't exempt developers from being aware of their app's memory management; if the memory management is not well done, the application can be lead to lags, malfunctions and, even, crashes when an OutOfMemoryError exception is thrown. Shared memory In Android, every app has its own process that is completely managed by the runtime with the aim to reclaim memory in order to free resources for other foreground processes, if needed. The available amount of memory for our application lies completely in RAM as Android doesn't use swap memory. The main consequence to this is that there is no other way for our app to have more memory than to unreferenced no longer used objects. But Android uses paging and memory mapping; the first technique defines blocks of memory of the same size called pages in a secondary storage, while the second one uses a mapping in memory with correlated files in secondary storage to be used as primary. They are used when the system needs to allocate memory for other processes, so the system creates paged memory-mapped files to save Dalvik code files, app resources, or native code files. In this way, those files can be shared between multiple processes. As a matter of fact, Android system uses a shared memory in order to better handle resources from a lot of different processes. Furthermore, every new process to be created is forked by an already existing one that is called Zygote. This particular process contains common framework classes and resources to speed up the first boot of the application. This means that the Zygote process is shared between processes and applications. This large use of shared memory makes it difficult to profile the use of memory of our application because there are many facets to be consider before reaching a correct analysis of memory usage. Runtime Some functions and operations of memory management depend on the runtime used. That's why we are going through some specific features of the two main runtime used by Android devices. They are as follows: Dalvik Android runtime (ART) ART has been added later to replace Dalvik to improve performance from different point of view. It was introduced in Android KitKat (API Level 19) as an option for developer to be enabled, and it has become the main and only runtime from Android Lollipop (API Level 21) on. Besides the difference between Dalvik and ART in compiling code, file formats, and internal instructions, what we are focusing on at the moment is memory management and garbage collection. So, let's understand how the Google team improved performance in runtimes garbage collection over time and what to pay attention at while developing our application. Let's step back and return to the restaurant for a bit more. What would happen if everything, all employees, such as other waiters and cooks, and all of the services, such as dishwashers, and so on, stop their tasks waiting for just a waiter to free a table? That single employee performance would make success or fail of all. So, it's really important to have a very fast waiter in this case. But, what to do if you cannot afford him? The owner wants him to do what he has to as fast as possible, by maximizing his productivity and, then, allocating all the customers in the best way and this is exactly what we have to do as developers. We have to optimize memory allocations in order to have a fast garbage collection even if it stops all the other operations. What is described here is just like the runtime garbage collection works. When the upper limit of memory is reached, the garbage collection starts its task pausing any other method, task, thread, or process execution, and those objects won't resume until the garbage collection task is completed. So, it's really important that the collection is fast enough not to impede the reaching of the 16 ms per frame rule, resulting in lags, and jank in the UI. The more time the garbage collection works, the less time the system has to prepare frames to be rendered on the screen. Keep in mind that automatic garbage collection is not free; bad memory management can lead to bad UI performance and, thus, bad UX. No runtime feature can replace good memory management. That's why we need to be careful about new allocations of objects and, above all, references. Obviously, ART introduced a lot of improvement in this process after the Dalvik era, but the background concept is the same; it reduces the collection steps, it adds a particular memory for Bitmap objects, it uses new fast algorithms, and it does other cool stuff getting better in the future, but there is no way to escape that we need to profile our code and memory usage if we want our application to have the best performance. Android N JIT compiler The ART runtime uses an ahead-of-time compilation that, as the name suggests, performs compilation when the applications are first installed. This approach brought in advantages to the overall system in different ways because, the system can: Reduce battery consumption due to pre-compilation and, then, improve autonomy Execute application faster than Dalvik Improve memory management and garbage collection However, those advantages have a cost related to installation timings; the system needs to compile the application at that time, and then, it's slower than a different type of compiler. For this reason, Google added a just-in-time (JIT) compiler to the ahead-of-time compiler of ART into the new Android N. This one acts when needed, so during the execution of the application and, then, it uses a different approach compared to the ahead-of-time one. This compiler uses code profiling techniques and it's not a replacement for the ahead-of-time, but it's in addition to it. It's a good enhancement to the system for the advantages in terms of performance it introduces. The profile-guided compilation adds the possibility to precompile and, then, to cache and to reuse methods of the application, depending on usage and/or device conditions. This feature can save time to the compilation and improve performance in every kind of system. Then, all of the devices benefit of this new memory management. The key advantages are: Less used memory Less RAM accesses Lower impact on battery All of these advantages introduced in Android N, however, shouldn't be a way to avoid a good memory management in our applications. For this, we need to know what pitfalls are lurking behind our code and, more than this, how to behave in particular situations to improve the memory management of the system while our application is active. Memory leak The main mistake from the memory performance perspective a developer can do while developing an Android application is called memory leak, and it refers to an object that is no more used but it's referenced by another object that is, instead, still active. In this situation, the garbage collector skips it because the reference is enough to leave that object in memory. Actually, we are avoiding that the garbage collector frees memory for other future allocations. So, our heap memory gets smaller because of this, and this leads to the garbage collection to be invoked more often, blocking the rest of executions of the application. This could lead to a situation where there is no more memory to allocate a new object and, then, an OutOfMemoryError exception is thrown by the system. Consider the case where a used object references no more used objects, that reference no more used objects, and so on; none of them can be collected, just because the root object is still in use. Memory churn Another anomaly in memory management is called memory churn, and it refers to the amount of allocations that is not sustainable by the runtime for the too many new instantiated objects in a small period of time. In this case, a lot of garbage collection events are called many times affecting the overall memory and UI performance of the application. The need to avoid allocations in the View.onDraw() method, is closely related to memory churn; we know that this method is called every time the view needs to be drawn again and the screen needs to be refreshed every 16.6667 ms. If we instantiate objects inside that method, we could cause a memory churn because those objects are instantiated in the View.onDraw() method and no longer used, so they are collected very soon. In some cases, this leads to one or more garbage collection events to be executed every time the frame is drawn on the screen, reducing the available time to draw it below the 16.6667 ms, depending on collection event duration. References Let's have a quick overview of different objects that Java provides us to reference objects. This way, we will have an idea of when we can use them and how. Java defines four levels of strength: Normal: It's the main type of reference. It corresponds to the simple creation of an object and this object will be collected when it will be no more used and referenced, and it's just the classical object instantiation: SampleObject sampleObject = new SampleObject(); Soft: It's a reference not enough strong to keep an object in memory when a garbage collection event is triggered. So, it can be null anytime during the execution. Using this reference, the garbage collector decides when to free the object memory based on memory demand of the system. To use it, just create a SoftReference object passing the real object as parameter in the constructor and call the SoftReference.get() method to get the object: SoftReference<SampleObject> sampleObjectSoftRef = new SoftReference<SampleObject>(new SampleObject()); SampleObject sampleObject = sampleObjectSoftRef.get(); Weak: It's exactly as SoftReferences, but this is weaker than the soft one: WeakReference<SampleObject> sampleObjectWeakRef = new WeakReference<SampleObject>(new SampleObject()); Phantom: This is the weakest reference; the object is eligible for finalization. This kind of references is rarely used and the PhantomReference.get() method returns always null. This is for reference queues that don't interest us at the moment, but it's just to know that this kind of reference is also provided. These classes may be useful while developing if we know which objects have a lower level of priority and can be collected without causing problems to the normal execution of our application. We will see how can help us manage memory in the following pages. Memory-side projects During the development of the Android platform, Google has always tried to improve the memory management system of the platform to maintain a wide compatibility with increasing performance devices and low resources ones. This is the main purpose of two project Google develops in parallel with the platform, and, then, every new Android version released means new improvements and changes to those projects and their impacts on the system performance. Every one of those side projects is focusing on a different matter: Project Butter: This is introduced in Android Jelly Bean 4.1 (API Level 16) and then improved in Android Jelly Bean 4.2 (API Level 17), added features related to the graphical aspect of the platform (VSync and buffering are the main addition) in order to improve responsiveness of the device while used. Project Svelte: This is introduced inside Android KitKat 4.4 (API Level 19), it deals with memory management improvements in order to support low RAM devices. Project Volta: This is introduced in Android Lollipop (API Level 21), it focuses on battery life of the device. Then, it adds important APIs to deal with batching expensive battery draining operations, such as the JobSheduler or new tools such as the Battery Historian. Project Svelte and Android N When it was first introduced, Project Svelte reduced the memory footprint and improved the memory management in order to support entry-level devices with low memory availability and then broaden the supported range of devices with clear advantage for the platform. With the new release of Android N, Google wants to provide an optimized way to run applications in background. We know that the process of our application last in background even if it is not visible on the screen, or even if there are no started activities, because a service could be executing some operations. This is a key feature for memory management; the overall system performance could be affected by a bad memory management of the background processes. But what's changed in the application behavior and the APIs with the new Android N? The chosen strategy to improve memory management reducing the impact of background processes is to avoid to send the application the broadcasts for the following actions: ConnectivityManager.CONNECTIVITY_ACTION: Starting from Android N, a new connectivity action will be received just from those applications that are in foreground and, then, that have registered BroadcastReceiver for this action. No application with implicit intent declared inside the manifest file will receive it any longer. Hence, the application needs to change its logics to do the same as before. Camera.ACTION_NEW_PICTURE: This one is used to notify that a picture has just been taken and added to the media store. This action won't be available anymore neither for receiving nor for sending and it will be for any application, not just for the ones that are targeting the new Android N. Camera.ACTION_NEW_VIDEO: This is used to notify a video has just been taken and added to the media store. As the previous one, this action cannot be used anymore, and it will be for any application too. Keep in mind these changes when targeting the application with the new Android N to avoid unwanted or unexpected behaviors. All of the preceding actions listed have been changed by Google to force developers not to use them in applications. As a more general rule, we should not use implicit receivers for the same reason. Hence, we should always check the behavior of our application while it's in the background because this could lead to an unexpected usage of memory and battery drain. Implicit receivers can start our application components, while the explicit ones are set up for a limited time while the activity is in foreground and then they cannot affect the background processes. It's a good practice to avoid the use of implicit broadcast while developing applications to reduce the impact of it on background operations that could lead to unwanted waste of memory and, then, a battery drain. Furthermore, Android N introduces a new command in ADB to test the application behavior ignoring the background processes. Use the following command to ignore background services and processes: adb shell cmd appops set RUN_IN_BACKGROUND ignore Use the following one to restore the initial state: adb shell cmd appops set RUN_IN_BACKGROUND allow Best practices Now that we know what can happen in memory while our application is active, let's have a deep examination of what we can do to avoid memory leaks, memory churns, and optimize our memory management in order to reach our performance target, not just in memory usage, but in garbage collection attendance, because, as we know, it stops any other working operation. In the following pages, we will go through a lot of hints and tips using a bottom-up strategy, starting from low-level shrewdness in Java code to highest level Android practices. Data types We weren't joking; we are really talking about Java primitive types as they are the foundation of all the applications, and it's really important to know how to deal with them even though it may be obvious. It's not, and we will understand why. Java provides primitive types that need to be saved in memory when used: the system allocate an amount of memory related to the needed one requested for that particular type. The followings are Java primitive types with related amount of bits needed to allocate the type: byte: 8 bit short: 16 bit int: 32 bit long: 64 bit float: 32 bit double: 64 bit boolean: 8 bit, but it depends on virtual machine char: 16 bit At first glance, what is clear is that you should be careful in choosing the right primitive type every time you are going to use them. Don't use a bigger primitive type if you don't really need it; never use long, float, or double, if you can represent the number with an integer data type. Otherwise, it would be a useless waste of memory and calculations every time the CPU need to deal with it and remember that to calculate an expression, the system needs to do a widening primitive implicit conversion to the largest primitive type involved in the calculation. Autoboxing Autoboxing is the term used to indicate an automatic conversion between a primitive type and its corresponding wrapper class object. Primitive type wrapper classes are the followings: java.lang.Byte java.lang.Short java.lang.Integer java.lang.Long java.lang.Float java.lang.Double java.lang.Boolean java.lang.Character They can be instantiated using the assignation operator as for the primitive types, and they can be used as their primitive types: Integer i = 0; This is exactly as the following: Integer i = new Integer(0); But the use of autoboxing is not the right way to improve the performance of our applications; there are many costs for that: first of all, the wrapper object is much bigger than the corresponding primitive type. For instance, an Integer object needs 16 bytes in memory instead of 16 bits of the primitive one. Hence, the bigger amount of memory used to handle that. Then, when we declare a variable using the primitive wrapper object, any operation on that implies at least another object allocation. Take a look at the following snippet: Integer integer = 0; integer++; Every Java developer knows what it is, but this simple code needs an explanation about what happened step by step: First of all, the integer value is taken from the Integer value integer and it's added 1: int temp = integer.intValue() + 1; Then the result is assigned to integer, but this means that a new autoboxing operation needs to be executed: i = temp; Undoubtedly, those operations are slower than if we used the primitive type instead of the wrapper class; no needs to autoboxing, hence, no more bad allocations. Things can get worse in loops, where the mentioned operations are repeated every cycle; take, for example the following code: Integer sum = 0; for (int i = 0; i < 500; i++) { sum += i; } In this case, there are a lot of inappropriate allocations caused by autoboxing, and if we compare this with the primitive type for loop, we notice that there are no allocations: int sum = 0; for (int i = 0; i < 500; i++) { sum += i; } Autoboxing should be avoided as much as possible. The more we use primitive wrapper classes instead of primitive types themselves, the more waste of memory there will be while executing our application and this waste could be propagated when using autoboxing in loop cycles, affecting not just memory, but CPU timings too. Sparse array family So, in all of the cases described in the previous paragraph, we can just use the primitive type instead of the object counterpart. Nevertheless, it's not always so simple. What happens if we are dealing with generics? For example, let's think about collections; we cannot use a primitive type as generics for objects that implements one of the following interfaces. We have to use the wrapper class this way: List<Integer> list; Map<Integer, Object> map; Set<Integer> set; Every time we use one of the Integer objects of a collection, autoboxing occurs at least once, producing the waste outlined above, and we know well how many times we deal with this kind of objects in every day developing time, but isn't there a solution to avoid autoboxing in these situations? Android provides a useful family of objects created on purpose to replace Maps objects and avoid autoboxing protecting memory from pointless bigger allocations; they are the Sparse arrays. The list of Sparse arrays, with related type of Maps they can replace, is the following: SparseBooleanArray: HashMap<Integer, Boolean> SparseLongArray: HashMap<Integer, Long> SparseIntArray: HashMap<Integer, Integer> SparseArray<E>: HashMap<Integer, E> LongSparseArray<E>: HashMap<Long, E> In the following, we will talk about SparseArray object specifically, but everything we say is true for all other object above as well. The SparseArray uses two different arrays to store hashes and objects. The first one collects the sorted hashes, while the second one stores the key/value pairs ordered conforming to the key hashes array sorting as in Figure 1: Figure 1: SparseArray's hashes structure When you need to add a value, you have to specify the integer key and the value to be added in SparseArray.put() method, just like in the HashMap case. This could create collisions if multiple key hashes are added in the same position. When a value is needed, simply call SparseArray.get(), specifying the related key; internally, the key object is used to binary search the index of the hash, and then the value of the related key, as in Figure 2: Figure 2: SparseArray's workflow When the key found at the index resulting from binary search does not match with the original one, a collision happened, so the search keeps on in both directions to find the same key and to provide the value if it's still inside the array. Thus, the time needed to find the value increases significantly with a large number of object contained by the array. By contrast, a HashMap contains just a single array to store hashes, keys, and values, and it uses largest arrays as a technique to avoid collisions. This is not good for memory, because it's allocating more memory than what it's really needed. So HashMap is fast, because it implements a better way to avoid collisions, but it's not memory efficient. Conversely, SparseArray is memory efficient because it uses the right number of object allocations, with an acceptable increase of execution timings. The memory used for these arrays is contiguous, so every time you remove a key/value pair from SparseArray, they can be compacted or resized: Compaction: The object to remove is shifted at the end and all the other objects are shifted left. The last block containing the item to be removed can be reused for future additions to save allocations. Resize: All the elements of the arrays are copied to other arrays and the old ones are deleted. On the other hand, the addition of new elements produces the same effect of copying all elements into new arrays. This is the slowest method, but it's completely memory safe because there are no useless memory allocations. In general, HashMap is faster while doing these operations because it contains more blocks than what it's really needed. Hence, the memory waste. The use of SparseArray family objects depends of the strategy applied for memory management and CPU performance patterns because of calculations performance cost compared to the memory saving. So, the use is right in some situations. Consider the use of it when: The number of object you are dealing with is below a thousand, and you are not going to do a lot of additions and deletions. You are using collections of Maps with a few items, but lots of iterations. Another useful feature of those objects is that they let you iterate over indexing, instead of using the iterator pattern that is slower and memory inefficient. The following snippet shows how the iteration doesn't involve objects: // SparseArray for (int i = 0; i < map.size(); i++) { Object value = map.get(map.keyAt(i)); } Contrariwise, the Iterator object is needed to iterate through HashMaps: // HashMap for (Iterator iter = map.keySet().iterator(); iter.hasNext(); ) { Object value = iter.next(); } Some developers think the HashMap object is the better choice because it can be exported from an Android application to other Java ones, while the SparseArray family's object don't. But what we analyzed here as memory management gain is applicable to any other cases. And, as developers, we should strive to reach performance goals in every platform, instead of reusing the same code in different platform, because different platform could have been affected differently from a memory perspective. That's why, our main suggestion is to always profile the code in every platform we are working on, and then make our personal considerations on better or worse approaches depending on results. ArrayMap An ArrayMap object is an Android implementation of the Map interface that is more memory efficient than the HashMap one. This class is provided by the Android platform starting from Android KitKat (API Level 19), but there is another implementation of this inside the Support package v4 because of its main usage on older and lower-end devices. Its implementation and usage is totally similar to the SparseArray objects with all the implications about memory usage and computational costs, but its main purpose is to let you use objects as keys of the map, just like the HashMap does. Hence, it provides the best of both worlds. Summary We defined a lot of best practices to help keep a good memory management, introducing helpful design patterns and analyzing which are the best choices while developing things taken for granted that can actually affect memory and performance. Then, we faced the main causes for the worst leaks in Android platform, those related to main components such as Activities and Services. As a conclusion for the practices, we introduced APIs both to use and not to use. Then, other ones able to define a strategy for events related to the system and, then, external to the application. Resources for Article: Further resources on this subject: Hacking Android Apps Using the Xposed Framework [article] Speeding up Gradle builds for Android [article] Get your Apps Ready for Android N [article]
Read more
  • 0
  • 0
  • 7537
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-data-extracting-transforming-and-loading
Packt
01 Aug 2016
15 min read
Save for later

Data Extracting, Transforming, and Loading

Packt
01 Aug 2016
15 min read
In this article, by Yu-Wei, Chiu, author of the book, R for Data Science Cookbook, covers the following topics: Scraping web data Accessing Facebook data (For more resources related to this topic, see here.) Before using data to answer critical business questions, the most important thing is to prepare it. Data is normally archived in files, and using Excel or text editors allows it to be easily obtained. However, data can be located in a range of different sources, such as databases, websites, and various file formats. Being able to import data from these sources is crucial. There are four main types of data. Data recorded in a text format is the most simple. As some users require storing data in a structured format, files with a .tab or .csv extension can be used to arrange data in a fixed number of columns. For many years, Excel has held a leading role in the field of data processing, and this software uses the .xls and .xlsx formats. Knowing how to read and manipulate data from databases is another crucial skill. Moreover, as most data is not stored in a database, we must know how to use the web scraping technique to obtain data from the internet. As part of this chapter, we will introduce how to scrape data from the internet using the rvest package. Many experienced developers have already created packages to allow beginners to obtain data more easily, and we focus on leveraging these packages to perform data extraction, transformation, and loading. In this chapter, we will first learn how to utilize R packages to read data from a text format and scan files line by line. We then move to the topic of reading structured data from databases and Excel. Finally, we will learn how to scrape internet and social network data using the R web scraper. Scraping web data In most cases, the majority of data will not exist in your database, but it will instead be published in different forms on the internet. To dig more valuable information from these data sources, we need to know how to access and scrape data from the Web. Here, we will illustrate how to use the rvest package to harvest finance data from http://www.bloomberg.com/. Getting ready For this recipe, prepare your environment with R installed on a computer with internet access. How to do it... Perform the following steps to scrape data from http://www.bloomberg.com/: First, access the following link to browse the S&P 500 index on the Bloomberg Business website http://www.bloomberg.com/quote/SPX:IND: Once the page appears as shown in the preceding screenshot, we can begin installing and loading the rvest package: > install.packages("rvest") > library(rvest) Next, you can use the HTML function from rvest package to scrape and parse the HTML page of the link to the S&P 500 index at http://www.bloomberg.com/: > spx_quote <- html("http://www.bloomberg.com/quote/SPX:IND") Use the browser's built-in web inspector to inspect the location of detail quote (marked with a red rectangle) below the index chart: You can then move the mouse over the detail quote and click on the target element that you wish to scrape down. As the following screenshot shows, the <div class="cell"> section holds all the information that we need: Extract elements with the class of cell using the html_nodes function: > cell <- spx_quote %>% html_nodes(".cell") Furthermore, we can parse the label of the detailed quote from elements with the class of cell__label, extract text from scraped HTML, and eventually clean spaces and newline characters from the extracted text: > label <- cell %>% + html_nodes(".cell__label") %>% + html_text() %>% + lapply(function(e) gsub("n|\s+", "", e)) We can also extract the value of detailed quote from the element with the class of cell__value, extract text from scraped HTML, as well as clean spaces and newline characters: > value <- cell %>% + html_nodes(".cell__value") %>% + html_text() %>% + lapply(function(e)gsub("n|\s+", "", e)) Finally, we can set the extracted label as the name to value: > names(value) <- title Next, we can access the energy and oil market index page at this link (http://www.bloomberg.com/energy), as shown in the following screenshot: We can then use the web inspector to inspect the location of the table element: Finally, we can use html_table to extract the table element with class of data-table: > energy <- html("http://www.bloomberg.com/energy") > energy.table <- energy %>% html_node(".data-table") %>% html_table() How it works... The most difficult step in scraping data from a website is that web data is published and structured in different formats. We have to fully understand how data is structured within the HTML tag before continuing. As HTML (Hypertext Markup Language) is a language that has similar syntax to XML, we can use the XML package to read and parse HTML pages. However, the XML package only provides the XPath method, which has two main shortcomings, as follows: Inconsistent behavior in different browsers It is hard to read and maintain For these reasons, we recommend using the CSS selector over XPath when parsing HTML. Python users may be familiar with how to scrape data quickly using requests and the BeautifulSoup packages. The rvest package is the counterpart package in R, which provides the same capability to simply and efficiently harvest data from HTML pages. In this recipe, our target is to scrape the finance data of the S&P 500 detail quote from http://www.bloomberg.com/. Our first step is to make sure that we can access our target webpage through the internet, which is followed by installing and loading the rvest package. After installation and loading is complete, we can then use the HTML function to read the source code of the page to spx_quote. Once we have confirmed that we can read the HTML page, we can start parsing the detailed quote from the scraped HTML. However, we first need to inspect the CSS path of the detail quote. There are many ways to inspect the CSS path of a specific element. The most popular method is to use the development tool built into each browser (press F12 or FN + F12) to inspect the CSS path. Using Google Chrome as an example, you can open the development tool by pressing F12. A DevTools window may show up somewhere in the visual area (you may refer to https://developer.chrome.com/devtools/docs/dom-and-styles#inspecting-elements). Then, you can move the mouse cursor to the upper left of the DevTools window and select the Inspect Element icon (a magnifier icon similar to ). Next, click on the target element, and the DevTools window will highlight the source code of the selected area. You can then move the mouse cursor to the highlighted area and right-click on it. From the pop-up menu, click on Copy CSS Path to extract the CSS path. Or, you can examine the source code and find that the selected element is structured in HTML code with the class of cell. One highlight of rvest is that it is designed to work with magrittr, so that we can use a %>% pipelines operator to chain output parsed at each stage. Thus, we can first obtain the output source by calling spx_quote and then pipe the output to html_nodes. As the html_nodes function uses CSS selector to parse elements, the function takes basic selectors with type (for example, div), ID (for example, #header), and class (for example, .cell). As the elements to be extracted have the class of cell, you should place a period (.) in front of cell. Finally, we should extract both label and value from previously parsed nodes. Here, we first extract the element of class cell__label, and we then use html_text to extract text. We can then use the gsub function to clean spaces and newline characters from the parsed text. Likewise, we apply the same pipeline to extract the element of the class__value class. As we extracted both label and value from detail quote, we can apply the label as the name to the extracted values. We have now organized data from the web to structured data. Alternatively, we can also use rvest to harvest tabular data. Similarly to the process used to harvest the S&P 500 index, we can first access the energy and oil market index page. We can then use the web element inspector to find the element location of table data. As we have found the element located in the class of data-table, we can use the html_table function to read the table content into an R data frame. There's more... Instead of using the web inspector built into each browser, we can consider using SelectorGadget (http://selectorgadget.com/) to search for the CSS path. SelectorGadget is a very powerful and simple to use extension for Google Chrome, which enables the user to extract the CSS path of the target element with only a few clicks: To begin using SelectorGadget, access this link (https://chrome.google.com/webstore/detail/selectorgadget/mhjhnkcfbdhnjickkkdbjoemdmbfginb). Then, click on the green button (circled in the red rectangle as shown in the following screenshot) to install the plugin to Chrome: Next, click on the upper-right icon to open SelectorGadget, and then select the area which needs to be scraped down. The selected area will be colored green, and the gadget will display the CSS path of the area and the number of elements matched to the path: Finally, you can paste the extracted CSS path to html_nodes as an input argument to parse the data. Besides rvest, we can connect R to Selenium via Rselenium to scrape the web page. Selenium was originally designed as an automating web application that enables the user to command a web browser to automate processes through simple scripts. However, we can also use Selenium to scrape data from the Internet. The following instruction presents a sample demo on how to scrape Bloomberg.com using Rselenium: First, access this link to download the Selenium standalone server (http://www.seleniumhq.org/download/), as shown in the following screenshot: Next, start the Selenium standalone server using the following command: $ java -jar selenium-server-standalone-2.46.0.jar If you can successfully launch the standalone server, you should see the following message, which means that you can connect to the server that binds to port 4444: At this point, you can begin installing and loading RSelenium with the following command: > install.packages("RSelenium") > library(RSelenium) After RSelenium is installed, register the driver and connect to the Selenium server: > remDr <- remoteDriver(remoteServerAddr = "localhost" + , port = 4444 + , browserName = "firefox" +) Examine the status of the registered driver: > remDr$getStatus() Next, we navigate to Bloomberg.com: > remDr$open() > remDr$navigate("http://www.bloomberg.com/quote/SPX:IND ") Finally, we can scrape the data using the CSS selector. > webElem <- remDr$findElements('css selector', ".cell") > webData <- sapply(webElem, function(x){ + label <- x$findChildElement('css selector', '.cell__label') + value <- x$findChildElement('css selector', '.cell__value') + cbind(c("label" = label$getElementText(), "value" = value$getElementText())) + } + ) Accessing Facebook data Social network data is another great source for a user who is interested in exploring and analyzing social interactions. The main difference between social network data and web data is that social network platforms often provide a semi-structured data format (mostly JSON). Thus, we can easily access the data without the need to inspect how the data is structured. In this recipe, we will illustrate how to use rvest and rson to read and parse data from Facebook. Getting ready For this recipe, prepare your environment with R installed on a computer with Internet access. How to do it… Perform the following steps to access data from Facebook: First, we need to log in to Facebook and access the developer page (https://developers.facebook.com/), as shown in the following screenshot: Click on Tools & Support and select Graph API Explorer: Next, click on Get Token and choose Get Access Token: On the User Data Permissions pane, select user_tagged_places and then click on Get Access Token: Copy the generated access token to the clipboard: Try to access Facebook API using rvest: > access_token <- '<access_token>' > fb_data <- html(sprintf("https://graph.facebook.com/me/tagged_places?access_token=%s",access_token)) Install and load rjson package: > install.packages("rjson") > library(rjson) Extract the text from fb_data and then use fromJSON to read JSON data: > fb_json <- fromJSON(fb_data %>% html_text()) Use sapply to extract the name and ID of the place from fb_json: > fb_place <- sapply(fb_json$data, function(e){e$place$name}) > fb_id <- sapply(fb_json$data, function(e){e$place$id}) Last, use data.frame to wrap the data: > data.frame(place = fb_place, id = fb_id) How it works… In this recipe, we covered how to retrieve social network data through Facebook's Graph API. Unlike scraping web pages, you need to obtain a Facebook access token before making any request for insight information. There are two ways to retrieve the access token: the first is to use Facebook's Graph API Explorer, and the other is to create a Facebook application. In this recipe, we illustrated how to use the Graph API Explorer to obtain the access token. Facebook's Graph API Explorer is where you can craft your requests URL to access Facebook data on your behalf. To access the explorer page, we first visit Facebook's developer page (https://developers.facebook.com/). The Graph API Explorer page is under the drop-down menu of Tools & Support. After entering the explorer page, we select Get Access Token from the drop-down menu of Get Token. Subsequently, a tabbed window will appear; we can check access permission to various levels of the application. For example, we can check tagged_places to access the locations that we previously tagged. After we selected the permissions that we require, we can click on Get Access Token to allow Graph API Explorer to access our insight data. After completing these steps, you will see an access token, which is a temporary and short-lived token that you can use to access Facebook API. With the access token, we can then access Facebook API with R. First, we need a HTTP request package. Similarly to the web scraping recipe, we can use the rvest package to make the request. We craft a request URL with the addition of the access_token (copied from Graph API Explorer) to the Facebook API. From the response, we should receive JSON formatted data. To read the attributes of the JSON format data, we install and load the RJSON package. We can then use the fromJSON function to read the JSON format string extracted from the response. Finally, we read places and ID information through the use of the sapply function, and we can then use data.frame to transform extracted information to the data frame. At the end of this recipe, we should see data formatted in the data frame. There's more... To learn more about Graph API, you can read the official document from Facebook (https://developers.facebook.com/docs/reference/api/field_expansion/): First, we need to install and load the Rfacebook package: > install.packages("Rfacebook") > library(Rfacebook) We can then use built-in functions to retrieve data from the user or access similar information with the provision of an access token: > getUsers("me", "<access_token>") If you want to scrape public fan pages without logging into Facebook every time, you can create a Facebook app to access insight information on behalf of the app.: To create an authorized app token, login to the Facebook developer page and click on Add a New Page: You can create a new Facebook app with any name, providing that it has not already been registered: Finally, you can copy both the app ID and app secret and craft the access token to <APP ID>|<APP SECRET>. You can now use this token to scrape public fan page information with Graph API: Similarly to Rfacebook, we can then replace the access_token with <APP ID>|<APP SECRET>: > getUsers("me", "<access_token>") Summary In this article, we learned how to utilize R packages to read data from a text format and scan files line by line. We also learned how to scrape internet and social network data using the R web scraper. Resources for Article: Further resources on this subject: Learning Data Analytics with R and Hadoop [article] Big Data Analysis (R and Hadoop) [article] Using R for Statistics, Research, and Graphics [article]
Read more
  • 0
  • 0
  • 2661

article-image-webrtc-freeswitch
Packt
25 Jul 2016
16 min read
Save for later

WebRTC in FreeSWITCH

Packt
25 Jul 2016
16 min read
In this article by Anthony Minessale and Giovanni Maruzzelli, authors of Mastering FreeSWITCH, we will cover the following topics: What WebRTC is and how it works Encryption and NAT traversing (STUN, TURN, etc) Signaling and media Interconnection with PSTN and SIP networks FreeSWITCH as a WebRTC server, gateway, and application server SIP signaling clients with JavaScript (SIP.js) Verto signaling clients with JavaScript (mod_verto, verto.js) (For more resources related to this topic, see here.) WebRTC Finally something new! How refreshing it is to be learning and experimenting again, especially if you're an old hand! After at least ten years of linear evolution, here we are with a quantum leap, the black swan that truly disrupts the communication sector. Browsers are already out there, waiting With an installed base of hundreds of millions, and soon to be in the billions ballpark, browsers (both on PCs and on smart phones) are now complete communication terminals, audio/video endpoints that do not need any additional software, plugins, hardware, or whatever. Browsers now incorporate, per default and in a standard way, all the software needed to interact with loudspeakers, microphones, headsets, cameras, screens, etc. Browsers are the new endpoints, the CPEs, the phones. They have an API, they're updated automatically, and are compatible with your system. You don't have to procure, configure, support, or upgrade them. They're ready for your new service; they just work, and are waiting for your business. Web Real-Time Communication is coming There are two completely separated flows in communication: Signaling and media. Signaling is a flow of information that defines who is calling whom, taking what paths, and which technology is used to transmit which content. Media is the actual digitized content of the communication, for example, audio, video, screen-sharing, etc. Media and signaling often take completely unrelated paths to go from caller to callee, for example, their IP packets traverse different gateways and routers. Also, the two flows are managed by separate software (or by different parts of the same application) using different protocols. WebRTC defines how a browser accesses its own media capture, how it sends and receives media from a peer through the network and how it renders the media stream that it receives. It represents this using the same Session Description Protocol (SDP) as SIP does. So, WebRTC is all about media, and doesn't prescribe a signaling system. This is a design decision, embedded in the standard definition. Popular signaling systems include SIP, XMPP, and proprietary or custom protocols. Also, WebRTC is all about encryption. All WebRTC media streams are mandatorily encrypted. Chrome, Firefox, and Opera (together they account for more than 70 percent of the browsers in use) already implement the standard; Edge is announcing the first steps in supporting WebRTC basic features, while only Safari is still holding its cards (Skype and FaceTime on WebRTC with proprietary signaling? Wink wink). Under the hood More or less, WebRTC works like this: Browser connects to a web server and loads a webpage with some JavaScript in it JavaScript in the webpage takes control of browser's media interfaces (microphone, camera, speakers, and so on), resulting in an API media object The WebRTC Api Media object will contain the capabilities of all devices and codecs available, for example, definition, sample rate, and so on, and it will permit the user to choose their own capabilities preferences (for example, use QVGA video to minimize CPU and bandwidth) Webpage will interface with browser's user, getting some input for signing in the webserver's communication service (if any) JavaScript will use whatever signaling method (SIP, XMPP, proprietary, custom) over encrypted secure websocket (wss://) for signing in the communication service, finding peers, originating and receiving calls Once signed up in the service, a call can be made and received. Signaling will give the protocol address of the peer (for example, sip:gmaruzz@opentelecom.it) These points are represented in the following image: Now is the moment to find out actual IP addresses. JavaScript will generate a WebRTC API object for finding its own IP addresses, transports and ports (ICE candidates) to be offered to peer for exchanging media (JavaScript WebRTC API will use ICE, STUN, TURN, and will send to peer its own local LAN address, its own public IP address, and maybe the IP address of a Turn server it can use)   Then, WebRTC Net API will exchange ICE candidates with the peer, until they both find the most "rational" triplets of IP address, port and transport (udp, dtls, and so on), for each stream (for example, audio, video, screen share, and so on) Once they get the best addresses, the signaling will establish the call. These points are represented in the following image:   Once signaling communication with the peer is established, media capabilities are exchanged in SDP format (exactly as in SIP), and the two peers agree on media formats (sample rates, codecs, and so on) When media formats are agreed, JavaScript WebRTC Transport API will use secure (encrypted) websockets (wss://) as transport for media and data JavaScript WebRTC Media API will be used to render the media streams received (for example, render video, play sound, capture microphone, and so on) Additionally or in alternative to media, peers can establish one or more data channels, through which they bidirectionally exchange raw or structured data (file transfers, augmented reality, stock tickers, and so on) At hangup, signaling will tear down the call, and JavaScript WebRTC Media API will be used to shut down streams and renderings These points are represented in the following image: This is a high level, but complete, view of how a WebRTC system works. Encryption – security Please note that in normal operation everything is encrypted, uses real PKI certificates from real Certification Authorities, actual DNS names, SSL, TLS, HTTPS, WSS, DTLS-SRTP. This is how it is supposed to work. In WebRTC, security is not an afterthought: It is mandatory. To make signaling work without encryption (for example, for debugging signaling protocols) is not so easy, but it is possible. Browsers will often raise security exceptions, and will ask for permission each time they access a camera or microphone. Some hiccups will happen, but it is doable. Signaling is not part of WebRTC standard, as you know. On the contrary, it is not possible to have the media or data streams to leave the browser in the clear, without encryption. The use of plain RTP to transmit media is explicitly forbidden by the standard. Media is transmitted by SRTP (Secure RTP), where encryption keys are pre-exchanged via DTLS (Datagram Transport Layer Security, a version of TLS for Datagrams), basically a secure version of UDP. Beyond peer to peer – WebRTC to communication networks and services WebRTC is a technique for browsers to send media to each other via Internet, peer to peer, perhaps with the help of a relay server (TURN), if they can't reach each other directly. That's it. No directories, no means to find another person, and also no way to "call" that person if we know "where" to call her. No way to transfer calls, to react to a busy user or to a user that does not pickup, and so on. Let's say WebRTC is a half-built phone: It has the handset, complete with working microphone and speaker, from which it comes out, the wiring left loose. You can cross join that wiring with the wiring of another half-built phone, and they can talk to each other. Then, if you want to talk to another device, you must find it and then join the wires anew. No dial pad, no Telecom Central Office, no interconnection between Local Carriers, and with International Carriers. No PBX. No way to call your grandma, and no possibilities to navigate the IVR at Federal Express' Customer Care. We need to integrate the media capabilities and the ubiquity of WebRTC with the world of telecommunication services that constitute the planet's nervous system. Enter the "WebRTC Gateway" and the "WebRTC Application Server"; in our case both are embodied by FreeSWITCH WebRTC gateways and application servers The problem to be solved is: We can implement some kind of signaling plane, even implement a complete SIP signaling stack in JavaScript (there are some very good ones in open source, we'll see later), but then both at the network and at the media plane, WebRTC is only "kind of" compatible with the existing telecommunication world; it uses techniques and concepts that are "similar", and protocols that are mostly an "evolution " of those implemented in usual Voice over IP. At the network plane, WebRTC uses ICE protocol to traverse NAT via STUN and TURN servers. ICE has been developed as Internet standard to be the ultimate tool to solve all NAT problems, but has not yet been implemented in either telco infrastructure, nor in most VoIP clients. Also, ICE candidates (the various different addresses the browser thinks they would be reachable at) need to be passed in SDP and negotiated between peers, in the same way codecs are negotiated. Being able to pass through corporate firewalls (UDP blocked, TCP open only on ports 80 and 443, and perhaps through protocol-aware proxies) is an absolute necessity for serious WebRTC deployment. At media plane, WebRTC specific codecs (V8 for video and Opus for audio) are incompatible with the telco world, with audio G711 as the only common denominator. Worst yet, all media are encrypted as SRTP with DTLS key exchange, and that's unheard of in today's telco infrastructure. So, we need to create the signaling plane, and then convert the network transport, convert the codecs, manage the ICE candidates selection in SDP, and allow access to the wealth of ready-made services (PSTN calls, IVRs, PBXs, conference rooms, etc), and then complement the legacy services with special features and new interconnected services enabled by the unique capabilities of WebRTC endpoints. Yeah, that's a job for FreeSWITCH. Which architecture? Legacy on the Web, or Web on the Telco? Real-time communication via the Web: From the building blocks we just saw, we can implement it in many ways. We have one degree of freedom: Signaling. I mean, media will be anyway agreed about via SDP, transmitted via websockets as SRTP packets, and encrypted via DTLS key exchange. We still have the task to choose how we will find the peer to exchange media with. So, this is an exercise in directory, location, registration, routing, presence, status, etc. You get the idea. So, at the end of the day you need to come out with a JavaScript library to implement your signaling on the browsers, commanding their underlying mechanisms (Comet, Websockets, WebRTC Data Channel) to find your beloved communication peer. Actually it boils down to different possibilities: SIP XMPP (eg: jabber) In-house signaling implementation VERTO (open source) SIP and XMPP make today's world spin around. SIP is mostly known for carrying the majority of telephone and VoIP signaling traffic. The biggest implementations of instant messaging and chatting are based on XMPP. And there is more: Those two signaling protocols are often used together, although each one of them has extensions that provide the other one's functionality. Both SIP and XMPP have been designed to be expandable and modular, and SIP particularly is an abstract protocol, for the management of "sessions" (where a "session" can be whatever has a beginning and an end in time, as a voice or video call, a screen share, a whiteboard, a collaboration platform, a payment, a message, and so on). Both have robust JavaScript implementations available (for SIP check SIP.js, JsSIP, SIPML, while for XMPP check Strophe, stanza.io, jingle.js). If your company has considerable investments and/or expertise in those protocols, then it makes sense to expand their usage on the web too. If you're running Skype, or similar services, you may find it an attractive option to maintain your proprietary, closed-signaling protocol and implement it in JavaScript, so you can expand your service reach to browsers and exploit that common transport and media technologies. VERTO is our open source signaling proposal, designed from the ground up to be familiar to Web application developers, and allowing for a high degree of integration between FreeSWITCH-provided services and browsers. It is implemented on the FreeSWITCH side by a module (mod_verto) that talks JSON with the JavaScript library (verto.js) on the browser side. FreeSWITCH accommodates them ALL FreeSWITCH implements all of WebRTC low-level protocols, codecs and requirements. It's got encryption, SRTP, DTLS, RTP, websocket and secure websocket transports (ws:// and wss://). Having got it all, it is able to serve SIP endpoints over WebRTC via mod_sofia (they'll be just other SIP phones, exactly like the rest of soft and hard SIP phones), and it interacts with XMPP via mod_jingle. Crucially, FreeSWITCH has been designed since its inception to be able to manage and message high-definition media, both audio and video. Support for OPUS audio codec (8 up to 48 khz, enough for actual audio-cd quality) started years ago as a pioneering feature, and has evolved over the years to be so robust and self-healing as to sustain a loss of more than 40% (yep, as in FORTY PERCENT) packets and maintain understandability. WebRTC's V8 video codec is routinely carrying our mixed video conferences in FullHD (as in 1920x1080 pixel), and we're looking forward to investing in fiber and in some facial cream to look good in 4K. That's why FreeSWITCH can be the pivot of your next big WebRTC project: its architecture was designed from the start to be a multimedia powerhouse. There is lot of experience out there using FreeSWITCH in expanding the reach of existing SIP services having the browsers acting as SIP phones via JavaScript libraries, without modifying in any way the service logic and implementation. You just add SIP extensions that happen to be browsers. For the remainder of this article we'll write about VERTO, a FreeSWITCH proposal especially dedicated to Web development. What is Verto (module and jslib)? Verto is a FreeSWITCH module (mod_verto) that allows for JSON interaction with FreeSWITCH, via secure websockets (wss). All the power and complexity of FreeSWITCH can be harnessed via Verto: Session management, call control, text messaging, and user data exchange and synchronization. Take a note for yourself: "User data exchange and synchronization". We'll be back to this later. Verto is like Event Socket Layer (ESL) on steroids: Anything you can do in ESL (subscribe, send and receive messages in FS core message pumps/queues) you can do in Verto, but Verto is actually much more and can do much more. Verto is also made for high-level control of WebRTC! Verto has an accompanying JavaScript library, verto.js. Using verto.js a web developer can videoconference and enable a website and/or add a collaboration platform to a CRM system in few lines of code. And in a few lines of a code that he understands, in a logic that's familiar to web developers, without forcing references to foreign knowledge domains like SIP. Also, Verto allows for the simplest way to extend your existing SIP services to WebRTC browsers. The added benefit of "user data exchange and synchronization" (see, I'm back to it) is not to be taken lightly: You can create data structures (for example, in JSON) and have them synchronized on server and all clients, with each modification made by the client or server to be automatically, immediately and transparently reflected on all other clients. Imagine a dynamic list of conference participants, or a chat, or a stock ticker, or a multiuser ping pong game, and so on. Configure mod_verto Mod_verto is installed by default by standard FreeSWITCH implementation. Let's have a look at its configuration file, verto.conf.xml. The most important parameter here, and the only one I had to modify from the stock configuration file, is ext-rtp-ip. If your server is behind a NAT (that is, it sits on a private network and exchanges packets with the public internet via some sort of port forwarding by a router or firewall), you must set this parameter to the public IP address the clients are reaching for. Other very important parameters are the codec strings. Those two parameters determine the absolute string that will be used in SDP media negotiation. The list in the string will represent all the media formats to be proposed and accepted. WebRTC has mandatory (so, assured) support for vp8 video codec, while mandatory audio codecs are opus and pcmu/pcma (eg, g711). Pcmu and pcma are much less CPU hungry than opus. So, if you are willing to set for less quality (g711 is "old PSTN" audio quality), you can use "pcmu,pcma,vp8" as your strings, and have both clients and server use far less CPU power for audio processing. This can make a real difference and very much sense in certain setups, for example, if you must cope with low-power devices. Also, if you route/bridge calls to/from PSTN, they will have no use for opus high definition audio; much better to directly offer the original g711 stream than decode/recode it in opus. Test with Communicator Once configured, you want to test your mod_verto install. What better moment than now to get to know the awesomeness of Verto Communicator, a JavaScript videoconference and collaboration advanced client, developed by Italo Rossi, Jonatas Oliveira and Stefan Yohansson from Brazil, Joao Mesquita from Argentina, and our core devs Ken Rice and Brian West from Tennessee and Oklahoma? If it's not already done, copy Verto Communicator distribution directory (/usr/src/freeswitch.git/html5/verto/verto_communicator/dist/) into a directory served by your web server in SSL (be sure you got all the SSL certificates right). To see it in all its splendor, be sure to call from two different clients, one as simple participant, the other as moderator, and you'll be presented with controls to manage the conference layout, for giving floor, for screen sharing, for creating banners with name and title for each participant, for real-time chatting, and much more. It is simply astonishing what can be done with JavaScript and mod_verto. Summary In this article we delved in WerbRTC design, what infrastructure it requires, in what is similar and in what is different from known VoIP. We understood that WebRTC is only about media, and leave the signaling to the implementor. Also, we get the specific of WebRTC, its way to traverse NAT, its omnipresent encryption, its peer to peer nature. We witnessed going beyond peer to peer, connecting with the telecommunication world of services needs gateways that do transport, protocol and media translations. FreeSWITCH is the perfect fit as WebRTC server, WebRTC gateway, and also as application server. And then we saw how to implement Verto, a signaling born on WebRTC, a JSON web protocol designed to exploit the additional features of WerbRTC and of FreeSWITCH, like real time data structure synchronization, session rehydration, event systems, and so on. Resources for Article: Further resources on this subject: Configuring FreeSWITCH for WebRTC [article] Architecture of FreeSWITCH [article] FreeSWITCH 1.0.6: SIP and the User Directory [article]
Read more
  • 0
  • 0
  • 20893

article-image-visualizing-time-spent-typing-slack
Bradley Cicenas
25 Jul 2016
4 min read
Save for later

Visualizing Time Spent Typing in Slack

Bradley Cicenas
25 Jul 2016
4 min read
Slacks massive popularity as a team messaging platform has brought up some age-old questions about productivity in the workplace. Does ease of communication really enable us to get more done day-to-day? Or is it just another distraction in the sea of our notification panel? Using the Slack RTM(Real-Time Messaging) API, we can follow just how much of our day we spend collaborating, making business-critical decisions, and sharing cat GIFs. A word on the Real-Time Messaging API Much of Slack’s success can be attributed the plethora of bots, integrations, and apps available for the platform. While many are built on the robust Web API, the Real-Time Messaging API provides a stream comprised of over 65 different events as they happen, making it an ideal choice for analyzing your own messaging habits. Events types include file uploads, emoji usage, user status, joining and leaving a channel, and many more. Since it's difficult to gauge how long we spend reading or thinking about conversations in Slack, we'll use a metric we do know with a bit of certainty—time spent typing. Fortunately, this is also a specific event type broadcast from the RTM API: user_typing. Unlike most web APIs, connections to the RTM API are made over a persistent websocket. We'll use the SlackSocket Python library to listen in on events as they come in. Recording events To start, we'll need to gather and record event data across a period of time. Creating a SlackSocket object filtered by event type is fairly straightforward: fromslacksocketimportSlackSocket slack=SlackSocket('<slack-token>', event_filters=['user_typing']) Since we're only concerned with following a single type of event, an event_filter is added so that we won't have to read and filter every incoming message in our code. According to the documentation, a user_typing event is sent: on every key press in the chat input unless one has been sent in the last three seconds For the sake of our analysis, we'll assume that each of these events accounts for three seconds of a user’s time. importos fromdatetimeimportdatetime for event inslack.events(): now=datetime.now().timestamp() # get the current epoch timestamp withopen('typing.csv', 'a') as of: of.write('%s,%s'% (now, event.event['user'])) Our typing will be logged in CSV format with a timestamp and the corresponding user that triggered the event. Plotting with matplotlib After we've collected a sufficient amount of data(a day in this case) on our typing events, we can plot it out in a separate script using matplotlib. We'll read in all of the data, filtering for our user: importos fromdatetimeimportdatetime importmatplotlib.pyplotasplt withopen('typing.log') as of: data= [ l.strip('n').split(',') for l inof.readlines() ] x = [] y = [] forts, user in data: if user =='bradley': x.append(datetime.fromtimestamp(float(ts))) # convert epoch timestamp to datetime object y.append(3) # seconds of typing Epoch timestamps are converted back into datetime objects to ensure that matplotlib can display them correctly along the x-axis. Create the plot and export as a PNG: plt.plot(x,y) plt.gcf().autofmt_xdate() # make the x-labels nicer for timestamps plt.savefig('typing.png') Results:  Not a particularly eventful morning(at least until I'd had my coffee), but enough to infer that I'm rarely spending more than five minutes an hour here in active discussion. Another data point missing from our observation is the number of messages in comparison to the time spent typing. If a message was rewritten or partially written and retracted, this could account for quite a bit of typing time without producing much in terms of message content. A playground for analytics There's quite a bit of fun and insight to be had watching just this single user_typing event. Likewise, tracking any number of the 65+ other events broadcast by Slack’s RTM API works well to create an interesting and multi-layered dataset ripe for analysis. The code for SlackSocket is available on GitHub and, as always, we welcome any contributions or feature requests! About the author Bradley Cicenas is a New York City-based infrastructure engineer with an affinity for microservices, systems design, data science, and stoops.
Read more
  • 0
  • 0
  • 2087

article-image-detecting-and-protecting-against-your-enemies
Packt
22 Jul 2016
9 min read
Save for later

Detecting and Protecting against Your Enemies

Packt
22 Jul 2016
9 min read
In this article by Matthew Poole, the author of the book Raspberry Pi for Secret Agents - Third Edition, we will discuss how Raspberry Pi has lots of ways of connecting things to it, such as plugging things into the USB ports, connecting devices to the onboard camera and display ports and to the various interfaces that make up the GPIO (General Purpose Input/Output) connector. As part of our detection and protection regime we'll be focusing mainly on connecting things to the GPIO connector. (For more resources related to this topic, see here.) Build a laser trip wire You may have seen Wallace and Grommet's short film, The Wrong Trousers, where the penguin uses a contraption to control Wallace in his sleep, making him break into a museum to steal the big shiny diamond. The diamond is surrounded by laser beams but when one of the beams is broken the alarms go off and the diamond is protected with a cage! In this project, I'm going to show you how to set up a laser beam and have our Raspberry Pi alert us when the beam is broken—aka a laser trip wire. For this we're going to need to use a Waveshare Laser Sensor module (www.waveshare.com), which is readily available to buy on Amazon for around £10 / $15. The module comes complete with jumper wires, that allows us to easily connect it to the GPIO connector in the Pi: The Waveshare laser sensor module contains both the transmitter and receiver How it works The module contains both a laser transmitter and receiver. The laser beam is transmitted from the gold tube on the module at a particular modulating frequency. The beam will then be reflected off a surface such as a wall or skirting board and picked up by the light sensor lens at the top of the module. The receiver will only detect light that is modulated at the same frequency as the laser beam, and so does not get affected by visible light. This particular module works best when the reflective surface is between 80 and 120 cm away from the laser transmitter. When the beam is interrupted and prevented from reflecting back to the receiver this is detected and the data pin will be triggered. A script monitoring the data pin on the Pi will then do something when it detects this trigger. Important: Don't ever look directly into the laser beam as will hurt your eyes and may irreversibly damage them. Make sure the unit is facing away from you when you wire it up. Wiring it up This particular device runs from a power supply of between 2.5 V and 5.0 V. Since our GPIO inputs require 3.3 V maximum when a high level is input, we will use the 3.3 V supply from our Raspberry Pi to power the device: Wiring diagram for the laser sensor module Connect the included 3-hole connector to the three pins at the bottom of the laser module with the red wire on the left (the pin marked VCC). Referring to the earlier GPIO pin-out diagram, connect the yellow wire to pin 11 of the GPIO connector (labeled D0/GPIO 17). Connect the black wire to pin 6 of the GPIO connector (labeled GND/0V) Connect the red wire to pin 1 of the GPIO connector (3.3 V). The module should now come alive. The red LED on the left of the module will come on if the beam is interrupted. This is what it should look like in real-life: The laser module connected to the Raspberry Pi Writing the detection script Now that we have connected the laser sensor module to our Raspberry Pi, we need to write a little script that will detect when the beam has been broken. In this project we've connected our sensor output to D0, which is GPIO17 (refer to the earlier GPIO pin-out diagram). We need to create file access for the pin by entering the command: pi@raspberrypi ~ $ sudo echo 17 > /sys/class/gpio/export And now set its direction to "in": pi@raspberrypi ~ $ sudo echo in > sys/class/gpio/gpio17/direction We're now ready to read its value, and we can do this with the following command: pi@raspberrypi ~ $ sudo cat /sys/class/gpio/gpio17/value You'll notice that it will have returned "1" (digital high state) if the beam reflection is detected, or a "0" (digital low state) if the beam is interrupted. We can create a script to poll for the beam state: #!/bin/bash sudo echo 17 > /sys/class/gpio/export sudo echo in > /sys/class/gpio/gpio17/direction # loop forever while true do # read the beam state BEAM=$(sudo cat /sys/class/gpio/gpio17/value) if [ $BEAM == 1 ]; then #beam not blocked echo "OK" else #beam was broken echo "ALERT" fi done Code listing for beam-sensor.sh When you run the script you should see OK scroll up the screen. Now interrupt the beam using your hand and you should see ALERT scroll up the console screen until you remove your hand. Don't forget, that once we've finished with the GPIO port it's tidy to remove its file access: pi@raspberrypi ~ $ sudo echo 17 > /sys/class/gpio/unexport We've now seen how to easily read a GPIO input, the same wiring principle and script can be used to read other sensors, such as motion detectors or anything else that has an on and off state, and act upon their status. Protecting an entire area Our laser trip wire is great for being able to detect when someone walks through a doorway or down a corridor, but what if we wanted to know if people are in a particular area or a whole room? Well we can with a basic motion sensor, otherwise known as a passive infrared (PIR) detector. These detectors come in a variety of types, and you may have seen them lurking in the corners of rooms, but fundamentally they all work the same way by detecting the presence of body heat in relation to the background temperature, within a certain area, and so are commonly used to trigger alarm systems when somebody (or something such as the pet cat) has entered a room. For the covert surveillance of our private zone we're going to use a small Parallax PIR Sensor available from many online Pi-friendly stores such as ModMyPi, Robot Shop or Adafruit for less than £10 / $15. This little device will detect the presence of enemies within a 10 meter range of it. If you can't obtain one of these types then there other types that will work just as well, but the wiring might be different to that explained in this project. Parallax passive infrared motion sensor Wiring it up As with our laser sensor module, this device also just needs three wires to connect it to the Raspberry Pi. However, they are connected differently on the sensor as shown below: Wiring diagram for the Parallax PIR motion sensor module Referring to the earlier GPIO pin-out diagram, connect the yellow wire to pin 11 of the GPIO connector (labelled D0 /GPIO 17), with the other end connecting to the OUT pin on the PIR module. Connect the black wire to pin 6 of the GPIO connector (labelled GND / 0V), with the other end connecting to the GND pin on the PIR module. Connect the red wire to pin 1 of the GPIO connector (3.3 V), with the other end connecting to the VCC pin on the module. The module should now come alive, and you'll notice the light switching on and off as it detects your movement around it. This is what it should look like for real: PIR motion sensor connected to Raspberry Pi Implementing the detection script The detection script for the PIR motion sensor is the similar to the one we created for the laser sensor module in the previous section. Once again, we've connected our sensor output to D0, which is GPIO17. We create file access for the pin by entering the command: pi@raspberrypi ~ $ sudo echo 17 > /sys/class/gpio/export And now set its direction to in: pi@raspberrypi ~ $ sudo echo in >/sys/class/gpio/gpio17/direction We're now ready to read its value, and we can do this with the following command: pi@raspberrypi ~ $ sudo cat /sys/class/gpio/gpio17/value You'll notice that this time the PIR module will have returned 1 (digital high state) if the motion is detected, or a 0 (digital low state) if there is no motion detected. We can modify our previous script to poll for the motion-detected state: #!/bin/bash sudo echo 17 > /sys/class/gpio/export sudo echo in > /sys/class/gpio/gpio17/direction # loop forever while true do # read the beam state BEAM=$(sudo cat /sys/class/gpio/gpio17/value) if [ $BEAM == 0 ]; then #no motion detected echo "OK" else #motion was detected echo "INTRUDER!" fi done Code listing for motion-sensor.sh When you run the script you should see OK scroll up the screen if everything is nice and still. Now move in front of the PIR's detection area and you should see INTRUDER! scroll up the console screen until you are still again. Again, don't forget, that once we've finished with the GPIO port we should remove its file access: pi@raspberrypi ~ $ sudo echo 17 > /sys/class/gpio/unexport Summary In this article we have a guide to the Raspberry Pi's GPIO connector and how to safely connect peripherals to it, that is, by connecting a laser sensor module to our Pi to create a rather cool laser trip wire that could alert you when the laser beam is broken. Resources for Article: Further resources on this subject: Building Our First Poky Image for the Raspberry Pi[article] Raspberry Pi LED Blueprints[article] Raspberry Pi Gaming Operating Systems[article]
Read more
  • 0
  • 0
  • 7422
article-image-debugging-your-net-application
Packt
21 Jul 2016
13 min read
Save for later

Debugging Your .NET Application

Packt
21 Jul 2016
13 min read
In this article by Jeff Martin, author of the book Visual Studio 2015 Cookbook - Second Edition, we will discuss about how but modern software development still requires developers to identify and correct bugs in their code. The familiar edit-compile-test cycle is as familiar as a text editor, and now the rise of portable devices has added the need to measure for battery consumption and optimization for multiple architectures. Fortunately, our development tools continue to evolve to combat this rise in complexity, and Visual Studio continues to improve its arsenal. (For more resources related to this topic, see here.) Multi-threaded code and asynchronous code are probably the two most difficult areas for most developers to work with, and also the hardest to debug when you have a problem like a race condition. A race condition occurs when multiple threads perform an operation at the same time, and the order in which they execute makes a difference to how the software runs or the output is generated. Race conditions often result in deadlocks, incorrect data being used in other calculations, and random, unrepeatable crashes. The other painful area to debug involves code running on other machines, whether it is running locally on your development machine or running in production. Hooking up a remote debugger in previous versions of Visual Studio has been less than simple, and the experience of debugging code in production was similarly frustrating. In this article, we will cover the following sections: Putting Diagnostic Tools to work Maximizing everyday debugging Putting Diagnostic Tools to work In Visual Studio 2013, Microsoft debuted a new set of tools called the Performance and Diagnostics hub. With VS2015, these tools have revised further, and in the case of Diagnostic Tools, promoted to a central presence on the main IDE window, and is displayed, by default, during debugging sessions. This is great for us as developers, because now it is easier than ever to troubleshoot and improve our code. In this section, we will explore how Diagnostic Tools can be used to explore our code, identify bottlenecks, and analyze memory usage. Getting ready The changes didn't stop when VS2015 was released, and succeeding updates to VS2015 have further refined the capabilities of these tools. So for this section, ensure that Update 2 has been installed on your copy of VS2015. We will be using Visual Studio Community 2015, but of course, you may use one of the premium editions too. How to do it… For this section, we will put together a short program that will generate some activity for us to analyze: Create a new C# Console Application, and give it a name of your choice. In your project's new Program.cs file, add the following method that will generate a large quantity of strings: static List<string> makeStrings() { List<string> stringList = new List<string>(); Random random = new Random(); for (int i = 0; i < 1000000; i++) { string x = "String details: " + (random.Next(1000, 100000)); stringList.Add(x); } return stringList; } Next we will add a second static method that produces an SHA256-calculated hash of each string that we generated. This method reads in each string that was previously generated, creates an SHA256 hash for it, and returns the list of computed hashes in the hex format. static List<string> hashStrings(List<string> srcStrings) { List<string> hashedStrings = new List<string>(); SHA256 mySHA256 = SHA256Managed.Create(); StringBuilder hash = new StringBuilder(); foreach (string str in srcStrings) { byte[] srcBytes = mySHA256.ComputeHash(Encoding.UTF8.GetBytes(str), 0, Encoding.UTF8.GetByteCount(str)); foreach (byte theByte in srcBytes) { hash.Append(theByte.ToString("x2")); } hashedStrings.Add(hash.ToString()); hash.Clear(); } mySHA256.Clear(); return hashedStrings; } After adding these methods, you may be prompted to add using statements for System.Text and System.Security.Cryptography. These are definitely needed, so go ahead and take Visual Studio's recommendation to have them added. Now we need to update our Main method to bring this all together. Update your Main method to have the following: static void Main(string[] args) { Console.WriteLine("Ready to create strings"); Console.ReadKey(true); List<string> results = makeStrings(); Console.WriteLine("Ready to Hash " + results.Count() + " strings "); //Console.ReadKey(true); List<string> strings = hashStrings(results); Console.ReadKey(true); } Before proceeding, build your solution to ensure everything is in working order. Now run the application in the Debug mode (F5), and watch how our program operates. By default, the Diagnostic Tools window will only appear while debugging. Feel free to reposition your IDE windows to make their presence more visible or use Ctrl + Alt + F2 to recall it as needed. When you first launch the program, you will see the Diagnostic Tools window appear. Its initial display resembles the following screenshot. Thanks to the first ReadKey method, the program will wait for us to proceed, so we can easily see the initial state. Note that CPU usage is minimal, and memory usage holds constant. Before going any further, click on the Memory Usage tab, and then the Take Snapshot command as indicated in the preceding screenshot. This will record the current state of memory usage by our program, and will be a useful comparison point later on. Once a snapshot is taken, your Memory Usage tab should resemble the following screenshot: Having a forced pause through our ReadKey() method is nice, but when working with real-world programs, we will not always have this luxury. Breakpoints are typically used for situations where it is not always possible to wait for user input, so let's take advantage of the program's current state, and set two of them. We will put one to the second WriteLine method, and one to the last ReadKey method, as shown in the following screenshot: Now return to the open application window, and press a key so that execution continues. The program will stop at the first break point, which is right after it has generated a bunch of strings and added them to our List object. Let's take another snapshot of the memory usage using the same manner given in Step 9. You may also notice that the memory usage displayed in the Process Memory gauge has increased significantly, as shown in this screenshot: Now that we have completed our second snapshot, click on Continue in Visual Studio, and proceed to the next breakpoint. The program will then calculate hashes for all of the generated strings, and when this has finished, it will stop at our last breakpoint. Take another snapshot of the memory usage. Also take notice of how the CPU usage spiked as the hashes were being calculated: Now that we have these three memory snapshots, we will examine how they can help us. You may notice how memory usage increases during execution, especially from the initial snapshot to the second. Click on the second snapshot's object delta, as shown in the following screenshot: On clicking, this will open the snapshot details in a new editor window. Click on the Size (Bytes) column to sort by size, and as you may suspect, our List<String> object is indeed the largest object in our program. Of course, given the nature of our sample program, this is fairly obvious, but when dealing with more complex code bases, being able to utilize this type of investigation is very helpful. The following screenshot shows the results of our filter: If you would like to know more about the object itself (perhaps there are multiple objects of the same type), you can use the Referenced Types option as indicated in the preceding screenshot. If you would like to try this out on the sample program, be sure to set a smaller number in the makeStrings() loop, otherwise you will run the risk of overloading your system. Returning to the main Diagnostic Tools window, we will now examine CPU utilization. While the program is executing the hashes (feel free to restart the debugging session if necessary), you can observe where the program spends most of its time: Again, it is probably no surprise that most of the hard work was done in the hashStrings() method. But when dealing with real-world code, it will not always be so obvious where the slowdowns are, and having this type of insight into your program's execution will make it easier to find areas requiring further improvement. When using the CPU profiler in our example, you may find it easier to remove the first breakpoint and simply trigger a profiling by clicking on Break All as shown in this screenshot: How it works... Microsoft wanted more developers to be able to take advantage of their improved technology, so they have increased its availability beyond the Professional and Enterprise editions to also include Community. Running your program within VS2015 with the Diagnostic Tools window open lets you examine your program's performance in great detail. By using memory snapshots and breakpoints, VS2015 provides you with the tools needed to analyze your program's operation, and determine where you should spend your time making optimizations. There's more… Our sample program does not perform a wide variety of tasks, but of course, more complex programs usually perform well. To further assist with analyzing those programs, there is a third option available to you beyond CPU Usage and Memory Usage: the Events tab. As shown in the following screenshot, the Events tab also provides the ability to search events for interesting (or long-running) activities. Different event types include file activity, gestures (for touch-based apps), and program modules being loaded or unloaded. Maximizing everyday debugging Given the frequency of debugging, any refinement to these tools can pay immediate dividends. VS 2015 brings the popular Edit and Continue feature into the 21st century by supporting a 64-bit code. Added to that is the new ability to see the return value of functions in your debugger. The addition of these features combine to make debugging code easier, allowing to solve problems faster. Getting ready For this section, you can use VS 2015 Community or one of the premium editions. Be sure to run your choice on a machine using a 64-bit edition of Windows, as that is what we will be demonstrating in the section. Don't worry, you can still use Edit and Continue with 32-bit C# and Visual Basic code. How to do it… Both features are now supported by C#/VB, but we will be using C# for our examples. The features being demonstrated are compiler features, so feel free to use code from one of your own projects if you prefer. To see how Edit and Continue can benefit 64-bit development, perform the following steps: Create a new C# Console Application using the default name. To ensure the demonstration is running with 64-bit code, we need to change the default solution platform. Click on the drop-down arrow next to Any CPU, and select Configuration Manager... When the Configuration Manager dialog opens, we can create a new project platform targeting a 64-bit code. To do this, click on the drop-down menu for Platform, and select <New...>: When <New...> is selected, it will present the New Project Platform dialog box. Select x64 as the new platform type: Once x64 has been selected, you will return to Configuration Manager. Verify that x64 remains active under Platform, and then click on Close to close this dialog. The main IDE window will now indicate that x64 is active: With the project settings out of the face, let's add some code to demonstrate the new behavior. Replace the existing code in your blank class file so that it looks like the following listing: class Program { static void Main(string[] args) { int w = 16; int h = 8; int area = calcArea(w, h); Console.WriteLine("Area: " + area); } private static int calcArea(int width, int height) { return width / height; } } Let's set some breakpoints so that we are able to inspect during execution. First, add a breakpoint to the Main method's Console line. Add a second breakpoint to the calcArea method's return line. You can do this by either clicking on the left side of the editor window's border, or by right-clicking on the line, and selecting Breakpoint | Insert Breakpoint: If you are not sure where to click, use the right-click method, and then practice toggling the breakpoint by left-clicking on the breakpoint marker. Feel free to use whatever method you find most convenient. Once the two breakpoints are added, Visual Studio will mark their location as shown in the following screenshot (the arrow indicates where you may click to toggle the breakpoint): With the breakpoint marker now set, let's debug the program. Begin debugging by either pressing F5, or by clicking on the Start button on the toolbar: Once debugging starts, the program will quickly execute until stopped by the first breakpoint. Let's first take a look at Edit and Continue. Visual Studio will stop at the calcArea method's return line. Astute readers will notice an error (marked by 1 in the following screenshot) present in the calculation, as the area value returned should be width * height. Make the correction. Before continuing, note the variables listed in the Autos window (marked by 2 in the following screenshot). (If you don't see Autos, it can be made visible by pressing Ctrl + D, A, or through Debug | Windows | Autos while debugging.) After correcting the area calculation, advance the debugging step by pressing F10 twice. (Alternatively make the advancement by selecting the menu item Debug | Step Over twice). Visual Studio will advance to the declaration for the area. Note that you were able to edit your code and continue debugging without restarting. The Autos window will update to display the function's return value, which is 128 (the value for area has not been assigned yet in the following screenshot—Step Over once more if you would like to see that assigned): There's more… Programmers who write C++ have already had the ability to see the return values of functions—this just brings .NET developers into the fold. The result is that your development experience won't have to suffer based on the language you have chosen to use for your project. The Edit and Continue functionality is also available for ASP.NET projects. New projects created on VS2015 will have Edit and Continue enabled by default. Existing projects imported to VS2015 will usually need this to be enabled if it hasn't been done already. To do so, open the Options dialog via Tools | Options, and look for the Debugging | General section. The following screenshot shows where this option is located on the properties page: Whether you are working with an ASP.NET project or a regular C#/VB .NET application, you can verify Edit and Continue is set via this location. Summary In this article, we examine the improvements to the debugging experience in Visual Studio 2015, and how it can help you diagnose the root cause of a problem faster so that you can fix it properly, and not just patch over the symptoms. Resources for Article:   Further resources on this subject: Creating efficient reports with Visual Studio [article] Creating efficient reports with Visual Studio [article] Connecting to Microsoft SQL Server Compact 3.5 with Visual Studio [article]
Read more
  • 0
  • 0
  • 15849

article-image-managing-eap-domain-mode
Packt
19 Jul 2016
7 min read
Save for later

Managing EAP in Domain Mode

Packt
19 Jul 2016
7 min read
This article by Francesco Marchioni author of the book Mastering JBoss Enterprise Application Platform 7dives deep into application server management using the domain mode, its main components, and discusses how to shift to advanced configurations that resemble real-world projects. Here are the main topics covered are: Domain mode breakdown Handy domainproperties Electing the domaincontroller (For more resources related to this topic, see here.) Domain mode break down Managing the application server in the domain mode means, in a nutshell, to control multiple servers from a centralized single point of control. The servers that are part of the domain can span across multiple machines (or even across the cloud) and they can be grouped with similar servers of the domain to share a common configuration. To make some rationale, we will break down the domain components into two main categories: Physical components: Theseare the domain elements that can be identified with a Java process running on the operating system Logical components: Theseare the domain elements which can span across several physical components Domain physical components When you start the application server through the domain.sh script, you will be able to identify the following processes: Host controller: Each domain installation contains a host controller. This is a Java process that is in charge to start and stop the servers that are defined within the host.xml file. The host controller is only aware of the items that are specific to the local physical installation such as the domaincontroller host and port, the JVM settings of the servers or their system properties. Domain controller: One host controller of the domain (and only one) is configured to act as domaincontroller. This means basically two things: keeping the domainconfiguration (into the domain.xml file) and assisting the host controller for managing the servers of the domain. Servers: Each host controller can contain any number of servers which are the actual server instances. These server instances cannot be started autonomously. The host controller is in charge to start/stop single servers, when the domaincontroller commands them. If you start the default domain configuration on a Linux machine, you will see that the following processes will show in your operating system: As you can see, the process controller is identified by the [Process Controller] label, while the domaincontroller corresponds to the [Host Controller] label. Each server shows in the process table with the name defined in the host.xml file. You can use common operating system commands such as grep to further restrict the search to a specific process. Domain logical components A domain configuration with only physical elements in it would not add much to a line of standalone servers. The following components can abstract the domain definition, making it dynamic and flexible: Server Group: A server group is a collection of servers. They are defined in the domain.xml file, hence they don't have any reference to an actual host controller installation. You can use a server group to share configuration and deployments across a group of servers. Profile: A profile is an EAP configuration. A domain can hold as many profiles as you need. Out of the box the following configurations are provided: default: This configuration matches with the standalone.xml configuration (in standalone mode) hence it does not include JMS, IIOP, or HA. full: This configuration matches with the standalone-full.xml configuration (in standalone mode) hence it includes JMS and OpenJDK IIOP to the default server. ha: This configuration matches with the standalone-ha.xml configuration (in standalone mode) so it enhances the default configuration with clustering (HA). full-ha: This configuration matches with the standalone-full-ha.xml configuration (in standalone mode), hence it includes JMS, IIOP, and HA. Handy domainproperties So far we have learnt the default configuration files used by JBoss EAP and the location where they are placed. These settings can be however varied by means of system properties. The following table shows how to customize the domain configuration file names: Option Description --domain-config The domain configuration file (default domain.xml) --host-config The host configuration file (default host.xml) On the other hand, this table summarizes the available options to adjust the domain directory structure: Property Description jboss.domain.base.dir The base directory for domain content jboss.domain.config.dir The base configuration directory jboss.domain.data.dir The directory used for persistent data file storage jboss.domain.log.dir The directory containing the host-controller.log and process-controller.log files jboss.domain.temp.dir The directory used for temporary file storage jboss.domain.deployment.dir The directory used to store deployed content jboss.domain.servers.dir The directory containing the managed server instances For example, you can start EAP 7 in domain mode using the domain configuration file mydomain.xml and the host file named myhost.xml based on the base directory /home/jboss/eap7domain using the following command: $ ./domain.sh –domain-config=mydomain.xml –host-config=myhost.xml –Djboss.domain.base.dir=/home/jboss/eap7domain Electing the domaincontroller Before creating your first domain, we will learn more in detail the process which connects one or more host controller to one domaincontroller and how to elect a host controller to be a domaincontroller. The physical topology of the domain is stored in the host.xml file. Within this file, you will find as the first line the Host Controller name, which makes each host controller unique: <host name="master"> One of the host controllers will be configured to act as a domaincontroller. This is done in the domain-controller section with the following block, which states that the domaincontroller is the host controller itself (hence, local): <domain-controller> <local/> </domain-controller> All other host controllers will connect to the domaincontroller, using the following example configuration which uses the jboss.domain.master.address and jboss.domain.master.port properties to specify the domaincontroller address and port: <domain-controller> <remote protocol="remote" host="${jboss.domain.master.address}" port="${jboss.domain.master.port:9999}" security-realm="ManagementRealm"/> </domain-controller> The host controller-domaincontroller communication happens behind the scenes through a management native port that is defined as well into the host.xml file: <management-interfaces> <native-interface security-realm="ManagementRealm"> <socket interface="management" port="${jboss.management.native.port:9999}"/> </native-interface> <http-interface security-realm="ManagementRealm" http-upgrade-enabled="true"> <socket interface="management" port="${jboss.management.http.port:9990}"/> </http-interface> </management-interfaces> The other highlighted attribute is the managementhttpport that can be used by the administrator to reach the domaincontroller. This port is especially relevant if the host controller is the domaincontroller. Both sockets use the management interface, which is defined in the interfaces section of the host.xml file, and exposes the domain controller on a network available address: <interfaces> <interface name="management"> <inet-address value="${jboss.bind.address.management:127.0.0.1}"/> </interface> <interface name="public"> <inet-address value="${jboss.bind.address:127.0.0.1}"/> </interface> </interfaces> If you want to run multiplehost controllers on the same machine, you need to provide a unique jboss.management.native.port for each host controller or a different jboss.bind.address.management. Summary In this article we have some essentials of the domain mode breakdown, handy domain propertiesand also electing the domain controller. Resources for Article: Further resources on this subject: Red5: A video-on-demand Flash Server [article] Animating Elements [article] Data Science with R [article]
Read more
  • 0
  • 0
  • 16460

article-image-overview-certificate-management
Packt
18 Jul 2016
24 min read
Save for later

Overview of Certificate Management

Packt
18 Jul 2016
24 min read
In this article by David Steadman and Jeff Ingalls, the authors of Microsoft Identity Manager 2016 Handbook, we will look at certificate management in brief. Microsoft Identity Management (MIM)—certificate management (CM)—is deemed the outcast in many discussions. We are here to tell you that this is not the case. We see many scenarios where CM makes the management of user-based certificates possible and improved. If you are currently using FIM certificate management or considering a new certificate management deployment with MIM, we think you will find that CM is a component to consider. CM is not a requirement for using smart cards, but it adds a lot of functionality and security to the process of managing the complete life cycle of your smart cards and software-based certificates in a single forest or multiforest scenario. In this article, we will look at the following topics: What is CM? Certificate management components Certificate management agents The certificate management permission model (For more resources related to this topic, see here.) What is certificate management? Certificate management extends MIM functionality by adding management policy to a driven workflow that enables the complete life cycle of initial enrollment, duplication, and the revocation of user-based certificates. Some smart card features include offline unblocking, duplicating cards, and recovering a certificate from a lost card. The concept of this policy is driven by a profile template within the CM application. Profile templates are stored in Active Directory, which means the application already has a built-in redundancy. CM is based on the idea that the product will proxy, or be the middle man, to make a request to and get one from CA. CM performs its functions with user agents that encrypt and decrypt its communications. When discussing PKI (Public Key Infrastructure) and smart cards, you usually need to have some discussion about the level of assurance you would like for the identities secured by your PKI. For basic insight on PKI and assurance, take a look at http://bit.ly/CorePKI. In typical scenarios, many PKI designers argue that you should use Hardware Security Module (HSM) to secure your PKI in order to get the assurance level to use smart cards. Our personal opinion is that HSMs are great if you need high assurance on your PKI, but smart cards increase your security even if your PKI has medium or low assurance. Using MIM CM with HSM will not be covered in this article, but if you take a look at http://bit.ly/CMandLunSA, you will find some guidelines on how to use MIM CM and HSM Luna SA. The Financial Company has a low-assurance PKI with only one enterprise root CA issuing the certificates. The Financial Company does not use a HSM with their PKI or their MIM CM. If you are running a medium- or high-assurance PKI within your company, policies on how to issue smart cards may differ from the example. More details on PKI design can be found at http://bit.ly/PKIDesign. Certificate management components Before we talk about certificate management, we need to understand the underlying components and architecture: As depicted before, we have several components at play. We will start from the left to the right. From a high level, we have the Enterprise CA. The Enterprise CA can be multiple CAs in the environment. Communication from the CM application server to the CA is over the DCOM/RPC channel. End user communication can be with the CM web page or with a new REST API via a modern client to enable the requesting of smart cards and the management of these cards. From the CM perspective, the two mandatory components are the CM server and the CA modules. Looking at the logical architecture, we have the CA, and underneath this, we have the modules. The policy and exit module, once installed, control the communication and behavior of the CA based on your CM's needs. Moving down the stack, we have Active Directory integration. AD integration is the nuts and bolts of the operation. Integration into AD can be very complex in some environments, so understanding this area and how CM interacts with it is very important. We will cover the permission model later in this article, but it is worth mentioning that most of the configuration is done and stored in AD along with the database. CM uses its own SQL database, and the default name is FIMCertificateManagement. The CM application uses its own dedicated IIS application pool account to gain access to the CM database in order to record transactions on behalf of users. By default, the application pool account is granted the clmApp role during the installation of the database, as shown in the following screenshot:   In CM, we have a concept called the profile template. The profile template is stored in the configuration partition of AD, and the security permissions on this container and its contents determine what a user is authorized to see. As depicted in the following screenshot, CM stores the data in the Public Key Services (1) and the Profile Templates container. CM then reads all the stored templates and the permissions to determine what a user has the right to do (2): Profile templates are at the core of the CM logic. The three components comprising profile templates are certificate templates, profile details, and management policies. The first area of the profile template is certificate templates. Certificate templates define the extensions and data point that can be included in the certificate being requested. The next item is profile details, which determines the type of request (either a smart card or a software user-based certificate), where we will generate the certificates (either on the server or on the client side of the operations), and which certificate templates will be included in the request. The final area of a profile template is known as management policies. Management policies are the workflow engine of the process and contain the manager, the subscriber functions, and any data collection items. The e-mail function is initiated here and commonly referred to as the One Time Password (OTP) activity. Note the word "One". A trigger will only happen once here; therefore, multiple alerts using e-mail would have to be engineered through alternate means, such as using the MIM service and expiration activities. The permission model is a bit complex, but you'll soon see the flexibility it provides. Keep in mind that Service Connection Point (SCP) also has permissions applied to it to determine who can log in to the portal and what rights the user has within the portal. SCP is created upon installation during the wizard configuration. You will want to be aware of the SCP location in case you run into configuration issues with administrators not being able to perform particular functions. The SCP location is in the System container, within Microsoft, and within Certificate Lifecycle Manager, as shown here: Typical location CN=Certificate Lifecycle Manager,CN=Microsoft,CN=System,DC=THEFINANCIALCOMPANY,DC=NET Certificate management agents We covered several key components of the profile templates and where some of the permission model is stored. We now need to understand how the separation of duties is defined within the agent role. The permission model provides granular control, which promotes the separation of duties. CM uses six agent accounts, and they can be named to fit your organization's requiremensts. We will walk through the initial setup again later in this article so that you can use our setup or alter it based on your need. The Financial Company only requires the typical setup. We precreated the following accounts for TFC, but the wizard will create them for you if you do not use them. During the installation and configuration of CM, we will use the following accounts: Besides the separation of duty, CM offers enrollment by proxy. Proxy enrollment of a request refers to providing a middle man to provide the end user with a fluid workflow during enrollment. Most of this proxy is accomplished via the agent accounts in one way or another. The first account is MIM CM Agent (MIMCMAgent), which is used by the CM server to encrypt data from the smart card admin PINs to the data collection stored in the database. So, the agent account has an important role to protect data and communication to and from the certificate authorities. The last user agent role CMAgent has is the capability to revoke certificates. The agent certificate thumbprint is very important, and you need to make sure the correct value is updated in the three areas: CM, web.config, and the certificate policy module under the Signing Certificates tab on the CA. We have identified these areas in the following. For web.config: <add key="Clm.SigningCertificate.Hash" value <add key="Clm.Encryption.Certificate.Hash" value <add key="Clm.SmartCard.ExchangeCertificate.Hash" value The Signing Certificates tab is as shown in the following screenshot:   Now, when you run through the configuration wizard, these items are already updated, but it is good to know which locations need to be updated if you need to troubleshoot agent issues or even update/renew this certificate. The second account we want to look at is Key Recovery Agent (MIMCMKRAgent); this agent account is needed for CM to recover any archived private keys certificates. Now, let's look at Enrollment Agent (MIMCMEnrollAgent); the main purpose of this agent account is to provide the enrollment of smart cards. Enrollment Agent, as we call it, is responsible for signing all smart card requests before they are submitted to the CA. Typical permission for this account on the CA is read and request. Authorization Agent (MIMCMAuthAgent)—or as some folks call this, the authentication agent—is responsible for determining access rights for all objects from a DACL perspective. When you log in to the CM site, it is the authorization account's job to determine what you have the right to do based on all the core components that ACL has applied. We will go over all the agents accounts and rights needed later in this article during our setup. CA Manager Agent (MIMCMManagerAgent) is used to perform core CA functions. More importantly, its job is to issue Certificate Revocation Lists (CRLs). This happens when a smart card or certificate is retired or revoked. It is up to this account to make sure the CRL is updated with this critical information. We saved the best for last: Web Pool Agent (MIMCMWebAgent). This agent is used to run the CM web application. The agent is the account that contacts the SQL server to record all user and admin transactions. The following is a good depiction of all the accounts together and the high-level functions:   The certificate management permission model In CM, we think this part is the most complex because with the implementation, you can be as granular as possible. For this reason, this area is the most difficult to understand. We will uncover the permission model so that we can begin to understand how the permission model works within CM. When looking at CM, you need to formulate the type of management model you will be deploying. What we mean by this is will you have a centralized or delegated model? This plays a key part in deployment planning for CM and the permission you will need to apply. In the centralized model, a specific set of managers are assigned all the rights for the management policy. This includes permissions on the users. Most environments use this method as it is less complex for environments. Now, within this model, we have manager-initiated permission, and this is where CM permissions are assigned to groups containing the subscribers. Subscribers are the actual users doing the enrollment or participating in the workflow. This is the model that The Financial Company will use in its configuration. The delegated model is created by updating two flags in web.config called clm.RequestSecurity.Flags and clm.RequestSecurity.Groups. These two flags work hand in hand as if you have UseGroups, then it will evaluate all the groups within the forests to include universal/global security. Now, if you use UseGroups and define clm.RequestSecurity.Groups, then it will only look for these specific groups and evaluate via the Authorization Agent . The user will tell the Authorization Agent to only read the permission on the user and ignore any group membership permissions:   When we continue to look at the permission, there are five locations that permissions can be applied in. In the preceding figure is an outline of these locations, but we will go in more depth in the subsections in a bit. The basis of the figure is to understand the location and what permission can be applied. The following are the areas and the permissions that can be set: Service Connection Point: Extended Permissions Users or Groups: Extended Permissions Profile Template Objects: Container: Read or Write Template Object: Read/Write or Enroll Certificate Template: Read or Enroll CM Management Policy within the Web application: We have multiple options based on the need, such as Initiate Request Now, let's begin to discuss the core areas to understand what they can do. So, The Financial Company can design the enrollment option they want. In the example, we will use the main scenario we encounter, such as the helpdesk, manager, and user-(subscriber) based scenarios. For example, certain functions are delegated to the helpdesk to allow them to assist the user base without giving them full control over the environment (delegated model). Remember this as we look at the five core permission areas. Creating service accounts So far, in our MIM deployment, we have created quite a few service accounts. MIM CM, however, requires that we create a few more. During the configuration wizard, we will get the option of having the wizard create them for us, but we always recommend creating them manually in FIM/MIM CM deployments. One reason is that a few of these need to be assigned some certificates. If we use an HSM, we have to create it manually in order to make sure the certificates are indeed using the HSM. The wizard will ask for six different service accounts (agents), but we actually need seven. In The Financial Company, we created the following seven accounts to be used by FIM/MIM CM: MIMCMAgent MIMCMAuthAgent MIMCMCAManagerAgent MIMCMEnrollAgent MIMCMKRAgent MIMCMWebAgent MIMCMService The last one, MIMCMService, will not be used during the configuration wizard, but it will be used to run the MIM CM Update service. We also created the following security groups to help us out in the scenarios we will go over: MIMCM-Helpdesk: This is the next step in OTP for subscribers MIMCM-Managers: These are the managers of the CM environment MIMCM-Subscribers: This is group of users that will enroll Service Connection Point Service Connection Point (SCP)is located under the Systems folder within Active Directory. This location, as discussed in the earlier parts of the article, defines who functions as the user as it relates to logging in to the web application. As an example, if we just wanted every user to only log in, we would give them read rights. Again, authenticated users, have this by default, but if you only wanted a subset of users to access, you should remove authenticated users and add your group. When you run the configuration wizard, SCP is decided, but the default is the one shown in the following screenshot:   If a user is assigned to any of the MIM CM permissions available on SCP, the administrative view of the MIM CM portal will be shown. The MIM CM permissions are defined in a Microsoft TechNet article at http://bit.ly/MIMCMPermission. For your convenience, we have copied parts of the information here: MIM CM Audit: This generates and displays MIM CM policy templates, defines management policies within a profile template, and generates MIM CM reports. MIM CM Enrollment Agent: This performs certificate requests for the user or group on behalf of another user. The issued certificate's subject contains the target user's name and not the requester's name. MIM CM Request Enroll: This initiates, executes, or completes an enrollment request. MIM CM Request Recover: This initiates encryption key recovery from the CA database. MIM CM Request Renew: This initiates, executes, or completes an enrollment request. The renewal request replaces a user's certificate that is near its expiration date with a new certificate that has a new validity period. MIM CM Request Revoke: This revokes a certificate before the expiration of the certificate's validity period. This may be necessary, for example, if a user's computer or smart card is stolen. MIM CM Request Unblock Smart Card: This resets a smart card's user Personal Identification Number (PIN) so that he/she can access the key material on a smart card. The Active Directory extended permissions So, even if you have the SCP defined, we still need to set up the permissions on the user or group of users that we want to manage. As in our helpdesk example, if we want to perform certain functions, the most common one is offline unblock. This would require the MIMCM-HelpDesk group. We will create this group later in this article. It would contain all help desk users then on SCP; we would give them CM Request Unblock Smart Card and CM Enrollment Agent. Then, you need to assign the permission to the extended permission on MIMCM-Subscribers, which contains all the users we plan to manage with the helpdesk and offline unblock:   So, as you can see, we are getting into redundant permissions, but depending on the location, it means what the user can do. So, planning of the model is very important. Also, it is important to document what you have as with some slight tweak, things can and will break. The certificate templates permission In order for any of this to be possible, we still need to give permission to the manager of the user to enroll or read the certificate template, as this will be added to the profile template. For anyone to manage this certificate, everyone will need read and enroll permissions. This is pretty basic, but that is it, as shown in the following screenshot:   The profile template permission The profile template determines what a user can read within the template. To get to the profile template, we need to use Active Directory sites and services to manage profile templates. We need to activate the services node as this is not shown by default, and to do this, we will click on View | Show Services Node:   As an example if you want a user to enroll in the cert, he/she would need CM Enroll on the profile template, as shown in the following screenshot:   Now, this is for users, but let's say you want to delegate the creation of profile templates. For this, all you need to do is give the MIMCM-Managers delegate the right to create all child items on the profile template container, as follows:   The management policy permission For the management policy, we will break it down into two sections: a software-based policy and a smart card management policy. As we have different capabilities within CM based on the type, by default, CM comes with two sample policies (take a look at the following screenshot), which we use for duplication to create a new one. When configuring, it is good to know that you cannot combine software and smart card-based certificates in a policy:   The software management policy The software-based certificate policy has the following policies available through the CM life cycle:   The Duplicate Policy panel creates a duplicate of all the certificates in the current profile. Now, if the first profile is created for the user, all the other profiles created afterwards will be considered duplicate, and the first generated policy will be primary. The Enroll Policy panel defines the initial enrollment steps for certificates such as initiate enroll request and data collection during enroll initiation. The Online Update Policy panel is part of the automatic policy function when key items in the policy change. This includes certificates about to expire, when a certificate is added to the existing profile template or even removed. The Recover Policy panel allows for the recovery of the profile in the event that the user was deleted. This includes the cases where certs are deleted by accident. One thing to point out is if the certificate was a signing cert, the recovery policy would issue a new replacement cert. However, if the cert was used for encryption, you can recover the original using this policy. The Recover On Behalf Policy panel allows managers or helpdesk operations to be recovered on behalf the user in the event that they need any of the certificates. The Renew Policy panel is the workflow that defines the renew setting, such as revocation and who can initiate a request. The Suspend and Reinstate Policy panel enables a temporary revocation of the profile and puts a "certificate hold" status. More information about the CRL status can be found at http://bit.ly/MIMCMCertificateStatus. The Revoke Policy panel maintains the revocation policy and setting around being able to set the revocation reason and delay. Also, it allows the system to push a delta CRL. You also can define the initiators for this policy workflow. The smart card management policy The smart card policy has some similarities to the software-based policy, but it also has a few new workflows to manage the full life cycle of the smart card:   The Profile Details panel is by far the most commonly used part in this section of the policy as it defines all the smart card certificates that will be loaded in the policy along with the type of provider. One key item is creating and destroying virtual smart cards. One final key part is diversifying the admin key. This is best practice as this secures the admin PIN using diversification. So, before we continue, we want to go over this setting as we think it is an important topic. Diversifying the admin key is important because each card or batch of cards comes with a default admin key. Smart cards may have several PINs, an admin PIN, a PINunlock key (PUK), and a user PIN. This admin key, as CM refers to it, is also known as the administrator PIN. This PIN differs from the user's PIN. When personalizing the smart card, you configure the admin key, the PUK, and the user's PIN. The admin key and the PUK are used to reset the virtual smart card's PIN. However, you cannot configure both. You must use the PUK to unlock the PIN if you assign one during the virtual smart card's creation. It is important to note that you must use the PUK to reset the PIN if you provide both a PUK and an admin key. During the configuration of the profile template, you will be asked to enter this key as follows:   The admin key is typically used by smart card management solutions that enable a challenge response approach to PIN unlocking. The card provides a set of random data that the user reads (after the verification of identity) to the deployment admin. The admin then encrypts the data with the admin key (obtained as mentioned before) and gives the encrypted data back to the user. If the encrypted data matches that produced by the card during verification, the card will allow PIN resetting. As the admin key is never in the hands of anyone other than the deployment administrator, it cannot be intercepted or recorded by any other party (including the employee) and thus has significant security benefits beyond those in using a PUK—an important consideration during the personalization process. When enabled, the admin key is set to a card-unique value when the card is assigned to the user. The option to diversify admin keys with the default initialization provider allows MIM CM to use an algorithm to uniquely generate a new key on the card. The key is encrypted and securely transmitted to the client. It is not stored in the database or anywhere else. MIM CM recalculates the key as needed to manage the card:   The CM profile template contains a thumbprint for the certificate to be used in admin key diversification. CM looks in the personal store of the CM agent service account for the private key of the certificate in the profile template. Once located, the private key is used to calculate the admin key for the smart card. The admin key allows CM to manage the smart card (issuing, revoking, retiring, renewing, and so on). Loss of the private key prevents the management of cards diversified using this certificate. More detail on the control can be found at http://bit.ly/MIMCMDiversifyAdminKey. Continuing on, the Disable Policy panel defines the termination of the smart card before expiration, you can define the reason if you choose. Once disabled, it cannot be reused in the environment. The Duplicate Policy panel, similarly to the software-based one, produces a duplicate of all the certificates that will be on the smart card. The Enroll Policy panel, similarly to the software policy, defines who can initiate the workflow and printing options. The Online Update Policy panel, similarly to the software-based cert, allows for the updating of certificates if the profile template is updated. The update is triggered when a renewal happens or, similarly to the software policy, a cert is added or removed. The Offline Unblock Policy panel is the configuration of a process to allow offline unblocking. This is used when a user is not connected to the network. This process only supports Microsoft-based smart cards with challenge questions and answers via, in most cases, the user calling the helpdesk. The Recovery On Behalf Policy panel allows the recovery of certificates for the management or the business to recover if the cert is needed to decrypt information from a user whose contract was terminated or who left the company. The Replace Policy panel is utilized by being able to replace a user's certificate in the event of a them losing their card. If the card they had had a signing cert, then a new signing cert would be issued on this new card. Like with software certs, if the certificate type is encryption, then it would need to be restored on the replace policy. The Renew Policy panel will be used when the profile/certificate is in the renewal period and defines revocation details and options and initiates permission. The Suspend and Reinstate Policy panel is the same as the software-based policy for putting the certificate on hold. The Retire Policy panel is similar to the disable policy, but a key difference is that this policy allows the card to be reused within the environment. The Unblock Policy panel defines the users that can perform an actual unblocking of a smart card. More in-depth detail of these policies can be found at http://bit.ly/MIMCMProfiletempates. Summary In this article, we uncovered the basics of certificate management and the management components that are required to successfully deploy a CM solution. Then, we discussed and outlined, agent accounts and the roles they play. Finally, we looked into the management permission model from the policy template to the permissions and the workflow. Resources for Article: Further resources on this subject: Managing Network Devices [article] Logging and Monitoring [article] Creating Horizon Desktop Pools [article]
Read more
  • 0
  • 0
  • 7293
article-image-reactive-programming-c
Packt
18 Jul 2016
30 min read
Save for later

Reactive Programming with C#

Packt
18 Jul 2016
30 min read
In this article by Antonio Esposito from the book Reactive Programming for .NET Developers , we will see a practical example of what is reactive programming with pure C# coding. The following topics will be discussed here: IObserver interface IObservable interface Subscription life cycle Sourcing events Filtering events Correlating events Sourcing from CLR streams Sourcing from CLR enumerables (For more resources related to this topic, see here.) IObserver interface This core level interface is available within the Base Class Library (BCL) of .NET 4.0 and is available for the older 3.5 as an add-on. The usage is pretty simple and the goal is to provide a standard way of handling the most basic features of any reactive message consumer. Reactive messages flow by a producer and a consumer and subscribe for some messages. The IObserver C# interface is available to construct message receivers that comply with the reactive programming layout by implementing the three main message-oriented events: a message received, an error received, and a task completed message. The IObserver interface has the following sign and description: // Summary: // Provides a mechanism for receiving push-based notifications. // // Type parameters: // T: // The object that provides notification information.This type parameter is // contravariant. That is, you can use either the type you specified or any // type that is less derived. For more information about covariance and contravariance, // see Covariance and Contravariance in Generics. public interface IObserver<in T> { // Summary: // Notifies the observer that the provider has finished sending push-based notifications. void OnCompleted(); // // Summary: // Notifies the observer that the provider has experienced an error condition. // // Parameters: // error: // An object that provides additional information about the error. void OnError(Exception error); // // Summary: // Provides the observer with new data. // // Parameters: // value: // The current notification information. void OnNext(T value); } Any new message to flow to the receiver implementing such an interface will reach the OnNext method. Any error will reach the OnError method, while the task completed acknowledgement message will reach the OnCompleted method. The usage of an interface means that we cannot use generic premade objects from the BCL. We need to implement any receiver from scratch by using such an interface as a service contract. Let's see an example because talking about a code example is always simpler than talking about something theoretic. The following examples show how to read from a console application command from a user in a reactive way: cass Program { static void Main(string[] args) { //creates a new console input consumer var consumer = new ConsoleTextConsumer(); while (true) { Console.WriteLine("Write some text and press ENTER to send a messagernPress ENTER to exit"); //read console input var input = Console.ReadLine(); //check for empty messate to exit if (string.IsNullOrEmpty(input)) { //job completed consumer.OnCompleted(); Console.WriteLine("Task completed. Any further message will generate an error"); } else { //route the message to the consumer consumer.OnNext(input); } } } } public class ConsoleTextConsumer : IObserver<string> { private bool finished = false; public void OnCompleted() { if (finished) { OnError(new Exception("This consumer already finished it's lifecycle")); return; } finished = true; Console.WriteLine("<- END"); } public void OnError(Exception error) { Console.WriteLine("<- ERROR"); Console.WriteLine("<- {0}", error.Message); } public void OnNext(string value) { if (finished) { OnError(new Exception("This consumer finished its lifecycle")); return; } //shows the received message Console.WriteLine("-> {0}", value); //do something //ack the caller Console.WriteLine("<- OK"); } } The preceding example shows the IObserver interface usage within the ConsoleTextConsumer class that simply asks a command console (DOS-like) for the user input text to do something. In this implementation, the class simply writes out the input text because we simply want to look at the reactive implementation. The first important concept here is that a message consumer knows nothing about how messages are produced. The consumer simply reacts to one of the tree events (not CLR events). Besides this, some kind of logic and cross-event ability is also available within the consumer itself. In the preceding example, we can see that the consumer simply showed any received message again on the console. However, if a complete message puts the consumer in a finished state (by signaling the finished flag), any other message that comes on the OnNext method will be automatically routed to the error one. Likewise, any other complete message that will reach the consumer will produce another error once the consumer is already in the finished state. IObservable interface The IObservable interface, the opposite of the IObserver interface, has the task of handling message production and the observer subscription. It routes right messages to the OnNext message handler and errors to the OnError message handler. At its life cycle end, it acknowledges all the observers on the OnComplete message handler. To create a valid reactive observable interface, we must write something that is not locking against user input or any other external system input data. The observable object acts as an infinite message generator, something like an infinite enumerable of messages; although in such cases, there is no enumeration. Once a new message is available somehow, observer routes it to all the subscribers. In the following example, we will try creating a console application to ask the user for an integer number and then route such a number to all the subscribers. Otherwise, if the given input is not a number, an error will be routed to all the subscribers. This is observer similar to the one already seen in the previous example. Take a look at the following codes: /// <summary> /// Consumes numeric values that divides without rest by a given number /// </summary> public class IntegerConsumer : IObserver<int> { readonly int validDivider; //the costructor asks for a divider public IntegerConsumer(int validDivider) { this.validDivider = validDivider; } private bool finished = false; public void OnCompleted() { if (finished) OnError(new Exception("This consumer already finished it's lifecycle")); else { finished = true; Console.WriteLine("{0}: END", GetHashCode()); } } public void OnError(Exception error) { Console.WriteLine("{0}: {1}", GetHashCode(), error.Message); } public void OnNext(int value) { if (finished) OnError(new Exception("This consumer finished its lifecycle")); //the simple business logic is made by checking divider result else if (value % validDivider == 0) Console.WriteLine("{0}: {1} divisible by {2}", GetHashCode(), value, validDivider); } } This observer consumes integer numeric messages, but it requires that the number is divisible by another one without producing any rest value. This logic, because of the encapsulation principle, is within the observer object. The observable interface, instead, only has the logic of the message sending of valid or error messages. This filtering logic is made within the receiver itself. Although that is not something wrong, in more complex applications, specific filtering features are available in the publish-subscribe communication pipeline. In other words, another object will be available between observable (publisher) and observer (subscriber) that will act as a message filter. Back to our numeric example, here we have the observable implementation made using an inner Task method that does the main job of parsing input text and sending messages. In addition, a cancellation token is available to handle the user cancellation request and an eventual observable dispose: //Observable able to parse strings from the Console //and route numeric messages to all subscribers public class ConsoleIntegerProducer : IObservable<int>, IDisposable { //the subscriber list private readonly List<IObserver<int>> subscriberList = new List<IObserver<int>>(); //the cancellation token source for starting stopping //inner observable working thread private readonly CancellationTokenSource cancellationSource; //the cancellation flag private readonly CancellationToken cancellationToken; //the running task that runs the inner running thread private readonly Task workerTask; public ConsoleIntegerProducer() { cancellationSource = new CancellationTokenSource(); cancellationToken = cancellationSource.Token; workerTask = Task.Factory.StartNew(OnInnerWorker, cancellationToken); } //add another observer to the subscriber list public IDisposable Subscribe(IObserver<int> observer) { if (subscriberList.Contains(observer)) throw new ArgumentException("The observer is already subscribed to this observable"); Console.WriteLine("Subscribing for {0}", observer.GetHashCode()); subscriberList.Add(observer); return null; } //this code executes the observable infinite loop //and routes messages to all observers on the valid //message handler private void OnInnerWorker() { while (!cancellationToken.IsCancellationRequested) { var input = Console.ReadLine(); int value; foreach (var observer in subscriberList) if (string.IsNullOrEmpty(input)) break; else if (input.Equals("EXIT")) { cancellationSource.Cancel(); break; } else if (!int.TryParse(input, out value)) observer.OnError(new FormatException("Unable to parse given value")); else observer.OnNext(value); } cancellationToken.ThrowIfCancellationRequested(); } //cancel main task and ack all observers //by sending the OnCompleted message public void Dispose() { if (!cancellationSource.IsCancellationRequested) { cancellationSource.Cancel(); while (!workerTask.IsCanceled) Thread.Sleep(100); } cancellationSource.Dispose(); workerTask.Dispose(); foreach (var observer in subscriberList) observer.OnCompleted(); } //wait until the main task completes or went cancelled public void Wait() { while (!(workerTask.IsCompleted || workerTask.IsCanceled)) Thread.Sleep(100); } } To complete the example, here there is the program Main: static void Main(string[] args) { //this is the message observable responsible of producing messages using (var observer = new ConsoleIntegerProducer()) //those are the message observer that consume messages using (var consumer1 = observer.Subscribe(new IntegerConsumer(2))) using (var consumer2 = observer.Subscribe(new IntegerConsumer(3))) using (var consumer3 = observer.Subscribe(new IntegerConsumer(5))) observer.Wait(); Console.WriteLine("END"); Console.ReadLine(); } The cancellationToken.ThrowIfCancellationRequested may raise an exception in your Visual Studio when debugging. Simply go next by pressing F5, or test such code example without the attached debugger by starting the test with Ctrl + F5 instead of the F5 alone. The application simply creates an observable variable, which is able to parse user data. Then, register three observers specifying to each observer variables the wanted valid divider value. Then, the observable variable will start reading user data from the console and valid or error messages will flow to all the observers. Each observer will apply its internal logic of showing the message when it divides for the related divider. Here is the result of executing the application: Observables and observers in action Subscription life cycle What will happen if we want to stop a single observer from receiving messages from the observable event source? If we change the program Main from the preceding example to the following one, we could experience a wrong observer life cycle design. Here's the code: //this is the message observable responsible of producing messages using (var observer = new ConsoleIntegerProducer()) //those are the message observer that consume messages using (var consumer1 = observer.Subscribe(new IntegerConsumer(2))) using (var consumer2 = observer.Subscribe(new IntegerConsumer(3))) { using (var consumer3 = observer.Subscribe(new IntegerConsumer(5))) { //internal lifecycle } observer.Wait(); } Console.WriteLine("END"); Console.ReadLine(); Here is the result in the output console: The third observer unable to catch value messages By using the using construct method, we should stop the life cycle of the consumer object. However, we do not, because in the previous example, the Subscribe method of the observable simply returns a NULL object. To create a valid observer, we must handle and design its life cycle management. This means that we must eventually handle the external disposing of the Subscribe method's result by signaling the right observer that his life cycle reached the end. We have to create a Subscription class to handle an eventual object disposing in the right reactive way by sending the message for the OnCompleted event handler. Here is a simple Subscription class implementation: /// <summary> /// Handle observer subscription lifecycle /// </summary> public sealed class Subscription<T> : IDisposable { private readonly IObserver<T> observer; public Subscription(IObserver<T> observer) { this.observer = observer; } //the event signalling that the observer has //completed its lifecycle public event EventHandler<IObserver<T>> OnCompleted; public void Dispose() { if (OnCompleted != null) OnCompleted(this, observer); observer.OnCompleted(); } } The usage is within the observable Subscribe method. Here's an example: //add another observer to the subscriber list public IDisposable Subscribe(IObserver<int> observer) { if (observerList.Contains(observer)) throw new ArgumentException("The observer is already subscribed to this observable"); Console.WriteLine("Subscribing for {0}", observer.GetHashCode()); observerList.Add(observer); //creates a new subscription for the given observer var subscription = new Subscription<int>(observer); //handle to the subscription lifecycle end event subscription.OnCompleted += OnObserverLifecycleEnd; return subscription; } void OnObserverLifecycleEnd(object sender, IObserver<int> e) { var subscription = sender as Subscription<int>; //remove the observer from the internal list within the observable observerList.Remove(e); //remove the handler from the subscription event //once already handled subscription.OnCompleted -= OnObserverLifecycleEnd; } As visible, the preceding example creates a new Subscription<T> object to handle this observer life cycle with the IDisposable.Dispose method. Here is the result of such code edits against the full example available in the previous paragraph: The observer will end their life as we dispose their life cycle tokens This time, an observer ends up its life cycle prematurely by disposing the subscription object. This is visible by the first END message. Later, only two observers remain available at the application ending; when the user asks for EXIT, only such two observers end their life cycle by themselves rather than by the Subscription disposing. In real-world applications, often, observers subscribe to observables and later unsubscribe by disposing the Subscription token. This happens because we do not always want a reactive module to handle all the messages. In this case, this means that we have to handle the observer life cycle by ourselves, as we already did in the previous examples, or we need to apply filters to choose which messages flows to which subscriber, as visible in the later section Filtering events. Kindly consider that although filters make things easier, we will always have to handle the observer life cycle. Sourcing events Sourcing events is the ability to obtain from a particular source where few useful events are usable in reactive programming. Reactive programming is all about event message handling. Any event is a specific occurrence of some kind of handleable behavior of users or external systems. We can actually program event reactions in the most pleasant and productive way for reaching our software goals. In the following example, we will see how to react to CLR events. In this specific case, we will handle filesystem events by using events from the System.IO.FileSystemWatcher class that gives us the ability to react to the filesystem's file changes without the need of making useless and resource-consuming polling queries against the file system status. Here's the observer and observable implementation: public sealed class NewFileSavedMessagePublisher : IObservable<string>, IDisposable { private readonly FileSystemWatcher watcher; public NewFileSavedMessagePublisher(string path) { //creates a new file system event router this.watcher = new FileSystemWatcher(path); //register for handling File Created event this.watcher.Created += OnFileCreated; //enable event routing this.watcher.EnableRaisingEvents = true; } //signal all observers a new file arrived private void OnFileCreated(object sender, FileSystemEventArgs e) { foreach (var observer in subscriberList) observer.OnNext(e.FullPath); } //the subscriber list private readonly List<IObserver<string>> subscriberList = new List<IObserver<string>>(); public IDisposable Subscribe(IObserver<string> observer) { //register the new observer subscriberList.Add(observer); return null; } public void Dispose() { //disable file system event routing this.watcher.EnableRaisingEvents = false; //deregister from watcher event handler this.watcher.Created -= OnFileCreated; //dispose the watcher this.watcher.Dispose(); //signal all observers that job is done foreach (var observer in subscriberList) observer.OnCompleted(); } } /// <summary> /// A tremendously basic implementation /// </summary> public sealed class NewFileSavedMessageSubscriber : IObserver<string> { public void OnCompleted() { Console.WriteLine("-> END"); } public void OnError(Exception error) { Console.WriteLine("-> {0}", error.Message); } public void OnNext(string value) { Console.WriteLine("-> {0}", value); } } The observer interface simply gives us the ability to write text to the console. I think, there is nothing to say about it. On the other hand, the observable interface makes the most of the job in this implementation. The observable interface creates the watcher object and registers the right event handler to catch the wanted reactive events. It handles the life cycle of itself and the internal watcher object. Then, it correctly sends the OnComplete message to all the observers. Here's the program's initialization: static void Main(string[] args) { Console.WriteLine("Watching for new files"); using (var publisher = new NewFileSavedMessagePublisher(@"[WRITE A PATH HERE]")) using (var subscriber = publisher.Subscribe(new NewFileSavedMessageSubscriber())) { Console.WriteLine("Press RETURN to exit"); //wait for user RETURN Console.ReadLine(); } } Any new file that arises in the folder will let route  full FileName to observer. This is the result of a copy and paste of the same file three times: -> [YOUR PATH]out - Copy.png-> [YOUR PATH]out - Copy (2).png-> [YOUR PATH]out - Copy (3).png By using a single observable interface and a single observer interface, the power of reactive programming is not so evident. Let's begin with writing some intermediate object to change the message flow within the pipeline of our message pump made in a reactive way with filters, message correlator, and dividers. Filtering events As said in the previous section, it is time to alter message flow. The observable interface has the task of producing messages, while observer at the opposite consumes such messages. To create a message filter, we need to create an object that is a publisher and subscriber altogether. The implementation must take into consideration the filtering need and the message routing to underlying observers that subscribes to the filter observable object instead of the main one. Here's an implementation of the filter: /// <summary> /// The filtering observable/observer /// </summary> public sealed class StringMessageFilter : IObservable<string>, IObserver<string>, IDisposable { private readonly string filter; public StringMessageFilter(string filter) { this.filter = filter; } //the observer collection private readonly List<IObserver<string>> observerList = new List<IObserver<string>>(); public IDisposable Subscribe(IObserver<string> observer) { this.observerList.Add(observer); return null; } //a simple implementation //that disables message routing once //the OnCompleted has been invoked private bool hasCompleted = false; public void OnCompleted() { hasCompleted = true; foreach (var observer in observerList) observer.OnCompleted(); } //routes error messages until not completed public void OnError(Exception error) { if (!hasCompleted) foreach (var observer in observerList) observer.OnError(error); } //routes valid messages until not completed public void OnNext(string value) { Console.WriteLine("Filtering {0}", value); if (!hasCompleted && value.ToLowerInvariant().Contains(filter.ToLowerInvariant())) foreach (var observer in observerList) observer.OnNext(value); } public void Dispose() { OnCompleted(); } } This filter can be used together with the example from the previous section that routes the FileSystemWatcher events of created files. This is the new program initialization: static void Main(string[] args) { Console.WriteLine("Watching for new files"); using (var publisher = new NewFileSavedMessagePublisher(@"[WRITE A PATH HERE]")) using (var filter = new StringMessageFilter(".txt")) { //subscribe the filter to publisher messages publisher.Subscribe(filter); //subscribe the console subscriber to the filter //instead that directly to the publisher filter.Subscribe(new NewFileSavedMessageSubscriber()); Console.WriteLine("Press RETURN to exit"); Console.ReadLine(); } } As visible, this new implementation creates a new filter object that takes parameter to verify valid filenames to flow to the underlying observers. The filter subscribes to the main observable object, while the observer subscribes to the filter itself. It is like a chain where each chain link refers to the near one. This is the output console of the running application: The filtering observer in action Although I made a copy of two files (a .png and a .txt file), we can see that only the text file reached the internal observer object, while the image file reached the OnNext of filter because the invalid against the filter argument never reached internal observer. Correlating events Sometimes, especially when dealing with integration scenarios, there is the need of correlating multiple events that not always came altogether. This is the case of a header file that came together with multiple body files. In reactive programming, correlating events means correlating multiple observable messages into a single message that is the result of two or more original messages. Such messages must be somehow correlated to a value (an ID, serial, or metadata) that defines that such initial messages belong to the same correlation set. Useful features in real-world correlators are the ability to specify a timeout (that may be infinite too) in the correlation waiting logic and the ability to specify a correlation message count (infinite too). Here's a correlator implementation made for the previous example based on the FileSystemWatcher class: public sealed class FileNameMessageCorrelator : IObservable<string>, IObserver<string>, IDisposable { private readonly Func<string, string> correlationKeyExtractor; public FileNameMessageCorrelator(Func<string, string> correlationKeyExtractor) { this.correlationKeyExtractor = correlationKeyExtractor; } //the observer collection private readonly List<IObserver<string>> observerList = new List<IObserver<string>>(); public IDisposable Subscribe(IObserver<string> observer) { this.observerList.Add(observer); return null; } private bool hasCompleted = false; public void OnCompleted() { hasCompleted = true; foreach (var observer in observerList) observer.OnCompleted(); } //routes error messages until not completed public void OnError(Exception error) { if (!hasCompleted) foreach (var observer in observerList) observer.OnError(error); } Just a pause. Up to this row, we simply created the reactive structure of FileNameMessageCorrelator class by implementing the two main interfaces. Here is the core implementation that correlates messages: //the container of correlations able to contain //multiple strings per each key private readonly NameValueCollection correlations = new NameValueCollection(); //routes valid messages until not completed public void OnNext(string value) { if (hasCompleted) return; //check if subscriber has completed Console.WriteLine("Parsing message: {0}", value); //try extracting the correlation ID var correlationID = correlationKeyExtractor(value); //check if the correlation is available if (correlationID == null) return; //append the new file name to the correlation state correlations.Add(correlationID, value); //in this example we will consider always //correlations of two items if (correlations.GetValues(correlationID).Count() == 2) { //once the correlation is complete //read the two files and push the //two contents altogether to the //observers var fileData = correlations.GetValues(correlationID) //route messages to the ReadAllText method .Select(File.ReadAllText) //materialize the query .ToArray(); var newValue = string.Join("|", fileData); foreach (var observer in observerList) observer.OnNext(newValue); correlations.Remove(correlationID); } } This correlator class accepts a correlation function as a constructor parameter. This function is later used to evaluate correlationID when a new filename variable flows within the OnNext method. Once the function returns valid correlationID, such IDs will be used as key for NameValueCollection, a specialized string collection to store multiple values per key. When there are two values for the same key, correlation is ready to flow out to the underlying observers by reading file data and joining such data into a single string message. Here's the application's initialization: static void Main(string[] args) { using (var publisher = new NewFileSavedMessagePublisher(@"[WRITE A PATH HERE]")) //creates a new correlator by specifying the correlation key //extraction function made with a Regular expression that //extract a file ID similar to FILEID0001 using (var correlator = new FileNameMessageCorrelator(ExtractCorrelationKey)) { //subscribe the correlator to publisher messages publisher.Subscribe(correlator); //subscribe the console subscriber to the correlator //instead that directly to the publisher correlator.Subscribe(new NewFileSavedMessageSubscriber()); //wait for user RETURN Console.ReadLine(); } } private static string ExtractCorrelationKey(string arg) { var match = Regex.Match(arg, "(FILEID\d{4})"); if (match.Success) return match.Captures[0].Value; else return null; } The initialization is quite the same of the filtering example seen in the previous section. The biggest difference is that the correlator object, instead of a string filter variable, accepts a function that analyses the incoming filename and produces the eventually available correlationID variable. I prepared two files with the same ID in filename variable. Here's the console output of the running example: Two files correlated by their name As visible, correlator made its job by joining the two file's data into a single message regardless of the order in which the two files were stored in the filesystem. These examples regarding the filtering and correlation of messages should give you the idea that we can do anything with received messages. We can put a message in standby until a correlated message comes, we can join multiple messages into one, we can produce multiple times the same message, and so on. This programming style opens the programmer's mind to lot of new application designs and possibilities. Sourcing from CLR streams Any class that extends System.IO.Stream is some kind of cursor-based flow of data. The same happens when we want to see a video stream, a sort of locally not persisted data that flows only in the network with the ability to go forward and backward, stop, pause, resume, play, and so on. The same behavior is available while streaming any kind of data, thus, the Stream class is the base class that exposes such behavior for any need. There are specialized classes that extend Stream, helping work with the streams of text data (StreamWriter and StreamReader), binary serialized data (BinaryReader and BinaryWriter), memory-based temporary byte containers (MemoryStream), network-based streams (NetworkStream), and lot of others. Regarding reactive programming, we are dealing with the ability to source events from any stream regardless of its kind (network, file, memory, and so on). Real-world applications that use reactive programming based on streams are cheats, remote binary listeners (socket programming), and any other unpredictable event-oriented applications. On the other hand, it is useless to read a huge file in reactive way, because there is simply nothing reactive in such cases. It is time to look at an example. Here's a complete example of a reactive application made for listening to a TPC port and route string messages (CR + LF divides multiple messages) to all the available observers. The program Main and the usual ConsoleObserver methods are omitted for better readability: public sealed class TcpListenerStringObservable : IObservable<string>, IDisposable { private readonly TcpListener listener; public TcpListenerStringObservable(int port, int backlogSize = 64) { //creates a new tcp listener on given port //with given backlog size listener = new TcpListener(IPAddress.Any, port); listener.Start(backlogSize); //start listening asynchronously listener.AcceptTcpClientAsync().ContinueWith(OnTcpClientConnected); } private void OnTcpClientConnected(Task<TcpClient> clientTask) { //if the task has not encountered errors if (clientTask.IsCompleted) //we will handle a single client connection per time //to handle multiple connections, simply put following //code into a Task using (var tcpClient = clientTask.Result) using (var stream = tcpClient.GetStream()) using (var reader = new StreamReader(stream)) while (tcpClient.Connected) { //read the message var line = reader.ReadLine(); //stop listening if nothing available if (string.IsNullOrEmpty(line)) break; else { //construct observer message adding client's remote endpoint address and port var msg = string.Format("{0}: {1}", tcpClient.Client.RemoteEndPoint, line); //route messages foreach (var observer in observerList) observer.OnNext(msg); } } //starts another client listener listener.AcceptTcpClientAsync().ContinueWith(OnTcpClientConnected); } private readonly List<IObserver<string>> observerList = new List<IObserver<string>>(); public IDisposable Subscribe(IObserver<string> observer) { observerList.Add(observer); //subscription lifecycle missing //for readability purpose return null; } public void Dispose() { //stop listener listener.Stop(); } } The preceding example shows how to create a reactive TCP listener that acts as observable of string messages. The observable method uses an internal TcpListener class that provides mid-level network services across an underlying Socket object. The example asks the listener to start listening and starts waiting for a client into another thread with the usage of a Task object. When a remote client becomes available, its communication with the internals of observable is guaranteed by the OnTcpClientConneted method that verifies the normal execution of Task. Then, it catches TcpClient from Task, reads the network stream, and appends StreamReader to such a network stream to start a reading feature. Once the message reading feature is complete, another Task starts repeating the procedure. Although, this design handles a backlog of pending connections, it makes available only a single client per time. To change such designs to handle multiple connections altogether, simply encapsulate the OnTcpClientConnected logic. Here's an example: private void OnTcpClientConnected(Task<TcpClient> clientTask) { //if the task has not encountered errors if (clientTask.IsCompleted) Task.Factory.StartNew(() => { using (var tcpClient = clientTask.Result) using (var stream = tcpClient.GetStream()) using (var reader = new StreamReader(stream)) while (tcpClient.Connected) { //read the message var line = reader.ReadLine(); //stop listening if nothing available if (string.IsNullOrEmpty(line)) break; else { //construct observer message adding client's remote endpoint address and port var msg = string.Format("{0}: {1}", tcpClient.Client.RemoteEndPoint, line); //route messages foreach (var observer in observerList) observer.OnNext(msg); } } }, TaskCreationOptions.PreferFairness); //starts another client listener listener.AcceptTcpClientAsync().ContinueWith(OnTcpClientConnected); } This is the output of the reactive application when it receives two different connections by using telnet as a client (C:>telnet localhost 8081). The program Main and the usual ConsoleObserver methods are omitted for better readability: The observable routing events from the telnet client As you can see, each client starts connecting to the listener by using a different remote port. This gives us the ability to differentiate multiple remote connections although they connect altogether. Sourcing from CLR enumerables Sourcing from a finite collection is something useless with regard to reactive programming. Differently, specific enumerable collections are perfect for reactive usages. These collections are the changeable collections that support collection change notifications by implementing the INotifyCollectionChanged(System.Collections.Specialized) interface like the ObservableCollection(System.Collections.ObjectModel) class and any infinite collection that supports the enumerator pattern with the usage of the yield keyword. Changeable collections The ObservableCollection<T> class gives us the ability to understand, in an event-based way, any change that occurs against the collection content. Kindly consider that changes regarding collection child properties are outside of the collection scope. This means that we are notified only for collection changes like the one produced from the Add or Remove methods. Changes within a single item does not produce an alteration of the collection size, thus, they are not notified at all. Here's a generic (nonreactive) example: static void Main(string[] args) { //the observable collection var collection = new ObservableCollection<string>(); //register a handler to catch collection changes collection.CollectionChanged += OnCollectionChanged; collection.Add("ciao"); collection.Add("hahahah"); collection.Insert(0, "new first line"); collection.RemoveAt(0); Console.WriteLine("Press RETURN to EXIT"); Console.ReadLine(); } private static void OnCollectionChanged(object sender, NotifyCollectionChangedEventArgs e) { var collection = sender as ObservableCollection<string>; if (e.NewStartingIndex >= 0) //adding new items Console.WriteLine("-> {0} {1}", e.Action, collection[e.NewStartingIndex]); else //removing items Console.WriteLine("-> {0} at {1}", e.Action, e.OldStartingIndex); } As visible, collection notifies all the adding operations, giving the ability to catch the new message. The Insert method signals an Add operation; although with the Insert method, we can specify the index and the value will be available within collection. Obviously, the parameter containing the index value (e.NewStartingIndex) contains the new index accordingly to the right operation. Differently, the Remove operation, although notifying the removed element index, cannot give us the ability to read the original message before the removal, because the event triggers after the remove operation has already occurred. In a real-world reactive application, the most interesting operation against ObservableCollection is the Add operation. Here's an example (console observer omitted for better readability): class Program { static void Main(string[] args) { //the observable collection var collection = new ObservableCollection<string>(); using (var observable = new NotifiableCollectionObservable(collection)) using (var observer = observable.Subscribe(new ConsoleStringObserver())) { collection.Add("ciao"); collection.Add("hahahah"); collection.Insert(0, "new first line"); collection.RemoveAt(0); Console.WriteLine("Press RETURN to EXIT"); Console.ReadLine(); } } public sealed class NotifiableCollectionObservable : IObservable<string>, IDisposable { private readonly ObservableCollection<string> collection; public NotifiableCollectionObservable(ObservableCollection<string> collection) { this.collection = collection; this.collection.CollectionChanged += collection_CollectionChanged; } private readonly List<IObserver<string>> observerList = new List<IObserver<string>>(); public IDisposable Subscribe(IObserver<string> observer) { observerList.Add(observer); //subscription lifecycle missing //for readability purpose return null; } public void Dispose() { this.collection.CollectionChanged -= collection_CollectionChanged; foreach (var observer in observerList) observer.OnCompleted(); } } } The result is the same as the previous example about ObservableCollection without the reactive objects. The only difference is that observable routes only messages when the Action values add. The ObservableCollection signaling its content changes Infinite collections Our last example is regarding sourcing events from an infinite collection method. In C#, it is possible to implement the enumerator pattern by signaling each object to enumerate per time, thanks to the yield keyword. Here's an example: static void Main(string[] args) { foreach (var value in EnumerateValuesFromSomewhere()) Console.WriteLine(value); } static IEnumerable<string> EnumerateValuesFromSomewhere() { var random = new Random(DateTime.Now.GetHashCode()); while (true) //forever { //returns a random integer number as string yield return random.Next().ToString(); //some throttling time Thread.Sleep(100); } } This implementation is powerful because it never materializes all the values into the memory. It simply signals that a new object is available to the enumerator that the foreach structure internally uses by itself. The result is writing forever numbers onto the output console. Somehow, this behavior is useful for reactive usage, because it never creates a useless state like a temporary array, list, or generic collection. It simply signals new items available to the enumerable. Here's an example: public sealed class EnumerableObservable : IObservable<string>, IDisposable { private readonly IEnumerable<string> enumerable; public EnumerableObservable(IEnumerable<string> enumerable) { this.enumerable = enumerable; this.cancellationSource = new CancellationTokenSource(); this.cancellationToken = cancellationSource.Token; this.workerTask = Task.Factory.StartNew(() => { foreach (var value in this.enumerable) { //if task cancellation triggers, raise the proper exception //to stop task execution cancellationToken.ThrowIfCancellationRequested(); foreach (var observer in observerList) observer.OnNext(value); } }, this.cancellationToken); } //the cancellation token source for starting stopping //inner observable working thread private readonly CancellationTokenSource cancellationSource; //the cancellation flag private readonly CancellationToken cancellationToken; //the running task that runs the inner running thread private readonly Task workerTask; //the observer list private readonly List<IObserver<string>> observerList = new List<IObserver<string>>(); public IDisposable Subscribe(IObserver<string> observer) { observerList.Add(observer); //subscription lifecycle missing //for readability purpose return null; } public void Dispose() { //trigger task cancellation //and wait for acknoledge if (!cancellationSource.IsCancellationRequested) { cancellationSource.Cancel(); while (!workerTask.IsCanceled) Thread.Sleep(100); } cancellationSource.Dispose(); workerTask.Dispose(); foreach (var observer in observerList) observer.OnCompleted(); } } This is the code of the program startup with the infinite enumerable generation: class Program { static void Main(string[] args) { //we create a variable containing the enumerable //this does not trigger item retrieval //so the enumerator does not begin flowing datas var enumerable = EnumerateValuesFromSomewhere(); using (var observable = new EnumerableObservable(enumerable)) using (var observer = observable.Subscribe(new ConsoleStringObserver())) { //wait for 2 seconds than exit Thread.Sleep(2000); } Console.WriteLine("Press RETURN to EXIT"); Console.ReadLine(); } static IEnumerable<string> EnumerateValuesFromSomewhere() { var random = new Random(DateTime.Now.GetHashCode()); while (true) //forever { //returns a random integer number as string yield return random.Next().ToString(); //some throttling time Thread.Sleep(100); } } } As against the last examples, here we have the usage of the Task class. The observable uses the enumerable within the asynchronous Task method to give the programmer the ability to stop the execution of the whole operation by simply exiting the using scope or by manually invoking the Dispose method. This example shows a tremendously powerful feature: the ability to yield values without having to source them from a concrete (finite) array or collection by simply implementing the enumerator pattern. Although few are used, the yield operator gives the ability to create complex applications simply by pushing messages between methods. The more methods we create that cross send messages to each other, the more complex business logics the application can handle. Consider the ability to catch all such messages with observables, and you have a little idea about how powerful reactive programming can be for a developer. Summary In this article, we had the opportunity to test the main features that any reactive application must implement: message sending, error sending, and completing acknowledgement. We focused on plain C# programming to give the first overview of how reactive classic designs can be applied to all main application needs, such as sourcing from streams, from user input, from changeable and infinite collection. Resources for Article: Further resources on this subject: Basic Website using Node.js and MySQL database [article] Domain-Driven Design [article] Data Science with R [article]
Read more
  • 0
  • 0
  • 17281

article-image-getting-started-packages-r
Joel Carlson
18 Jul 2016
6 min read
Save for later

Getting Started with Packages in R

Joel Carlson
18 Jul 2016
6 min read
R is a powerful programming language for loading, manipulating, transforming, and visualizing data. The language is made more powerful by its extensibility in conjunction with the efforts of a highly active open source community. This community is constantly contributing to the language in the form of packages, which are, at their core, sets of thematically linked functions. By leveraging the work that has been put in to the creation of useful open source packages, an R user can substantially improve both the readability and efficiency of their code. In this post, you will learn how to install new packages to extend the functionality of R and how to load those packages into your session. We will also explore some of the most useful packages that have been contributed by the R community! Installing Packages There are a number of places where R packages can be stored, but the three most popular locations are CRAN, Bioconductor, and GitHub. CRAN The Comprehensive R Archive Network is the home of R. At the time of this writing, there are over 8,000 packages hosted on CRAN, all of which are free to download and use. If you are looking to get started with using R in your field but don't know exactly where to start, the CRAN task view for your field or area of interest is likely a good place to start. There you will find listings of relevant packages, along with short descriptions and links to source code. Let's say you've entered the "Reproducible Research" task view and have decided that the package named knitr sounds useful. To install knitr from CRAN, you type this in your R console: install.packages("knitr") Bioconductor Bioconductor is home to over 1,000 packages for R, with a focus on packages that can be used for bioinformatics research. One of the main differences between Bioconductor and CRAN is that Bioconductor has stricter guidelines for accepting packages than CRAN. After finding a package on Bioconductor, such as EBImage, install it by running these commands: source("https://bioconductor.org/biocLite.R") biocLite("EBImage") It is possible to install from Bioconductor using install.packages, but this is not recommended for reasons discussed here. GitHub GitHub is a space where you can post the source code of your work to keep it under version control and also to encourage and facilitate collaboration. Often, GitHub is where the truly bleeding-edge packages can be found, and where package updates are put first. Many of the packages that can be found on CRAN have a development version on GitHub, occasionally with features absent from the CRAN version. As you browse GitHub, you will likely find some packages that will never be put on CRAN or Bioconductor. For this reason, caution should be exercised when using packages sourced from GitHub. Should you find a package on GitHub and wish to install it, you must first download the package devtools from CRAN. You then have access to the install_github() function, where the argument is the name of the developer, followed by a slash, and then the name of the package: install.packages("devtools") # Install swirl! See: https://github.com/swirldev/swirl devtools::install_github("swirldev/swirl") Where the syntax devtools::xxxx() simply means "Use the xxxx function from the devtools package ". You could just have easily called library(devtools) after installing and then simply typed install_github(). The devtools package also includes a number of different methods for installing packages that are stored locally, on bitbucket, in an SVN repository. Try typing ??devtools::install_ to see a full list. Some Popular Packages Now that you know the basic commands for installing packages, let's take a very short look at some of the more popular and useful packages. Visualizing data with ggplot2 ggplot2 is a package that is used to visualize data. It provides a method of chart-building that is intuitive (based on The Grammar of Graphics) and results in aesthetically pleasing graphics. Here is an example of a graphic produced using ggplot2: install.packages("ggplot2") # Install from CRAN library(ggplot2) # Load ggplot2 data(diamonds) # Load diamonds data set # Create plot with carat on x axis, price on y, # and color based on quality of cut ggplot(data=diamonds, aes(x=carat, y=price, col=cut)) + geom_point(alpha=0.5) # Use points (dots) to represent data Manipulating data with dplyr dplyr presents a number of verbs used for manipulating data (select, filter, mutate, arrange, summarize, and so on), each of which are common tasks when working with data. To see how dplyr can simplify your workflow, let's compare the base R versus the dplyr code used to subset the diamonds data into only those gems with Ideal cut type and greater than 2 carats: install.packages("dplyr") # Install dplyr from CRAN library(dplyr) # Load dplyr BaseR <- diamonds[which(diamonds$cut == "Ideal" & diamonds$carat > 2),] # vs: Dplyr <- filter(diamonds, cut == "Ideal" & carat > 2) Clearly the dplyr version is more succinct, more readable, and, most importantly, easier to write. Machine learning with caret The caret package is a collection of functions that unify the syntax used by many of the most popular machine learning packages implemented in R. caret will allow you to quickly prepare your data, create predictive models, tune the model parameters, and interpret the results. Here is a simple working example of training and tuning a k-nearest neighbors model with caret to predict the price of a diamond based on cut, color, and clarity: install.packages("caret") library(caret) # Split data into training and testing sets inTrain <- createDataPartition(diamonds$price, p=0.01, list=FALSE) training <- diamonds[inTrain,] testing <- diamonds[-inTrain,] knn_model <- train(price ~ cut + color + clarity, data=training, method="knn") plot(knn_model) You can see that increasing the number of neighbors in the model increases the accuracy (decreases the RMSE, a method of measuring the average distance between predictions and data). Summary In this post, you learned how to install and load packages from three different major sources: CRAN, Bioconductor, and GitHub. You also took a brief look at three popular packages: ggplot2 for visualization, dplyr for manipulation, and caret for machine learning. About the author Joel Carlson is a recent MSc graduate from Seoul National University, and current Data Science Fellow at Galvanize in San Francisco. He has contributed two R packages to CRAN (radiomics and RImagePalette). You can learn more or contact him at his personal website.
Read more
  • 0
  • 0
  • 2187
Modal Close icon
Modal Close icon