Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7014 Articles
article-image-buildbox-2-game-development-peek-boo
Packt
23 Sep 2016
20 min read
Save for later

Buildbox 2 Game Development: peek-a-boo

Packt
23 Sep 2016
20 min read
In this article by Ty Audronis author of the book Buildbox 2 Game Development, teaches the reader the Buildbox 2 game development environment by example.The following excerpts from the book should help you gain an understanding of the teaching style and the feel of the book.The largest example we give is by making a game called Ramblin' Rover (a motocross-style game that uses some of the most basic to the most advanced features of Buildbox).Let's take a quick look. (For more resources related to this topic, see here.) Making the Rover Jump As we've mentioned before, we're making a hybrid game. That is, it's a combination of a motocross game, a platformer, and a side-scrolling shooter game. Our initial rover will not be able to shoot at anything (we'll save this feature for the next upgraded rover that anyone can buy with in-game currency). But this rover will need to jump in order to make the game more fun. As we know, NASA has never made a rover for Mars that jumps. But if they did do this, how would they do it? The surface of Mars is a combination of dust and rocks, so the surface conditions vary greatly in both traction and softness. One viable way is to make the rover move in the same way a spacecraft manoeuvres (using little gas jets). And since the gravity on Mars is lower than that on Earth, this seems legit enough to include it in our game. While in our Mars Training Ground world, open the character properties for Training Rover. Drag the animated PNG sequence located in our Projects/RamblinRover/Characters/Rover001-Jump folder (a small four-frame animation) into the JumpAnimation field. Now we have an animation of a jump-jet firing when we jump. We just need to make our rover actually jump. Your Properties window should look like the following screenshot: The preceding screenshot shows the relevant sections of the character's properties window We're now going to revisit the Character Gameplay Settings section. Scroll the Properties window all the way down to this section. Here's where we actually configure a few settings in order to make the rover jump. The previous screenshot shows the section as we're going to set it up. You can configure your settings similarly. The first setting we are considering is Jump Force. You may notice that the vertical force is set to 55. Since our gravity is -20 in this world, we need enough force to not only counteract the gravity, but also to give us a decent height (about half the screen). A good rule is to just make our Jump Force 2x our Gravity. Next is Jump Counter. We've set it to 1. By default, it's set to 0. This actually means infinity. When JumpCounter is set to 0, there is no limit to how many times a player can use the jump boost… They could effectively ride the top of the screen using the jump boost,such as a flappy bird control. So, we set it to 1 in order to limit the jumps to one at a time. There is also a strange oddity with the Buildbox that we can exploit with this. The jump counter resets only after the rover hits the ground. But, there's a funny thing… The rover itself never actually touches the ground (unless it crashes), only the wheels do. There is one other way to reset the jump counter: by doing a flip. What this means is that once players use their jump up, the only way to reset it is to do a flip-trick off a ramp. Add a level of difficulty and excitement to the game using a quirk of the development software! We could trick the software into believing that the character is simply close enough to the ground to reset the counter by increasing Ground Threshold to the distance that the body is from the ground when the wheels have landed. But why do this? It's kind of cool that a player has to do a trick to reset the jump jets. Finally, let's untick the Jump From Ground checkbox. Since we're using jets for our boost, it makes sense that the driver could activate them while in the air. Plus, as we've already said, the body never meets the ground. Again, we could raise the ground threshold, but let's not (for the reasonsstated previously). Awesome! Go ahead and give it a try by previewing the level. Try jumping on the small ramp that we created, which is used to get on top of our cave. Now, instead of barely clearing it, the rover will easily clear it, and the player can then reset the counter by doing a flip off the big ramp on top. Making a Game Over screen This exercise will show you how to make some connections and new nodes using Game Mind Map. The first thing we're going to want is an event listener to sense when a character dies. It sounds complex, and if we were coding a game, this would take several lines of code to accomplish. In Buildbox, it's a simple drag-and-drop method. If you double-click on the Game Field UI node, you'll be presented with the overlay for the UI and controls during gameplay. Since this is a basic template, you are actually presented with a blank screen. This template is for you to play around with on a computer, so no controls are on the screen. Instead, it is assumed that you would use keyboard controls to play the demo game. This is why the screen looks blank: There are some significant differences between the UI editor and the World editor. You can notice that the Character tab from the asset library is missing, and there is a timeline editor on the bottom. We'll get into how to use this timeline later. For now, let's keep things simple and add our Game Over sensor. If you expand the Logic tab in the asset library, you'll find the Event Observer object. You can drag this object anywhere onto the stage. It doesn't even have to be in the visible window (the dark area in the center of the stage). So long as it's somewhere on the stage, the game can use this logic asset. If you do put it on the visible area of the stage, don't worry; it's an invisible asset, and won't show in your game. While the Event observer is selected on the stage, you'll notice that its properties pop up in the properties window (on the right side of the screen). By default, the Game Over type of event is selected. But if you select this drop-down menu, you'll notice a ton of different event types that this logic asset can handle. Let's leave all of the properties at their default values (except the name; change this to Game Over) and go back to Game Mind Map (the top-left button): Do you notice anything different? The Game Field UI node now has a Game Over output. Now, we just need a place to send this output. Right-click on the blank space of the grid area. Now you can either create a new world or new UI. Select Add New UI and you'll see a new green node that is titled New UI1. This new UI will be your Game Over screen when a character dies. Before we can use this new node, it needs to be connected to the Game Over output of Game Field UI. This process is exceedingly simple. Just hold down your left mouse button on the Game Over output's dark dot, and drag it to the New UI1's Load dark dot (on the left side of the New UI1 node). Congratulations, you've just created your first connected node. We're not done yet, though. We need to make this Game Over screen link back to restart the game. First, by selecting the New UI1 node, change its name using the parameters window (on the right of the screen) to Game Over UI. Make sure you hit your Enter key; this will commit the changed name. Now double-click on the Game Over UI node so we can add some elements to the screen. You can't have a Game Over screen without the words Game Over, so let's add some text. So, we've pretty much completed the game field (except for some minor items that we'll address quite soon). But believe it or not, we're only halfway there! In this article, we're going to finally create our other two rovers, and we'll test and tweak our scenes with them. We'll set up all of our menus, information screens, and even a coin shop where we can use in-game currency to buy the other two rovers, or even use some real-world currency to short-cut and buy more in-game currency. And speaking of monetization, we'll set up two different types of advertising from multiple providers to help us make some extra cash. Or, in the coin-shop, players can pay a modest fee to remove all advertising! Ready? Well, here we go! We got a fever, and the only cure is more rovers! So now that we've created other worlds, we definitely need to set up some rovers that are capable of traversing them. Let's begin with the optimal rover for Gliese. This one is called the K.R.A.B.B. (no, it doesn't actually stand for anything…but the rover looks like a crab, and acronyms look more military-like). Go ahead and drag all of the images in the Rover002-Body folder as characters. Don't worry about the error message. This just tells you that only one character can be on the stage at a time. The software still loads this new character into the library, and that's all we really want at this time anyway. Of course, drag the images in the Rover002-Jump folder to the Jump Animation field, and the LaserShot.png file to the Bullet Animation field. Set up your K.R.A.B.B. with the following settings: For Collision Shape, match this: In the Asset Library, drag the K.R.A.B.B. above the Mars Training Rover. This will make it the default rover. Now, you can test your Gliese level (by soloing each scene) with this rover to make sure it's challenging, yet attainable. You'll notice some problems with the gun destroying ground objects, but we'll solve that soon enough. Now, let's do the same with Rover 003. This one uses a single image for the Default Animation, but an image sequence for the jump. We'll get to the bullet for this one in a moment, but set it up as follows: Collision Shape should look as follows: You'll notice that a lot of the settings are different on this character, and you may wonder what the advantage of this is (since it doesn't lean as much as the K.R.A.B.B.). Well, it's a tank, so the damage it can take will be higher (which we'll set up shortly), and it can do multiple jumps before recharging (five, to be exact). This way, this rover can fly using flappy-bird style controls for short distances. It's going to take a lot more skill to pilot this rover, but once mastered, it'll be unstoppable. Let's move onto the bullet for this rover. Click on the Edit button (the little pencil icon) inside Bullet Animation (once you've dragged the missile.png file into the field), and let's add a flame trail. Set up a particle emitter on the missile, and position it as shown in the following screenshots: The image on the left shows the placement of the missile and the particle emitter. On the right, you can see the flame set up. You may wonder why it is pointed in the opposite direction. This will actually make the flames look more realistic (as if they're drifting behind the missile). Preparing graphic assets for use in Buildbox Okay, so as I said before, the only graphic assets that Buildbox can use are PNG files. If this was just a simple tutorial on how to make Ramblin' Rover, we could leave it there. But it's not just it. Ramblin' Rover is just an example of how a game is made, but we want to give you all of the tools and baseknowledge you need to create all of your own games from scratch. Even if you don't create your own graphic assets, you need to be able to tell anybody creating them for you how you want them. And more importantly, you need to know why. Graphics are absolutely the most important thing in developing a game. After all, you saw how just some eyes and sneakers made a cute character that people would want to see. Graphics create your world. They create characters that people want to succeed. Most importantly, graphics create the feel of your game, and differentiate it from other games on the market. What exactly is a PNG file? Anybody remember GIF files? No, not animated GIFs that you see on most chat-rooms and on Facebook (although they are related). Back in the 1990s, a still-frame GIF file was the best way to have a graphics file that had a transparent background. GIFs can be used for animation, and can have a number of different purposes. However, GIFs were clunky. How so? Well, they had a type of compression known as lossy. This just means that when compressed, information was lost, and artifacts and noise could pop up and be present. Furthermore, GIFs used indexed colors. This means that anywhere from 2 to 256 colors could be used, and that's why you see something known as banding in GIF imagery. Banding is where something in real life goes from dark to light because of lighting and shadows. In real life, it's a smooth transition known as a gradient. With indexed colors, banding can occur when these various shades are outside of the index. In this case, the colors of these pixels are quantized (or snapped) to the nearest color in the index. The images here show a noisy and banded GIF (left) versus the original picture (right): So, along came PNGs (Portable Network Graphics is what it stands for). Originally, the PNG format was what a program called Macromedia Fireworks used to save projects. Now,the same software is called Adobe Fireworks and is part of the Creative Cloud. Fireworks would cut up a graphics file into a table or image map and make areas of the image clickable via hyperlink for HTML web files. PNGs were still not widely supported by web browsers, so it would export the final web files as GIFs or JPEGs. But somewhere along the line, someone realized that the PNG image itself was extremely bandwidthefficient. So, in the 2000s, PNGs started to see some support on browsers. Up until around 2008, though, Microsoft's Internet Explorer still did not support PNGs with transparency, so some strange CSS hacks needed to be done to utilize them. Today, though, the PNG file is the most widely used network-based image file. It's lossless, has great transparency, and is extremely efficient. Since PNGs are very widely used, and this is probably why Buildbox restricts compatibility to this format. Remember, Buildbox can export for multiple mobile and desktop platforms. Alright, so PNGs are great and very compatible. But there are multiple flavours of PNG files. So, what differentiates them? What bit-ratings mean? When dealing with bit-ratings, you have to understand that when you hear 8-bit image and 24-bit image, it maybe talking about two different types of rating, or even exactly the same type of image. Confused? Good, because when dealing with a graphics professional to create your assets, you're going to have to be a lot more specific, so let's give you a brief education in this. Your typical image is 8 bits per channel (8 bpc), or 24 bits total (because there are three channels: red, green, and blue). This is also what they mean by a 16.7 million-color image. The math is pretty simple. A bit is either 0 or 1. 8 bits may look something as 01100110. This means that there are 256 possible combinations on that channel. Why? Because to calculate the number of possibilities, you take the number of possible values per slot and take it to that power. 0 or 1; that's 2 possibilities, and 8 bit is 8 slots. 2x2x2x2x2x2x2x2 (2 to the 8th power) is 256. To combine colors on a pixel, you'd need to multiply the possibilities such as 256x256x256 (which is 16.7 million). This is how they know that there are 16.7 million possible colors in an 8 bpc or 24-bit image. So saying 8 bit may mean per channel or overall. This is why it's extremely important to add thr "channel" word if that's what you mean. Finally, there is a fourth channel called alpha. The alpha channel is the transparency channel. So when you're talking about a 24-bit PNG with transparency, you're really talking about a 32-bit image. Why is this important to know? This is because some graphics programs (such as Photoshop) have 24-bit PNG as an option with a checkbox for transparency. But some other programs (such as the 3D software we used called Lightwave) have an option for a 24-bit PNG and a 32-bit PNG. This is essentially the same as the Photoshop options, but with different names. By understanding what these bits per channel are and what they do, you can navigate your image-creating software options better. So, what's an 8-bit PNG, and why is it so important to differentiate it from an 8-bit per channel PNG (or 24-bit PNG)? It is because an 8-bit PNG is highly compressed. Much like a GIF, it uses indexed colors. It also uses a great algorithm to "dither" or blend the colors to fill them in to avoid banding. 8-bit PNG files are extremely efficient on resources (that is, they are much smaller files), but they still look good, unless they have transparency. Because they are so highly compressed, the alpha channel is included in the 8-bits. So, if you use 8-bit PNG files for objects that require transparency, they will end up with a white-ghosting effect around them and look terrible on screen, much like a weather report where the weather reporter's green screen is bad. So, the rule is… So, what all this means to you is pretty simple. For objects that require transparency channels, always use 24-bit PNG files with transparency (also called 8 bits per channel, or 32-bit images). For objects that have no transparency (such as block-shaped obstacles and objects), use 8-bit PNG files. By following this rule, you'll keep your game looking great while avoiding bloating your project files. In the end, Buildbox repacks all of the images in your project into atlases (which we'll cover later) that are 32 bit. However, it's always a good practice to stay lean. If you were a Buildbox 1.x user, you may remember that Buildbox had some issues with DPI (dots per inch) between the standard 72 and 144 on retina displays. This issue is a thing of the past with Buildbox 2. Image sequences Think of a film strip. It's just a sequence of still-images known as frames. Your standard United States film runs at 24 frames per second (well, really 23.976, but let's just round up for our purposes). Also, in the US, television runs at 30 frames per second (again, 29.97, but whatever…let's round up). Remember that each image in our sequence is a full image with all of the resources associated with it. We can quite literally cut our necessary resources in half by cutting this to 15 frames per second (fps). If you open the content you downloaded, and navigate to Projects/RamblinRover/Characters/Rover001-Body, you'll see that the images are named Rover001-body_001.png, Rover001-body_002.png and so on. The final number indicates the number that should play in the sequence (first 001, then 002, and so on). The animation is really just the satellite dish rotating, and the scanner light in the window rotating as well. But what you'll really notice is that this animation is loopable. All loopable means is that the animation can loop (play over and over again) without you noticing a bump in the footage (the final frame leads seamlessly back to the first). If you're not creating these animations yourself, you'll need to make sure to specify to your graphics professional to make these animations loopable at 15 fps. They should understand exactly what you mean, and if they don't…you may consider finding a new animator. Recommended software for graphics assets For the purposes of context (now that you understand more about graphics and Buildbox), a bit of reinforcement couldn't hurt. A key piece of graphics software is the Adobe Creative Cloud subscription (http://www.adobe.com/CreativeCloud ). Given its bang for the buck, it just can't be beaten. With it, you'll get Photoshop (which can be used for all graphics assets from your game's icon to obstacles and other objects), Illustrator (which is great for navigational buttons), After Effects (very useful for animated image sequences), Premiere Pro (a video editing application for marketing videos from screen-captured gameplay), and Audition (for editing all your sound). You may also want some 3D software, such as Lightwave, 3D Studio Max, or Maya. This can greatly improve the ability to make characters, enemies, and to create still renders for menus and backgrounds. Most of the assets in Ramblin' Rover were created with the 3D software Lightwave. There are free options for all of these tools. However, there are not nearly as many tutorials and resources available on the web to help you learn and create using these. One key thing to remember when using free software: if it's free…you're the product. In other words, some benefits come with paid software, such as better support, and being part of the industry standard. Free software seems to be in a perpetual state of "beta testing." If using free software, read your End User License Agreement (known as a EULA) very carefully. Some software may require you to credit them in some way for the privilege of using their software for profit. They may even lay claim to part of your profits. Okay, let's get to actually using our graphics in Ramblin' Rover… Summary See? It's not that tough to follow. By using plain-English explanations combined with demonstrating some significant and intricate processes, you'll be taken on a journey meant to stimulate your imagination and educate you on how to use the software. Along with the book comes the complete project files and assets to help you follow along the entire way through the build process. You'll be making your own games in no time! Resources for Article: Further resources on this subject: What Makes a Game a Game? [article] Alice 3: Controlling the Behavior of Animations [article] Building a Gallery Application [article]
Read more
  • 0
  • 0
  • 22740

article-image-jira-101
Packt
22 Sep 2016
15 min read
Save for later

JIRA 101

Packt
22 Sep 2016
15 min read
In this article by Ravi Sagar, author of the book, Mastering Jira 7 - Second Edition, you will learn the basics about JIRA. We will look into the components available in the product, their application, and how to use them. We will try our hands on a few JQL queries, after which we will create project reports on JIRA for issue tracking, and then derive information from them using various built-in reports that JIRA comes with. We will also take a look at the gadgets that JIRA provides, which are helpful for reporting purposes. Finally, we will take take a look at the migrating options, which JIRA provides, to fully restore a JIRA instance for a specific project. (For more resources related to this topic, see here.) Product introduction Atlassian JIRA is a proprietary issue tracking system. It is used for tracking bugs, issues, and project management. There are many such tools available, but the best thing about JIRA is that it can be configured very easily and it offers a wide range of customizations. Out of the box, JIRA offers a defect/bug tracking functionality, but it can be customized to act like a helpdesk system, simple test management suite, or a project management system with end-to-end traceability. The much awaited JIRA 7 was released in October 2015 and it is now offered in the following three different application variants: JIRA Core JIRA Software JIRA Service Desk Let us discuss each one of them separately. JIRA Core This comprises the base application of JIRA that you may be familiar with, of course with some new features. JIRA Core is a simplified version of the JIRA features that we have used till 6.x versions. JIRA Software This comprises of all the features of JIRA Core + JIRA Agile. From JIRA 7 onwards, JIRA Agile will no longer be offered as an add-on. You will not be able to install JIRA Agile from the marketplace. JIRA Service Desk This comprises of all the features of JIRA Core + JIRA Service Desk. Just like JIRA Software, the JIRA Service Desk will no longer be offered as an add-on and you cannot install it from the marketplace. Applications, uses, and examples The ability to customize JIRA is what makes it popular among various companies who use it. The following are the various applications of JIRA: Defect/bug tracking Change requests Helpdesk/support tickets Project management Test-case management Requirements management Process management Let's take a look at the implementation of test-case management: The issue types: Test campaign: This will be the standard issue type Test case: This will be subtask The workflow for test campaign: New states: Published Under Execution Conditions: A test campaign will only pass when all the test cases are passed Only reporter can move this test campaign to Closed Post function: When the test campaign is closed, send an email to everyone in a particular group Workflow for a test case: New states: Blocked Passed Failed In Review Condition: Only the assigned user can move the test case to Passed state Post function: When the test case is moved to Failed state, change the issue priority to major Custom fields: Name Type Values Field configuration Category Select list   Customer name Select list   Steps to reproduce Text area   Mandatory Expected input Text area   Mandatory Expected output Text area   Mandatory Precondition Text area   Postcondition Text area Campaign type Select list Unit Functional Endurance Benchmark Robustness Security Backward compatibility Certification with baseline Automation status Select list Automatic Manual Partially automatic JIRA core concepts Let's take a look at the architecture of JIRA; it will help you understand the core concepts: Project Categories: When there are too many projects in JIRA, it becomes important to segregate them into various categories. JIRA will let you create several categories that can represent the business units, clients, or teams in your company. Projects: A JIRA project is a collection of issues. Your team can use a JIRA project to coordinate the development of a product, track a project, manage a help desk, and so on, depending on your requirements. Components: Components are subsections of a project. They are used to group issues within a project to smaller parts. Versions: Versions are point-in-time for a project. They help you schedule and organize your releases. Issue Types: JIRA will let you create more than one issue types that are different from each other in terms of what kind of information they store. JIRA comes with default-issue types, such as bug, task, and subtask, but you can create more issue types that can follow their own workflow as well as have a different set of fields. Subtasks: Issue types are of two types, namely standard and subtasks, which are children of a standard task. For instance, you can have test campaigns as standard-issue type and test cases as subtasks. Introduction to JQL JIRA Query Language (JQL) is one of the best features in JIRA that lets you search the issues efficiently and offers lots of handy features. The best part about JQL is that it is very easy to learn, thanks to the autocomplete functionality in the Advanced search, which helps the user with suggestions based on keywords typed. JQL consists of questions, whether single or multiple, that can be combined together to form complex questions. Basic JQL syntax JQL has a field followed by an operator. For instance, to retrieve all the issues of the CSTA project, you can use a simple query like this: project = CSTA Now, within this project, if you want to find the issues assigned to a specific user, use the following query: project = CSTA and assignee = ravisagar There may be several hundred issues assigned to a user and, maybe, we just want to focus on issues whose priority is either Critical or Blocker, you can use the following query: project = CSTA and assignee = ravisagar and priority in (Blocker, "Critical) What if, instead of issues assigned to a specific user, we want to find the issues assigned to all other users except one? It can be achieved in the following way: project = CSTA and assignee != ravisagar and priority in (Blocker, "Critical) So, you can see that JQL consists of one or more queries. Project reports Once you start using JIRA for issue tracking of any type, it becomes imperative to derive useful information out of it. JIRA comes with built-in reports that show real-time statistics for projects, users, and other fields. Let's take a look at each of these reports. Open any project in JIRA that contains a lot of issues and has around 5 to 10 users, which are either assignee or reporters. When you open any project page, the default view is the Summary view that contains a 30 day summary report and Activity Stream that shows whatever is happening in the project-like creation of new issues, update of status, comments, and basically any change in the project. On the left-hand side of the project summary page, there are links for Issues and Reports. Average Age Report This report displays the average number of days for which issues are in an unresolved state on a given date. Created vs. Resolved Issues Report This report displays the number of issues that were created over the period of time versus the number of issues that were resolved in that period: Pie Chart Report This chart shows the breakup of data. For instance, in your project, if you are interested in finding out the issue count for all the issue types, then this report can be used to fetch this information. Recently Created Issues Report This report displays the statistical information on the number of issues created for the selected period and days. The report also displays status of the issues. Resolution Time Report There are cases when you are interested in understanding the speed of your team every month. How soon can your team resolve the issues? This report displays the average resolution time of the issues in a given month. Single Level Group By Report It is a simple report that just lists the issues grouped by a particular field, such as Assignee, Issue Type, Resolution, Status, Priority, and so on. Time Since Issues Report This report is useful in finding out how many issues were created in a specific quarter over the past one year and, not only that, there are various date-based fields supported by this report. Time Tracking Report This comprehensive report displays the estimated effort and remaining effort of all the issues. Not only that, the report will also give you indication on the overall progress of the project. User Workload Report This report can tell us about the occupancy of the resources in all the projects. It really helps in distributing the tasks among users. Version Workload Report If your project has various versions that are related to the actual releases or fixes, then it becomes important to understand the status of all such issues. Gadgets for reporting purposes JIRA comes with a lot of useful gadgets that you can add in the dashboard and use for reporting purposes. Additional gadgets can be added in JIRA by installing add-ons. Let's take a look at some of these gadgets. Activity Stream This gadget will display all the latest updates in your JIRA instance. It's also possible to limit this stream to a particular filter as well. This gadget is quite useful because it displays up-to-date information on the dashboard: Created vs. Resolved Chart The project summary page has a chart to display all the issues that were created and resolved in the past 30 days. There is a similar gadget to display this information. You can also change the duration from 30 days to whatever you like. This gadget can be created for a specific project: Pie Chart Just like the pie chart, which is there in project reports, there is a similar gadget that you can add in the dashboard. For instance, for a particular project, you can generate a Pie Chart based on Priority: Issue Statistics This gadget is quite useful in generating simple statistics for various fields. Here, we are interested in finding out the breakup of the project in terms of Issue Statistics: Two Dimensional Filter Statistics The Issue Statistics gadget can display the breakup of project issues for every Status. What if you want to further segregate this information? For instance, how many issues are open and to which Issue Type they belong to? In such scenarios, Two Dimensional Filter Statistics can be used. You just need to select two fields that will be used to generate this report, one for x axis and another for y axis: These are certain common gadgets that can be used in the dashboard; however, there are many more gadgets. Click on the Add Gadget option on the top-right corner to see all such gadgets in your JIRA instance. Some gadgets come out of the box with JIRA and others are a part of add-ons that you can install. After you select all these gadgets in your dashboard, this is how it looks: This is the new dashboard that we have just created and configured for a specific project, but it's also possible to create more than one dashboard. To add another dashboard, just click on the Create Dashboard option under Tools on the top-right corner. If you have more than one dashboard, then you can switch between them using the links on the top-left corner of the screen, as shown in the following screenshot: The simple CSV import Let's understand how to perform a simple import of the CSV data. The first thing to do is prepare the CSV file that can be imported into JIRA. For this exercise, we will import issues into a particular project; these issues will have data, such as issue Summary, Status, Dates, and a few other fields. Preparing the CSV file We’ll use MS Excel to prepare the CSV file with the following data: If your existing tool has the option to export directly into the CSV file, then you can skip this step, but we recommend reviewing your data before importing it into JIRA. Usually, the CSV import will not work if the format of the CSV file and the data is not correct. It's very easy to generate a CSV file from an Excel file. Perform the following steps: Go to File | Save As | File name: and select Save as type: as CSV (comma delimited). If you don't have Microsoft Excel installed, you can use LibreOffice Calc, which is an open source alternative for Microsoft Office Excel: You can open the CSV file to verify its format too: Our CSV file has the following fields: CSV Field Purpose Project JIRA's project key needs to be specified in this field Summary This field is mandatory and needs to be specified in the CSV file Issue Type This is important to specify the issue type Status This displays the status of the issue; these are workflow states that need to exist in JIRA and the project workflow should have the states that be imported into the CSV file Priority The priorities mentioned here should exist in JIRA before import Resolution The resolutions mentioned here should exist in JIRA before import Assignee This specifies the assignee of the issue Reporter This specifies the reporter of the issue Created This is the issue creation date Resolved This is the issue resolution date Performing the CSV import Once your CSV file is prepared, then you are ready to perform the import in JIRA: Navigate to JIRA Administration | System | External System Import | CSV (under IMPORT & EXPORT). On the File import screen in the CSV Source File field, click on the Browse… button to select the CSV file that you just prepared on your machine. Once you select the CSV file, the Next button will be enabled: On the Setup screen, select Import to Project as DOPT, which is the name of our project. Verify Date format. It should match the format of the date values in the CSV file. Click on the Next button to continue. On the Map fields screen, we need to map the fields in the CSV file to JIRA fields. This step is crucial because in your old system, the field name can be different from JIRA fields; so, in this step, map these fields to the respective JIRA fields. Click on the Next button to continue. On the Map values screen, map the values of Status, in fact, this mapping of field values can be done for any field. In our case, the values in the status field are the same as in JIRA, so click on the Begin Import button. You will finally get a confirmation that the issues are imported successfully: If you encounter any errors during the CSV import, then it's usually due to some problem with the CSV format. Read the error messages carefully and correct these issues. As mentioned earlier, the CSV import needs to be performed on the test environment first. Migrate JIRA configurations using Configuration Manager add-on JIRA has provision to fully restore a JIRA instance from a backup file, restore a specific project, and use the CSV import functionality to important data in it. These utilities are quite important as it really makes the life of JIRA administrators a lot easier. They can perform these activities right from the JIRA user interface. The project-import utility and CSV import is used to migrate one or more projects from one instance of JIRA to another, but the target instance should have the required configuration in place otherwise these utilities will not work. For instance, if there is a project in source instance with custom workflow states along with a few custom fields, then the exact similar configurations of workflow and custom fields should exist already in the target instance. Recreating these configurations and schemes can be a time consuming and error prone process. Additionally, in various organizations, there is a test environment or a staging server for JIRA where all the new configurations are first tested before they are rolled out to the production instance. Currently, there is no such way to selectively migrate the configurations from one instance to another. It has to be done manually on the target instance. Configuration Manager is an add-on that does this job. Using this add-on, the project-specific configuration can be migrated from one instance to another. Summary In this article we looked at the different products offered by JIRA. Then we learned about concepts of JIRA. We then tried our hands on JQL and a few examples of it. We saw the different types of reports provided by JIRA and the various gadgets available for reporting purposes. Finally we saw how to migrate JIRA configurations using the configuration manager add-on. Resources for Article: Further resources on this subject: Working with JIRA [article] JIRA Workflows [article] JIRA – an Overview [article]
Read more
  • 0
  • 0
  • 15564

article-image-openstack-networking-nutshell
Packt
22 Sep 2016
13 min read
Save for later

OpenStack Networking in a Nutshell

Packt
22 Sep 2016
13 min read
Information technology (IT) applications are rapidly moving from dedicated infrastructure to cloud based infrastructure. This move to cloud started with server virtualization where a hardware server ran as a virtual machine on a hypervisor. The adoptionof cloud based applicationshas accelerated due to factors such as globalization and outsourcing where diverse teams need to collaborate in real time. Server hardware connects to network switches using Ethernet and IP to establish network connectivity. However, as servers move from physical to virtual, the network boundary also moves from the physical network to the virtual network.Traditionally applications, servers and networking were tightly integrated. But modern enterprises and IT infrastructure demand flexibility in order to support complex applications. The flexibility of cloud infrastructure requires networking to be dynamic and scalable. Software Defined Networking (SDN) and Network Functions Virtualization (NFV) play a critical role in data centers in order to deliver the flexibility and agility demanded by cloud based applications. By providing practical management tools and abstractions that hide underlying physical network’s complexity, SDN allows operators to build complex networking capabilities on demand. OpenStack is an open source cloud platform that helps build public and private cloud at scale. Within OpenStack, the name for OpenStack Networking project is Neutron. The functionality of Neutron can be classified as core and service. In this article by Sriram Subramanian and SreenivasVoruganti, authors of the book Software Defined Networking (SDN) with OpenStack, aims to provide a short introduction toOpenStack Networking. We will cover the following topics in this article: Understand traffic flows between virtual and physical networks Neutron entities that support Layer 2 (L2) networking Layer 3 (L3) or routing between OpenStack Networks Securing OpenStack network traffic Advanced networking services in OpenStack OpenStack and SDN The terms Neutron and OpenStack Networking are used interchangeably throughout this article. (For more resources related to this topic, see here.) Virtual and physical networking Server virtualization led to the adoption of virtualized applications and workloads running inside physical servers. While physical servers are connected to the physical network equipment, modern networking has pushed the boundary of networking into the virtual domain as well. Virtual switches, firewalls and routers play a critical role in the flexibility provided by cloud infrastructure. Figure 1: Networking components for server virtualization The preceding figure describes a typical virtualized server and the various networking components. The virtual machines are connected to a virtual switch inside the compute node (or server). The traffic is secured using virtual routers and firewalls. The compute node is connected to a physical switch which is the entry point into the physical network. Let us now walk through different traffic flow scenarios using the picture above as the background. In Figure 2, traffic from one VM to another on same compute node is forwarded by the virtual switch itself. It does not reach the physical network. You can even apply firewall rules to traffic between the two virtual machine. Figure 2: Traffic flow between two virtual machines on the same server Next, let us have a look at how traffic flows between virtual machines across two compute nodes. In Figure 3, the traffic comes out from compute node and then reaches the physical switch. The physical switch forwards the traffic to the second compute node and the virtual switch within the second compute node steers the traffic to appropriate VM. Figure 3: Traffic flow between two virtual machines on the different servers Finally, here is the depiction of traffic flow when a virtual machine sends or receives traffic from the Internet. The physical switch forwards the traffic to the physical router and firewall which is presumed to be connected to the internet. Figure 4: Traffic flow from a virtual machine to external network As seen from the above diagrams, the physical and the virtual network components work together to provide connectivity to virtual machines and applications. Tenant isolation As a cloud platform, OpenStack supports multiple users grouped into tenants. One of the key requirements of a multi-tenant cloud is to provide isolation of data traffic belonging to one tenant from rest of the tenants that use the same infrastructure. OpenStack supports different ways of achieving isolation and the it is the responsibility of the virtual switch to implement the isolation. Layer 2 (L2) capabilities in OpenStack The connectivity to a physical or virtual switch is also known as Layer 2 (L2) connectivity in networking terminology. Layer 2 connectivity is the most fundamental form of network connectivity needed for virtual machines. As mentioned earlier OpenStack supports core and service functionality. The L2 connectivity for virtual machines falls under the core capability of OpenStack Networking, whereas Router, Firewall etc., fall under the service category. The L2 connectivity in OpenStack is realized using two constructs called Network and Subnet. Operators can use OpenStack CLI or the web interface to create Networks and Subnets. And virtual machines are instantiated, the operators can associate them appropriate Networks. Creatingnetwork using OpenStack CLI A Network defines the Layer 2 (L2) boundary for all the instances that are associated with it. All the virtual machines within a Network are part of the same L2 broadcast domain. The Liberty release has introduced new OpenStack CLI (Command Line Interface) for different services. We will use the new CLI and see how to create a Network. Creating Subnet using OpenStack CLI A Subnet is a range of IP addresses that are assigned to virtual machines on the associated network. OpenStack Neutron configures a DHCP server with this IP address range and it starts one DHCP server instance per Network, by default. We will now show you how to create a Subnet using OpenStack CLI. Note: Unlike Network, for Subnet, we need to use the regular neutron CLI command in the Liberty release. Associating a network and Subnet to a virtual machine To give a complete perspective, we will create a virtual machine using OpenStack web interface and show you how to associate a Network and Subnet to a virtual machine. In your OpenStack web interface, navigate to Project|Compute|Instances. Click on Launch Instances action on the right hand side as highlighted above. In the resulting window enter the name for your instance and how you want to boot your instance. To associate a network and a subnet with the instance, click on Networking tab. If you have more than one tenant network, you will be able to choose the network you want to associate with the instance. If you have exactly one network, the web interface will automatically select it. As mentioned earlier, providing isolation for Tenant network traffic is a key requirement for any cloud. OpenStack Neutron uses Network and Subnet to define the boundaries and isolate data traffic between different tenants. Depending on Neutron configuration, the actual isolation of traffic is accomplished by the virtual switches. VLAN and VXLAN are common networking technologies used to isolate traffic. Layer 3 (L3) capabilities in OpenStack Once L2 connectivity is established, the virtual machines within one Network can send or receive traffic between themselves. However, two virtual machines belonging to two different Networks will not be able communicate with each other automatically. This is done to provide privacy and isolation for Tenant networks. In order to allow traffic from one Network to reach another Network, OpenStack Networking supports an entity called Router. The default implementation of OpenStack uses Namespaces to support L3 routing capabilities. Creating Router using OpenStack CLI Operators can create Routers using OpenStack CLI or web interface. They can then add more than one Subnets as interface to the Router. This allows the Networks associated with the router to exchange traffic with one another. The command to create a Router is as follows: This command creates a Router with the specified name. Associating Subnetwork to a Router Once a Router is created, the next step is to associate one or more sub-networks to the Router. The command to accomplish this is: The Subnet represented by subnet1 is now associated to the Router router1. Securing network traffic in OpenStack The security of network traffic is very critical and OpenStack supports two mechanisms to secure network traffic. Security Groups allow traffic within a tenant’s network to be secured. Linux iptables on the compute nodes are used to implement OpenStack security groups. The traffic that goes outside of a tenant’s network – to another Network or the Internet, is secured using OpenStackFirewall Service functionality. Like Routing, Firewall is a service with Neutron. Firewall service also uses iptables but the scope of iptables is limited to the OpenStack Router used as part of the Firewall Service. Usingsecurity groups to secure traffic within a network In order to secure traffic going from one VM to another within a given Network, we must create a security group. The command to create a security group is: The next step is to create one or more rules within the security group. As an example let us create a rule which allows only UDP, incoming traffic on port 8080 from any Source IP address. The final step is to associate this security group and the rules to a virtual machine instance. We will use the nova boot command for this: Once the virtual machine instance has a security group associated with it, the incoming traffic will be monitored and depending upon the rules inside the security group, data traffic may be blocked or permitted to reach the virtual machine. Note: it is possible to block ingress or egress traffic using security groups. Using firewall service to secure traffic We have seen that security groups provide a fine grain control over what traffic is allowed to and from a virtual machine instance. Another layer of security supported by OpenStack is the Firewall as a Service (FWaaS). The FWaaS enforces security at the Router level whereas security groups enforce security at a virtual machine interface level. The main use case of FWaaS is to protect all virtual machine instances within a Network from threats and attacks from outsidethe Network. This could be virtual machines part of another Network in the same OpenStack cloud or some entity on the Internet trying to make an unauthorized access. Let us now see how FWaaS is used in OpenStack. In FWaaS, a set of firewall rules are grouped into a firewall policy. And then a firewall is created that implements one policy at a time. This firewall is then associated to a Router. Firewall rule can be created using neutron firewall-rule-create command as follows: This rule blocks ICMP protocol so applications like Ping will be blocked by the firewall. The next step is to create a Firewall policy. In real world scenarios the security administrators will define several rules and consolidate them under a single policy. For example all rules that block various types of traffic can be combined into a single Policy. The command to create a firewall policy is: The final step is to create a firewall and associate it with a router. The command to do this is: In the command above we did not specify any Routers and the OpenStack behavior is to associate the firewall (and in turn the policy and rules) to all the Routers available for that tenant. The neutron firewall-create command supports an option to pick a specific Router as well. Advanced networking services Besides routing and firewall, there are few other commonly used networking technologies supported by OpenStack. Let’s take a quick look at these without delving deep into the respective commands. Load Balancing as a Service (LBaaS) Virtual machines instances created in OpenStack are used to run applications. Most applications are required to support redundancy and concurrent access. For example, a web server may be accessed by a large number of users at the same time. One of the common strategies to handle scale and redundancy is to implement load-balancing for incoming requests. In this approach, aLoad Balancer distributes an incoming service request onto a pool of servers, which processes the request thus providing higher throughput. If one of the servers in the pool fails, the Load Balancer removes it from the pool and the subsequent service requests are distributed among the remaining servers. User of the application use the IP address of the Load Balancer to access the application and are unaware of the pool of servers. OpenStack implements Load Balancer using HAProxy software and Linux Namespace. Virtual Private Network as a Service (VPNaaS) As mentioned earlier tenant isolation requires data traffic to be segregated and secured within an OpenStack cloud. However, there are times when external entities need to be part of the same Network without removing the firewall based security. This can be accomplished using a Virtual Private Network or VPN. A Virtual Private Network (VPN) connects two endpoints on different networks over a public Internet connection, such that the endpoints appear to be directly connected to each other.  VPNs also provide confidentiality and integrity of transmitted data. Neutron provides a service plugin that enables OpenStack users to connect two networks using a Virtual Private Network (VPN).  The reference implementation of VPN plugin in Neutron uses Openswan to create an IPSec based VPN. IPSec is a suite of protocols that provides secure connection between two endpoints by encrypting each IP packet transferred between them. OpenStack and SDN context So far in this article we have seen the different networking capabilities provided by OpenStack. Let us know look at two capabilities in OpenStack that enable SDN to be leveraged effectively. Choice of technology OpenStack being an open source platform bundles open source networking solutions as default implementation for these networking capabilities. For example, Routing is supported using Namespace, security using iptables and Load balancing using HAproxy. Historically these networking capabilities were implemented using customized hardware and software, most of them being proprietary solutions. These custom solutions are capable of much higher performance and are well supported by their vendors. And hence they have a place in the OpenStack and SDN ecosystem. From it initial releases OpenStack has been designed for extensibility. Vendors can write their own extensions and then can easily configure OpenStack to use their extension instead of the default solutions. This allows the operators to deploy the networking technology of their choice. OpenStack API for networking One of the most powerful capabilities of OpenStack is the extensive support for APIs. All services of OpenStack interact with one another using well defined RESTful APIs. This allows custom implementations and pluggable components to provide powerful enhancements for practical cloud implementation. For example, when a Network is created using OpenStack web interface, a RESTful request is sent to Horizon service. This in turn invokes a RESTful API to validate the user using Keystone service. Once validated, Horizon sends another RESTful API request to Neutron to actually create the Network. Summary As seen in this article, OpenStack supports a wide variety of networking functionality right out of the box. The importance of isolating tenant traffic and the need to allow customized solution requires OpenStack to support flexible configuration. We also highlighted some key aspects of OpenStack that will play a key role in deploying Software Defined Networking in datacenters, thereby supporting powerful cloud architecture and solution. Resources for Article: Further resources on this subject: Setting Up a Network Backup Server with Bacula [article] Jenkins 2.0: The impetus for DevOps Movement [article] Integrating Accumulo into Various Cloud Platforms [article]
Read more
  • 0
  • 0
  • 19085

article-image-string-management-in-swift
Jorge Izquierdo
21 Sep 2016
7 min read
Save for later

String management in Swift

Jorge Izquierdo
21 Sep 2016
7 min read
One of the most common tasks when building a production app is translating the user interface into multiple languages. I won't go into much detail explaining this or how to set it up, because there are lots of good articles and tutorials on the topic. As a summary, the default system is pretty straightforward. You have a file named Localizable.strings with a set of keys and then different values depending on the file's language. To use these strings from within your app, there is a simple macro in Foundation, NSLocalizedString(key, comment: comment), that will take care of looking up that key in your localizable strings and return the value for the user's device language. Magic numbers, magic strings The problem with this handy macro is that as you can add a new string inline, you will presumably end up with dozens of NSLocalizedStrings in the middle of the code of your app, resulting in something like this: mainLabel.text = NSLocalizedString("Hello world", comment: "") Or maybe, you will write a simple String extension for not having to write it every time. That extension would be something like: extension String { var localized: String { return NSLocalizedString(self, comment: "") } } mainLabel.text = "Hello world".localized This is an improvement, but you still have the problem that the strings are all over the place in the app, and it is difficult to maintain a scalable format for the strings as there is not a central repository of strings that follows the same structure. The other problem with this approach is that you have plain strings inside your code, where you could change a character and not notice it until seeing a weird string in the user interface. For that not to happen, you can take advantage of Swift's awesome strongly typed nature and make the compiler catch these errors with your strings, so that nothing unexpected happens at runtime. Writing a Swift strings file So that is what we are going to do. The goal is to be able to have a data structure that will hold all the strings in your app. The idea is to have something like this: enum Strings { case Title enum Menu { case Feed case Profile case Settings } } And then whenever you want to display a string from the app, you just do: Strings.Title.Feed // "Feed" Strings.Title.Feed.localized // "Feed" or the value for "Feed" in Localizable.strings This system is not likely to scale when you have dozens of strings in your app, so you need to add some sort of organization for the keys. The basic approach would be to just set the value of the enum to the key: enum Strings: String { case Title = "app.title" enum Menu: String { case Feed = "app.menu.feed" case Profile = "app.menu.profile" case Settings = "app.menu.settings" } } But you can see that this is very repetitive and verbose. Also, whenever you add a new string, you need to write its key in the file and then add it to the Localizable.strings file. We can do better than this. Autogenerating the keys Let’s look into how you can automate this process so that you will have something similar to the first example, where you didn't write the key, but you want an outcome like the second example, where you get a reasonable key organization that will be scalable as you add more and more strings during development. We will take advantage of protocol extensions to do this. For starters, you will define a Localizable protocol to make the string enums conform to: protocol Localizable { var rawValue: String { get } } enum Strings: String, Localizable { case Title case Description } And now with the help of a protocol extension, you can get a better key organization: extension Localizable { var localizableKey: String { return self.dynamicType.entityName + "." rawValue } static var entityName: String { return String(self) } } With that key, you can fetch the localized string in a similar way as we did with the String extension: extension Localizable { var localized: String { return NSLocalizedString(localizableKey, comment: "") } } What you have done so far allows you to do Strings.Title.localized, which will look in the localizable strings file for the key Strings.Title and return the value for that language. Polishing the solution This works great when you only have one level of strings, but if you want to group a bit more, say Strings.Menu.Home.Title, you need to make some changes. The first one is that each child needs to know who its parent is in order to generate a full key. That is impossible to do in Swift in an elegant way today, so what I propose is to explicitly have a variable that holds the type of the parent. This way you can recurse back the strings tree until the parent is nil, where we assume it is the root node. For this to happen, you need to change your Localizable protocol a bit: public protocol Localizable { static var parent: LocalizeParent { get } var rawValue: String { get } } public typealias LocalizeParent = Localizable.Type? Now that you have the parent idea in place, the key generation needs to recurse up the tree in order to find the full path for the key. rivate let stringSeparator: String = "." private extension Localizable { static func concatComponent(parent parent: String?, child: String) -> String { guard let p = parent else { return child.snakeCaseString } return p + stringSeparator + child.snakeCaseString } static var entityName: String { return String(self) } static var entityPath: String { return concatComponent(parent: parent?.entityName, child: entityName) } var localizableKey: String { return self.dynamicType.concatComponent(parent: self.dynamicType.entityPath, child: rawValue) } } And to finish, you have to make enums conform to the updated protocol: enum Strings: String, Localizable { case Title enum Menu: String, Localizable { case Feed case Profile case Settings static let parent: LocalizeParent = Strings.self } static let parent: LocalizeParent = nil } With all this in place you can do the following in your app: label.text = Strings.Menu.Settings.localized And the label will have the value for the "strings.menu.settings" key in Localizable.strings. Source code The final code for this article is available on Github. You can find there the instructions for using it within your project. But also you can just add the Localize.swift and modify it according to your project's needs. You can also check out a simple example project to see the whole solution together.  Next time The next steps we would need to take in order to have a full solution is a way for the Localizable.strings file to autogenerate. The solution for this at the current state of Swift wouldn't be very elegant, because it would require either inspecting the objects using the ObjC runtime (which would be difficult to do since we are dealing with pure Swift types here) or defining all the children of a given object explicitly, in the same way as open source XCTest does. Each test case defines all of its tests in a static property. About the author Jorge Izquierdo has been developing iOS apps for 5 years. The day Swift was released, he starting hacking around with the language and built the first Swift HTTP server, Taylor. He has worked on several projects and right now works as an iOS development contractor.
Read more
  • 0
  • 0
  • 28088

article-image-how-to-build-and-deploy-node-app-docker
John Oerter
20 Sep 2016
7 min read
Save for later

How to Build and Deploy a Node App with Docker

John Oerter
20 Sep 2016
7 min read
How many times have you deployed your app that was working perfectly in your local environment to production, only to see it break? Whether it was directly related to the bug or feature you were working on, or another random issue entirely, this happens all too often for most developers. Errors like this not only slow you down, but they're also embarrassing. Why does this happen? Usually, it's because your development environment on your local machine is different from the production environment you're deploying to. The tenth factor of the Twelve-Factor App is Dev/prod parity. This means that your development, staging, and production environments should be as similar as possible. The authors of the Twelve-Factor App spell out three "gaps" that can be present. They are: The time gap: A developer may work on code that takes days, weeks, or even months to go into production. The personnel gap: Developers write code, ops engineers deploy it. The tools gap: Developers may be using a stack like Nginx, SQLite, and OS X, while the production deployment uses Apache, MySQL, and Linux. (Source) In this post, we will mostly focus on the tools gap, and how to bridge that gap in a Node application with Docker. The Tools Gap In the Node ecosystem, the tools gap usually manifests itself either in differences in Node and npm versions, or differences in package dependency versions. If a package author publishes a breaking change in one of your dependencies or your dependencies' dependencies, it is entirely possible that your app will break on the next deployment (assuming you reinstall dependencies with npm install on every deployment), while it runs perfectly on your local machine. Although you can work around this issue using tools like npm shrinkwrap, adding Docker to the mix will streamline your deployment life cycle and minimize broken deployments to production. Why Docker? Docker is unique because it can be used the same way in development and production. When you enable the architecture of your app to run inside containers, you can easily scale out and create small containers that can be composed together to make one awesome system. Then, you can mimic this architecture in development so you never have to guess how your app will behave in production. In regards to the time gap and the personnel gap, Docker makes it easier for developers to automate deployments, thereby decreasing time to production and making it easier for full-stack teams to own deployments. Tools and Concepts When developing inside Docker containers, the two most important concepts are docker-compose and volumes. docker-compose helps define mulit-container environments and the ability to run them with one command. Here are some of the more often used docker-compose commands: docker-compose build: Builds images for services defined in docker-compose.yml docker-compose up: Creates and starts services. This is the same as running docker-compose create && docker-compose start docker-compose run: Runs a one-off command inside a container Volumes allow you to mount files from the host machine into the container. When the files on your host machine change, they change inside the container as well. This is important so that we don't have to constantly rebuild containers during development every time we make a change. You can also use a tool like node-mon to automatically restart the node app on changes. Let's walk through some tips and tricks with developing Node apps inside Docker containers. Set up Dockerfile and docker-compose.yml When you start a new project with Docker, you'll first want to define a barebones Dockerfile and docker-compose.yml to get you started. Here's an example Dockerfile: FROM node:6.2.1 RUN useradd --user-group --create-home --shell /bin/false app-user ENV HOME=/home/app-user USER app-user WORKDIR $HOME/app This Dockerfile displays two best practices: Favor exact version tags over floating tags such as latest. Node releases often these days, and you don't want to implicitly upgrade when building your container on another machine. By specifying a version such as 6.2.1, you ensure that anyone who builds the image will always be working from the same node version. Create a new user to run the app inside the container. Without this step, everything would run under root in the container. You certainly wouldn't do that on a physical machine, so don't do in Docker containers either. Here's an example starter docker-compose.yml: web: build: . volumes: - .:/home/app-user/app Pretty simple right? Here we are telling Docker to build the web service based on our Dockerfile and create a volume from our current host directory to /home/app-user/app inside the container. This simple setup lets you build the container with docker-compose build and then run bash inside it with docker-compose run --rm web /bin/bash. Now, it's essentially the same as if you were SSH'd into a remote server or working off a VM, except that any file you create inside the container will be on your host machine and vice versa. With that in mind, you can bootstrap your Node app from inside your container using npm init -y and npm shrinkwrap. Then, you can install any modules you need such as Express. Install node modules on build With that done, we need to update our Dockerfile to install dependencies from npm when the image is built. Here is the updated Dockerfile: FROM node:6.2.1 RUN useradd --user-group --create-home --shell /bin/false app-user ENV HOME=/home/app-user COPY package.json npm-shrinkwrap.json $HOME/app/ RUN chown -R app-user:app-user $HOME/* USER app-user WORKDIR $HOME/app RUN npm install Notice that we had to change the ownership of the copied files to app-user. This is because files copied into a container are automatically owned by root. Add a volume for the node_modules directory We also need to make an update to our docker-compose.yml to make sure that our modules are installed inside the container properly. web: build: . volumes: - .:/home/app-user/app - /home/app-user/app/node_modules Without adding a data volume to /home/app-user/app/node_modules, the node_modules wouldn't exist at runtime in the container because our host directory, which won't contain the node_modules directory, would be mounted and hide the node_modules directory that was created when the container was built. For more information, see this Stack Overflow post. Running your app Once you've got an entry point to your app ready to go, simply add it as a CMD in your Dockerfile: CMD ["node", "index.js"] This will automatically start your app on docker-compose up. Running tests inside your container is easy as well. docker-compose --rm run web npm test You could easily hook this into CI. Production Now going to production with your Docker-powered Node app is a breeze! Just use docker-compose again. You will probably want to define another docker-compose.yml that is especially written for production use. This means removing volumes, binding to different ports, setting NODE_ENV=production, and so on. Once you have a production config file, you can tell docker-compose to use it, like so: docker-compose -f docker-compose.yml -f docker-compose.production.yml up The -f lets you specify a list of files that are merged in the order specified. Here is a complete Dockerfile and docker-compose.yml for reference: # Dockerfile FROM node:6.2.1 RUN useradd --user-group --create-home --shell /bin/false app-user ENV HOME=/home/app-user COPY package.json npm-shrinkwrap.json $HOME/app/ RUN chown -R app-user:app-user $HOME/* USER app-user WORKDIR $HOME/app RUN npm install CMD ["node", "index.js"] # docker-compose.yml web: build: . ports: - '3000:3000' volumes: - .:/home/app-user/app - /home/app-user/app/node_modules About the author John Oerter is a software engineer from Omaha, Nebraska, USA. He has a passion for continuous improvement and learning in all areas of software development, including Docker, JavaScript, and C#. He blogs here.
Read more
  • 0
  • 0
  • 18098

article-image-jenkins-20-impetus-devops-movement
Packt
19 Sep 2016
15 min read
Save for later

Jenkins 2.0: The impetus for DevOps Movement

Packt
19 Sep 2016
15 min read
In this article by Mitesh Soni, the author of the book DevOps for Web Development, provides some insight into DevOps movement, benefits of DevOps culture, Lifecycle of DevOps, how Jenkins 2.0 is bridging the gaps between Continuous Integration and Continuous Delivery using new features and UI improvements, installation and configuration of Jenkins 2.0. (For more resources related to this topic, see here.) Understanding the DevOps movement Let's try to understand what DevOps is. Is it a real, technical word? No, because DevOps is not just about technical stuff. It is also neither simply a technology nor an innovation. In simple terms, DevOps is a blend of complex terminologies. It can be considered as a concept, culture, development and operational philosophy, or a movement. To understand DevOps, let's revisit the old days of any IT organization. Consider there are multiple environments where an application is deployed. The following sequence of events takes place when any new feature is implemented or bug fixed:   The development team writes code to implement a new feature or fix a bug. This new code is deployed to the development environment and generally tested by the development team. The new code is deployed to the QA environment, where it is verified by the testing team. The code is then provided to the operations team for deploying it to the production environment. The operations team is responsible for managing and maintaining the code.   Let's list the possible issues in this approach: The transition of the current application build from the development environment to the production environment takes weeks or months. The priorities of the development team, QA team, and IT operations team are different in an organization and effective, and efficient co-ordination becomes a necessity for smooth operations. The development team is focused on the latest development release, while the operations team cares about the stability of the production environment. The development and operations teams are not aware of each other's work and work culture. Both teams work in different types of environments; there is a possibility that the development team has resource constraints and they therefore use a different kind of configuration. It may work on the localhost or in the dev environment. The operations team works on production resources and there will therefore be a huge gap in the configuration and deployment environments. It may not work where it needs to run—the production environment. Assumptions are key in such a scenario, and it is improbable that both teams will work under the same set of assumptions. There is manual work involved in setting up the runtime environment and configuration and deployment activities. The biggest issue with the manual application-deployment process is its nonrepeatability and error-prone nature. The development team has the executable files, configuration files, database scripts, and deployment documentation. They provide it to the operations team. All these artifacts are verified on the development environment and not in production or staging. Each team may take a different approach for setting up the runtime environment and the configuration and deployment activities, considering resource constraints and resource availability. In addition, the deployment process needs to be documented for future usage. Now, maintaining the documentation is a time-consuming task that requires collaboration between different stakeholders. Both teams work separately and hence there can be a situation where both use different automation techniques. Both teams are unaware of the challenges faced by each other and hence may not be able to visualize or understand an ideal scenario in which the application works. While the operations team is busy in deployment activities, the development team may get another request for a feature implementation or bug fix; in such a case, if the operations team faces any issues in deployment, they may try to consult the development team, who are already occupied with the new implementation request. This results in communication gaps, and the required collaboration may not happen. There is hardly any collaboration between the development team and the operations team. Poor collaboration causes many issues in the application's deployment to different environments, resulting in back-and-forth communication through e-mail, chat, calls, meetings, and so on, and it often ends in quick fixes. Challenges for the development team: The competitive market creates pressure of on-time delivery. They have to take care of production-ready code management and new feature implementation. The release cycle is often long and hence the development team has to make assumptions before the application deployment finally takes place. In such a scenario, it takes more time to fix the issues that occurred during deployment in the staging or production environment. Challenges for the operations team: Resource contention: It's difficult to handle increasing resource demands Redesigning or tweaking: This is needed to run the application in the production environment Diagnosing and rectifying: They are supposed to diagnose and rectify issues after application deployment in isolation The benefits of DevOps This diagram covers all the benefits of DevOps: Collaboration among different stakeholders brings many business and technical benefits that help organizations achieve their business goals. The DevOps lifecycle – it's all about "continuous" Continuous Integration(CI),Continuous Testing(CT), and Continuous Delivery(CD) are significant part of DevOps culture. CI includes automating builds, unit tests, and packaging processes while CD is concerned with the application delivery pipeline across different environments. CI and CD accelerate the application development process through automation across different phases, such as build, test, and code analysis, and enable users achieve end-to-end automation in the application delivery lifecycle: Continuous integration and continuous delivery or deployment are well supported by cloud provisioning and configuration management. Continuous monitoring helps identify issues or bottlenecks in the end-to-end pipeline and helps make the pipeline effective. Continuous feedback is an integral part of this pipeline, which directs the stakeholders whether are close to the required outcome or going in the different direction. "Continuous effort – not strength or intelligence – is the key to unlocking our potential"                                                                                            -Winston Churchill Continuous integration What is continuous integration? In simple words, CI is a software engineering practice where each check-in made by a developer is verified by either of the following: Pull mechanism: Executing an automated build at a scheduled time Push mechanism: Executing an automated build when changes are saved in the repository This step is followed by executing a unit test against the latest changes available in the source code repository. The main benefit of continuous integration is quick feedback based on the result of build execution. If it is successful, all is well; else, assign responsibility to the developer whose commit has broken the build, notify all stakeholders, and fix the issue. Read more about CI at http://martinfowler.com/articles/continuousIntegration.html. So why is CI needed? Because it makes things simple and helps us identify bugs or errors in the code at a very early stage of development, when it is relatively easy to fix them. Just imagine if the same scenario takes place after a long duration and there are too many dependencies and complexities we need to manage. In the early stages, it is far easier to cure and fix issues; consider health issues as an analogy, and things will be clearer in this context. Continuous integration is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early. CI is a significant part and in fact a base for the release-management strategy of any organization that wants to develop a DevOps culture. Following are immediate benefits of CI: Automated integration with pull or push mechanism Repeatable process without any manual intervention Automated test case execution Coding standard verification Execution of scripts based on requirement Quick feedback: build status notification to stakeholders via e-mail Teams focused on their work and not in the managing processes Jenkins, Apache Continuum, Buildbot, GitLabCI, and so on are some examples of open source CI tools. AnthillPro, Atlassian Bamboo, TeamCity, Team Foundation Server, and so on are some examples of commercial CI tools. Continuous integration tools – Jenkins Jenkins was originally an open source continuous integration software written in Java under the MIT License. However, Jenkins 2 an open source automation server that focuses on any automation, including continuous integration and continuous delivery. Jenkins can be used across different platforms, such as Windows, Ubuntu/Debian, Red Hat/Fedora, Mac OS X, openSUSE, and FreeBSD. Jenkins enables user to utilize continuous integration services for software development in an agile environment. It can be used to build freestyle software projects based on Apache Ant and Maven 2/Maven 3. It can also execute Windows batch commands and shell scripts. It can be easily customized with the use of plugins. There are different kinds of plugins available for customizing Jenkins based on specific needs for setting up continuous integration. Categories of plugins include source code management (the Git, CVS, and Bazaar plugins), build triggers (the Accelerated Build Now and Build Flow plugins), build reports (the Code Scanner and Disk Usage plugins), authentication and user management (the Active Directory and GitHub OAuth plugins), and cluster management and distributed build (Amazon EC2 and Azure Slave plugins). To know more about all plugins, visit https://wiki.jenkins-ci.org/display/JENKINS/Plugins. To explore how to create a new plugin, visit https://wiki.jenkins-ci.org/display/JENKINS/Plugin+tutorial. To download different versions of plugins, visit https://updates.jenkins-ci.org/download/plugins/. Visit the Jenkins website at http://jenkins.io/. Jenkins accelerates the software development process through automation: Key features and benefits Here are some striking benefits of Jenkins: Easy install, upgrade, and configuration. Supported platforms: Windows, Ubuntu/Debian, Red Hat/Fedora/CentOS, Mac OS X, openSUSE, FreeBSD, OpenBSD, Solaris, and Gentoo. Manages and controls development lifecycle processes. Non-Java projects supported by Jenkins: Such as .NET, Ruby, PHP, Drupal, Perl, C++, Node.js, Python, Android, and Scala. A development methodology of daily integrations verified by automated builds. Every commit can trigger a build. Jenkins is a fully featured technology platform that enables users to implement CI and CD. The use of Jenkins is not limited to CI and CD. It is possible to include a model and orchestrate the entire pipeline with the use of Jenkins as it supports shell and Windows batch command execution. Jenkins 2.0 supports a delivery pipeline that uses a Domain-Specific Language (DSL) for modeling entire deployments or delivery pipelines. Pipeline as code provides a common language—DSL—to help the development and operations teams to collaborate in an effective manner. Jenkins 2 brings a new GUI with stage view to observe the progress across the delivery pipeline. Jenkins 2.0 is fully backward compatible with the Jenkins 1.x series. Jenkins 2 now requires Servlet 3.1 to run. You can use embedded Winstone-Jetty or a container that supports Servlet 3.1 (such as Tomcat 8). GitHub, Collabnet, SVN, TFS code repositories, and so on are supported by Jenkins for collaborative development. Continuous integration: Automate build and test—automated testing (continuous testing), package, and static code analysis. Supports common test frameworks such as HP ALM Tools, Junit, Selenium, and MSTest. For continuous testing, Jenkins has plugins for both; Jenkins slaves can execute test suites on different platforms. Jenkins supports static code analysis tools such as code verification by CheckStyle and FindBug. It also integrates with Sonar. Continuous delivery and continuous deployment: It automates the application deployment pipeline, integrates with popular configuration management tools, and automates environment provisioning. To achieve continuous delivery and deployment, Jenkins supports automatic deployment; it provides a plugin for direct integration with IBM uDeploy. Highly configurable: Plugins-based architecture that provides support to many technologies, repositories, build tools, and test tools; it has an open source CI server and provides over 400 plugins to achieve extensibility. Supports distributed builds: Jenkins supports "master/slave" mode, where the workload of building projects is delegated to multiple slave nodes. It has a machine-consumable remote access API to retrieve information from Jenkins for programmatic consumption, to trigger a new build, and so on. It delivers a better application faster by automating the application development lifecycle, allowing faster delivery. The Jenkins build pipeline (quality gate system) provides a build pipeline view of upstream and downstream connected jobs, as a chain of jobs, each one subjecting the build to quality-assurance steps. It has the ability to define manual triggers for jobs that require intervention prior to execution, such as an approval process outside of Jenkins. In the following diagram Quality Gates and Orchestration of Build Pipeline is illustrated: Jenkins can be used with the following tools in different categories as shown here: Language Java .Net Code repositories Subversion, Git, CVS, StarTeam Build tools Ant, Maven NAnt, MS Build Code analysis tools Sonar, CheckStyle, FindBugs, NCover, Visual Studio Code Metrics, PowerTool Continuous integration Jenkins Continuous testing Jenkins plugins (HP Quality Center 10.00 with the QuickTest Professional add-in, HP Unified Functional Testing 11.5x and 12.0x, HP Service Test 11.20 and 11.50, HP LoadRunner 11.52 and 12.0x, HP Performance Center 12.xx, HP QuickTest Professional 11.00, HP Application Lifecycle Management 11.00, 11.52, and 12.xx, HP ALM Lab Management 11.50, 11.52, and 12.xx, JUnit, MSTest, and VsTest) Infrastructure provisioning Configuration management tool—Chef Virtualization/cloud service provider VMware, AWS, Microsoft Azure (IaaS), traditional environment Continuous delivery/deployment Chef/deployment plugin/shell scripting/Powershell scripts/Windows batch commands Installing Jenkins Jenkins provides us with multiple ways to install it for all types of users. We can install it on at least the following operating systems: Ubuntu/Debian Windows Mac OS X OpenBSD FreeBSD openSUSE Gentoo CentOS/Fedora/Red Hat One of the easiest options I recommend is to use a WAR file. A WAR file can be used with or without a container or web application server. Having Java is a must before we try to use a WAR file for Jenkins, which can be done as follows: Download the jenkins.war file from https://jenkins.io/. Open command prompt in Windows or a terminal in Linux, go to the directory where the jenkins.war file is stored, and execute the following command: java – jar jenkins.war Once Jenkins is fully up and running, as shown in the following screenshot, explore it in the web browser by visiting http://localhost:8080. By default, Jenkins works on port 8080. Execute the following command from the command line: java -jar jenkins.war --httpPort=9999 For HTTPS, use the following command: java -jar jenkins.war --httpsPort=8888 Once Jenkins is running, visit the Jenkins home directory. In our case, we have installed Jenkins 2 on a CentOS 6.7 virtual machine. Go to /home/<username>/.jenkins, as shown in the following screenshot. If you can't see the .jenkins directory, make sure hidden files are visible. In CentOS, press Ctrl+H to make hidden files visible. Setting up Jenkins Now that we have installed Jenkins, let's verify whether Jenkins is running. Open a browser and navigate to http://localhost:8080 or http://<IP_ADDRESS>:8080. If you've used Jenkins earlier and recently downloaded the Jenkins 2 WAR file, it will ask for a security setup. To unlock Jenkins, follow these steps: Go to the .Jenkins directory and open the initialAdminPassword file from the secrets subdirectory: Copy the password in that file, paste it in the Administrator password box, and click on Continue, as shown here: Clicking on Continue will redirect you to the Customize Jenkins page. Click on Install suggested plugins: The installation of the plugins will start. Make sure that you have a working Internet connection. Once all the required plugins have been installed, you will seethe Create First Admin User page. Provide the required details, and click on Save and Finish: Jenkins is ready! Our Jenkins setup is complete. Click on Start using Jenkins: Get Jenkins plugins from https://wiki.jenkins-ci.org/display/JENKINS/Plugins. Summary We have covered some brief details on DevOps culture and Jenkins 2.0 and its new features. DevOps for Web Developmentprovides more details on extending Continuous Integration to Continuous Delivery and Continuous Deployment using Configuration management tools such as Chef and Cloud Computing platforms such Microsoft Azure (App Services) and AWS (Amazon EC2 and AWS Elastic Beanstalk), you refer at https://www.packtpub.com/networking-and-servers/devops-web-development. To get more details Jenkins, refer to JenkinsEssentials, https://www.packtpub.com/application-development/jenkins-essentials. Resources for Article: Further resources on this subject: Setting Up and Cleaning Up [article] Maven and Jenkins Plugin [article] Exploring Jenkins [article]
Read more
  • 0
  • 0
  • 14222
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-setting-upa-network-backup-server-bacula
Packt
19 Sep 2016
12 min read
Save for later

Setting Up a Network Backup Server with Bacula

Packt
19 Sep 2016
12 min read
In this article by Timothy Boronczyk,the author of the book CentOS 7 Server Management Cookbook,we'll discuss how to set up a network backup server with Bacula. The fact of the matter is that we are living in a world that is becoming increasingly dependent on data. Also, from accidental deletion to a catastrophic hard drive failure, there are many threats to the safety of your data. The more important your data is and the more difficult it is to recreate if it were lost, the more important it is to have backups. So, this article shows you how you can set up a backup server using Bacula and how to configure other systems on your network to back up their data to it. (For more resources related to this topic, see here.) Getting ready This article requires at least two CentOS systems with working network connections. The first system is the local system which we'll assume has the hostname benito and the IP address 192.168.56.41. The second system is the backup server. You'll need administrative access on both systems, either by logging in with the root account or through the use of sudo. How to do it… Perform the following steps on your local system to install and configure the Bacula file daemon: Install the bacula-client package. yum install bacula-client Open the file daemon's configuration file with your text editor. vi /etc/bacula/bacula-fd.conf In the FileDaemon resource, update the value of the Name directive to reflect the system's hostname with the suffix -fd. FileDaemon {   Name = benito-fd ... } Save the changes and close the file. Start the file daemon and enable it to start when the system reboots. systemctl start bacula-fd.service systemctl enable bacula-fd.service Open the firewall to allow TCP traffic through to port 9102. firewall-cmd --zone=public --permanent --add-port=9102/tcp firewall-cmd --reload Repeat steps 1-6 on each system that will be backed up. Install the bacula-console, bacula-director, bacula-storage, and bacula-client packages. yum install bacula-console bacula-director bacula-storage bacula-client Re-link the catalog library to use SQLite database storage. alternatives --config libbaccats.so Type 2 when asked to provide the selection number. Create the SQLite database file and import the table schema. /usr/libexec/bacula/create_sqlite3_database /usr/libexec/bacula/make_sqlite3_tables Open the director's configuration file with your text editor. vi /etc/bacula/bacula-dir.conf In the Job resource where Name has the value BackupClient1, change the value of the Name directive to reflect one of the local systems. Then add a Client directive with a value that matches that system's FileDaemonName. Job {   Name = "BackupBenito"   Client = benito-fd   JobDefs = "DefaultJob" } Duplicate the Job resource and update its directive values as necessary so that there is a Job resource defined for each system to be backed up. For each system that will be backed up, duplicate the Client resource where the Name directive is set to bacula-fd. In the copied resource, update the Name and Address directives to identify that system. Client {   Name = bacula-fd   Address = localhost   ... } Client {   Name = benito-fd   Address = 192.168.56.41   ... } Client {   Name = javier-fd   Address = 192.168.56.42   ... } Save your changes and close the file. Open the storage daemon's configuration file. vi /etc/bacula/bacula-sd.conf In the Device resource where Name has the value FileStorage, change the value of the Archive Device directive to /bacula. Device {   Name = FileStorage   Media Type = File   Archive Device = /bacula ... Save the update and close the file. Create the /bacula directory and assign it the proper ownership. mkdir /bacula chown bacula:bacula /bacula If you have SELinux enabled, reset the security context on the new directory. restorecon -Rv /bacula Start the director and storage daemons and enable them to start when the system reboots. systemctl start bacula-dir.service bacula-sd.service bacula-fd.service systemctl enable bacula-dir.service bacula-sd.service bacula-fd.service Open the firewall to allow TCP traffic through to ports 9101-9103. firewall-cmd --zone=public --permanent --add-port=9101-9103/tcp firewall-cmd –reload Launch Bacula's console interface. bconsole Enter label to create a destination for the backup. When prompted for the volume name, use Volume0001 or a similar value. When prompted for the pool, select the File pool. label Enter quit to leave the console interface. How it works… The suite's distributed architecture and the amount of flexibility it offers us can make configuring Bacula a daunting task.However, once you have everything up and running, you'll be able to rest easy knowing that your data is safe from disasters and accidents. Bacula is broken up into several components. In this article, our efforts centered on the following three daemons: the director, the file daemon, and the storage daemon. The file daemon is installed on each local system to be backed up and listens for connections from the director. The director connects to each file daemon as scheduled and tells it whichfiles to back up and where to copy them to (the storage daemon). This allows us to perform all scheduling at a central location. The storage daemon then receives the data and writes it to the backup medium, for example, disk or tape drive. On the local system, we installed the file daemon with the bacula-client package andedited the file daemon's configuration file at /etc/bacula/bacula-fd.conf to specify the name of the process. The convention is to add the suffix -fd to the system's hostname. FileDaemon {   Name = benito-fd   FDPort = 9102   WorkingDirectory = /var/spool/bacula   Pid Directory = /var/run   Maximum Concurrent Jobs = 20 } On the backup server, we installed thebacula-console, bacula-director, bacula-storage, and bacula-client packages. This gives us the director and storage daemon and another file daemon. This file daemon's purpose is to back up Bacula's catalog. Bacula maintains a database of metadata about previous backup jobs called the catalog, which can be managed by MySQL, PostgreSQL, or SQLite. To support multiple databases, Bacula is written so that all of its database access routines are contained in shared libraries with a different library for each database. When Bacula wants to interact with a database, it does so through libbaccats.so, a fake library that is nothing more than a symbolic link pointing to one of the specific database libraries. This let's Bacula support different databases without requiring us to recompile its source code. To create the symbolic link, we usedalternatives and selected the real library that we want to use. I chose SQLite since it's an embedded database library and doesn't require additional services. Next, we needed to initialize the database schema using scripts that come with Bacula. If you want to use MySQL, you'll need to create a dedicated MySQL user for Bacula to use and then initialize the schema with the following scripts instead. You'll also need to review Bacula's configuration files to provide Bacula with the required MySQL credentials. /usr/libexec/bacula/grant_mysql_privileges /usr/libexec/bacula/create_mysql_database /usr/libexec/bacula/make_mysql_tables Different resources are defined in the director's configuration file at /etc/bacula/bacula-dir.conf, many of which consist not only of their own values but also reference other resources. For example, the FileSet resource specifies which files are included or excluded in backups and restores, while a Schedule resource specifies when backups should be made. A JobDef resource can contain various configuration directives that are common to multiple backup jobs and also reference particular FileSet and Schedule resources. Client resources identify the names and addresses of systems running file daemons, and a Job resource will pull together a JobDef and Client resource to define the backup or restore task for a particular system. Some resources define things at a more granular level and are used as building blocks to define other resources. Thisallows us to create complex definitions in a flexible manner. The default resource definitions outline basic backup and restore jobs that are sufficient for this article (you'll want to study the configuration and see how the different resources fit together so that you can tweak them to better suit your needs). We customized the existing backup Jobresource by changing its name and client. Then, we customized the Client resource by changing its name and address to point to a specific system running a file daemon. A pair of Job and Client resources can be duplicated for each additional system youwant to back up. However, notice that I left the default Client resource that defines bacula-fd for the localhost. This is for the file daemon that's local to the backup server and will be the target for things such as restore jobs and catalog backups. Job {   Name = "BackupBenito"   Client = benito-fd   JobDefs = "DefaultJob" }   Job {   Name = "BackupJavier"   Client = javier-fd   JobDefs = "DefaultJob" }   Client {   Name = bacula-fd   Address = localhost   ... }   Client {   Name = benito-fd   Address = 192.168.56.41   ... }   Client {   Name = javier-fd   Address = 192.168.56.42   ... } To complete the setup, we labeled a backup volume. This task, as with most others, is performed through bconsole, a console interface to the Bacula director. We used thelabel command to specify a label for the backup volume and when prompted for the pool, we assigned the labeled volume to the File pool. In a way very similar to how LVM works, an individual device or storage unit is allocated as a volume and the volumes are grouped into storage pools. If a pool contains two volumes backed by tape drives for example and one of the drives is full, the storage daemon will write the data to the tape that has space available. Even though in our configuration we're storing the backup to disk, we still need to create a volume as the destination for data to be written to. There's more... At this point, you should consider which backup strategy works best for you. A full backup is a complete copy of your data, a differential backup captures only the files that have changed since the last full backup, and an incremental backup copies the files that have changed since the last backup (regardless of the type of backup). Commonly, administrators employ a previous combination, perhaps making a full backup at the start of the week and then differential or incremental backups each day thereafter. This saves storage space because the differential and incremental backups are not only smaller but also convenient when the need to restore a file arises because a limited number of backups need to be searched for the file. Another consideration is the expected size of each backup and how long it will take for the backup to run to completion. Full backups obviously take longer to run, and in an office with 9-5 working hours, Monday through Friday and it may not be possible to run a full backup during the evenings. Performing a full backup on Fridays gives the backup time over the weekend to run. Smaller, incremental backups can be performed on the other days when time is lesser. Yet another point that is important in your backup strategy is, how long the backups will be kept and where they will be kept. A year's worth of backups is of no use if your office burns down and they were sitting in the office's IT closet. At one employer, we kept the last full back up and last day's incremental on site;they were then duplicated to tape and stored off site. Regardless of the strategy you choose to implement, your backups are only as good as your ability to restore data from them. You should periodically test your backups to make sure you can restore your files. To run a backup job on demand, enter run in bconsole. You'll be prompted with a menu to select one of the current configured jobs. You'll then be presented with the job's options, such as what level of backup will be performed (full, incremental, or differential), it's priority, and when it will run. You can type yes or no to accept or cancel it or mod to modify a parameter. Once accepted, the job will be queued and assigned a job ID. To restore files from a backup, use the restore command. You'll be presented with a list of options allowing you to specify which backup the desired files will be retrieved from. Depending on your selection, the prompts will be different. Bacula's prompts are rather clear, so read them carefully and they will guide you through the process. Apart from the run and restore commands, another useful command is status. It allows you to see the current status of the Bacula components, if there are any jobs currently running, and which jobs have completed. A full list of commands can be retrieved by typing help in bconsole. See also For more information on working with Bacula, refer to the following resources: Bacula documentation (http://blog.bacula.org/documentation/) How to use Bacula on CentOS 7 (http://www.digitalocean.com/community/tutorial_series/how-to-use-bacula-on-centos-7) Bacula Web (a web-based reporting and monitoring tool for Bacula) (http://www.bacula-web.org/) Summary In this article, we discussed how we can set up a backup server using Bacula and how to configure other systems on our network to back up our data to it. Resources for Article: Further resources on this subject: Jenkins 2.0: The impetus for DevOps Movement [article] Gearing Up for Bootstrap 4 [article] Introducing Penetration Testing [article]
Read more
  • 0
  • 0
  • 30260

article-image-how-construct-your-own-network-models-chainer
Masayuki Takagi
19 Sep 2016
6 min read
Save for later

How to construct your own network models on Chainer

Masayuki Takagi
19 Sep 2016
6 min read
In this post, I will introduce you to how to construct your own network model on Chainer. First, I will begin by introducing Chainer's basic concepts as its building blocks. Then, you will see how to construct network models on Chainer. Finally, I define a multi-class classifier bound to the network model. Concepts Let’s start with the basic concepts Chainer has as its building blocks. Procedural abstractions Chains Links Functions Data abstraction Variables Chainer has three procedural abstractions: chains, links and functions from the higher level. Chains are at the highest level and they represent entire network models. They consist of links and/or other chains. Links are like layers in a network model, which have learnable parameters to be optimized through training. Functions are the most fundamental in Chainer's procedural abstractions. They take inputs and return outputs. Links use functions to apply them its inputs and parameters to return its outputs. At the data abstraction level, Chainer has Variables. They represent inputs and outputs of chains, links and functions. As their actual data representation, they wrap numpy and cupy's n-dimentional arrays. In the following sections, we will construct our own network models with these concepts. Construct network models on Chainer In this section, I describe how to construct multi-layer perceptron (MLP) with three layers on top of the basic concepts shown in the previous section. It is very simple. from chainer import Chain import chainer.functions as F import chainer.links as L class MLP(Chain): def__init__(self): super(MLP, self).__init__( l1=L.Linear(784, 100), l2=L.Linear(100, 100), l3=L.Linear(100, 10), ) def__call__(self, x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(h1)) y = self.l3(h2) return y We define the MLP class derived from Chain class. Then we implement two methods, __init__ and __call__. The __init__ method is for initializing links that the chain has. It has three fully connected layers: chainer.links.Linear, or L.Linear above. The first layer named l1 takes 784-dimension input vectors, which means a 28x28-pixel grayscale handwritten digital image. The last layer named l3 returns 10-dimension output vectors, which correspond to ten digits from 0 to 9. Between them, it has another middle layer named l2 that takes 100 input/output vectors. Called on forward propagation is the __call__ method. It takes an input x as a Chainer variable and applies it to three Linear layers and two ReLU activity functions for l1 and l2 Linear layers. It then returns a Chainer variable y that is the output of l3 layer. Because the two ReLU activity functions are Chainer functions, they do not have learnable parameters internally, so you do not have to initialize them with the __init__ method. This code does forward propagation as well as network construction behind the stage. It is Chainer's magic, and backward propagation is automatically computed based on the network constructed here when we optimize it. Of course, you may write the __call__ method as follows. The local variables h1 and h2 are just for clear code. def__call__(self, x): returnself.l3(F.relu(self.l2(F.relu(self.l1(x))))) Define a Multi-class classifier Another chain is Classifier, which will be bound to the MLP chain defined in the previous section. Classifier is for general multi-class classification, and it computes loss value and accuracy of a network model for a given input vector and its ground truth. import chainer.functions as F class Classifier(Chain): def__init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def__call__(self, x, t): y = self.predictor(x) self.loss = F.softmax_cross_entropy(y, t) self.accuracy = F.accuracy(y, t) returnself.loss We define the Classifier class derived from the Chain class as well as the MLP class, because it takes a chain as a predictor. We similarly implement the __init__ and __call__ methods. In the __init__ method, it takes a predictor parameter as a chain, or a link is also acceptable, to initialize this class. In the __call__ method, it takes two inputs x and t. These are input vectors and its ground truth respectively. Chains that are passed to Chainer optimizers should be compliant with this protocol. First, we give the input vector x to the predictor, which would be MLP model in this post, to get the result y of forward propagation. Then, the loss value and accuracy are computed with the softmax_cross_entropy and accuracy functions for a given ground truth t. Finally, it returns the loss value. The accuracy can be accessed as an attribute any time. Actually, we initialize this classifier bound to the MLP model as follows. model is passed to Chainer optimizer. model = Classifier(MLP()) Conclusion Now we have our own network model. In this post I introduced basic concepts of Chainer. Then we implemented the MLP class that models multi-layer perceptron and Classifier class that computes loss value and accuracy to be optimized for a given ground truth. from chainer import Chain import chainer.functions as F import chainer.links as L class MLP(Chain): def__init__(self): super(MLP, self).__init__( l1=L.Linear(784, 100), l2=L.Linear(100, 100), l3=L.Linear(100, 10), ) def__call__(self, x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(h1)) y = self.l3(h2) return y class Classifier(Chain): def__init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def__call__(self, x, t): y = self.predictor(x) self.loss = F.softmax_cross_entropy(y, t) self.accuracy = F.accuracy(y, t) returnself.loss model = Classifier(MLP()) As some next steps, you may want to learn: How to optimize the model. How to train and test the model for an MNIST dataset. How to accelerate training the model using GPU. Chainer provides an MNIST example program in the chainer/examples/mnist directory, which will help you. About the author Masayuki Takagi is an entrepreneur and software engineer from Japan. His professional experience domains are advertising and deep learning, serving big Japanese corporations. His personal interests are fluid simulation, GPU computing, FPGA, compiler, and programming language design. Common Lisp is his most beloved programming language and he is the author of the cl-cuda library. Masayuki is a graduate of the University of Tokyo and lives in Tokyo with his buddy Plum, a long coat Chihuahua, and aco.
Read more
  • 0
  • 0
  • 1773

article-image-making-history-event-sourcing
Packt
15 Sep 2016
18 min read
Save for later

Making History with Event Sourcing

Packt
15 Sep 2016
18 min read
In this article by Christian Baxter, author of the book Mastering Akka, we will see, the most common, tried, and true approach is to model the data in a relational database when it comes to the persistence needs for an application. Following this approach has been the de facto way to store data until recently, when NoSQL (and to a lesser extent NewSQL) started to chip away at the footholds of relational database dominance. There's nothing wrong with storing your application's data this way—it's how we initially chose to do so for the bookstore application using PostgreSQL as the storage engine. This article deals with event sourcing and how to implement that approach using Akka Persistence. These are the main things you can expect to learn from this article: (For more resources related to this topic, see here.) Akka persistence for event sourcing Akka persistence is a relatively newer module within the Akka toolkit. It became available as experimental in the 2.3.x series. Throughout that series, it went through quite a few changes as the team worked on getting the API and functionality right. When Akka 2.4.2 was released, the experimental label was removed, signifying that persistence was stable and ready to be leveraged in production code. Akka persistence allows stateful actors to persist their internal state. It does this not to persisting the state itself, but instead as changes to that state. It uses an append-only model to persist these state changes, allowing you to later reconstitute the state by replaying the changes to that state. It also allows you to take periodic snapshots and use those to reestablish an actor's state as a performance optimization for long-lived entities with lots of state changes. Akka persistence's approach should certainly sound familiar as it's almost a direct overlay to the features of event sourcing. In fact, it was inspired by the eventsourced Scala library, so that overlay is no coincidence. Because of this alignment with event sourcing, Akka persistence will be the perfect tool for us to switch over to an event sourced model. Before getting into the details of the refactor, I want to describe some of the high-level concepts in the framework. The PersistentActor trait The PersistentActor trait is the core building block to create event sourced entities. This actor is able to persist its events to a pluggable journal. When a persistent actor is restarted (reloaded), it will replay its journaled events to reestablish its current internal state. These two behaviors perfectly fit what we need to do for our event sourced entities, so this will be our core building block. The PersistentActor trait has a log of features, more that I will cover in the next few sections. I'll cover the things that we will use in the bookstore refactoring, which I consider to be the most useful features in PersistentActor. If you want to learn more, then I suggest you take a look at the Akka documents as they pretty much cover everything else that you can do with PersistentActor. Persistent actor state handling A PersistentActor implementation has two basic states that it can be in—Recovering and Receiving Commands. When Recovering, it's in the process of reloading its event stream from the journal to rebuild its internal state. Any external messages that come in during this time will be stashed until the recovery process is complete. Once the recovery process completes, the persistent actor transitions into the Receiving Commands state where it can start to handle commands. These commands can then generate new events that can further modify the state of this entity. This two-state flow can be visualized in the following diagram: These two states are both represented by custom actor receive handling partial functions. You must provide implementations for both of the following vals in order to properly implement these two states for your persistent actor: val receiveRecover: Receive = { . . . } val receiveCommand: Receive = { . . . } While in the recovering state, there are two possible messages that you need to be able to handle. The first is one of the event types that you previously persisted for this entity type. When you get that type of message, you have to reapply the change implied by that event to the internal state of the actor. For example, if we had a SalesOrderFO fields object as the internal state, and we received a replayed event indicating that the order was approved, the handling might look something like this: var state:SalesOrderFO = ... val receiveRecover: Receive = { case OrderApproved(id) => state = state.copy(status = SalesOrderStatus.Approved) } We'd, of course, need to handle a lot more than that one event. This code sample was just to show you how you can modify the internal state of a persistent actor when it's being recovered. Once the actor has completed the recovery process, it can transition into the state where it starts to handle incoming command requests. Event sourcing is all about Action (command) and Reaction (events). When the persistent actor receives a command, it has the option to generate zero to many events as a result of that command. These events represent a happening on that entity that will affect its current state. Events you receive while in the Recovering state will be previously generated while in the Receiving Commands state. So, the preceding example that I coded, where we receive OrderApproved, must have previously come from some command that we handled earlier. The handling of that command could have looked something like this: val receiveCommand: Receive = { case ApproveOrder(id) => persist(OrderApproved(id)){ event => state = state.copy(status = SalesOrderStatus.Approved) sender() ! FullResult(state) } } After receiving the command request to change the order status to approved, the code makes a call to persist, which will asynchronously write an event into the journal. The full signature for persist is: persist[A](event: A)(handler: (A) ⇒ Unit): Unit The first argument there represents the event that you want to write to the journal. The second argument is a callback function that will be executed after the event has been successfully persisted (and won't be called at all if the persistence fails). For our example, we will use that callback function to mutate the internal state to update the status field to match the requested action. One thing to note is that the writing in the journal is asynchronous. So, one may then think that it's possible to be closing over that internal state in an unsafe way when the callback function is executed. If you persisted two events in rapid succession, couldn't it be possible for both of them to access that internal state at the same time in separate threads, kind of like when using Futures in an actor? Thankfully, this is not the case. The completion of a persistence action is sent back as a new message to the actor. The hidden receive handling for this message will then invoke the callback associated with that persistence action. By using the mailbox again, we will know these post-persistence actions will be executed one at a time, in a safe manner. As an added bonus, the sender associated with those post-persistence messages will be the original sender of the command so you can safely use sender() in a persistence callback to reply to the original requestor, as shown in my example. Another guarantee that the persistence framework makes when persisting events is that no other commands will be processed in between the persistence action and the associated callback. Any commands that come in during that time will be stashed until all of the post-persistence actions have been completed. This makes the persist/callback sequence atomic and isolated, in that nothing else can interfere with it while it's happening. Allowing additional commands to be executed during this process may lead to an inconsistent state and response to the caller who sent the commands. If for some reason, the persisting to the journal fails, there is an onPersistFailure callback that will be invoked. If you want to implement custom handling for this, you can override this method. No matter what, when persistence fails, the actor will be stopped after making this callback. At this point, it's possible that the actor is in an inconsistent state, so it's safer to stop it than to allow it to continue on in this state. Persistence failures probably mean something is failing with the journal anyway so restarting as opposed to stopping will more than likely lead to even more failures. There's one more callback that you can implement in your persistent actors and that's onPersistRejected. This will happen if the serialization framework rejects the serialization of the event to store. When this happens, the persist callback does not get invoked, so no internal state update will happen. In this case, the actor does not stop or restart because it's not in an inconsistent state and the journal itself is not failing. The PersistenceId Another important concept that you need to understand with PersistentActor is the persistenceId method. This abstract method must be defined for every type of PersistentActor you define, returning a String that is to be unique across different entity types and also between actor instances within the same type. Let's say I will create the Book entity as a PersistentActor and define the persistenceId method as follows: override def persistenceId = "book" If I do that, then I will have a problem with this entity, in that every instance will share the entire event stream for every other Book instance. If I want each instance of the Book entity to have its own separate event stream (and trust me, you will), then I will do something like this when defining the Book PersistentActor: class Book(id:Int) extends PersistentActor{ override def persistenceId = s"book-$id" } If I follow an approach like this, then I can be assured that each of my entity instances will have its own separate event stream as the persistenceId will be unique for every Int keyed book we have. In the current model, when creating a new instance of an entity, we will pass in the special ID of 0 to indicate that this entity does not yet exist and needs to be persisted. We will defer ID creation to the database, and once we have an ID (after persistence), we will stop that actor instance as it is not properly associated with that newly generated ID. With the persistenceId model of associating the event stream to an entity, we will need the ID as soon as we create the actor instance. This means we will need a way to have a unique identifier even before persisting the initial entity state. This is something to think about before we get to the upcoming refactor. Taking snapshots for faster recovery I've mentioned the concept of taking a snapshot of the current state of an entity to speed up the process of recovering its state. If you have a long-lived entity that has generated a large amount of events, it will take progressively more and more time to recover its state. Akka's PersistentActor supports the snapshot concept, putting it in your hands as to when to take the snapshot. Once you have taken the snapshots, the latest one will be offered to the entity during the recovery phase instead of all of the events that led up to it. This will reduce the total number of events to process to recover state, thus speeding up that process. This is a two-part process, with the first part being taking snapshots periodically and the second being handling them during the recovery phase. Let's take a look at the snapshot taking process first. Let's say that you coded a particular entity to save a new snapshot for every one hundred events received. To make this happen, your command handling block may look something like this: var eventTotal = ... val receiveCommand:Receive = { case UpdateStatus(status) => persist(StatusUpdated(status)){ event => state = state.copy(status = event.status) eventTotal += 1 if (eventTotal % 100 == 0) saveSnapshot(state) } case SaveSnapshotSuccess(metadata) => . . . case SaveSnapshotFailure(metadata, reason) => . . . } You can see in the post-persist logic that if we we're making a specific call to saveSnapshot, we are passing the latest version of the actor's internal state. You're not limited to doing this just in the post-persist logic in reaction to a new event, but you can also set up the actor to publish a snapshot on regular intervals. You can leverage Akka's scheduler to send a special message to the entity to instruct it to save the snapshot periodically. If you start saving snapshots, then you will have to start handling the two new messages that will be sent to the entity indicating the status of the saved snapshot. These two new message types are SaveSnapshotSuccess and SaveSnapshotFailure. The metadata that appears on both messages will tell you things, such as the persistence ID where the failure occurred, the sequence number of the snapshot that failed, and the timestamp of the failure. You can see these two new messages in the command handling block shown in the preceding code. Once you have saved a snapshot, you will need to start handling it in the recovery phase. The logic to handle a snapshot during recovery will look like the following code block: val receiveRecover:Receive = { case SnapshotOffer(metadata, offeredSnapshot) => state = offeredSnapshot case event => . . . } Here, you can see that if we get a snapshot during recovery, instead of just making an incremental change, as we do with real replayed events, we set the entire state to whatever the offered snapshot is. There may be hundreds of events that led up to that snapshot, but all we need to handle here is one message in order to wind the state forward to when we took that snapshot. This process will certainly pay dividends if we have lots of events for this entity and we continue to take periodic snapshots. One thing to note about snapshots is that you will only ever be offered the latest snapshot (per persistence id) during the recovery process. Even though I'm taking a new snapshot every 100 events, I will only ever be offered one,the latest one, during the recovery phase. Another thing to note is that there is no real harm in losing a snapshot. If your snapshot storage was wiped out for some reason, the only negative side effect is that you'll be stuck processing all of the events for an entity when recovering it. When you take snapshots, you don't lose any of the event history. Snapshots are completely supplemental and only benefit the performance of the recovery phase. You don't need to take them, and you can live without them if something happens to the ones you had taken. Serialization of events and snapshots Within both the persistence and snapshot examples, you can see I was passing objects into the persist and saveSnapshot calls. So, how are these objects marshaled to and from a format that can actually be written to those stores? The answer is—via Akka serialization. Akka persistence is dependent on Akka serialization to convert event and snapshot objects to and from a binary format that can be saved into a data store. If you don't make any changes to the default serialization configuration, then your objects will be converted into binary via Java serialization. Java serialization is both slow and inefficient in terms of size of the serialized object. It's also not flexible in terms of the object definition changing after producing the binary when you are trying to read it back in. It's not a good choice for our needs with our event sourced app. Luckily, Akka serialization allows you to provide your own custom serializers. If you, perhaps, wanted to use JSON as your serialized object representation then you can pretty easily build a custom serializer to do that. They also have a built-in Google Protobuf serializer that can convert your Protobuf binding classes into their binary format. We'll explore both custom serializers and the Protobuf serializer when we get into the sections dealing with the refactors. The AsyncWriteJournal Another important component in Akka persistence, which I've mentioned a few times already, is the AsyncWriteJournal. This component is an append-only data store that stores the sequence of events (per persistence id) a PersistentActor generates via calls to persist. The journal also stores the highestSequenceNr per persistence id that tracks the total number of persisted events for that persistence id. The journal is a pluggable component. You have the ability to configure the default journal and, also, override it on a per-entity basis. The default configuration for Akka does not provide a value for the journal to use, so you must either configure this setting or add a per-entity override (more on that in a moment) in order to start using persistence. If you want to set the default journal, then it can be set in your config with the following property: akka.persistence.journal.plugin="akka.persistence.journal.leveldb" The value in the preceding code must be the fully qualified path to another configuration section of the same name where the journal plugin's config lives. For this example, I set it to the already provided leveldb config section (from Akka's reference.conf). If you want to override the journal plugin for a particular entity instance only, then you can do so by overriding the journalPluginId method on that entity actor, as follows: class MyEntity extends PersistentActor{ override def journalPluginId = "my-other-journal" . . . } The same rules apply here, in which, my-other-journal must be the fully qualified name to a config section where the config for that plugin lives. My example config showed the use of the leveldb plugin that writes to the local file system. If you actually want to play around using this simple plugin, then you will also need to add the following dependencies into your sbt file: "org.iq80.leveldb" % "leveldb" % "0.7" "org.fusesource.leveldbjni" % "leveldbjni-all" % "1.8" If you want to use something different, then you can check the community plugins page on the Akka site to find one that suits your needs. For our app, we will use the Cassandra journal plugin. I'll show you how to set up the config for that in the section dealing with the installation of Cassandra. The SnapshotStore The last thing I want to cover before we start the refactoring process is the SnapshotStore. Like the AsyncWriteJournal, the SnapshotStore is a pluggable and configurable storage system, but this one stores just snapshots as opposed to the entire event stream for a persistence id. As I mentioned earlier, you don't need snapshots, and you can survive if the storage system you used for them gets wiped out for some reason. Because of this, you may consider using a separate storage plugin for them. When selecting the storage system for your events, you need something that is robust, distributed, highly available, fault tolerant, and backup capable. If you lose these events, you lose the entire data set for your application. But, the same is not true for snapshots. So, take that information into consideration when selecting the storage. You may decide to use the same system for both, but you certainly don't have to. Also, not every journal plugin can act as a snapshot plugin; so, if you decide to use the same for both, make sure that the journal plugin you select can handle snapshots. If you want to configure the snapshot store, then the config setting to do that is as follows: akka.persistence.snapshot-store.plugin="my-snapshot-plugin" The setting here follows the same rules as the write journal; the value must be the fully qualified name to a config section where the plugin's config lives. If you want to override the default setting on a per entity basis, then you can do so by overriding the snapshotPluginId command on your actor like this: class MyEntity extends PersistentActor{ override def snapshotPluginId = "my-other-snap-plugin" . . . } The same rules apply here as well, in which, the value must be a fully qualified path to a config section where the plugin's config lives. Also, there are no out-of-the-box default settings for the snapshot store, so if you want to use snapshots, you must either set the appropriate setting in your config or provide the earlier mentioned override on a per entity basis. For our needs, we will use the same storage mechanism—Cassandra—for both the write journal and the snapshot storage. We have a multi-node system currently, so using something that writes to the local file system, or a simple in-memory plugin, won't work for us. Summary In this article, you learned about Akka persistence for event sourcing and the need to take a snapshot of the current state of an entity to speed up the process to recover its state. Resources for Article: Further resources on this subject: Introduction to Akka [article] Using NoSQL Databases [article] PostgreSQL – New Features [article]
Read more
  • 0
  • 0
  • 2704

article-image-hello-tdd
Packt
14 Sep 2016
6 min read
Save for later

Hello TDD!

Packt
14 Sep 2016
6 min read
In this article by Gaurav Sood, the author of the book Scala Test-Driven Development, tells  basics of Test-Driven Development. We will explore: What is Test-Driven Development? What is the need for Test-Driven Development? Brief introduction to Scala and SBT (For more resources related to this topic, see here.) What is Test-Driven Development? Test-Driven Development or TDD(as it is referred to commonly)is the practice of writing your tests before writing any application code. This consists of the following iterative steps: This process is also referred to asRed-Green-Refactor-Repeat. TDD became more prevalent with the use of agile software development process, though it can be used as easily with any of the Agile's predecessors like Waterfall. Though TDD is not specifically mentioned in agile manifesto (http://agilemanifesto.org), it has become a standard methodology used with agile. Saying this, you can still use agile without using TDD. Why TDD? The need for TDD arises from the fact that there can be constant changes to the application code. This becomes more of a problem when we are using agile development process, as it is inherently an iterative development process. Here are some of the advantages, which underpin the need for TDD: Code quality: Tests on itself make the programmer more confident of their code. Programmers can be sure of syntactic and semantic correctness of their code. Evolving architecture: A purely test-driven application code gives way to an evolving architecture. This means that we do not have to pre define our architectural boundaries and the design patterns. As the application grows so does the architecture. This results in an application that is flexible towards future changes. Avoids over engineering: Tests that are written before the application code define and document the boundaries. These tests also document the requirements and application code. Agile purists normally regard comments inside the code as a smell. According to them your tests should document your code. Since all the boundaries are predefined in the tests, it is hard to write application code, which breaches these boundaries. This however assumes that TDD is following religiously. Paradigm shift: When I had started with TDD, I noticed that the first question I asked myself after looking at the problem was; "How can I solve it?" This however is counterproductive. TDD forces the programmer to think about the testability of the solution before its implementation. To understand how to test a problem would mean a better understanding of the problem and its edge cases. This in turn can result into refinement of the requirements or discovery or some new requirements. Now it had become impossible for me not to think about testability of the problem before the solution. Now the first question I ask myself is; "How can I test it?". Maintainable code: I have always found it easier to work on an application that has historically been test-driven rather than on one that is not. Why? Only because when I make change to the existing code, the existing tests make sure that I do not break any existing functionality. This results in highly maintainable code, where many programmers can collaborate simultaneously. Brief introduction to Scala and SBT Let us look at Scala and SBT briefly. It is assumed that the reader is familiar with Scala and therefore will not go into the depth of it. What is Scala Scala is a general-purpose programming language. Scala is an acronym for Scalable Language. This reflects the vision of its creators of making Scala a language that grows with the programmer's experience of it. The fact that Scala and Java objects can be freely mixed, makes transition from Java to Scala quite easy. Scala is also a full-blown functional language. Unlike Haskell, which is a pure functional language, Scala allows interoperability with Java and support for objectoriented programming. Scala also allows use of both pure and impure functions. Impure functions have side affect like mutation, I/O and exceptions. Purist approach to Scala programming encourages use of pure functions only. Scala is a type-safe JVM language that incorporates both object oriented and functional programming into an extremely concise, logical, and extraordinarily powerful language. Why Scala? Here are some advantages of using Scala: Functional solution to problem is always better: This is my personal view and open for contention. Elimination of mutation from application code allows application to be run in parallelacross hosts and cores without any deadlocks. Better concurrency model: Scala has an actor model that is better than Java's model of locks on thread. Concise code:Scala code is more concise than itsmore verbose cousin Java. Type safety/ static typing: Scala does type checking at compile time. Pattern matching: Case statements in Scala are superpowerful. Inheritance:Mixin traits are great and they definitely reduce code repetition. There are other features of Scala like closure and monads, which will need more understanding of functional language concepts to learn. Scala Build Tool Scala Build Tool (SBT) is a build tool that allows compiling, running, testing, packaging, and deployment of your code. SBT is mostly used with Scala projects, but it can as easily be used for projects in other languages. Here, we will be using SBT as a build tool for managing our project and running our tests. SBT is written in Scala and can use many of the features of Scala language. Build definitions for SBT are also written in Scala. These definitions are both flexible and powerful. SBT also allows use of plugins and dependency management. If you have used a build tool like Maven or Gradlein any of your previous incarnations, you will find SBT a breeze. Why SBT? Better dependency management Ivy based dependency management Only-update-on-request model Can launch REPL in project context Continuous command execution Scala language support for creating tasks Resources for learning Scala Here are few of the resources for learning Scala: http://www.scala-lang.org/ https://www.coursera.org/course/progfun https://www.manning.com/books/functional-programming-in-scala http://www.tutorialspoint.com/scala/index.htm Resources for SBT Here are few of the resources for learning SBT: http://www.scala-sbt.org/ https://twitter.github.io/scala_school/sbt.html Summary In this article we learned what is TDD and why to use it. We also learned about Scala and SBT.  Resources for Article: Further resources on this subject: Overview of TDD [article] Understanding TDD [article] Android Application Testing: TDD and the Temperature Converter [article]
Read more
  • 0
  • 0
  • 8952
article-image-its-all-about-data
Packt
14 Sep 2016
12 min read
Save for later

It's All About Data

Packt
14 Sep 2016
12 min read
In this article by Samuli Thomasson, the author of the book, Haskell High Performance Programming, we will know how to choose and design optimal data structures in applications. You will be able to drop the level of abstraction in slow parts of code, all the way to mutable data structures if necessary. (For more resources related to this topic, see here.) Annotating strictness and unpacking datatype fields We used the BangPatterns extension to make function arguments strict: {-# LANGUAGE BangPatterns #-} f !s (x:xs) = f (s + 1) xs f !s _ = s Using bangs for annotating strictness in fact predates the BangPatterns extension (and the older compiler flag -fbang-patterns in GHC6.x). With just plain Haskell98, we are allowed to use bangs to make datatype fields strict: > data T = T !Int A bang in front of a field ensures that whenever the outer constructor (T) is in WHNF, the inner field is as well in WHNF. We can check this: > T undefined `seq` () *** Exception: Prelude.undefined There are no restrictions to which fields can be strict, be it recursive or polymorphic fields, although, it rarely makes sense to make recursive fields strict. Consider the fully strict linked list: data List a = List !a !(List a) | ListEnd With this much strictness, you cannot represent parts of infinite lists without always requiring infinite space. Moreover, before accessing the head of a finite strict list you must evaluate the list all the way to the last element. Strict lists don't have the streaming property of lazy lists. By default, all data constructor fields are pointers to other data constructors or primitives, regardless of their strictness. This applies to basic data types such asInt, Double, Char, and so on, which are not primitive in Haskell. They are data constructors over their primitive counterparts Int#, Double#, and Char#: > :info Int data Int = GHC.Types.I# GHC.Prim.Int# There is a performance overhead, the size of pointer dereference between types, say, Int and Int#, but an Int can represent lazy values (called thunks), whereas primitives cannot. Without thunks, we couldn't have lazy evaluation. Luckily,GHC is intelligent enough to unroll wrapper types as primitives in many situations, completely eliminating indirect references. The hash suffix is specific to GHC and always denotes a primitive type. The GHC modules do expose the primitive interface. Programming with primitives you can further micro-optimize code and get C-like performance. However, several limitations and drawbacks apply. Using anonymous tuples Tuples may seem harmless at first; they just lump a bunch of values together. But note that the fields in a tuple aren't strict, so a twotuple corresponds to the slowest PairP data type from our previous benchmark. If you need a strict Tuple type, you need to define one yourself. This is also one more reason to prefer custom types over nameless tuples in many situations. These two structurally similar tuple types have widely different performance semantics: data Tuple = Tuple {-# UNPACK #-} !Int {-# UNPACK #-} !Int data Tuple2 = Tuple2 {-# UNPACK #-} !(Int, Int) If you really want unboxed anonymous tuples, you can enable the UnboxedTuples extension and write things with types like (# Int#, Char# #). But note that a number of restrictions apply to unboxed tuples like to all primitives. The most important restriction is that unboxed types may not occur where polymorphic types or values are expected, because polymorphic values are always considered as pointers. Representing bit arrays One way to define a bitarray in Haskell that still retains the convenience of Bool is: import Data.Array.Unboxed type BitArray = UArrayInt Bool This representation packs 8 bits per byte, so it's space efficient. See the following section on arrays in general to learn about time efficiency – for now we only note that BitArray is an immutable data structure, like BitStruct, and that copying small BitStructs is cheaper than copying BitArrays due to overheads in UArray. Consider a program that processes a list of integers and tells whether they are even or odd counts of numbers divisible by 2, 3, and 5. We can implement this with simple recursion and a three-bit accumulator. Here are three alternative representations for the accumulator: type BitTuple = (Bool, Bool, Bool) data BitStruct = BitStruct !Bool !Bool !Bool deriving Show type BitArray = UArrayInt Bool And the program itself is defined along these lines: go :: acc -> [Int] ->acc go acc [] = acc go (two three five) (x:xs) = go ((test 2 x `xor` two) (test 3 x `xor` three) (test 5 x `xor` five)) xs test n x = x `mod` n == 0 I've omitted the details here. They can be found in the bitstore.hs file. The fastest variant is BitStruct, then comes BitTuple (30% slower), and BitArray is the slowest (130% slower than BitStruct). Although BitArray is the slowest (due to making a copy of the array on every iteration), it would be easy to scale the array in size or make it dynamic. Note also that this benchmark is really on the extreme side; normally programs do a bunch of other stuff besides updating an array in a tight loop. If you need fast array updates, you can resort to mutable arrays discussed later on. It might also be tempting to use Data.Vector.Unboxed.VectorBool from the vector package, due to its nice interface. But beware that that representation uses one byte for every bit, wasting 7 bits for every bit. Mutable references are slow Data.IORef and Data.STRef are the smallest bits of mutable state, references to mutable variables, one for IO and other for ST. There is also a Data.STRef.Lazy module, which provides a wrapper over strict STRef for lazy ST. However, because IORef and STRef are references, they imply a level of indirection. GHC intentionally does not optimize it away, as that would cause problems in concurrent settings. For this reason, IORef or STRef shouldn't be used like variables in C, for example. Performance will for sure be very bad. Let's verify the performance hit by considering the following ST-based sum-of-range implementation: -- file: sum_mutable.hs import Control.Monad.ST import Data.STRef count_st :: Int ->Int count_st n = runST $ do ref <- newSTRef 0 let go 0 = readSTRef ref go i = modifySTRef' ref (+ i) >> go (i - 1) go n And compare it to this pure recursive implementation: count_pure :: Int ->Int count_pure n = go n 0 where go 0 s = s go i s = go (i - 1) $! (s + i) The ST implementation is many times slower when at least -O is enabled. Without optimizations, the two functions are more or less equivalent in performance;there is similar amount of indirection from not unboxing arguments in the latter version. This is one example of the wonders that can be done to optimize referentially transparent code. Bubble sort with vectors Bubble sort is not an efficient sort algorithm, but because it's an in-place algorithm and simple, we will implement it as a demonstration of mutable vectors: -- file: bubblesort.hs import Control.Monad.ST import Data.Vector as V import Data.Vector.Mutable as MV import System.Random (randomIO) -- for testing The (naive) bubble sort compares values of all adjacent indices in order, and swaps the values if necessary. After reaching the last element, it starts from the beginning or, if no swaps were made, the list is sorted and the algorithm is done: bubblesortM :: (Ord a, PrimMonad m) =>MVector (PrimState m) a -> m () bubblesortM v = loop where indices = V.fromList [1 .. MV.length v - 1] loop = do swapped <- V.foldM' f False indices – (1) if swapped then loop else return () – (2) f swapped i = do – (3) a <- MV.read v (i-1) b <- MV.read v i if a > b then MV.swap v (i-1) i>> return True else return swapped At (1), we fold monadically over all but the last index, keeping state about whether or not we have performed a swap in this iteration. If we had, at (2) we rerun the fold or, if not, we can return. At (3) we compare an index and possibly swap values. We can write a pure function that wraps the stateful algorithm: bubblesort :: Ord a => Vector a -> Vector a bubblesort v = runST $ do mv <- V.thaw v bubblesortM mv V.freeze mv V.thaw and V.freeze (both O(n)) can be used to go back and forth with mutable and immutable vectors. Now, there are multiple code optimization opportunities in our implementation of bubble sort. But before tackling those, let's see how well our straightforward implementation fares using the following main: main = do v <- V.generateM 10000 $ _ ->randomIO :: IO Double let v_sorted = bubblesort v median = v_sorted ! 5000 print median We should remember to compile with -O2. On my machine, this program takes about 1.55s, and Runtime System reports 99.9% productivity, 18.7 megabytes allocated heap and 570 Kilobytes copied during GC. So now with a baseline, let's see if we can squeeze out more performance from vectors. This is a non-exhaustive list: Use unboxed vectors instead. This restricts the types of elements we can store, but it saves us a level of indirection. Down to 960ms and approximately halved GC traffic. Large lists are inefficient, and they don't compose with vectors stream fusion. We should change indices so that it uses V.enumFromTo instead (alternatively turn on OverloadedLists extension and drop V.fromList). Down to 360ms and 94% less GC traffic. Conversion functions V.thaw and V.freeze are O(n), that is, they modify copies. Using in-place V.unsafeThaw and V.unsafeFreeze instead is sometimes useful. V.unsafeFreeze in the bubblesort wrapper is completely safe, but V.unsafeThaw is not. In our example, however, with -O2, the program is optimized into a single loop and all those conversions get eliminated. Vector operations (V.read, V.swap) in bubblesortM are guaranteed to never be out of bounds, so it's perfectly safe to replace these with unsafe variants (V.unsafeRead, V.unsafeSwap) that don't check bounds. Speed-up of about 25 milliseconds, or 5%. To summarize, applying good practices and safe usage of unsafe functions, our Bubble sort just got 80% faster. These optimizations are applied in thebubblesort-optimized.hsfile (omitted here). We noticed that almost all GC traffic came from a linked list, which was constructed and immediately consumed. Lists are bad for performance in that they don't fuse like vectors. To ensure good vector performance, ensure that the fusion framework can work effectively. Anything that can be done with a vector should be done. As final note, when working with vectors(and other libraries) it's a good idea to keep the Haddock documentation handy. There are several big and small performance choices to be made. Often the difference is that of between O(n) and O(1). Speedup via continuation-passing style Implementing monads in continuation-passing style (CPS) can have very good results. Unfortunately, no widely-used or supported library I'm aware of would provide drop-in replacements for ubiquitous Maybe, List, Reader, Writer, and State monads. It's not that hard to implement the standard monads in CPS from scratch. For example, the State monad can be implemented using the Cont monad from mtl as follows: -- file: cont-state-writer.hs {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FlexibleContexts #-} import Control.Monad.State.Strict import Control.Monad.Cont newtypeStateCPS s r a = StateCPS (Cont (s -> r) a) deriving (Functor, Applicative, Monad, MonadCont) instance MonadState s (StateCPS s r) where get = StateCPS $ cont $ next curState→ next curStatecurState put newState = StateCPS $ cont $ next curState→ next () newState runStateCPS :: StateCPS s s () -> s -> s runStateCPS (StateCPS m) = runCont m (_ -> id) In case you're not familiar with the continuation-passing style and the Cont monad, the details might not make much sense, instead of just returning results from a function, a function in CPS applies its results to a continuation. So in short, to "get" the state in continuation-passing style, we pass the current state to the "next" continuation (first argument) and don't change the state (second argument). To "put", we call the continuation with unit (no return value) and change the state to new state (second argument to next). StateCPS is used just like the State monad: action :: MonadStateInt m => m () action = replicateM_ 1000000 $ do i<- get put $! i + 1 main = do print $ (runStateCPS action 0 :: Int) print $ (snd $ runState action 0 :: Int) That action operation is, in the CPS version of the state monad, about 5% faster and performs 30% less heap allocation than the state monad from mtl. This program is limited pretty much only by the speed of monadic composition, so these numbers are at least very close to maximum speedup we can have from CPSing the state monad. Speedups of the writer monad are probably near these results. Other standard monads can be implemented similarly to StateCPS. The definitions can also be generalized to monad transformers over an arbitrary monad (a la ContT). For extra speed, you might wish to combine many monads in a single CPS monad, similarly to what RWST does. Summary We witnessed the performance of the bytestring, text, and vector libraries, all of which get their speed from fusion optimizations, in contrast to linked lists, which have a huge overhead despite also being subject to fusion to some degree. However, linked lists give rise to simple difference lists and zippers. The builder patterns for lists, bytestring, and textwere introduced. We discovered that the array package is low-level and clumsy compared to the superior vector package, unless you must support Haskell 98. We also saw how to implement Bubble sort using vectors and how to speedup via continuation-passing style. Resources for Article: Further resources on this subject: Data Tables and DataTables Plugin in jQuery 1.3 with PHP [article] Data Extracting, Transforming, and Loading [article] Linking Data to Shapes [article]
Read more
  • 0
  • 0
  • 13304

article-image-decoding-why-good-php-developerisnt-oxymoron
Packt
14 Sep 2016
20 min read
Save for later

Decoding Why "Good PHP Developer"Isn't an Oxymoron

Packt
14 Sep 2016
20 min read
In this article by Junade Ali, author of the book Mastering PHP Design Patterns, we will be revisiting object-oriented programming. Back in 2010 MailChimp published a post on their blog, it was entitled Ewww, You Use PHP? In this blog post they described the horror when they explained their choice of PHP to developers who consider the phrase good PHP programmer an oxymoron. In their rebuttal they argued that their PHP wasn't your grandfathers PHP and they use a sophisticated framework. I tend to judge the quality of PHP on the basis of, not only how it functions, but how secure it is and how it is architected. This book focuses on ideas of how you should architect your code. The design of software allows for developers to ease the extension of the code beyond its original purpose, in a bug free and elegant fashion. (For more resources related to this topic, see here.) As Martin Fowler put it: Any fool can write code that a computer can understand. Good programmers write code that humans can understand. This isn't just limited to code style, but how developers architect and structure their code. I've encountered many developers with their noses constantly stuck in documentation, copying and pasting bits of code until it works; hacking snippets together until it works. Moreover, I far too often see the software development process rapidly deteriorate as developers ever more tightly couple their classes with functions of ever increasing length. Software engineers mustn't just code software; they must know how to design it. Indeed often a good software engineer, when interviewing other software engineers will ask questions surrounding the design of the code itself. It is trivial to get a piece of code that will execute, and it is also benign to question a developer as to whether strtolower or str2lower is the correct name of a function (for the record, it's strtolower). Knowing the difference between a class and an object doesn't make you a competent developer; a better interview question would, for example, be how one could apply subtype polymorphism to a real software development challenge. Failure to assess software design skills dumbs down an interview and results in there being no way to differentiate between those who are good at it, and those who aren't. These advanced topics will be discussed throughout this book, by learning these tactics you will better understand what the right questions to ask are when discussing software architecture. Moxie Marlinspike once tweeted: As a software developer, I envy writers, musicians, and filmmakers. Unlike software, when they create something it is really done, forever. When developing software we mustn't forget we are authors, not just of instructions for a machine, but we are also authoring something that we later expect others to extend upon. Therefore, our code mustn't just be targeted at machines, but humans also. Code isn't just poetry for a machine, it should be poetry for humans also. This is, of course, better said than done. In PHP this may be found especially difficult given the freedom PHP offers developers on how they may architect and structure their code. By the very nature of freedom, it may be both used and abused, so it is true with the freedom offered in PHP. PHP offers freedom to developers to decide how to architect this code. By the very nature of freedom it can be both used and abused, so it is true with the freedom offered in PHP. Therefore, it is increasingly important that developers understand proper software design practices to ensure their code maintains long term maintainability. Indeed, another key skill lies in refactoringcode, improving design of existing code to make it easier to extend in the longer term Technical debt, the eventual consequence of poor system design, is something that I've found comes with the career of a PHP developer. This has been true for me whether it has been dealing with systems that provide advanced functionality or simple websites. It usually arises because a developer elects to implement bad design for a variety of reasons; this is when adding functionality to an existing codebase or taking poor design decisions during the initial construction of software. Refactoring can help us address these issues. SensioLabs (the creators of the Symfonyframework) have a tool called Insight that allows developers to calculate the technical debt in their own code. In 2011 they did an evaluation of technical debt in various projects using this tool; rather unsurprisingly they found that WordPress 4.1 topped the chart of all platforms they evaluated with them claiming it would take 20.1 years to resolve the technical debt that the project contains. Those familiar with the WordPress core may not be surprised by this, but this issue of course is not only associated to WordPress. In my career of working with PHP, from working with security critical cryptography systems to working with systems that work with mission critical embedded systems, dealing with technical debt comes with the job. Dealing with technical debt is not something to be ashamed of for a PHP Developer, indeed some may consider it courageous. Dealing with technical debt is no easy task, especially in the face of an ever more demanding user base, client, or project manager; constantly demanding more functionality without being familiar with the technical debt the project has associated to it. I recently emailed the PHP Internals group as to whether they should consider deprecating the error suppression operator @. When any PHP function is prepended by an @ symbol, the function will suppress an error returned by it. This can be brutal; especially where that function renders a fatal error that stops the execution of the script, making debugging a tough task. If the error is suppressed, the script may fail to execute without providing developers a reason as to why this is. Despite the fact that no one objected to the fact that there were better ways of handling errors (try/catch, proper validation) than abusing the error suppression operator and that deprecation should be an eventual aim of PHP, it is the case that some functions return needless warnings even though they already have a success/failure value. This means that due to technical debt in the PHP core itself, this operator cannot be deprecated until a lot of other prerequisite work is done. In the meantime, it is down to developers to decide the best methodologies of handling errors. Until the inherent problem of unnecessary error reporting is addressed, this operator cannot be deprecated. Therefore, it is down to developers to be educated as to the proper methodologies that should be used to address error handling and not to constantly resort to using an @ symbol. Fundamentally, technical debt slows down development of a project and often leads to code being deployed that is broken as developers try and work on a fragile project. When starting a new project, never be afraid to discus architecture as architecture meetings are vital to developer collaboration; as one scrum master I've worked with said in the face of criticism that "meetings are a great alternative to work", he said "meetings are work…how much work would you be doing without meetings?". Coding style - thePSR standards When it comes to coding style, I would like to introduce you to the PSR standards created by the PHP Framework Interop Group. Namely, the two standards that apply to coding standards are PSR-1 (Basic Coding Style) and PSR-2 (Coding Style Guide). In addition to this there are PSR standards that cover additional areas, for example, as of today; the PSR-4 standard is the most up-to-date autoloading standard published by the group. You can find out more about the standards at http://www.php-fig.org/. Coding style being used to enforce consistency throughout a codebase is something I strongly believe in, it does make a difference to your code readability throughout a project. It is especially important when you are starting a project (chances are you may be reading this book to find out how to do that right) as your coding style determines the style the developers following you in working on this project will adopt. Using a global standard such as PSR-1 or PSR-2 means that developers can easily switch between projects without having to reconfigure their code style in their IDE. Good code style can make formatting errors easier to spot. Needless to say that coding styles will develop as time progresses, to date I elect to work with the PSR standards. I am a strong believer in the phrase: Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. It isn't known who wrote this phrase originally; but it's widely thought that it could have been John Woods or potentially Martin Golding. I would strongly recommend familiarizingyourself with these standards before proceeding in this book. Revising object-oriented programming Object-oriented programming is more than just classes and objects, it's a whole programming paradigm based around objects(data structures) that contain data fields and methods. It is essential to understand this; using classes to organize a bunch of unrelated methods together is not object orientation. Assuming you're aware of classes (and how to instantiate them), allow me to remind you of a few different bits and pieces. Polymorphism Polymorphism is a fairly long word for a fairly simple concept. Essentially, polymorphism means the same interfaceis used with a different underlying code. So multiple classes could have a draw function, each accepting the same arguments, but at an underlying level the code is implemented differently. In this article, I would also like to talk about Subtype Polymorphism in particular (also known as Subtyping or Inclusion Polymorphism). Let's say we have animals as our supertype;our subtypes may well be cats, dogs, and sheep. In PHP, interfaces allow you to define a set of functionality that a class that implements it must contain, as of PHP 7 you can also use scalar type hints to define the return types we expect. So for example, suppose we defined the following interface: interface Animal { public function eat(string $food) : bool; public function talk(bool $shout) : string; } We could then implement this interface in our own class, as follows: class Cat implements Animal { } If we were to run this code without defining the classes we would get an error message as follows: Class Cat contains 2 abstract methods and must therefore be declared abstract or implement the remaining methods (Animal::eat, Animal::talk) Essentially, we are required to implement the methods we defined in our interface, so now let's go ahead and create a class that implements these methods: class Cat implements Animal { public function eat(string $food): bool { if ($food === "tuna") { return true; } else { return false; } } public function talk(bool $shout): string { if ($shout === true) { return "MEOW!"; } else { return "Meow."; } } } Now that we've implemented these methods we can then just instantiate the class we are after and use the functions contained in it: $felix = new Cat(); echo $felix->talk(false); So where does polymorphism come into this? Suppose we had another class for a dog: class Dog implements Animal { public function eat(string $food): bool { if (($food === "dog food") || ($food === "meat")) { return true; } else { return false; } } public function talk(bool $shout): string { if ($shout === true) { return "WOOF!"; } else { return "Woof woof."; } } } Now let's suppose we have multiple different types of animals in a pets array: $pets = array( 'felix' => new Cat(), 'oscar' => new Dog(), 'snowflake' => new Cat() ); We can now actually go ahead and loop through all these pets individually in order to run the talk function.We don't care about the type of pet because the talkmethod that is implemented in every class we getis by virtue of us having extended the Animals interface. So let's suppose we wanted to have all our animals run the talk method, we could just use the following code: foreach ($pets as $pet) { echo $pet->talk(false); } No need for unnecessary switch/case blocks in order to wrap around our classes, we just use software design to make things easier for us in the long-term. Abstract classes work in a similar way, except for the fact that abstract classes can contain functionality where interfaces cannot. It is important to note that any class that defines one or more abstract classes must also be defined as abstract. You cannot have a normal class defining abstract methods, but you can have normal methods in abstract classes. Let's start off by refactoring our interface to be an abstract class: abstract class Animal { abstract public function eat(string $food) : bool; abstract public function talk(bool $shout) : string; public function walk(int $speed): bool { if ($speed > 0) { return true; } else { return false; } } } You might have noticed that I have also added a walk method as an ordinary, non-abstract method; this is a standard method that can be used or extended by any classes that inherit the parent abstract class. They already have implementation. Note that it is impossible to instantiate an abstract class (much like it's not possible to instantiate an interface). Instead we must extend it. So, in our Cat class let's substitute: class Cat implements Animal With the following code: class Cat extends Animal That's all we need to refactor in order to get classes to extend the Animal abstract class. We must implement the abstract functions in the classes as we outlined for the interfaces, plus we can use the ordinary functions without needing to implement them: $whiskers = new Cat(); $whiskers->walk(1); As of PHP 5.4 it has also become possible to instantiate a class and access a property of it in one system. PHP.net advertised it as: Class member access on instantiation has been added, e.g. (new Foo)->bar(). You can also do it with individual properties, for example,(new Cat)->legs. In our example, we can use it as follows: (new IcyAprilChapterOneCat())->walk(1); Just to recap a few other points about how PHP implemented OOP, the final keyword before a class declaration or indeed a function declaration means that you cannot override such classes or functions after they've been defined. So, if we were to try extending a class we have named as final: final class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { } This results in the following output: Fatal error: Class Cat may not inherit from final class (Animal) Similarly, if we were to do the same except at a function level: class Animal { final public function walk() { return "walking..."; } } class Cat extends Animal { public function walk () { return "walking with tail wagging..."; } } This results in the following output: Fatal error: Cannot override final method Animal::walk() Traits (multiple inheritance) Traits were introduced into PHP as a mechanism for introducing Horizontal Reuse. PHP conventionally acts as a single inheritance language, namely because of the fact that you can't inherit more than one class into a script. Traditional multiple inheritance is a controversial process that is often looked down upon by software engineers. Let me give you an example of using Traits first hand; let's define an abstract Animal class which we want to extend into another class: class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { public function walk () { return "walking with tail wagging..."; } } So now let's suppose we have a function to name our class, but we don't want it to apply to all our classes that extend the Animal class, we want it to apply to certain classes irrespective of whether they inherit the properties of the abstract Animal class or not. So we've defined our functions like so: function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } The problem now is that there is no place we can put them without using Horizontal Reuse, apart from copying and pasting different bits of code or resorting to using conditional inheritance. This is where Traits come to the rescue; let's start off by wrapping these methods in a Trait called Name: trait Name { function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } } So now that we've defined our Trait, we can just tell PHP to use it in our Cat class: class Cat extends Animal { use Name; public function walk() { return "walking with tail wagging..."; } } Notice the use of theName statement? That's where the magic happens. Now you can call the functions in that Trait without any problems: $whiskers = new Cat(); $whiskers->setFirstName('Paul'); echo $whiskers->firstName; All put together, the new code block looks as follows: trait Name { function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } } class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { use Name; public function walk() { return "walking with tail wagging..."; } } $whiskers = new Cat(); $whiskers->setFirstName('Paul'); echo $whiskers->firstName; Scalar type hints Let me take this opportunity to introduce you to a PHP7 concept known as scalar type hinting; it allows you to define the return types (yes, I know this isn't strictly under the scope of OOP; deal with it). Let's define a function, as follows: function addNumbers (int $a, int $b): int { return $a + $b; } Let's take a look at this function; firstly you will notice that before each of the arguments we define the type of variable we want to receive, in this case,int or integer. Next up you'll notice there's a bit of code after the function definition : int, which defines our return type so our function can only receive an integer. If you don't provide the right type of variable as a function argument or don't return the right type of variable from the function; you will get a TypeError exception. In strict mode, PHP will also throw a TypeError exception in the event that strict mode is enabled and you also provide the incorrect number of arguments. It is also possible in PHP to define strict_types; let me explain why you might want to do this. Without strict_types, PHP will attempt to automatically convert a variable to the defined type in very limited circumstances. For example, if you pass a string containing solely numbers it will be converted to an integer, a string that's non-numeric, however, will result in a TypeError exception. Once you enable strict_typesthis all changes, you can no longer have this automatic casting behavior. Taking our previous example, without strict_types, you could do the following: echo addNumbers(5, "5.0"); Trying it again after enablingstrict_types, you will find that PHP throws a TypeError exception. This configuration only applies on an individual file basis, putting it before you include other files will not result in this configuration being inherited to those files. There are multiple benefits of why PHP chose to go down this route; they are listed very clearly in Version: 0.5.3 of the RFC that implemented scalar type hints called PHP RFC: Scalar Type Declarations. You can read about it by going to http://www.wiki.php.net (the wiki, not the main PHP website) and searching for scalar_type_hints_v5. In order to enable it, make sure you put this as the very first statement in your PHP script: declare(strict_types=1); This will not work unless you define strict_typesas the very first statement in a PHP script; no other usages of this definition are permitted. Indeed if you try to define it later on, your script PHP will throw a fatal error. Of course, in the interests of the rage induced PHP core fanatic reading this book in its coffee stained form, I should mention that there are other valid types that can be used in type hinting. For example, PHP 5.1.0 introduced this with arrays and PHP 5.0.0 introduced the ability for a developer to do this with their own classes. Let me give you a quick example of how this would work in practice, suppose we had an Address class: class Address { public $firstLine; public $postcode; public $country; public function __construct(string $firstLine, string $postcode, string $country) { $this->firstLine = $firstLine; $this->postcode = $postcode; $this->country = $country; } } We can then type the hint of the Address class that we inject into a Customer class: class Customer { public $name; public $address; public function __construct($name, Address $address) { $this->name = $name; $this->address = $address; } } And just to show how it all can come together: $address = new Address('10 Downing Street', 'SW1A2AA', 'UK'); $customer = new Customer('Davey Cameron', $address); var_dump($customer); Limiting debug access to private/protected properties If you define a class which contains private or protected variables, you will notice an odd behavior if you were to var_dumpthe object of that class. You will notice that when you wrap the object in a var_dumpit reveals all variables; be they protected, private, or public. PHP treats var_dump as an internal debugging function, meaning all data becomes visible. Fortunately, there is a workaround for this. PHP 5.6 introduced the __debugInfo magic method. Functions in classes preceded by a double underscore represent magic methods and have special functionality associated to them. Every time you try to var_dump an object that has the __debugInfo magic method set, the var_dump will be overridden with the result of that function call instead. Let me show you how this works in practice, let's start by defining a class: class Bear { private $hasPaws = true; } Let's instantiate this class: $richard = new Bear(); Now if we were to try and access the private variable that ishasPaws, we would get a fatal error; so this call: echo $richard->hasPaws; Would result in the following fatal error being thrown: Fatal error: Cannot access private property Bear::$hasPaws That is the expected output, we don't want a private property visible outside its object. That being said, if we wrap the object with a var_dump as follows: var_dump($richard); We would then get the following output: object(Bear)#1 (1) { ["hasPaws":"Bear":private]=> bool(true) } As you can see, our private property is marked as private, but nevertheless it is visible. So how would we go about preventing this? So, let's redefine our class as follows: class Bear { private $hasPaws = true; public function __debugInfo () { return call_user_func('get_object_vars', $this); } } Now, after we instantiate our class and var_dump the resulting object, we get the following output: object(Bear)#1 (0) { } The script all put together looks like this now, you will notice I've added an extra public property called growls, which I have set to true: <?php class Bear { private $hasPaws = true; public $growls = true; public function __debugInfo () { return call_user_func('get_object_vars', $this); } } $richard = new Bear(); var_dump($richard); If we were to var_dump this script (with both public and private property to play with), we would get the following output: object(Bear)#1 (1) { ["growls"]=> bool(true) } As you can see, only the public property is visible. So what is the moral of the story from this little experiment? Firstly, that var_dumps exposesprivate and protected properties inside objects, and secondly, that this behavior can be overridden. Summary In this article, we revised some PHP principles, including OOP principles. We also revised some PHP syntax basics. Resources for Article: Further resources on this subject: Running Simpletest and PHPUnit [article] Data Tables and DataTables Plugin in jQuery 1.3 with PHP [article] Understanding PHP basics [article]
Read more
  • 0
  • 0
  • 17843

article-image-getting-started-cloud-only-scenario
Packt
13 Sep 2016
23 min read
Save for later

Getting Started with a Cloud-Only Scenario

Packt
13 Sep 2016
23 min read
In this article by Jochen Nickel from the book, Mastering Identity and Access Management with Microsoft Azure, we will first start with a business view to identify the important business needs and challenges of a cloud-only environment and scenario. Throughout thisarticle, we will also discuss the main features of and licensing information for such an approach. Finally,we will round up with the challenges surroundingsecurity and legal requirements. The topics we will cover in this articleare as follows: Identifying the business needs and challenges An overview of feature and licensing decisions Defining the benefits and costs Principles of security and legal requirements (For more resources related to this topic, see here.) Identifying business needs and challenges Oh! Don't worry, we don’t have the intention of boring you with a lesson of typical IAM stories – we're sure you've been in touch with a lot of information in this area. However, you do need to have an independentview of the actual business needs and challenges in the cloud area, so that you can get the most out of your own situation. Common Identity and Access Management needs Identity and Access Management(IAM) is the disciplinethat plays an important role in the actual cloud era of your organization. It’s also of valueto smalland medium-sizedcompanies, so that they can enable the right individuals to access the right resources from any location and device, at the right time and for the right reasons, to empower and enable the desired business outcomes. IAM addresses the mission-critical need of ensuringappropriate and secure access to resources inside and across company borders,such as cloud or partner applications. The old security strategy of only securing your environment with an intelligent firewall concept and access control lists will take on a more and more subordinated role. There is arecommended requirement of reviewing and overworking this strategyin order to meet higher compliance and operational and business requirements. To adopt a mature security and risk management practice, it's very important that your IAMIAM strategy is business-aligned and that the required business skills and stakeholders are committed to this topic. Without clearly defined business processes you can't implement a successful IAM-functionality in the planned timeframe. Companies that follow this strategy can become more agile in supporting new business initiatives and reduce their costs inIAM. The following three groups show the typicalindicators for missing IAM capabilities on the premises and for cloud services: Your employees/partners: Same password usage across multiple applications without periodic changes (also in social media accounts) Multiple identities and logins Passwords are written downinSticky Notes, Excel, etc. Application and data access allowed after termination Forgotten usernames and passwords Poor usability of application access inside and outside the company (multiple logins, VPN connection required, incompatible devices, etc.) Your IT department: High workload on Password Reset Support Missing automated identity lifecycles with integrity (data duplication and data quality problems) No insights in application usage and security Missing reporting tools for compliance management Complex integration of central access to Software as a Service (SaaS), Partner and On-Premise applications (missing central access/authentication/authorization platform) No policy enforcement in cloud services usage Collection of access rights(missing processes) Your developers: Limited knowledge ofall the different security standards, protocols, and APIs Constantly changing requirements and rapid developments Complex changes of the Identity Provider Implications of Shadow IT On top of that, often the ITdepartment will hear the following question:When can we expect the new application for our business unit?Sorry, but the answer will always take too long. Why should I wait? All I need is a valid credit card that allows me to buy my required business application, butsuddenly another popular phenomenon pops up:The shadow IT!. Most of the time, this introduces another problem— uncontrolled information leakage. The following figure shows the flow of typical information — and that what you don't know can hurt! The previous figure should not give you the impression that cloud services are inherently dangerous, rather that before using them you should first be aware that,[R1]  and in which manner they are being used. Simplymigrating or ordering a new service in the cloud won't solve common IAMneeds. This figure should help you to imagine that, if not planned, the introduction of a new or migrated service brings with it a new identity and credential set for the users, and therefore multiple credentials and logins to remember! You should also be sure which information can be stored and processed in aregulatory area other than your own organization. The following table shows the responsibilities involved when using the different cloud service models. In particular, you should identify thatyou are responsible for data classification, IAM, and endpoint security in every model: Cloud Service Modell IaaS   PaaS   SaaS   Responsibility Customer Provider Customer Provider Customer Provider Data Classification X   X   X   End Point Security X   X   X   Identity and Access Management X   X X X X Application Security X   X X   X Network Controls X   X X   X Host Security   X   X   X Physical Security   X   X   X The mobile workforce and cloud-first strategy Many organizations are facing the challengeof meeting the expectations of a mobile workforce, all with their own device preferences, a mix of private and professional commitments, and the request to use social media as an additional way of business communication. Let's dive into a short, practical but extreme example. The AzureID company employs approximately 80 employees. They work with a SaaS landscape of eight services to drive all their business processes. On the premises, they use Network-Attached Storage(NAS) to store some corporate data and provide network printers to all of the employees. Some of the printers are directly attached to the C-level of the company. The main issues today are that the employees need to remember all their usernames and passwords of all the business applications, and if they want to share some information with partners they cannot give them partial access to the necessary information in a secure and flexible way. Another point is if they want to access corporate data from their mobile device, it's always a burden to provide every single login for the applications necessary to fulfil their job. The small IT department with oneFull-time Equivalent(FTE) is overloaded with having to create and manage every identity in each different service. In addition, users forget their passwords periodically, and most of the time outside normal business hours. The following figure shows the actual infrastructure: Let's analyze this extreme example to revealsome typical problems,so that you can match some ideas to your IT infrastructure: Provisioning, managing,and de-provisioning identities can be a time-consuming task There are no single identity and credentials for the employees There is no collaboration support for partner and consumer communication There is no Self-Service Password Reset functionality Sensitive information leaves the corporation over email There are no usage or security reports about the accessed applications/services There is no central way to enable multi-factor authentication for sensitive applications There is no secure strategy for accessing social media There is no usable, secure,and central remote access portal Remember, shifting applications and services to the cloud just introduces more implications/challenges, not solutions. First of all, you need your IAM functionality accurately in place.You also need to alwayshandle on-premises resourceswith minimal printer management. An overview of feature and licensing decisions With the cloud-first strategy of Microsoft, the Azure platform and their number of services grow constantly, and we have seen a lot of costumers lost in a paradise of wonderful services and functionality. This brings usto the point of how to figure out the relevant services for IAMfor you and how to give them the space for explanation. Obviously, there are more services available that stripe this field [R2] with a small subset of functionality, but due to the limited page count of this book and our need for  rapid development we will focus on the most important ones, and will reference any other interesting content. The primary service for IAMis the Azure Active Directory service, which has also been the core directory service for Office 365 since 2011. Every other SaaS offering of Microsoft is also based on this core service, including Microsoft Intune, Dynamics, and Visual Studio Online. So, if you are already an Office 365 customer you willhave your own instance of Azure Active Directory in place. For sustained access management and the protection of your information assets, the Azure Rights Management services are in place. There is also an option for Office 365 customers to use the included Azure Rights Management services. You can find further information about this by visiting the following link:http://bit.ly/1KrXUxz. Let's get started with the feature sets that can provide a solution, as shown in the following screenshot: Including Azure Active Directory and Rights Managementhelps you to provide a secure solution with a central access portal for all of your applications with just one identity and login for your employees, partners, and customers that you want to share your information with. With a few clicks you can also add multi-factor authentication to your sensitive applications and administrative accounts. Furthermore, you can directly add a Self-Service Password Reset functionality that your users can use to reset their passwordfor themselves. As the administrator, you will receive predefined security and usage reports to control your complete application ecosystem. To protect your sensible content, you will receive digital rights management capabilities with the Azure Rights Management services to give you granular access rights on every device your information is used. Doesn’t it sound great? Let's take a deeper look into the functionality and usage of the different Microsoft Azure IAMservices. Azure Active Directory Azure Active Directory is a fully managed multi-tenant service that provides IAMcapabilities as a service. This service is not just an instance of theWindows Server Domain Controller you already know from your actual Active Directory infrastructure. Azure AD is not a replacement for the Windows Server Active Directory either. If you already use a local Active Directory infrastructure, you can extend it to the cloud by integrating Azure AD to authenticate your users in the same way ason-premise and cloud services. Staying in the business view, we want to discuss some of the main features of Azure Active Directory. Firstly, we want to start with the Access panel that gives the user a central place to access all his applications from any device and any location with single-sign-on. The combination of the Azure Active Directory Access panel and the Windows Server 2012 R2/2016 Web Application Proxy / ADFS capabilities provide an efficient way to securely publish web applications and services to your employees, partners, and customers. It will be a good replacement for your retired Forefront TMG/UAG infrastructure. Over this portal,your users can do the following: User and group management Access their business relevant applications(On-premise, partner, and SaaS) with single-sign-on or single logon Delegation of access control to the data, process, or project owner Self-service profile editing for correcting or adding information Self-service password change and reset Management of registered devices With the Self-Service Password Reset functionality, a user gets a straight forward way to reset his password and to prove his identity, for example through a phone call, email, or by answering security questions. The different portals can be customized with your own Corporate Identity branding. To try the different portals, just use the following links: https://myapps.microsoft.com and https://passwordreset.microsoftonline.com. To complete our short introduction to the main features of the Azure Active Directory, we will take a look at the reporting capabilities. With this feature you get predefined reports with the following information provided. With viewing and acting on these reports, you are able to control your whole application ecosystem published over the Azure AD access panel. Anomaly reports Integrated application reports Error reports User-specific reports Activity logs From our discussions with customers we recognize that, a lot of time, the differences between the different Azure Active Directory editions are unclear. For that reason, we will include and explain the feature tables provided by Microsoft. We will start with the common features and then go through the premium features of Azure Active Directory. Common features First of all, we want to discuss the Access panel portal so we can clear up some open questions. With the Azure AD Free and Basic editions, you can provide a maximum of 10 applications to every user. However, this doesn't mean that you are limited to 10 applications in total. Next, the portal link: right now it cannot be changed to your own company-owned domain, such ashttps://myapps.inovit.ch. The only way you can do so is by providing an aliasin your DNS configuration; the accessed link is https://myapps.microsoft.com. Company branding will lead us on to the next discussion point, where we are often asked how much corporate identity branding is possible. The following link provides you with all the necessary information for branding your solution:http://bit.ly/1Jjf2nw. Rounding up this short Q&Aon the different feature sets isApplication Proxy usage, one of the important differentiators between the Azure AD Free and Basic editions. The short answer is that with Azure AD Free, you cannot publish on-premises applications and services over the Azure AD Access Panel portal. Features AAD Free AAD Basic AAD Premium Directory as a Service (objects) 500k unlimited unlimited User/Group management (UI or PowerShell) X X X Access Panel portal for SSO (per user) 10 apps 10 apps unlimited User-based application access management/provisioning X X X Self-service password change (cloud users) X X X Directory synchronization tool X X X Standard security reports X X X High availability SLA (99.9%)   X X Group-based application access management and provisioning   X X Company branding   X X Self-service password reset for cloud users   X X Application Proxy   X X Self-service group management for cloud users   X X Premium features The Azure Active Directory Premium edition provides you with the entireIAMcapabilities, including the usage licenses of the on-premises used Microsoft Identity Manager. From a technical perspective, you need to use the Azure AD Connect utility to connect your on-premises Active Directory with the cloud and the Microsoft Identity Manager to manage your on-premises identities and prepare them for your cloud integration. To acquire Azure AD Premium, you can also use the Enterprise Mobility Suite(EMS) bundle, which contains Azure AD Premium, Azure Rights Management, Microsoft Intune,and Advanced Threat Analytics(ATA) licensing. You can find more information about EMS by visitinghttp://bit.ly/1cJLPcM and http://bit.ly/29rupF4. Features       Azure AD Premium Self-service password reset with on-premiseswrite-back     X Microsoft Identity Manager server licenses     X Advanced anomaly security reports     X Advanced usage reporting     X Multi-Factor Authentication (cloud users)     X Multi-Factor Authentication (on-premises users)     X Azure AD Premium reference: http://bit.ly/1gyDRoN Multi-Factor Authentication for cloud users is also included in Office 365. The main difference is that you cannot use it for on-premises users and services such as VPN or web servers. Azure Active Directory Business to Business One of the newest features based on Azure Active Directory is the new Business to Business(B2B) capability. The new product solves the problem of collaboration between business partners. It allows users to share business applications between partners without going through inter-company federation relationships and internally-managed partner identities. With Azure AD B2B, you can create cross-company relationships by inviting and authorizing users from partner companies to access your resources. With this process,each company federates once with Azure AD and each user is then represented by a single Azure AD account. This option also provides a higher security level, because if a user leaves the partner organization,access is automatically disallowed. Inside Azure AD, the user will be handled as though a guest, and they will not be able to traverse other users in the directory. Real permissions will be provided over the correct associated group membership. Azure Active Directory Business to Consumer Azure Active Directory Business to Consumer(B2C) is another brand new feature based on Azure Active Directory. This functionality supports signing in to your application using social networks like Facebook, Google, or LinkedIn and creating accounts with usernames and passwords specifically for your company-owned application. Self-service password management and profile management are also provided with this scenario. Additionally, Multi-Factor Authentication introduces a higher grade of security to the solution. Principally, this feature allows small and medium companies to hold their customers in a separated Azure Active Directory with all the capabilities, and more,in a similar way to the corporate-managed Azure Active Directory. With different verification options, you are also able to provide the necessary identity assurance required for more sensible transactions. Azure Active Directory Privileged Identity Management Azure AD Privileged Identity Management provides you the functionality to manage, control, and monitor your privileged identities. With this option, you can build up an RBAC solution over your Azure AD and other Microsoft online services, such as Office 365 or Microsoft Intune. The following activities can be reached with this functionality: You can discover the actual configured Azure AD Administrators You can providejust in time administrative access You can get reports about administrator access history and assignment changes You can receive alerts about access to a privileged role The following built-in roles can be managed with the current version: Global Administrator Billing Administrator Service Administrator User Administrator Password Administrator Azure Multi-Factor Authentication Protecting sensible information or application access with additional authentication is an important task not just in the on-premises world. In particular, it needs to be extended to every used sensible cloud service. There are a lot of variations for providing this level of security and additional authentication, such ascertificates, smart cards, or biometric options. For example, smart cards have  a dependency on special hardware used to read the smart card and cannot be used in every scenario without limiting the access to a special device or hardware or. The following table gives you an overview of different attacks and how they can be mitigated with a well designed and implemented security solution. Attacker Security solution Password brute force Strong password policies Shoulder surfing Key or screen logging One-time password solution Phishing or pharming Server authentication (HTTPS) Man-in-the-Middle Whaling (Social engineering) Two-factor authentication Certificate or one-time password solution Certificate authority corruption Cross Channel Attacks (CSRF) Transaction signature and verification Non repudiation Man-in-the-Browser Key loggers Secure PIN entry Secure messaging Browser (read only) Push button (token) Three-factor authentication The Azure Multi-Factor Authentication functionality has been included in the Azure Active Directory Premium capabilities to address exactly the attacks described in the previous table. With a one-time password solution, you can build a very capable security solution to access information or applications from devices that cannot use smart cards as the additional authentication method. Otherwise, for small or medium business organizations a smart card deployment, including the appropriate management solution will, be too cost-intensive and the Azure MFA solution can be a good alternative for reaching the expected higher security level. In discussions with our customers, we recognized that a lot don't realize that Azure MFA is already included in different Office 365 plans. They would be able to protect their Office 365 with multi-factor out-of-the-box but they don't know it! This brings us to Microsoft and the following table, which compares the functionality between Office 365 and Azure MFA. Feature O365 Azure Administrators can enable/enforce MFA to end-users X X Use mobile app (online and OTP) as second authentication factor X X Use phone call as second authentication factor X X Use SMS as second authentication factor X X App passwords for non-browser clients (for example, Outlook, Lync) X X Default Microsoft greetings during authentication phone calls X X Remember Me X X IP Whitelist   X Custom greetings during authentication phone calls   X Fraud alert   X Event confirmation   X Security reports   X Block/unblock users   X One-time bypass   X Customizable caller ID for authentication phone calls   X MFA Server – MFA for on-premises applications   X MFA SDK – MFA for custom apps   X With the Office 365 capabilities of MFA, the administrators are able to use basic functionality to protect their sensible information. In particular,if integrating on-premises users and services, the Azure MFA solution is needed. Azure MFA and the on-premises installation of the MFA server cannot be used to protect your Windows Server DirectAccess implementation. Furthermore, you will find the customizable caller ID limited to special regions. Azure Rights Management More and more organizations are in the position to provide a continuous and integrated information protection solution to protect sensible assets and information. On one side stands the department, which carries out its business activities, generates the data, and then processes. Furthermore, it uses the data inside and outside the functional areas, passes it, and runs a lively exchange of information. On the other hand, revision is requiredby legal requirements that prescribe measures to ensure that information is dealt with and  dangers such as industrial espionage and data loss are avoided. So,thisis a big concern when safeguarding sensitive information. While the staff appreciate the many ways of communication and data exchange, this development starts stressing the IT security officers and makes the managers worried. The fear is that critical corporate data staysin an uncontrolled manner and leaves the company or moves to competitors. The routes are varied, but data is often lost ininadvertent delivery via email. In addition, sensitive data can leave the company on a USB stick and smartphone, or IT media can be lost or stolen. In addition, new risks are added, such as employees posting information on social media platforms.IT must ensure the protection of data in all phases, and traditional IT security solutions are not always sufficient. The following figure illustrates this situation and leads us to the Azure Rights Management services. Like itsother additional features, the base functional is included in different Office 365 plans. The main difference between the two is  that only the Azure RMS edition can be integrated in an on-premises file server environment in order to be able to use the File Classification Infrastructure feature of the Windows Server file server role. The Azure RMS capability allows you to protect your sensitive information based on classification information with a granular access control system. The following table provided from Microsoft shows the differences between the Office 365 and Azure RMS functionality.Azure RMS is included with E3, E4, A3, and A4 plans. Feature RMS O365 RMS Azure Users can create and consume protected content by using Windows clients and Office applications X X Users can create and consume protected content by using mobile devices X X Integrates with Exchange Online, SharePoint Online, and OneDrive for Business X X Integrates with Exchange Server 2013/Exchange Server 2010 and SharePoint Server 2013/SharePoint Server 2010 on-premises via the RMS connector X X Administrators can create departmental templates X X Organizations can create and manage their own RMS tenant key in a hardware security module (the Bring Your Own Key solution) X X Supports non-Office file formats: Text and image files are natively protected; other files are generically protected X X RMS SDK for all platforms: Windows, Windows Phone, iOS, Mac OSX, and Android X X Integrates with Windows file servers for automatic protection with FCI via the RMS connector   X Users can track usage of their documents   X Users can revoke access to their documents   X In particular, the tracking feature helps users to find where their documents are distributed and allows them to revoke access to a single protected document. Microsoft Azure security services in combination Now that we have discussed the relevant Microsoft Azure IAMcapabilities, you can see that Microsoft provides more than just single features or subsets of functionality. Furthermore, it brings a whole solution to the market, which provides functionality for every facet of IAM. Microsoft Azure also combines clear service management with IAM, making it a rich solution for your organization. You can work with that toolset in a native cloud-first scenario, hybrid, and a complex hybrid scenario and can extend your solution to every possible use case or environment. The following figure illustrates all the different topics that are covered with Microsoft Azure security solutions: Defining the benefits and costs The Microsoft Azure IAMcapabilities help you to empower your users with a flexible and rich solution that enables better business outcomes in a more productive way. You help your organization to improve the regulatory compliance overall and reduce the information security risk. Additionally, it can be possible to reduce IT operating and development costs by providing higher operating efficiencyand transparency. Last but not least,it will lead toimproved user satisfaction and better support from the business for further investments. The following toolset gives you very good instruments for calculatingthe costs of your special environment. Azure Active Directory Pricing Calculator:http://bit.ly/1fspdhz Enterprise Mobility Suite Pricing:http://bit.ly/1V42RSk Microsoft Azure Pricing Calculator:http://bit.ly/1JojUfA Principles of security and legal requirements The classification of data, such as business information or personal data, is not only necessary for an on-premises infrastructure. It is a basis for the assurance of business-related information and is represented by compliance with official regulations. These requirements are ofgreater significance when using cloud services or solutions outside your own company and regulation borders. They are clearly needed for a controlled shift of data in an area in which responsibilities on contracts must be regulated. Safety limits do not stop at the private cloud, and are responsible for the technical and organizational implementation and control of security settings. The subsequent objectives are as follows: Construction, extension, or adaptation of the data classification to the Cloud Integration Data classification as a basis for encryption or isolated security silos Data classification as a basis for authentication and authorization Microsoft itself has strict controls that restrict access to Azure to Microsoft employees. Microsoft also enables customers to control access to their Azure environments, data, and applications, as well as allowing them to penetrate and audit services with special auditors and regulations on request. A statement from Microsoft: Customers will only use cloud providers in which they have great trust. They must trust that the privacy of their information will be protected, and that their data will be used in a way that is consistent with their expectations.We build privacy protections into Azure through Privacy by Design. You can get all the necessary information about security, compliance, and privacy by visiting the following link http://bit.ly/1uJTLAT. Summary Now that you are fully clued up with information about typical needs and challenges andfeature and licensing information, you should be able to apply the right technology and licensing model to your cloud-only scenario. You should also be aware of the benefits and cost calculators that will help you calculate a basic price model for your required services. Furthermore, you canalso decide which security and legal requirements are relevant for your cloud-only environments. Resources for Article: Further resources on this subject: Creating Multitenant Applications in Azure [article] Building A Recommendation System with Azure [article] Installing and Configuring Windows Azure Pack [article]
Read more
  • 0
  • 0
  • 10914
article-image-creating-slack-progress-bar
Bradley Cicenas
13 Sep 2016
4 min read
Save for later

Creating a Slack Progress Bar

Bradley Cicenas
13 Sep 2016
4 min read
There are many ways you can customize your Slack program, but with a little bit of programming you can create things that can provide you with all sorts of productivity-enhancing options. In this post, you are going to learn how to use Python to create a custom progress bar that will run in your very own Slack channel. You can use this progress bar in so many different ways. We won’t get into the specifics beyond the progress bar, but one idea would be to use it to broadcast the progress of a to-do list in Trello. That is something that is easy to set up in Slack. Let’s get started. Real-time notifications are a great way to follow along with the events of the day. For ever-changing and long-running events though, a stream of update messages quickly becomes spammy and overwhelming. If you make a typo in a sent message, or want to append additional details, you know that you can go back and edit it without sending a new chat(and subsequently a new notification for those in the channel); so why can't your bot friends do the same? As you may have guessed by now, they can! This program will make use of Slack’s chat.update method to create a dynamic progress bar with just one single-line message. Save the below script to a file namedprogress.py, updating the SLACK_TOKEN and SLACK_CHANNEL variables at the top with your configured token and desired channel name. from time import sleep from slacker import Slacker from random importrandint SLACK_TOKEN = '<token>' SLACK_CHANNEL = '#general' classSlackSimpleProgress(object): def__init__(self, slack_token, slack_channel): self.slack = Slacker(slack_token) self.channel = slack_channel res = self.slack.chat.post_message(self.channel, self.make_bar(0)) self.msg_ts = res.body['ts'] self.channel_id = res.body['channel'] def update(self, percent_done): self.slack.chat.update(self.channel_id, self.msg_ts, self.make_bar(percent_done)) @staticmethod defmake_bar(percent_done): return'%s%s%%' % ((round(percent_done / 5) * chr(9608)), percent_done) if__name__ == '__main__': progress_bar = SlackSimpleProgress(SLACK_TOKEN, SLACK_CHANNEL) # initialize the progress bar percent_complete = 1 whilepercent_complete<100: progress.update(percent_complete) percent_complete += randint(1,10) sleep(1) progress_bar.update(100) Run the script with a simple python progress.py and, like magic, your progress bar will appear in the channel, updating at regular intervals until completion:   How It Works In the last six lines of the script: A progress bar is initially created with SlackSimpleProgress. The update method is used to refresh the current position of the bar once a second. percent_complete is slowly incremented by a random amount using Python’s random.randint. We set the progress bar to 100% when completed. Extending With the basic usage down, the same script can be updated to serve a real purpose. Say we're copying a large amount of files between directories for a backup—we'll replace the __main__ part of the script with: importos import sys importshutil src_path = sys.argv[0] dest_path = sys.argv[1] all_files = next(os.walk('/usr/lib'))[2] progress_bar = SlackSimpleProgress(SLACK_TOKEN, SLACK_CHANNEL) # initialize the progress bar files_copied = 0 forfile in all_files: src_file = '%s/%s' % (src_path, file) dest_file = '%s/%s' % (dest_path, file) print('copying %s to %s' % (src_file, dest_file)) # print the file being copied shutil.copy2(src_file, dest_file) files_copied += 1# increment files copied percent_done = files_copied / len(all_files) * 100# calculate percent done progress_bar.update(percent_done) The script can now be run with two arguments: python progress.py /path/to/files /path/to/destination. You'll see the name of each file copied in the output of the script, and your teammates will be able to follow along with the status of the copy as it progresses in your Slack channel! I hope you find many uses for ways to implement this progress bar in Slack! About the author Bradley Cicenas is a New York City-based infrastructure engineer with an affinity for microservices, systems design, data science, and stoops.
Read more
  • 0
  • 0
  • 9137

article-image-computer-vision-keras-part-2
Sasank Chilamkurthy
13 Sep 2016
6 min read
Save for later

Computer Vision with Keras, Part 2

Sasank Chilamkurthy
13 Sep 2016
6 min read
If you were following along in Part 1, you will have seen how we used Keras to create our model for tackling The German Traffic Sign Recognition Benchmark(GTSRB). Now in Part 2 you will see how we achieve performance close to human-level performance. You will also see how to improve the accuracy of the model using augmentation of the training data. Training Now, our model is ready to train. During the training, our model will iterate over batches of the training set, each of size batch_size. For each batch, gradients will be computed and updates will be made to the weights of the network automatically. One iteration over all of the training set is referred to as an epoch. Training is usually run until the loss converges to a constant. We will add a couple of features to our training: Learning rate scheduler: Decaying learning rate over the epochs usually helps the model learn better. Model checkpoint: We will save the model with best validation accuracy. This is useful because our network might start overfitting after a certain number of epochs, but we want the best model. These are not necessary but they improve the model accuracy. These features are implemented via the callback feature of Keras. callback are a set of functions that will applied at given stages of training procedure like end of an epoch of training. Keras provides inbuilt functions for both learning rate scheduling and model checkpointing. fromkeras.callbacks import LearningRateScheduler, ModelCheckpoint deflr_schedule(epoch): returnlr*(0.1**int(epoch/10)) batch_size = 32 nb_epoch = 30 model.fit(X, Y, batch_size=batch_size, nb_epoch=nb_epoch, validation_split=0.2, callbacks=[LearningRateScheduler(lr_schedule), ModelCheckpoint('model.h5',save_best_only=True)] ) You'll see that model starts training and logs the losses and accuracies: Train on 31367 samples, validate on 7842 samples Epoch 1/30 31367/31367 [==============================] - 30s - loss: 1.1502 - acc: 0.6723 - val_loss: 0.1262 - val_acc: 0.9616 Epoch 2/30 31367/31367 [==============================] - 32s - loss: 0.2143 - acc: 0.9359 - val_loss: 0.0653 - val_acc: 0.9809 Epoch 3/30 31367/31367 [==============================] - 31s - loss: 0.1342 - acc: 0.9604 - val_loss: 0.0590 - val_acc: 0.9825 ... Now this might take a bit of time, especially if you are running on a CPU. If you have anNvidiaGPU, you should install cuda. It speeds up the training dramatically. For example, on my Macbook air, it takes 10 minutes per epoch while on a machine with Nvidia Titan X GPU, it takes 30 seconds. Even modest GPUs offer impressive speedup because of the inherent parallelizability of the neural networks. This makes GPUs necessary for deep learning if anything big has to be done. Grab a coffee while you wait for training to complete ;). Congratulations! You have just trained your first deep learning model. Evaluation Let's quickly load test data and evaluate our model on it: import pandas as pd test = pd.read_csv('GT-final_test.csv',sep=';') # Load test dataset X_test = [] y_test = [] i = 0 forfile_name, class_id in zip(list(test['Filename']), list(test['ClassId'])): img_path = os.path.join('GTSRB/Final_Test/Images/',file_name) X_test.append(preprocess_img(io.imread(img_path))) y_test.append(class_id) X_test = np.array(X_test) y_test = np.array(y_test) # predict and evaluate y_pred = model.predict_classes(X_test) acc = np.sum(y_pred==y_test)/np.size(y_pred) print("Test accuracy = {}".format(acc)) Which outputs on my system (Results may change a bit because the weights of the neural network are randomly initialized): 12630/12630 [==============================] - 2s Test accuracy = 0.9792557403008709 97.92%! That's great! It's not far from average human performance (98.84%)[1]. A lot of things can be done to squeeze out extra performance from the neural net. I'll implement one such improvement in the next section. Data Augmentation You might think 40000 images is a lot of images. Think about it again. Our model has 1358155 parameters (try model.count_params() or model.summary()). That's 4X the number of training images. If we can generate new images for training from the existing images, that will be a great way to increase the size of the dataset. This can be done by slightly: Translating theimage Rotating theimage Shearing the image Zooming in/out of the image Rather than generating and saving such images to hard disk, we will generate them on the fly during training. This can be done directly using built-in functionality of Keras. fromkeras.preprocessing.image import ImageDataGenerator fromkeras.preprocessing.image import ImageDataGenerator fromsklearn.cross_validation import train_test_split X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=42) datagen = ImageDataGenerator(featurewise_center=False, featurewise_std_normalization=False, width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2, shear_range=0.1, rotation_range=10.,) datagen.fit(X_train) # Reinitialize model and compile model = cnn_model() model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) # Train again nb_epoch = 30 model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size), samples_per_epoch=X_train.shape[0], nb_epoch=nb_epoch, validation_data=(X_val, Y_val), callbacks=[LearningRateScheduler(lr_schedule), ModelCheckpoint('model.h5',save_best_only=True)] ) With this model, I get 98.29% accuracy on the test set. Frankly, I haven't done much parameter tuning. I'll make a small list of things which can be tried to improve the model: Try different network architectures. Try deeper and shallower networks Try adding BatchNormalization layers to the network Experiment with different weight initializations Try different learning rates and schedules Make an ensemble of models Try normalization of input images More aggressive data augmentation This is but a model for beginners. For state-of-the-art solutions of the problem, you can have a look at this, where the authors achieve 99.61% accuracy with a specialized layer called Spatial Transformer layer. Conclusion In this two-part post, you have learned how to use convolutional networks to solve a computer vision problem. We used the Keras deep learning framework to implement CNNs in Python. We have achieved performance close to human-level performance. We also have seen a way to improve the accuracy of the model using augmentation of the training data. References: Stallkamp, Johannes, et al. "Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition." Neural networks 32 (2012): 323-332. About the author Sasank Chilamkurthy works at Qure.ai. His work involves deep learning on medical images obtained from radiology and pathology. He completed his UG in Mumbai at the Indian Institute of Technology, Bombay. He can be found on Github at here.
Read more
  • 0
  • 0
  • 2649
Modal Close icon
Modal Close icon