Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-angularjs-web-application-development-cookbook
Packt
08 May 2015
2 min read
Save for later

AngularJS Web Application Development Cookbook

Packt
08 May 2015
2 min read
Architect performant applications and implement best practices in AngularJS. Packed with easy-to-follow recipes, this practical guide will show you how to unleash the full might of the AngularJS framework. Skip straight to practical solutions and quick, functional answers to your problems without hand-holding or slogging through the basics. (For more resources related to this topic, see here.) Some highlights include: Architecting recursive directives Extensively customizing your search filter Custom routing attributes Animating ngRepeat Animating ngInclude, ngView, and ngIf Animating ngSwitch Animating ngClass, and class attributes Animating ngShow, and ngHide The goal of this text is to have you walk away from reading about an AngularJS concept armed with a solid understanding of how it works, insight into the best ways to wield it in real-world applications, and annotated code examples to get you started. Why you should buy this book A collection of recipes demonstrating optimal organization, scaleable architecture, and best practices for use in small and large-scale production applications. Each recipe contains complete, functioning examples and detailed explanations on how and why they are organized and built that way, as well as alternative design choices for different situations. The author of this book is a full stack developer at DoorDash (YC S13), where he joined as the first engineer. He led their adoption of AngularJS, and he also focuses on the infrastructural, predictive, and data projects within the company. Matt has a degree in Computer Engineering from the University of Illinois at Urbana-Champaign. He is the author of the video series Learning AngularJS, available through O'Reilly Media. Previously, he worked as an engineer at several educational technology start-ups. Almost every example in this book has been added to JSFiddle, with the links provided in the book. This allows you to merely visit a URL in order to test and modify the code with no setup of any kind, on any major browser and on any major operating system. Resources for Article:  Further resources on this subject: Working with Live Data and AngularJS [article] Angular Zen [article] AngularJS Project [article]
Read more
  • 0
  • 0
  • 2093

article-image-mastering-lumion-3d
Packt
08 May 2015
4 min read
Save for later

Mastering Lumion 3D

Packt
08 May 2015
4 min read
Welcome to this treasure house of Lumion 3D! This article will guide you through the intricacies of using Lumion—the next generation graphical tool. It will also present crisp notes from the book, Mastering Lumion 3D. Why Lumion 3D? ''To suppose that the eye with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I confess, absurd in the highest degree..." - Charles Darwin The eye is indeed one of the most complex structures man is gifted with. The eye beholds the beauty of nature in the most magnificent way. To replicate what the eye sees in nature onto the computer screen requires the finest tools man has ever developed. Lumion 3D is one such tool. It has a strong emphasis on architectural visualization. Hobbyists use it a lot to create architectural CAD art. Professionals use it to create mock-ups of concepts. The result is ultra-realistic visualizations of concepts for buildings, outside areas, furnishings, landmarks, and skyscrapers. The core of this product comes from the fact that architects and designers do not need to know computer graphics skills, but just to learn the tools and workflow in Lumion. Highlights of Lumion 3D Indeed, you can breathe life into your architectural designs with Lumion. These are some of the features that stand out: Ease of use: Adding trees, clouds, people, artistic effects, and materials, and converting your 3DS Max design into an amazing 3D image or 3D flythrough movie is easy. It enables anyone to create movies and images without any prior training. Through Lumion, you can make beautiful SketchUp renderings yourself. You don't need to outsource this work any longer. Fine graphics: The graphics in Lumion is too good. It has an edge over other contemporary software in this regard. Speed: Generating animation is fast with Lumion. In comparison to traditional 3D rendering programs that take days to process animations, Lumion takes anything from a few minutes to a few hours. Why Mastering Lumion 3D? Lumion can be an intuitive tool, but that doesn't mean we can automatically produce a better architectural visualization. The reason why Ciro Cardoso, the author, wrote this book was because like you, the first time he picked up Lumion, he felt that there was something missing on his projects. Mastering Lumion 3D covers the process of picking a 3D model, preparing it, and then building layers on top of layers of detail, using textures and optimized 3D models. However, it doesn't stop there because several chapters are dedicated exclusively toward explaining how to use Lumion's effects and other special features to take your project to an expert level. This book is written in a way that will hopefully cover all the questions you may have when starting the first steps with Lumion. On the other hand, if you are an intermediate or advanced user, you can find some unique techniques that will make you look at Lumion from another perspective. The journey to write this book was filled not only with the author's experience, but also from what he learned while working with other great professionals. What you need for this book Lumion Version 4 is used for all the examples in this book, but you can follow the explanations using the free version or a previous Lumion version. Although Adobe Photoshop is used in some examples, you can use GIMP as an alternative. Also, ensure that your system has a good graphics card. This will make working with Lumion fun. Who this book is for This book is designed for all levels of Lumion users, from beginners to advanced users. You will find useful insights and professional techniques to improve and develop your skills in order to fully control and master Lumion. However, this book doesn't cover the process of transforming 2D information (CAD plan) into a 3D model. If you are really interested in drawing great architectural designs, this book is for you. Resources for Article:  Further resources on this subject: Integrating Direct3D with XAML and Windows 8.1 [article] Diving Straight into Photographic Rendering [article] What is Lumion? [article]
Read more
  • 0
  • 0
  • 1904

article-image-learning-selenium-testing-tools-python
Packt
08 May 2015
3 min read
Save for later

Learning Selenium Testing Tools with Python

Packt
08 May 2015
3 min read
Selenium is a portable software testing framework for web applications. It is open-source software, released under the Apache 2.0 license, and can be downloaded and used without charge. (For more resources related to this topic, see here.) Selenium WebDriver is the successor to Selenium RC. Selenium WebDriver accepts commands and sends them to a browser. This is implemented through a browser-specific browser driver, which sends commands to a browser, and retrieves results. Selenium WebDriver is a set of open source tools and libraries to automate browsers. It has gained a wider acceptance and has become a tool of choice for automated testing on web applications. Selenium WebDriver is now part of the W3C standard. The beauty of Selenium WebDriver is that the user can write automated tests in any language, thanks to its platform agnostic approach. It provides a number of client libraries in Java, C#, Python, Ruby, JavaScript, and more to write the tests. Over the years Selenium has become a very powerful testing platform and many organizations are adopting Selenium over the other commercial tools. In the book Learning Selenium WebDriver with Python by Unmesh Gundecha, you will learn the following topics: Creating Selenium WebDriver tests using Python the unittest module Using Selenium WebDriver for cross-browser testing Building reliable and robust tests using implicit and explicit waits Setting up and using Selenium Grid for a distributed run Testing web applications on mobile platforms such as iOS and Android using Appium Using various methods provided by Selenium WebDriver to locate the web elements and interact with them Capturing screenshot and video of the test execution This book is a practical guide on automated web testing with Selenium testing tools using Python and is written for users with previous Python experience, although any previous knowledge of Selenium WebDriver is not needed. The author has provided you with step-by-step tutorials including practical examples that will help you build automated tests for testing your web applications using the Selenium WebDriver Python client library. This book is an interactive guide on automated web testing with Selenium WebDriver using Python. With the help of this book you can use Selenium for automated testing in real world, explore the Selenium WebDriver API for easy implementation of small to complex operations on browsers and web applications, and its easy and practical examples will help you get started with Selenium WebDriver. Summary The main aim of this book is to cover the fundamentals related to Python Selenium testing. You will learn how the Selenium WebDriver Python API can be integrated with CI and Build tools to allow tests to be run while building applications. This book will guide you through using the Selenium WebDriver Python client library as well as other tools from the Selenium project. Towards the end of this book, you'll get to grips with Selenium Grid, which is used for running tests in parallel using nodes for cross-browser testing. It will also give you a basic overview of the concepts, while helping you improve your practical testing skills with Python and Selenium. Resources for Article: Further resources on this subject: BackTrack 4: Security with Penetration Testing Methodology [article] Improving Plone 3 Product Performance [article] Selenium Testing Tools [article]
Read more
  • 0
  • 0
  • 2647

article-image-nodejs-building-maintainable-codebase
Benjamin Reed
06 May 2015
8 min read
Save for later

NodeJS: Building a Maintainable Codebase

Benjamin Reed
06 May 2015
8 min read
NodeJS has become the most anticipated web development technology since Ruby on Rails. This is not an introduction to Node. First, you must realize that NodeJS is not a direct competitor to Rails or Django. Instead, Node is a collection of libraries that allow JavaScript to run on the v8 runtime. Node powers many tools, and some of the tools have nothing to do with a scaling web application. For instance, GitHub’s Atom editor is built on top of Node. Its web application frameworks, like Express, are the competitors. This article can apply to all environments using Node. Second, Node is designed under the asynchronous ideology. Not all of the operations in Node are asynchronous. Many libraries offer synchronous and asynchronous options. A Node developer must decipher the best operation for his or her needs. Third, you should have a solid understanding of the concept of a callback in Node. Over the course of two weeks, a team attempted to refactor a Rails app to be an Express application. We loved the concepts behind Node, and we truly believed that all we needed was a barebones framework. We transferred our controller logic over to Express routes in a weekend. As a beginning team, I will analyze some of the pitfalls that we came across. Hopefully, this will help you identify strategies to tackle Node with your team. First, attempt to structure callbacks and avoid anonymous functions. As we added more and more logic, we added more and more callbacks. Everything was beautifully asynchronous, and our code would successfully run. However, we soon found ourselves debugging an anonymous function nested inside of other anonymous functions. In other words, the codebase was incredibly difficult to follow. Anyone starting out with Node could potentially notice the novice “spaghetti code.” Here’s a simple example of nested callbacks: router.put('/:id', function(req, res) { console.log("attempt to update bathroom"); models.User.find({ where: {id: req.param('id')} }).success(function (user) { var raw_cell = req.param('cell') ? req.param('cell') : user.cell; var raw_email = req.param('email') ? req.param('email') : user.email; var raw_username = req.param('username') ? req.param('username') : user.username; var raw_digest = req.param('digest') ? req.param('digest') : user.digest; user.cell = raw_cell; user.email = raw_email; user.username = raw_username; user.digest = raw_digest; user.updated_on = new Date(); user.save().success(function () { res.json(user); }).error(function () { res.json({"status": "error"}); }); }) .error(function() { res.json({"status": "error"}); }) }); Notice that there are many success and error callbacks. Locating a specific callback is not difficult if the whitespace is perfect or the developer can count closing brackets back up to the destination. However, this is pretty nasty to any newcomer. And this illegibility will only increase as the application becomes more complex. A developer may get this response: {"status": "error"} Where did this response come from? Did the ORM fail to update the object? Did it fail to find the object in the first place? A developer could add descriptions to the json in the chained error callbacks, but there has to be a better way. Let’s extract some of the callbacks into separate methods: router.put('/:id', function(req, res) { var id = req.param('id'); var query = { where: {id: id} }; // search for user models.User.find(query).success(function (user) { // parse req parameters var raw_cell = req.param('cell') ? req.param('cell') : user.cell; var raw_email = req.param('email') ? req.param('email') : user.email; var raw_username = req.param('username') ? req.param('username') : user.username; // set user attributes user.cell = raw_cell; user.email = raw_email; user.username = raw_username; user.updated_on = new Date(); // attempt to save user user.save() .success(SuccessHandler.userSaved(res, user)) .error(ErrorHandler.userNotSaved(res, id)); }) .error(ErrorHandler.userNotFound(res, id)) }); var ErrorHandler = { userNotFound: function(res, user_id) { res.json({"status": "error", "description": "The user with the specified id could not be found.", "user_id": user_id}); }, userNotSaved: function(res, user_id) { res.json({"status": "error", "description": "The update to the user with the specified id could not be completed.", "user_id": user_id}); } }; var SuccessHandler = { userSaved: function(res, user) { res.json(user); } } This seemed to help clean up our minimal sample. There is now only one anonymous function. The code seems to be a lot more readable and independent. However, our code is still cluttered by chaining success and error callbacks. One could make these global mutable variables, or, perhaps we can consider another approach. Futures, also known as promises, are becoming more prominent. Twitter has adopted them in Scala. It is definitely something to consider. Next, do what makes your team comfortable and productive. At the same time, do not compromise the integrity of the project. There are numerous posts that encourage certain styles over others. There are also extensive posts on the subject of CoffeeScript. If you aren’t aware, CoffeeScript is a language with some added syntactic flavor that compiles to JavaScript. Our team was primarily ruby developers, and it definitely appealed to us. When we migrated some of the project over to CoffeeScript, we found that our code was a lot shorter and appeared more legible. GitHub uses CoffeeScript for the Atom text editor to this day, and the Rails community has openly embraced it. The majority of node module documentation will use JavaScript, so CoffeeScript developers will have to become acquainted with translation. There are some problems with CoffeeScript being ES6 ready, and there are some modules that are clearly not meant to be utilized in CoffeeScript. CoffeeScript is an open source project, but it has appears to have a good backbone and a stable community. If your developers are more comfortable with it, utilize it. When it comes to open source projects, everyone tends to trust them. In the purest form, open source projects are absolutely beautiful. They make the lives of all of the developers better. Nobody has to re-implement the wheel unless they choose. Obviously, both Node and CoffeeScript are open source. However, the community is very new, and it is dangerous to assume that any package you find on NPM is stable. For us, the problem occurred when we searched for an ORM. We truly missed ActiveRecord, and we assumed that other projects would work similarly.  We tried several solutions, and none of them interacted the way we wanted. Besides expressing our entire schema in a JavaScript format, we found relations to be a bit of a hack. Settling on one, we ran our server. And our database cleared out. That’s fine in development, but we struggled to find a way to get it into production. We needed more documentation. Also, the module was not designed with CoffeeScript in mind. We practically needed to revert to JavaScript. In contrast, the Node community has openly embraced some NoSQL databases, such as MongoDB. They are definitely worth considering.   Either way, make sure that your team’s dependencies are very well documented. There should be a written documentation for each exposed object, function, etc. To sum everything up, this article comes down to two fundamental things learned in any computer science class: write modular code and document everything. Do your research on Node and find a style that is legible for your team and any newcomers. A NodeJS project can only be maintained if developers utilizing the framework recognize the importance of the project in the future. If your code is messy now, it will only become messier. If you cannot find necessary information in a module’s documentation, you probably will miss other information when there is a problem in production. Don’t take shortcuts. A node application can only be as good as its developers and dependencies. About the Author Benjamin Reed began Computer Science classes at a nearby university in Nashville during his sophomore year in high school. Since then, he has become an advocate for open source. He is now pursing degrees in Computer Science and Mathematics fulltime. The Ruby community has intrigued him, and he openly expresses support for the Rails framework. When asked, he believes that studying Rails has led him to some of the best practices and, ultimately, has made him a better programmer. iOS development is one of his hobbies, and he enjoys scouting out new projects on GitHub. On GitHub, he’s appropriately named @codeblooded. On Twitter, he’s @benreedDev.
Read more
  • 0
  • 0
  • 2586

article-image-command-line-companion-called-artisan
Packt
06 May 2015
17 min read
Save for later

A Command-line Companion Called Artisan

Packt
06 May 2015
17 min read
In this article by Martin Bean, author of the book Laravel 5 Essentials, we will see how Laravel's command-line utility has far more capabilities and can be used to run and automate all sorts of tasks. In the next pages, you will learn how Artisan can help you: Inspect and interact with your application Enhance the overall performance of your application Write your own commands By the end of this tour of Artisan's capabilities, you will understand how it can become an indispensable companion in your projects. (For more resources related to this topic, see here.) Keeping up with the latest changes New features are constantly being added to Laravel. If a few days have passed since you first installed it, try running a composer update command from your terminal. You should see the latest versions of Laravel and its dependencies being downloaded. Since you are already in the terminal, finding out about the latest features is just one command away: $ php artisan changes This saves you from going online to find a change log or reading through a long history of commits on GitHub. It can also help you learn about features that you were not aware of. You can also find out which version of Laravel you are running by entering the following command: $ php artisan --version Laravel Framework version 5.0.16 All Artisan commands have to be run from your project's root directory. With the help of a short script such as Artisan Anywhere, available at https://github.com/antonioribeiro/artisan-anywhere, it is also possible to run Artisan from any subfolder in your project. Inspecting and interacting with your application With the route:list command, you can see at a glance which URLs your application will respond to, what their names are, and if any middleware has been registered to handle requests. This is probably the quickest way to get acquainted with a Laravel application that someone else has built. To display a table with all the routes, all you have to do is enter the following command: $ php artisan route:list In some applications, you might see /{v1}/{v2}/{v3}/{v4}/{v5} appended to particular routes. This is because the developer has registered a controller with implicit routing, and Laravel will try to match and pass up to five parameters to the controller. Fiddling with the internals When developing your application, you will sometimes need to run short, one-off commands to inspect the contents of your database, insert some data into it, or check the syntax and results of an Eloquent query. One way you could do this is by creating a temporary route with a closure that is going to trigger these actions. However, this is less than practical since it requires you to switch back and forth between your code editor and your web browser. To make these small changes easier, Artisan provides a command called tinker, which boots up the application and lets you interact with it. Just enter the following command: $ php artisan tinker This will start a Read-Eval-Print Loop (REPL) similar to what you get when running the php -a command, which starts an interactive shell. In this REPL, you can enter PHP commands in the context of the application and immediately see their output: > $cat = 'Garfield'; > AppCat::create(['name' => $cat,'date_of_birth' => new DateTime]); > echo AppCat::whereName($cat)->get(); [{"id":"4","name":"Garfield 2","date_of_birth":…}] > dd(Config::get('database.default')); Version 5 of Laravel leverages PsySH, a PHP-specific REPL that provides a more robust shell with support for keyboard shortcuts and history. Turning the engine off Whether it is because you are upgrading a database or waiting to push a fix for a critical bug to production, you may want to manually put your application on hold to avoid serving a broken page to your visitors. You can do this by entering the following command: $ php artisan down This will put your application into maintenance mode. You can determine what to display to users when they visit your application in this mode by editing the template file at resources/views/errors/503.blade.php (since maintenance mode sends an HTTP status code of 503 Service Unavailable to the client). To exit maintenance mode, simply run the following command: $ php artisan up Fine-tuning your application For every incoming request, Laravel has to load many different classes and this can slow down your application, particularly if you are not using a PHP accelerator such as APC, eAccelerator, or XCache. In order to reduce disk I/O and shave off precious milliseconds from each request, you can run the following command: $ php artisan optimize This will trim and merge many common classes into one file located inside storage/framework/compiled.php. The optimize command is something you could, for example, include in a deployment script. By default, Laravel will not compile your classes if app.debug is set to true. You can override this by adding the --force flag to the command but bear in mind that this will make your error messages less readable. Caching routes Apart from caching class maps to improve the response time of your application, you can also cache the routes of your application. This is something else you can include in your deployment process. The command? Simply enter the following: $ php artisan route:cache The advantage of caching routes is that your application will get a little faster as its routes will have been pre-compiled, instead of evaluating the URL and any matches routes on each request. However, as the routing process now refers to a cache file, any new routes added will not be parsed. You will need to re-cache them by running the route:cache command again. Therefore, this is not suitable during development, where routes might be changing frequently. Generators Laravel 5 ships with various commands to generate new files of different types. If you run $ php artisan list under the make namespace, you will find the following entries: make:command make:console make:controller make:event make:middleware make:migration make:model make:provider make:request These commands create a stub file in the appropriate location in your Laravel application containing boilerplate code ready for you to get started with. This saves keystrokes, creating these files from scratch. All of these commands require a name to be specified, as shown in the following command: $ php artisan make:model Cat This will create an Eloquent model class called Cat at app/Cat.php, as well as a corresponding migration to create a cats table. If you do not need to create a migration when making a model (for example, if the table already exists), then you can pass the --no-migration option as follows: $ php artisan make:model Cat --no-migration A new model class will look like this: <?php namespace App; use IlluminateDatabaseEloquentModel; class Cat extends Model { // } From here, you can define your own properties and methods. The other commands may have options. The best way to check is to append --help after the command name, as shown in the following command: $ php artisan make:command --help You will see that this command has --handler and --queued options to modify the class stub that is created. Rolling out your own Artisan commands At this stage you might be thinking about writing your own bespoke commands. As you will see, this is surprisingly easy to do with Artisan. If you have used Symfony's Console component, you will be pleased to know that an Artisan command is simply an extension of it with a slightly more expressive syntax. This means the various helpers will prompt for input, show a progress bar, or format a table, are all available from within Artisan. The command that we are going to write depends on the application we built. It will allow you to export all cat records present in the database as a CSV with or without a header line. If no output file is specified, the command will simply dump all records onto the screen in a formatted table. Creating the command There are only two required steps to create a command. Firstly, you need to create the command itself, and then you need to register it manually. We can make use of the following command to create a console command we have seen previously: $ php artisan make:console ExportCatsCommand This will generate a class inside app/Console/Commands. We will then need to register this command with the console kernel, located at app/Console/Kernel.php: protected $commands = [ 'AppConsoleCommandsExportCatsCommand', ]; If you now run php artisan, you should see a new command called command:name. This command does not do anything yet. However, before we start writing the functionality, let's briefly look at how it works internally. The anatomy of a command Inside the newly created command class, you will find some code that has been generated for you. We will walk through the different properties and methods and see what their purpose is. The first two properties are the name and description of the command. Nothing exciting here, this is only the information that will be shown in the command line when you run Artisan. The colon is used to namespace the commands, as shown here: protected $name = 'export:cats';   protected $description = 'Export all cats'; Then you will find the fire method. This is the method that gets called when you run a particular command. From there, you can retrieve the arguments and options passed to the command, or run other methods. public function fire() Lastly, there are two methods that are responsible for defining the list of arguments or options that are passed to the command: protected function getArguments() { /* Array of arguments */ } protected function getOptions() { /* Array of options */ } Each argument or option can have a name, a description, and a default value that can be mandatory or optional. Additionally, options can have a shortcut. To understand the difference between arguments and options, consider the following command, where options are prefixed with two dashes: $ command --option_one=value --option_two -v=1 argument_one argument_two In this example, option_two does not have a value; it is only used as a flag. The -v flag only has one dash since it is a shortcut. In your console commands, you'll need to verify any option and argument values the user provides (for example, if you're expecting a number, to ensure the value passed is actually a numerical value). Arguments can be retrieved with $this->argument($arg), and options—you guessed it—with $this->option($opt). If these methods do not receive any parameters, they simply return the full list of parameters. You refer to arguments and options via their names, that is, $this->argument('argument_name');. Writing the command We are going to start by writing a method that retrieves all cats from the database and returns them as an array: protected function getCatsData() { $cats = AppCat::with('breed')->get(); foreach ($cats as $cat) {    $output[] = [      $cat->name,      $cat->date_of_birth,      $cat->breed->name,    ]; } return $output; } There should not be anything new here. We could have used the toArray() method, which turns an Eloquent collection into an array, but we would have had to flatten the array and exclude certain fields. Then we need to define what arguments and options our command expects: protected function getArguments() { return [    ['file', InputArgument::OPTIONAL, 'The output file', null], ]; } To specify additional arguments, just add an additional element to the array with the same parameters: return [ ['arg_one', InputArgument::OPTIONAL, 'Argument 1', null], ['arg_two', InputArgument::OPTIONAL, 'Argument 2', null], ]; The options are defined in a similar way: protected function getOptions() { return [    ['headers', 'h', InputOption::VALUE_NONE, 'Display headers?',    null], ]; } The last parameter is the default value that the argument and option should have if it is not specified. In both the cases, we want it to be null. Lastly, we write the logic for the fire method: public function fire() { $output_path = $this->argument('file');   $headers = ['Name', 'Date of Birth', 'Breed']; $rows = $this->getCatsData();   if ($output_path) {    $handle = fopen($output_path, 'w');      if ($this->option('headers')) {        fputcsv($handle, $headers);      }      foreach ($rows as $row) {        fputcsv($handle, $row);      }      fclose($handle);   } else {        $table = $this->getHelperSet()->get('table');        $table->setHeaders($headers)->setRows($rows);        $table->render($this->getOutput());    } } While the bulk of this method is relatively straightforward, there are a few novelties. The first one is the use of the $this->info() method, which writes an informative message to the output. If you need to show an error message in a different color, you can use the $this->error() method. Further down in the code, you will see some functions that are used to generate a table. As we mentioned previously, an Artisan command extends the Symfony console component and, therefore, inherits all of its helpers. These can be accessed with $this->getHelperSet(). Then it is only a matter of passing arrays for the header and rows of the table, and calling the render method. To see the output of our command, we will run the following command: $ php artisan export:cats $ php artisan export:cats --headers file.csv Scheduling commands Traditionally, if you wanted a command to run periodically (hourly, daily, weekly, and so on), then you would have to set up a Cron job in Linux-based environments, or a scheduled task in Windows environments. However, this comes with drawbacks. It requires the user to have server access and familiarity with creating such schedules. Also, in cloud-based environments, the application may not be hosted on a single machine, or the user might not have the privileges to create Cron jobs. The creators of Laravel saw this as something that could be improved, and have come up with an expressive way of scheduling Artisan tasks. Your schedule is defined in app/Console/Kernel.php, and with your schedule being defined in this file, it has the added advantage of being present in source control. If you open the Kernel class file, you will see a method named schedule. Laravel ships with one by default that serves as an example: $schedule->command('inspire')->hourly(); If you've set up a Cron job in the past, you will see that this is instantly more readable than the crontab equivalent: 0 * * * * /path/to/artisan inspire Specifying the task in code also means we can easily change the console command to be run without having to update the crontab entry. By default, scheduled commands will not run. To do so, you need a single Cron job that runs the scheduler each and every minute: * * * * * php /path/to/artisan schedule:run 1>> /dev/null 2>&1 When the scheduler is run, it will check for any jobs whose schedules match and then runs them. If no schedules match, then no commands are run in that pass. You are free to schedule as many commands as you wish, and there are various methods to schedule them that are expressive and descriptive: $schedule->command('foo')->everyFiveMinutes(); $schedule->command('bar')->everyTenMinutes(); $schedule->command('baz')->everyThirtyMinutes(); $schedule->command('qux')->daily(); You can also specify a time for a scheduled command to run: $schedule->command('foo')->dailyAt('21:00'); Alternatively, you can create less frequent scheduled commands: $schedule->command('foo')->weekly(); $schedule->command('bar')->weeklyOn(1, '21:00'); The first parameter in the second example is the day, with 0 representing Sunday, and 1 through 6 representing Monday through Saturday, and the second parameter is the time, again specified in 24-hour format. You can also explicitly specify the day on which to run a scheduled command: $schedule->command('foo')->mondays(); $schedule->command('foo')->tuesdays(); $schedule->command('foo')->wednesdays(); // And so on $schedule->command('foo')->weekdays(); If you have a potentially long-running command, then you can prevent it from overlapping: $schedule->command('foo')->everyFiveMinutes()          ->withoutOverlapping(); Along with the schedule, you can also specify the environment under which a scheduled command should run, as shown in the following command: $schedule->command('foo')->weekly()->environments('production'); You could use this to run commands in a production environment, for example, archiving data or running a report periodically. By default, scheduled commands won't execute if the maintenance mode is enabled. This behavior can be easily overridden: $schedule->command('foo')->weekly()->evenInMaintenanceMode(); Viewing the output of scheduled commands For some scheduled commands, you probably want to view the output somehow, whether that is via e-mail, logged to a file on disk, or sending a callback to a pre-defined URL. All of these scenarios are possible in Laravel. To send the output of a job via e-mail by using the following command: $schedule->command('foo')->weekly()          ->emailOutputTo('someone@example.com'); If you wish to write the output of a job to a file on disk, that is easy enough too: $schedule->command('foo')->weekly()->sendOutputTo($filepath); You can also ping a URL after a job is run: $schedule->command('foo')->weekly()->thenPing($url); This will execute a GET request to the specified URL, at which point you could send a message to your favorite chat client to notify you that the command has run. Finally, you can chain the preceding command to send multiple notifications: $schedule->command('foo')->weekly()          ->sendOutputTo($filepath)          ->emailOutputTo('someone@example.com'); However, note that you have to send the output to a file before it can be e-mailed if you wish to do both. Summary In this article, you have learned the different ways in which Artisan can assist you in the development, debugging, and deployment process. We have also seen how easy it is to build a custom Artisan command and adapt it to your own needs. If you are relatively new to the command line, you will have had a glimpse into the power of command-line utilities. If, on the other hand, you are a seasoned user of the command line and you have written scripts with other programming languages, you can surely appreciate the simplicity and expressiveness of Artisan. Resources for Article: Further resources on this subject: Your First Application [article] Creating and Using Composer Packages [article] Eloquent relationships [article]
Read more
  • 0
  • 0
  • 7481

article-image-getting-started-websockets
Packt
06 May 2015
6 min read
Save for later

Getting Started with WebSockets

Packt
06 May 2015
6 min read
In this article by Varun Chopra, author of the book WebSocket Essentials – Building Apps with HTML5 WebSockets, we will try to understand why we need and what is the importance of WebSockets, followed by when to use them and how WebSockets actually work. Client server communication is one of the most important parts of any web application. Data communication between the server and client has to be smooth and fast so that the user can have an excellent experience. If we look into the traditional methods of server communication, we will find that those methods were limited and were not really the best solutions. These methods have been used by people for a long period of time and made HTML the second choice for data communication. (For more resources related to this topic, see here.) Why WebSockets The answer to why we need WebSockets lies in the question—what are the problems with the other methods of communication? Some of the methods used for server communication are request/response, polling, and long-polling, which have been explained as follows: Request/Response: This is a commonly used mechanisms in which the client requests the server and gets a response. This process is driven by some interaction like the click of a button on the webpage to refresh the whole page. When AJAX came into the picture, it made the webpages dynamic and helped in loading some part of the webpage without loading the whole page. Polling: There are scenarios where we need the data to be reflected without user interaction, such as the score of a football match. In polling, the data is fetched after a period of time and it keeps hitting the server, regardless of whether the data has changed or not. This causes unnecessary calls to the server, opening a connection and then closing it every time. Long-polling: This is basically a connection kept open for a particular time period. This is one of the ways of achieving real-time communication, but it works only when you know the time interval. The problems with these methods lead to the solution, which is WebSockets. It solves all the problems faced during the use of the old methods. Importance of WebSockets WebSockets comes into the picture to save us from the old heavy methods of server communication. WebSockets solved one of the biggest problems of server communication by providing a full-duplex two-way communication bridge. It provides both the server and client the ability to send data at any point of time, which was not provided by any of the old methods. This has not only improved performance but also reduced the latency of data. It creates a lightweight connection which we can keep open for a long time without sacrificing the performance. It also gives us full control to open and close the connection at any point of time. WebSockets comes as a part of HTML5 standard, so we do not need to worry about adding some extra plugin to make it work. WebSockets API is fully supported and implemented by JavaScript. Almost all modern browsers now support WebSockets; this can be checked using the website http://caniuse.com/#feat=websockets which gives the following screenshot: WebSockets need to be implemented on both the client and server side. On the client side, the API is a part of HTML5. But on the server side, we need to use a library that implements WebSockets. There are many—or we can say almost all—servers that support WebSockets API libraries now. Node.js, which is a modern JavaScript based platform also supports WebSockets based server implementation using different packages, which makes it really easy for developers to code both server and client-side code without learning another language. When to use WebSockets being a very powerful way of communication between the client and server, it is really useful for applications which need a lot of server interaction. As WebSockets gives us the benefit of real-time communication, applications that require real-time data transfer, like chatting applications, can leverage WebSockets. It is not only used for real-time communication but also for scenarios where we need only the server to push the data to the client. The decision to use WebSockets can be made when we know the exact purpose of its usage. We should not use WebSockets when we just have to create a website with static pages and hardly any interaction. We should use WebSockets where the communication is higher in terms of data passing between the client and server. There are many applications like stock applications where the data keeps updating in real time. Collaborative applications need real-time data sharing, such as a game of chess or a Ping-Pong game. WebSockets is majorly utilized in real-time gaming web applications. How it works? WebSockets communicates using the TCP layer. The connection is established over HTTP and is basically a handshake mechanism between the client and server. After the handshake, the connection is upgraded to TCP. Let's see how it works through this flow diagram: The first step is the HTTP call that is initiated from the client side; the header of the HTTP call looks like this: GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw== Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13 Origin: http://example.com      Here, Host is the name of the server that we are hitting.      Upgrade shows that it is an upgrade call for, in this case, WebSockets. Connection defines that it is an upgrade call.      Sec-Websocket-Key is a randomly generated key which is further used to authenticate the response. It is the authentication key of the handshake.      Origin is also another important parameter which shows where the call originated from; on the server side, it is used to check the requester's authenticity. Once the server checks the authenticity a response is sent back, which looks like this: HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk= Sec-WebSocket-Protocol: chat      Here, Sec-WebSocket-Accept has a key which is decoded and checked with the key sent for confirmation that the response is coming to the right originator. So, once the connection is open, the client and server can send the data to each other. The data is sent in the form of small packets using TCP protocol. These calls are not HTTP so they are not visible directly under the Network tab of Developer Tools of a browser. Summary We learned why we need WebSockets and what their importance is. Along with that, we also learned when to use WebSockets and how they actually work. Resources for Article: Further resources on this subject: Let's Chat [article] WebSocket – a Handshake! [article] Understanding WebSockets and Server-sent Events in Detail [article]
Read more
  • 0
  • 0
  • 3798
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-controlling-movement-robot-legs
Packt
06 May 2015
18 min read
Save for later

Controlling the Movement of a Robot with Legs

Packt
06 May 2015
18 min read
In this article by Richard Grimmett, author of the book Raspberry Pi Robotics Projects - Second Edition, we will add the ability to move the entire project using legs. In this article, you will be introduced to some of the basics of servo motors and to using Raspberry Pi to control the speed and direction of your legged platform. (For more resources related to this topic, see here.) Even though you've learned to make your robot mobile by adding wheels or tracks, these platforms will only work well on smooth, flat surfaces. Often, you'll want your robot to work in environments where the path is not smooth or flat; perhaps, you'll even want your robot to go upstairs or over other barriers. In this article, you'll learn how to attach your board, both mechanically and electrically, to a platform with legs so that your projects can be mobile in many more environments. Robots that can walk! What could be more amazing than this? In this article, we will cover the following topics: Connecting Raspberry Pi to a two-legged mobile platform using a servo motor controller Creating a program in Linux so that you can control the movement of the two-legged mobile platform Making your robot truly mobile by adding voice control Gathering the hardware In this article, you'll need to add a legged platform to make your project mobile. For a legged robot, there are a lot of choices for hardware. Some robots are completely assembled and others require some assembly; you may even choose to buy the components and construct your own custom mobile platform. Also, I'm going to assume that you don't want to do any soldering or mechanical machining yourself, so let's look at several choices of hardware that are available completely assembled or can be assembled using simple tools (a screwdriver and/or pliers). One of the simplest legged mobile platforms is one that has two legs and four servo motors. The following is an image of this type of platform: You'll use this legged mobile platform in this article because it is the simplest to program and the least expensive, requiring only four servos. To construct this platform, you must purchase the parts and then assemble them yourself. Find the instructions and parts list at http://www.lynxmotion.com/images/html/build112.htm. Another easy way to get all the mechanical parts (except servos) is by purchasing a biped robot kit with six degrees of freedom (DOFs). This will contain the parts needed to construct a six-servo biped, but you can use a subset of the parts for your four-servo biped. These six DOF bipeds can be purchased on eBay or at http://www.robotshop.com/2-wheeled-development-platforms-1.html. You'll also need to purchase the servo motors. Servo motors are similar to the DC motors, except that servo motors are designed to move at specific angles based on the control signals that you send. For this type of robot, you can use standard-sized servos. I like Hitec HS-311 for this robot. They are inexpensive but powerful enough for the operations you'll use for this robot. You can get them on Amazon or eBay. The following is an image of an HS-311 servo: I personally like the 5-V cell phone rechargeable batteries that are available at almost any place that supplies cell phones. Choose one that comes with two USB connectors; you can use the second port to power your servo controller. The mobile power supply shown in the following image mounts well on the biped hardware platform: You'll also need a USB cable to connect your battery to Raspberry Pi. You should already have one of these. Now that you have the mechanical parts for your legged mobile platform, you'll need some hardware that will turn the control signals from your Raspberry Pi into voltage levels that can control the servo motors. Servo motors are controlled using a signal called PWM. For a good overview of this type of control, see http://pcbheaven.com/wikipages/How_RC_Servos_Works/ or https://www.ghielectronics.com/docs/18/pwm. Although the Raspberry Pi's GPIO pins do support some limited square-wave pulse width modulation (SW PWM) signals, unfortunately these signals are not stable enough to accurately control servos. In order to control servos reliably, you should purchase a servo controller that can talk over a USB and control the servo motor. These controllers protect your board and make controlling many servos easy. My personal favorite for this application is a simple servo motor controller utilizing a USB from Pololu that can control six servo motors—Micro Maestro 6-Channel USB Servo Controller (assembled). This is available at www.pololu.com. The following is an image of the unit: Make sure you order the assembled version. This piece of hardware will turn USB commands into voltage levels that control your servo motors. Pololu makes a number of different versions of this controller, each able to control a certain number of servos. Once you've chosen your legged platform, simply count the number of servos you need to control and choose a controller that can control that many servos. In this article, you will use a two-legged, four-servo robot, so you'll build the robot by using the six-servo version. Since you are going to connect this controller to Raspberry Pi through USB, you'll also need a USB A to mini-B cable. You'll also need a power cable running from the battery to your servo controller. You'll want to purchase a USB to FTDI cable adapter that has female connectors, for example, the PL2303HX USB to TTL to UART RS232 COM cable available at www.amazon.com. The TTL to UART RS232 cable isn't particularly important; other than that, the cable itself provides individual connectors to each of the four wires in a USB cable. The following is an image of the cable: Now that you have all the hardware, let's walk through a quick tutorial of how a two-legged system with servos works and then some step-by-step instructions to make your project walk. Connecting Raspberry Pi to the mobile platform using a servo controller Now that you have a legged platform and a servo motor controller, you are ready to make your project walk! Before you begin, you'll need some background on servo motors. Servo motors are somewhat similar to DC motors. However, there is an important difference; while DC motors are generally designed to move in a continuous way, rotating 360 degrees at a given speed, servo motors are generally designed to move at angles within a limited set. In other words, in the DC motor world, you generally want your motors to spin at a continuous rotation speed that you control. In the servo world, you want to limit the movement of your motor to a specific position. For more information on how servos work, visit http://www.seattlerobotics.org/guide/servos.html or http://www.societyofrobots.com/actuators_servos.shtml. Connecting the hardware To make your project walk, you first need to connect the servo motor controller to the servos. There are two connections you need to make, the first is to the servo motors, and the second is to the battery. In this section, before connecting your controller to your Raspberry Pi, you'll first connect your servo controller to your PC or Linux machine to check whether or not everything is working. The steps for doing so are as follows: Connect the servos to the controller. The following is an image of your two-legged robot and the four different servo connections: In order to be consistent, let's connect your four servos to the connections marked from 0 to 3 on the controller by using the following configurations:      0: Left foot      1: Left hip      2: Right foot      3: Right hip The following is an image of the back of the controller; it will show you where to connect your servos: Connect these servos to the servo motor controller as follows:      The left foot to 0 (the top connector) and the black cable to the outside (-)      The left hip to connector 1 and the black cable out      The right foot to connector 2 and the black cable out      The right hip to connector 3 and the black cable out See the following image indicating how to connect servos to the controller: Now, you need to connect the servo motor controller to your battery. You'll use the USB to the FTDI UART cable; plug the red and black cables into the power connector on the servo controller, as shown in the following image: Now, plug the other end of the USB cable into one of the battery outputs. Configuring the software Now, you can connect the motor controller to your PC or Linux machine to see whether or not you can talk to it. Once the hardware is connected, you will use some of the software provided by Polulu to control the servos. The steps to do so are as follows: Download the Polulu software from http://www.pololu.com/docs/0J40/3.a and install it using the instructions on the website. Once it is installed, run the software; you should see the window shown in the following screenshot: You will first need to change the Serial mode configuration in Serial Settings, so select the Serial Settings tab; you should see the window shown in the following screenshot: Make sure that USB Chained is selected; this will allow you to connect to and control the motor controller over the USB. Now, go back to the main screen by selecting the Status tab; you can now turn on the four servos. The screen should look as shown in the following screenshot: Now, you can use the sliders to control the servos. Enable the four servos and make sure that servo 0 moves the left foot; 1, the left hip; 2, the right foot; and 3, the right hip. You've checked the motor controllers and the servos and you'll now connect the motor controller to Raspberry Pi to control the servos from there. Remove the USB cable from the PC and connect it to Raspberry Pi. The entire system will look as shown in the following image: Let's now talk to the motor controller from your Raspberry Pi by downloading the Linux code from Pololu at http://www.pololu.com/docs/0J40/3.b. Perhaps the best way to do this is by logging on to Raspberry Pi using vncserver and opening a VNC Viewer window on your PC. To do this, log in to your Raspberry Pi by using PuTTY, and then, type vncserver at the prompt to make sure vncserver is running. Then, perform the following steps: On your PC, open the VNC Viewer application, enter your IP address, and then click on Connect. Then, enter the password that you created for the vncserver; you should see the Raspberry Pi viewer screen, which should look as shown in the following screenshot: Open a browser window and go to http://www.pololu.com/docs/0J40/3.b. Click on the Maestro Servo Controller Linux Software link. You will need to download the maestro_linux_100507.tar.gz file to the Download folder. You can also use wget to get this software by typing wget http://www.pololu.com/file/download/maestro-linux-100507.tar.gz?file_id=0J315 in a terminal window. Go to your Download folder, move it to your home folder by typing mv maestro_linux_100507.tar.gz .., and then go back to your home folder. Unpack the file by typing tar –xzfv maestro_linux_011507.tar.gz. This will create a folder called maestro_linux. Go to this folder by typing cd maestro_linux and then, type ls. You should see the output as shown in the following screenshot: The document README.txt will give you explicit instructions on how to install the software. Unfortunately, you can't run Maestro Control Center on your Raspberry Pi. The standard version of Maestro Control Center doesn't support the Raspberry Pi graphical system, but you can control your servos by using the UscCmd command-line application. First, type ./UscCmd --list; you should see the following screenshot: The software now recognizes that you have a servo controller. If you just type ./UscCmd, you can see all the commands you could send to your controller. When you run this command, you can see the result as shown in the following screenshot: Notice that you can send a servo a specific target angle, although if the target angle is not within range, it makes it a bit difficult to know where you are sending your servo. Try typing ./UscCmd --servo 0, 10. The servo will most likely move to its full angle position. Type ./UscCmd – servo 0, 0 and it will prevent the servo from trying to move. In the next section, you'll write some software that will translate your angles to the electronic signals that will move the servos. If you haven't run the Maestro Controller tool and set the Serial Settings setting to USB Chained, your motor controller may not respond. Creating a program in Linux to control the mobile platform Now that you can control your servos by using a basic command-line program, let's control them by programming some movement in Python. In this section, you'll create a Python program that will let you talk to your servos a bit more intuitively. You'll issue commands that tell a servo to go to a specific angle and it will go to that angle. You can then add a set of such commands to allow your legged mobile robot to lean left or right and even take a step forward. Let's start with a simple program that will make your legged mobile robot's servos turn at 90-degrees; this should be somewhere close to the middle of the 180-degree range you can work within. However, the center, maximum, and minimum values can vary from one servo to another, so you may need to calibrate them. To keep things simple, we will not cover that here. The following screenshot shows the code required for turning the servos: The following is an explanation of the code: The #!/user/bin/python line allows you to make this Python file available for execution from the command line. It will allow you to call this program from your voice command program. We'll talk about this in the next section. The import serial and import time lines include the serial and time libraries. You need the serial library to talk to your unit via USB. If you have not installed this library, type sudo apt-get install python-serial. You will use the time library later to wait between servo commands. The PololuMicroMaestro class holds the methods that will allow you to communicate with your motor controller. The __init__ method, opens the USB port associated with your servo motor controller. The setAngle, method converts your desired settings for the servo and angle to the serial command that the servo motor controller needs. The values, such as minTarget and maxTarget, and the structure of the communications—channelByte, commandByte, lowTargetByte, and highTargetByte—comes from the manufacturer. The close, method closes the serial port. Now that you have the class, the __main__ statement of the program instantiates an instance of your servo motor controller class so that you can call it. Now, you can set each servo to the desired position. The default would be to set each servo to 90-degrees. However, the servos weren't exactly centered, so I found that I needed to set the angle of each servo so that my robot has both feet on the ground and both hips centered. Once you have the basic home position set, you can ask your robot to do different things; the following screenshot shows some examples in simple Python code: In this case, you are using your setAngle command to set your servos to manipulate your robot. This set of commands first sets your robot to the home position. Then, you can use the feet to lean to the right and then to the left and then you can use a combination of commands to make your robot step forward with the left and then the right foot. Once you have the program working, you'll want to package all your hardware onto the mobile robot. By following these principles, you can make your robot do many amazing things, such as walk forward and backward, dance, and turn around—any number of movements are possible. The best way to learn these movements is to try positioning the servos in new and different ways. Making your mobile platform truly mobile by issuing voice commands Now that your robot can move, wouldn't it be neat to have it obey your commands? You should now have a mobile platform that you can program to move in any number of ways. Unfortunately, you still have your LAN cable connected, so the platform isn't completely mobile. Once you have started executing the program, you can't alter its behavior. In this section, you will use the principles to issue voice commands to initiate movement. You'll need to modify your voice recognition program so that it will run your Python program when it gets a voice command. You are going to make a simple modification to the continuous.c program in /home/pi/pocketsphinx-0.8/src/. To do this, type cd /home/pi/pocketsphinx-0.8/src/programs and then type emacs continuous.c. The changes will appear in the same section as your other voice commands and will look as shown in the following screenshot: The additions are pretty straightforward. Let's walk through them: else if (strcmp(hyp, "FORWARD") == 0): This checks the input word as recognized by your voice command program. If it corresponds with the word FORWARD, you will execute everything within the if statement. You use { and } to tell the system which commands go with this else if clause. system("espeak "moving robot""): This executes Espeak, which should tell you that you are about to run your robot program. system("/home/pi/maestro_linux/robot.py"): This indicates the name of the program you will execute. In this case, your mobile platform will do whatever the robot.py program tells it to. After doing this, you will need to recompile the program, so type make and the pocketsphinx_continuous executable will be created. Run the program by typing ./pocketsphinx_continuous. Disconnect the LAN cable and the mobile platform will now take the forward voice command and execute your program. You should now have a complete mobile platform! When you execute your program, the mobile platform can now move around based on what you have programmed it to do. You can use the command-line arguments, to make your robot do many different actions. Perhaps one voice command can move your robot forward, a different one can move it backwards, and another can turn it right or left. Congratulations! Your robot should now be able to move around in any way you program it to move. You can even have the robot dance. You have now built a two-legged robot and you can easily expand on this knowledge to create robots with even more legs. The following is an image of the mechanical structure of a four-legged robot that has eight DOFs and is fairly easy to create by using many of the parts that you have used to create your two-legged robot; this is my personal favorite because it doesn't fall over and break the electronics: You'll need eight servos and lots of batteries. If you search eBay, you can often find kits for sale for four-legged robots with 12 DOFs, but remember that the battery will need to be much bigger. For this application, you can use an RC (which stands for remote control) battery. RC batteries are nice as they are rechargeable and can provide lots of power, but make sure you either purchase one that is 5 V to 6 V or include a way to regulate the voltage. The following is an image of such a battery, available at most hobby stores: If you use this type of battery, don't forget its charger. The hobby store can help with choosing an appropriate match. Summary Now, you have the ability to build not only wheeled robots but also robots with legs. It is also easy to expand this ability to robots with arms; controlling the servos for an arm is the same as controlling them for legs. Resources for Article: Further resources on this subject: Penetration Testing [article] Testing Your Speed [article] Making the Unit Very Mobile – Controlling the Movement of a Robot with Legs [article]
Read more
  • 0
  • 0
  • 12119

article-image-introducing-postgresql-9
Packt
06 May 2015
23 min read
Save for later

Introducing PostgreSQL 9

Packt
06 May 2015
23 min read
In this article by Simon Riggs, Gianni Ciolli, Hannu Krosing, Gabriele Bartolini, the authors of PostgreSQL 9 Administration Cookbook - Second Edition, we will introduce PostgreSQL 9. PostgreSQL is a feature-rich, general-purpose database management system. It's a complex piece of software, but every journey begins with the first step. (For more resources related to this topic, see here.) We'll start with your first connection. Many people fall at the first hurdle, so we'll try not to skip that too swiftly. We'll quickly move on to enabling remote users, and from there, we will move to access through GUI administration tools. We will also introduce the psql query tool. PostgreSQL is an advanced SQL database server, available on a wide range of platforms. One of the clearest benefits of PostgreSQL is that it is open source, meaning that you have a very permissive license to install, use, and distribute PostgreSQL without paying anyone fees or royalties. On top of that, PostgreSQL is well-known as a database that stays up for long periods and requires little or no maintenance in most cases. Overall, PostgreSQL provides a very low total cost of ownership. PostgreSQL is also noted for its huge range of advanced features, developed over the course of more than 20 years of continuous development and enhancement. Originally developed by the Database Research Group at the University of California, Berkeley, PostgreSQL is now developed and maintained by a huge army of developers and contributors. Many of those contributors have full-time jobs related to PostgreSQL, working as designers, developers, database administrators, and trainers. Some, but not many, of those contributors work for companies that specialize in support for PostgreSQL, like we (the authors) do. No single company owns PostgreSQL, nor are you required (or even encouraged) to register your usage. PostgreSQL has the following main features: Excellent SQL standards compliance up to SQL:2011 Client-server architecture Highly concurrent design where readers and writers don't block each other Highly configurable and extensible for many types of applications Excellent scalability and performance with extensive tuning features Support for many kinds of data models: relational, document (JSON and XML), and key/value What makes PostgreSQL different? The PostgreSQL project focuses on the following objectives: Robust, high-quality software with maintainable, well-commented code Low maintenance administration for both embedded and enterprise use Standards-compliant SQL, interoperability, and compatibility Performance, security, and high availability What surprises many people is that PostgreSQL's feature set is more comparable with Oracle or SQL Server than it is with MySQL. The only connection between MySQL and PostgreSQL is that these two projects are open source; apart from that, the features and philosophies are almost totally different. One of the key features of Oracle, since Oracle 7, has been snapshot isolation, where readers don't block writers and writers don't block readers. You may be surprised to learn that PostgreSQL was the first database to be designed with this feature, and it offers a complete implementation. In PostgreSQL, this feature is called Multiversion Concurrency Control (MVCC). PostgreSQL is a general-purpose database management system. You define the database that you would like to manage with it. PostgreSQL offers you many ways to work. You can use a normalized database model, augmented with features such as arrays and record subtypes, or use a fully dynamic schema with the help of JSONB and an extension named hstore. PostgreSQL also allows you to create your own server-side functions in any of a dozen different languages. PostgreSQL is highly extensible, so you can add your own data types, operators, index types, and functional languages. You can even override different parts of the system using plugins to alter the execution of commands or add a new optimizer. All of these features offer a huge range of implementation options to software architects. There are many ways out of trouble when building applications and maintaining them over long periods of time. In the early days, when PostgreSQL was still a research database, the focus was solely on the cool new features. Over the last 15 years, enormous amounts of code have been rewritten and improved, giving us one of the most stable and largest software servers available for operational use. You may have read that PostgreSQL was, or is, slower than My Favorite DBMS, whichever that is. It's been a personal mission of mine over the last ten years to improve server performance, and the team has been successful in making the server highly performant and very scalable. That gives PostgreSQL enormous headroom for growth. Who is using PostgreSQL? Prominent users include Apple, BASF, Genentech, Heroku, IMDB.com, Skype, McAfee, NTT, The UK Met Office, and The U. S. National Weather Service. 5 years ago, PostgreSQL received well in excess of 1 million downloads per year, according to data submitted to the European Commission, which concluded, "PostgreSQL is considered by many database users to be a credible alternative." We need to mention one last thing. When PostgreSQL was first developed, it was named Postgres, and therefore many aspects of the project still refer to the word "postgres"; for example, the default database is named postgres, and the software is frequently installed using the postgres user ID. As a result, people shorten the name PostgreSQL to simply Postgres, and in many cases use the two names interchangeably. PostgreSQL is pronounced as "post-grez-q-l". Postgres is pronounced as "post-grez." Some people get confused, and refer to "Postgre", which is hard to say, and likely to confuse people. Two names are enough, so please don't use a third name! The following sections explain the key areas in more detail. Robustness PostgreSQL is robust, high-quality software, supported by automated testing for both features and concurrency. By default, the database provides strong disk-write guarantees, and the developers take the risk of data loss very seriously in everything they do. Options to trade robustness for performance exist, though they are not enabled by default. All actions on the database are performed within transactions, protected by a transaction log that will perform automatic crash recovery in case of software failure. Databases may be optionally created with data block checksums to help diagnose hardware faults. Multiple backup mechanisms exist, with full and detailed Point-In-Time Recovery, in case of the need for detailed recovery. A variety of diagnostic tools are available. Database replication is supported natively. Synchronous Replication can provide greater than "5 Nines" (99.999 percent) availability and data protection, if properly configured and managed. Security Access to PostgreSQL is controllable via host-based access rules. Authentication is flexible and pluggable, allowing easy integration with any external security architecture. Full SSL-encrypted access is supported natively. A full-featured cryptographic function library is available for database users. PostgreSQL provides role-based access privileges to access data, by command type. Functions may execute with the permissions of the definer, while views may be defined with security barriers to ensure that security is enforced ahead of other processing. All aspects of PostgreSQL are assessed by an active security team, while known exploits are categorized and reported at http://www.postgresql.org/support/security/. Ease of use Clear, full, and accurate documentation exists as a result of a development process where doc changes are required. Hundreds of small changes occur with each release that smooth off any rough edges of usage, supplied directly by knowledgeable users. PostgreSQL works in the same way on small or large systems and across operating systems. Client access and drivers exist for every language and environment, so there is no restriction on what type of development environment is chosen now, or in the future. SQL Standard is followed very closely; there is no weird behavior, such as silent truncation of data. Text data is supported via a single data type that allows storage of anything from 1 byte to 1 gigabyte. This storage is optimized in multiple ways, so 1 byte is stored efficiently, and much larger values are automatically managed and compressed. PostgreSQL has a clear policy to minimize the number of configuration parameters, and with each release, we work out ways to auto-tune settings. Extensibility PostgreSQL is designed to be highly extensible. Database extensions can be loaded simply and easily using CREATE EXTENSION, which automates version checks, dependencies, and other aspects of configuration. PostgreSQL supports user-defined data types, operators, indexes, functions and languages. Many extensions are available for PostgreSQL, including the PostGIS extension that provides world-class Geographical Information System (GIS) features. Performance and concurrency PostgreSQL 9.4 can achieve more than 300,000 reads per second on a 32-CPU server, and it benchmarks at more than 20,000 write transactions per second with full durability. PostgreSQL has an advanced optimizer that considers a variety of join types, utilizing user data statistics to guide its choices. PostgreSQL provides MVCC, which enables readers and writers to avoid blocking each other. Taken together, the performance features of PostgreSQL allow a mixed workload of transactional systems and complex search and analytical tasks. This is important because it means we don't always need to unload our data from production systems and reload them into analytical data stores just to execute a few ad hoc queries. PostgreSQL's capabilities make it the database of choice for new systems, as well as the right long-term choice in almost every case. Scalability PostgreSQL 9.4 scales well on a single node up to 32 CPUs. PostgreSQL scales well up to hundreds of active sessions, and up to thousands of connected sessions when using a session pool. Further scalability is achieved in each annual release. PostgreSQL provides multinode read scalability using the Hot Standby feature. Multinode write scalability is under active development. The starting point for this is Bi-Directional Replication. SQL and NoSQL PostgreSQL follows SQL Standard very closely. SQL itself does not force any particular type of model to be used, so PostgreSQL can easily be used for many types of models at the same time, in the same database. PostgreSQL supports the more normal SQL language statement. With PostgreSQL acting as a relational database, we can utilize any level of denormalization, from the full Third Normal Form, to the more normalized Star Schema models. PostgreSQL extends the relational model to provide arrays, row types, and range types. A document-centric database is also possible using PostgreSQL's text, XML, and binary JSON (JSONB) data types, supported by indexes optimized for documents and by full text search capabilities. Key/value stores are supported using the hstore extension. Popularity When MySQL was taken over some years back, it was agreed in the EU monopoly investigation that followed that PostgreSQL was a viable competitor. That's been certainly true, with the PostgreSQL user base expanding consistently for more than a decade. Various polls have indicated that PostgreSQL is the favorite database for building new, enterprise-class applications. The PostgreSQL feature set attracts serious users who have serious applications. Financial services companies may be PostgreSQL's largest user group, though governments, telecommunication companies, and many other segments are strong users as well. This popularity extends across the world; Japan, Ecuador, Argentina, and Russia have very large user groups, and so do USA, Europe, and Australasia. Amazon Web Services' chief technology officer Dr. Werner Vogels described PostgreSQL as "an amazing database", going on to say that "PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading geospatial and mobile applications". Commercial support Many people have commented that strong commercial support is what enterprises need before they can invest in open source technology. Strong support is available worldwide from a number of companies. 2ndQuadrant provides commercial support for open source PostgreSQL, offering 24 x 7 support in English and Spanish with bug-fix resolution times. EnterpriseDB provides commercial support for PostgreSQL as well as their main product, which is a variant of Postgres that offers enhanced Oracle compatibility. Many other companies provide strong and knowledgeable support to specific geographic regions, vertical markets, and specialized technology stacks. PostgreSQL is also available as hosted or cloud solutions from a variety of companies, since it runs very well in cloud environments. A full list of companies is kept up to date at http://www.postgresql.org/support/professional_support/. Research and development funding PostgreSQL was originally developed as a research project at the University of California, Berkeley in the late 1980s and early 1990s. Further work was carried out by volunteers until the late 1990s. Then, the first professional developer became involved. Over time, more and more companies and research groups became involved, supporting many professional contributors. Further funding for research and development was provided by the NSF. The project also received funding from the EU FP7 Programme in the form of the 4CaaST project for cloud computing and the AXLE project for scalable data analytics. AXLE deserves a special mention because it is a 3-year project aimed at enhancing PostgreSQL's business intelligence capabilities, specifically for very large databases. The project covers security, privacy, integration with data mining, and visualization tools and interfaces for new hardware. Further details of it are available at http://www.axleproject.eu. Other funding for PostgreSQL development comes from users who directly sponsor features and companies selling products and services based around PostgreSQL. Monitoring Databases are not isolated entities. They live on computer hardware using CPUs, RAM, and disk subsystems. Users access databases using networks. Depending on the setup, databases themselves may need network resources to function in any of the following ways: performing some authentication checks when users log in, using disks that are mounted over the network (not generally recommended), or making remote function calls to other databases. This means that monitoring only the database is not enough. As a minimum, one should also monitor everything directly involved in using the database. This means knowing the following: Is the database host available? Does it accept connections? How much of the network bandwidth is in use? Have there been network interruptions and dropped connections? Is there enough RAM available for the most common tasks? How much of it is left? Is there enough disk space available? When will it run out of disk space? Is the disk subsystem keeping up? How much more load can it take? Can the CPU keep up with the load? How many spare idle cycles do the CPUs have? Are other network services the database access depends on (if any) available? For example, if you use Kerberos for authentication, you need to monitor it as well. How many context switches are happening when the database is running? For most of these things, you are interested in history; that is, how have things evolved? Was everything mostly the same yesterday or last week? When did the disk usage start changing rapidly? For any larger installation, you probably have something already in place to monitor the health of your hosts and network. The two aspects of monitoring are collecting historical data to see how things have evolved and getting alerts when things go seriously wrong. Tools based on Round Robin Database Tool (RRDtool) such as Cacti and Munin are quite popular for collecting the historical information on all aspects of the servers and presenting this information in an easy-to-follow graphical form. Seeing several statistics on the same timescale can really help when trying to figure out why the system is behaving the way it is. Another popular open source solution is Ganglia, a distributed monitoring solution particularly suitable for environments with several servers and in multiple locations. Another aspect of monitoring is getting alerts when something goes really wrong and needs (immediate) attention. For alerting, one of the most widely used tools is Nagios, with its fork (Icinga) being an emerging solution. The aforementioned trending tools can integrate with Nagios. However, if you need a solution for both the alerting and trending aspects of a monitoring tool, you might want to look into Zabbix. Then, of course, there is Simple Network Management Protocol (SNMP), which is supported by a wide array of commercial monitoring solutions. Basic support for monitoring PostgreSQL through SNMP is found in pgsnmpd. This project does not seem very active though. However, you can find more information about pgsnmpd and download it from http://pgsnmpd.projects.postgresql.org/. Providing PostgreSQL information to monitoring tools Historical monitoring information is best to use when all of it is available from the same place and at the same timescale. Most monitoring systems are designed for generic purposes, while allowing application and system developers to integrate their specific checks with the monitoring infrastructure. This is possible through a plugin architecture. Adding new kinds of data inputs to them means installing a plugin. Sometimes, you may need to write or develop this plugin, but writing a plugin for something such as Cacti is easy. You just have to write a script that outputs monitored values in simple text format. In most common scenarios, the monitoring system is centralized and data is collected directly (and remotely) by the system itself or through some distributed components that are responsible for sending the observed metrics back to the main node. As far as PostgreSQL is concerned, some useful things to include in graphs are the number of connections, disk usage, number of queries, number of WAL files, most numbers from pg_stat_user_tables and pg_stat_user_indexes, and so on, as shown here: An example of a dashboard in Cacti The preceding Cacti screenshot includes data for CPU, disk, and network usage; pgbouncer connection pooler; and the number of PostgreSQL client connections. As you can see, they are nicely correlated. One Swiss Army knife script, which can be used from both Cacti and Nagios/Icinga, is check_postgres. It is available at http://bucardo.org/wiki/Check_postgres. It has ready-made reporting actions for a large array of things worth monitoring in PostgreSQL. For Munin, there are some PostgreSQL plugins available at the Munin plugin repository at https://github.com/munin-monitoring/contrib/tree/master/plugins/postgresql. The following screenshot shows a Munin graph about PostgreSQL buffer cache hits for a specific database, where cache hits (blue line) dominate reads from the disk (green line): Finding more information about generic monitoring tools Setting up the tools themselves is a larger topic. In fact, each of these tools has more than one book written about them. The basic setup information and the tools themselves can be found at the following URLs: RRDtool: http://www.mrtg.org/rrdtool/ Cacti: http://www.cacti.net/ Ganglia: http://ganglia.sourceforge.net/ Icinga: http://www.icinga.org Munin: http://munin-monitoring.org/ Nagios: http://www.nagios.org/ Zabbix: http://www.zabbix.org/ Real-time viewing using pgAdmin You can also use pgAdmin to get a quick view of what is going on in the database. For better control, you need to install the adminpack extension in the destination database, by issuing this command: CREATE EXTENSION adminpack; This extension is a part of the additionally supplied modules of PostgreSQL (aka contrib). It provides several administration functions that PgAdmin (and other tools) can use in order to manage, control, and monitor a Postgres server from a remote location. Once you have installed adminpack, connect to the database and then go to Tools | Server Status. This will open a window similar to what is shown in the following screenshot, reporting locks and running transactions: Loading data from flat files Loading data into your database is one of the most important tasks. You need to do this accurately and quickly. Here's how. Getting ready You'll need a copy of pgloader, which is available at http://github.com/dimitri/pgloader. At the time of writing this article, the current stable version is 3.1.0. The 3.x series is a major rewrite, with many additional features, and the 2.x series is now considered obsolete. How to do it… PostgreSQL includes a command named COPY that provides the basic data load/unload mechanism. The COPY command doesn't do enough when loading data, so let's skip the basic command and go straight to pgloader. To load data, we need to understand our requirements, so let's break this down into a step-by-step process, as follows: Identify the data files and where they are located. Make sure that pgloader is installed at the location of the files. Identify the table into which you are loading, ensure that you have the permissions to load, and check the available space. Work out the file type (fixed, text, or CSV) and check the encoding. Specify the mapping between columns in the file and columns on the table being loaded. Make sure you know which columns in the file are not needed—pgloader allows you to include only the columns you want. Identify any columns in the table for which you don't have data. Do you need them to have a default value on the table, or does pgloader need to generate values for those columns through functions or constants? Specify any transformations that need to take place. The most common issue is date formats, though possibly there may be other issues. Write the pgloader script. pgloader will create a log file to record whether the load has succeeded or failed, and another file to store rejected rows. You need a directory with sufficient disk space if you expect them to be large. Their size is roughly proportional to the number of failing rows. Finally, consider what settings you need for performance options. This is definitely last, as fiddling with things earlier can lead to confusion when you're still making the load work correctly. You must use a script to execute pgloader. This is not a restriction; actually it is more like best practice, because it makes it much easier to iterate towards something that works. Loads never work the first time, except in the movies! Let's look at a typical example from pgloader's documentation—the example.load file: LOAD CSV    FROM 'GeoLiteCity-Blocks.csv' WITH ENCODING iso-646-us        HAVING FIELDS        (            startIpNum, endIpNum, locId        )    INTO postgresql://user@localhost:54393/dbname?geolite.blocks        TARGET COLUMNS        (            iprange ip4r using (ip-range startIpNum endIpNum),          locId        )    WITH truncate,        skip header = 2,        fields optionally enclosed by '"',        fields escaped by backslash-quote,        fields terminated by 't'      SET work_mem to '32 MB', maintenance_work_mem to '64 MB'; We can use the load script like this: pgloader --summary summary.log example.load How it works… pgloader copes gracefully with errors. The COPY command loads all rows in a single transaction, so only a single error is enough to abort the load. pgloader breaks down an input file into reasonably sized chunks, and loads them piece by piece. If some rows in a chunk cause errors, then pgloader will split it iteratively until it loads all the good rows and skips all the bad rows, which are then saved in a separate "rejects" file for later inspection. This behavior is very convenient if you have large data files with a small percentage of bad rows; for instance, you can edit the rejects, fix them, and finally, load them with another pgloader run. Versions 2.x of pgloader were written in Python and connected to PostgreSQL through the standard Python client interface. Version 3.x is written in Common Lisp. Yes, pgloader is less efficient than loading data files using a COPY command, but running a COPY command has many more restrictions: the file has to be in the right place on the server, has to be in the right format, and must be unlikely to throw errors on loading. pgloader has additional overhead, but it also has the ability to load data using multiple parallel threads, so it can be faster to use as well. pgloader's ability to call out to reformat functions is often essential in most cases; straight COPY is just too simple. pgloader also allows loading from fixed-width files, which COPY does not. There's more… If you need to reload the table completely from scratch, then specify the –WITH TRUNCATE clause in the pgloader script. There are also options to specify SQL to be executed before and after loading the data. For instance, you may have a script that creates the empty tables before, or you can add constraints after, or both. After loading, if we have load errors, then there will be some junk loaded into the PostgreSQL tables. It is not junk that you can see, or that gives any semantic errors, but think of it more like fragmentation. You should think about whether you need to add a VACUUM command after the data load, though this will make the load take possibly much longer. We need to be careful to avoid loading data twice. The only easy way of doing that is to make sure that there is at least one unique index defined on every table that you load. The load should then fail very quickly. String handling can often be difficult, because of the presence of formatting or nonprintable characters. The default setting for PostgreSQL is to have a parameter named standard_conforming_strings set to off, which means that backslashes will be assumed to be escape characters. Put another way, by default, the n string means line feed, which can cause data to appear truncated. You'll need to turn standard_conforming_strings to on, or you'll need to specify an escape character in the load-parameter file. If you are reloading data that has been unloaded from PostgreSQL, then you may want to use the pg_restore utility instead. The pg_restore utility has an option to reload data in parallel, -j number_of_threads, though this is only possible if the dump was produced using the custom pg_dump format. This can be useful for reloading dumps, though it lacks almost all of the other pgloader features discussed here. If you need to use rows from a read-only text file that does not have errors, and you are using version 9.1 or later of PostgreSQL, then you may consider using the file_fdw contrib module. The short story is that it lets you create a "virtual" table that will parse the text file every time it is scanned. This is different from filling a table once and for all, either with COPY or pgloader; therefore, it covers a different use case. For example, think about an external data source that is maintained by a third party and needs to be shared across different databases. You may wish to send an e-mail to Dimitri Fontaine, the current author and maintainer of most of pgloader. He always loves to receive e-mails from users. Summary PostgreSQL provides a lot of features, which make it the most advanced open source database. Resources for Article: Further resources on this subject: Getting Started with PostgreSQL [article] Installing PostgreSQL [article] PostgreSQL – New Features [article]
Read more
  • 0
  • 0
  • 3530

article-image-introduction-hadoop
Packt
06 May 2015
11 min read
Save for later

Introduction to Hadoop

Packt
06 May 2015
11 min read
In this article by Shiva Achari, author of the book Hadoop Essentials, you'll get an introduction about Hadoop, its uses, and advantages (For more resources related to this topic, see here.) Hadoop In big data, the most widely used system is Hadoop. Hadoop is an open source implementation of big data, which is widely accepted in the industry, and benchmarks for Hadoop are impressive and, in some cases, incomparable to other systems. Hadoop is used in the industry for large-scale, massively parallel, and distributed data processing. Hadoop is highly fault tolerant and configurable to as many levels as we need for the system to be fault tolerant, which has a direct impact to the number of times the data is stored across. As we have already touched upon big data systems, the architecture revolves around two major components: distributed computing and parallel processing. In Hadoop, the distributed computing is handled by HDFS, and parallel processing is handled by MapReduce. In short, we can say that Hadoop is a combination of HDFS and MapReduce, as shown in the following image: Hadoop history Hadoop began from a project called Nutch, an open source crawler-based search, which processes on a distributed system. In 2003–2004, Google released Google MapReduce and GFS papers. MapReduce was adapted on Nutch. Doug Cutting and Mike Cafarella are the creators of Hadoop. When Doug Cutting joined Yahoo, a new project was created along the similar lines of Nutch, which we call Hadoop, and Nutch remained as a separate sub-project. Then, there were different releases, and other separate sub-projects started integrating with Hadoop, which we call a Hadoop ecosystem. The following figure and description depicts the history with timelines and milestones achieved in Hadoop: Description 2002.8: The Nutch Project was started 2003.2: The first MapReduce library was written at Google 2003.10: The Google File System paper was published 2004.12: The Google MapReduce paper was published 2005.7: Doug Cutting reported that Nutch now uses new MapReduce implementation 2006.2: Hadoop code moved out of Nutch into a new Lucene sub-project 2006.11: The Google Bigtable paper was published 2007.2: The first HBase code was dropped from Mike Cafarella 2007.4: Yahoo! Running Hadoop on 1000-node cluster 2008.1: Hadoop made an Apache Top Level Project 2008.7: Hadoop broke the Terabyte data sort Benchmark 2008.11: Hadoop 0.19 was released 2011.12: Hadoop 1.0 was released 2012.10: Hadoop 2.0 was alpha released 2013.10: Hadoop 2.2.0 was released 2014.10: Hadoop 2.6.0 was released Advantages of Hadoop Hadoop has a lot of advantages, and some of them are as follows: Low cost—Runs on commodity hardware: Hadoop can run on average performing commodity hardware and doesn't require a high performance system, which can help in controlling cost and achieve scalability and performance. Adding or removing nodes from the cluster is simple, as an when we require. The cost per terabyte is lower for storage and processing in Hadoop. Storage flexibility: Hadoop can store data in raw format in a distributed environment. Hadoop can process the unstructured data and semi-structured data better than most of the available technologies. Hadoop gives full flexibility to process the data and we will not have any loss of data. Open source community: Hadoop is open source and supported by many contributors with a growing network of developers worldwide. Many organizations such as Yahoo, Facebook, Hortonworks, and others have contributed immensely toward the progress of Hadoop and other related sub-projects. Fault tolerant: Hadoop is massively scalable and fault tolerant. Hadoop is reliable in terms of data availability, and even if some nodes go down, Hadoop can recover the data. Hadoop architecture assumes that nodes can go down and the system should be able to process the data. Complex data analytics: With the emergence of big data, data science has also grown leaps and bounds, and we have complex and heavy computation intensive algorithms for data analysis. Hadoop can process such scalable algorithms for a very large-scale data and can process the algorithms faster. Uses of Hadoop Some examples of use cases where Hadoop is used are as follows: Searching/text mining Log processing Recommendation systems Business intelligence/data warehousing Video and image analysis Archiving Graph creation and analysis Pattern recognition Risk assessment Sentiment analysis Hadoop ecosystem A Hadoop cluster can be of thousands of nodes, and it is complex and difficult to manage manually, hence there are some components that assist configuration, maintenance, and management of the whole Hadoop system. In this article, we will touch base upon the following components: Layer Utility/Tool name Distributed filesystem Apache HDFS Distributed programming Apache MapReduce Apache Hive Apache Pig Apache Spark NoSQL databases Apache HBase Data ingestion Apache Flume Apache Sqoop Apache Storm Service programming Apache Zookeeper Scheduling Apache Oozie Machine learning Apache Mahout System deployment Apache Ambari All the components above are helpful in managing Hadoop tasks and jobs. Apache Hadoop The open source Hadoop is maintained by the Apache Software Foundation. The official website for Apache Hadoop is http://hadoop.apache.org/, where the packages and other details are described elaborately. The current Apache Hadoop project (version 2.6) includes the following modules: Hadoop common: The common utilities that support other Hadoop modules Hadoop Distributed File System (HDFS): A distributed filesystem that provides high-throughput access to application data Hadoop YARN: A framework for job scheduling and cluster resource management Hadoop MapReduce: A YARN-based system for parallel processing of large datasets Apache Hadoop can be deployed in the following three modes: Standalone: It is used for simple analysis or debugging. Pseudo distributed: It helps you to simulate a multi-node installation on a single node. In pseudo-distributed mode, each of the component processes runs in a separate JVM. Instead of installing Hadoop on different servers, you can simulate it on a single server. Distributed: Cluster with multiple worker nodes in tens or hundreds or thousands of nodes. In a Hadoop ecosystem, along with Hadoop, there are many utility components that are separate Apache projects such as Hive, Pig, HBase, Sqoop, Flume, Zookeper, Mahout, and so on, which have to be configured separately. We have to be careful with the compatibility of subprojects with Hadoop versions as not all versions are inter-compatible. Apache Hadoop is an open source project that has a lot of benefits as source code can be updated, and also some contributions are done with some improvements. One downside for being an open source project is that companies usually offer support for their products, not for an open source project. Customers prefer support and adapt Hadoop distributions supported by the vendors. Let's look at some Hadoop distributions available. Hadoop distributions Hadoop distributions are supported by the companies managing the distribution, and some distributions have license costs also. Companies such as Cloudera, Hortonworks, Amazon, MapR, and Pivotal have their respective Hadoop distribution in the market that offers Hadoop with required sub-packages and projects, which are compatible and provide commercial support. This greatly reduces efforts, not just for operations, but also for deployment, monitoring, and tools and utility for easy and faster development of the product or project. For managing the Hadoop cluster, Hadoop distributions provide some graphical web UI tooling for the deployment, administration, and monitoring of Hadoop clusters, which can be used to set up, manage, and monitor complex clusters, which reduce a lot of effort and time. Some Hadoop distributions which are available are as follows: Cloudera: According to The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014, this is the most widely used Hadoop distribution with the biggest customer base as it provides good support and has some good utility components such as Cloudera Manager, which can create, manage, and maintain a cluster, and manage job processing, and Impala is developed and contributed by Cloudera which has real-time processing capability. Hortonworks: Hortonworks' strategy is to drive all innovation through the open source community and create an ecosystem of partners that accelerates Hadoop adoption among enterprises. It uses an open source Hadoop project and is a major contributor to Hadoop enhancement in Apache Hadoop. Ambari was developed and contributed to Apache by Hortonworks. Hortonworks offers a very good, easy-to-use sandbox for getting started. Hortonworks contributed changes that made Apache Hadoop run natively on the Microsoft Windows platforms including Windows Server and Microsoft Azure. MapR: MapR distribution of Hadoop uses different concepts than plain open source Hadoop and its competitors, especially support for a network file system (NFS) instead of HDFS for better performance and ease of use. In NFS, Native Unix commands can be used instead of Hadoop commands. MapR have high availability features such as snapshots, mirroring, or stateful failover. Amazon Elastic MapReduce (EMR): AWS's Elastic MapReduce (EMR) leverages its comprehensive cloud services, such as Amazon EC2 for compute, Amazon S3 for storage, and other services, to offer a very strong Hadoop solution for customers who wish to implement Hadoop in the cloud. EMR is much advisable to be used for infrequent big data processing. It might save you a lot of money. Pillars of Hadoop Hadoop is designed to be highly scalable, distributed, massively parallel processing, fault tolerant and flexible and the key aspect of the design are HDFS, MapReduce and YARN. HDFS and MapReduce can perform very large scale batch processing at a much faster rate. Due to contributions from various organizations and institutions Hadoop architecture has undergone a lot of improvements, and one of them is YARN. YARN has overcome some limitations of Hadoop and allows Hadoop to integrate with different applications and environments easily, especially in streaming and real-time analysis. One such example that we are going to discuss are Storm and Spark, they are well known in streaming and real-time analysis, both can integrate with Hadoop via YARN. Data access components MapReduce is a very powerful framework, but has a huge learning curve to master and optimize a MapReduce job. For analyzing data in a MapReduce paradigm, a lot of our time will be spent in coding. In big data, the users come from different backgrounds such as programming, scripting, EDW, DBA, analytics, and so on, for such users there are abstraction layers on top of MapReduce. Hive and Pig are two such layers, Hive has a SQL query-like interface and Pig has Pig Latin procedural language interface. Analyzing data on such layers becomes much easier. Data storage component HBase is a column store-based NoSQL database solution. HBase's data model is very similar to Google's BigTable framework. HBase can efficiently process random and real-time access in a large volume of data, usually millions or billions of rows. HBase's important advantage is that it supports updates on larger tables and faster lookup. The HBase data store supports linear and modular scaling. HBase stores data as a multidimensional map and is distributed. HBase operations are all MapReduce tasks that run in a parallel manner. Data ingestion in Hadoop In Hadoop, storage is never an issue, but managing the data is the driven force around which different solutions can be designed differently with different systems, hence managing data becomes extremely critical. A better manageable system can help a lot in terms of scalability, reusability, and even performance. In a Hadoop ecosystem, we have two widely used tools: Sqoop and Flume, both can help manage the data and can import and export data efficiently, with a good performance. Sqoop is usually used for data integration with RDBMS systems, and Flume usually performs better with streaming log data. Streaming and real-time analysis Storm and Spark are the two new fascinating components that can run on YARN and have some amazing capabilities in terms of processing streaming and real-time analysis. Both of these are used in scenarios where we have heavy continuous streaming data and have to be processed in, or near, real-time cases. The example which we discussed earlier for traffic analyzer is a good example for use cases of Storm and Spark. Summary In this article, we explored a bit about Hadoop history, finally migrating to the advantages and uses of Hadoop. Hadoop systems are complex to monitor and manage, and we have separate sub-projects' frameworks, tools, and utilities that integrate with Hadoop and help in better management of tasks, which are called a Hadoop ecosystem. Resources for Article: Further resources on this subject: Hive in Hadoop [article] Hadoop and MapReduce [article] Evolution of Hadoop [article]
Read more
  • 0
  • 0
  • 3178

article-image-provisioning-docker-containers
Xavier Bruhiere
06 May 2015
10 min read
Save for later

Provisioning Docker Containers

Xavier Bruhiere
06 May 2015
10 min read
Docker containers are spreading fast. They're sneaking into our development environments, production servers, and the proliferation of links in this post emphasizes how hot the topic currently is. Containers encapsulate applications into a portable machine you can easily build, control and ship. It brings most of the modern services just one command away, clean development environments, and agile production infrastructure, to name a few of the benefits. While getting started is insanely easy, real life applications can be tricky when you try to push a bit on the boundaries. In this post, we're going to study how to provision docker containers, prototyping along the way our very own image builder. Hopefully, by the end of this post, you should get a good idea of the challenges and opportunities involved. As of today, docker hub features 15,000 images and the most popular one was downloaded 3.588.280 times. We better be good at crafting them! Configuration First thing first, we need a convenient way to describe how to build the application. This is what files like travis.yml exactly aim to, so here is a good place to start. # The official base image to use language: python # Container build steps install: # For the sake of genericity, we introduce support for templating {% for pkg in dependencies %} - pip install {{ pkg }} {% endfor %} # Validating the build script: - pylint {{ project }} tests/ - nosetests --with-coverage --cover-package {{ project }} Yaml formatting is also a decent choice, easily processed both by humans and machines (and I think this is something Ansible and Salt get right in configuration management). I'm also biased toward python for exploration, so here is the code to load the information into our program. # run (sudo) pip install pyyaml==3.11 jinja2==2.7.3 import jinja2, yaml def load_manifest(filepath, **properties): tpl = jinja2.Template(open('travis.yml').read()) return yaml.load(tpl.render(**properties)) This setup gives us the simplest configuration interface ever (files), version control for our build, centralized view of container definitions, trivial management, easy integration for future tools like, say, container provisioning. You can already enjoy those benefits with projects built by hashicorp or with the application container specification. While I plan to borrow a lot of the concepts behind the latter, we don't need this level of precision nor to constrain our code to their layout conventions. Regarding tools like packer, they're oversized here, although we already took some inspiration from them : configuration as template files. Model So far so good. We have a nice dictionary, describing a simple application. However I propose to transcribe this structure into a directed graph. It will bring hierarchical order to the steps, and whenever we parallelize them, like independent tasks or tests, we will simply branch out. class Node(object): def __init__(self, tag, **properties): # Node will be processed later. The tag provided here will indicate how to self.tag = tag self.props = properties # Children nodes connected to this one self.outgoings = [] class Graph(object): def __init__(self, startnode): self.nodes = [startnode] def connect(self, node, *child_nodes): for child in child_nodes: node.outgoings.append(child) self.nodes.append(child) def walk(self, node_ptr, callback): callback(node_ptr) for node in node_ptr.outgoings: # Recursively follow nodes self.walk(node, callback) Starting from the data we previously loaded, we finally model our application into a suitable structure. def build_graph(data, artifact): # Initialization node_ptr = Node("start", image=data["language"]) graph = Graph(node_ptr) # Provision for task in data["install"]: task_node = Node("task", command=task) graph.connect(node_ptr, task_node) node_ptr = task_node # Validation, on a different branch test_node_ptr = node_ptr for test in data["script"]: test_node = Node("test", command=test) graph.connect(node_ptr, test_node) test_node_ptr = test_node # Finalization graph.connect(node_ptr, Node("commit", repo=artifact)) return graph Build Flow While our implementation is really naive, we now have a convenient structure to work on. Keeping up with our fictional model, the following graph represents the build workflow as a simple Finite State Machine. !container fsm Some remarks : * travis.yml steps, i.e. graph nodes, became events. * We handle caching like docker build does. A new container is only started when a new task is received. Pieces begin to come in place. The walk() method of the Graph is a perfect fit to emit events and the state machine is a robust solution to safely manage a container life-cycle with a conditional start. As a bonus point, it decouples the data model and the build process (loosely coupled components are cool). Execution In order to focus on provisioning issues instead of programmatic implementations, however, we're going to prefer the _good enough_ Factory below. # pip install docker-py==1.1.0 import docker class Factory(object): """ Manage the build workflow. """" def __init__(self, endpoint=None): endpoint = endpoint or os.environ.get('DOCKER_HOST', 'unix://var/run/docker.sock') self.conn = docker.Client(endpoint) self.container = None def start(self, image): self.container = self.conn.create_container(image=image, command='sleep 360') self.conn.start(self.container['Id']) def provision(self, command): self.conn.execute(self.container['Id'], command) def teardown(self, artifact): self.conn.commit(self.container['Id'], repository='my/container', tag='awesome') self.conn.stop(self.container['Id']) self.conn.remove_container(self.container['Id']) def callback(self, node): #print("[factory] new step {}: {}".format(node.tag, node.props)) if node.tag == "start": self.start_container(node.props["image"]) elif node.tag == "task": self.provision(node.props["command"]) elif node.tag == "commit": self.teardown_container(node.props["repo"]) We leverage docker exec feature to run commands inside the container. This approach gives us an important asset: 0 requirements on the target to make it work with our project. We're compatible with every container and we have nothing to pre-install, i.e. no overhead and no extra bytes for our final image. At this point, you should be able to synthetize a cute, completely useless, little python container. data = load_manifest('travis.yml', project='factory', packages=['requests', 'ipython']) graph = build_graph(data, "test/factory") graph.walk(graph.nodes[0], Factory().callback) Getting smarter As mentioned, docker cli optimizes subsequent builds by skipping previous successful steps, speeding up development workflow. But it also has its flaws. What if we could run commands with strong security guarantees and we know to be pinned at the exact same version, across different run? Basically, we want reliable, reproducible builds and tools like Snappy and Nix come handy for the task. Both solutions ensures the security and the stability of what we're provisioning, avoiding side effects on/from other unrelated os components. Going further Our modest tool takes shape, but we're still lacking an important feature: copying files from the host inside the container (code, configuration files). The former is straightforward as docker supports mapping volumes. The latter can be solved by what I think is an elegant solution, powered by consul-template and explained below. * First we build a container full of useful binaries our future other containers may need (at least consul-template). FROM scratch MAINTAINER Xavier Bruhiere <xavier.bruhiere@gmail.com> # This directory contains the programs ADD ./tools /tools # And we expose it to the world VOLUME /tools WORKDIR /tools ENTRYPOINT ["/bin/sh"] docker build -t factory/toolbox . # It just needs to exist to be available, not even run docker create --name toolbox factory/toolbox * We make those tools available by mapping the toolbox to the target container. This is in fact a common practice known as data containers. self.conn.start(self.container['Id'], volumes_from='toolbox') * Files, optionally being go templates, are grouped inside a directory on the host, along with a configuration specifying where to move them. The project's readme explains it all. * Finally we insert the following task before the others to perform the copy, rendering templates in the process with values from consul key/value store. cmd = '/tools/consul-template -config /root/app/templates/template.hcl -consul 192.168.0.17:8500 -once' task_node = Node("task", command=cmd) graph.connect(node_ptr, task_node) We now know how to provide useful binary tools and any parametric file inside the build. ### Base image Keeping our tools outside the container let us factorize common utilities and avoid fat images. But we can go further and take a special look to the base we're using. Small images improve download, build speed and therefore are much easier to deal with, both for development and production. Projects like docker-alpine try to define the minimal common ground for applications, while unikernels want to compile and link necessary os components along with the app to produce an artefact ultra specialized (and we can go even further and strip down the final image). Those philosophies also limit maintenance overhead (less moving parts reduce side effects and unexpected behaviors), attack surface and are especially efficient when keeping a single responsibility per container (not necessarily a single process, though). Having a common base image is also a good opportunity to solve one and for all some issues with docker defaults, like phusion suggests. On the other hand, using a common layer for all future builds prevents us from exploiting community creations. Official language images allows one to quickly containerize its application on top of solid ground. As always, it really depends on the use case. Brainstorm of improvements What's more, here is a totally non-exhaustive list of ideas to push further our investigation : Container engine agnostic : who knows who will be the big player tomorrow. Instead of a docker client we could implement drivers for [rkt]() or [lxd](). We could also split the Factory into an engine and a provisioner components. Since we fully control the build flow, we could change the graph walker callback into an interactive prompt to manually build, debug and inspect the container. Given multiple apps and remote docker endpoints, builds could be parallel and distributed. We could modify our load_manifest function to recursively load other manifest required. With reusable modules we could share the best ones (much like Ansible-galaxy). Built-in integration tests with the help of docker-compose and third party containers Currently, the container is launched with a sleep command. We could instead place terminus within our toolbox and use it at runtime to gather host information and eventually reuse it in our templates (again, very similar to Salt pillars for example). Wrapping up We merely scratched the surface of container provisioning but yet, there are plenty of exciting opportunities for supporting developers' efficiency. While the fast progresses in container technologies might seem overwhelming, I hope the directions provided here gave you a modest overview of what is happening. There are a lot of open questions and interesting experiments, so I encourage you to be part of it ! About the Author Xavier Bruhiere is the CEO of Hive Tech. He contributes to many community projects, including Occulus Rift, Myo, Docker and Leap Motion. In his spare time he enjoys playing tennis, the violin and the guitar. You can reach him at @XavierBruhiere.
Read more
  • 0
  • 0
  • 17432
article-image-why-big-data-financial-sector
Packt
06 May 2015
7 min read
Save for later

Why Big Data in the Financial Sector?

Packt
06 May 2015
7 min read
In this article by Rajiv Tiwari, author of the book, Hadoop for Finance Essentials, explains big data is not just changing the data landscape of healthcare, human science, telecom, and online retail industries, but is also transforming the way financial organizations treat their massive data. (For more resources related to this topic, see here.) As shown in the following figure, a study by McKinsey on big data use cases claims that the financial services industry is poised to gain the most from this remarkable technology: The data in financial organizations is relatively easier to capture compared to other industries, as it is easily accessible from internal systems—transactions and customer details—as well as external systems—FX Rates, legal entity data, and so on. Quite simply, the gain per byte of data is the maximum for financial services. Where do we get the big data in finance? The data is collected at every stage—be it onboarding of new customers, call center records, or financial transactions. The financial industry is rapidly moving online, and so it has never been easier to capture the data. There are other reasons as well, such as: Customers no longer need to visit branches to withdraw and deposit money or make investments. They can discuss their requirements with the Bank online or over the phone instead of physical meetings. According to SNL Financial, institutions have shut 2,599 branches in 2014 against 1,137 openings, a net loss of 1,462 branches that is just off 2013's record full-year total of 1,487 branches opened. The move brings total US branches down to 94,752, a decline of 1.5 percent. The trend is global and not just in the US. Electronic channels such as debit/credit cards and mobile devices, through which customers can interact with financial organizations, have increased in the UK, as shown in the following figure. The trend is global and not just in the UK. Mobile equipment such as computers, smartphones, telephones, or tablets make it easier and inexpensive for customers to transact, which means customers will transact more and generate more data. Since customer profiles and transaction patterns are rapidly changing, risk models based on smaller data sets are not very accurate. We need to analyze data for longer durations and be able to write complex data algorithms without worrying about computing and data storage capabilities. When financial organizations combine structured data with unstructured data on social media, the data analysis becomes very powerful. For example, they can get feedback on their new products or TV advertisements by analyzing Twitter, Facebook, and other social media comments. Big data use cases in the financial sector The financial sector is also sometimes called the BFSI sector; that is, banking, financial services, and insurance. Banking includes retail, corporate, business, investment (including capital markets), cards, and other core banking services Financial services include brokering, payment channels, mutual funds, asset management, and other services Insurance covers life and general insurance Financial organizations have been actively using big data platforms for the last few years and their key objectives are: Complying with regulatory requirements Better risk analytics Understanding customer behavior and improving services Understanding transaction patterns and monetizing using cross-selling of products Now I will define a few use cases within the financial services industry with real tangible business benefits. Data archival on HDFS Archiving data on HDFS is one of the basic use cases for Hadoop in financial organizations and is a quick win. It is likely to provide a very high return on investment. The data is archived on Hadoop and is still available to query (although not in real time), which is far more efficient than archiving on tape and far less expensive than keeping it on databases. Some of the use cases are: Migrate expensive and inefficient legacy mainframe data and load jobs to the Hadoop platform Migrate expensive older transaction data from high-end expensive databases to Hadoop HDFS Migrate unstructured legal, compliance, and onboarding documents to Hadoop HDFS Regulatory Financial organizations must comply with regulatory requirements. In order to meet these requirements, the use of traditional data processing platforms is becoming increasingly expensive and unsustainable. A couple of such use cases are: Checking customer names against a sanctions blacklist is very complicated due to the same or similar names. It is even more complicated when financial organizations have different names or aliases across different systems. With Hadoop, we can apply complex fuzzy matching on name and contact information across massive data sets at a much lower cost. The BCBS239 regulation states that financial organizations must be able to aggregate risk exposures across the whole group quickly and accurately. With Hadoop, financial organizations can consolidate and aggregate data on a single platform in the most efficient and cost-effective way. Fraud detection Fraud is estimated to cost the financial industry billions of US dollars per year. Financial organizations have invested in Hadoop platforms to identify fraudulent transactions by picking up unusual behavior patterns. Complex algorithms that need to be run on large volumes of transaction data to identify outliers are now possible on the Hadoop platform at a much lower expense. Tick data Stock market tick data is real-time data and generated on a massive scale. Live data streams can be processed using real-time streaming technology on the Hadoop infrastructure for quick trading decisions, and older tick data can be used for trending and forecasting using batch Hadoop tools. Risk management Financial organizations must be able to measure risk exposures for each customer and effectively aggregate it across entire business divisions. They should be able to score the credit risk for each customer using internal rules. They need to build risk models with intensive calculation on the underlying massive data. All these risk management requirements have two things in common—massive data and intensive calculation. Hadoop can handle both, given its inexpensive commodity hardware and parallel execution of jobs. Customer behavior prediction Once the customer data has been consolidated from a variety of sources on a Hadoop platform, it is possible to analyze data and: Predict mortgage defaults Predict spending for retail customers Analyze patterns that lead to customers leaving and customer dissatisfaction Sentiment analysis – unstructured Sentiment analysis is one of the best use cases to test the power of unstructured data analysis using Hadoop. Here are a few use cases: Analyze all e-mail text and call recordings from customers, which indicates whether they feel positive or negative about the products offered to them Analyze Facebook and Twitter comments to make buy or sell recommendations—analyze the market sentiments on which sectors or organizations will be a better buy for stock investments Analyze Facebook and Twitter comments to assess the feedback on new products Other use cases Big data has the potential to create new non-traditional income streams for financial organizations. As financial organizations store all the payment details of their retailers, they know exactly where, when, and how their customers are spending money. By analyzing this information, financial organizations can develop deep insight into customer intelligence and spending patterns, which they will be able to monetize. A few such possibilities include: Partner with a retailer to understand where the retailer's customers live, where and when they buy, what they buy, and how much they spend. This information will be used to recommend a sales strategy. Partner with a retailer to recommend discount offers to loyalty cardholders who use their loyalty cards in the vicinity of the retailer's stores. Summary In this article, we learned the use cases of Hadoop across different industry sectors and then detailed a few use cases within the financial sector. Resources for Article: Further resources on this subject: Hive in Hadoop [article] Hadoop and MapReduce [article] Hadoop Monitoring and its aspects [article]
Read more
  • 0
  • 0
  • 2627

article-image-preparing-our-solution
Packt
06 May 2015
21 min read
Save for later

Preparing our Solution

Packt
06 May 2015
21 min read
This article by Simon Buxton and Mat Fergusson, the authors of Microsoft Dynamics AX 2012 R3 Programming – Getting Started, covers the preparation work required before we start cutting code. Some parts of this may be skipped or reduced, depending on the scale of the development. This article does not cover the installation and configuration of the required environments; it is assumed that this is already done. We also assume that our development environment has the AX client, management tools, and Visual Studio 2010 professional installed. If you are using cumulative update 8 (CU8), you need to use Visual Studio 2013 Professional. If we are to use Team Foundation Server (TFS), each developer must have their own development environment. Typically, we will have a virtual server as a single box AX installation. We will cover the following topics in this article: Creating the models Designing the technical solution (For more resources related to this topic, see here.) Creating the models The models required depend on your organization's requirements. In this section, we will create the models based on our Fleet Management System, from a customer or end user perspective. Models should also include your prefix. In this case, we will use the Con prefix for Contoso. We will create our models in the USR layer, as explained later in this section. ISV models ISVs will normally have a base model that contains shared code between all models, and a model per add-on or vertical solution. Some care is required in ensuring that there isn't a circular dependency chain between models; that is, both reference each other's models, requiring the installation to have special instructions. By following the naming convention of prefixing elements—an ISV with the Axp prefix and an add-on named Documental—they can name the model AxpDocumental. VAR models If we are a VAR building a solution to customer-specific requirements, we will have three models: one for the actual modifications, another for changes to security, and the third for the labels. For example, if our prefix is Bcl and the customer is Contoso, we will have BclContoso, BclContosoLabels, and BclContosoSecurity. Creating security in a separate model is not mandatory, but helps when implementing projects with the Sure Step methodology because it allows security to be worked on in separate streams. Customer or end user models If we are a global organization, with separate Dynamics AX installations, we may decide to develop a central application, which is then installed on each site. In this case, we will have three models placed in the CUS layer. For the Contoso example, we have ConGlobalApplication, ConGlobalLabels, and ConGlobalSecurity. The most common scenario is to host Dynamics AX centrally, therefore having one application. The same three models are required, but this time in the USR layer: ConApplication, ConLabels, and ConSecurity. It may make sense to place distinct sets of functionality in separate models, and this would certainly help in managing separate development streams. Over time, the models tend to develop cyclic dependencies, which require them to be merged in order to ensure that a complete set of code is deployed. In our example, we will create a model for a specific functionality—our Fleet Management System—and it will make sense to hold it in a separate stream. However, it would be a particularly bad idea to hold each module's modifications in its own model. Creating the models Before we create the models, we must be in the correct layer; the USR layer in this case. The model creation is done by following these steps: Open the Dynamics AX Client, and open the development environment (Ctrl + D). From the main menu, go to Tools | Model management | Create model. Complete the Create model form as shown here: Field Description Model name This is the name of the model. It can contain spaces, but it is normally the same as the display name. Model Publisher Your organization or department. Layer The layer the model should be created in. This should be the current layer. Version The version of the model. This becomes part of the strong name of the model during signing. Model description Long description of the model. It is a good idea to link this to a functional or technical design document. Leave Set as current model checked and press OK. An example of a completed Create model form is shown in the following screenshot: If we have version control enabled for TFS, we will also ask for the Model repository folder. It will suggest, in our example C:ProjectsVCSAX6015GettingStarted<Model folder>. Replace <Model folder> with the model name, as shown in the following screenshot: Using the preceding instructions as a guide, we will need to create the following models. Usefully, AX will remember the previous information, so we only need to populate Model name and Model display name: ConFleetManagement ConLabels ConFleetManagementSecurity Designing the technical solution In most implementations, we have several roles involved in the solution design, build, and implementation. Our role is to design and develop a technical solution to a business requirement, and as discussed earlier, we will follow the design and build of a Fleet Management System. The first steps in this are to analyze the business requirement and design a solution within Dynamics AX. This work will typically be led by a consultant, who will (in short) perform the following: Match the business requirements to the AX functionality. The requirement may require new functionality or an extension of existing functionality. Discuss the technical solution with a technical consultant/developer in order to design a solution that is feasible in AX in a suitable time frame. Work through the solution with the solution architect to ensure that it fits in the overall solution design. Create functional design documents. These will be signed by the customer stakeholders and process owners. The consultant may propose table structures as parts of the functional design, but these are only to reinforce the requirement. The technical designer may find a more appropriate solution to this. The process is intended to leverage the skills of all parties in the solution delivery, allowing all parties to use their skills by abstracting the solution. Here is its summary: The customer or process owner understands the process The consultant is an expert in AX and focuses on the solution, creating a FDD The solution architect validates the FDD against the solution design The technical architect (or lead/analyst developer) creates the technical design while validating that it fits with the overall technical solution These roles often merge, but there should always be a separation of business requirement definition and technical design definition. This freedom over the technical design does not mean we have total freedom over the technical solution. At all levels, it has to both match the original requirement and fit in with the overall solution. Just because it is technically cool does not mean it is appropriate. Our purpose is to create a technical solution to a business requirement. We will evolve the design throughout this article, and the reason for each decision will be explained. It is important to understand and follow these design goals: Upgrades and system maintainability: Minimize the footprint on standard AX. Design for code reuse: This could span from creating a general framework to a useful static function on a global class. Design for a service-oriented architecture: Always consider that your code might be used as a service or as part of a service. This paradigm also promotes code reuse. Validate the design: Always validate the technical design against the original requirement, which is a very common cause of time overruns and cost. A prototype can be useful for this. Use design patterns: Do not reinvent the wheel. Patterns save design time, reduce mistakes, and promote a solution that better conforms to best practice. The technical design will include decisions on what technologies, frameworks, and patterns we will use. We may revert some decisions later on, but the majority we need to be sure we get right first time. One such design element are the data structures. Once we start using them in code and the UI, it makes any changes to this more and more difficult. Some elements can't easily be reverted, such as whether to use table inheritance or not. Table inheritance is a little like class inheritance. For example, we may have a core vehicle table, and specialized tables that inherit its properties (fields) and methods. As a more specific example, an articulated truck will have different attributes compared to a company car. Data structure design considerations The data structure architecture within Dynamics AX is breathtaking. When designing the technical structure, the tables and views should be considered along with classes as part of your static structure. We are not making the classes persist in the database. The table definition may be designed on object-oriented (OO) principles, but we are using this to define physical tables that are transacted on reliably. The key concepts within this are described in the following sections. Extended data types In traditional database design, a field tended to be one of a limited set of primitive types, such as string, number, and so on. The extended data type (EDT) system in AX allows us to define types with extended properties. With this, we can control the following categories of properties: Appearance For example, the label, help text, size (string), alignment, and other type-specific properties. Behavior Direction, for example, RTL and presence information. Business intelligence Information used by the system to generate the OLAP database. Data A reference table, internal ID information, or a reference form (the form used to open the record identified by the database relation). Relationships For the example of the ItemId EDT, this would specify that this EDT references InventTable.ItemId. In this way, the system knows when the EDT is associated with a field on a child table and which table and record it references. Additionally, the table will often have a reference to the form that is used to edit the data, allowing the user to quickly navigate to the details form. The specific properties aren't important for now, but understanding the concept is. Using EDTs ensures database consistency (primarily type and size) and user interface consistency (label, help text, and so on). This is done with very little effort, as we only have to change the EDT properties. Even more powerfully, changing the size of an EDT will change the size of all fields that reference it. Therefore, we will always use an EDT when creating a field, and almost always use EDTs as variable declarations and method parameters. We can override most of the other properties on the table field, but we rarely do this. A key benefit is that we can control these properties with little effort, ensuring consistency throughout the user interface. In some cases, we need to have a minor difference; for instance, we may wish to change a label when used in a specific context. Rather than changing the field label, it is better to create a new EDT that extends the primary EDT. It is possible to change standard EDTs, but great care must be taken, as we need to know the full effects of the change. Base enums – enumerated types Base enums are what is more commonly known as enumerated types. They provide a list of options that are stored as a number in the database; the user interface will always display the corresponding text. They are equivalent to integers, and can be cast between the integer and the symbol (text). Enums are great for status fields, where we need to have code written against a specific value. Writing code against number or string literals is bad practice. Should the option not exist or be removed from the enum's definition, we will get a compilation error when the code is compiled. An example Enum is SalesStatus, which contains the following elements: Symbol Label (based on en-us) Enum Value None None 0 Backorder Open order 1 Delivered Delivered 2 Invoiced Invoiced 3 Canceled Canceled 4 You should always reference the symbol in the code. AX will understand that and translate it into the Enum value. Tables The table definitions stored within AX synchronize with the business database in SQL Server; often, this is automatic. No changes should be made to the SQL Server table definition, as this will be overwritten whenever the database is synchronized. The table definition information controls both the user interface and how the physical table in SQL Server is created. This includes table properties, field definitions, and indexes. Relationships and referential integrity constraints are not created within SQL Server; these are managed within AX. They control what happens to a child table when a record is deleted or the primary key is renamed. The field definitions also control both the user interface and the physical field in SQL Server. These are usually set by the EDT it is associated with. A key differentiator for tables is that we can create methods and override table event methods such as validateField, validateWrite, modifiedField, insert, delete, and so on. This allows us to place table-level validation and events on the table centrally and not on the interface. In AX 2012, we can now have inheritance within tables and valid date/time states. Inheritance will allow us to have a base table, such as a vehicle table, and specialized tables that extend it, such as a bulk/loose product truck with silos and a vehicle designed to take pallets. The interface natively understands this relationship and can display records for all inherited tables in one grid control. The valid time state provides a way to version records. For example, as data about an employee changes, we can at any time ask the system what the data was at a specific point in time. The key consideration here in the design is to determine the structure and events we need to handle, which in turn drives the required EDTs and base enums. Views Views are SQL view definitions, created by constructing a query of one or more tables. They can contain aggregates and also calculate view fields, which are essentially a piece of Transact-SQL that equates to a column in view. These are useful when flattening data from normalized tables. The only real drawback is that they are read-only views, and when they are placed on the user interface in a grid control, the grid control becomes read-only. This means that if we add a column from a related table that is editable, it will become read-only in the grid control. Maps Maps provide a method of sharing code for similar tables. A good example of this is the pricing logic for sales and purchase order lines that is handled by the SalesPurchLine map. Maps contain a list of reference fields, details of how these reference fields map to the actual table fields, and methods. This is best explained with an example, such as calculating the stocking unit quantity from the quantity ordered (for example, stored in cases and sold in each). Rather than write this code on the sales order line and the purchase order line, we can do this once using a map. On the SalesPurchLine map, there are fields for PurchSalesUnit, ItemId, and SalesPurchQty. They are mapped as follows: Map field SalesLine field PurchLine field ItemId ItemId ItemId PurchSalesUnit SalesUnit PurchUnit SalesPurchQty SalesQty PurchQty We can create a simplified method on the map that contains the following code snippet: InventQty calcQtyOrdered(Qty _qtySalesPurch = realMin()) {    InventQty       qty;    InventTable     inventTable;    Qty             qtySalesPurch = _qtySalesPurch;    ;      if (qtySalesPurch == realMin())        qtySalesPurch = this.SalesPurchQty;      if (!qtySalesPurch)        return 0;    // this is actually calling a method that should exist    // on the actual table, e.g SalesLine    inventTable = this.inventTable();        qty = UnitConvert::qty(qtySalesPurch,                            this.PurchSalesUnit,                            inventTable.inventUnitId(),                            this.ItemId);      return decround(qty,InventTable::inventDecimals(this.ItemId)); } On the sales and purchase order line table, we can call the preceding method. An example from the PurchLine table is as follows: AmountCur calcLineAmount(Qty qty = this.PurchQty) {    AmountCur ret;      if (this.LineDeliveryType != LineDeliveryType::OrderLineWithMultipleDeliveries)    {        ret = this.SalesPurchLine::calcLineAmount(qty);    }      return ret; } The map will then automatically construct itself using the field mapping. Hence, we call methods on the map as if they are static methods with the :: sign. So, when called from the sales order line, this.PurchSalesUnit becomes SalesLine.SalesUnit and this.SalesPurchQty becomes SalesLine.SalesQty. This can be a useful feature for reusing code across tables that provide similar functionalities. Classes Class definitions with AX provide functionality similar to C++, C#, or Java classes, in that they support inheritance and encapsulation. Interfaces can also be used and implemented in a way similar to C#. Classes are created for the following purposes: User interface interaction Table event handling Services General processing (for example data updates, batch routines, and so on) Although it is common to use a class to handle table events, the table itself will handle the interaction with the database. Form interaction classes are not mandatory for list pages, such as Accounts receivable | Common | Customers | All customers, but should be used on data entry forms that require logic. This ensures consistency and allows easier maintenance of logic. When designing the static (mainly table and class) structure, we should break the design down so that we can easily expose that task as a service. An example can be a class that takes a vehicle out of service. This may perform many tasks: checking whether it was planned on loads, replacing the vehicle on the basis of a rule set, changing the status of the vehicle, and so on. We may have other requirements to simply change the status of vehicle or associate suitable vehicles with unallocated loads. We have already written the code to change the status and the code to find a suitable vehicle. If we had classes for finding a suitable vehicle, changing status, and so on, we could've reused them, be it on a form or from code or linked to a service that could be used by a mobile application. The point here is that we should break discrete pieces of functionality down into separate classes, as it then become much easier to reuse later on. Forms Most forms will be built from templates, which help us provide a consistent user interface that will look and feel much like the rest of AX. The helps reduce training time, reduce user error, and improve end user acceptance. The following templates are available: A list page A details form—master A details form—transaction A simple list A simple list, with a details section A table of contents A dialog A drop dialog The list page is our main entry point to both master data (items, customers, and so on) and transactional forms, such as sales orders. The list page provides the user with a searchable list of records, with a button ribbon allowing the user to interact with the record, for example, Post sales invoice. They also provide key business intelligence about the current record, which means we don't have to navigate to the details form to make a decision on the record in question. There are two types of details forms: master and transaction. Master forms are like customer and item forms, while sales order details and purchase order details are transaction forms. Details forms are designed to be opened from a list page. Simple lists are useful for setup groups, where the form contains only a grid of a few fields. These are useful for simple lists of setup data. The simple list with details version contains a grid on the left and a details section that can be arranged into tabs to present many fields. Both of these form types are designed to be opened from the content area or menu. Tables of contents are designed to be used for parameter forms. Although they may act on more than one data source (table), the tables will typically have a single record. Dialog and drop dialog are similar in that both are designed to ask for limited information and then trigger an action. Both are usually called from another form. The difference is in how they are presented. The drop dialog will appear to drop down and be a part of the calling form, whereas a dialogs appears as a pop-up window. The difference is cosmetic, but drop dialogs are often preferred as they can't fall behind the current window. Designing test plans There are two main types of testing: unit testing and integration (process) testing. We are more concerned with unit testing. Unit testing primarily ensures that the code performs functions for the functional design. We may also have performance requirements, where we need to test the code under a simulated load. AX provides a method to do this through a test project, where we can extend the test framework to write specific test cases. These work well when simulating the load against the live hardware environment. We can use a range of performance tools to ascertain where performance bottlenecks may lie and correct them. It is better to know before "go live" that we have a bottleneck. Even with this framework in place, manual testing is often the best method, especially since we are typically writing code based on database and UI interaction. Let's take an example of a vehicle status change requirement. In this case, we will list the conditions that allow the status change to occur, and what should happen. Status changed to condition Result Available Status: created Vehicle: not acquired Error "Vehicle not yet acquired" Status remains unchanged Available Status: created Vehicle: acquired Success Status changed to Available Not available Status: created Vehicle: not acquired Error "Vehicle not yet acquired" Status remains unchanged We then test our code to ensure that these statuses are being followed. Because we have one class that handles the status change, the form button, service call, and code call should also work. "Should" does not mean "will" of course, so each should be tested individually. One of the biggest fears and causes of end user complaints is regression. The users involved in testing are usually key users or process owners, who are already busy with additional work brought on by an implementation. It is often their job to train their users, and "sell" the system's benefits; user buy-in is critical for successful user adoption. There are two causes of regression: code that breaks other code or a change to a process that is incompatible with another process. The latter is mitigated by getting a solution architect or lead consultant who is responsible for the solution as a whole. Code regression can be caused by the simplest change, and these changes are often the main cause of regression, as testing is often skipped in these cases. This is mitigated by thinking of testing as a component of the technical design, and having good technical documentation. The risks are further reduced if developer notes points where regression might occur, as the code is being written. Since the code that might be affected is commented with the TDD or FDD reference, it should be easy to locate the test plan to check for regression. Summary In this article, we covered the steps that we take to start up a new project. We covered both the theory and practical steps that are to be followed when starting work on a new solution. This includes creating a model and designing the technical solution. Resources for Article: Further resources on this subject: Where Is My Data and How Do I Get to It? [article] Consuming Web Services using Microsoft Dynamics AX [article] Training, Tools, and Next Steps [article]
Read more
  • 0
  • 0
  • 1382

article-image-solving-some-not-so-common-vcenter-issues
Packt
05 May 2015
7 min read
Save for later

Solving Some Not-so-common vCenter Issues

Packt
05 May 2015
7 min read
In this article by Chuck Mills, author of the book vCenter Troubleshooting, we will review some of the not-so-common vCenter issues that administrators could face while they work with the vSphere environment. The article will cover the following issues and provide the solutions: The vCenter inventory shows no objects after you log in You get the VPXD must be stopped to perform this operation message Removing the vCenter plugins when they are no longer needed (For more resources related to this topic, see here.) Solving the problem of no objects in vCenter After successfully completing the vSphere 5.5 installation (not an upgrade) process with no error messages whatsoever, and logging in you log in to vCenter with the account you used for the installation. In this case, it is the local administrator account. Surprisingly, you are presented with an inventory of 0. The first thing is to make sure you have given vCenter enough time to start. Considering the previously mentioned account was the account used to install vCenter, you would assume the account is granted appropriate rights that allow you to manage your vCenter Server. Also consider the fact that you can log in and receive no objects from vCenter. Then, you might try logging in with your domain administrator account. This makes you wonder, What is going on here? After installing vCenter 5.5 using the Windows option, remember that the administrator@vsphere.local user will have administrator privileges for both the vCenter Single Sign-On Server and vCenter Server. You log in using the administrator@vsphere.local account with the password you defined during the installation of the SSO server: vSphere attaches the permissions along with assigning the role of administrator to the default account administrator@vsphere.local. These privileges are given for both the vCenter Single Sign-On server and the vCenter Server system. You must log in with this account after the installation is complete. After logging in with this account, you can configure your domain as an identity source. You can also give your domain administrator access to vCenter Server. Remember, the installation does not assign any administrator rights to the user account that was used to install vCenter. For additional information, review the Prerequisites for Installing vCenter Single Sign-On, Inventory Service, and vCenter Server document found at https://pubs.vmware.com/vsphere-51/index.jsp?topic=%2Fcom.vmware.vsphere.install.doc%2FGUID-C6AF2766-1AD0-41FD-B591-75D37DDB281F.html. Now that you understand what is going on with the vCenter account, use the following steps to enable the use of your Active Directory account for managing vCenter. Add or verify your AD domain as an identity source using the following procedure: Log in with administrator@vsphere.local. Select Administration from the menu. Choose Configuration under the Single Sign-On option. You will see the Single Sign-On | Configuration option only when you log in using the administrator@vsphere.local account. Select the Identity Sources tab and verify that the AD domain is listed. If not, choose Active Directory (Integrated Windows Authentication) found at the top of the window. Enter your Domain name and click on OK at the bottom of the window. Verify that your domain was added to Identity Sources, as shown in the following screenshot: Add the permissions for the AD account using the following steps: Click on Home at the top left of the window. Select vCenter from the menu options. Select vCenter Servers and then choose the vCenter Server object: Select the Manage tab and then the Permissions tab found in the vCenter Object window. Review the image that follows the steps to verify the process. Click on the green + icon to add permission. Choose the Add button located at the bottom of the window. Select the AD domain found in the drop-down option at the top of the window. Choose a user or group you want to assign permission to (the account named Chuck was selected for this example). Verify that the user or group is selected in the window. Use the drop-down options to choose the level of permissions (verify that Propagate to children is checked). Now, you should be able to log into vCenter with your AD account. See the results of the successful login in the following screenshot: Now, by adding the permissions to the account, you are able to log into vCenter using your AD credentials. The preceding screenshot shows the results of the changes, which is much different than the earlier attempt. Fixing the VPXD must be stopped to perform this operation message It has been mentioned several times in this article that the Virtual Center Service Appliance (VCSA) is the direction VMware is moving in when it comes to managing vCenter. As the number of administrators using it keeps increasing, the number of problems will also increase. One of the components an administrator might have problems with is the Virtual Centre Server service. This service should not be running during any changes to the database or the account settings. However, as with most vSphere components, there are times when something happens and you need to stop or start a service in order to fix the problem. There are times when an administrator who works within the VCSA appliance encounters the following error: This service can be stopped using the web console, by performing the following steps: Log into the console using https://ip-of-vcsa:5480. Enter your username and password: Choose vCenter Server after logging in. Make sure the Summary tab is selected. Click on the Stop button to stop the server: This should work most of the time, but if you find that using the web console is not working, then you need to log into the VCSA appliance directly and use the following procedure to stop the server: Connect to the appliance by using an SSH client such as Putty or mRemote. Type the command chkconfig. This will list all the services and their current status: Verify that vmware-vxpd is on: You can stop the server by using service vmware-vpxd stop command: After completing your work, you can start the server using one of the following methods: Restart the VCSA appliance Use the web console by clicking on the Start button on the vCenter Summary page Type service vmware-vpxd start on the SSH command line This should fix the issues that occur when you see the VPXD must be stopped to perform this operation message. Removing unwanted plugins in vSphere Administrators add and remove tools from their environment based on the needs and also the life of the tool. This is no different for the vSphere environment. As the needs of the administrator change, so does the usage of the plugins used in vSphere. The following section can be used to remove any unwanted plugins from your current vCenter. So, if you have lots of plugins and they are no longer needed, use the follow procedure to remove them: Log into your vCenter using http://vCenter_name or IP_address/mob and enter your username and password: Click on the content link under Properties: Click on ExtensionManager, which is found in the VALUE column: Highlight, right-click, and Copy the extension to be removed. Check out the Knowledge Base 1025360 found at http://Kb.vmware.com/kb/1025360 to get an overview of the plugins and their names. Select UnregisterExtension near the bottom of the page: Right-click on the plugin name and Paste it into the Value field: Click on Invoke Method to remove the plugin: This will give you the Method Invocation Result: void message. This message informs you that the selected plugin has been removed. You can repeat this process for each plugin that you want to remove. Summary In this article, we covered some of the not-so-common challenges an administrator could encounter in the vSphere environment. It provided the troubleshooting along with the solutions to the following issues: Seeing NO objects after logging into vCenter with the account you used to install it How to get past the VPXD must be stopped error when you are performing certain tasks within vCenter Removing the unwanted plugins from vCenter Server Resources for Article: Further resources on this subject: Availability Management [article] The Design Documentation [article] Design, Install, and Configure [article]
Read more
  • 0
  • 0
  • 7762
article-image-symmetric-messages-and-asynchronous-messages-part-1
Packt
05 May 2015
31 min read
Save for later

Symmetric Messages and Asynchronous Messages (Part 1)

Packt
05 May 2015
31 min read
In this article by Kingston Smiler. S, author of the book OpenFlow Cookbook describes the steps involved in sending and processing symmetric messages and asynchronous messages in the switch and contains the following recipes: Sending and processing a hello message Sending and processing an echo request and a reply message Sending and processing an error message Sending and processing an experimenter message Handling a Get Asynchronous Configuration message from the controller, which is used to fetch a list of asynchronous events that will be sent from the switch Sending a Packet-In message to the controller Sending a Flow-removed message to the controller Sending a port-status message to the controller Sending a controller-role status message to the controller Sending a table-status message to the controller Sending a request-forward message to the controller Handling a packet-out message from the controller Handling a barrier-message from the controller (For more resources related to this topic, see here.) Symmetric messages can be sent from both the controller and the switch without any solicitation between them. The OpenFlow switch should be able to send and process the following symmetric messages to or from the controller, but error messages will not be processed by the switch: Hello message Echo request and echo reply message Error message Experimenter message Asynchronous messages are sent by both the controller and the switch when there is any state change in the system. Like symmetric messages, asynchronous messages also should be sent without any solicitation between the switch and the controller. The switch should be able to send the following asynchronous messages to the controller: Packet-in message Flow-removed message Port-status message Table-status message Controller-role status message Request-forward message Similarly, the switch should be able to receive, or process, the following controller-to-switch messages: Packet-out message Barrier message The controller can program or instruct the switch to send a subset of interested asynchronous messages using an asynchronous configuration message. Based on this configuration, the switch should send the subset of asynchronous messages only via the communication channel. The switch should replicate and send asynchronous messages to all the controllers based on the information present in the asynchronous configuration message sent from each controller. The switch should maintain asynchronous configuration information on a per communication channel basis. Sending and processing a hello message The OFPT_HELLO message is used by both the switch and the controller to identify and negotiate the OpenFlow version supported by both the devices. Hello messages should be sent from the switch once the TCP/TLS connection is established and are considered part of the communication channel establishment procedure. The switch should send a hello message to the controller immediately after establishing the TCP/TLS connection with the controller. How to do it... As hello messages are transmitted by both the switch and the controller, the switch should be able to send, receive, and process the hello message. The following section explains these procedures in detail. Sending the OFPT_HELLO message The message format to be used to send the hello message from the switch is as follows. This message includes the OpenFlow header along with zero or more elements that have variable size: /* OFPT_HELLO. This message includes zero or more    hello elements having variable size. */ struct ofp_hello { struct ofp_header header; /* Hello element list */ struct ofp_hello_elem_header elements[0]; /* List of elements */ }; The version field in the ofp_header should be set with the highest OpenFlow protocol version supported by the switch. The elements field is an optional field and might contain the element definition, which takes the following TLV format: /* Version bitmap Hello Element */ struct ofp_hello_elem_versionbitmap { uint16_t type;           /* OFPHET_VERSIONBITMAP. */ uint16_t length;         /* Length in bytes of this element. */        /* Followed by:          * - Exactly (length - 4) bytes containing the bitmaps,          * then Exactly (length + 7)/8*8 - (length) (between 0          * and 7) bytes of all-zero bytes */ uint32_t bitmaps[0]; /* List of bitmaps - supported versions */ }; The type field should be set with OFPHET_VERSIONBITMAP. The length field should be set to the length of this element. The bitmaps field should be set with the list of the OpenFlow versions the switch supports. The number of bitmaps included in the field should depend on the highest version number supported by the switch. The ofp_versions 0 to 31 should be encoded in the first bitmap, ofp_versions 32 to 63 should be encoded in the second bitmap, and so on. For example, if the switch supports only version 1.0 (ofp_versions = 0 x 01) and version 1.3 (ofp_versions = 0 x 04), then the first bitmap should be set to 0 x 00000012. Refer to the send_hello_message() function in the of/openflow.c file for the procedure to build and send the OFPT_Hello message. Receiving the OFPT_HELLO message The switch should be able to receive and process the OFPT_HELLO messages that are sent from the controller. The controller also uses the same message format, structures, and enumerations as defined in the previous section of this recipe. Once the switch receives the hello message, it should calculate the protocol version to be used for messages exchanged with the controller. The procedure required to calculate the protocol version to be used is as follows: If the hello message received from the switch contains an optional OFPHET_VERSIONBITMAP element and the bitmap field contains a valid value, then the negotiated version should be the highest common version among the supported protocol versions in the controller, with the bitmap field in the OFPHET_VERSIONBITMAP element. If the hello message doesn't contain any OFPHET_VERSIONBITMAP element, then the negotiated version should be the smallest of the switch-supported protocol versions and the version field set in the OpenFlow header of the received hello message. If the negotiated version is supported by the switch, then the OpenFlow connection between the controller and the switch continues. Otherwise, the switch should send an OFPT_ERROR message with the type field set as OFPET_HELLO_FAILED, the code field set as OFPHFC_INCOMPATIBLE, and an optional ASCII string explaining the situation in the data and terminate the connection. There's more… Once the switch and the controller negotiate the OpenFlow protocol version to be used, the connection setup procedure is complete. From then on, both the controller and the switch can send OpenFlow protocol messages to each other. Sending and processing an echo request and a reply message Echo request and reply messages are used by both the controller and the switch to maintain and verify the liveliness of the controller-switch connection. Echo messages are also used to calculate the latency and bandwidth of the controller-switch connection. On reception of an echo request message, the switch should respond with an echo reply message. How to do it... As echo messages are transmitted by both the switch and the controller, the switch should be able to send, receive, and process them. The following section explains these procedures in detail. Sending the OFPT_ECHO_REQUEST message The OpenFlow specification doesn't specify how frequently this echo message has to be sent from the switch. However, the switch might choose to send an echo request message periodically to the controller with the configured interval. Similarly, the OpenFlow specification doesn't mention what the timeout (the longest period of time the switch should wait) for receiving echo reply message from the controller should be. After sending an echo request message to the controller, the switch should wait for the echo reply message for the configured timeout period. If the switch doesn't receive the echo reply message within this period, then it should initiate the connection interruption procedure. The OFPT_ECHO_REQUEST message contains an OpenFlow header followed by an undefined data field of arbitrary length. The data field might be filled with the timestamp at which the echo request message was sent, various lengths or values to measure the bandwidth, or be zero-size for just checking the liveliness of the connection. In most open source implementations of OpenFlow, the echo request message only contains the header field and doesn't contain any body. Refer to the send_echo_request() function in the of/openflow.c file for the procedure to build and send the echo_request message. Receiving OFPT_ECHO_REQUEST The switch should be able to receive and process OFPT_ECHO_REQUEST messages that are sent from the controller. The controller also uses the same message format, structures, and enumerations as defined in the previous section of this recipe. Once the switch receives the echo request message, it should build the OFPT_ECHO_REPLY message. This message consists of ofp_header and an arbitrary-length data field. While forming the echo reply message, the switch should copy the content present in the arbitrary-length field of the request message to the reply message. Refer to the process_echo_request() function in the of/openflow.c file for the procedure to handle and process the echo request message and send the echo reply message. Processing OFPT_ECHO_REPLY message The switch should be able to receive the echo reply message from the controller. If the switch sends the echo request message to calculate the latency or bandwidth, on receiving the echo reply message, it should parse the arbitrary-length data field and can calculate the bandwidth, latency, and so on. There's more… If the OpenFlow switch implementation is divided into multiple layers, then the processing of the echo request and reply should be handled in the deepest possible layer. For example, if the OpenFlow switch implementation is divided into user-space processing and kernel-space processing, then the echo request and reply message handling should be in the kernel space. Sending and processing an error message Error messages are used by both the controller and the switch to notify the other end of the connection about any problem. Error messages are typically used by the switch to inform the controller about failure of execution of the request sent from the controller. How to do it... Whenever the switch wants to send the error message to the controller, it should build the OFPT_ERROR message, which takes the following message format: /* OFPT_ERROR: Error message (datapath -> the controller). */ struct ofp_error_msg { struct ofp_header header; uint16_t type; uint16_t code; uint8_t data[0]; /* Variable-length data. Interpreted based on the type and code. No padding. */ }; The type field indicates a high-level type of error. The code value is interpreted based on the type. The data value is a piece of variable-length data that is interpreted based on both the type and the value. The data field should contain an ASCII text string that adds details about why the error occurred. Unless specified otherwise, the data field should contain at least 64 bytes of the failed message that caused this error. If the failed message is shorter 64 bytes, then the data field should contain the full message without any padding. If the switch needs to send an error message in response to a specific message from the controller (say, OFPET_BAD_REQUEST, OFPET_BAD_ACTION, OFPET_BAD_INSTRUCTION, OFPET_BAD_MATCH, or OFPET_FLOW_MOD_FAILED), then the xid field of the OpenFlow header in the error message should be set with the offending request message. Refer to the send_error_message() function in the of/openflow.c file for the procedure to build and send an error message. If the switch needs to send an error message for a request message sent from the controller (because of an error condition), then the switch need not send the reply message to that request. Sending and processing an experimenter message Experimenter messages provide a way for the switch to offer additional vendor-defined functionalities. How to do it... The controller sends the experimenter message with the format. Once the switch receives this message, it should invoke the appropriate vendor-specific functions. Handling a "Get Asynchronous Configuration message" from the controller The OpenFlow specification provides a mechanism in the controller to fetch the list of asynchronous events that can be sent from the switch to the controller channel. This is achieved by sending the "Get Asynchronous Configuration message" (OFPT_GET_ASYNC_REQUEST) to the switch. How to do it... The message format to be used to get the asynchronous configuration message (OFPT_GET_ASYNC_REQUEST) doesn't have any body other than ofp_header. On receiving this OFPT_GET_ASYNC_REQUEST message, the switch should respond with the OFPT_GET_ASYNC_REPLY message. The switch should fill the property list with the list of asynchronous configuration events / property types that the relevant controller channel is preconfigured to receive. The switch should get this information from its internal data structures. Refer to the process_async_config_request() function in the of/openflow.c file for the procedure to process the get asynchronous configuration request message from the controller. Sending a packet-in message to the controller Packet-in messages (OFP_PACKET_IN) are sent from the switch to the controller to transfer a packet received from one of the switch-ports to the controller for further processing. By default, a packet-in message should be sent to all the controllers that are in equal (OFPCR_ROLE_EQUAL) and master (OFPCR_ROLE_MASTER) roles. This message should not be sent to controllers that are in the slave state. There are three ways by which the switch can send a packet-in event to the controller: Table-miss entry: When there is no matching flow entry for the incoming packet, the switch can send the packet to the controller. TTL checking: When the TTL value in a packet reaches zero, the switch can send the packet to the controller. The "send to the controller" action in the matching entry (either the flow table entry or the group table entry) of the packet. How to do it... When the switch wants to send a packet received in its data path to the controller, the following message format should be used: /* Packet received on port (datapath -> the controller). */ struct ofp_packet_in { struct ofp_header header; uint32_t buffer_id; /* ID assigned by datapath. */ uint16_t total_len; /* Full length of frame. */ uint8_t reason;     /* Reason packet is being sent                      * (one of OFPR_*) */ uint8_t table_id;   /* ID of the table that was looked up */ uint64_t cookie;   /* Cookie of the flow entry that was                      * looked up. */ struct ofp_match match; /* Packet metadata. Variable size. */ /* The variable size and padded match is always followed by: * - Exactly 2 all-zero padding bytes, then * - An Ethernet frame whose length is inferred from header.length. * The padding bytes preceding the Ethernet frame ensure that IP * header (if any) following the Ethernet header is 32-bit aligned. */ uint8_t pad[2]; /* Align to 64 bit + 16 bit */ uint8_t data[0]; /* Ethernet frame */ }; The buffer-id field should be set to the opaque value generated by the switch. When the packet is buffered, the data portion of the packet-in message should contain some bytes of data from the incoming packet. If the packet is sent to the controller because of the "send to the controller" action of a table entry, then the max_len field of ofp_action_output should be used as the size of the packet to be included in the packet-in message. If the packet is sent to the controller for any other reason, then the miss_send_len field of the OFPT_SET_CONFIG message should be used to determine the size of the packet. If the packet is not buffered, either because of unavailability of buffers or an explicit configuration via OFPCML_NO_BUFFER, then the entire packet should be included in the data portion of the packet-in message with the buffer-id value as OFP_NO_BUFFER. The date field should be set to the complete packet or a fraction of the packet. The total_length field should be set to the length of the packet included in the data field. The reason field should be set with any one of the following values defined in the enumeration, based on the context that triggers the packet-in event: /* Why is this packet being sent to the controller? */ enum ofp_packet_in_reason { OFPR_TABLE_MISS = 0,   /* No matching flow (table-miss                        * flow entry). */ OFPR_APPLY_ACTION = 1, /* Output to the controller in                        * apply-actions. */ OFPR_INVALID_TTL = 2, /* Packet has invalid TTL */ OFPR_ACTION_SET = 3,   /* Output to the controller in action set. */ OFPR_GROUP = 4,       /* Output to the controller in group bucket. */ OFPR_PACKET_OUT = 5,   /* Output to the controller in packet-out. */ }; If the packet-in message was triggered by the flow-entry "send to the controller" action, then the cookie field should be set with the cookie of the flow entry that caused the packet to be sent to the controller. This field should be set to -1 if the cookie cannot be associated with a particular flow. When the packet-in message is triggered by the "send to the controller" action of a table entry, there is a possibility that some changes have already been applied over the packet in previous stages of the pipeline. This information needs to be carried along with the packet-in message, and it can be carried in the match field of the packet-in message with a set of OXM (short for OpenFlow Extensible Match) TLVs. If the switch includes an OXM TLV in the packet-in message, then the match field should contain a set of OXM TLVs that include context fields. The standard context fields that can be added into the OXL TLVs are OFPXMT_OFB_IN_PORT, OFPXMT_OFB_IN_PHY_PORT, OFPXMT_OFB_METADATA, and OFPXMT_OFB_TUNNEL_ID. When the switch receives the packet in the physical port, and this packet information needs to be carried in the packet-in message, then OFPXMT_OFB_IN_PORT and OFPXMT_OFB_IN_PHY_PORT should have the same value, which is the OpenFlow port number of that physical port. When the switch receives the packet in the logical port and this packet information needs to be carried in the packet-in message, then the switch should set the logical port's port number in OFPXMT_OFB_IN_PORT and the physical port's port number in OFPXMT_OFB_IN_PHY_PORT. For example, consider a packet received on a tunnel interface defined over a Link Aggregation Group (LAG) with two member ports. Then the packet-in message should carry the tunnel interface's port_no to the OFPXMT_OFB_IN_PORT field and the physical interface's port_no to the OFPXMT_OFB_IN_PHY_PORT field. Refer to the send_packet_in_message() function in the of/openflow.c file for the procedure to send a packet-in message event to the controller. How it works... The switch can send either the entire packet it receives from the switch port to the controller, or a fraction of the packet to the controller. When the switch is configured to send only a fraction of the packet, it should buffer the packet in its memory and send a portion of packet data. This is controlled by the switch configuration. If the switch is configured to buffer the packet, and it has sufficient memory to buffer the packet, then the packet-in message should contain the following: A fraction of the packet. This is the size of the packet to be included in the packet-in message, configured via the switch configuration message. By default, it is 128 bytes. When the packet-in message is resulted by a table-entry action, then the output action itself can specify the size of the packet to be sent to the controller. For all other packet-in messages, it is defined in the switch configuration. The buffer ID to be used by the controller when the controller wants to forward the message at a later point in time. There's more… The switch that implements buffering should be expected to expose some details, such as the amount of available buffers, the period of time the buffered data will be available, and so on, through documentation. The switch should implement the procedure to release the buffered packet when there is no response from the controller to the packet-in event. Sending a flow-removed message to the controller A flow-removed message (OFPT_FLOW_REMOVED) is sent from the switch to the controller when a flow entry is removed from the flow table. This message should be sent to the controller only when the OFPFF_SEND_FLOW_REM flag in the flow entry is set. The switch should send this message only to the controller channel wherein the controller requested the switch to send this event. The controller can express its interest in receiving this event by sending the switch configuration message to the switch. By default, OFPT_FLOW_REMOVED should be sent to all the controllers that are in equal (OFPCR_ROLE_EQUAL) and master (OFPCR_ROLE_MASTER) roles. This message should not be sent to a controller that is in the slave state. How to do it... When the switch removes an entry from the flow table, it should build an OFPT_FLOW_REMOVED message with the following format and send this message to the controllers that have already shown interest in this event: /* Flow removed (datapath -> the controller). */ struct ofp_flow_removed { struct ofp_header header; uint64_t cookie;       /* Opaque the controller-issued identifier. */ uint16_t priority;     /* Priority level of flow entry. */ uint8_t reason;         /* One of OFPRR_*. */ uint8_t table_id;       /* ID of the table */ uint32_t duration_sec; /* Time flow was alive in seconds. */ uint32_t duration_nsec; /* Time flow was alive in nanoseconds                          * beyond duration_sec. */ uint16_t idle_timeout; /* Idle timeout from original flow mod. */ uint16_t hard_timeout; /* Hard timeout from original flow mod. */ uint64_t packet_count; uint64_t byte_count; struct ofp_match match; /* Description of fields.Variable size. */ }; The cookie field should be set with the cookie of the flow entry, the priority field should be set with the priority of the flow entry, and the reason field should be set with one of the following values defined in the enumeration: /* Why was this flow removed? */ enum ofp_flow_removed_reason { OFPRR_IDLE_TIMEOUT = 0,/* Flow idle time exceeded idle_timeout. */ OFPRR_HARD_TIMEOUT = 1, /* Time exceeded hard_timeout. */ OFPRR_DELETE = 2,       /* Evicted by a DELETE flow mod. */ OFPRR_GROUP_DELETE = 3, /* Group was removed. */ OFPRR_METER_DELETE = 4, /* Meter was removed. */ OFPRR_EVICTION = 5,     /* the switch eviction to free resources. */ }; The duration_sec and duration_nsec should be set with the elapsed time of the flow entry in the switch. The total duration in nanoseconds can be computed as duration_sec*109 + duration_nsec. All the other fields, such as idle_timeout, hard_timeoutm, and so on, should be set with the appropriate value from the flow entry, that is, these values can be directly copied from the flow mode that created this entry. The packet_count and byte_count should be set with the number of packet count and the byte count associated with the flow entry, respectively. If the values are not available, then these fields should be set with the maximum possible value. Refer to the send_flow_removed_message() function in the of/openflow.c file for the procedure to send a flow removed event message to the controller. Sending a port-status message to the controller Port-status messages (OFPT_PORT_STATUS) are sent from the switch to the controller when there is any change in the port status or when a new port is added, removed, or modified in the switch's data path. The switch should send this message only to the controller channel that the controller requested the switch to send it. The controller can express its interest to receive this event by sending an asynchronous configuration message to the switch. By default, the port-status message should be sent to all configured controllers in the switch, including the controller in the slave role (OFPCR_ROLE_SLAVE). How to do it... The switch should construct an OFPT_PORT_STATUS message with the following format and send this message to the controllers that have already shown interest in this event: /* A physical port has changed in the datapath */ struct ofp_port_status { struct ofp_header header; uint8_t reason; /* One of OFPPR_*. */ uint8_t pad[7]; /* Align to 64-bits. */ struct ofp_port desc; }; The reason field should be set to one of the following values as defined in the enumeration: /* What changed about the physical port */ enum ofp_port_reason { OFPPR_ADD = 0,   /* The port was added. */ OFPPR_DELETE = 1, /* The port was removed. */ OFPPR_MODIFY = 2, /* Some attribute of the port has changed. */ }; The desc field should be set to the port description. In the port description, all properties need not be filled by the switch. The switch should fill the properties that have changed, whereas the unchanged properties can be included optionally. Refer to the send_port_status_message() function in the of/openflow.c file for the procedure to send port_status_message to the controller. Sending a controller role-status message to the controller Controller role-status messages (OFPT_ROLE_STATUS) are sent from the switch to the set of controllers when the role of a controller is changed as a result of an OFPT_ROLE_REQUEST message. For example, if there are three the controllers connected to a switch (say controller1, controller2, and controller3) and controller1 sends an OFPT_ROLE_REQUEST message to the switch, then the switch should send an OFPT_ROLE_STATUS message to controller2 and controller3. How to do it... The switch should build the OFPT_ROLE_STATUS message with the following format and send it to all the other controllers: /* Role status event message. */ struct ofp_role_status { struct ofp_header header; /* Type OFPT_ROLE_REQUEST /                            * OFPT_ROLE_REPLY. */ uint32_t role;           /* One of OFPCR_ROLE_*. */ uint8_t reason;           /* One of OFPCRR_*. */ uint8_t pad[3];           /* Align to 64 bits. */ uint64_t generation_id;   /* Master Election Generation Id */ /* Role Property list */ struct ofp_role_prop_header properties[0]; }; The reason field should be set with one of the following values as defined in the enumeration: /* What changed about the controller role */ enum ofp_controller_role_reason { OFPCRR_MASTER_REQUEST = 0, /* Another the controller asked                            * to be master. */ OFPCRR_CONFIG = 1,         /* Configuration changed on the                            * the switch. */ OFPCRR_EXPERIMENTER = 2,   /* Experimenter data changed. */ }; The role should be set to the new role of the controller. The generation_id should be set with the generation ID of the OFPT_ROLE_REQUEST message that triggered the OFPT_ROLE_STATUS message. If the reason code is OFPCRR_EXPERIMENTER, then the role property list should be set in the following format: /* Role property types. */ enum ofp_role_prop_type { OFPRPT_EXPERIMENTER = 0xFFFF, /* Experimenter property. */ };   /* Experimenter role property */ struct ofp_role_prop_experimenter { uint16_t type;         /* One of OFPRPT_EXPERIMENTER. */ uint16_t length;       /* Length in bytes of this property. */ uint32_t experimenter; /* Experimenter ID which takes the same                        * form as struct                        * ofp_experimenter_header. */ uint32_t exp_type;     /* Experimenter defined. */ /* Followed by: * - Exactly (length - 12) bytes containing the experimenter data, * - Exactly (length + 7)/8*8 - (length) (between 0 and 7) * bytes of all-zero bytes */ uint32_t experimenter_data[0]; }; The experimenter field in the experimenter ID should take the same format as the experimenter structure. Refer to the send_role_status_message() function in the of/openflow.c file for the procedure to send a role status message to the controller. Sending a table-status message to the controller Table-status messages (OFPT_TABLE_STATUS) are sent from the switch to the controller when there is any change in the table status; for example, the number of entries in the table crosses the threshold value, called the vacancy threshold. The switch should send this message only to the controller channel in which the controller requested the switch to send it. The controller can express its interest to receive this event by sending the asynchronous configuration message to the switch. How to do it... The switch should build an OFPT_TABLE_STATUS message with the following format and send this message to the controllers that have already shown interest in this event: /* A table config has changed in the datapath */ struct ofp_table_status { struct ofp_header header; uint8_t reason;             /* One of OFPTR_*. */ uint8_t pad[7];             /* Pad to 64 bits */ struct ofp_table_desc table; /* New table config. */ }; The reason field should be set with one of the following values defined in the enumeration: /* What changed about the table */ enum ofp_table_reason { OFPTR_VACANCY_DOWN = 3, /* Vacancy down threshold event. */ OFPTR_VACANCY_UP = 4,   /* Vacancy up threshold event. */ }; When the number of free entries in the table crosses the vacancy_down threshold, the switch should set the reason code as OFPTR_VACANCY_DOWN. Once the vacancy_down event is generated by the switch, the switch should not generate any further vacancy down event until a vacancy up event is generated. When the number of free entries in the table crosses the vacancy_up threshold value, the switch should set the reason code as OFPTR_VACANCY_UP. Again, once the vacancy up event is generated by the switch, the switch should not generate any further vacancy up event until a vacancy down event is generated. The table field should be set with the table description. Refer to the send_table_status_message() function in the of/openflow.c file for the procedure to send a table status message to the controller. Sending a request-forward message to the controller When a the switch receives a modify request message from the controller to modify the state of a group or meter entries, after successful modification of the state, the switch should forward this request message to all other controllers as a request forward message (OFPT_REQUESTFORWAD). The switch should send this message only to the controller channel in which the controller requested the switch to send this event. The controller can express its interest to receive this event by sending an asynchronous configuration message to the switch. How to do it... The switch should build the OFPT_REQUESTFORWAD message with the following format, and send this message to the controllers that have already shown interest in this event: /* Group/Meter request forwarding. */ struct ofp_requestforward_header { struct ofp_header header; /* Type OFPT_REQUESTFORWARD. */ struct ofp_header request; /* Request being forwarded. */ }; The request field should be set with the request that received from the controller. Refer to the send_request_forward_message() function in the of/openflow.c file for the procedure to send request_forward_message to the controller. Handling a packet-out message from the controller Packet-out (OFPT_PACKET_OUT) messages are sent from the controller to the switch when the controller wishes to send a packet out through the switch's data path via a switch port. How to do it... There are two ways in which the controller can send a packet-out message to the switch: Construct the full packet: In this case, the controller generates the complete packet and adds the action list field to the packet-out message. The action field contains a list of actions defining how the packet should be processed by the switch. If the switch receives a packet_out message with buffer_id set as OFP_NO_BUFFER, then the switch should look into the action list, and based on the action to be performed, it can do one of the following: Modify the packet and send it via the switch port mentioned in the action list Hand over the packet to OpenFlow's pipeline processing, based on the OFPP_TABLE specified in the action list Use a packet buffer in the switch: In this mechanism, the switch should use the buffer that was created at the time of sending the packet-in message to the controller. While sending the packet_in message to the controller, the switch adds the buffer_id to the packet_in message. When the controller wants to send a packet_out message that uses this buffer, the controller includes this buffer_id in the packet_out message. On receiving the packet_out message with a valid buffer_id, the switch should fetch the packet from the buffer and send it via the switch port. Once the packet is sent out, the switch should free the memory allocated to the buffer, which was cached. Handling a barrier message from the controller The switch implementation could arbitrarily reorder the message sent from the controller to maximize its performance. So, if the controller wants to enforce the processing of the messages in order, then barrier messages are used. Barrier messages (OFPT_TABLE_STATUS) are sent from the controller to the switch to ensure message ordering. The switch should not reorder any messages across the barrier message. For example, if the controller is sending a group add message, followed by a flow add message referencing the group, then the message order should be preserved in the barrier message. How to do it... When the controller wants to send messages that are related to each other, it sends a barrier message between these messages. The switch should process these messages as follows: Messages before a barrier request should be processed fully before the barrier, including sending any resulting replies or errors. The barrier request message should then be processed and a barrier reply should be sent. While sending the barrier reply message, the switch should copy the xid value from the barrier request message. The switch should process the remaining messages. Both the barrier request and barrier reply messages don't have any body. They only have the ofp_header. Summary This article covers the list of symmetric and asynchronous messages sent and received by the OpenFlow switch, along with the procedure for handling these messages. Resources for Article: Further resources on this subject: The OpenFlow Controllers [article] Untangle VPN Services [article] Getting Started [article]
Read more
  • 0
  • 0
  • 9638

article-image-installation-and-upgrade
Packt
05 May 2015
8 min read
Save for later

Installation and Upgrade

Packt
05 May 2015
8 min read
In this article by Robert Hedblom, author of the book Microsoft System Center Data Protection Manager Cookbook, we will cover the installation and upgrade for SQL Server on DPM server. We will also understand the prerequisites to start your upgrade process. You will learn how to: Install a SQL Server locally on the DPM server Prepare a remote SQL Server for DPMDB (For more resources related to this topic, see here.) The final result of an installation will never be better than the dependent application design and implementation. A common mistake discovered frequently is the misconfiguration of the SQL configurations that the System Center applications depend on. If you provide System Center a poorly configured SQL Server or insufficient resources, you will end up with quite a bad installation of the application that could be part of the services you would like to provision within your modern data center. In the end, a System Center application can never work faster than what the underlying dependent architecture or technology allows. By proper planning and decent design, you can also provide a scalable scenario for your installation that will make your System Center application applicable for future scenarios. One important note regarding the upgrade scenario for the System Center Data Protection Manager software is the fact that there is no rollback feature built in. If your upgrade fails, you will not be able to provide an easy approach for restoring your DPM server to its former running state. Always remember to provide supported scenarios for your solution. Never take any shortcuts because there aren't any. Installing a SQL Server locally on the DPM server This recipe will cover the installation process of a local SQL Server that is collocated with the DPM server on the same operating system. Getting ready SQL Server is a core component for System Center Data Protection Manager. It is of major importance that the installation and design of SQL Server is well planned and implemented. If you have an undersized installation of SQL Server, it will provide you with a negative experience while operating the System Center Data Protection Manager. How to do it… Make sure that your operating system is fully patched and rebooted before you start the installation of SQL Server 2012 and that the DPM Admins group is a member of the local administrators group. Now take the following steps: Insert the SQL server media and start the SQL server setup. In the SQL Server Installation Center, click on New SQL Server stand-alone installation… The Setup Support Rules will start and will identify any problems that might occur during the SQL server installation. When the Operation is complete, click on OK to continue. In the Product Key step, Enter the product key and click on Next > to continue. The next is the License Terms step where you check the I accept the license terms checkbox if you agree with the license terms. Click on Next > to continue. The SQL server installation will verify if there are any product updates available from the Microsoft update service. Check the Include SQL Server product updates checkbox and click on Next > to continue. Next is the Install Setup Files step that initializes the actual installation. When the tasks have finished, click on Install to continue. Verify that all the rules have passed in the Setup Support Rule step of the SQL server installation process. Resolve any warnings or errors and click the Re-run button to run the verification again. If all the rules have passed, click on Next > to continue. In the Setup Role step, select SQL Server Feature Installation and click on Next >. In the Feature Selection, choose the SQL server features that you would like to install. System Center Data Protection Manager requires: Database Engine Service Full-Text and Semantic Extractions for Search Reporting Services – Native As an option, you can also install the SQL Server Management Studio on the same operating system as the DPM sever. Those components are found under Management Tools, check both Basic and Complete. Click on Next > to continue. Verify the Installation Rules step, resolve any errors, and click on Next > to continue. In the Instance Configuration step, select Named instance and type in a suitable name for your SQL server instance. Click on the button next to the Instance root directory and select the volume that should host the DPMDB. Click on Next > to continue Verify that there are no problems in the Disk Space Requirement step, resolve any issues, and click on Next > to continue. In the Server Configuration step, type in the credentials for the dedicated service account you would like to use for this SQL server. Switch the Startup Type to Automatic for the SQL Server Agent. When all the credentials are filled in, click on the Collation tab. In the Collation tab, must enter the collation for the database engine. System Center Data Protection Manager must have the SQL_Latin1_General_CP1_CI_AS collation. Click on the Customize… button to choose the correct collation and then Next > to continue. The next step is the Database Engine Configuration step and here you enter the authentication security mode, administrators, and directories. In the Authentication Mode section, choose Windows Authentication mode. In the Specify SQL Server administrators section, add the DPM Admins group and click on the Data Directories tab to verify that all your SQL server configurations point to the dedicated disk. Click on Next > to continue. In the Reporting Services Configuration step, configure SSRS or SQL Server Reporting Services. For the Reporting Services Native Mode choose Install and configure and click on Next > to continue. The next step is Error Reporting. Choose the defaults and click on Next > to continue. In the Installation and Configuration Rules step, verify that all operations pass the rules. Resolve any warnings or errors and click the Re-run button for another verification. When all operations have passed, click on Next > to continue. Verify the configuration in the Ready to Install step and click on Install to start the installation. The Installation Progress step will show the current status of the installation process. When the installation is done, the SQL Server 2012 Setup will show you a summary of the Complete step. That is the final step page of the SQL Server Server 2012 installation wizard. Click on the Close button to end the SQL Server 2012 Setup. How it works… SQL server is a very important component for the System Center family. If the SQL server is undersized or misconfigured in any way, it will reflect negatively in many ways on the performance of the System Center. It is crucial to plan, design, and measure the performance of the SQL server so that you know it will fit the scale you are planning for, and the workloads that it should host. Preparing a remote SQL server for DPMDB This recipe will cover the procedure to prepare a remote SQL server for hosting the DPMDB. Getting ready In the scenario where you build a large hosted DPM service solution delivering BaaS (Backup as a Service), RaaS (Restore as a Service), or DRaaS (Disaster Recovery as a Service) within your modern data center, you may want to use a dedicated backend SQL server that is either a standalone SQL server or a clustered one, for high availability. It is not advisable to use SQL Server Always-On to host the DPMDB. Regardless of whether you put the DPMDB on a cluster or a backend standalone SQL server, you still need to perform some initial configurations prior to the actual DPM server installation. How to do it… After installing your backend SQL server solution you must prepare it for hosting the DPMDB. Insert the DPM2012R2 media and run the setup. In the setup screen, click on the DPM Remote SQL Prep link. The installation wizard will start and install the DPM 2012 R2 Support Files; this is a very quick installation. When the installation has finished, a message box prompts that the installation has finished and that the System Center 2012 R2 DPM Support Files have been successfully installed. How it works… The support files for SQL server will be installed on the backend SQL server box and will be used when the DPM server connects and creates its database. There's more… For the DPM server installation to be successful, when you place the DPMDB on a backend SQL server solution, you need to install the SQL 2012 SERVICE PACK 1 Tools that are located in the catalogueSCDPMSQLSRV2012SP1 directory on the DPM media. Summary In this article we learned how to install a SQL Server on a local DPM server, and prepared the remote SQL Server for hosting the DPMDB. We got to know the prerequisites to start the upgrade process for System Center Data Protection Manager. Resources for Article: Further resources on this subject: Mobility [article] Planning a Compliance Program in Microsoft System Center 2012 [article] Wireless and Mobile Hacks [article]
Read more
  • 0
  • 0
  • 1851
Modal Close icon
Modal Close icon