How-To Tutorials

15 Apr 2016

17 min read

Getting Started with Force.com

15 Apr 2016

0
0
6299

Packt

15 Apr 2016

24 min read

Web Server Development

Packt

15 Apr 2016

24 min read

In this article by Holger Brunn, Alexandre Fayolle, and Daniel Eufémio Gago Reis, the authors of the book, Odoo Development Cookbook, have discussed how to deploy the web server in Odoo. In this article, we'll cover the following topics: Make a path accessible from the network Restrict access to web accessible paths Consume parameters passed to your handlers Modify an existing handler Using the RPC API (For more resources related to this topic, see here.) Introduction We'll introduce the basics of the web server part of Odoo in this article. Note that this article covers the fundamental pieces. All of Odoo's web request handling is driven by the Python library werkzeug (http://werkzeug.pocoo.org). While the complexity of werkzeug is mostly hidden by Odoo's convenient wrappers, it is an interesting read to see how things work under the hood. Make a path accessible from the network In this recipe, we'll see how to make an URL of the form http://yourserver/path1/path2 accessible to users. This can either be a web page or a path returning arbitrary data to be consumed by other programs. In the latter case, you would usually use the JSON format to consume parameters and to offer you data. Getting ready We'll make use of a ready-made library.book model. We want to allow any user to query the full list of books. Furthermore, we want to provide the same information to programs via a JSON request. How to do it… We'll need to add controllers, which go into a folder called controllers by convention. Add a controllers/main.py file with the HTML version of our page: from openerp import http from openerp.http import request class Main(http.Controller): @http.route('/my_module/books', type='http', auth='none') def books(self): records = request.env['library.book']. sudo().search([]) result = '<html><body><table><tr><td>' result += '</td></tr><tr><td>'.join( records.mapped('name')) result += '</td></tr></table></body></html>' return result Add a function to serve the same information in the JSON format @http.route('/my_module/books/json', type='json', auth='none') def books_json(self): records = request.env['library.book']. sudo().search([]) return records.read(['name']) Add the file controllers/__init__.py: from . import main Add controllers to your __init__.py addon: from . import controllers After restarting your server, you can visit /my_module/books in your browser and get presented with a flat list of book names. To test the JSON-RPC part, you'll have to craft a JSON request. A simple way to do that would be using the following command line to receive the output on the command line: curl -i -X POST -H "Content-Type: application/json" -d "{}" localhost:8069/my_module/books/json If you get 404 errors at this point, you probably have more than one database available on your instance. In this case, it's impossible for Odoo to determine which database is meant to serve the request. Use the --db-filter='^yourdatabasename$' parameter to force using exact database you installed the module in. Now the path should be accessible. How it works… The two crucial parts here are that our controller is derived from openerp.http.Controller and that the methods we use to serve content are decorated with openerp.http.route. Inheriting from openerp.http.Controller registers the controller with Odoo's routing system in a similar way as models are registered by inheriting from openerp.models.Model; also, Controller has a meta class that takes care of this. In general, paths handled by your addon should start with your addon's name to avoid name clashes. Of course, if you extend some addon's functionality, you'll use this addon's name. openerp.http.route The route decorator allows us to tell Odoo that a method is to be web accessible in the first place, and the first parameter determines on which path it is accessible. Instead of a string, you can also pass a list of strings in case you use the same function to serve multiple paths. The type argument defaults to http and determines what type of request is to be served. While strictly speaking JSON is HTTP, declaring the second function as type='json' makes life a lot easier, because Odoo then handles type conversions itself. Don't worry about the auth parameter for now, it will be addressed in recipe Restrict access to web accessible paths. Return values Odoo's treatment of the functions' return values is determined by the type argument of the route decorator. For type='http', we usually want to deliver some HTML, so the first function simply returns a string containing it. An alternative is to use request.make_response(), which gives you control over the headers to send in the response. So to indicate when our page was updated the last time, we might change the last line in books() to the following: return request.make_response( result, [ ('Last-modified', email.utils.formatdate( ( fields.Datetime.from_string( request.env['library.book'].sudo() .search([], order='write_date desc', limit=1) .write_date) - datetime.datetime(1970, 1, 1) ).total_seconds(), usegmt=True)), ]) This code sends a Last-modified header along with the HTML we generated, telling the browser when the list was modified for the last time. We extract this information from the write_date field of the library.book model. In order for the preceding snippet to work, you'll have to add some imports on the top of the file: import email import datetime from openerp import fields You can also create a Response object of werkzeug manually and return that, but there's little gain for the effort. Generating HTML manually is nice for demonstration purposes, but you should never do this in production code. Always use templates as appropriate and return them by calling request.render(). This will give you localization for free and makes your code better by separating business logic from the presentation layer. Also, templates provide you with functions to escape data before outputting HTML. The preceding code is vulnerable to cross-site-scripting attacks if a user manages to slip a script tag into the book name, for example. For a JSON request, simply return the data structure you want to hand over to the client, Odoo takes care of serialization. For this to work, you should restrict yourself to data types that are JSON serializable, which are roughly dictionaries, lists, strings, floats and integers. openerp.http.request The request object is a static object referring to the currently handled request, which contains everything you need to take useful action. Most important is the property request.env, which contains an Environment object which is just the same as in self.env for models. This environment is bound to the current user, which is none in the preceding example because we used auth='none'. Lack of a user is also why we have to sudo() all our calls to model methods in the example code. If you're used to web development, you'll expect session handling, which is perfectly correct. Use request.session for an OpenERPSession object (which is quite a thin wrapper around the Session object of werkzeug), and request.session.sid to access the session id. To store session values, just treat request.session as a dictionary: request.session['hello'] = 'world' request.session.get('hello') Note that storing data in the session is not different from using global variables. Use it only if you must - that is usually the case for multi request actions like a checkout in the website_sale module. And also in this case, handle all functionality concerning sessions in your controllers, never in your modules. There's more… The route decorator can have some extra parameters to customize its behavior further. By default, all HTTP methods are allowed, and Odoo intermingles with the parameters passed. Using the parameter methods, you can pass a list of methods to accept, which usually would be one of either ['GET'] or ['POST']. To allow cross origin requests (browsers block AJAX and some other types of requests to domains other than where the script was loaded from for security and privacy reasons), set the cors parameter to * to allow requests from all origins, or some URI to restrict requests to ones originating from this URI. If this parameter is unset, which is the default, the Access-Control-Allow-Origin header is not set, leaving you with the browser's standard behavior. In our example, we might want to set it on /my_module/books/json in order to allow scripts pulled from other websites accessing the list of books. By default, Odoo protects certain types of requests from an attack known as cross-site request forgery by passing a token along on every request. If you want to turn that off, set the parameter csrf to False, but note that this is a bad idea in general. See also If you host multiple Odoo databases on the same instance and each database has different web accessible paths on possibly multiple domain names per database, the standard regular expressions in the --db-filter parameter might not be enough to force the right database for every domain. In that case, use the community module dbfilter_from_header from https://github.com/OCA/server-tools in order to configure the database filters on proxy level. To see how using templates makes modularity possible, see recipe Modify an existing handler later in the article. Restrict access to web accessible paths We'll explore the three authentication mechanisms Odoo provides for routes in this recipe. We'll define routes with different authentication mechanisms in order to show their differences. Getting ready As we extend code from the previous recipe, we'll also depend on the library.book model, so you should get its code correct in order to proceed. How to do it… Define handlers in controllers/main.py: Add a path that shows all books: @http.route('/my_module/all-books', type='http', auth='none') def all_books(self): records = request.env['library.book'].sudo().search([]) result = '<html><body><table><tr><td>' result += '</td></tr><tr><td>'.join( records.mapped('name')) result += '</td></tr></table></body></html>' return result Add a path that shows all books and indicates which was written by the current user, if any: @http.route('/my_module/all-books/mark-mine', type='http', auth='public') def all_books_mark_mine(self): records = request.env['library.book'].sudo().search([]) result = '<html><body><table>' for record in records: result += '<tr>' if record.author_ids & request.env.user.partner_id: result += '<th>' else: result += '<td>' result += record.name if record.author_ids & request.env.user.partner_id: result += '</th>' else: result += '</td>' result += '</tr>' result += '</table></body></html>' return result Add a path that shows the current user's books: @http.route('/my_module/all-books/mine', type='http', auth='user') def all_books_mine(self): records = request.env['library.book'].search([ ('author_ids', 'in', request.env.user.partner_id.ids), ]) result = '<html><body><table><tr><td>' result += '</td></tr><tr><td>'.join( records.mapped('name')) result += '</td></tr></table></body></html>' return result With this code, the paths /my_module/all_books and /my_module/all_books/mark_mine look the same for unauthenticated users, while a logged in user sees her books in a bold font on the latter path. The path /my_module/all-books/mine is not accessible at all for unauthenticated users. If you try to access it without being authenticated, you'll be redirected to the login screen in order to do so. How it works… The difference between authentication methods is basically what you can expect from the content of request.env.user. For auth='none', the user record is always empty, even if an authenticated user is accessing the path. Use this if you want to serve content that has no dependencies on users, or if you want to provide database agnostic functionality in a server wide module. The value auth='public' sets the user record to a special user with XML ID, base.public_user, for unauthenticated users, and to the user's record for authenticated ones. This is the right choice if you want to offer functionality to both unauthenticated and authenticated users, while the authenticated ones get some extras, as demonstrated in the preceding code. Use auth='user' to be sure that only authenticated users have access to what you've got to offer. With this method, you can be sure request.env.user points to some existing user. There's more… The magic for authentication methods happens in the ir.http model from the base addon. For whatever value you pass to the auth parameter in your route, Odoo searches for a function called _auth_method_<yourvalue> on this model, so you can easily customize this by inheriting this model and declaring a method that takes care of your authentication method of choice. As an example, we provide an authentication method base_group_user which enforces a currently logged in user who is a member of the group with XML ID, base.group_user: from openerp import exceptions, http, models from openerp.http import request class IrHttp(models.Model): _inherit = 'ir.http' def _auth_method_base_group_user(self): self._auth_method_user() if not request.env.user.has_group('base.group_user'): raise exceptions.AccessDenied() Now you can say auth='base_group_user' in your decorator and be sure that users running this route's handler are members of this group. With a little trickery you can extend this to auth='groups(xmlid1,…)', the implementation of this is left as an exercise to the reader, but is included in the example code. Consume parameters passed to your handlers It's nice to be able to show content, but it's better to show content as a result of some user input. This recipe will demonstrate the different ways to receive this input and react to it. As the recipes before, we'll make use of the library.book model. How to do it… First, we'll add a route that expects a traditional parameter with a book's ID to show some details about it. Then, we'll do the same, but we'll incorporate our parameter into the path itself: Add a path that expects a book's ID as parameter: @http.route('/my_module/book_details', type='http', auth='none') def book_details(self, book_id): record = request.env['library.book'].sudo().browse( int(book_id)) return u'<html><body><h1>%s</h1>Authors: %s' % ( record.name, u', '.join(record.author_ids.mapped( 'name')) or 'none', ) Add a path where we can pass the book's ID in the path @http.route("/my_module/book_details/<model('library.book') :book>", type='http', auth='none') def book_details_in_path(self, book): return self.book_details(book.id) If you point your browser to /my_module/book_details?book_id=1, you should see a detail page of the book with ID 1. If this doesn't exist, you'll receive an error page. The second handler allows you to go to /my_module/book_details/1 and view the same page. How it works… By default, Odoo (actually werkzeug) intermingles with GET and POST parameters and passes them as keyword argument to your handler. So by simply declaring your function as expecting a parameter called book_id, you introduce this parameter as either GET (the parameter in the URL) or POST (usually passed by forms with your handler as action) parameter. Given that we didn't add a default value for this parameter, the runtime will raise an error if you try to access this path without setting the parameter. The second example makes use of the fact that in a werkzeug environment, most paths are virtual anyway. So we can simply define our path as containing some input. In this case, we say we expect the ID of a library.book as the last component of the path. The name after the colon is the name of a keyword argument. Our function will be called with this parameter passed as keyword argument. Here, Odoo takes care of looking up this ID and delivering a browse record, which of course only works if the user accessing this path has appropriate permissions. Given that book is a browse record, we can simply recycle the first example's function by passing book.id as parameter book_id to give out the same content. There's more… Defining parameters within the path is a functionality delivered by werkzeug, which is called converters. The model converter is added by Odoo, which also defines the converter, models, that accepts a comma separated list of IDs and passes a record set containing those IDs to your handler. The beauty of converters is that the runtime coerces the parameters to the expected type, while you're on your own with normal keyword parameters. These are delivered as strings and you have to take care of the necessary type conversions yourself, as seen in the first example. Built-in werkzeug converters include int, float, and string, but also more intricate ones such as path, any, or uuid. You can look up their semantics at http://werkzeug.pocoo.org/docs/0.11/routing/#builtin-converters. See also Odoo's custom converters are defined in ir_http.py in the base module and registered in the _get_converters method of ir.http. As an exercise, you can create your own converter that allows you to visit the /my_module/book_details/Odoo+cookbook page to receive the details of this book (if you added it to your library before). Modify an existing handler When you install the website module, the path /website/info displays some information about your Odoo instance. In this recipe, we override this in order to change this information page's layout, but also to change what is displayed. Getting ready Install the website module and inspect the path /website/info. Now craft a new module that depends on website and uses the following code. How to do it… We'll have to adapt the existing template and override the existing handler: Override the qweb template in a file called views/templates.xml: <?xml version="1.0" encoding="UTF-8"?> <odoo> <template id="show_website_info" inherit_id="website.show_website_info"> <xpath expr="//dl[@t-foreach='apps']" position="replace"> <table class="table"> <tr t-foreach="apps" t-as="app"> <th> <a t-att-href="app.website"> <t t-esc="app.name" /></a> </th> <td><t t-esc="app.summary" /></td> </tr> </table> </xpath> </template> </odoo> Override the handler in a file called controllers/main.py: from openerp import http from openerp.addons.website.controllers.main import Website class Website(Website): @http.route() def website_info(self): result = super(Website, self).website_info() result.qcontext['apps'] = result.qcontext[ 'apps'].filtered( lambda x: x.name != 'website') return result Now when visiting the info page, we'll only see a filtered list of installed applications, and in a table as opposed to the original definition list. How it works In the first step, we override an existing QWeb template. In order to find out which that is, you'll have to consult the code of the original handler. Usually, it will end with the following command line, which tells you that you need to override template.name: return request.render('template.name', values) In our case, the handler uses a template called website.info, but this one is extended immediately by another template called website.show_website_info, so it's more convenient to override this one. Here, we replace the definition list showing installed apps with a table. In order to override the handler method, we must identify the class that defines the handler, which is openerp.addons.website.controllers.main.Website in this case. We import the class to be able to inherit from it. Now we override the method and change the data passed to the response. Note that what the overridden handler returns is a Response object and not a string of HTML as the previous recipes did for the sake of brevity. This object contains a reference to the template to be used and the values accessible to the template, but is only evaluated at the very end of the request. In general, there are three ways to change an existing handler: If it uses a QWeb template, the simplest way of changing it is to override the template. This is the right choice for layout changes and small logic changes. QWeb templates get a context passed, which is available in the response as the field qcontext. This usually is a dictionary where you can add or remove values to suit your needs. In the preceding example, we filter the list of apps to only contain apps which have a website set. If the handler receives parameters, you could also preprocess those in order to have the overridden handler behave the way you want. There's more… As seen in the preceding section, inheritance with controllers works slightly differently than model inheritance: You actually need a reference to the base class and use Python inheritance on it. Don't forget to decorate your new handler with the @http.route decorator; Odoo uses it as a marker for which methods are exposed to the network layer. If you omit the decorator, you actually make the handler's path inaccessible. The @http.route decorator itself behaves similarly to field declarations: every value you don't set will be derived from the decorator of the function you're overriding, so we don't have to repeat values we don't want to change. After receiving a response object from the function you override, you can do a lot more than just changing the QWeb context: You can add or remove HTTP headers by manipulating response.headers. If you want to render an entirely different template, you can set response.template. To detect if a response is based on QWeb in the first place, query response.is_qweb. The resulting HTML code is available by calling response.render(). Using the RPC API One of Odoo's strengths is its interoperability, which is helped by the fact that basically any functionality is available via JSON-RPC 2.0 and XMLRPC. In this recipe, we'll explore how to use both of them from client code. This interface also enables you to integrate Odoo with any other application. Making functionality available via any of the two protocols on the server side is explained in the There's more section of this recipe. We'll query a list of installed modules from the Odoo instance, so that we could show a list as the one displayed in the previous recipe in our own application or website. How to do it… The following code is not meant to run within Odoo, but as simple scripts: First, we query the list of installed modules via XMLRPC: #!/usr/bin/env python2 import xmlrpclib db = 'odoo9' user = 'admin' password = 'admin' uid = xmlrpclib.ServerProxy( 'http://localhost:8069/xmlrpc/2/common') .authenticate(db, user, password, {}) odoo = xmlrpclib.ServerProxy( 'http://localhost:8069/xmlrpc/2/object') installed_modules = odoo.execute_kw( db, uid, password, 'ir.module.module', 'search_read', [[('state', '=', 'installed')], ['name']], {'context': {'lang': 'fr_FR'}}) for module in installed_modules: print module['name'] Then we do the same with JSONRPC: import json import urllib2 db = 'odoo9' user = 'admin' password = 'admin' request = urllib2.Request( 'http://localhost:8069/web/session/authenticate', json.dumps({ 'jsonrpc': '2.0', 'params': { 'db': db, 'login': user, 'password': password, }, }), {'Content-type': 'application/json'}) result = urllib2.urlopen(request).read() result = json.loads(result) session_id = result['result']['session_id'] request = urllib2.Request( 'http://localhost:8069/web/dataset/call_kw', json.dumps({ 'jsonrpc': '2.0', 'params': { 'model': 'ir.module.module', 'method': 'search_read', 'args': [ [('state', '=', 'installed')], ['name'], ], 'kwargs': {'context': {'lang': 'fr_FR'}}, }, }), { 'X-Openerp-Session-Id': session_id, 'Content-type': 'application/json', }) result = urllib2.urlopen(request).read() result = json.loads(result) for module in result['result']: print module['name'] Both code snippets will print a list of installed modules, and because they pass a context that sets the language to French, the list will be in French if there are no translations available. How it works… Both snippets call the function search_read, which is very convenient because you can specify a search domain on the model you call, pass a list of fields you want to be returned, and receive the result in one request. In older versions of Odoo, you had to call search first to receive a list of IDs and then call read to actually read the data. search_read returns a list of dictionaries, with the keys being the names of the fields requested and the values the record's data. The ID field will always be transmitted, no matter if you requested it or not. Now, we need to look at the specifics of the two protocols. XMLRPC The XMLRPC API expects a user ID and a password for every call, which is why we need to fetch this ID via the method authenticate on the path /xmlrpc/2/common. If you already know the user's ID, you can skip this step. As soon as you know the user's ID, you can call any model's method by calling execute_kw on the path /xmlrpc/2/object. This method expects the database you want to execute the function on, the user's ID and password for authentication, then the model you want to call your function on, and then the function's name. The next two mandatory parameters are a list of positional arguments to your function, and a dictionary of keyword arguments. JSONRPC Don't be distracted by the size of the code example, that's because Python doesn't have built in support for JSONRPC. As soon as you've wrapped the urllib calls in some helper functions, the example will be as concise as the XMLRPC one. As JSONRPC is stateful, the first thing we have to do is to request a session at /web/session/authenticate. This function takes the database, the user's name, and their password. The crucial part here is that we record the session ID Odoo created, which we pass in the header X-Openerp-Session-Id to /web/dataset/call_kw. Then the function behaves the same as execute_kw from; we need to pass a model name and a function to call on it, then positional and keyword arguments. There's more… Both protocols allow you to call basically any function of your models. In case you don't want a function to be available via either interface, prepend its name with an underscore – Odoo won't expose those functions as RPC calls. Furthermore, you need to take care that your parameters, as well as the return values, are serializable for the protocol. To be sure, restrict yourself to scalar values, dictionaries, and lists. As you can do roughly the same with both protocols, it's up to you which one to use. This decision should be mainly driven by what your platform supports best. In a web context, you're generally better off with JSON, because Odoo allows JSON handlers to pass a CORS header conveniently (see the Make a path accessible from the network recipe for details). This is rather difficult with XMLRPC. Summary In this article, we saw how to start about with the web server architecture. Later on, we covered the Routes and Controllers that will be used in the article and their authentication, how the handlers consumes parameters, and how to use an RPC API, namely, JSON-RPC and XML-RPC. Resources for Article: Further resources on this subject: Advanced React [article] Remote Authentication [article] ASP.Net Site Performance: Improving JavaScript Loading [article]

0
0
17536

article-image-building-our-first-poky-image-raspberry-pi

Packt

14 Apr 2016

12 min read

Building Our First Poky Image for the Raspberry Pi

Packt

14 Apr 2016

12 min read

0
1
41800

article-image-probabilistic-graphical-models-r

Packt

14 Apr 2016

18 min read

Probabilistic Graphical Models in R

Packt

14 Apr 2016

18 min read

In this article by David Bellot, author of the book, Learning Probabilistic Graphical Models in R, explains that among all the predictions that were made about the 21st century, we may not have expected that we would collect such a formidable amount of data about everything, everyday, and everywhere in the world. The past years have seen an incredible explosion of data collection about our world and lives, and technology is the main driver of what we can certainly call a revolution. We live in the age of information. However, collecting data is nothing if we don't exploit it and if we don't try to extract knowledge out of it. At the beginning of the 20th century, with the birth of statistics, the world was all about collecting data and making statistics. Back then, the only reliable tools were pencils and papers and, of course, the eyes and ears of the observers. Scientific observation was still in its infancy despite the prodigious development of the 19th century. (For more resources related to this topic, see here.) More than a hundred years later, we have computers, electronic sensors, massive data storage, and we are able to store huge amounts of data continuously, not only about our physical world but also about our lives, mainly through the use of social networks, Internet, and mobile phones. Moreover, the density of our storage technology increased so much that we can, nowadays, store months if not years of data into a very small volume that can fit in the palm of our hand. Among all the tools and theories that have been developed to analyze, understand, and manipulate probability and statistics became one of the most used. In this field, we are interested in a special, versatile, and powerful class of models called the probabilistic graphical models (PGM, for short). Probabilistic graphical model is a tool to represent beliefs and uncertain knowledge about facts and events using probabilities. It is also one of the most advanced machine learning techniques nowadays and has many industrial success stories. They can deal with our imperfect knowledge about the world because our knowledge is always limited. We can't observe everything, and we can't represent the entire universe in a computer. We are intrinsically limited as human beings and so are our computers. With Probabilistic Graphical Models, we can build simple learning algorithms or complex expert systems. With new data, we can improve these models and refine them as much as we can, and we can also infer new information or make predictions about unseen situations and events. Probabilistic Graphical Models, seen from the point of view of mathematics, are a way to represent a probability distribution over several variables, which is called a joint probability distribution. In a PGM, such knowledge between variables can be represented with a graph, that is, nodes connected by edges with a specific meaning associated to it. Let's consider an example from the medical world: how to diagnose a cold. This is an example and by no means a medical advice. It is oversimplified for the sake of simplicity. We define several random variables such as the following: Se: This means season of the year N: This means that the nose is blocked H: This means the patient has a headache S: This means that the patient regularly sneezes C: This means that the patient coughs Cold: This means the patient has a cold. Because each of the symptoms can exist at different degrees, it is natural to represent the variable as random variables. For example, if the patient's nose is a bit blocked, we will assign a probability of, say, 60% to this variable, that is P(N=blocked)=0.6 and P(N=not blocked)=0.4. In this example, the probability distribution P(Se,N,H,S,C,Cold) will require 4 * 25 = 128 values in total (4 values for season and 2 values for each of the other random variables). It's quite a lot, and honestly, it's quite difficult to determine things such as the probability that the nose is not blocked, the patient has a headache, the patient sneeze, and so on. However, we can say that a headache is not directly related to cough of a blocked nose, expect when the patient has a cold. Indeed, the patient can have a headache for many other reasons. Moreover, we can say that the Season has quite a direct effect on Sneezing, blocked nose, or Cough but less or no direct effect on Headache. In a Probabilistic Graphical Model, we will represent these dependency relationships with a graph, as follow, where each random variable is a node in the graph, and each relationship is an arrow between 2 nodes: In the graph that follows, there is a direct relation between each node and each variable of the Probabilistic Graphical Model and also a direct relation between arrows and the way we can simplify the joint probability distribution in order to make it tractable. Using a graph as a model to simplify a complex (and sometimes complicated) distribution presents numerous benefits: As we observed in the previous example, and in general when we model a problem, the random variables interacts directly with only a small subsets of other random variables. Therefore, this promotes more compact and tractable models The knowledge and dependencies represented in a graph are easy to understand and communicate The graph induces a compact representation of the joint probability distribution and it is easy to make computations with Algorithms to draw inferences and learn can use the graph theory and the associated algorithms to improve and facilitate all the inference and learning algorithms. Compared to the raw joint probability distribution, using a PGM will speed up computations by several order of magnitude. The junction tree algorithm The Junction Tree Algorithm is one of the main algorithms to do inference on PGM. Its name arises from the fact that before doing the numerical computations, we will transform the graph of the PGM into a tree with a set of properties that allow for efficient computations of posterior probabilities. One of the main aspects is that this algorithm will not only compute the posterior distribution of the variables in the query, but also the posterior distribution of all other variables that are not observed. Therefore, for the same computational price, one can have any posterior distribution. Implementing a junction tree algorithm is a complex task, but fortunately, several R packages contain a full implementation, for example, gRain. Let's say we have several variables A, B, C, D, E,and F. We will consider for the sake of simplicity that each variable is binary so that we won't have too many values to deal with. We will assume the following factorization: This is represented by the following graph: We first start by loading the gRain package into R: library(gRain) Then, we create our set of random variables from A to F: val=c(“true”,”false”) F = cptable(~F, values=c(10,90),levels=val) C = cptable(~C|F, values=c(10,90,20,80),levels=val) E = cptable(~E|F, values=c(50,50,30,70),levels=val) A = cptable(~A|C, values=c(50,50,70,30),levels=val) D = cptable(~D|E, values=c(60,40,70,30),levels=val) B = cptable(~B|A:D, values=c(60,40,70,30,20,80,10,90),levels=val) The cptable function creates a conditional probability table, which is a factor for discrete variables. The probabilities associated to each variable are purely subjective and only serve the purpose of the example. The next step is to compute the junction tree. In most packages, computing the junction tree is done by calling one function because the algorithm just does everything at once: plist = compileCPT(list(F,E,C,A,D,B)) plist Also, we check whether the list of variable is correctly compiled into a probabilistic graphical model and we obtain from the previous code: CPTspec with probabilities: P( F ) P( E | F ) P( C | F ) P( A | C ) P( D | E ) P( B | A D ) This is indeed the factorization of our distribution, as stated earlier. If we want to check further, we can look at the conditional probability table of a few variables: print(plist$F) print(plist$B) F true false 0.1 0.9 , , D = true A B true false true 0.6 0.7 false 0.4 0.3 , , D = false A B true false true 0.2 0.1 false 0.8 0.9 The second output is a bit more complex, but if you look carefully, you will see that you have two distributions, P(B|A,D=true) and P(B|A,D=false) which is more readable presentation of P(B|A,D). We finally create the graph and run the junction tree algorithm by calling this: jtree = grain(plist) Again, when we check the result, we obtain: jtree Independence network: Compiled: FALSE Propagated: FALSE Nodes: chr [1:6] "F" "E" "C" "A" "D" "B" We only need to compute the junction tree once. Then, all queries can be computed with the same junction tree. Of course, if you change the graph, then you need to recompute the junction tree. Let's perform a few queries: querygrain(jtree, nodes=c("F"), type="marginal") $F F true false 0.1 0.9 Of course, if you ask for the marginal distribution of F, you will obtain the initial conditional probability table because F has no parents. querygrain(jtree, nodes=c("C"), type="marginal") $C C true false 0.19 0.81 This is more interesting because it computes the marginal of C while we only stated the conditional distribution of C given F. We didn't need to have such a complex algorithm as the junction tree algorithm to compute such a small marginal. We saw the variable elimination algorithm earlier and that would be enough too. But if you ask for the marginal of B, then the variable elimination will not work because of the loop in the graph. However, the junction tree will give the following: querygrain(jtree, nodes=c("B"), type="marginal") $B B true false 0.478564 0.521436 And, can ask more complex distribution, such as the joint distribution of B and A: querygrain(jtree, nodes=c("A","B"), type="joint") B A true false true 0.309272 0.352728 false 0.169292 0.168708 In fact, any combination can be given like A,B,C: querygrain(jtree, nodes=c("A","B","C"), type="joint") , , B = true A C true false true 0.044420 0.047630 false 0.264852 0.121662 , , B = false A C true false true 0.050580 0.047370 false 0.302148 0.121338 Now, we want to observe a variable and compute the posterior distribution. Let's say F=true and we want to propagate this information down to the rest of the network: jtree2 = setEvidence(jtree, evidence=list(F="true")) We can ask for any joint or marginal now: querygrain(jtree, nodes=c("A"), type="marginal") $A A true false 0.662 0.338 querygrain(jtree2, nodes=c("A"), type="marginal") $A A true false 0.68 0.32 Here, we see that knowing that F=true changed the marginal distribution on A from its previous marginal (the second query is again with jtree2, the tree with an evidence). And, we can query any other variable: querygrain(jtree, nodes=c("B"), type="marginal") $B B true false 0.478564 0.521436 querygrain(jtree2, nodes=c("B"), type="marginal") $B B true false 0.4696 0.5304 Learning Building a Probabilistic Graphical Model, generally, requires three steps: defining the random variables, which are the nodes of the graph as well; defining the structure of the graph; and finally defining the numerical parameters of each local distribution. So far, the last step has been done manually and we gave numerical values to each local probability distribution by hand. In many cases, we have access to a wealth of data and we can find the numerical values of those parameters with a method called parameters learning. In other fields, it is also called parameters fitting or model calibration. Learning parameters can be done with several approaches and there is no ultimate solution to the problem because it depends on the goal where the model's user wants to reach. Nevertheless, it is common to use the notion of Maximum Likelihood of a model and also Maximum A Posteriori. As you are now used to the notion of prior and posterior of a distribution, you can already guess what a maximum a posteriori can do. Many algorithms are used, among which we can cite the Expectation Maximization algorithm (EM), which computes the maximum likelihood of a model even when data is missing or variables are not observed at all. It is a very important algorithm, especially for mixture models. A graphical model of a linear model PGM can be used to represent standard statistical models and then extend them. One famous example is the linear regression mode. We can visualize the structure of a linear mode and better understand the relationships between the variable. The linear model captures the relationships between observable variables xand a target variable y. This relation is modeled by a set of parameters, θ. But remember the distribution of y for each data point indexed by i: Here, Xiis a row vector for which the first element is always one to capture the intercept of the linear model. The parameter θ in the following graph is itself composed of the intercept, the coefficient β for each component of X, and the variance σ2 of in the distribution of yi. The PGM for an observation of a linear model can be represented as follows: So, this decomposition leads us to a second version of the graphical model in which we explicitly separate the components of θ: In a PGM, when a rectangle is drawn around a set of nodes with a number or variables in a corner (N for example), it means that the same graph is repeated many times. The likelihood function of a linear model is , and it can be represented as a PGM. And, the vector β can also be decomposed it into its univariate components too: In this last iterations of the graphical model, we see that the parameters β could have a prior probability on it instead of being fixed. In fact, the parameter can also be considered as a random variable. For the time being, we will keep it fixed. Latent Dirichlet Allocation The last model we want to show in this article is called the Latent Dirichlet Allocation. It is a generative model that can be represented as a graphical model. It's based on the same idea as the mixture model with one notable exception. In this model, we assume that the data points might be generated by a combination of clusters and not just one cluster at a time, as it was the case before. The LDA model is primarily used in text analysis and classification. Let's consider that a text document is composed of words making sentences and paragraphs. To simplify the problem we can say that each sentence or paragraph is about one specific topic, such as science, animals, sports, and s on. Topics can also be more specific, such as cat topic or European soccer topic. Therefore, there are words that are more likely to come from specific topics. For example, the work cat is likely to come from the topic cat topic. The word stadium is likely to come from the topic European soccer. However, the word ball should come with a higher probability from the topic European soccer, but it is not unlikely to come from the topic cat, because cats like to play with balls too. So, it seems the word ball might belong to two topics at the same time with a different degree of certainty. Other words such as table will certainly belong equally to both topics and presumably to others. They are very generic; expect, of course, if we introduce another topics such as furniture. A document is a collection of words, so a document can have complex relationships with a set of topics. But in the end, it is more likely to see words coming from the same topic or the same topics within a paragraph and to some extent to the document. In general, we model a document with a bag of words model, that is, we consider a document to be a randomly generated set of words, using a specific distribution over the words. If this distribution is uniform over all the words, then the document will be purely random without a specific meaning. However, if this distribution has a specific form, with more probability mass to related words, then the collection of words generated by this model will have a meaning. Of course, generating documents is not really the application we have in mind for such a model. What we are interested in is the analysis of documents, their classification, and automatic understanding. Let's say is a categorical variable (in other words, a histogram), representing the probability of appearance of all words from a dictionary. Usually, in this kind of model, we restrict ourselves to long words only and remove the small words, like and, to, but, the, a, and so onThese words are usually called stop words. Let w_jbe the jth words in a document. The following three graphs show the progression from representing a document (left-most graph) to representing a collection of documents (the third graph): Let be a distribution over topics, then in the second graph from the left, we extend this model by choosing the kind of topic that will be selected at any time and then generate a word out of it. Therefore, the variable zi now becomes the variable zij, that is, the topic iis selected for the word j. We can go even further and decide that we want to model a collection of documents, which seems natural if we consider that we have a big data set. Assuming that documents are i.i.d, the next step (the third graph) is a PGM that represents the generative model for M documents. And, because the distribution on is categorical, we want to be Bayesian about it, mainly because it will help to model not to overfit and because we consider the selection of topics for a document to be a random process. Moreover, we want to apply the same treatment to the word variable by having a Dirichlet prior. This prior is used to avoid non-observed words that have zero probability. It smooths the distribution of words per topic. A uniform Dirichlet prior will induce a uniform prior distribution on all the words. And therefore, the final graph on the right represents the complete model. This is quite a complex graphical model but techniques have been developed to fit the parameters and use this model. If we follow this graphical model carefully, we have a process that generates documents based on a certain set of topics: α chooses the set of topics for a documents From θ, we generate a topic zij From this topic, we generate a word wj In this model, only the words are observable. All the other variables will have to be determined without observation, exactly like in the other mixture models. So, documents are represented as random mixtures over latent topics, in which each topic is represented as a distribution over words. The distribution of a topic mixture based on this graphical mode can be written as follows: You can see in this formula that for each word, we select a topic, hence the product from 1 to N. Integrating over θ and summing over z, the marginal distribution of a document is as follows: The final distribution can be obtained by taking the product of marginal distributions of single documents, so as to get the distribution over a collection of documents (assuming that documents are independently and identically distributed). Here, D is the collection of documents: The main problem to be solved now is how to compute the posterior distribution over θ and z, given a document. By applying the Bayes formula, we know the following: Unfortunately, this is intractable because of the normalization factor at the denominator. The original paper on LDA, therefore, refers to a technique called Variational inference, which aims at transforming a complex Bayesian inference problem into a simpler approximation which can be solved as an (convex) optimization problem. This technique is the third approach to Bayesian inference and has been used on many other problems. Summary The probabilistic graphical model framework offers a powerful and versatile framework to develop and extend many probabilistic models using an elegant graph-based formalism. It has many applications such as in biology, genomics, medicine, finance, robotics, computer vision, automation, engineering, law, and games, for example. Many packages in R exist to deal with all sort of models and data among which gRain or Rstan are very popular. Resources for Article: Further resources on this subject: Extending ElasticSearch with Scripting [article] Exception Handling in MySQL for Python [article] Breaking the Bank [article]

0
0
12660

article-image-step-detector-and-step-counters-sensors

Packt

14 Apr 2016

13 min read

Step Detector and Step Counters Sensors

Packt

14 Apr 2016

13 min read

0
3
48076

Packt

14 Apr 2016

9 min read

Remote Authentication

Packt

14 Apr 2016

9 min read

0
0
2450

article-image-detecting-fraud-e-commerce-orders-benfords-law

Packt

14 Apr 2016

7 min read

Detecting fraud on e-commerce orders with Benford's law

Packt

14 Apr 2016

7 min read

In this article by Andrea Cirillo, author of the book RStudio for R Statistical Computing Cookbook, has explained how to detect fraud on e-commerce orders. Benford's law is a popular empirical law that states that the first digits of a population of data will follow a specific logarithmic distribution. This law was observed by Frank Benford around 1938 and since then has gained increasing popularity as a way to detect anomalous alteration of population of data. Basically, testing a population against Benford's law means verifying that the given population respects this law. If deviations are discovered, the law performs further analysis for items related to those deviations. In this recipe, we will test a population of e-commerce orders against the law, focusing on items deviating from the expected distribution. (For more resources related to this topic, see here.) Getting ready This recipe will use functions from the well-documented benford.analysis package by Carlos Cinelli. We therefore need to install and load this package: install.packages("benford.analysis") library(benford.analysis) In our example, we will use a data frame that stores e-commerce orders, provided within the book as an .Rdata file. In order to make it available within your environment, we need to load this file by running the following command (assuming the file is within your current working directory): load("ecommerce_orders_list.Rdata") How to do it... Perform Benford test on the order amounts: benford_test <- benford(ecommerce_orders_list$order_amount,1) Plot test analysis: plot(benford_test) This will result in the following plot: Highlights supectes digits: suspectsTable(benford_test) This will produce a table showing for each digit absolute differences between expected and observed frequencies. The first digits will therefore be more anomalous ones: > suspectsTable(benford_test) digits absolute.diff 1: 5 4860.8974 2: 9 3764.0664 3: 1 2876.4653 4: 2 2870.4985 5: 3 2856.0362 6: 4 2706.3959 7: 7 1567.3235 8: 6 1300.7127 9: 8 200.4623 Define a function to extrapolate the first digit from each amount: left = function (string,char){ substr(string,1,char)} Extrapolate the first digit from each amount: ecommerce_orders_list$first_digit <- left(ecommerce_orders_list$order_amount,1) Filter amounts starting with the suspected digit: suspects_orders <- subset(ecommerce_orders_list,first_digit == 5) How it works Step 1 performs the Benford test on the order amounts. In this step, we applied the benford() function to the amounts. Applying this function means evaluating the distribution of the first digits of amounts against the expected Benford distribution. The function will result in the production of the following objects: Object Description Info This object covers the following general information: data.name: This shows the name of the data used n: This shows the number of observations used n.second.order: This shows the number of observations used for second-order analysis number.of.digits: This shows the number of first digits analyzed Data This is a data frame with the following subobjects: lines.used: This shows the original lines of the dataset data.used: This shows the data used data.mantissa: This shows the log data's mantissa data.digits: This shows the first digits of the data s.o.data This is a data frame with the following subobjects: data.second.order: This shows the differences of the ordered data data.second.order.digits: This shows the first digits of the second-order analysis Bfd This is a data frame with the following subobjects: digits: This highlights the groups of digits analyzed data.dist: This highlights the distribution of the first digits of the data data.second.order.dist: This highlights the distribution of the first digits of the second-order analysis benford.dist: This shows the theoretical Benford distribution data.second.order.dist.freq: This shows the frequency distribution of the first digits of the second-order analysis data.dist.freq: This shows the frequency distribution of the first digits of the data benford.dist.freq: This shows the theoretical Benford frequency distribution benford.so.dist.freq: This shows the theoretical Benford frequency distribution of the second order analysis. data.summation: This shows the summation of the data values grouped by first digits abs.excess.summation: This shows the absolute excess summation of the data values grouped by first digits difference: This highlights the difference between the data and Benford frequencies squared.diff: This shows the chi-squared difference between the data and Benford frequencies absolute.diff: This highlights the absolute difference between the data and Benford frequencies Mantissa This is a data frame with the following subobjects: mean.mantissa: This shows the mean of the mantissa var.mantissa: This shows the variance of the mantissa ek.mantissa: This shows the excess kurtosis of the mantissa sk.mantissa: This highlights the skewness of the mantissa MAD This object depicts the mean absolute deviation. distortion.factor This object talks about the distortion factor. Stats This object lists of htest class statistics as follows: chisq: This lists the Pearson's Chi-squared test. mantissa.arc.test: This lists the Mantissa Arc test Step 2 plots test results. Running plot on the object resulting from the benford() function will result in a plot showing the following (from upper-left corner to bottom-right corner): First digit distribution Results of second-order test Summation distribution for each digit Results of chi-squared test Summation differences If you look carefully at these plots, you will understand which digits show up a distribution significantly different from the one expected from the Benford law. Nevertheless, in order to have a sounder base for our consideration, we need to look at the suspects table, showing absolute differences between expected and observed frequencies. This is what we will do in the next step. Step 3 highlights suspects digits. Using suspectsTable() we can easily discover which digits presents the greater deviation from the expected distribution. Looking at the so-called suspects table, we can see that number 5 shows up as the first digit within our table. In the next step, we will focus our attention on the orders with amounts having this digit as the first digit. Step 4 defines a function to extrapolate the first digit from each amount. This function leverages the substr() function from the stringr() package and extracts the first digit from the number passed to it as an argument. Step 5 adds a new column to the investigated dataset where the first digit is extrapolated. Step 6 filters amounts starting with the suspected digit. After applying the left function to our sequence of amounts, we can now filter the dataset, retaining only rows whose amounts have 5 as the first digit. We will now be able to perform analytical, testing procedures on those items. Summary In this article, you learned how to apply the R language to an e-commerce fraud detection system. Resources for Article: Further resources on this subject: Recommending Movies at Scale (Python) [article] Visualization of Big Data [article] Big Data Analysis (R and Hadoop) [article]

0
0
3169

article-image-using-registry-and-xlswriter-modules

Packt

14 Apr 2016

12 min read

Using the Registry and xlswriter modules

Packt

14 Apr 2016

12 min read

0
0
33740

article-image-understanding-proxmox-ve-and-advanced-installation

Packt

13 Apr 2016

12 min read

Understanding Proxmox VE and Advanced Installation

Packt

13 Apr 2016

12 min read

In this article by Wasim Ahmed, the author of the book Mastering Proxmox - Second Edition, we will see Virtualization as we all know today is a decade old technology that was first implemented in mainframes of the 1960s. Virtualization was a way to logically divide the mainframe's resources for different application processing. With the rise in energy costs, running under-utilized server hardware is no longer a luxury. Virtualization enables us to do more with less thus save energy and money while creating a virtual green data center without geographical boundaries. (For more resources related to this topic, see here.) A hypervisor is a piece software, hardware, or firmware that creates and manages virtual machines. It is the underlying platform or foundation that allows a virtual world to be built upon. In a way, it is the very building block of all virtualization. A bare metal hypervisor acts as a bridge between physical hardware and the virtual machines by creating an abstraction layer. Because of this unique feature, an entire virtual machine can be moved over a vast distance over the Internet and be made able to function exactly the same. A virtual machine does not see the hardware directly; instead, it sees the layer of the hypervisor, which is the same no matter on what hardware the hypervisor has been installed. The Proxmox Virtual Environment (VE) is a cluster-based hypervisor and one of the best kept secrets in the virtualization world. The reason is simple. It allows you to build an enterprise business-class virtual infrastructure at a small business-class price tag without sacrificing stability, performance, and ease of use. Whether it is a massive data center to serve millions of people, or a small educational institution, or a home serving important family members, Proxmox can handle configuration to suit any situation. If you have picked up this article, no doubt you will be familiar with virtualization and perhaps well versed with other hypervisors, such VMWare, Xen, Hyper-V, and so on. In this article and upcoming articles, we will see the mighty power of Proxmox from inside out. We will examine scenarios and create a complex virtual environment. We will tackle some heavy day-to-day issues and show resolutions, which might just save the day in a production environment. So, strap yourself and let's dive into the virtual world with the mighty hypervisor, Proxmox VE. Understanding Proxmox features Before we dive in, it is necessary to understand why one should choose Proxmox over the other main stream hypervisors. Proxmox is not perfect but stands out among other contenders with some hard to beat features. The following are some of the features that makes Proxmox a real game changer. It is free! Yes, Proxmox is free! To be more accurate, Proxmox has several subscription levels among which the community edition is completely free. One can simply download Proxmox ISO at no cost and raise a fully functional cluster without missing a single feature and without paying anything. The main difference between the paid and community subscription level is that the paid subscription receives updates, which goes through additional testing and refinement. If you are running a production cluster with real workload, it is highly recommended that you purchase support and licensing from Proxmox or Proxmox resellers. Built-in firewall Proxmox VE comes with a robust firewall ready to be configured out of the box. This firewall can be configured to protect the entire Proxmox cluster down to a virtual machine. The Per VM firewall option gives you the ability to configure each VM individually by creating individualized firewall rules, a prominent feature in a multi-tenant virtual environment. Open vSwitch Licensed under Apache 2.0 license, Open vSwitch is a virtual switch designed to work in a multi-server virtual environment. All hypervisors need a bridge between VMs and the outside network. Open vSwitch enhances features of the standard Linux bridge in an ever changing virtual environment. Proxmox fully supports Open vSwitch that allows you to create an intricate virtual environment all the while, reducing virtual network management overhead. For details on Open vSwitch, refer to http://openvswitch.org/. The graphical user interface Proxmox comes with a fully functional graphical user interface or GUI out of the box. The GUI allows an administrator to manage and configure almost all the aspects of a Proxmox cluster. The GUI has been designed keeping simplicity in mind with functions and features separated into menus for easier navigation. The following screenshot shows an example of the Proxmox GUI dashboard: KVM virtual machines KVM or Kernel-based virtual machine is a kernel module that is added to Linux for full virtualization to create isolated fully independent virtual machines. KVM VMs are not dependent on the host operating system in any way, but they do require the virtualization feature in BIOS to be enabled. KVM allows a wide variety of operating systems for virtual machines, such as Linux and Windows. Proxmox provides a very stable environment for KVM-based VMs. Linux containers or LXC Introduced recently in Proxmox VE 4.0, Linux containers allow multiple Linux instances on the same Linux host. All the containers are dependent on the host Linux operating system and only Linux flavors can be virtualized as containers. There are no containers for the Windows operating system. LXC replace prior OpenVZ containers, which were the primary containers in the virtualization method in the previous Proxmox versions. If you are not familiar with LXC and for details on LXC, refer to https://linuxcontainers.org/. Storage plugins Out of the box, Proxmox VE supports a variety of storage systems to store virtual disk images, ISO templates, backups, and so on. All plug-ins are quite stable and work great with Proxmox. Being able to choose different storage systems gives an administrator the flexibility to leverage the existing storage in the network. As of Proxmox VE 4.0, the following storage plug-ins are supported: The local directory mount points iSCSI LVM Group NFS Share GlusterFS Ceph RBD ZFS Vibrant culture Proxmox has a growing community of users who are always helping others to learn Proxmox and troubleshoot various issues. With so many active users around the world and through active participation of Proxmox developers, the community has now become a culture of its own. Feature requests are continuously being worked on, and the existing features are being strengthened on a regular basis. With so many users supporting Proxmox, it is sure here to stay. The basic installation of Proxmox The installation of a Proxmox node is very straightforward. Simply, accept the default options, select localization, and enter the network information to install Proxmox VE. We can summarize the installation process in the following steps: Download ISO from the official Proxmox site and prepare a disc with the image (http://proxmox.com/en/downloads). Boot the node with the disc and hit enter to start the installation from the installation GUI. We can also install Proxmox from a USB drive. Progress through the prompts to select options or type in information. After the installation is complete, access the Proxmox GUI dashboard using the IP address, as follows: https://<proxmox_node_ip:8006 In some cases, it may be necessary to open the firewall port to allow access to the GUI over port 8006. The advanced installation option Although the basic installation works in all scenarios, there may be times when the advanced installation option may be necessary. Only the advanced installation option provides you the ability to customize the main OS drive. A common practice for the operating system drive is to use a mirror RAID array using a controller interface. This provides drive redundancy if one of the drives fails. This same level of redundancy can also be achieved using a software-based RAID array, such as ZFS. Proxmox now offers options to select ZFS-based arrays for the operating system drive right at the beginning of the installation. For details on ZFS, if you are not familiar with ZFS, refer to https://en.wikipedia.org/wiki/ZFS. It is a common question to ask why one should choose ZFS software RAID over tried and tested hardware-based RAID. The simple answer is flexibility. A hardware RAID is locked or fully dependent on the hardware RAID controller interface that created the array, whereas ZFS software-based is not dependent on any hardware, and the array can be easily be ported to different hardware nodes. Should a RAID controller failure occur, the entire array created from that controller is lost unless there is an identical controller interface available for replacement? The ZFS array is only lost when all the drives or maximum tolerable number of drives are lost in the array. Besides ZFS, we can also select other filesystem types, such as ext3, ext4, or xfs from the same advanced option. We can also set the custom disk or partition sizes through the advanced option. The following screenshot shows the installation interface with the Target Hard disk selection page: Click on Options, as shown in the preceding screenshot, to open the advanced option for the Hard disk. The following screenshot shows the option window after clicking on the Options button: In the preceding screenshot, we selected ZFS RAID1 for mirroring and the two drives, Harddisk 0 and Harddisk 1, respectively to install Proxmox. If we pick one of the filesystems such as ext3, ext4, or xfs instead of ZFS, the Hard disk Option dialog box will look like the following screenshot with different set of options: Selecting a filesystem gives us the following advanced options: hdsize: This is the total drive size to be used by the Proxmox installation. swapsize: This defines the swap partition size. maxroot: This defines the maximum size to be used by the root partition. minfree: This defines the minimum free space that should remain after the Proxmox installation. maxvz: This defines the maximum size for data partition. This is usually /var/lib/vz. Debugging the Proxmox installation Debugging features are part of any good operating system. Proxmox has debugging features that will help you during a failed installation. Some common reasons are unsupported hardware, conflicts between devices, ISO image errors, and so on. Debugging mode logs and displays installation activities in real time. When the standard installation fails, we can start the Proxmox installation in debug mode from the main installation interface, as shown in the following screenshot: The debug installation mode will drop us in the following prompt. To start the installation, we need to press Ctrl + D. When there is an error during the installation, we can simply press Ctrl + C to get back to this console to continue with our investigation: From the console, we can check the installation log using the following command: # cat /tmp/install.log From the main installation menu, we can also press e to enter edit mode to change the loader information, as shown in the following screenshot: At times, it may be necessary to edit the loader information when normal booting does not function. This is a common case when Proxmox is unable to show the video output due to UEFI or a nonsupported resolution. In such cases, the booting process may hang. One way to continue with booting is to add the nomodeset argument by editing the loader. The loader will look as follows after editing: linux/boot/linux26 ro ramdisk_size=16777216 rw quiet nomodeset Customizing the Proxmox splash screen When building a custom Proxmox solution, it may be necessary to change the default blue splash screen to something more appealing in order to identify the company or department the server belongs to. In this section, we will see how easily we can integrate any image as the splash screen background. The splash screen image must be in the .tga format and must have fixed standard sizes, such as 640 x 480, 800 x 600, or 1024 x 768. If you do not have any image software that supports the .tga format, you can easily convert an jpg, gif, or png image to the .tga format using a free online image converter (http://image.online-convert.com/convert-to-tga). Once the desired image is ready in the .tga format, the following steps will integrate the image as the Proxmox splash screen: Copy the .tga image in the Proxmox node in the /boot/grub directory. Edit the grub file in /etc/default/grub to add the following code, and click on save: GRUB_BACKGROUND=/boot/grub/<image_name>.tga Run the following command to update the grub configuration: # update-grub Reboot. The following screenshot shows an example of how the splash screen may look like after we add a custom image to it: Picture courtesy of www.techcitynews.com We can also change the font color to make it properly visible, depending on the custom image used. To change the font color, edit the debian theme file in /etc/grub.d/05_debian_theme, and find the following line of code: set_background_image "${GRUB_BACKGROUND}" || set_default_theme Edit the line to add the font color, as shown in the following format. In our example, we have changed the font color to black and highlighted the font color to light blue: set_background_image "${GRUB_BACKGROUND}" "black/black" "light-blue/black" || set_default_theme After making the necessary changes, update grub, and reboot to see the changes. Summary In this article, we looked at why Proxmox is a better option as a hypervisor, what advanced installation options are available during an installation, and why do we choose software RAID for the operating system drive. We also looked at the cost of Proxmox, storage options, and network flexibility using openvswitch. We learned the presence of the debugging features and customization options of the Proxmox splash screen. In next article, we will take a closer look at the Proxmox GUI and see how easy it is to centrally manage a Proxmox cluster from a web browser. Resources for Article: Further resources on this subject: Proxmox VE Fundamentals [article] Basic Concepts of Proxmox Virtual Environment [article]

0
0
21868

Packt

13 Apr 2016

7 min read

Nginx "expires" directive – Emitting Caching Headers

Packt

13 Apr 2016

7 min read

In this article by Alex Kapranoff, the author of the book Nginx Troubleshooting, explains how all browsers (and even many non-browser HTTP clients) support client-side caching. It is a part of the HTTP standard, albeit one of the most complex caching to understand. Web servers do not control client-side caching to full extent, obviously, but they may issue recommendations about what to cache and how, in the form of special HTTP response headers. This is a topic thoroughly discussed in many great articles and guides, so we will mention it shortly, and with a lean towards problems you may face and how to troubleshoot them. (For more resources related to this topic, see here.) In spite of the fact that browsers have been supporting caching on their side for at least 20 years, configuring cache headers was always a little confusing mostly due to the fact that there two sets of headers designed for the same purpose but having different scopes and totally different formats. There is the Expires: header, which was designed as a quick and dirty solution and also the new (relatively) almost omnipotent Cache-Control: header, which tries to support all the different ways an HTTP cache could work. This is an example of a modern HTTP request-response pair containing the caching headers. First is the request headers sent from the browser (here Firefox 41, but it does not matter): User-Agent:"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:41.0) Gecko/20100101 Firefox/41.0" Accept:"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" Accept-Encoding:"gzip, deflate" Connection:"keep-alive" Cache-Control:"max-age=0" Then, the response headers is: Cache-Control:"max-age=1800" Content-Encoding:"gzip" Content-Type:"text/html; charset=UTF-8" Date:"Sun, 10 Oct 2015 13:42:34 GMT" Expires:"Sun, 10 Oct 2015 14:12:34 GMT" We highlighted the parts that are relevant. Note that some directives may be sent by both sides of the conversation. First, the browser sent the Cache-Control: max-age=0 header because the user pressed the F5 key. This is an indication that the user wants to receive a response that is fresh. Normally, the request will not contain this header and will allow any intermediate cache to respond with a stale but still nonexpired response. In this case, the server we talked to responded with a gzipped HTML page encoded in UTF-8 and indicated that the response is okay to use for half an hour. It used both mechanisms available, the modern Cache-Control:max-age=1800 header and the very old Expires:Sun, 10 Oct 2015 14:12:34 GMT header. The X-Cache: "EXPIRED" header is not a standard HTTP header but was also probably (there is no way to know for sure from the outside) emitted by Nginx. It may be an indication that there are, indeed, intermediate caching proxies between the client and the server, and one of them added this header for debugging purposes. The header may also show that the backend software uses some internal caching. Another possible source of this header is a debugging technique used to find problems in the Nginx cache configuration. The idea is to use the cache hit or miss status, which is available in one of the handy internal Nginx variables as a value for an extra header and then to be able to monitor the status from the client side. This is the code that will add such a header: add_header X-Cache $upstream_cache_status; Nginx has a special directive that transparently sets up both of standard cache control headers, and it is named expires. This is a piece of the nginx.conf file using the expires directive: location ~* \.(?:css|js)$ { expires 1y; add_header Cache-Control "public"; } First, the pattern uses the so-called noncapturing parenthesis, which is a feature first appeared in Perl regular expressions. The effect of this regexp is the same as of a simpler \.(css|js)$ pattern, but the regular expression engine is specifically instructed not to create a variable containing the actual string from inside the parenthesis. This is a simple optimization. Then, the expires directive declares that the content of the css and js files will expire after a year of storage. The actual headers as received by the client will look like this: Server: nginx/1.9.8 (Ubuntu) Date: Fri, 11 Mar 2016 22:01:04 GMT Content-Type: text/css Last-Modified: Thu, 10 Mar 2016 05:45:39 GMT Expires: Sat, 11 Mar 2017 22:01:04 GMT Cache-Control: max-age=31536000 The last two lines contain the same information in wildly different forms. The Expires: header is exactly one year after the date in the Date: header, whereas Cache-Control: specifies the age in seconds so that the client do the date arithmetics itself. The last directive in the provided configuration extract adds another Cache-Control: header with a value of public explicitly. What this means is that the content of the HTTP resource is not access-controlled and therefore may be cached not only for one particular user but also anywhere else. A simple and effective strategy that was used in offices to minimize consumed bandwidth is to have an office-wide caching proxy server. When one user requested a resource from a website on the Internet and that resource had a Cache-Control: public designation, the company cache server would store that to serve to other users on the office network. This may not be as popular today due to cheap bandwidth, but because history has a tendency to repeat itself, you need to know how and why Cache-Control: public works. The Nginx expires directive is surprisingly expressive. It may take a number of different values. See this table: off This value turns off Nginx cache headers logic. Nothing will be added, and more importantly, existing headers received from upstreams will not be modified. epoch This is an artificial value used to purge a stored resource from all caches by setting the Expires header to "1 January, 1970 00:00:01 GMT". max This is the opposite of the "epoch" value. The Expires header will be equal to "31 December 2037 23:59:59 GMT", and the Cache-Control max-age set to 10 years. This basically means that the HTTP responses are guaranteed to never change, so clients are free to never request the same thing twice and may use their own stored values. Specific time An actual specific time value means an expiry deadline from the time of the respective request. For example, expires 10w; A negative value for this directive will emit a special header Cache-Control: no-cache. "modified" specific time If you add the keyword "modified" before the time value, then the expiration moment will be computed relatively to the modification time of the file that is served. "@" specific time A time with an @ prefix specifies an absolute time-of-day expiry. This should be less than 24 hours. For example, Expires @17h;. Many web applications choose to emit the caching headers themselves, and this is a good thing. They have more information about which resources change often and which never change. Tampering with the headers that you receive from the upstream may or may not be a thing you want to do. Sometimes, adding headers to a response while proxying it may produce a conflicting set of headers and therefore create an unpredictable behavior. The static files that you serve with Nginx yourself should have the expires directive in place. However, the general advice about upstreams is to always examine the caching headers you get and refrain from overoptimizing by setting up more aggressive caching policy. Resources for Article: Further resources on this subject: Nginx service [article] Fine-tune the NGINX Configuration [article] Nginx Web Services: Configuration and Implementation [article]

0
0
26567

article-image-cluster-computing-using-scala

Packt

13 Apr 2016

18 min read

Cluster Computing Using Scala

Packt

13 Apr 2016

18 min read

In this article by Vytautas Jančauskas the author of the book Scientific Computing with Scala, explains the way of writing software to be run on distributed computing clusters. We will learn the MPJ Express library here. (For more resources related to this topic, see here.) Very often when dealing with intense data processing tasks and simulations of physical phenomena, there comes a time when no matter how many CPU cores and memory your workstation has, it is not enough. At times like these, you will want to turn to supercomputing clusters for help. These distributed computing environments consist of many nodes (each node being a separate computer) connected into a computer network using specialized high bandwidth and low latency connections (or if you are on a budget standard Ethernet hardware is often enough). These computers usually utilize a network filesystem allowing each node to see the same files. They communicate using messaging libraries, such as MPI. Your program will run on separate computers and utilize the message passing framework to exchange data via the computer network. Using MPJ Express for distributed computing MPJ Express is a message passing library for distributed computing. It works in programming languages using Java Virtual Machine (JVM). So, we can use it from Scala. It is similar in functionality and programming interface to MPI. If you know MPI, you will be able to use MPJ Express pretty much the same way. The differences specific to Scala are explained in this section. We will start with how to install it. For further reference, visit the MPJ Express website given here: http://mpj-express.org/ Setting up and running MPJ Express The steps to set up and run MPJ Express are as follows: First, download MPJ Express from the following link. The version at the time of this writing is 0.44.http://mpj-express.org/download.php Unpack the archive and refer to the included README file for installation instructions. Currently, you have to set MPJ_HOME to the folder you unpacked the archive to and add the bin folder in that archive to your path. For example, if you are a Linux user using bash as your shell, you can add the following two lines to your .bashrc file (the file is in your home directory at /home/yourusername/.bashrc): export MPJ_HOME=/home/yourusername/mpj export PATH=$MPJ_HOME/bin:$PATH Here, mpj is the folder you extracted the archive you downloaded from the MPJ Express website to. If you are using a different system, you will have to do the equivalent of the above for your system to use MPJ Express. We will want to use MPJ Express with Scala Build Tool (SBT), which we used previously to build and run all of our programs. Create the following directory structure: scalacluster/ lib/ project/ plugins.sbt build.sbt I have chosen to name the project folder asscalacluster here, but you can call it whatever you want. The .jar files in the lib folder will be accessible to your program now. Copy the contents of the lib folder from the mpj directory to this folder. Finally, create an empty build.sbt and plugins.sbt files. Let’s now write and run a simple "Hello, World!" program to test our setup: import mpi._ object MPJTest { def main(args: Array[String]) { MPI.Init(args) val me: Int = MPI.COMM_WORLD.Rank val size: Int = MPI.COMM_WORLD.Size println("Hello, World, I'm <" + me + ">") MPI.Finalize() } } This should be familiar to everyone who has ever used MPI. First, we import everything from the mpj package. Then, we initialize MPJ Express by calling MPI.Initialize, the arguments to MPJ Express will be passed from the command-line arguments you will enter when running the program. The MPI.COMM_WORLD.Rank() function returns the MPJ processes rank. A rank is a unique identifier used to distinguish processes from one another. They are used when you want different processes to do different things. A common pattern is to use the process with rank 0 as the master process and the processes with other ranks as workers. Then, you can use the processes rank to decide what action to take in the program. We also determine how many MPJ processes were launched by checking MPI.COMM_WORLD.Size. Our program will simply print a processes rank for now. We will want to run it. If you don't have a distributed computing cluster readily available, don't worry. You can test your programs locally on your desktop or laptop. The same program will work without changes on clusters as well. To run programs written using MPJ Express, you have to use the mpjrun.sh script. This script will be available to you if you have added the bin folder of the MPJ Express archive to your PATH as described in the section on installing MPJ Express. The mpjrun.sh script will setup the environment for your MPJ Express processes and start said processes. The mpjrun.sh script takes a .jar file, so we need to create one. Unfortunately for us, this cannot easily be done using the sbt package command in the directory containing our program. This worked previously, because we used Scala runtime to execute our programs. MPJ Express uses Java. The problem is that the .jar package created with sbt package does not include Scala's standard library. We need what is called a fat .jar—one that contains all the dependencies within itself. One way of generating it is to use a plugin for SBT called sbt-assembly. The website for this plugin is given here: https://github.com/sbt/sbt-assembly There is a simple way of adding the plugin for use in our project. Remember that project/plugins.sbt file we created? All you need to do is add the following line to it (the line may be different for different versions of the plugin. Consult the website): addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.1") Now, add the following to the build.sbt file you created: lazy val root = (project in file(".")). settings( name := "mpjtest", version := "1.0", scalaVersion := "2.11.7" ) Then, execute the sbt assembly command from the shell to build the .jar file. The file will be put under the following directory if you are using the preceding build.sbt file. That is, if the folder you put the program and build.sbt in is /home/you/cluster: /home/you/cluster/target/scala-2.11/mpjtest-assembly- 1.0.jar Now, you can run the mpjtest-assembly-1.0.jar file as follows: $ mpjrun.sh -np 4 -jar target/scala-2.11/mpjtest-assembly-1.0.jar MPJ Express (0.44) is started in the multicore configuration Hello, World, I'm <0> Hello, World, I'm <2> Hello, World, I'm <3> Hello, World, I'm <1> Argument -np specifies how many processes to run. Since we specified -np 4, four processes will be started by the script. The order of the "Hello, World" messages can differ on your system since the precise order of execution of different processes is undetermined. If you got the output similar to the one shown here, then congratulations, you have done the majority of the work needed to write and deploy applications using MPJ Express. Using Send and Recv MPJ Express processes can communicate using Send and Recv. These methods constitute arguably the simplest and easiest to understand mode of operation that is also probably the most error prone. We will look at these two first. The following are the signatures for the Send and Recv methods: public void Send(java.lang.Object buf, int offset, int count, Datatype datatype, int dest, int tag) throws MPIException public Status Recv(java.lang.Object buf, int offset, int count, Datatype datatype, int source, int tag) throws MPIException Both of these calls are blocking. This means that after calling Send, your process will block (will not execute the instructions following it) until a corresponding Recv is called by another process. Also Recv will block the process, until a corresponding Send happens. By corresponding, we mean that the dest and source arguments of the calls have the values corresponding to receivers and senders ranks, respectively. The two calls will be enough to implement many complicated communication patterns. However, they are prone to various problems such as deadlocks. Also, they are quite difficult to debug, since you have to make sure that each Send has the correct corresponding Recv and vice versa. The parameters for Send and Recv are basically the same. The meanings of those parameters are summarized in the following table: Argument Type Description Buf java.lang.Object It has to be a one-dimensional Java array. When using from Scala, use the Scala array, which is a one-to-one mapping to a Java array. offset int The start of the data you want to pass from the start of the array. Count int This shows the number items of the array you want to pass. datatype Datatype The type of data in the array. Can be one of the following: MPI.BYTE, MPI.CHAR, MPI.SHORT, MPI.BOOLEAN, MPI.INT, MPI.LONG, MPI.FLOAT, MPI.DOUBLE, MPI.OBJECT, MPI.LB, MPI.UB, and MPI.PACKED. dest/source int Either the destination to send the message to or the source to get the message from. You use the rank of the process to identify sources and destinations. tag int Used to tag the message. Can be used to introduce different message types. Can be ignored for most common applications. Let’s look at a simple program using these calls for communication. We will implement a simple master/worker communication pattern: import mpi._ import scala.util.Random object MPJTest { def main(args: Array[String]) { MPI.Init(args) val me: Int = MPI.COMM_WORLD.Rank() val size: Int = MPI.COMM_WORLD.Size() if (me == 0) { Here, we use an if statement to identify who we are based on our rank. Since each process gets a unique rank, this allows us to determine what action should be taken. In our case, we assigned the role of the master to the process with rank 0 and the role of a worker to processes with other ranks: for (i <- 1 until size) { val buf = Array(Random.nextInt(100)) MPI.COMM_WORLD.Send(buf, 0, 1, MPI.INT, i, 0) println("MASTER: Dear <" + i + "> please do work on " + buf(0)) } We iterate over workers, who have the ranks from 1 to whatever is the argument for number of processes you passed to the mpjrun.sh script. Let’s say that number is four. This gives us one master process and three worker processes. So, each process with a rank from 1 to 3 will get a randomly generated number. We have to put that number in an array even though it is a single number. This is because both Send and Recv methods expect an array as their first argument. We then use the Send method to send the data. We specified the array as argument buf, offset of 0, size of 1, type MPI.INT, destination as the for loop index, and tag as 0. This means that each of our three worker processes will receive a (most probably) different number: for (i <- 1 until size) { val buf = Array(0) MPI.COMM_WORLD.Recv(buf, 0, 1, MPI.INT, i, 0) println("MASTER: Dear <" + i + "> thanks for the reply, which was " + buf(0)) } Finally, we collect the results from the workers. For this, we iterate over the worker ranks and use the Recv method on each one of them. We print the result we got from the worker, and this concludes the master's part. We now move on to the workers: } else { val buf = Array(0) MPI.COMM_WORLD.Recv(buf, 0, 1, MPI.INT, 0, 0) println("<" + me + ">: " + "Understood, doing work on " + buf(0)) buf(0) = buf(0) * buf(0) MPI.COMM_WORLD.Send(buf, 0, 1, MPI.INT, 0, 0) println("<" + me + ">: " + "Reporting back") } The workers code is identical for all of them. They receive a message from the master, calculate the square of it, and send it back: MPI.Finalize() } } After you run the program, the results should be akin to the following, which I got when running this program on my system: MASTER: Dear <1> please do work on 71 MASTER: Dear <2> please do work on 12 MASTER: Dear <3> please do work on 55 <1>: Understood, doing work on 71 <1>: Reported back MASTER: Dear <1> thanks for the reply, which was 5041 <3>: Understood, doing work on 55 <2>: Understood, doing work on 12 <2>: Reported back MASTER: Dear <2> thanks for the reply, which was 144 MASTER: Dear <3> thanks for the reply, which was 3025 <3>: Reported back Sending Scala objects in MPJ Express messages Sometimes, the types provided by MPJ Express for use in the Send and Recv methods are not enough. You may want to send your MPJ Express processes a Scala object. A very realistic example of this would be to send an instance of a Scala case class. These can be used to construct more complicated data types consisting of several different basic types. A simple example is a two-dimensional vector consisting of x and y coordinates. This can be sent as a simple array, but more complicated classes can't. For example, you may want to use a case class as the one shown here. It has two attributes of type String and one attribute of type Int. So what do we do with a data type like this? The simplest answer to that problem is to serialize it. Serializing converts an object to a stream of characters or a string that can be sent over the network (or stored to a file or done other things with) and later on deserialized to get the original object back: scala> case class Person(name: String, surname: String, age: Int) defined class Person scala> val a = Person("Name", "Surname", 25) a: Person = Person(Name,Surname,25) A simple way of serializing is to use a format such as XML or JSON. This can be done automatically using a pickling library. Pickling is a term that comes from the Python programming language. It is the automatic conversion of an arbitrary object into a string representation that can later be de-converted to get the original object back. The reconstructed object will behave the same way as it did before conversion. This allows one to store arbitrary objects to files for example. There is a pickling library available for Scala as well. You can of course do serialization in several different ways (for example, using the powerful support for XML available in Scala). We will use the pickling library that is available from the following website for this example: https://github.com/scala/pickling You can install it by adding the following line to your build.sbt file: libraryDependencies += "org.scala-lang.modules" %% "scala- pickling" % "0.10.1" After doing that, use the following import statements to enable easy pickling in your projects: scala> import scala.pickling.Defaults._ import scala.pickling.Defaults._ scala> import scala.pickling.json._ import scala.pickling.json._ Here, you can see how you can then easily use this library to pickle and unpickle arbitrary objects without the use of annoying boiler plate code: scala> val pklA = a.pickle pklA: pickling.json.pickleFormat.PickleType = JSONPickle({ "$type": "Person", "name": "Name", "surname": "Surname", "age": 25 }) scala> val unpklA = pklA.unpickle[Person] unpklA: Person = Person(Name,Surname,25) Let’s see how this would work in an application using MPJ Express for message passing. A program using pickling to send a case class instance in a message is given here: import mpi._ import scala.pickling.Defaults._ import scala.pickling.json._ case class ArbitraryObject(a: Array[Double], b: Array[Int], c: String) Here, we have chosen to define a fairly complex case class, consisting of two arrays of different types and a string: object MPJTest { def main(args: Array[String]) { MPI.Init(args) val me: Int = MPI.COMM_WORLD.Rank() val size: Int = MPI.COMM_WORLD.Size() if (me == 0) { val obj = ArbitraryObject(Array(1.0, 2.0, 3.0), Array(1, 2, 3), "Hello") val pkl = obj.pickle.value.toCharArray MPI.COMM_WORLD.Send(pkl, 0, pkl.size, MPI.CHAR, 1, 0) In the preceding bit of code, we create an instance of our case class. We then pickle it to JSON and get the string representation of said JSON with the value method. However, to send it in an MPJ message, we need to convert it to a one-dimensional array of one of the supported types. Since it is a string, we convert it to a char array. This is done using the toCharArray method: } else if (me == 1) { val buf = new Array[Char](1000) MPI.COMM_WORLD.Recv(buf, 0, 1000, MPI.CHAR, 0, 0) val msg = buf.mkString val obj = msg.unpickle[ArbitraryObject] On the receiving end, we get the raw char array, convert it back to string using mkString method, and then unpickle it using unpickle[T]. This will return an instance of the case class that we can use as any other instance of a case class. It is in its functionality the same object that was sent to us: println(msg) println(obj.c) } MPI.Finalize() } } The following is the result of running the preceding program. It prints out the JSON representation of our object, and also show that we can access the attributes of said object by printing the c attribute. MPJ Express (0.44) is started in the multicore configuration: { "$type": "ArbitraryObject", "a": [ 1.0, 2.0, 3.0 ], "b": [ 1, 2, 3 ], "c": "Hello" } Hello You can use this method to send arbitrary objects in an MPJ Express message. However, this is just one of many ways of doing this. As mentioned previously, an example of another way is to use the XML representation. XML support is strong in Scala, and you can use it to serialize objects as well. This will usually require you to add some boiler plate code to your program to serialize to XML. The method discussed earlier has the advantage of requiring no boiler plate code. Non-blocking communication So far, we examined only blocking (or synchronous) communication between two processes. This means that the process is blocked (halted their execution) until the Send or Recv methods have been completed successfully. This is simple to understand and enough for most cases. The problem with synchronous communication is that you have to be very careful otherwise deadlocks may occur. Deadlocks are situations when processes wait for each other to release a resource first. Mexican standoff including the dining philosophers problem is one of the famous example of Deadlock in Operating System. The point is that if you are unlucky, you may end up with a program that is seemingly stuck and you don't know why. Using nonlocking communication allows you to avoid these problems most of the time. If you think you may be at risk of deadlocks, you will probably want to use it. The signatures for the primary methods used in asynchronous communication are given here: Request Isend(java.lang.Object buf, int offset, int count, Datatype datatype, int dest, int tag) Isend works similar to its Send counterpart. The main differences are that it does not block (the program continues execution after the call rather than waiting for a corresponding send), and then it returns a Request object. This object is used to check the status of your Send request, block until it is complete if required, and so on: Request Irecv(java.lang.Object buf, int offset, int count, Datatype datatype, int src, int tag) Irecv is again the same as Recv only non-blocking and returns a Request object used to handle your receive request. The operation of these methods can be seen in action in the following example: import mpi._ object MPJTest { def main(args: Array[String]) { MPI.Init(args) val me: Int = MPI.COMM_WORLD.Rank() val size: Int = MPI.COMM_WORLD.Size() if (me == 0) { val requests = for (i <- 0 until 10) yield { val buf = Array(i * i) MPI.COMM_WORLD.Isend(buf, 0, 1, MPI.INT, 1, 0) } } else if (me == 1) { for (i <- 0 until 10) { Thread.sleep(1000) val buf = Array[Int](0) val request = MPI.COMM_WORLD.Irecv (buf, 0, 1, MPI.INT, 0, 0) request.Wait() println("RECEIVED: " + buf(0)) } } MPI.Finalize() } } This is a very simplistic example used simply to demonstrate the basics of using the asynchronous message passing methods. First, the process with rank 0 will send 10 messages to process with rank 1 using Isend. Since Isend does not block, the loop will finish quickly and the messages it sent will be buffered until they are retrieved using Irecv. The second process (the one with rank 1) will wait for one second before retrieving each message. This is to demonstrate the asynchronous nature of these methods. The messages are in the buffer waiting to be retrieved. Therefore, Irecv can be used at your leisure when convenient. The Wait() method of the Request object, it returns, has to be used to retrieve results. The Wait() method blocks until the message is successfully received from the buffer. Summary Extremely computationally intensive programs are usually parallelized and run on supercomputing clusters. These clusters consist of multiple networked computers. Communication between these computers is usually done using messaging libraries such as MPI. These allow you to pass data between processes running on different machines in an efficient manner. In this article, you have learned how to use MPJ Express—an MPI like library for JVM. We saw how to carry out process to process communication as well as collective communication. Most important MPJ Express primitives were covered and example programs using them were given. Resources for Article: Further resources on this subject: Differences in style between Java and Scala code[article] Getting Started with JavaFX[article] Integrating Scala, Groovy, and Flex Development with Apache Maven[article]

0
0
4814

article-image-creating-graphs-and-charts

Packt

12 Apr 2016

17 min read

Creating Graphs and Charts

Packt

12 Apr 2016

17 min read

0
0
34650

Packt

12 Apr 2016

7 min read

Advanced React

Packt

12 Apr 2016

7 min read

In this article by Sven A. Robbestad, author of ReactJS Blueprints, we will cover the following topics: Understanding Webpack Adding Redux to your ReactJS app Understanding Redux reducers, actions, and the store (For more resources related to this topic, see here.) Introduction Understanding the tools you use and the libraries you include in your web app is important to make an efficient web application. In this article, we'll look at some of the difficult parts of modern web development with ReactJS, including Webpack and Redux. Webpack is an important tool for modern web developers. It is a module bundler and works by bundling all modules and files within the context of your base folder. Any file within this context is considered a module and attemptes will be made to bundled it. The only exceptions are files placed in designated vendor folders by default, that are node_modules and web_modules files. Files in these folders are explicitly required in your code to be bundled. Redux is an implementation of the Flux pattern. Flux describes how data should flow through your app. Since the birth of the pattern, there's been an explosion in the number of libraries that attempt to execute on the idea. It's safe to say that while many have enjoyed moderate success, none has been as successful as Redux. Configuring Webpack You can configure Webpack to do almost anything you want, including replacing the current code loaded in your browser with the updated code, while preserving the state of the app. Webpack is configured by writing a special configuration file, usually called webpack.config.js. In this file, you specify the entry and output parameters, plugins, module loaders, and various other configuration parameters. A very basic config file looks like this: var webpack = require('webpack'); module.exports = { entry: [ './entry' ], output: { path: './', filename: 'bundle.js' } }; It's executed by issuing this command from the command line: webpack --config webpack.config.js You can even drop the config parameter, as Webpack will automatically look for the presence of webpack.config.js if not specified. In order to convert the source files before bundling, you use module loaders. Adding this section to the Webpack config file will ensure that the babel-loader module converts JavaScript 2015 code to ECMAScript 5: module: { loaders: [{ test: /.js?$/', loader: 'babel-loader', exclude: /node_modules/, query: { presets: ['es2015','react'] } }] } The first option (required), test, is a regex match that tells Webpack which files these loader operates on. The regex tells Webpack to look for files with a period followed by the letters js and then any optional letters (?) before the end ($). This makes sure that the loader reads both plain JavaScript files and JSX files. The second option (required), loader, is the name of the package that we'll use to convert the code. The third option (optional), exclude, is another regex variable used to explicitly ignore a set of folders or files. The final option (optional), query, contains special configuration options for Babel. The recommended way to do it is actually by setting them in a special file called .babelrc. This file will be picked up automatically by Babel when transpiling files. Adding Redux to your ReactJS app When ReactJS was first introduced to the public in late 2013/early 2014, you would often hear it mentioned together with functional programming. However, there's no inherent requirement to write functional code when writing the ReactJS code, and JavaScript itself being a multi-paradigm language is neither strictly functional nor strictly imperative. Redux chose the functional approach, and it's quickly gaining traction as the superior Flux implementation. There are a number of benefits of choosing a functional, which are as follows: No side effects allowed, that is, the operation is stateless Always returns the same output for a given input Ideal for creating recursive operations Ideal for parallel execution Easy to establish the single source of truth Easy to debug Easy to persist the store state for a faster development cycle Easy to create functionality such as undo and redo Easy to inject the store state for server rendering The concept of stateless operations is possibly the number one benefit, as it makes it very easy to reason about the state of your application. This is, however, not the idiomatic Reflux approach, because it's actually designed to create many stores and has the children listen to changes separately. Application state is the only most difficult part of any application, and every single implementation of Flux has attempted to solve this problem. Redux solves it by not actually doing Flux at all but is an amalgamation of the ideas of Flux and the functional programming language Elm. There are three parts to Redux: actions, reducers, and the global store. The store In Redux, there is only one global store. It is an object that holds the state of your entire application. You create a store by passing your root reducing function (or reducer, for short) to a method called createStore. Rather than creating more stores, you use a concept called reducer composition to split data handling logic. You will then need to use a function called combineReducers to create a single root reducer. The createStore function is derived from Redux and is usually called once in the root of your app (or your store file). It is then passed on to your app and then propagated to the app's children. The only way to change the state of the store is to dispatch an action on it. This is not the same as a Flux dispatcher because Redux doesn't have one. You can also subscribe to changes from the store in order to update your components when the store changes state. Actions An action is an object that represents an intention to change the state. It must have a type field that indicates what kind of action is being performed. They can be defined as constants and imported from other modules. Apart from this requirement, the structure of the object is entirely up to you. A basic action object can look like this: { type: 'UPDATE', payload: { value: "some value" } } The payload property is optional and can be an object, as we saw earlier, or any other valid JavaScript type, such as a function or a primitive. Reducers A reducer is a function that accepts an accumulation and a value and returns a new accumulation. In other words, it returns the next state based on the previous state and an action. It must be a pure function, free of side effects, and it does not mutate the existing state. For smaller apps, it's okay to start with a single reducer, and as your app grows, you split off smaller reducers that manage specific parts of your state tree. This is what's called reducer composition and is the fundamental pattern of building apps with Redux. You start with a single reducer, and as your app grows, split it off into smaller reducers that manage specific parts of the state tree. Because reducers are just functions, you can control the order in which they are called, pass additional data, or even make reusable reducers for common tasks such as pagination. It's okay to have multiple reducers. In fact, it's encouraged. Summary In this article, you learned about Webpack and how to configure it. You also learned about adding Redux to your ReactJS app. Apart from this, you learned about Redux's reducers, actions, and the store. Resources for Article: Further resources on this subject: Getting Started with React [article] Reactive Programming and the Flux Architecture [article] Create Your First React Element [article]

0
0
14716

Packt

12 Apr 2016

17 min read

Market Basket Analysis

Packt

12 Apr 2016

17 min read

In this article by Boštjan Kaluža, author of the book Machine Learning in Java, we will discuss affinity analysis which is the heart of Market Basket Analysis (MBA). It can discover co-occurrence relationships among activities performed by specific users or groups. In retail, affinity analysis can help you understand the purchasing behavior of customers. These insights can drive revenue through smart cross-selling and upselling strategies and can assist you in developing loyalty programs, sales promotions, and discount plans. In this article, we will look into the following topics: Market basket analysis Association rule learning Other applications in various domains First, we will revise the core association rule learning concepts and algorithms, such as support, lift, Apriori algorithm, and FP-growth algorithm. Next, we will use Weka to perform our first affinity analysis on supermarket dataset and study how to interpret the resulting rules. We will conclude the article by analyzing how association rule learning can be applied in other domains, such as IT Operations Analytics, medicine, and others. (For more resources related to this topic, see here.) Market basket analysis Since the introduction of electronic point of sale, retailers have been collecting an incredible amount of data. To leverage this data in order to produce business value, they first developed a way to consolidate and aggregate the data to understand the basics of the business. What are they selling? How many units are moving? What is the sales amount? Recently, the focus shifted to the lowest level of granularity—the market basket transaction. At this level of detail, the retailers have direct visibility into the market basket of each customer who shopped at their store, understanding not only the quantity of the purchased items in that particular basket, but also how these items were bought in conjunction with each other. This can be used to drive decisions about how to differentiate store assortment and merchandise, as well as effectively combine offers of multiple products, within and across categories, to drive higher sales and profits. These decisions can be implemented across an entire retail chain, by channel, at the local store level, and even for the specific customer with the so-called personalized marketing, where a unique product offering is made for each customer. MBA covers a wide variety of analysis: Item affinity: This defines the likelihood of two (or more) items being purchased together Identification of driver items: This enables the identification of the items that drive people to the store and always need to be in stock Trip classification: This analyzes the content of the basket and classifies the shopping trip into a category: weekly grocery trip, special occasion, and so on Store-to-store comparison: Understanding the number of baskets allows any metric to be divided by the total number of baskets, effectively creating a convenient and easy way to compare the stores with different characteristics (units sold per customer, revenue per transaction, number of items per basket, and so on) Revenue optimization: This helps in determining the magic price points for this store, increasing the size and value of the market basket Marketing: This helps in identifying more profitable advertising and promotions, targeting offers more precisely in order to improve ROI, generating better loyalty card promotions with longitudinal analysis, and attracting more traffic to the store Operations optimization: This helps in matching the inventory to the requirement by customizing the store and assortment to trade area demographics, optimizing store layout Predictive models help retailers to direct the right offer to the right customer segments/profiles, as well as gain understanding of what is valid for which customer, predict the probability score of customers responding to this offer, and understand the customer value gain from the offer acceptance. Affinity analysis Affinity analysis is used to determine the likelihood that a set of items will be bought together. In retail, there are natural product affinities, for example, it is very typical for people who buy hamburger patties to buy hamburger rolls, along with ketchup, mustard, tomatoes, and other items that make up the burger experience. While there are some product affinities that might seem trivial, there are some affinities that are not very obvious. A classic example is toothpaste and tuna. It seems that people who eat tuna are more prone to brush their teeth right after finishing their meal. So, why it is important for retailers to get a good grasp of the product affinities? This information is critical to appropriately plan promotions as reducing the price for some items may cause a spike on related high-affinity items without the need to further promote these related items. In the following section, we'll look into the algorithms for association rule learning: Apriori and FP-growth. Association rule learning Association rule learning has been a popular approach for discovering interesting relations between items in large databases. It is most commonly applied in retail for discovering regularities between products. Association rule learning approaches find patterns as interesting strong rules in the database using different measures of interestingness. For example, the following rule would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat:{onions, potatoes} à {burger} Another classic story probably told in every machine learning class is the beer and diaper story. An analysis of supermarket shoppers' behavior showed that customers, presumably young men, who buy diapers tend also to buy beer. It immediately became a popular example of how an unexpected association rule might be found from everyday data; however, there are varying opinions as to how much of the story is true. Daniel Powers says (DSS News, 2002): In 1992, Thomas Blischok, manager of a retail consulting group at Teradata, and his staff prepared an analysis of 1.2 million market baskets from about 25 Osco Drug stores. Database queries were developed to identify affinities. The analysis "did discover that between 5:00 and 7:00 p.m. that consumers bought beer and diapers". Osco managers did NOT exploit the beer and diapers relationship by moving the products closer together on the shelves. In addition to the preceding example from MBA, association rules are today employed in many application areas, including web usage mining, intrusion detection, continuous production, and bioinformatics. We'll take a closer look these areas later in this article. Basic concepts Before we dive into algorithms, let's first review the basic concepts. Database of transactions First, there is no class value, as this is not required for learning association rules. Next, the dataset is presented as a transactional table, where each supermarket item corresponds to a binary attribute. Hence, the feature vector could be extremely large. Consider the following example. Suppose we have five receipts as shown in the following image. Each receipt corresponds a purchasing transaction: To write these receipts in the form of transactional database, we first identify all the possible items that appear in the receipts. These items are onions, potatoes, burger, beer, and dippers. Each purchase, that is, transaction, is presented in a row, and there is 1 if an item was purchased within the transaction and 0 otherwise, as shown in the following table: Transaction ID Onions Potatoes Burger Beer Dippers 1 0 1 1 0 0 2 1 1 1 1 0 3 0 0 0 1 1 4 1 0 1 1 0 This example is really small. In practical applications, the dataset often contains thousands or millions of transactions, which allow learning algorithm discovery of statistically significant patterns. Itemset and rule Itemset is simply a set of items, for example, {onions, potatoes, burger}. A rule consists of two itemsets, X and Y, in the following format X -> Y. This indicates a pattern that when the X itemset is observed, Y is also observed. To select interesting rules, various measures of significance can be used. Support Support, for an itemset, is defined as the proportion of transactions that contain the itemset. The {potatoes, burger} itemset in the previous table has the following support as it occurs in 50% of transactions (2 out of 4 transactions) supp({potatoes, burger }) = 2/4 = 0.5. Intuitively, it indicates the share of transactions that support the pattern. Confidence Confidence of a rule indicates its accuracy. It is defined as Conf(X -> Y) = supp(X U Y) / supp(X). For example, the {onions, burger} -> {beer} rule has the confidence 0.5/0.5 = 1.0 in the previous table, which means that 100% of the times when onions and burger are bought together, beer is bought as well. Apriori algorithm Apriori algorithm is a classic algorithm used for frequent pattern mining and association rule learning over transactional. By identifying the frequent individual items in a database and extending them to larger itemsets, Apriori can determine the association rules, which highlight general trends about a database. Apriori algorithm constructs a set of itemsets, for example, itemset1= {Item A, Item B}, and calculates support, which counts the number of occurrences in the database. Apriori then uses a bottom up approach, where frequent itemsets are extended, one item at a time, and it works by eliminating the largest sets as candidates by first looking at the smaller sets and recognizing that a large set cannot be frequent unless all its subsets are. The algorithm terminates when no further successful extensions are found. Although, Apriori algorithm is an important milestone in machine learning, it suffers from a number of inefficiencies and tradeoffs. In the following section, we'll look into a more recent FP-growth technique. FP-growth algorithm FP-growth, where frequent pattern (FP), represents the transaction database as a prefix tree. First, the algorithm counts the occurrence of items in the dataset. In the second pass, it builds a prefix tree, an ordered tree data structure commonly used to store a string. An example of prefix tree based on the previous example is shown in the following diagram: If many transactions share most frequent items, prefix tree provides high compression close to the tree root. Large itemsets are grown directly, instead of generating candidate items and testing them against the entire database. Growth starts at the bottom of the tree, by finding all the itemsets matching minimal support and confidence. Once the recursive process has completed, all large itemsets with minimum coverage have been found and association rule creation begins. FP-growth algorithms have several advantages. First, it constructs an FP-tree, which encodes the original dataset in a substantially compact presentation. Second, it efficiently builds frequent itemsets, leveraging the FP-tree structure and divide-and-conquer strategy. The supermarket dataset The supermarket dataset, located in datasets/chap5/supermarket.arff, describes the shopping habits of supermarket customers. Most of the attributes stand for a particular item group, for example, diary foods, beef, potatoes; or department, for example, department 79, department 81, and so on. The value is t if the customer had bought an item and missing otherwise. There is one instance per customer. The dataset contains no class attribute, as this is not required to learn association rules. A sample of data is shown in the following table: Discover patterns To discover shopping patterns, we will use the two algorithms that we have looked into before, Apriori and FP-growth. Apriori We will use the Apriori algorithm as implemented in Weka. It iteratively reduces the minimum support until it finds the required number of rules with the given minimum confidence: import java.io.BufferedReader; import java.io.FileReader; import weka.core.Instances; import weka.associations.Apriori; First, we will load the supermarket dataset: Instances data = new Instances( new BufferedReader( new FileReader("datasets/chap5/supermarket.arff"))); Next, we will initialize an Apriori instance and call the buildAssociations(Instances) function to start frequent pattern mining, as follows: Apriori model = new Apriori(); model.buildAssociations(data); Finally, we can output the discovered itemsets and rules, as shown in the following code: System.out.println(model); The output is as follows: Apriori ======= Minimum support: 0.15 (694 instances) Minimum metric <confidence>: 0.9 Number of cycles performed: 17 Generated sets of large itemsets: Size of set of large itemsets L(1): 44 Size of set of large itemsets L(2): 380 Size of set of large itemsets L(3): 910 Size of set of large itemsets L(4): 633 Size of set of large itemsets L(5): 105 Size of set of large itemsets L(6): 1 Best rules found: 1. biscuits=t frozen foods=t fruit=t total=high 788 ==> bread and cake=t 723 <conf:(0.92)> lift:(1.27) lev:(0.03) [155] conv:(3.35) 2. baking needs=t biscuits=t fruit=t total=high 760 ==> bread and cake=t 696 <conf:(0.92)> lift:(1.27) lev:(0.03) [149] conv:(3.28) 3. baking needs=t frozen foods=t fruit=t total=high 770 ==> bread and cake=t 705 <conf:(0.92)> lift:(1.27) lev:(0.03) [150] conv:(3.27) ... The algorithm outputs ten best rules according to confidence. Let's look the first rule and interpret the output, as follows: biscuits=t frozen foods=t fruit=t total=high 788 ==> bread and cake=t 723 <conf:(0.92)> lift:(1.27) lev:(0.03) [155] conv:(3.35) It says that when biscuits, frozen foods, and fruits are bought together and the total purchase price is high, it is also very likely that bread and cake are purchased as well. The {biscuits, frozen foods, fruit, total high} itemset appears in 778 transactions, while the {bread, cake} itemset appears in 723 transactions. The confidence of this rule is 0.92, meaning that the rule holds true in 92% of transactions where the {biscuits, frozen foods, fruit, total high} itemset is present. The output also reports additional measures such as lift, leverage, and conviction, which estimate the accuracy against our initial assumptions, for example, the 3.35 conviction value indicates that the rule would be incorrect 3.35 times as often if the association was purely a random chance. Lift measures the number of times X and Y occur together than expected if they where statistically independent (lift=1). The 2.16 lift in the X -> Y rule means that the probability of X is 2.16 times greater than the probability of Y. FP-growth Now, let's try to get the same results with more efficient FP-growth algorithm. FP-growth is also implemented in the weka.associations package: import weka.associations.FPGrowth; The FP-growth is initialized similarly as we did earlier: FPGrowth fpgModel = new FPGrowth(); fpgModel.buildAssociations(data); System.out.println(fpgModel); The output reveals that FP-growth discovered 16 rules: FPGrowth found 16 rules (displaying top 10) 1. [fruit=t, frozen foods=t, biscuits=t, total=high]: 788 ==> [bread and cake=t]: 723 <conf:(0.92)> lift:(1.27) lev:(0.03) conv:(3.35) 2. [fruit=t, baking needs=t, biscuits=t, total=high]: 760 ==> [bread and cake=t]: 696 <conf:(0.92)> lift:(1.27) lev:(0.03) conv:(3.28) ... We can observe that FP-growth found the same set of rules as Apriori; however, the time required to process larger datasets can be significantly shorter. Other applications in various areas We looked into affinity analysis to demystify shopping behavior patterns in supermarkets. Although, the roots of association rule learning are in analyzing point-of-sale transactions, they can be applied outside the retail industry to find relationships among other types of baskets. The notion of a basket can easily be extended to services and products, for example, to analyze items purchased using a credit card, such as rental cars and hotel rooms, and to analyze information on value-added services purchased by telecom customers (call waiting, call forwarding, DSL, speed call, and so on), which can help the operators determine the ways to improve their bundling of service packages. Additionally, we will look into the following examples of potential cross-industry applications: Medical diagnosis Protein sequences Census data Customer relationship management IT Operations Analytics Medical diagnosis Applying association rules in medical diagnosis can be used to assist physicians while curing patients. The general problem of the induction of reliable diagnostic rules is hard as, theoretically, no induction process can guarantee the correctness of induced hypotheses by itself. Practically, diagnosis is not an easy process as it involves unreliable diagnosis tests and the presence of noise in training examples. Nevertheless, association rules can be used to identify likely symptoms appearing together. A transaction, in this case, corresponds to a medical case, while symptoms correspond to items. When a patient is treated, a list of symptoms is recorded as one transaction. Protein sequences A lot of research has gone into understanding the composition and nature of proteins; yet many things remain to be understood satisfactorily. It is now generally believed that amino-acid sequences of proteins are not random. With association rules, it is possible to identify associations between different amino acids that are present in a protein. A protein is a sequences made up of 20 types of amino acids. Each protein has a unique three-dimensional structure, which depends on amino-acid sequence; slight change in the sequence may change the functioning of protein. To apply association rules, a protein corresponds to a transaction, while amino acids, their two grams and structure correspond to the items. Such association rules are desirable for enhancing our understanding of protein composition and hold the potential to give clues regarding the global interactions amongst some particular sets of amino acids occurring in the proteins. Knowledge of these association rules or constraints is highly desirable for synthesis of artificial proteins. Census data Censuses make a huge variety of general statistical information about the society available to both researchers and general public. The information related to population and economic census can be forecasted in planning public services (education, health, transport, and funds) as well as in public business(for setting up new factories, shopping malls, or banks and even marketing particular products). To discover frequent patterns, each statistical area (for example, municipality, city, and neighborhood) corresponds to a transaction, and the collected indicators correspond to the items. Customer relationship management Association rules can reinforce the knowledge management process and allow the marketing personnel to know their customers well in order to provide better quality services. For example, association rules can be applied to detect a change of customer behavior at different time snapshots from customer profiles and sales data. The basic idea is to discover changes from two datasets and generate rules from each dataset to carry out rule matching. IT Operations Analytics Based on records of a large number of transactions, association rule learning is well-suited to be applied to the data that is routinely collected in day-to-day IT operations, enabling IT Operations Analytics tools to detect frequent patterns and identify critical changes. IT specialists need to see the big picture and understand, for example, how a problem on a database could impact an application server. For a specific day, IT operations may take in a variety of alerts, presenting them in a transactional database. Using an association rule learning algorithm, IT Operations Analytics tools can correlate and detect the frequent patterns of alerts appearing together. This can lead to a better understanding about how a component impacts another. With identified alert patterns, it is possible to apply predictive analytics. For example, a particular database server hosts a web application and suddenly an alert about a database is triggered. By looking into frequent patterns identified by an association rule learning algorithm, this means that the IT staff needs to take action before the web application is impacted. Association rule learning can also discover alert events originating from the same IT event. For example, every time a new user is added, six changes in the Windows operating systems are detected. Next, in the Application Portfolio Management (APM), IT may face multiple alerts, showing that the transactional time in a database as high. If all these issues originate from the same source (such as getting hundreds of alerts about changes that are all due to a Windows update), this frequent pattern mining can help to quickly cut through a number of alerts, allowing the IT operators to focus on truly critical changes. Summary In this article, you learned how to leverage association rules learning on transactional datasets to gain insight about frequent patterns We performed an affinity analysis in Weka and learned that the hard work lies in the analysis of results—careful attention is required when interpreting rules, as association (that is, correlation) is not the same as causation. Resources for Article: Further resources on this subject: Debugging Java Programs using JDB [article] Functional Testing with JMeter [article] Implementing AJAX Grid using jQuery data grid plugin jqGrid [article]

0
0
17050

Packt

11 Apr 2016

34 min read

Setting Up and Cleaning Up

Packt

11 Apr 2016

34 min read

0
0
3541

Getting Started with Force.com

Web Server Development

Building Our First Poky Image for the Raspberry Pi

Probabilistic Graphical Models in R

Step Detector and Step Counters Sensors

Remote Authentication

Detecting fraud on e-commerce orders with Benford's law

Using the Registry and xlswriter modules

Understanding Proxmox VE and Advanced Installation

Nginx "expires" directive – Emitting Caching Headers

Trending Topics

Cluster Computing Using Scala

Creating Graphs and Charts

Advanced React

Market Basket Analysis

Setting Up and Cleaning Up

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access