How-To Tutorials

article-image-developing-extensible-data-security

17 Jun 2015

7 min read

Developing Extensible Data Security

17 Jun 2015

This article is written by Ahmed Mohamed Rafik Moustafa, the author of Microsoft Dynamics AX 2012 R3 Security. In any corporation, some users are restricted to work with specific sensitive data because of its confidentiality or company policies, and this type of data access authorization can be managed using extensible data security (XDS). XDS is the evolution of the record-level security (RLS) that was available in the previous versions of Microsoft Dynamics AX. Also, Microsoft keeps the RLS in version AX 2012, so you can refer to it at any time. The topics that will be covered in this article are as follows: The main concepts of XDS policies Designing and developing the XDS policy Creating the XDS policy Adding constrained tables and views Setting the XDS policy context Debugging the XDS policy (For more resources related to this topic, see here.) The main concepts of XDS policies When developing an XDS policy, you need to be familiar with the following concepts: Concept Description Constrained tables A constrained table is the table or tables in a given security policy from which data is filtered or secured, based on the associated policy query. Primary tables A primary table is used to secure the content of the related constrained table. Policy queries A policy query is used to secure the constrained tables specified in a given extensible data security policy. Policy context A policy context is a piece of information that controls the circumstances under which a given policy is considered to be applicable. If this context is not set, then the policy, even if enabled, is not enforced. After understanding the previous concepts of XDS, we move on to the four steps to develop an XDS policy, and they are as follows: Design the query on the primary tables. Develop the policy. Add the constrained tables and views. Set up the policy context. Designing and developing the XDS policy XDS is a powerful mechanism that allows us to express and implement data security needs. The following steps show detailed instructions on designing and developing XDS: Determine the primary table; for example, VendTable. Create a query under the AOT Queries node: Use VendTable as the first data source Add other data sources as required by the vendor data model: Creating the policy Now we have to create the policy itself. Follow these steps: Right-click on AOT and go to Security | Policies. Select New Security Policy. Adjust the PrimaryTable property on the policy to VendTable. Settle the Query property on the policy to VendProfileAccountPolicy. Adjust the PolicyGroup property to Vendor Self Service. Settle the ConstrainedTable property to Yes to secure the primary table using this policy. Adjust the Enabled property to Yes or No, depending on whether or not you want to control the policy. Settle the ContextType property to one of the following: ContextString: Adjust the property to this value if a global context is to be used with the policy. After using ContextString, it needs to be set by the application using the XDS::SetContext API. RoleName: Adjust the property to this value if the policy should be applied only if a user in a specific role needs to access the constrained tables. RoleProperty: Adjust the property to this value if the policy is to be applied only if the user is a member of any one of roles that have the ContextString property settled to the same value. The following screenshot displays the properties: Adding constrained tables and views After designing the query and developing the required policy, the next step is to add the constrained tables and views that contain the data by using the created policy. By following the next steps, you will be able to add constrained tables or views: Right-click on the Constrained Tables node. Go to New | Add table to add a constrained table; for example, the AssetBook table, as shown in the following screenshot: When adding the constrained table AssetBook, you must determine the relationship that should be used to join the primary table with the last constrained table. Go to New | Add View to add a constrained view to the selected policy. Repeat these steps for every constrained table or view that needs to be secured through this policy. After finishing these steps, the policy will be applied for all users who are attempting to access the tables or views that are located on the constrained table's node when the Enabled property is set to Yes. Security policies are not applied to system administrators who are in the SysAdmin role. Setting the XDS policy context According to the requirements, the security policy needs to be adjusted to apply only to the users who were assigned to the vendor role. The following steps should be performed to make the appropriate adjustment: Adjust the ContextType property on the policy node to RoleProperty. Settle the ContextString property on the policy node to ForAllVendorRoles: To assign this policy to all the vendor roles, the ForAllVendorRoles context should be applied to the appropriate roles: Locate each role that needs to be assigned to this policy on the AOT node; for example, the VendVendor role. Adjust the ContextString property on the VendVendor role to ForAllVendorRoles: For more information, go to MSDN and refer to Whitepapers – Developing Extensible Data Security Policies at https://msdn.microsoft.com/en-us/library/bb629286.aspx. Debugging XDS policies One of the most common issues reported when a new XDS policy is deployed is that an unexpected number of rows are being returned from a given constrained table. For example, more sales orders are being returned than expected if the sales order table is being constrained by a given customer group. XDS provides a method to debug these errors. We will go over it now. Review the SQL queries that have been generated. The X++ select has been extended with a command that instructs the underlying data access framework to generate the SQL query without actually executing it. The following job runs a select query on SalesTable with a generated command. It then calls the getSQLStatement() method on SalesTable and dumps the output using the info API. static void VerifySalesQuery(Args _args) { SalesTable salesTable; XDSServices xdsServices = new XDSServices(); xdsServices.setXDSContext(1, ''); //Only generate SQL statement for custGroup table select generateonly forceLiterals CustAccount, DeliveryDate from salesTable; //Print SQL statement to infolog info(salesTable.getSQLStatement()); xdsServices.setXDSContext(2, ''); } The XDS policy development framework further eases this process of doing some advanced debugging by storing the query in a human-readable form. This query and others on a given constrained table in a policy can be retrieved by using the following Transact-SQL query on the database in the development environment (AXDBDEV in this example): SELECT [PRIMARYTABLEAOTNAME], [QUERYOBJECTAOTNAME], [CONSTRAINEDTABLE], [MODELEDQUERYDEBUGINFO], [CONTEXTTYPE],[CONTEXTSTRING], [ISENABLED], [ISMODELED] FROM [AXDBDEV].[dbo].[ModelSecPolRuntimeEx] This SQL query generates the following output: As you can see, the query that will join the WHERE statement of any query to the AssetBook table will be ready for debugging. Other metadata, such as LayerId, can be debugged if needed. When multiple policies apply to a table, the results of the policies are linked together with AND operators. Summary By the end of this article, you are able to secure your sensitive data using the XDS features. We learned how to design and develop XDS policies, constrained tables and views, primary tables, policy queries, set the security context, run SQL queries and learned how to debug XDS policies. Resources for Article: Further resources on this subject: Understanding and Creating Simple SSRS Reports [article] Working with Data in Forms [article] Learning MS Dynamics AX 2012 Programming [article]

0
0
3644

How-To Tutorials

Packt

17 Jun 2015

16 min read

Global Illumination

Packt

17 Jun 2015

16 min read

In this article by Volodymyr Gerasimov, the author of the book, Building Levels in Unity, you will see two types of lighting that you need to take into account if you want to create well lit levels—direct and indirect. Direct light is the one that is coming directly from the source. Indirect light is created by light bouncing off the affected area at a certain angle with variable intensity. In the real world, the number of bounces is infinite and that is the reason why we can see dark areas that don't have light shining directly at them. In computer software, we don't yet have the infinite computing power at our disposal to be able to use different tricks to simulate the realistic lighting at runtime. The process that simulates indirect lighting, light bouncing, reflections, and color bleeding is known as Global Illumination (GI). Unity 5 is powered by one of the industry's leading technologies for handling indirect lighting (radiosity) in the gaming industry, called Enlighten by Geomerics. Games such as Battlefield 3-4, Medal of Honor: Warfighter, Need for Speed the Run and Dragon Age: Inquisition are excellent examples of what this technology is capable of, and now all of that power is at your fingertips completely for free! Now, it's only appropriate to learn how to tame this new beast. (For more resources related to this topic, see here.) Preparing the environment Realtime realistic lighting is just not feasible at our level of computing power, which forces us into inventing tricks to simulate it as close as possible, but just like with any trick, there are certain conditions that need to be met in order for it to work properly and keep viewer's eyes from exposing our clever deception. To demonstrate how to work with these limitations, we are going to construct a simple light set up for the small interior scene and talk about solutions to the problems as we go. For example, we will use the LightmappingInterior scene that can be found in the Chapter 7 folder in the Project window. It's a very simple interior and should take us no time to set up. The first step is to place the lights. We will be required to create two lights: a Directional to imitate the moonlight coming from the crack in the dome and a Point light for the fire burning in the goblet, on the ceiling. Tune the light's Intensity, Range (in Point light's case), and Color to your liking. So far so good! We can see the direct lighting coming from the moonlight, but there is no trace of indirect lighting. Why is this happening? Should GI be enabled somehow for it to work? As a matter of fact, it does and here comes the first limitation of Global Illumination—it only works on GameObjects that are marked as Static. Static versus dynamic objects Unity objects can be of one of the two categories: static or dynamic. Differentiation is very simple: static objects don't move, they stay still where they are at all times, they neither play any animations nor engage in any kind of interactions. The rest of the objects are dynamic. By default, all objects in Unity are dynamic and can only be converted into static by checking the Static checkbox in the Inspector window. See it for yourself. Try to mark an object as static in Unity and attempt to move it around in the Play mode. Does it work? Global Illumination will only work with static objects; this means, before we go into the Play mode right above it, we need to be 100 percent sure that the objects that will cast and receive indirect lights will not stop doing that from their designated positions. However, why is that you may ask, isn't the whole purpose of Realtime GI to calculate indirect lighting in runtime? The answer to that would be yes, but only to an extent. The technology behind this is called Precomputed Realtime GI, according to Unity developers it precomputes all possible bounces that the light can make and encodes them to be used in realtime; so it essentially tells us that it's going to take a static object, a light and answer a question: "If this light is going to travel around, how is it going to bounce from the affected surface of the static object from every possible angle?" During runtime, lights are using this encoded data as instructions on how the light should bounce instead of calculating it every frame. Having static objects can be beneficial in many other ways, such as pathfinding, but that's a story for another time. To test this theory, let's mark objects in the scene as Static, meaning they will not move (and can't be forced to move) by physics, code or even transformation tools (the latter is only true during the Play mode). To do that, simply select Pillar, Dome, WaterProNighttime, and Goblet GameObjects in the Hierarchy window and check the Static checkbox at the top-right corner of the Inspector window. Doing that will cause Unity to recalculate the light and encode bouncing information. Once the process has finished (it should take no time at all), you can hit the Play button and move the light around. Notice that bounce lighting is changing as well without any performance overhead. Fixing the light coming from the crack The moonlight inside the dome should be coming from the crack on its surface, however, if you rotate the directional light around, you'll notice that it simply ignores concrete walls and freely shines through. Naturally, that is incorrect behavior and we can't have that stay. We can clearly see through the dome ourselves from the outside as a result of one-sided normals. Earlier, the solution was to duplicate the faces and invert the normals; however, in this case, we actually don't mind seeing through the walls and only want to fix the lighting issue. To fix this, we need to go to the Mesh Render component of the Dome GameObject and select the Two Sided option from the drop-down menu of the Cast Shadows parameter. This will ignore backface culling and allow us to cast shadows from both sides of the mesh, thus fixing the problem. In order to cast shadows, make sure that your directional light has Shadow Type parameter set to either Hard Shadows or Soft Shadows. Emission materials Another way to light up the level is to utilize materials with Emission maps. Pillar_EmissionMaterial applied to the Pillar GameObject already has an Emission map assigned to it, all that is left is to crank up the parameter next to it, to a number which will give it a noticeable effect (let's say 3). Unfortunately, emissive materials are not lights, and precomputed GI will not be able to update indirect light bounce created by the emissive material. As a result, changing material in the Play mode will not cause the update. Changes done to materials in the Play mode will be preserved in the Editor. Shadows An important byproduct of lighting is shadows cast by affected objects. No surprises here! Unity allows us to cast shadows by both dynamic and static objects and have different results based on render settings. By default, all lights in Unity have shadows disabled. In order to enable shadows for a particular light, we need to modify the Shadow Type parameter to be either Hard Shadows or Soft Shadows in the Inspector window. Enabling shadows will grant you access to three parameters: Strength: This is the darkness of shadows, from 0 to 1. Resolution: This controls the resolution of the shadows. This parameter can utilize the value set in the Use Quality Settings or be selected individually from the drop down menu. Bias and Normal Bias – this is the shadow offset. These parameters are used to prevent an artifact known as Shadow Acne (pixelated shadows in lit areas); however, setting them too high can cause another artifact known as Peter Panning (disconnected shadow). Default values usually help us to avoid both issues. Unity is using a technique known as Shadow Mapping, which determines the objects that will be lit by assuming the light's perspective—every object that light sees directly, is lit; every object that isn't seen should be in the shadow. After rendering the light's perspective, Unity stores the depth of each surface into a shadow map. In the cases where the shadow map resolution is low, this can cause some pixels to appear shaded when they shouldn't be (Shadow Acne) or not have a shadow where it's supposed to be (Peter Panning), if the offset is too high. Unity allows you to control the objects that should receive or cast shadows by changing the parameters Cast Shadows and Receive Shadows in the Rendering Mesh component of a GameObject. Lightmapping Every year, more and more games are being released with real-time rendering solutions that allow for more realistic-looking environments at the price of ever-growing computing power of modern PCs and consoles. However, due to the limiting hardware capabilities of mobile platforms, it is still a long time before we are ready to part ways with cheap and affordable techniques such as lightmapping. Lightmapping is a technology for precomputing brightness of surfaces, also known as baking, and storing it in a separate texture—a lightmap. In order to see lighting in the area, we need to be able to calculate it at least 30 times per second (or more, based on fps requirements). This is not very cheap; however, with lightmapping we can calculate lighting once and then apply it as a texture. This technology is suitable for static objects that artists know will never be moved; in a nutshell, this process involves creating a scene, setting up the lighting rig and clicking Bake to get great lighting with minimum performance issues during runtime. To demonstrate the lightmapping process, we will take the scene and try to bake it using lightmapping. Static versus dynamic lights We've just talked about a way to guarantee that the GameObjects will not move. But what about lights? Hitting the Static checkbox for lights will not achieve much (unless you simply want to completely avoid the possibility of accidentally moving them). The problem at hand is that light, being a component of an object, has a separate set of controls allowing them to be manipulated even if the holder is set to static. For that purpose, each light has a parameter that allows us to specify the role of individual light and its contribution to the baking process, this parameter is called Baking. There are three options available for it: Realtime: This option will exclude this particular light from the baking process. It is totally fine to use real-time lighting, precomputed GI will make sure that modern computers and consoles are able to handle them quite smoothly. However, they might cause an issue if you are developing for the mobile platforms which will require every bit of optimization to be able to run with a stable frame rate. There are ways to fake real-time lighting with much cheaper options,. The only thing you should consider is that the number of realtime lights should be kept at a minimum if you are going for maximum optimization. Realtime will allow lights to affect static and dynamic objects. Baked: This option will include this light into the baking process. However, there is a catch: only static objects will receive light from it. This is self-explanatory—if we want dynamic objects to receive lighting, we need to calculate it every time the position of an object changes, which is what Realtime lighting does. Baked lights are cheap, calculated once we have stored all lighting information on a hard drive and using it from there, no further recalculation is required during runtime. It is mostly used on small situational lights that won't have a significant effect on dynamic objects. Mixed: This one is a combination of the previous two options. It bakes the lights into the static objects and affects the dynamic objects as they pass by. Think of the street lights: you want the passing cars to be affected; however, you have no need to calculate the lighting for the static environment in realtime. Naturally, we can't have dynamic objects move around the level unlit, no matter how much we'd like to save on computing power. Mixed will allow us to have the benefit of the baked lighting on the static objects as well as affect the dynamic objects at runtime. The first step that we are going to take is changing the Baking parameter of our lights from Realtime to Baked and enabling Soft Shadows: You shouldn't notice any significant difference, except for the extra shadows appearing. The final result isn't too different from the real-time lighting. Its performance is much better, but lacks the support of dynamic objects. Dynamic shadows versus static shadows One of the things that get people confused when starting to work with shadows in Unity is how they are being cast by static and dynamic objects with different Baking settings on the light source. This is one of those things that you simply need to memorize and keep in mind when planning the lighting in the scene. We are going to explore how different Baking options affect the shadow casting between different combinations of static and dynamic objects: As you can see, real-time lighting handles everything pretty well; all the objects are casting shadows onto each other and everything works as intended. There is even color bleeding happening between two static objects on the right. With Baked lighting the result isn't that inspiring. Let's break it down. Dynamic objects are not lit. If the object is subject to change at runtime, we can't preemptively bake it into the lightmap; therefore, lights that are set to Baked will simply ignore them. Shadows are only cast by static objects onto static objects. This correlates to the previous statement that if we aren't sure that the object is going to change we can't safely bake its shadows into the shadow map. With Mixed we get a similar result as with real-time lighting, except for one instance: dynamic objects are not casting shadows onto static objects, but the reverse does work: static objects are casting shadows onto the dynamic objects just fine, so what's the catch? Each object gets individual treatment from the Mixed light: those that are static are treated as if they are lit by the Baked light and dynamic are lit in realtime. In other words, when we are casting a shadow onto a dynamic object, it is calculated in realtime, while when we are casting shadow onto the static object, it is baked and we can't bake a shadow that is cast by the object that is subject to change. This was never the case with real-time lighting, since we were calculating the shadows at realtime, regardless of what they were cast by or cast onto. And again, this is just one scenario that you need to memorize. Lighting options The Lighting window has three tabs: Object, Scene, and Lightmap. For now we will focus on the first one. The main content of an Object tab is information on objects that are currently selected. This allows us to get quick access to a list of controls, to better tweak selected objects for lightmapping and GI. You can switch between object types with the help of Scene Filter at the top; this is a shortcut to filtering objects in the Hierarchy window (this will not filter the selected GameObjects, but everything in the Hierarchy window). All GameObjects need to be set to Static in order to be affected by the lightmapping process; this is why the Lightmap Static checkbox is the first in the list for Mesh Renderers. If you haven't set the object to static in the Inspector window, checking the Lightmap Static box will do just that. The Scale in Lightmap parameter controls the lightmap resolution. The greater the value, the bigger the resolution given to the object's lightmap, resulting in better lighting effects and shadows. Setting the parameter to 0 will result in an object not being affected by lightmapping. Unless you are trying to fix lighting artifacts on the object, or going for the maximum optimization, you shouldn't touch this parameter; there is a better way to adjust the lightmap resolution for all objects in the scene; Scale in Lightmap scales in relation to global value. The rest of the parameters are very situational and quite advanced, they deal with UVs, extend the effect of GI on the GameObject, and give detailed information on the lightmap. For lights, we have a baking parameter with three options: Realtime, Baked, or Mixed. Naturally, if you want this light for lightmapping, Realtime is not an option, so you should pick Baked or Mixed. Color and Intensity are referenced from the Inspector window and can be adjusted in either place. Baked Shadows allows us to choose the shadow type that will be baked (Hard, Soft, Off). Summary Lighting is a difficult process that is deceptively easy to learn, but hard to master. In Unity, lighting isn't without its issues. Attempting to apply real-world logic to 3D rendering will result in a direct confrontation with limitations posed by imperfect simulation. In order to solve issues that may arise, one must first understand what might be causing them, in order to isolate the problem and attempt to find a solution. Alas, there are still a lot of topics left uncovered that are outside of the realm of an introduction. If you wish to learn more about lighting, I would point you again to the official documentation and developer blogs, where you'll find a lot of useful information, tons of theory, practical recommendations, as well as in-depth look into all light elements discussed. Resources for Article: Further resources on this subject: Learning NGUI for Unity [article] Saying Hello to Unity and Android [article] Components in Unity [article]

0
0
24587

Packt

17 Jun 2015

16 min read

Code Style in Django

Packt

17 Jun 2015

16 min read

In this article written by Sanjeev Jaiswal and Ratan Kumar, authors of the book Learning Django Web Development, this article will cover all the basic topics which you would require to follow, such as coding practices for better Django web development, which IDE to use, version control, and so on. We will learn the following topics in this article: Django coding style Using IDE for Django web development Django project structure This article is based on the important fact that code is read much more often than it is written. Thus, before you actually start building your projects, we suggest that you familiarize yourself with all the standard practices adopted by the Django community for web development. Django coding style Most of Django's important practices are based on Python. Though chances are you already know them, we will still take a break and write all the documented practices so that you know these concepts even before you begin. To mainstream standard practices, Python enhancement proposals are made, and one such widely adopted standard practice for development is PEP8, the style guide for Python code–the best way to style the Python code authored by Guido van Rossum. The documentation says, "PEP8 deals with semantics and conventions associated with Python docstrings." For further reading, please visit http://legacy.python.org/dev/peps/pep-0008/. Understanding indentation in Python When you are writing Python code, indentation plays a very important role. It acts as a block like in other languages, such as C or Perl. But it's always a matter of discussion amongst programmers whether we should use tabs or spaces, and, if space, how many–two or four or eight. Using four spaces for indentation is better than eight, and if there are a few more nested blocks, using eight spaces for each indentation may take up more characters than can be shown in single line. But, again, this is the programmer's choice. The following is what incorrect indentation practices lead to: >>> def a(): ... print "foo" ... print "bar" IndentationError: unexpected indent So, which one we should use: tabs or spaces? Choose any one of them, but never mix up tabs and spaces in the same project or else it will be a nightmare for maintenance. The most popular way of indention in Python is with spaces; tabs come in second. If any code you have encountered has a mixture of tabs and spaces, you should convert it to using spaces exclusively. Doing indentation right – do we need four spaces per indentation level? There has been a lot of confusion about it, as of course, Python's syntax is all about indentation. Let's be honest: in most cases, it is. So, what is highly recommended is to use four spaces per indentation level, and if you have been following the two-space method, stop using it. There is nothing wrong with it, but when you deal with multiple third party libraries, you might end up having a spaghetti of different versions, which will ultimately become hard to debug. Now for indentation. When your code is in a continuation line, you should wrap it vertically aligned, or you can go in for a hanging indent. When you are using a hanging indent, the first line should not contain any argument and further indentation should be used to clearly distinguish it as a continuation line. A hanging indent (also known as a negative indent) is a style of indentation in which all lines are indented except for the first line of the paragraph. The preceding paragraph is the example of hanging indent. The following example illustrates how you should use a proper indentation method while writing the code: bar = some_function_name(var_first, var_second, var_third, var_fourth) # Here indentation of arguments makes them grouped, and stand clear from others. def some_function_name( var_first, var_second, var_third, var_fourth): print(var_first) # This example shows the hanging intent. We do not encourage the following coding style, and it will not work in Python anyway: # When vertical alignment is not used, Arguments on the first line are forbidden foo = some_function_name(var_first, var_second, var_third, var_fourth) # Further indentation is required as indentation is not distinguishable between arguments and source code. def some_function_name( var_first, var_second, var_third, var_fourth): print(var_first) Although extra indentation is not required, if you want to use extra indentation to ensure that the code will work, you can use the following coding style: # Extra indentation is not necessary. if (this and that): do_something() Ideally, you should limit each line to a maximum of 79 characters. It allows for a + or – character used for viewing difference using version control. It is even better to limit lines to 79 characters for uniformity across editors. You can use the rest of the space for other purposes. The importance of blank lines The importance of two blank lines and single blank lines are as follows: Two blank lines: A double blank lines can be used to separate top-level functions and the class definition, which enhances code readability. Single blank lines: A single blank line can be used in the use cases–for example, each function inside a class can be separated by a single line, and related functions can be grouped together with a single line. You can also separate the logical section of source code with a single line. Importing a package Importing a package is a direct implication of code reusability. Therefore, always place imports at the top of your source file, just after any module comments and document strings, and before the module's global and constants as variables. Each import should usually be on separate lines. The best way to import packages is as follows: import os import sys It is not advisable to import more than one package in the same line, for example: import sys, os You may import packages in the following fashion, although it is optional: from django.http import Http404, HttpResponse If your import gets longer, you can use the following method to declare them: from django.http import ( Http404, HttpResponse, HttpResponsePermanentRedirect ) Grouping imported packages Package imports can be grouped in the following ways: Standard library imports: Such as sys, os, subprocess, and so on. import reimport simplejson Related third party imports: These are usually downloaded from the Python cheese shop, that is, PyPy (using pip install). Here is an example: from decimal import * Local application / library-specific imports: This included the local modules of your projects, such as models, views, and so on. from models import ModelFoofrom models import ModelBar Naming conventions in Python/Django Every programming language and framework has its own naming convention. The naming convention in Python/Django is more or less the same, but it is worth mentioning it here. You will need to follow this while creating a variable name or global variable name and when naming a class, package, modules, and so on. This is the common naming convention that we should follow: Name the variables properly: Never use single characters, for example, 'x' or 'X' as variable names. It might be okay for your normal Python scripts, but when you are building a web application, you must name the variable properly as it determines the readability of the whole project. Naming of packages and modules: Lowercase and short names are recommended for modules. Underscores can be used if their use would improve readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged. Since module names are mapped to file names (models.py, urls.py, and so on), it is important that module names be chosen to be fairly short as some file systems are case insensitive and truncate long names. Naming a class: Class names should follow the CamelCase naming convention, and classes for internal use can have a leading underscore in their name. Global variable names: First of all, you should avoid using global variables, but if you need to use them, prevention of global variables from getting exported can be done via __all__, or by defining them with a prefixed underscore (the old, conventional way). Function names and method argument: Names of functions should be in lowercase and separated by an underscore and self as the first argument to instantiate methods. For classes or methods, use CLS or the objects for initialization. Method names and instance variables: Use the function naming rules—lowercase with words separated by underscores as necessary to improve readability. Use one leading underscore only for non-public methods and instance variables. Using IDE for faster development There are many options on the market when it comes to source code editors. Some people prefer full-fledged IDEs, whereas others like simple text editors. The choice is totally yours; pick up whatever feels more comfortable. If you already use a certain program to work with Python source files, I suggest that you stick to it as it will work just fine with Django. Otherwise, I can make a couple of recommendations, such as these: SublimeText: This editor is lightweight and very powerful. It is available for all major platforms, supports syntax highlighting and code completion, and works well with Python. The editor is open source and you can find it at http://www.sublimetext.com/ PyCharm: This, I would say, is most intelligent code editor of all and has advanced features, such as code refactoring and code analysis, which makes development cleaner. Features for Django include template debugging (which is a winner) and also quick documentation, so this look-up is a must for beginners. The community edition is free and you can sample a 30-day trial version before buying the professional edition. Setting up your project with the Sublime text editor Most of the examples that we will show you in this book will be written using Sublime text editor. In this section, we will show how to install and set up the Django project. Download and installation: You can download Sublime from the download tab of the site www.sublimetext.com. Click on the downloaded file option to install. Setting up for Django: Sublime has a very extensive plug-in ecosystem, which means that once you have downloaded the editor, you can install plug-ins for adding more features to it. After successful installation, it will look like this: Most important of all is Package Control, which is the manager for installing additional plugins directly from within Sublime. This will be your only manual installation of the package. It will take care of the rest of the package installation ahead. Some of the recommendations for Python development using Sublime are as follows: Sublime Linter: This gives instant feedback about the Python code as you write it. It also has PEP8 support; this plugin will highlight in real time the things we discussed about better coding in the previous section so that you can fix them. Sublime CodeIntel: This is maintained by the developer of SublimeLint. Sublime CodeIntel have some of advanced functionalities, such as directly go-to definition, intelligent code completion, and import suggestions. You can also explore other plugins for Sublime to increase your productivity. Setting up the pycharm IDE You can use any of your favorite IDEs for Django project development. We will use pycharm IDE for this book. This IDE is recommended as it will help you at the time of debugging, using breakpoints that will save you a lot of time figuring out what actually went wrong. Here is how to install and set up pycharm IDE for Django: Download and installation: You can check the features and download the pycharm IDE from the following link: http://www.jetbrains.com/pycharm/ Setting up for Django: Setting up pycharm for Django is very easy. You just have to import the project folder and give the manage.py path, as shown in the following figure: The Django project structure The Django project structure has been changed in the 1.6 release version. Django (django-admin.py) also has a startapp command to create an application, so it is high time to tell you the difference between an application and a project in Django. A project is a complete website or application, whereas an application is a small, self-contained Django application. An application is based on the principle that it should do one thing and do it right. To ease out the pain of building a Django project right from scratch, Django gives you an advantage by auto-generating the basic project structure files from which any project can be taken forward for its development and feature addition. Thus, to conclude, we can say that a project is a collection of applications, and an application can be written as a separate entity and can be easily exported to other applications for reusability. To create your first Django project, open a terminal (or Command Prompt for Windows users), type the following command, and hit Enter: $ django-admin.py startproject django_mytweets This command will make a folder named django_mytweets in the current directory and create the initial directory structure inside it. Let's see what kind of files are created. The new structure is as follows: django_mytweets/// django_mytweets/ manage.py This is the content of django_mytweets/: django_mytweets/ __init__.py settings.py urls.py wsgi.py Here is a quick explanation of what these files are: django_mytweets (the outer folder): This folder is the project folder. Contrary to the earlier project structure in which the whole project was kept in a single folder, the new Django project structure somehow hints that every project is an application inside Django. This means that you can import other third party applications on the same level as the Django project. This folder also contains the manage.py file, which include all the project management settings. manage.py: This is utility script is used to manage our project. You can think of it as your project's version of django-admin.py. Actually, both django-admin.py and manage.py share the same backend code. Further clarification about the settings will be provided when are going to tweak the changes. Let's have a look at the manage.py file: #!/usr/bin/env python import os import sys if __name__ == "__main__": os.environ.setdefault("DJANGO_SETTINGS_MODULE", "django_mytweets.settings") from django.core.management import execute_from_command_line execute_from_command_line(sys.argv) The source code of the manage.py file will be self-explanatory once you read the following code explanation. #!/usr/bin/env python The first line is just the declaration that the following file is a Python file, followed by the import section in which os and sys modules are imported. These modules mainly contain system-related operations. import os import sys The next piece of code checks whether the file is executed by the main function, which is the first function to be executed, and then loads the Django setting module to the current path. As you are already running a virtual environment, this will set the path for all the modules to the path of the current running virtual environment. if __name__ == "__main__": os.environ.setdefault("DJANGO_SETTINGS_MODULE", "django_mytweets.settings") django_mytweets/ ( Inner folder) __init__.py Django projects are Python packages, and this file is required to tell Python that this folder is to be treated as a package. A package in Python's terminology is a collection of modules, and they are used to group similar files together and prevent naming conflicts. settings.py: This is the main configuration file for your Django project. In it, you can specify a variety of options, including database settings, site language(s), what Django features need to be enabled, and so on. By default, the database is configured to use SQLite Database, which is advisable to use for testing purposes. Here, we will only see how to enter the database in the settings file; it also contains the basic setting configuration, and with slight modification in the manage.py file, it can be moved to another folder, such as config or conf. To make every other third-party application a part of the project, we need to register it in the settings.py file. INSTALLED_APPS is a variable that contains all the entries about the installed application. As the project grows, it becomes difficult to manage; therefore, there are three logical partitions for the INSTALLED_APPS variable, as follows: DEFAULT_APPS: This parameter contains the default Django installed applications (such as the admin) THIRD_PARTY_APPS: This parameter contains other application like SocialAuth used for social authentication LOCAL_APPS: This parameter contains the applications that are created by you url.py: This is another configuration file. You can think of it as a mapping between URLs and the Django view functions that handle them. This file is one of Django's more powerful features. When we start writing code for our application, we will create new files inside the project's folder. So, the folder also serves as a container for our code. Now that you have a general idea of the structure of a Django project, let's configure our database system. Summary We prepared our development environment in this article, created our first project, set up the database, and learned how to launch the Django development server. We learned the best way to write code for our Django project and saw the default Django project structure. Resources for Article: Further resources on this subject: Tinkering Around in Django JavaScript Integration [article] Adding a developer with Django forms [article] So, what is Django? [article]

0
0
8916

article-image-client-and-server-applications

Packt

16 Jun 2015

27 min read

Client and Server Applications

Packt

16 Jun 2015

27 min read

In this article by, Sam Washington and Dr. M. O. Faruque Sarker, authors of the book Learning Python Network Programming, we're going to use sockets to build network applications. Sockets follow one of the main models of computer networking, that is, the client/server model. We'll look at this with a focus on structuring server applications. We'll cover the following topics: Designing a simple protocol Building an echo server and client (For more resources related to this topic, see here.) The examples in this article are best run on Linux or a Unix operating system. The Windows sockets implementation has some idiosyncrasies, and these can create some error conditions, which we will not be covering here. Note that Windows does not support the poll interface that we'll use in one example. If you do use Windows, then you'll probably need to use ctrl + break to kill these processes in the console, rather than using ctrl - c because Python in a Windows command prompt doesn't respond to ctrl – c when it's blocking on a socket send or receive, which will be quite often in this article! (and if, like me, you're unfortunate enough to try testing these on a Windows laptop without a break key, then be prepared to get very familiar with the Windows Task Manager's End task button). Client and server The basic setup in the client/server model is one device, the server that runs a service and patiently waits for clients to connect and make requests to the service. A 24-hour grocery shop may be a real world analogy. The shop waits for customers to come in and when they do, they request certain products, purchase them and leave. The shop might advertise itself so people know where to find it, but the actual transactions happen while the customers are visiting the shop. A typical computing example is a web server. The server listens on a TCP port for clients that need its web pages. When a client, for example a web browser, requires a web page that the server hosts, it connects to the server and then makes a request for that page. The server replies with the content of the page and then the client disconnects. The server advertises itself by having a hostname, which the clients can use to discover the IP address so that they can connect to it. In both of these situations, it is the client that initiates any interaction – the server is purely responsive to that interaction. So, the needs of the programs that run on the client and server are quite different. Client programs are typically oriented towards the interface between the user and the service. They retrieve and display the service, and allow the user to interact with it. Server programs are written to stay running for indefinite periods of time, to be stable, to efficiently deliver the service to the clients that are requesting it, and to potentially handle a large number of simultaneous connections with a minimal impact on the experience of any one client. In this article, we will look at this model by writing a simple echo server and client, which can handle a session with multiple clients. The socket module in Python perfectly suits this task. An echo protocol Before we write our first client and server programs, we need to decide how they are going to interact with each other, that is we need to design a protocol for their communication. Our echo server should listen until a client connects and sends a bytes string, then we want it to echo that string back to the client. We only need a few basic rules for doing this. These rules are as follows: Communication will take place over TCP. The client will initiate an echo session by creating a socket connection to the server. The server will accept the connection and listen for the client to send a bytes string. The client will send a bytes string to the server. Once it sends the bytes string, the client will listen for a reply from the server When it receives the bytes string from the client, the server will send the bytes string back to the client. When the client has received the bytes string from the server, it will close its socket to end the session. These steps are straightforward enough. The missing element here is how the server and the client will know when a complete message has been sent. Remember that an application sees a TCP connection as an endless stream of bytes, so we need to decide what in that byte stream will signal the end of a message. Framing This problem is called framing, and there are several approaches that we can take to handle it. The main ones are described here: Make it a protocol rule that only one message will be sent per connection, and once a message has been sent, the sender will immediately close the socket. Use fixed length messages. The receiver will read the number of bytes and know that they have the whole message. Prefix the message with the length of the message. The receiver will read the length of the message from the stream first, then it will read the indicated number of bytes to get the rest of the message. Use special character delimiters for indicating the end of a message. The receiver will scan the incoming stream for a delimiter, and the message comprises everything up to the delimiter. Option 1 is a good choice for very simple protocols. It's easy to implement and it doesn't require any special handling of the received stream. However, it requires the setting up and tearing down of a socket for every message, and this can impact performance when a server is handling many messages at once. Option 2 is again simple to implement, but it only makes efficient use of the network when our data comes in neat, fixed-length blocks. For example in a chat server the message lengths are variable, so we will have to use a special character, such as the null byte, to pad messages to the block size. This only works where we know for sure that the padding character will never appear in the actual message data. There is also the additional issue of how to handle messages longer than the block length. Option 3 is usually considered as one of the best approaches. Although it can be more complex to code than the other options, the implementations are still reasonably straightforward, and it makes efficient use of bandwidth. The overhead imposed by including the length of each message is usually minimal as compared to the message length. It also avoids the need for any additional processing of the received data, which may be needed by certain implementations of option 4. Option 4 is the most bandwidth-efficient option, and is a good choice when we know that only a limited set of characters, such as the ASCII alphanumeric characters, will be used in messages. If this is the case, then we can choose a delimiter character, such as the null byte, which will never appear in the message data, and then the received data can be easily broken into messages as this character is encountered. Implementations are usually simpler than option 3. Although it is possible to employ this method for arbitrary data, that is, where the delimiter could also appear as a valid character in a message, this requires the use of character escaping, which needs an additional round of processing of the data. Hence in these situations, it's usually simpler to use length-prefixing. For our echo and chat applications, we'll be using the UTF-8 character set to send messages. The null byte isn't used in any character in UTF-8 except for the null byte itself, so it makes a good delimiter. Thus, we'll be using method 4 with the null byte as the delimiter to frame our messages. So, our last rule which is number 8 will become: Messages will be encoded in the UTF-8 character set for transmission, and they will be terminated by the null byte. Now, let's write our echo programs. A simple echo server As we work through this article, we'll find ourselves reusing several pieces of code, so to save ourselves from repetition, we'll set up a module with useful functions that we can reuse as we go along. Create a file called tincanchat.py and save the following code in it: import socket HOST = '' PORT = 4040 def create_listen_socket(host, port): """ Setup the sockets our server will receive connection requests on """ sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) sock.bind((host, port)) sock.listen(100) return sock def recv_msg(sock): """ Wait for data to arrive on the socket, then parse into messages using b' ' as message delimiter """ data = bytearray() msg = '' # Repeatedly read 4096 bytes off the socket, storing the bytes # in data until we see a delimiter while not msg: recvd = sock.recv(4096) if not recvd: # Socket has been closed prematurely raise ConnectionError() data = data + recvd if b' ' in recvd: # we know from our protocol rules that we only send # one message per connection, so b' ' will always be # the last character msg = data.rstrip(b' ') msg = msg.decode('utf-8') return msg def prep_msg(msg): """ Prepare a string to be sent as a message """ msg += ' ' return msg.encode('utf-8') def send_msg(sock, msg): """ Send a string over a socket, preparing it first """ data = prep_msg(msg) sock.sendall(data) First we define a default interface and a port number to listen on. The empty '' interface, specified in the HOST variable, tells socket.bind() to listen on all available interfaces. If you want to restrict access to just your machine, then change the value of the HOST variable at the beginning of the code to 127.0.0.1. We'll be using create_listen_socket() to set up our server listening connections. This code is the same for several of our server programs, so it makes sense to reuse it. The recv_msg() function will be used by our echo server and client for receiving messages from a socket. In our echo protocol, there isn't anything that our programs may need to do while they're waiting to receive a message, so this function just calls socket.recv() in a loop until it has received the whole message. As per our framing rule, it will check the accumulated data on each iteration to see if it has received a null byte, and if so, then it will return the received data, stripping off the null byte and decoding it from UTF-8. The send_msg() and prep_msg() functions work together for framing and sending a message. We've separated the null byte termination and the UTF-8 encoding into prep_msg() because we will use them in isolation later on. Handling received data Note that we're drawing ourselves a careful line with these send and receive functions as regards string encoding. Python 3 strings are Unicode, while the data that we receive over the network is bytes. The last thing that we want to be doing is handling a mixture of these in the rest of our program code, so we're going to carefully encode and decode the data at the boundary of our program, where the data enters and leaves the network. This will ensure that any functions in the rest of our code can assume that they'll be working with Python strings, which will later on make things much easier for us. Of course, not all the data that we may want to send or receive over a network will be text. For example, images, compressed files, and music, can't be decoded to a Unicode string, so a different kind of handling is needed. Usually this will involve loading the data into a class, such as a Python Image Library (PIL) image for example, if we are going to manipulate the object in some way. There are basic checks that could be done here on the received data, before performing full processing on it, to quickly flag any problems with the data. Some examples of such checks are as follows: Checking the length of the received data Checking the first few bytes of a file for a magic number to confirm a file type Checking values of higher level protocol headers, such as the Host header in an HTTP request This kind of checking will allow our application to fail fast if there is an obvious problem. The server itself Now, let's write our echo server. Open a new file called 1.1-echo-server-uni.py and save the following code in it: import tincanchat HOST = tincanchat.HOST PORT = tincanchat.PORT def handle_client(sock, addr): """ Receive data from the client via sock and echo it back """ try: msg = tincanchat.recv_msg(sock) # Blocks until received # complete message print('{}: {}'.format(addr, msg)) tincanchat.send_msg(sock, msg) # Blocks until sent except (ConnectionError, BrokenPipeError): print('Socket error') finally: print('Closed connection to {}'.format(addr)) sock.close() if __name__ == '__main__': listen_sock = tincanchat.create_listen_socket(HOST, PORT) addr = listen_sock.getsockname() print('Listening on {}'.format(addr)) while True: client_sock, addr = listen_sock.accept() print('Connection from {}'.format(addr)) handle_client(client_sock, addr) This is about as simple as a server can get! First, we set up our listening socket with the create_listen_socket() call. Second, we enter our main loop, where we listen forever for incoming connections from clients, blocking on listen_sock.accept(). When a client connection comes in, we invoke the handle_client() function, which handles the client as per our protocol. We've created a separate function for this code, partly to keep the main loop tidy, and partly because we'll want to reuse this set of operations in later programs. That's our server, now we just need to make a client to talk to it. A simple echo client Create a file called 1.2-echo_client-uni.py and save the following code in it: import sys, socket import tincanchat HOST = sys.argv[-1] if len(sys.argv) > 1 else '127.0.0.1' PORT = tincanchat.PORT if __name__ == '__main__': while True: try: sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((HOST, PORT)) print('nConnected to {}:{}'.format(HOST, PORT)) print("Type message, enter to send, 'q' to quit") msg = input() if msg == 'q': break tincanchat.send_msg(sock, msg) # Blocks until sent print('Sent message: {}'.format(msg)) msg = tincanchat.recv_msg(sock) # Block until # received complete # message print('Received echo: ' + msg) except ConnectionError: print('Socket error') break finally: sock.close() print('Closed connection to servern') If we're running our server on a different machine from the one on which we are running the client, then we can supply the IP address or the hostname of the server as a command line argument to the client program. If we don't, then it will default to trying to connect to the localhost. The third and forth lines of the code check the command line arguments for a server address. Once we've determined which server to connect to, we enter our main loop, which loops forever until we kill the client by entering q as a message. Within the main loop, we first create a connection to the server. Second, we prompt the user to enter the message to send and then we send the message using the tincanchat.send_msg() function. We then wait for the server's reply. Once we get the reply, we print it and then we close the connection as per our protocol. Give our client and server a try. Run the server in a terminal by using the following command: $ python 1.1-echo_server-uni.py Listening on ('0.0.0.0', 4040) In another terminal, run the client and note that you will need to specify the server if you need to connect to another computer, as shown here: $ python 1.2-echo_client.py 192.168.0.7 Type message, enter to send, 'q' to quit Running the terminals side by side is a good idea, because you can simultaneously see how the programs behave. Type a few messages into the client and see how the server picks them up and sends them back. Disconnecting with the client should also prompt a notification on the server. Concurrent I/O If you're adventurous, then you may have tried connecting to our server using more than one client at once. If you tried sending messages from both of them, then you'd have seen that it does not work as we might have hoped. If you haven't tried this, then give it a go. A working echo session on the client should look like this: Type message, enter to send. 'q' to quit hello world Sent message: hello world Received echo: hello world Closed connection to server However, when trying to send a message by using a second connected client, we'll see something like this: Type message, enter to send. 'q' to quit hello world Sent message: hello world The client will hang when the message is sent, and it won't get an echo reply. You may also notice that if we send a message by using the first connected client, then the second client will get its response. So, what's going on here? The problem is that the server can only listen for the messages from one client at a time. As soon as the first client connects, the server blocks at the socket.recv() call in tincanchat.recv_msg(), waiting for the first client to send a message. The server isn't able to receive messages from other clients while this is happening and so, when another client sends a message, that client blocks too, waiting for the server to send a reply. This is a slightly contrived example. The problem in this case could easily be fixed in the client end by asking the user for an input before establishing a connection to the server. However in our full chat service, the client will need to be able to listen for messages from the server while simultaneously waiting for user input. This is not possible in our present procedural setup. There are two solutions to this problem. We can either use more than one thread or process, or use non-blocking sockets along with an event-driven architecture. We're going to look at both of these approaches, starting with multithreading. Multithreading and multiprocessing Python has APIs that allow us to write both multithreading and multiprocessing applications. The principle behind multithreading and multiprocessing is simply to take copies of our code and run them in additional threads or processes. The operating system automatically schedules the threads and processes across available CPU cores to provide fair processing time allocation to all the threads and processes. This effectively allows a program to simultaneously run multiple operations. In addition, when a thread or process blocks, for example, when waiting for IO, the thread or process can be de-prioritized by the OS, and the CPU cores can be allocated to other threads or processes that have actual computation to do. Here is an overview of how threads and processes relate to each other: Threads exist within processes. A process can contain multiple threads but it always contains at least one thread, sometimes called the main thread. Threads within the same process share memory, so data transfer between threads is just a case of referencing the shared objects. Processes do not share memory, so other interfaces, such as files, sockets, or specially allocated areas of shared memory, must be used for transferring data between processes. When threads have operations to execute, they ask the operating system thread scheduler to allocate them some time on a CPU, and the scheduler allocates the waiting threads to CPU cores based on various parameters, which vary from OS to OS. Threads in the same process may run on separate CPU cores at the same time. Although two processes have been displayed in the preceding diagram, multiprocessing is not going on here, since the processes belong to different applications. The second process is displayed to illustrates a key difference between Python threading and threading in most other programs. This difference is the presence of the GIL. Threading and the GIL The CPython interpreter (the standard version of Python available for download from www.python.org) contains something called the Global Interpreter Lock (GIL). The GIL exists to ensure that only a single thread in a Python process can run at a time, even if multiple CPU cores are present. The reason for having the GIL is that it makes the underlying C code of the Python interpreter much easier to write and maintain. The drawback of this is that Python programs using multithreading cannot take advantage of multiple cores for parallel computation. This is a cause of much contention; however, for us this is not so much of a problem. Even with the GIL present, threads that are blocking on I/O are still de-prioritized by the OS and put into the background, so threads that do have computational work to do can run instead. The following figure is a simplified illustration of this: The Waiting for GIL state is where a thread has sent or received some data and so is ready to come out of the blocking state, but another thread has the GIL, so the ready thread is forced to wait. In many network applications, including our echo and chat servers, the time spent waiting on I/O is much higher than the time spent processing data. As long as we don't have a very large number of connections (a situation we'll discuss later on when we come to event driven architectures), thread contention caused by the GIL is relatively low, and hence threading is still a suitable architecture for these network server applications. With this in mind, we're going to use multithreading rather than multiprocessing in our echo server. The shared data model will simplify the code that we'll need for allowing our chat clients to exchange messages with each other, and because we're I/O bound, we don't need processes for parallel computation. Another reason for not using processes in this case is that processes are more "heavyweight" in terms of the OS resources, so creating a new process takes longer than creating a new thread. Processes also use more memory. One thing to note is that if you need to perform an intensive computation in your network server application (maybe you need to compress a large file before sending it over the network), then you should investigate methods for running this in a separate process. Because of quirks in the implementation of the GIL, having even a single computationally intensive thread in a mainly I/O bound process when multiple CPU cores are available can severely impact the performance of all the I/O bound threads. For more details, go through the David Beazley presentations linked to in the following information box: Processes and threads are different beasts, and if you're not clear on the distinctions, it's worthwhile to read up. A good starting point is the Wikipedia article on threads, which can be found at http://en.wikipedia.org/wiki/Thread_(computing). A good overview of the topic is given in Chapter 4 of Benjamin Erb's thesis, which is available at http://berb.github.io/diploma-thesis/community/. Additional information on the GIL, including the reasoning behind keeping it in Python can be found in the official Python documentation at https://wiki.python.org/moin/GlobalInterpreterLock. You can also read more on this topic in Nick Coghlan's Python 3 Q&A, which can be found at http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html#but-but-surely-fixing-the-gil-is-more-important-than-fixing-unicode. Finally, David Beazley has done some fascinating research on the performance of the GIL on multi-core systems. Two presentations of importance are available online. They give a good technical background, which is relevant to this article. These can be found at http://pyvideo.org/video/353/pycon-2010--understanding-the-python-gil---82 and at https://www.youtube.com/watch?v=5jbG7UKT1l4. A multithreaded echo server A benefit of the multithreading approach is that the OS handles the thread switches for us, which means we can continue to write our program in a procedural style. Hence we only need to make small adjustments to our server program to make it multithreaded, and thus, capable of handling multiple clients simultaneously. Create a new file called 1.3-echo_server-multi.py and add the following code to it: import threading import tincanchat HOST = tincanchat.HOST PORT = tincanchat.PORT def handle_client(sock, addr): """ Receive one message and echo it back to client, then close socket """ try: msg = tincanchat.recv_msg(sock) # blocks until received # complete message msg = '{}: {}'.format(addr, msg) print(msg) tincanchat.send_msg(sock, msg) # blocks until sent except (ConnectionError, BrokenPipeError): print('Socket error') finally: print('Closed connection to {}'.format(addr)) sock.close() if __name__ == '__main__': listen_sock = tincanchat.create_listen_socket(HOST, PORT) addr = listen_sock.getsockname() print('Listening on {}'.format(addr)) while True: client_sock,addr = listen_sock.accept() # Thread will run function handle_client() autonomously # and concurrently to this while loop thread = threading.Thread(target=handle_client, args=[client_sock, addr], daemon=True) thread.start() print('Connection from {}'.format(addr)) You can see that we've just imported an extra module and modified our main loop to run our handle_client() function in separate threads, rather than running it in the main thread. For each client that connects, we create a new thread that just runs the handle_client() function. When the thread blocks on a receive or send, the OS checks the other threads to see if they have come out of a blocking state, and if any have, then it switches to one of them. Notice that we have set the daemon argument in the thread constructor call to True. This will allow the program to exit if we hit ctrl - c without us having to explicitly close all of our threads first. If you try this echo server with multiple clients, then you'll see that a second client that connects and sends a message will immediately get a response. Summary We looked at how to develop network protocols while considering aspects such as the connection sequence, framing of the data on the wire, and the impact these choices will have on the architecture of the client and server programs. We worked through different architectures for network servers and clients, demonstrating the multithreaded models by writing a simple echo server. Resources for Article: Further resources on this subject: Importing Dynamic Data [article] Driving Visual Analyses with Automobile Data (Python) [article] Preparing to Build Your Own GIS Application [article]

0
0
10661

Packt

16 Jun 2015

4 min read

Color and motion finding

Packt

16 Jun 2015

4 min read

In this article by Richard Grimmet, the author of the book, Raspberry Pi Robotics Essentials, we'll look at how to detect the Color and motion of an object. (For more resources related to this topic, see here.) OpenCV and your webcam can also track colored objects. This will be useful if you want your biped to follow a colored object. OpenCV makes this amazingly simple by providing some high-level libraries that can help us with this task. To accomplish this, you'll edit a file to look something like what is shown in the following screenshot: Let's look specifically at the code that makes it possible to isolate the colored ball: hue_img = cv.CvtColor(frame, cv.CV_BGR2HSV): This line creates a new image that stores the image as per the values of hue (color), saturation, and value (HSV), instead of the red, green, and blue (RGB) pixel values of the original image. Converting to HSV focuses our processing more on the color, as opposed to the amount of light hitting it. threshold_img = cv.InRangeS(hue_img, low_range, high_range): The low_range, high_range parameters determine the color range. In this case, it is an orange ball, so you want to detect the color orange. For a good tutorial on using hue to specify color, refer to http://www.tomjewett.com/colors/hsb.html. Also, http://www.shervinemami.info/colorConversion.html includes a program that you can use to determine your values by selecting a specific color. Run the program. If you see a single black image, move this window, and you will expose the original image window as well. Now, take your target (in this case, an orange ping-pong ball) and move it into the frame. You should see something like what is shown in the following screenshot: Notice the white pixels in our threshold image showing where the ball is located. You can add more OpenCV code that gives the actual location of the ball. In our original image file of the ball's location, you can actually draw a rectangle around the ball as an indicator. Edit the file to look as follows: The added lines look like the following: hue_image = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV): This line creates a hue image out of the RGB image that was captured. Hue is easier to deal with when trying to capture real world images; for details, refer to http://www.bogotobogo.com/python/OpenCV_Python/python_opencv3_Changing_ColorSpaces_RGB_HSV_HLS.php. threshold_img = cv2.inRange(hue_image, low_range, high_range): This creates a new image that contains only those pixels that occur between the low_range and high_range n-tuples. contour, hierarchy = cv2.findContours(threshold_img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE): This finds the contours, or groups of like pixels, in the threshold_img image. center = contour[0]: This identifies the first contour. moment = cv2.moments(center): This finds the moment of this group of pixels. (x,y),radius = cv2.minEnclosingCircle(center): This gives the x and y locations and the radius of the minimum circle that will enclose this group of pixels. center = (int(x),int(y)): Find the center of the x and y locations. radius = int(radius): The integer radius of the circle. img = cv2.circle(frame,center,radius,(0,255,0),2): Draw a circle on the image. Now that the code is ready, you can run it. You should see something that looks like the following screenshot: You can now track your object. You can modify the color by changing the low_range and high_range n-tuples. You also have the location of your object, so you can use the location to do path planning for your robot. Summary Your biped robot can walk, use sensors to avoid barriers, plans its path, and even see barriers or target. Resources for Article: Further resources on this subject: Develop a Digital Clock [article] Creating Random Insults [article] Raspberry Pi and 1-Wire [article]

0
0
6191

Packt

16 Jun 2015

8 min read

Set Up MariaDB

Packt

16 Jun 2015

8 min read

In this article, by Daniel Bartholomew, author of Getting Started with MariaDB - Second Edition, you will learn to set up MariaDB with a generic configuration suitable for general use. This is perfect for giving MariaDB a try but might not be suitable for a production database application under heavy load. There are thousands of ways to tweak the settings to get MariaDB to perform just the way we need it to. Many books have been written on this subject. In this article, we'll cover enough of the basics so that we can comfortably edit the MariaDB configuration files and know our way around. The MariaDB filesystem layout A MariaDB installation is not a single file or even a single directory, so the first stop on our tour is a high-level overview of the filesystem layout. We'll start with Windows and then move on to Linux. The MariaDB filesystem layout on Windows On Windows, MariaDB is installed under a directory named with the following pattern: C:Program FilesMariaDB <major>.<minor> In the preceding command, <major> and <minor> refer to the first and second number in the MariaDB version string. So for MariaDB 10.1, the location would be: C:Program FilesMariaDB 10.1 The only alteration to this location, unless we change it during the installation, is when the 32-bit version of MariaDB is installed on a 64-bit version of Windows. In that case, the default MariaDB directory is at the following location: C:Program Files x86MariaDB <major>.<minor> Under the MariaDB directory on Windows, there are four primary directories: bin, data, lib, and include. There are also several configuration examples and other files under the MariaDB directory and a couple of additional directories (docs and Share), but we won't go into their details here. The bin directory is where the executable files of MariaDB are located. The data directory is where databases are stored; it is also where the primary MariaDB configuration file, my.ini, is stored. The lib directory contains various library and plugin files. Lastly, the include directory contains files that are useful for application developers. We don't generally need to worry about the bin, lib, and include directories; it's enough for us to be aware that they exist and know what they contain. The data directory is where we'll spend most of our time in this article and when using MariaDB. On Linux distributions, MariaDB follows the default filesystem layout. For example, the MariaDB binaries are placed under /usr/bin/, libraries are placed under /usr/lib/, manual pages are placed under /usr/share/man/, and so on. However, there are some key MariaDB-specific directories and file locations that we should know about. Two of them are locations that are the same across most Linux distributions. These locations are the /usr/share/mysql/ and /var/lib/mysql/ directories. The /usr/share/mysql/ directory contains helper scripts that are used during the initial installation of MariaDB, translations (so we can have error and system messages in different languages), and character set information. We don't need to worry about these files and scripts; it's enough to know that this directory exists and contains important files. The /var/lib/mysql/ directory is the default location for our actual database data and the related files such as logs. There is not much need to worry about this directory as MariaDB will handle its contents automatically; for now it's enough to know that it exists. The next directory we should know about is where the MariaDB plugins are stored. Unlike the previous two, the location of this directory varies. On Debian and Ubuntu systems, the directory is at the following location: /usr/lib/mysql/plugin/ In distributions such as Fedora, Red Hat, and CentOS, the location of the plugin directory varies depending on whether our system is 32 bit or 64 bit. If unsure, we can just look in both. The possible locations are: /lib64/mysql/plugin//lib/mysql/plugin/ The basic rule of thumb is that if we don't have a /lib64/ directory, we have the 32-bit version of Fedora, Red Hat, or CentOS installed. As with /usr/share/mysql/, we don't need to worry about the contents of the MariaDB plugin directory. It's enough to know that it exists and contains important files. Also, if in the future we install a new MariaDB plugin, this directory is where it will go. The last directory that we should know about is only found on Debian and the distributions based on Debian such as Ubuntu. Its location is as follows: /etc/mysql/ The /etc/mysql/ directory is where the configuration information for MariaDB is stored; specifically, in the following two locations: /etc/mysql/my.cnf/etc/mysql/conf.d/ Fedora, Red Hat, CentOS, and related systems don't have an /etc/mysql/ directory by default, but they do have a my.cnf file and a directory that serves the same purpose that the /etc/mysql/conf.d/ directory does on Debian and Ubuntu. They are at the following two locations: /etc/my.cnf/etc/my.cnf.d/ The my.cnf files, regardless of location, function the same on all Linux versions and on Windows, where it is often named my.ini. The /etc/my.cnf.d/ and /etc/mysql/conf.d/ directories, as mentioned, serve the same purpose. We'll spend the next section going over these two directories. Modular configuration on Linux The /etc/my.cnf.d/ and /etc/mysql/conf.d/ directories are special locations for the MariaDB configuration files. They are found on the MariaDB releases for Linux such as Debian, Ubuntu, Fedora, Red Hat, and CentOS. We will only have one or the other of them, never both, and regardless of which one we have, their function is the same. The basic idea behind these directories is to allow the package manager (APT or YUM) to be able to install packages for MariaDB, which include additions to MariaDB's configuration without needing to edit or change the main my.cnf configuration file. It's easy to imagine the harm that would be caused if we installed a new plugin package and it overwrote a carefully crafted and tuned configuration file. With these special directories, the package manager can simply add a file to the appropriate directory and be done. When the MariaDB server and the clients and utilities included with MariaDB start up, they first read the main my.cnf file and then any files that they find under the /etc/my.cnf.d/ or /etc/mysql/conf.d/ directories that have the extension .cnf because of a line at the end of the default configuration files. For example, MariaDB includes a plugin called feedback whose sole purpose is to send back anonymous statistical information to the MariaDB developers. They use this information to help guide future development efforts. It is disabled by default but can easily be enabled by adding feedback=on to a [mysqld] group of the MariaDB configuration file (we'll talk about configuration groups in the following section). We could add the required lines to our main my.cnf file or, better yet, we can create a file called feedback.cnf (MariaDB doesn't care what the actual filename is, apart from the .cnf extension) with the following content: [mysqld]feedback=on All we have to do is put our feedback.cnf file in the /etc/my.cnf.d/ or /etc/mysql/conf.d/ directory and when we start or restart the server, the feedback.cnf file will be read and the plugin will be turned on. Doing this for a single plugin on a solitary MariaDB server may seem like too much work, but suppose we have 100 servers, and further assume that since the servers are doing different things, each of them has a slightly different my.cnf configuration file. Without using our small feedback.cnf file to turn on the feedback plugin on all of them, we would have to connect to each server in turn and manually add feedback=on to the [mysqld] group of the file. This would get tiresome and there is also a chance that we might make a mistake with one, or several of the files that we edit, even if we try to automate the editing in some way. Copying a single file to each server that only does one thing (turning on the feedback plugin in our example) is much faster, and much safer. And, if we have an automated deployment system in place, copying the file to every server can be almost instant. Caution! Because the configuration settings in the /etc/my.cnf.d/ or /etc/mysql/conf.d/ directory are read after the settings in the my.cnf file, they can override or change the settings in our main my.cnf file. This can be a good thing if that is what we want and expect. Conversely, it can be a bad thing if we are not expecting that behavior. Summary That's it for our configuration highlights tour! In this article, we've learned where the various bits and pieces of MariaDB are installed and about the different parts that make up a typical MariaDB configuration file. Resources for Article: Building a Web Application with PHP and MariaDB – Introduction to caching Installing MariaDB on Windows and Mac OS X Questions & Answers with MariaDB's Michael "Monty" Widenius- Founder of MySQL AB

0
0
1861

Packt

16 Jun 2015

17 min read

Digging Deep into Requests

Packt

16 Jun 2015

17 min read

In this article by Rakesh Vidya Chandra and Bala Subrahmanyam Varanasi, authors of the book Python Requests Essentials, we are going to deal with advanced topics in the Requests module. There are many more features in the Requests module that makes the interaction with the web a cakewalk. Let us get to know more about different ways to use Requests module which helps us to understand the ease of using it. (For more resources related to this topic, see here.) In a nutshell, we will cover the following topics: Persisting parameters across requests using Session objects Revealing the structure of request and response Using prepared requests Verifying SSL certificate with Requests Body Content Workflow Using generator for sending chunk encoded requests Getting the request method arguments with event hooks Iterating over streaming API Self-describing the APIs with link headers Transport Adapter Persisting parameters across Requests using Session objects The Requests module contains a session object, which has the capability to persist settings across the requests. Using this session object, we can persist cookies, we can create prepared requests, we can use the keep-alive feature and do many more things. The Session object contains all the methods of Requests API such as GET, POST, PUT, DELETE and so on. Before using all the capabilities of the Session object, let us get to know how to use sessions and persist cookies across requests. Let us use the session method to get the resource. >>> import requests >>> session = requests.Session() >>> response = requests.get("https://google.co.in", cookies={"new-cookie-identifier": "1234abcd"}) In the preceding example, we created a session object with requests and its get method is used to access a web resource. The cookie value which we had set in the previous example will be accessible using response.request.headers. >>> response.request.headers CaseInsensitiveDict({'Cookie': 'new-cookie-identifier=1234abcd', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/2.2.1 CPython/2.7.5+ Linux/3.13.0-43-generic'}) >>> response.request.headers['Cookie'] 'new-cookie-identifier=1234abcd' With session object, we can specify some default values of the properties, which needs to be sent to the server using GET, POST, PUT and so on. We can achieve this by specifying the values to the properties like headers, auth and so on, on a Session object. >>> session.params = {"key1": "value", "key2": "value2"} >>> session.auth = ('username', 'password') >>> session.headers.update({'foo': 'bar'}) In the preceding example, we have set some default values to the properties—params, auth, and headers using the session object. We can override them in the subsequent request, as shown in the following example, if we want to: >>> session.get('http://mysite.com/new/url', headers={'foo': 'new-bar'}) Revealing the structure of request and response A Requests object is the one which is created by the user when he/she tries to interact with a web resource. It will be sent as a prepared request to the server and does contain some parameters which are optional. Let us have an eagle eye view on the parameters: Method: This is the HTTP method to be used to interact with the web service. For example: GET, POST, PUT. URL: The web address to which the request needs to be sent. headers: A dictionary of headers to be sent in the request. files: This can be used while dealing with the multipart upload. It's the dictionary of files, with key as file name and value as file object. data: This is the body to be attached to the request.json. There are two cases that come in to the picture here: If json is provided, content-type in the header is changed to application/json and at this point, json acts as a body to the request. In the second case, if both json and data are provided together, data is silently ignored. params: A dictionary of URL parameters to append to the URL. auth: This is used when we need to specify the authentication to the request. It's a tuple containing username and password. cookies: A dictionary or a cookie jar of cookies which can be added to the request. hooks: A dictionary of callback hooks. A Response object contains the response of the server to a HTTP request. It is generated once Requests gets a response back from the server. It contains all of the information returned by the server and also stores the Request object we created originally. Whenever we make a call to a server using the requests, two major transactions are taking place in this context which are listed as follows: We are constructing a Request object which will be sent out to the server to request a resource A Response object is generated by the requests module Now, let us look at an example of getting a resource from Python's official site. >>> response = requests.get('https://python.org') In the preceding line of code, a requests object gets constructed and will be sent to 'https://python.org'. Thus obtained Requests object will be stored in the response.request variable. We can access the headers of the Request object which was sent off to the server in the following way: >>> response.request.headers CaseInsensitiveDict({'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/2.2.1 CPython/2.7.5+ Linux/3.13.0-43-generic'}) The headers returned by the server can be accessed with its 'headers' attribute as shown in the following example: >>> response.headers CaseInsensitiveDict({'content-length': '45950', 'via': '1.1 varnish', 'x-cache': 'HIT', 'accept-ranges': 'bytes', 'strict-transport-security': 'max-age=63072000; includeSubDomains', 'vary': 'Cookie', 'server': 'nginx', 'age': '557','content-type': 'text/html; charset=utf-8', 'public-key-pins': 'max-age=600; includeSubDomains; ..) The response object contains different attributes like _content, status_code, headers, url, history, encoding, reason, cookies, elapsed, request. >>> response.status_code 200 >>> response.url u'https://www.python.org/' >>> response.elapsed datetime.timedelta(0, 1, 904954) >>> response.reason 'OK' Using prepared Requests Every request we send to the server turns to be a PreparedRequest by default. The request attribute of the Response object which is received from an API call or a session call is actually the PreparedRequest that was used. There might be cases in which we ought to send a request which would incur an extra step of adding a different parameter. Parameters can be cookies, files, auth, timeout and so on. We can handle this extra step efficiently by using the combination of sessions and prepared requests. Let us look at an example: >>> from requests import Request, Session >>> header = {} >>> request = Request('get', 'some_url', headers=header) We are trying to send a get request with a header in the previous example. Now, take an instance where we are planning to send the request with the same method, URL, and headers, but we want to add some more parameters to it. In this condition, we can use the session method to receive complete session level state to access the parameters of the initial sent request. This can be done by using the session object. >>> from requests import Request, Session >>> session = Session() >>> request1 = Request('GET', 'some_url', headers=header) Now, let us prepare a request using the session object to get the values of the session level state: >>> prepare = session.prepare_request(request1) We can send the request object request with more parameters now, as follows: >>> response = session.send(prepare, stream=True, verify=True) 200 Voila! Huge time saving! The prepare method prepares the complete request with the supplied parameters. In the previous example, the prepare_request method was used. There are also some other methods like prepare_auth, prepare_body, prepare_cookies, prepare_headers, prepare_hooks, prepare_method, prepare_url which are used to create individual properties. Verifying an SSL certificate with Requests Requests provides the facility to verify an SSL certificate for HTTPS requests. We can use the verify argument to check whether the host's SSL certificate is verified or not. Let us consider a website which has got no SSL certificate. We shall send a GET request with the argument verify to it. The syntax to send the request is as follows: requests.get('no ssl certificate site', verify=True) As the website doesn't have an SSL certificate, it will result an error similar to the following: requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused')) Let us verify the SSL certificate for a website which is certified. Consider the following example: >>> requests.get('https://python.org', verify=True) <Response [200]> In the preceding example, the result was 200, as the mentioned website is SSL certified one. If we do not want to verify the SSL certificate with a request, then we can put the argument verify=False. By default, the value of verify will turn to True. Body content workflow Take an instance where a continuous stream of data is being downloaded when we make a request. In this situation, the client has to listen to the server continuously until it receives the complete data. Consider the case of accessing the content from the response first and the worry about the body next. In the above two situations, we can use the parameter stream. Let us look at an example: >>> requests.get("https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz", stream=True) If we make a request with the parameter stream=True, the connection remains open and only the headers of the response will be downloaded. This gives us the capability to fetch the content whenever we need by specifying the conditions like the number of bytes of data. The syntax is as follows: if int(request.headers['content_length']) < TOO_LONG: content = r.content By setting the parameter stream=True and by accessing the response as a file-like object that is response.raw, if we use the method iter_content, we can iterate over response.data. This will avoid reading of larger responses at once. The syntax is as follows: iter_content(chunk_size=size in bytes, decode_unicode=False) In the same way, we can iterate through the content using iter_lines method which will iterate over the response data one line at a time. The syntax is as follows: iter_lines(chunk_size = size in bytes, decode_unicode=None, delimitter=None) The important thing that should be noted while using the stream parameter is it doesn't release the connection when it is set as True, unless all the data is consumed or response.close is executed. Keep-alive facility As the urllib3 supports the reuse of the same socket connection for multiple requests, we can send many requests with one socket and receive the responses using the keep-alive feature in the Requests library. Within a session, it turns to be automatic. Every request made within a session automatically uses the appropriate connection by default. The connection that is being used will be released after all the data from the body is read. Streaming uploads A file-like object which is of massive size can be streamed and uploaded using the Requests library. All we need to do is to supply the contents of the stream as a value to the data attribute in the request call as shown in the following lines. The syntax is as follows: with open('massive-body', 'rb') as file: requests.post('http://example.com/some/stream/url', data=file) Using generator for sending chunk encoded Requests Chunked transfer encoding is a mechanism for transferring data in an HTTP request. With this mechanism, the data is sent in a series of chunks. Requests supports chunked transfer encoding, for both outgoing and incoming requests. In order to send a chunk encoded request, we need to supply a generator for your body. The usage is shown in the following example: >>> def generator(): ... yield "Hello " ... yield "World!" ... >>> requests.post('http://example.com/some/chunked/url/path', data=generator()) Getting the request method arguments with event hooks We can alter the portions of the request process signal event handling using hooks. For example, there is hook named response which contains the response generated from a request. It is a dictionary which can be passed as a parameter to the request. The syntax is as follows: hooks = {hook_name: callback_function, … } The callback_function parameter may or may not return a value. When it returns a value, it is assumed that it is to replace the data that was passed in. If the callback function doesn't return any value, there won't be any effect on the data. Here is an example of a callback function: >>> def print_attributes(request, *args, **kwargs): ... print(request.url) ... print(request .status_code) ... print(request .headers) If there is an error in the execution of callback_function, you'll receive a warning message in the standard output. Now let us print some of the attributes of the request, using the preceding callback_function: >>> requests.get('https://www.python.org/', hooks=dict(response=print_attributes)) https://www.python.org/ 200 CaseInsensitiveDict({'content-type': 'text/html; ...}) <Response [200]> Iterating over streaming API Streaming API tends to keep the request open allowing us to collect the stream data in real time. While dealing with a continuous stream of data, to ensure that none of the messages being missed from it we can take the help of iter_lines() in Requests. The iter_lines() iterates over the response data line by line. This can be achieved by setting the parameter stream as True while sending the request. It's better to keep in mind that it's not always safe to call the iter_lines() function as it may result in loss of received data. Consider the following example taken from http://docs.python-requests.org/en/latest/user/advanced/#streaming-requests: >>> import json >>> import requests >>> r = requests.get('http://httpbin.org/stream/4', stream=True) >>> for line in r.iter_lines(): ... if line: ... print(json.loads(line) ) In the preceding example, the response contains a stream of data. With the help of iter_lines(), we tried to print the data by iterating through every line. Encodings As specified in the HTTP protocol (RFC 7230), applications can request the server to return the HTTP responses in an encoded format. The process of encoding turns the response content into an understandable format which makes it easy to access it. When the HTTP header fails to return the type of encoding, Requests will try to assume the encoding with the help of chardet. If we access the response headers of a request, it does contain the keys of content-type. Let us look at a response header's content-type: >>> re = requests.get('http://google.com') >>> re.headers['content-type'] 'text/html; charset=ISO-8859-1' In the preceding example the content type contains 'text/html; charset=ISO-8859-1'. This happens when the Requests finds the charset value to be None and the 'content-type' value to be 'Text'. It follows the protocol RFC 7230 to change the value of charset to ISO-8859-1 in this type of a situation. In case we are dealing with different types of encodings like 'utf-8', we can explicitly specify the encoding by setting the property to Response.encoding. HTTP verbs Requests support the usage of the full range of HTTP verbs which are defined in the following table. To most of the supported verbs, 'url' is the only argument that must be passed while using them. Method Description GET GET method requests a representation of the specified resource. Apart from retrieving the data, there will be no other effect of using this method. Definition is given as requests.get(url, **kwargs) POST The POST verb is used for the creation of new resources. The submitted data will be handled by the server to a specified resource. Definition is given as requests.post(url, data=None, json=None, **kwargs) PUT This method uploads a representation of the specified URI. If the URI is not pointing to any resource, the server can create a new object with the given data or it will modify the existing resource. Definition is given as requests.put(url, data=None, **kwargs) DELETE This is pretty easy to understand. It is used to delete the specified resource. Definition is given as requests.delete(url, **kwargs) HEAD This verb is useful for retrieving meta-information written in response headers without having to fetch the response body. Definition is given as requests.head(url, **kwargs) OPTIONS OPTIONS is a HTTP method which returns the HTTP methods that the server supports for a specified URL. Definition is given as requests.options(url, **kwargs) PATCH This method is used to apply partial modifications to a resource. Definition is given as requests.patch(url, data=None, **kwargs) Self-describing the APIs with link headers Take a case of accessing a resource in which the information is accommodated in different pages. If we need to approach the next page of the resource, we can make use of the link headers. The link headers contain the meta data of the requested resource, that is the next page information in our case. >>> url = "https://api.github.com/search/code?q=addClass+user:mozilla&page=1&per_page=4" >>> response = requests.head(url=url) >>> response.headers['link'] '<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2&per_page=4>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=250&per_page=4>; rel="last" In the preceding example, we have specified in the URL that we want to access page number one and it should contain four records. The Requests automatically parses the link headers and updates the information about the next page. When we try to access the link header, it showed the output with the values of the page and the number of records per page. Transport Adapter It is used to provide an interface for Requests sessions to connect with HTTP and HTTPS. This will help us to mimic the web service to fit our needs. With the help of Transport Adapters, we can configure the request according to the HTTP service we opt to use. Requests contains a Transport Adapter called HTTPAdapter included in it. Consider the following example: >>> session = requests.Session() >>> adapter = requests.adapters.HTTPAdapter(max_retries=6) >>> session.mount("http://google.co.in", adapter) In this example, we created a request session in which every request we make retries only six times, when the connection fails. Summary In this article, we learnt about creating sessions and using the session with different criteria. We also looked deeply into HTTP verbs and using proxies. We learnt about streaming requests, dealing with SSL certificate verifications and streaming responses. We also got to know how to use prepared requests, link headers and chunk encoded requests. Resources for Article: Further resources on this subject: Machine Learning [article] Solving problems – closest good restaurant [article] Installing NumPy, SciPy, matplotlib, and IPython [article]

0
0
7908

Packt

16 Jun 2015

14 min read

Defining Dependencies

Packt

16 Jun 2015

14 min read

0
0
1800

Packt

16 Jun 2015

8 min read

Clustering

Packt

16 Jun 2015

8 min read

In this article by Jayani Withanawasam, author of the book Apache Mahout Essentials, we will see the clustering technique in machine learning and its implementation using Apache Mahout. The K-Means clustering algorithm is explained in detail with both Java and command-line examples (sequential and parallel executions), and other important clustering algorithms, such as Fuzzy K-Means, canopy clustering, and spectral K-Means are also explored. In this article, we will cover the following topics: Unsupervised learning and clustering Applications of clustering Types of clustering K-Means clustering K-Means clustering with MapReduce (For more resources related to this topic, see here.) Unsupervised learning and clustering Information is a key driver for any type of organization. However, with the rapid growth in the volume of data, valuable information may be hidden and go unnoticed due to the lack of effective data processing and analyzing mechanisms. Clustering is an unsupervised learning mechanism that can find the hidden patterns and structures in data by finding data points that are similar to each other. No prelabeling is required. So, you can organize data using clustering with little or no human intervention. For example, let's say you are given a collection of balls of different sizes without any category labels, such as big and small, attached to them; you should be able to categorize them using clustering by considering their attributes, such as radius and weight, for similarity. We will learn how to use Apache Mahout to perform clustering using different algorithms. Applications of clustering Clustering has many applications in different domains, such as biology, business, and information retrieval. Computer vision and image processing Clustering techniques are widely used in the computer vision and image processing domain. Clustering is used for image segmentation in medical image processing for computer aided disease (CAD) diagnosis. One specific area is breast cancer detection. In breast cancer detection, a mammogram is clustered into several parts for further analysis, as shown in the following image. The regions of interest for signs of breast cancer in the mammogram can be identified using the K-Means algorithm. Image features such as pixels, colors, intensity, and texture are used during clustering: Types of clustering Clustering can be divided into different categories based on different criteria. Hard clustering versus soft clustering Clustering techniques can be divided into hard clustering and soft clustering based on the cluster's membership. In hard clustering, a given data point in n-dimensional space only belongs to one cluster. This is also known as exclusive clustering. The K-Means clustering mechanism is an example of hard clustering. A given data point can belong to more than one cluster in soft clustering. This is also known as overlapping clustering. The Fuzzy K-Means algorithm is a good example of soft clustering. A visual representation of the difference between hard clustering and soft clustering is given in the following figure: Flat clustering versus hierarchical clustering In hierarchical clustering, a hierarchy of clusters is built using the top-down (divisive) or bottom-up (agglomerative) approach. This is more informative and accurate than flat clustering, which is a simple technique where no hierarchy is present. However, this comes at the cost of performance, as flat clustering is faster and more efficient than hierarchical clustering. For example, let's assume that you need to figure out T-shirt sizes for people of different sizes. Using hierarchal clustering, you can come up with sizes for small (s), medium (m), and large (l) first by analyzing a sample of the people in the population. Then, we can further categorize this as extra small (xs), small (s), medium, large (l), and extra large (xl) sizes. Model-based clustering In model-based clustering, data is modeled using a standard statistical model to work with different distributions. The idea is to find a model that best fits the data. The best-fit model is achieved by tuning up parameters to minimize loss on errors. Once the parameter values are set, probability membership can be calculated for new data points using the model. Model-based clustering gives a probability distribution over clusters. K-Means clustering K-Means clustering is a simple and fast clustering algorithm that has been widely adopted in many problem domains. We will give a detailed explanation of the K-Means algorithm, as it will provide the base for other algorithms. K-Means clustering assigns data points to k number of clusters (cluster centroids) by minimizing the distance from the data points to the cluster centroids. Let's consider a simple scenario where we need to cluster people based on their size (height and weight are the selected attributes) and different colors (clusters): We can plot this problem in two-dimensional space, as shown in the following figure and solve it using the K-Means algorithm: Getting your hands dirty! Let's move on to a real implementation of the K-Means algorithm using Apache Mahout. The following are the different ways in which you can run algorithms in Apache Mahout: Sequential MapReduce You can execute the algorithms using a command line (by calling the correct bin/mahout subcommand) or using Java programming (calling the correct driver's run method). Running K-Means using Java programming This example continues with the people-clustering scenario mentioned earlier. The size (weight and height) distribution for this example has been plotted in two-dimensional space, as shown in the following image: Data preparation First, we need to represent the problem domain as numerical vectors. The following table shows the size distribution of people mentioned in the previous scenario: Weight (kg) Height (cm) 22 80 25 75 28 85 55 150 50 145 53 153 Save the following content in a file named KmeansTest.data: 22 80 25 75 28 85 55 150 50 145 53 153 Understanding important parameters Let's take a look at the significance of some important parameters: org.apache.hadoop.fs.Path: This denotes the path to a file or directory in the filesystem. org.apache.hadoop.conf.Configuration: This provides access to Hadoop-related configuration parameters. org.apache.mahout.common.distance.DistanceMeasure: This determines the distance between two points. K: This denotes the number of clusters. convergenceDelta: This is a double value that is used to determine whether the algorithm has converged. maxIterations: This denotes the maximum number of iterations to run. runClustering: If this is true, the clustering step is to be executed after the clusters have been determined. runSequential: If this is true, the K-Means sequential implementation is to be used in order to process the input data. The following code snippet shows the source code: private static final String DIRECTORY_CONTAINING_CONVERTED_INPUT ="Kmeansdata";public static void main(String[] args) throws Exception {// Path to output folderPath output = new Path("Kmeansoutput");// Hadoop configuration detailsConfiguration conf = new Configuration();HadoopUtil.delete(conf, output);run(conf, new Path("KmeansTest"), output, newEuclideanDistanceMeasure(), 2, 0.5, 10);}public static void run(Configuration conf, Path input, Pathoutput, DistanceMeasure measure, int k,double convergenceDelta, int maxIterations) throws Exception {// Input should be given as sequence file formatPath directoryContainingConvertedInput = new Path(output,DIRECTORY_CONTAINING_CONVERTED_INPUT);InputDriver.runJob(input, directoryContainingConvertedInput,"org.apache.mahout.math.RandomAccessSparseVector");// Get initial clusters randomlyPath clusters = new Path(output, "random-seeds");clusters = RandomSeedGenerator.buildRandom(conf,directoryContainingConvertedInput, clusters, k, measure);// Run K-Means with a given KKMeansDriver.run(conf, directoryContainingConvertedInput,clusters, output, convergenceDelta,maxIterations, true, 0.0, false);// run ClusterDumper to display resultPath outGlob = new Path(output, "clusters-*-final");Path clusteredPoints = new Path(output,"clusteredPoints");ClusterDumper clusterDumper = new ClusterDumper(outGlob,clusteredPoints);clusterDumper.printClusters(null);} Use the following code example in order to get a better (readable) outcome to analyze the data points and the centroids they are assigned to: Reader reader = new SequenceFile.Reader(fs,new Path(output,Cluster.CLUSTERED_POINTS_DIR + "/part-m-00000"), conf);IntWritable key = new IntWritable();WeightedPropertyVectorWritable value = newWeightedPropertyVectorWritable();while (reader.next(key, value)) {System.out.println("key: " + key.toString()+ " value: "+value.toString());}reader.close(); After you run the algorithm, you will see the clustering output generated for each iteration and the final result in the filesystem (in the output directory you have specified; in this case, Kmeansoutput). Summary Clustering is an unsupervised learning mechanism that requires minimal human effort. Clustering has many applications in different areas, such as medical image processing, market segmentation, and information retrieval. Clustering mechanisms can be divided into different types, such as hard, soft, flat, hierarchical, and model-based clustering based on different criteria. Apache Mahout implements different clustering algorithms, which can be accessed sequentially or in parallel (using MapReduce). The K-Means algorithm is a simple and fast algorithm that is widely applied. However, there are situations that the K-Means algorithm will not be able to cater to. For such scenarios, Apache Mahout has implemented other algorithms, such as canopy, Fuzzy K-Means, streaming, and spectral clustering. Resources for Article: Further resources on this subject: Apache Solr and Big Data – integration with MongoDB [Article] Introduction to Apache ZooKeeper [Article] Creating an Apache JMeter™ test workbench [Article]

0
0
2887

Packt

16 Jun 2015

15 min read

Flappy Swift

Packt

16 Jun 2015

15 min read

Let's start using the first framework by implementing a nice clone of Flappy Bird with the help of this article by Giordano Scalzo, the author of Swift by Example. (For more resources related to this topic, see here.) The app is… Only someone who has been living under a rock for the past two years may not have heard of Flappy Bird, but to be sure that everybody understands the game, let's go through a brief introduction. Flappy Bird is a simple, but addictive, game where the player controls a bird that must fly between a series of pipes. Gravity pulls the bird down, but by touching the screen, the player can make the bird flap and move towards the sky, driving the bird through a gap in a couple of pipes. The goal is to pass through as many pipes as possible. Our implementation will be a high-fidelity tribute to the original game, with the same simplicity and difficulty level. The app will consist of only two screens—a clean menu screen and the game itself—as shown in the following screenshot: Building the skeleton of the app Let's start implementing the skeleton of our game using the SpriteKit game template. Creating the project For implementing a SpriteKit game, Xcode provides a convenient template, which prepares a project with all the useful settings: Go to New| Project and select the Game template, as shown in this screenshot: In the following screen, after filling in all the fields, pay attention and select SpriteKit under Game Technology, like this: By running the app and touching the screen, you will be delighted by the cute, rotating airplanes! Implementing the menu First of all, let's add CocoaPods; write the following code in the Podfile: use_frameworks! target 'FlappySwift' do pod 'Cartography', '~> 0.5' pod 'HTPressableButton', '~> 1.3' end Then install CocoaPods by running the pod install command. As usual, we are going to implement the UI without using Interface Builder and the storyboards. Go to AppDelegate and add these lines to create the main ViewController: func application(application: UIApplication, didFinishLaunchingWithOptions launchOptions: [NSObject: AnyObject]?) -> Bool { let viewController = MenuViewController() let mainWindow = UIWindow(frame: UIScreen.mainScreen().bounds) mainWindow.backgroundColor = UIColor.whiteColor() mainWindow.rootViewController = viewController mainWindow.makeKeyAndVisible() window = mainWindow return true } The MenuViewController, as the name suggests, implements a nice menu to choose between the game and the Game Center: import UIKit import HTPressableButton import Cartography class MenuViewController: UIViewController { private let playButton = HTPressableButton(frame: CGRectMake(0, 0, 260, 50), buttonStyle: .Rect) private let gameCenterButton = HTPressableButton(frame: CGRectMake(0, 0, 260, 50), buttonStyle: .Rect) override func viewDidLoad() { super.viewDidLoad() setup() layoutView() style() render() } } As you can see, we are using the usual structure. Just for the sake of making the UI prettier, we are using HTPressableButtons instead of the default buttons. Despite the fact that we are using AutoLayout, the implementation of this custom button requires that we instantiate the button by passing a frame to it: // MARK: Setup private extension MenuViewController{ func setup(){ playButton.addTarget(self, action: "onPlayPressed:", forControlEvents: .TouchUpInside) view.addSubview(playButton) gameCenterButton.addTarget(self, action: "onGameCenterPressed:", forControlEvents: .TouchUpInside) view.addSubview(gameCenterButton) } @objc func onPlayPressed(sender: UIButton) { let vc = GameViewController() vc.modalTransitionStyle = .CrossDissolve presentViewController(vc, animated: true, completion: nil) } @objc func onGameCenterPressed(sender: UIButton) { println("onGameCenterPressed") } } The only thing to note is that, because we are setting the function to be called when the button is pressed using the addTarget() function, we must prefix the designed methods using @objc. Otherwise, it will be impossible for the Objective-C runtime to find the correct method when the button is pressed. This is because they are implemented in a private extension; of course, you can set the extension as internal or public and you won't need to prepend @objc to the functions: // MARK: Layout extension MenuViewController{ func layoutView() { layout(playButton) { view in view.bottom == view.superview!.centerY - 60 view.centerX == view.superview!.centerX view.height == 80 view.width == view.superview!.width - 40 } layout(gameCenterButton) { view in view.bottom == view.superview!.centerY + 60 view.centerX == view.superview!.centerX view.height == 80 view.width == view.superview!.width - 40 } } } The layout functions simply put the two buttons in the correct places on the screen: // MARK: Style private extension MenuViewController{ func style(){ playButton.buttonColor = UIColor.ht_grapeFruitColor() playButton.shadowColor = UIColor.ht_grapeFruitDarkColor() gameCenterButton.buttonColor = UIColor.ht_aquaColor() gameCenterButton.shadowColor = UIColor.ht_aquaDarkColor() } } // MARK: Render private extension MenuViewController{ func render(){ playButton.setTitle("Play", forState: .Normal) gameCenterButton.setTitle("Game Center", forState: .Normal) } } Finally, we set the colors and text for the titles of the buttons. The following screenshot shows the complete menu: You will notice that on pressing Play, the app crashes. This is because the template is using the view defined in storyboard, and we are directly using the controllers. Let's change the code in GameViewController: class GameViewController: UIViewController { private let skView = SKView() override func viewDidLoad() { super.viewDidLoad() skView.frame = view.bounds view.addSubview(skView) if let scene = GameScene.unarchiveFromFile("GameScene") as? GameScene { scene.size = skView.frame.size skView.showsFPS = true skView.showsNodeCount = true skView.ignoresSiblingOrder = true scene.scaleMode = .AspectFill skView.presentScene(scene) } } } We are basically creating the SKView programmatically, and setting its size just as we did for the main view's size. If the app is run now, everything will work fine. You can find the code for this version at https://github.com/gscalzo/FlappySwift/tree/the_menu_is_ready. A stage for a bird Let's kick-start the game by implementing the background, which is not as straightforward as it might sound. SpriteKit in a nutshell SpriteKit is a powerful, but easy-to-use, game framework introduced in iOS 7. It basically provides the infrastructure to move images onto the screen and interact with them. It also provides a physics engine (based on Box2D), a particles engine, and basic sound playback support, making it particularly suitable for casual games. The content of the game is drawn inside an SKView, which is a particular kind of UIView, so it can be placed inside a normal hierarchy of UIViews. The content of the game is organized into scenes, represented by subclasses of SKScene. Different parts of the game, such as the menu, levels, and so on, must be implemented in different SKScenes. You can consider an SK in SpriteKit as an equivalent of the UIViewController. Inside an SKScene, the elements of the game are grouped in the SKNode's tree which tells the SKScene how to render the components. An SKNode can be either a drawable node, such as SKSpriteNode or SKShapeNode; or something to be applied to the subtree of its descendants, such as SKEffectNode or SKCropNode. Note that SKScene is an SKNode itself. Nodes are animated using SKAction. An SKAction is a change that must be applied to a node, such as a move to a particular position, a change of scaling, or a change in the way the node appears. The actions can be grouped together to create actions that run in parallel, or wait for the end of a previous action. Finally, we can define physics-based relations between objects, defining mass, gravity, and how the nodes interact with each other. That said, the best way to understand and learn SpriteKit is by starting to play with it. So, without further ado, let's move on to the implementation of our tiny game. In this way, you'll get a complete understanding of the most important features of SpriteKit. Explaining the code In the previous section, we implemented the menu view, leaving the code similar to what was created by the template. With basic knowledge of SpriteKit, you can now start understanding the code: class GameViewController: UIViewController { private let skView = SKView() override func viewDidLoad() { super.viewDidLoad() skView.frame = view.bounds view.addSubview(skView) if let scene = GameScene.unarchiveFromFile("GameScene") as? GameScene { scene.size = skView.frame.size skView.showsFPS = true skView.showsNodeCount = true skView.ignoresSiblingOrder = true scene.scaleMode = .AspectFill skView.presentScene(scene) } } } This is the UIViewController that starts the game; it creates an SKView to present the full screen. Then it instantiates the scene from GameScene.sks, which can be considered the equivalent of a Storyboard. Next, it enables some debug information before presenting the scene. It's now clear that we must implement the game inside the GameScene class. Simulating a three-dimensional world using parallax To simulate the depth of the in-game world, we are going to use the technique of parallax scrolling, a really popular method wherein the farther images on the game screen move slower than the closer images. In our case, we have three different levels, and we'll use three different speeds: Before implementing the scrolling background, we must import the images into our project, remembering to set each image as 2x in the assets. The assets can be downloaded from https://github.com/gscalzo/FlappySwift/blob/master/assets.zip?raw=true. The GameScene class basically sets up the background levels: import SpriteKit class GameScene: SKScene { private var screenNode: SKSpriteNode! private var actors: [Startable]! override func didMoveToView(view: SKView) { screenNode = SKSpriteNode(color: UIColor.clearColor(), size: self.size) addChild(screenNode) let sky = Background(textureNamed: "sky", duration:60.0).addTo(screenNode) let city = Background(textureNamed: "city", duration:20.0).addTo(screenNode) let ground = Background(textureNamed: "ground", duration:5.0).addTo(screenNode) actors = [sky, city, ground] for actor in actors { actor.start() } } } The only implemented function is didMoveToView(), which can be considered the equivalent of viewDidAppear for a UIVIewController. We define an array of Startable objects, where Startable is a protocol for making the life cycle of the scene, uniform: import SpriteKit protocol Startable { func start() func stop() } This will be handy for giving us an easy way to stop the game later, when either we reach the final goal or our character dies. The Background class holds the behavior for a scrollable level: import SpriteKit class Background { private let parallaxNode: ParallaxNode private let duration: Double init(textureNamed textureName: String, duration: Double) { parallaxNode = ParallaxNode(textureNamed: textureName) self.duration = duration } func addTo(parentNode: SKSpriteNode) -> Self { parallaxNode.addTo(parentNode) return self } } As you can see, the class saves the requested duration of a cycle, and then it forwards the calls to a class called ParallaxNode: // Startable extension Background : Startable { func start() { parallaxNode.start(duration: duration) } func stop() { parallaxNode.stop() } } The Startable protocol is implemented by forwarding the methods to ParallaxNode. How to implement the scrolling The idea of implementing scrolling is really straightforward: we implement a node where we put two copies of the same image in a tiled format. We then place the node such that we have the left half fully visible. Then we move the entire node to the left until we fully present the left node. Finally, we reset the position to the original one and restart the cycle. The following figure explains this algorithm: import SpriteKit class ParallaxNode { private let node: SKSpriteNode! init(textureNamed: String) { let leftHalf = createHalfNodeTexture(textureNamed, offsetX: 0) let rightHalf = createHalfNodeTexture(textureNamed, offsetX: leftHalf.size.width) let size = CGSize(width: leftHalf.size.width + rightHalf.size.width, height: leftHalf.size.height) node = SKSpriteNode(color: UIColor.whiteColor(), size: size) node.anchorPoint = CGPointZero node.position = CGPointZero node.addChild(leftHalf) node.addChild(rightHalf) } func zPosition(zPosition: CGFloat) -> ParallaxNode { node.zPosition = zPosition return self } func addTo(parentNode: SKSpriteNode) -> ParallaxNode { parentNode.addChild(node) return self } } The init() method simply creates the two halves, puts them side by side, and sets the position of the node: // Mark: Private private func createHalfNodeTexture(textureNamed: String, offsetX: CGFloat) -> SKSpriteNode { let node = SKSpriteNode(imageNamed: textureNamed, normalMapped: true) node.anchorPoint = CGPointZero node.position = CGPoint(x: offsetX, y: 0) return node } The half node is just a node with the correct offset for the x-coordinate: // Mark: Startable extension ParallaxNode { func start(#duration: NSTimeInterval) { node.runAction(SKAction.repeatActionForever(SKAction.sequence( [ SKAction.moveToX(-node.size.width/2.0, duration: duration), SKAction.moveToX(0, duration: 0) ] ))) } func stop() { node.removeAllActions() } } Finally, the Startable protocol is implemented using two actions in a sequence. First, we move half the size—which means an image width—to the left, and then we move the node to the original position to start the cycle again. This is what the final result looks like: You can find the code for this version at https://github.com/gscalzo/FlappySwift/tree/stage_with_parallax_levels. Summary This article has given an idea on how to go about direction that you need to build a clone of the Flappy Bird app. For the complete exercise and a lot more, please refer to Swift by Example by Giordano Scalzo. Resources for Article: Further resources on this subject: Playing with Swift [article] Configuring Your Operating System [article] Updating data in the background [article]

0
0
8063

Packt

15 Jun 2015

3 min read

Neural Network in Azure ML

Packt

15 Jun 2015

3 min read

In this article written by Sumit Mund, author of the book Microsoft Azure Machine Learning, we will learn about neural network, which is a kind of machine learning algorithm inspired by the computational models of a human brain. It builds a network of computation units, neurons, or nodes. In a typical network, there are three layers of nodes. First, the input layer, followed by the middle layer or hidden layer, and in the end, the output layer. Neural network algorithms can be used for both classification and regression problems. (For more resources related to this topic, see here.) The number of nodes in a layer depends on the problem and how you construct the network to get the best result. Usually, the number of nodes in an input layer is equal to the number of features in the dataset. For a regression problem, the number of nodes in the output layer is one while for a classification problem, it is equal to the number of class or label. Each node in a layer gets connected to all the nodes in the next layer. Each edge that connects between nodes is assigned a weight. So, a neural network can well be imagined as a weighted directed acyclic graph. In a typical neural network, as shown in the preceding figure, the middle layer or hidden layer contains the number nodes, which are chosen to make the computation right. While there is no formula or agreed convention for this, it is often optimized after trying out different options. Azure Machine Learning supports neural network for regression, two-class classification, and multiclass classification. It provides a separate module for each kind of problem and lets the users tune it with different parameters, such as the number of hidden nodes, number of iterations to train the model, and so on. A special kind of neural network algorithms where there are more than one hidden layers is known as deep networks or deep learning algorithms. Azure Machine Learning allows us to choose the number of hidden nodes as a property value of the neural network module. These kind of neural networks are getting increasingly popular these days because of their remarkable results and because they allow us to model complex and nonlinear scenarios. There are many kinds of deep networks, but recently, a special kind of deep network known as the convolutional neural network got very popular because of its significant performance in image recognition or classification problems. Azure Machine Learning supports the convolutional neural network. For simple networks with three layers, this can be done through a UI just by choosing parameters. However, to build a deep network like a convolutional deep network, it’s not easy to do so through a UI. So, Azure Machine Learning supports a new kind of language called Net#, which allows you to script different kinds of neural network inside ML Studio by defining different node, the connections (edges), and kind of connections. While deep networks are complex to build and train, Net# makes things relatively easy and simple. Though complex, neural networks are very powerful and Azure Machine Learning makes it fun to work with these be it three-layered shallow networks or multilayer deep networks. Resources for Article: Further resources on this subject: Security in Microsoft Azure [article] High Availability, Protection, and Recovery using Microsoft Azure [article] Managing Microsoft Cloud [article]

0
0
4680

article-image-preparing-build-your-own-gis-application

Packt

15 Jun 2015

12 min read

Preparing to Build Your Own GIS Application

Packt

15 Jun 2015

12 min read

In this article by Karim Bahgat, author of the book Python Geospatial Development Essentials, we will see how to design your own Geographic Information Systems (GIS) application. You want to create a desktop application, in other words, a user interface, that helps you or others create, process, analyze, and visualize geographic data. This article will be your step-by-step guide toward that goal. (For more resources related to this topic, see here.) We assume that you are someone who enjoys programming and being creative but are not necessarily a computer science guru, Python expert, or seasoned GIS analyst. To successfully proceed with this article, it is recommended that you have a basic introductory knowledge of Python programming that includes classes, methods, and the Tkinter toolkit, as well as some core GIS concepts. If you are a newcomer to some of these, we will still cover some of the basics, but you will need to have the interest and ability to follow along at a fast pace. In this introductory article, you will cover the following: Learn some of the benefits of creating a GIS application from scratch Set up your computer, so you can follow the article instructions Become familiar with the roadmap toward creating our application Why reinvent the wheel? The first step in preparing ourselves for this article is in convincing ourselves why we want to make our own GIS application, as well as to be clear about our motives. Spatial analysis and GIS have been popular for decades and there is plenty of GIS software out there, so why go through the trouble of reinventing the wheel? Firstly, we aren't really reinventing the wheel, since Python can be extended with plenty of third-party libraries that take care of most of our geospatial needs. For me, the main motivation stems from the problem that most of today's GIS applications are aimed at highly capable and technical users who are well-versed in GIS or computer science, packed with a dizzying array of buttons and options that will scare off many an analyst. We believe that there is a virtue in trying to create a simpler and more user-friendly software for beginner GIS users or even the broader public, without having to start completely from scratch. This way, we also add more alternatives for users to choose from, as supplements to the current GIS market dominated by a few major giants, notably ArcGIS and QGIS, but also others such as GRASS, uDig, gvSIG, and more. Another particularly exciting reason to create your own GIS from scratch is to make your own domain-specific special purpose software for any task you can imagine, whether it is a water flow model GIS, an ecological migrations GIS, or even a GIS for kids. Such specialized tasks that would usually require many arduous steps in an ordinary GIS, could be greatly simplified into a single button and accompanied with suitable functionality, design layout, icons, and colors. One such example is the Crime Analytics for Space-Time (CAST) software produced by the GeoDa Center at Arizona State University, seen in the following picture: Also, by creating your GIS from scratch, it is possible to have greater control of the size and portability of your application. This can enable you to go small—letting your application have faster startup time, and travel the Internet or on a USB-stick easily. Although storage space itself is not as much of an issue these days, from a user's perspective, installing a 200 MB application is still a greater psychological investment with a greater toll in terms of willingness to try it than a mere 30 MB application (all else being equal). This is particularly true in the realm of smartphones and tablets, a very exciting market for special-purpose geospatial apps. While the specific application we make in this article will not be able to run on iOS or Android devices, it will run on Windows 8-based hybrid tablets, and can be rebuilt around a different GUI toolkit in order to support iOS. Finally, the utility and philosophy of free and open source software may be an important motivation for some of you. Many people today, learn to appreciate open source GIS after losing access to subscription-based applications like ArcGIS when they complete their university education or change their workplace. By developing your own open source GIS application and sharing with others, you can contribute back to and become part of the community that once helped you. Setting up your computer In this article, we follow steps on how to make an application that is developed in a Windows environment. This does not mean that the application cannot be developed on Mac OS X or Linux, but those platforms may have slightly different installation instructions and may require compiling of the binary code that is outside the scope of this article. Therefore, we leave that choice up to the reader. In this article, which focuses on Windows, we avoid the problem of compiling it altogether, using precompiled versions where possible (more on this later). The development process itself will be done using Python 2.7, specifically the 32-bit version, though 64-bit can theoretically be used as well (note that this is the bit version of your Python installation and has nothing to do with the bit version of your operating system). Although there exists many newer versions, version 2.7 is the most widely supported in terms of being able to use third-party packages. It has also been reported that the version 2.7 will continue to be actively developed and promoted until the year 2020. It will still be possible to use after support has ended. If you do not already have version 2.7, install it now, by following these steps: Go to https://www.python.org/. Under Downloads click on download the latest 32-bit version of Python 2.7 for Windows, which at the time of this writing is Python 2.7.9. Download and run the installation program. For the actual code writing and editing, we will be using the built-in Python Interactive Development Environment (IDLE), but you may of course use any code editor you want. The IDLE lets you write long scripts that can be saved to files and offers an interactive shell window to execute one line at a time. There should be a desktop or start-menu link to Python IDLE after installing Python. Installing third-party packages In order to make our application, we will have to rely on the rich and varied ecosystem of third-party packages that already exists for GIS usage. The Python Package Index (PyPI) website currently lists more than 240 packages tagged Topic :: Scientific/Engineering :: GIS. For a less overwhelming overview of the more popular GIS-related Python libraries, check out the catalogue at the Python-GIS-Resources website created by the author: http://pythongisresources.wordpress.com/ We will have to define which packages to use and install, and this depends on the type of application we are making. What we want to make in this article is a lightweight, highly portable, extendable, and general-purpose GIS application. For these reasons, we avoid heavy packages like GDAL, NumPy, Matplotlib, SciPy, and Mapnik (weighing in at about 30 MB each or about 150-200 MB if we combine them all together). Instead, we focus on lighter third-party packages specialized for each specific functionality. Dropping these heavy packages is a bold decision, as they contain a lot of functionality, and are reliable, efficient, and a dependency for many other packages. If you decide that you want to use them in an application where size is not an issue, you may want to begin now by installing the multipurpose NumPy and possibly SciPy, both of which have easy-to-use installers from their official websites. The typical way to install Python packages is using pip (included with Python 2.7), which downloads and installs packages directly from the Python Package Index website. Pip is used in the following way: Step 1—open your operating system's command line (not the Python IDLE). On Windows, this is done by searching your system for cmd.exe and running it. Step 2—in the black screen window that pops up, one simply types pip install packagename. This will only work if pip is on your system's environment path. If this is not the case, a quick fix is to simply type the full path to the pip script C:Python27Scriptspip instead of just pip. For C or C++ based packages, it is becoming increasingly popular to make them available as precompiled wheel files ending in .whl, which has caused some confusion on how to install them. Luckily, we can use pip to install these wheel files as well, by simply downloading the wheel and pointing pip to its file path. Since some of our dependencies have multiple purposes and are not unique, we will install these ones now. One of them is the Python Imaging Library (PIL), which we will use for the raster data model and for visualization. Let's go ahead and install PIL for Windows now: Go to https://pypi.python.org/pypi/Pillow/2.6.1. Click on the latest .exe file link for our 32-bit Python 2.7 environment to download the PIL installer, which is currently Pillow-2.6.1.win32-py2.7.exe. Run the installation file. Open the IDLE interactive shell and type import PIL to make sure it was installed correctly. Another central package we will be using is Shapely, used for location testing and geometric manipulation. To install it on Windows, perform the following steps: Go to http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely. Download the Shapely wheel file that fits our system, looking something like Shapely‑1.5.7‑cp27‑none‑win32.whl. As described earlier, open a command line window and type C:Python27Scriptspip install pathtoShapely‑1.5.7‑cp27‑none‑win32.whl to unpack the precompiled binaries. To make sure it was installed correctly, open the IDLE interactive shell and type import shapely. Imagining the roadmap ahead Before we begin developing our application, it is important that we create a vision of how we want to structure our application. In Python terms, we will be creating a multilevel package with various subpackages and submodules to take care of different parts of our functionality, independently of any user interface. Only on top of this underlying functionality do we create the visual user interface as a way to access and run that underlying code. This way, we build a solid system, and allow power-users to access all the same functionality via Python scripting for greater automation and efficiency, as exists for ArcGIS and QGIS. To setup the main Python package behind our application, create a new folder called pythongis anywhere on your computer. For Python to be able to interpret the folder pythongis as an importable package, it needs to find a file named __init__.py in that folder. Perform the following steps: Open Python IDLE from the Windows start menu. The first window to pop up is the interactive shell. To open the script editing window click on File and New. Click on File and then Save As. In the dialog window that pops up, browse into the pythongis folder, type __init__.py as the filename, and click on Save. There are two main types of GIS data: vector (coordinate-based geometries such as points, lines, and polygons) and raster (a regularly spaced out grid of data points or cells, similar to an image and its pixels). For a more detailed introduction to the differences between vector and raster data, and other basic GIS concepts, we refer the reader to the book Learning Geospatial Analysis with Python, by Joel Lawhead. You can find this book at: https://www.packtpub.com/application-development/learning-geospatial-analysis-python Since vector and raster data are so fundamentally different in all regards, we split our package in two, one for vector and one for raster. Using the same method as earlier, we create two new subpackage folders within the pythongis package; one called vector and one called raster (each with the same aforementioned empty __init__.py file). Thus, the structure of our package will look as follows (note that : package is not part of the folder name): To make our new vector and raster subpackages importable by our top level pythongis package, we need to add the following relative import statements in pythongis/__init__.py: from . import vectorfrom . import raster Throughout the course of this article, we will build the functionality of these two data types as a set of Python modules in their respective folders. Eventually, we want to end up with a GIS application that has only the most basic of geospatial tools so that we will be able to load, save, manage, visualize, and overlay data. As far as our final product goes, since we focus on clarity and simplicity, we do not put too much effort into making it fast or memory efficient. This comes from an often repeated saying among programmers, an example of which is found in Structured Programming with go to Statements, ACM, Computing Surveys 6 (4): premature optimization is the root of all evil – Donald E. Knuth This leaves us with software that works best with small files, which in most cases is good enough. Once you have a working application and you feel that you need support for larger or faster files, then it's up to you if you want to put in the extra effort of optimization. The GIS application you end up with at the end of the article is simple but functional, and is meant to serve as a framework that you can easily build on. To leave you with some ideas to pick up on, we placed various information boxes throughout the article with ways that you can optimize or extend your application. Summary In this article, you learned about why you want to create a GIS application using Python, set up our programming environment, installed some recurring packages, and created your application structure and framework. Resources for Article: Further resources on this subject: Python functions – Avoid repeating code [article] Python 3: Designing a Tasklist Application [article] Geolocating photos on the map [article]

0
0
8411

Aaron Mills

12 Jun 2015

6 min read

Learning with Minecraft Mods

Aaron Mills

12 Jun 2015

6 min read

Minecraft has shaped a generation of gamers. It's popular with all ages, but elementary age kids live and breath it. Inevitably, someone starts to wonder whether this is a good thing, whether that be parents, teachers, or the media. But something that is often overlooked is the influence that Minecraft Mods can have on that equation. Minecraft Mods come in many different flavors and varieties. The Minecraft Modding community is perhaps the largest such community to exist, and Minecraft lends itself well to presenting real world problems as part of the game world. Due to the shear number of mods, you can find many examples that incorporate some form of beneficial learning that has real world applications. For example, there are mods that incorporate aspects of engineering, systems design, genetics, logic puzzles, computer programming, and more. Vanilla Minecraft aside, let us take a look at the kinds of challenges and tasks that foster learning in various Minecraft Mods. Many mods include some kind of interactive machine system. Whether it be pipes or wires, they all have specific rules for construction and present the player with the challenge of combining a bunch of simple pieces together to create something more. Usually the end result is a factory for manufacturing more items and blocks to build yet more machinery or a power plant for powering all that machinery. Construction typically requires logical problem solving and spatial comprehension. Running a pipe from one end of your factory to the other can be just as complex a piece of spaghetti as in a real factory. There are many mods that focus on these challenges, including Buildcraft, EnderIO, IndustrialCraft2, PneumaticCraft, and more. These mods are also generally capable of interacting with each other seamlessly for even more creative solutions for your factory floor. But factories aren’t the only logic problems that mods present. There are also many logic puzzles built into mods. My own mod, Railcraft, has a fully functional Train routing and signaling system. It's strongly based on real life examples of railroads and provides many of the same solutions you’ll find real railway engineers using to solve the challenges of a railway. Problems that a budding railway engineer faces include scheduling, routing, best usage of track, avoiding collisions using signal logic, and more. But there are many other mods out there with similar types of puzzles. Some item management and piping systems are very logic driven. Logistics Pipes and Applied Energetics 2 are a couple such mods that take things just a step beyond normal pipe mods, in terms of both the amount of logical thinking required and the system’s overall capabilities. Both mods allow you to intelligently manage supply and demand of items across an entire base using logic modules that you install in the machines and pipes. This is all well and good of course, but there are some mods that take this even further. When it comes to logic, some mods allow you to actually write computer code. ComputerCraft and OpenComputers are two mods that allow you to use LUA to control in-game displays, robots, and more. There are even add-ons to these mods that allow you to control a Railcraft railway network from an in-game computer screen. Robot programming is generally very similar to the old “move the turtle around the screen” introductory programming lessons; ComputerCraft even calls its robots Turtles. You can instruct them to move and interact with the world, creating complex structures or just mining out an entire area for ore. The more complex the task, the more complex the code required. However, while mechanical and logic based problems are great, they are far from all that Minecraft Mods have to offer. Another area that has received a lot of attention from mods is Genetics. The first major mod to pioneer Genetics was IndustrialCraft2 with its Crop Breeding mechanics. As an added bonus, IC2 also provides an interesting power system. However, when most people think of Genetics in Mods, the first mod they think of is the Forestry Mod by SirSengir. In Forestry, you can breed Bees, Trees, and Butterflies. But there are other mods, too, such as Mariculture, which allows you to breed Fish. Genetics systems generally provide a wide range of traits and abilities that can be bred into the organisms: for example, increasing the yields of crops or improving the speed at which your bees work, and even breeding in more interesting traits, such as bees that heal you when you get close to the hive. The systems are generally fairly representative of Mendelian Inheritance: each individual has two sets of genes, each one a random set of genes from each parent. There are dominant and recessive genes and the two sets combined give your individual its specific traits. Punnett Squares are encouraged, just like those taught in school. Speaking of school, no discussion of learning and Minecraft would be complete without at least mentioning MinecraftEdu. TeacherGaming, the company behind MinecraftEdu, partnered with Mojang to provide a version of Minecraft specifically tailored for school programs. Of note is the fact that MinecraftEdu ships ready for use with mods and even recommends some of the mods mentioned in this post, including a special version of ComputerCraft created just for the MinecraftEdu project. Real schools use this stuff in real classrooms, teaching kids lessons using Minecraft and Minecraft Mods. So yes, there are many things that can be learned by playing Minecraft, especially if you play with the right mods. So should we be worried about the current generational obsession with Minecraft? Probably not. There are much less edifying things these kids could be playing. So next time your kid starts lecturing you about Mendelian Inheritance, remember it's quite possible he learned about it while playing Minecraft. About the Author Aaron Mills was born in 1983 and lives in the Pacific Northwest, which is a land rich in lore, trees, and rain. He has a Bachelor's Degree in Computer Science and studied at Washington State University Vancouver. He is best known for his work on the Minecraft Mod, Railcraft, but has also contributed significantly to the Minecraft Mods of Forestry and Buildcraft as well some contributions to the Minecraft Forge project.

0
0
29057

How-To Tutorials

Packt

12 Jun 2015

19 min read

Linear Regression

Packt

12 Jun 2015

19 min read

0
0
13160

article-image-deploying-play-application-coreos-and-docker

Packt

11 Jun 2015

8 min read

Deploying a Play application on CoreOS and Docker

Packt

11 Jun 2015

8 min read

0
0
6468

Developing Extensible Data Security

Global Illumination

Code Style in Django

Client and Server Applications

Color and motion finding

Set Up MariaDB

Digging Deep into Requests

Defining Dependencies

Clustering

Flappy Swift

Trending Topics

Neural Network in Azure ML

Preparing to Build Your Own GIS Application

Learning with Minecraft Mods

Linear Regression

Deploying a Play application on CoreOS and Docker

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access