Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Server-Side Web Development

406 Articles
article-image-using-model-serializers-eliminate-duplicate-code
Packt
23 Sep 2016
12 min read
Save for later

Using model serializers to eliminate duplicate code

Packt
23 Sep 2016
12 min read
In this article by Gastón C. Hillar, author of, Building RESTful Python Web Services, we will cover the use of model serializers to eliminate duplicate code and use of default parsing and rendering options. (For more resources related to this topic, see here.) Using model serializers to eliminate duplicate code The GameSerializer class declares many attributes with the same names that we used in the Game model and repeats information such as the types and the max_length values. The GameSerializer class is a subclass of the rest_framework.serializers.Serializer, it declares attributes that we manually mapped to the appropriate types, and overrides the create and update methods. Now, we will create a new version of the GameSerializer class that will inherit from the rest_framework.serializers.ModelSerializer class. The ModelSerializer class automatically populates both a set of default fields and a set of default validators. In addition, the class provides default implementations for the create and update methods. In case you have any experience with Django Web Framework, you will notice that the Serializer and ModelSerializer classes are similar to the Form and ModelForm classes. Now, go to the gamesapi/games folder folder and open the serializers.py file. Replace the code in this file with the following code that declares the new version of the GameSerializer class. The code file for the sample is included in the restful_python_chapter_02_01 folder. from rest_framework import serializers from games.models import Game class GameSerializer(serializers.ModelSerializer): class Meta: model = Game fields = ('id', 'name', 'release_date', 'game_category', 'played') The new GameSerializer class declares a Meta inner class that declares two attributes: model and fields. The model attribute specifies the model related to the serializer, that is, the Game class. The fields attribute specifies a tuple of string whose values indicate the field names that we want to include in the serialization from the related model. There is no need to override either the create or update methods because the generic behavior will be enough in this case. The ModelSerializer superclass provides implementations for both methods. We have reduced boilerplate code that we didn’t require in the GameSerializer class. We just needed to specify the desired set of fields in a tuple. Now, the types related to the game fields is included only in the Game class. Press Ctrl + C to quit Django’s development server and execute the following command to start it again. python manage.py runserver Using the default parsing and rendering options and move beyond JSON The APIView class specifies default settings for each view that we can override by specifying appropriate values in the gamesapi/settings.py file or by overriding the class attributes in subclasses. As previously explained the usage of the APIView class under the hoods makes the decorator apply these default settings. Thus, whenever we use the decorator, the default parser classes and the default renderer classes will be associated with the function views. By default, the value for the DEFAULT_PARSER_CLASSES is the following tuple of classes: ( 'rest_framework.parsers.JSONParser', 'rest_framework.parsers.FormParser', 'rest_framework.parsers.MultiPartParser' ) When we use the decorator, the API will be able to handle any of the following content types through the appropriate parsers when accessing the request.data attribute. application/json application/x-www-form-urlencoded multipart/form-data When we access the request.data attribute in the functions, Django REST Framework examines the value for the Content-Type header in the incoming request and determines the appropriate parser to parse the request content. If we use the previously explained default values, Django REST Framework will be able to parse the previously listed content types. However, it is extremely important that the request specifies the appropriate value in the Content-Type header. We have to remove the usage of the rest_framework.parsers.JSONParser class in the functions to make it possible to be able to work with all the configured parsers and stop working with a parser that only works with JSON. The game_list function executes the following two lines when request.method is equal to 'POST': game_data = JSONParser().parse(request) game_serializer = GameSerializer(data=game_data) We will remove the first line that uses the JSONParser and we will pass request.data as the data argument for the GameSerializer. The following line will replace the previous lines: game_serializer = GameSerializer(data=request.data) The game_detail function executes the following two lines when request.method is equal to 'PUT': game_data = JSONParser().parse(request) game_serializer = GameSerializer(game, data=game_data) We will make the same edits done for the code in the game_list function. We will remove the first line that uses the JSONParser and we will pass request.data as the data argument for the GameSerializer. The following line will replace the previous lines: game_serializer = GameSerializer(game, data=request.data) By default, the value for the DEFAULT_RENDERER_CLASSES is the following tuple of classes: ( 'rest_framework.renderers.JSONRenderer', 'rest_framework.renderers.BrowsableAPIRenderer', ) When we use the decorator, the API will be able to render any of the following content types in the response through the appropriate renderers when working with the rest_framework.response.Response object. application/json text/html By default, the value for the DEFAULT_CONTENT_NEGOTIATION_CLASS is the rest_framework.negotiation.DefaultContentNegotiation class. When we use the decorator, the API will use this content negotiation class to select the appropriate renderer for the response based on the incoming request. This way, when a request specifies that it will accept text/html, the content negotiation class selects the rest_framework.renderers.BrowsableAPIRenderer to render the response and generate text/html instead of application/json. We have to replace the usages of both the JSONResponse and HttpResponse classes in the functions with the rest_framework.response.Response class. The Response class uses the previously explained content negotiation features, renders the received data into the appropriate content type and returns it to the client. Now, go to the gamesapi/games folder folder and open the views.py file. Replace the code in this file with the following code that removes the JSONResponse class, uses the @api_view decorator for the functions and the rest_framework.response.Response class. The modified lines are highlighted. The code file for the sample is included in the restful_python_chapter_02_02 folder. from rest_framework.parsers import JSONParser from rest_framework import status from rest_framework.decorators import api_view from rest_framework.response import Response from games.models import Game from games.serializers import GameSerializer @api_view(['GET', 'POST']) def game_list(request): if request.method == 'GET': games = Game.objects.all() games_serializer = GameSerializer(games, many=True) return Response(games_serializer.data) elif request.method == 'POST': game_serializer = GameSerializer(data=request.data) if game_serializer.is_valid(): game_serializer.save() return Response(game_serializer.data, status=status.HTTP_201_CREATED) return Response(game_serializer.errors, status=status.HTTP_400_BAD_REQUEST) @api_view(['GET', 'PUT', 'POST']) def game_detail(request, pk): try: game = Game.objects.get(pk=pk) except Game.DoesNotExist: return Response(status=status.HTTP_404_NOT_FOUND) if request.method == 'GET': game_serializer = GameSerializer(game) return Response(game_serializer.data) elif request.method == 'PUT': game_serializer = GameSerializer(game, data=request.data) if game_serializer.is_valid(): game_serializer.save() return Response(game_serializer.data) return Response(game_serializer.errors, status=status.HTTP_400_BAD_REQUEST) elif request.method == 'DELETE': game.delete() return Response(status=status.HTTP_204_NO_CONTENT) After you save the previous changes, run the following command: http OPTIONS :8000/games/ The following is the equivalent curl command: curl -iX OPTIONS :8000/games/ The previous command will compose and send the following HTTP request: OPTIONS http://localhost:8000/games/. The request will match and run the views.game_list function, that is, the game_list function declared within the games/views.py file. We added the @api_view decorator to this function, and therefore, it is capable of determining the supported HTTP verbs, parsing and rendering capabilities. The following lines show the output: HTTP/1.0 200 OK Allow: GET, POST, OPTIONS, PUT Content-Type: application/json Date: Thu, 09 Jun 2016 21:35:58 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "description": "", "name": "Game Detail", "parses": [ "application/json", "application/x-www-form-urlencoded", "multipart/form-data" ], "renders": [ "application/json", "text/html" ] } The response header includes an Allow key with a comma-separated list of HTTP verbs supported by the resource collection as its value: GET, POST, OPTIONS. As our request didn’t specify the allowed content type, the function rendered the response with the default application/json content type. The response body specifies the Content-type that the resource collection parses and the Content-type that it renders. Run the following command to compose and send and HTTP request with the OPTIONS verb for a game resource. Don’t forget to replace 3 with a primary key value of an existing game in your configuration: http OPTIONS :8000/games/3/ The following is the equivalent curl command: curl -iX OPTIONS :8000/games/3/ The previous command will compose and send the following HTTP request: OPTIONS http://localhost:8000/games/3/. The request will match and run the views.game_detail function, that is, the game_detail function declared within the games/views.py file. We also added the @api_view decorator to this function, and therefore, it is capable of determining the supported HTTP verbs, parsing and rendering capabilities. The following lines show the output: HTTP/1.0 200 OK Allow: GET, POST, OPTIONS Content-Type: application/json Date: Thu, 09 Jun 2016 20:24:31 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "description": "", "name": "Game List", "parses": [ "application/json", "application/x-www-form-urlencoded", "multipart/form-data" ], "renders": [ "application/json", "text/html" ] } The response header includes an Allow key with comma-separated list of HTTP verbs supported by the resource as its value: GET, POST, OPTIONS, PUT. The response body specifies the content-type that the resource parses and the content-type that it renders, with the same contents received in the previous OPTIONS request applied to a resource collection, that is, to a games collection. When we composed and sent POST and PUT commands, we had to use the use the -H "Content-Type: application/json" option to indicate curl to send the data specified after the -d option as application/json instead of the default application/x-www-form-urlencoded. Now, in addition to application/json, our API is capable of parsing application/x-www-form-urlencoded and multipart/form-data data specified in the POST and PUT requests. Thus, we can compose and send a POST command that sends the data as application/x-www-form-urlencoded with the changes made to our API. We will compose and send an HTTP request to create a new game. In this case, we will use the -f option for HTTPie that serializes data items from the command line as form fields and sets the Content-Type header key to the application/x-www-form-urlencoded value. http -f POST :8000/games/ name='Toy Story 4' game_category='3D RPG' played=false release_date='2016-05-18T03:02:00.776594Z' The following is the equivalent curl command. Notice that we don’t use the -H option and curl will send the data in the default application/x-www-form-urlencoded: curl -iX POST -d '{"name":"Toy Story 4", "game_category":"3D RPG", "played": "false", "release_date": "2016-05-18T03:02:00.776594Z"}' :8000/games/ The previous commands will compose and send the following HTTP request: POST http://localhost:8000/games/ with the Content-Type header key set to the application/x-www-form-urlencoded value and the following data: name=Toy+Story+4&game_category=3D+RPG&played=false&release_date=2016-05-18T03%3A02%3A00.776594Z The request specifies /games/, and therefore, it will match '^games/$' and run the views.game_list function, that is, the updated game_detail function declared within the games/views.py file. As the HTTP verb for the request is POST, the request.method property is equal to 'POST', and therefore, the function will execute the code that creates a GameSerializer instance and passes request.data as the data argument for its creation. The rest_framework.parsers.FormParser class will parse the data received in the request, the code creates a new Game and, if the data is valid, it saves the new Game. If the new Game was successfully persisted in the database, the function returns an HTTP 201 Created status code and the recently persisted Game serialized to JSON in the response body. The following lines show an example response for the HTTP request, with the new Game object in the JSON response: HTTP/1.0 201 Created Allow: OPTIONS, POST, GET Content-Type: application/json Date: Fri, 10 Jun 2016 20:38:40 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "game_category": "3D RPG", "id": 20, "name": "Toy Story 4", "played": false, "release_date": "2016-05-18T03:02:00.776594Z" } After the changes we made in the code, we can run the following command to see what happens when we compose and send an HTTP request with an HTTP verb that is not supported: http PUT :8000/games/ The following is the equivalent curl command: curl -iX PUT :8000/games/ The previous command will compose and send the following HTTP request: PUT http://localhost:8000/games/. The request will match and try to run the views.game_list function, that is, the game_list function declared within the games/views.py file. The @api_view decorator we added to this function doesn’t include 'PUT' in the string list with the allowed HTTP verbs, and therefore, the default behavior returns a 405 Method Not Allowed status code. The following lines show the output with the response from the previous request. A JSON content provides a detail key with a string value that indicates the PUT method is not allowed. HTTP/1.0 405 Method Not Allowed Allow: GET, OPTIONS, POST Content-Type: application/json Date: Sat, 11 Jun 2016 00:49:30 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "detail": "Method "PUT" not allowed." } Summary This article covers the use of model serializers and how it is effective in removing duplicate code. Resources for Article: Further resources on this subject: Making History with Event Sourcing [article] Implementing a WCF Service in the Real World [article] WCF – Windows Communication Foundation [article]
Read more
  • 0
  • 0
  • 9087

article-image-hello-tdd
Packt
14 Sep 2016
6 min read
Save for later

Hello TDD!

Packt
14 Sep 2016
6 min read
In this article by Gaurav Sood, the author of the book Scala Test-Driven Development, tells  basics of Test-Driven Development. We will explore: What is Test-Driven Development? What is the need for Test-Driven Development? Brief introduction to Scala and SBT (For more resources related to this topic, see here.) What is Test-Driven Development? Test-Driven Development or TDD(as it is referred to commonly)is the practice of writing your tests before writing any application code. This consists of the following iterative steps: This process is also referred to asRed-Green-Refactor-Repeat. TDD became more prevalent with the use of agile software development process, though it can be used as easily with any of the Agile's predecessors like Waterfall. Though TDD is not specifically mentioned in agile manifesto (http://agilemanifesto.org), it has become a standard methodology used with agile. Saying this, you can still use agile without using TDD. Why TDD? The need for TDD arises from the fact that there can be constant changes to the application code. This becomes more of a problem when we are using agile development process, as it is inherently an iterative development process. Here are some of the advantages, which underpin the need for TDD: Code quality: Tests on itself make the programmer more confident of their code. Programmers can be sure of syntactic and semantic correctness of their code. Evolving architecture: A purely test-driven application code gives way to an evolving architecture. This means that we do not have to pre define our architectural boundaries and the design patterns. As the application grows so does the architecture. This results in an application that is flexible towards future changes. Avoids over engineering: Tests that are written before the application code define and document the boundaries. These tests also document the requirements and application code. Agile purists normally regard comments inside the code as a smell. According to them your tests should document your code. Since all the boundaries are predefined in the tests, it is hard to write application code, which breaches these boundaries. This however assumes that TDD is following religiously. Paradigm shift: When I had started with TDD, I noticed that the first question I asked myself after looking at the problem was; "How can I solve it?" This however is counterproductive. TDD forces the programmer to think about the testability of the solution before its implementation. To understand how to test a problem would mean a better understanding of the problem and its edge cases. This in turn can result into refinement of the requirements or discovery or some new requirements. Now it had become impossible for me not to think about testability of the problem before the solution. Now the first question I ask myself is; "How can I test it?". Maintainable code: I have always found it easier to work on an application that has historically been test-driven rather than on one that is not. Why? Only because when I make change to the existing code, the existing tests make sure that I do not break any existing functionality. This results in highly maintainable code, where many programmers can collaborate simultaneously. Brief introduction to Scala and SBT Let us look at Scala and SBT briefly. It is assumed that the reader is familiar with Scala and therefore will not go into the depth of it. What is Scala Scala is a general-purpose programming language. Scala is an acronym for Scalable Language. This reflects the vision of its creators of making Scala a language that grows with the programmer's experience of it. The fact that Scala and Java objects can be freely mixed, makes transition from Java to Scala quite easy. Scala is also a full-blown functional language. Unlike Haskell, which is a pure functional language, Scala allows interoperability with Java and support for objectoriented programming. Scala also allows use of both pure and impure functions. Impure functions have side affect like mutation, I/O and exceptions. Purist approach to Scala programming encourages use of pure functions only. Scala is a type-safe JVM language that incorporates both object oriented and functional programming into an extremely concise, logical, and extraordinarily powerful language. Why Scala? Here are some advantages of using Scala: Functional solution to problem is always better: This is my personal view and open for contention. Elimination of mutation from application code allows application to be run in parallelacross hosts and cores without any deadlocks. Better concurrency model: Scala has an actor model that is better than Java's model of locks on thread. Concise code:Scala code is more concise than itsmore verbose cousin Java. Type safety/ static typing: Scala does type checking at compile time. Pattern matching: Case statements in Scala are superpowerful. Inheritance:Mixin traits are great and they definitely reduce code repetition. There are other features of Scala like closure and monads, which will need more understanding of functional language concepts to learn. Scala Build Tool Scala Build Tool (SBT) is a build tool that allows compiling, running, testing, packaging, and deployment of your code. SBT is mostly used with Scala projects, but it can as easily be used for projects in other languages. Here, we will be using SBT as a build tool for managing our project and running our tests. SBT is written in Scala and can use many of the features of Scala language. Build definitions for SBT are also written in Scala. These definitions are both flexible and powerful. SBT also allows use of plugins and dependency management. If you have used a build tool like Maven or Gradlein any of your previous incarnations, you will find SBT a breeze. Why SBT? Better dependency management Ivy based dependency management Only-update-on-request model Can launch REPL in project context Continuous command execution Scala language support for creating tasks Resources for learning Scala Here are few of the resources for learning Scala: http://www.scala-lang.org/ https://www.coursera.org/course/progfun https://www.manning.com/books/functional-programming-in-scala http://www.tutorialspoint.com/scala/index.htm Resources for SBT Here are few of the resources for learning SBT: http://www.scala-sbt.org/ https://twitter.github.io/scala_school/sbt.html Summary In this article we learned what is TDD and why to use it. We also learned about Scala and SBT.  Resources for Article: Further resources on this subject: Overview of TDD [article] Understanding TDD [article] Android Application Testing: TDD and the Temperature Converter [article]
Read more
  • 0
  • 0
  • 8952

article-image-code-style-django
Packt
17 Jun 2015
16 min read
Save for later

Code Style in Django

Packt
17 Jun 2015
16 min read
In this article written by Sanjeev Jaiswal and Ratan Kumar, authors of the book Learning Django Web Development, this article will cover all the basic topics which you would require to follow, such as coding practices for better Django web development, which IDE to use, version control, and so on. We will learn the following topics in this article: Django coding style Using IDE for Django web development Django project structure This article is based on the important fact that code is read much more often than it is written. Thus, before you actually start building your projects, we suggest that you familiarize yourself with all the standard practices adopted by the Django community for web development. Django coding style Most of Django's important practices are based on Python. Though chances are you already know them, we will still take a break and write all the documented practices so that you know these concepts even before you begin. To mainstream standard practices, Python enhancement proposals are made, and one such widely adopted standard practice for development is PEP8, the style guide for Python code–the best way to style the Python code authored by Guido van Rossum. The documentation says, "PEP8 deals with semantics and conventions associated with Python docstrings." For further reading, please visit http://legacy.python.org/dev/peps/pep-0008/. Understanding indentation in Python When you are writing Python code, indentation plays a very important role. It acts as a block like in other languages, such as C or Perl. But it's always a matter of discussion amongst programmers whether we should use tabs or spaces, and, if space, how many–two or four or eight. Using four spaces for indentation is better than eight, and if there are a few more nested blocks, using eight spaces for each indentation may take up more characters than can be shown in single line. But, again, this is the programmer's choice. The following is what incorrect indentation practices lead to: >>> def a(): ...   print "foo" ...     print "bar" IndentationError: unexpected indent So, which one we should use: tabs or spaces? Choose any one of them, but never mix up tabs and spaces in the same project or else it will be a nightmare for maintenance. The most popular way of indention in Python is with spaces; tabs come in second. If any code you have encountered has a mixture of tabs and spaces, you should convert it to using spaces exclusively. Doing indentation right – do we need four spaces per indentation level? There has been a lot of confusion about it, as of course, Python's syntax is all about indentation. Let's be honest: in most cases, it is. So, what is highly recommended is to use four spaces per indentation level, and if you have been following the two-space method, stop using it. There is nothing wrong with it, but when you deal with multiple third party libraries, you might end up having a spaghetti of different versions, which will ultimately become hard to debug. Now for indentation. When your code is in a continuation line, you should wrap it vertically aligned, or you can go in for a hanging indent. When you are using a hanging indent, the first line should not contain any argument and further indentation should be used to clearly distinguish it as a continuation line. A hanging indent (also known as a negative indent) is a style of indentation in which all lines are indented except for the first line of the paragraph. The preceding paragraph is the example of hanging indent. The following example illustrates how you should use a proper indentation method while writing the code: bar = some_function_name(var_first, var_second,                                            var_third, var_fourth) # Here indentation of arguments makes them grouped, and stand clear from others. def some_function_name(        var_first, var_second, var_third,        var_fourth):    print(var_first) # This example shows the hanging intent. We do not encourage the following coding style, and it will not work in Python anyway: # When vertical alignment is not used, Arguments on the first line are forbidden foo = some_function_name(var_first, var_second,    var_third, var_fourth) # Further indentation is required as indentation is not distinguishable between arguments and source code. def some_function_name(    var_first, var_second, var_third,    var_fourth):    print(var_first) Although extra indentation is not required, if you want to use extra indentation to ensure that the code will work, you can use the following coding style: # Extra indentation is not necessary. if (this    and that):    do_something() Ideally, you should limit each line to a maximum of 79 characters. It allows for a + or – character used for viewing difference using version control. It is even better to limit lines to 79 characters for uniformity across editors. You can use the rest of the space for other purposes. The importance of blank lines The importance of two blank lines and single blank lines are as follows: Two blank lines: A double blank lines can be used to separate top-level functions and the class definition, which enhances code readability. Single blank lines: A single blank line can be used in the use cases–for example, each function inside a class can be separated by a single line, and related functions can be grouped together with a single line. You can also separate the logical section of source code with a single line. Importing a package Importing a package is a direct implication of code reusability. Therefore, always place imports at the top of your source file, just after any module comments and document strings, and before the module's global and constants as variables. Each import should usually be on separate lines. The best way to import packages is as follows: import os import sys It is not advisable to import more than one package in the same line, for example: import sys, os You may import packages in the following fashion, although it is optional: from django.http import Http404, HttpResponse If your import gets longer, you can use the following method to declare them: from django.http import ( Http404, HttpResponse, HttpResponsePermanentRedirect ) Grouping imported packages Package imports can be grouped in the following ways: Standard library imports: Such as sys, os, subprocess, and so on. import reimport simplejson Related third party imports: These are usually downloaded from the Python cheese shop, that is, PyPy (using pip install). Here is an example: from decimal import * Local application / library-specific imports: This included the local modules of your projects, such as models, views, and so on. from models import ModelFoofrom models import ModelBar Naming conventions in Python/Django Every programming language and framework has its own naming convention. The naming convention in Python/Django is more or less the same, but it is worth mentioning it here. You will need to follow this while creating a variable name or global variable name and when naming a class, package, modules, and so on. This is the common naming convention that we should follow: Name the variables properly: Never use single characters, for example, 'x' or 'X' as variable names. It might be okay for your normal Python scripts, but when you are building a web application, you must name the variable properly as it determines the readability of the whole project. Naming of packages and modules: Lowercase and short names are recommended for modules. Underscores can be used if their use would improve readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged. Since module names are mapped to file names (models.py, urls.py, and so on), it is important that module names be chosen to be fairly short as some file systems are case insensitive and truncate long names. Naming a class: Class names should follow the CamelCase naming convention, and classes for internal use can have a leading underscore in their name. Global variable names: First of all, you should avoid using global variables, but if you need to use them, prevention of global variables from getting exported can be done via __all__, or by defining them with a prefixed underscore (the old, conventional way). Function names and method argument: Names of functions should be in lowercase and separated by an underscore and self as the first argument to instantiate methods. For classes or methods, use CLS or the objects for initialization. Method names and instance variables: Use the function naming rules—lowercase with words separated by underscores as necessary to improve readability. Use one leading underscore only for non-public methods and instance variables. Using IDE for faster development There are many options on the market when it comes to source code editors. Some people prefer full-fledged IDEs, whereas others like simple text editors. The choice is totally yours; pick up whatever feels more comfortable. If you already use a certain program to work with Python source files, I suggest that you stick to it as it will work just fine with Django. Otherwise, I can make a couple of recommendations, such as these: SublimeText: This editor is lightweight and very powerful. It is available for all major platforms, supports syntax highlighting and code completion, and works well with Python. The editor is open source and you can find it at http://www.sublimetext.com/ PyCharm: This, I would say, is most intelligent code editor of all and has advanced features, such as code refactoring and code analysis, which makes development cleaner. Features for Django include template debugging (which is a winner) and also quick documentation, so this look-up is a must for beginners. The community edition is free and you can sample a 30-day trial version before buying the professional edition. Setting up your project with the Sublime text editor Most of the examples that we will show you in this book will be written using Sublime text editor. In this section, we will show how to install and set up the Django project. Download and installation: You can download Sublime from the download tab of the site www.sublimetext.com. Click on the downloaded file option to install. Setting up for Django: Sublime has a very extensive plug-in ecosystem, which means that once you have downloaded the editor, you can install plug-ins for adding more features to it. After successful installation, it will look like this: Most important of all is Package Control, which is the manager for installing additional plugins directly from within Sublime. This will be your only manual installation of the package. It will take care of the rest of the package installation ahead. Some of the recommendations for Python development using Sublime are as follows: Sublime Linter: This gives instant feedback about the Python code as you write it. It also has PEP8 support; this plugin will highlight in real time the things we discussed about better coding in the previous section so that you can fix them.   Sublime CodeIntel: This is maintained by the developer of SublimeLint. Sublime CodeIntel have some of advanced functionalities, such as directly go-to definition, intelligent code completion, and import suggestions.   You can also explore other plugins for Sublime to increase your productivity. Setting up the pycharm IDE You can use any of your favorite IDEs for Django project development. We will use pycharm IDE for this book. This IDE is recommended as it will help you at the time of debugging, using breakpoints that will save you a lot of time figuring out what actually went wrong. Here is how to install and set up pycharm IDE for Django: Download and installation: You can check the features and download the pycharm IDE from the following link: http://www.jetbrains.com/pycharm/ Setting up for Django: Setting up pycharm for Django is very easy. You just have to import the project folder and give the manage.py path, as shown in the following figure: The Django project structure The Django project structure has been changed in the 1.6 release version. Django (django-admin.py) also has a startapp command to create an application, so it is high time to tell you the difference between an application and a project in Django. A project is a complete website or application, whereas an application is a small, self-contained Django application. An application is based on the principle that it should do one thing and do it right. To ease out the pain of building a Django project right from scratch, Django gives you an advantage by auto-generating the basic project structure files from which any project can be taken forward for its development and feature addition. Thus, to conclude, we can say that a project is a collection of applications, and an application can be written as a separate entity and can be easily exported to other applications for reusability. To create your first Django project, open a terminal (or Command Prompt for Windows users), type the following command, and hit Enter: $ django-admin.py startproject django_mytweets This command will make a folder named django_mytweets in the current directory and create the initial directory structure inside it. Let's see what kind of files are created. The new structure is as follows: django_mytweets/// django_mytweets/ manage.py This is the content of django_mytweets/: django_mytweets/ __init__.py settings.py urls.py wsgi.py Here is a quick explanation of what these files are: django_mytweets (the outer folder): This folder is the project folder. Contrary to the earlier project structure in which the whole project was kept in a single folder, the new Django project structure somehow hints that every project is an application inside Django. This means that you can import other third party applications on the same level as the Django project. This folder also contains the manage.py file, which include all the project management settings. manage.py: This is utility script is used to manage our project. You can think of it as your project's version of django-admin.py. Actually, both django-admin.py and manage.py share the same backend code. Further clarification about the settings will be provided when are going to tweak the changes. Let's have a look at the manage.py file: #!/usr/bin/env python import os import sys if __name__ == "__main__":    os.environ.setdefault("DJANGO_SETTINGS_MODULE", "django_mytweets.settings")    from django.core.management import   execute_from_command_line    execute_from_command_line(sys.argv) The source code of the manage.py file will be self-explanatory once you read the following code explanation. #!/usr/bin/env python The first line is just the declaration that the following file is a Python file, followed by the import section in which os and sys modules are imported. These modules mainly contain system-related operations. import os import sys The next piece of code checks whether the file is executed by the main function, which is the first function to be executed, and then loads the Django setting module to the current path. As you are already running a virtual environment, this will set the path for all the modules to the path of the current running virtual environment. if __name__ == "__main__":    os.environ.setdefault("DJANGO_SETTINGS_MODULE",     "django_mytweets.settings") django_mytweets/ ( Inner folder) __init__.py Django projects are Python packages, and this file is required to tell Python that this folder is to be treated as a package. A package in Python's terminology is a collection of modules, and they are used to group similar files together and prevent naming conflicts. settings.py: This is the main configuration file for your Django project. In it, you can specify a variety of options, including database settings, site language(s), what Django features need to be enabled, and so on. By default, the database is configured to use SQLite Database, which is advisable to use for testing purposes. Here, we will only see how to enter the database in the settings file; it also contains the basic setting configuration, and with slight modification in the manage.py file, it can be moved to another folder, such as config or conf. To make every other third-party application a part of the project, we need to register it in the settings.py file. INSTALLED_APPS is a variable that contains all the entries about the installed application. As the project grows, it becomes difficult to manage; therefore, there are three logical partitions for the INSTALLED_APPS variable, as follows: DEFAULT_APPS: This parameter contains the default Django installed applications (such as the admin) THIRD_PARTY_APPS: This parameter contains other application like SocialAuth used for social authentication LOCAL_APPS: This parameter contains the applications that are created by you url.py: This is another configuration file. You can think of it as a mapping between URLs and the Django view functions that handle them. This file is one of Django's more powerful features. When we start writing code for our application, we will create new files inside the project's folder. So, the folder also serves as a container for our code. Now that you have a general idea of the structure of a Django project, let's configure our database system. Summary We prepared our development environment in this article, created our first project, set up the database, and learned how to launch the Django development server. We learned the best way to write code for our Django project and saw the default Django project structure. Resources for Article: Further resources on this subject: Tinkering Around in Django JavaScript Integration [article] Adding a developer with Django forms [article] So, what is Django? [article]
Read more
  • 0
  • 0
  • 8916

article-image-search-using-beautiful-soup
Packt
20 Jan 2014
6 min read
Save for later

Search Using Beautiful Soup

Packt
20 Jan 2014
6 min read
(For more resources related to this topic, see here.) Searching with find_all() The find() method was used to find the first result within a particular search criteria that we applied on a BeautifulSoup object. As the name implies, find_all() will give us all the items matching the search criteria we defined. The different filters that we see in find() can be used in the find_all() method. In fact, these filters can be used in any searching methods, such as find_parents() and find_siblings(). Let us consider an example of using find_all(). Finding all tertiary consumers We saw how to find the first and second primary consumer. If we need to find all the tertiary consumers, we can't use find(). In this case, find_all() will become handy. all_tertiaryconsumers = soup.find_all(class_="tertiaryconsumerslist") The preceding code line finds all the tags with the = "tertiaryconsumerlist" class. If given a type check on this variable, we can see that it is nothing but a list of tag objects as follows: print(type(all_tertiaryconsumers)) #output <class 'list'> We can iterate through this list to display all tertiary consumer names by using the following code: for tertiaryconsumer in all_tertiaryconsumers: print(tertiaryconsumer.div.string) #output lion tiger Understanding parameters used with find_all() Like find(), the find_all() method also has a similar set of parameters with an extra parameter, limit, as shown in the following code line: find_all(name,attrs,recursive,text,limit,**kwargs) The limit parameter is used to specify a limit to the number of results that we get. For example, from the e-mail ID sample we saw, we can use find_all() to get all the e-mail IDs. Refer to the following code: email_ids = soup.find_all(text=emailid_regexp) print(email_ids) #output [u'abc@example.com',u'xyz@example.com',u'foo@example.com'] Here, if we pass limit, it will limit the result set to the limit we impose, as shown in the following example: email_ids_limited = soup.find_all(text=emailid_regexp,limit=2) print(email_ids_limited) #output [u'abc@example.com',u'xyz@example.com'] From the output, we can see that the result is limited to two. The find() method is find_all() with limit=1. We can pass True or False values to find the methods. If we pass True to find_all(), it will return all tags in the soup object. In the case of find(), it will be the first tag within the object. The print(soup.find_all(True)) line of code will print out all the tags associated with the soup object. In the case of searching for text, passing True will return all text within the document as follows: all_texts = soup.find_all(text=True) print(all_texts) #output [u'n', u'n', u'n', u'n', u'n', u'plants', u'n', u'100000', u'n', u'n', u'n', u'algae', u'n', u'100000', u'n', u'n', u'n', u'n', u'n', u'deer', u'n', u'1000', u'n', u'n', u'n', u'rabbit', u'n', u'2000', u'n', u'n', u'n', u'n', u'n', u'fox', u'n', u'100', u'n', u'n', u'n', u'bear', u'n', u'100', u'n', u'n', u'n', u'n', u'n', u'lion', u'n', u'80', u'n', u'n', u'n', u'tiger', u'n', u'50', u'n', u'n', u'n', u'n', u'n'] The preceding output prints every text content within the soup object including the new-line characters too. Also, in the case of text, we can pass a list of strings and find_all() will find every string defined in the list: all_texts_in_list = soup.find_all(text=["plants","algae"]) print(all_texts_in_list) #output [u'plants', u'algae'] This is same in the case of searching for the tags, attribute values of tag, custom attributes, and the CSS class. For finding all the div and li tags, we can use the following code line: div_li_tags = soup.find_all(["div","li"]) Similarly, for finding tags with the producerlist and primaryconsumerlist classes, we can use the following code lines: all_css_class = soup.find_all(class_=["producerlist","primaryconsumerlist"]) Both find() and find_all() search an object's descendants (that is, all children coming after it in the tree), their children, and so on. We can control this behavior by using the recursive parameter. If recursive = False, search happens only on an object's direct children. For example, in the following code, search happens only at direct children for div and li tags. Since the direct child of the soup object is html, the following code will give an empty list: div_li_tags = soup.find_all(["div","li"],recursive=False) print(div_li_tags) #output [] If find_all() can't find results, it will return an empty list, whereas find() returns None. Navigation using Beautiful Soup Navigation in Beautiful Soup is almost the same as the searching methods. In navigating, instead of methods, there are certain attributes that facilitate the navigation. So each Tag or NavigableString object will be a member of the resulting tree with the Beautiful Soup object placed at the top and other objects as the nodes of the tree. The following code snippet is an example for an HTML tree: html_markup = """<div class="ecopyramid"> <ul id="producers"> <li class="producerlist"> <div class="name">plants</div> <div class="number">100000</div> </li> <li class="producerlist"> <div class="name">algae</div> <div class="number">100000</div> </li> </ul> </div>""" For the previous code snippet, the following HTML tree is formed: In the previous figure, we can see that Beautiful Soup is the root of the tree, the Tag objects make up the different nodes of the tree, while NavigableString objects make up the leaves of the tree. Navigation in Beautiful Soup is intended to help us visit the nodes of this HTML/XML tree. From a particular node, it is possible to: Navigate down to the children Navigate up to the parent Navigate sideways to the siblings Navigate to the next and previous objects parsed We will be using the previous html_markup as an example to discuss the different navigations using Beautiful Soup. Summary In this article, we discussed in detail the different search methods in Beautiful Soup, namely, find(), find_all(), find_next(), and find_parents(); code examples for a scraper using search methods to get information from a website; and understanding the application of search methods in combination. We also discussed in detail the different navigation methods provided by Beautiful Soup, methods specific to navigating downwards and upwards, and sideways, to the previous and next elements of the HTML tree. Resources for Article: Further resources on this subject: Web Services Testing and soapUI [article] Web Scraping with Python [article] Plotting data using Matplotlib: Part 1 [article]
Read more
  • 0
  • 0
  • 8522

article-image-creating-your-first-complete-moodle-theme
Packt
11 May 2010
11 min read
Save for later

Creating your First Complete Moodle Theme

Packt
11 May 2010
11 min read
So let's crack on... Creating a new theme Finding a base theme to create your Moodle theme on is the first thing that you need to do.There are, however, various ways to do this; you can make a copy of the standard themeand rename it as you did in part of this article, or you can use a parent theme that isalso based on the standard theme. The important point here is that the standard theme that comes with Moodle is the cornerstone of the Moodle theming process. Every other Moodle theme should be based upon this theme, and would normally describe the differences from the standard theme. Although this method does work and is a simple way to get started with Moodle theming, it does cause problems as new features could get added to Moodle that might cause your theme to display or function incorrectly. The standard theme will always be updated before a new release of Moodle is launched. So if you do choose to make a copy of the standard theme and change its styles, it would be best to make sure that you use a parent theme as well. In this way, the parent theme will be your base theme along with the changes that you make to your copy of the standard theme. However, there is another way of creating your first theme, and that is to create a copy of a theme that is very close to the standard theme, such as standardwhite, and use this as your theme. Moodle will then use the standard theme as its base theme and apply any changes that you make to the standardwhite theme on the top (parent). All we are doing is describing the differences between the standard and the standardwhite themes. This is better because Moodle developers will sometimes make changes to the standard theme to be up-to-date with new Moodle features. This means that on each Moodle update, your standard theme, folder will be updated automatically, thus avoiding any nasty display problems being caused by Moodle updates. The way by which you configure Moodle themes is completely up to you. If you see a theme that is nearly what you want and there aren't really many changes needed, then using a parent theme makes sense, as most of the styles that you require have already been written. However, if you want to create a theme that is completely different from any other theme or wish to really get into Moodle theming, then using a copy of one of the standard sheets would be best. So let's get on and see what the differences are when using different theme setups, and see what effect these different methods have on the theming process. Time for action – copying the standard theme Browse to your theme folder in C:Program FilesApache Software FoundationApache 2.2htdocstheme. Copy the standard theme by right-clicking on the theme's folder and choosing Copy. Paste the copied theme into the theme directory (the same directory that you arecurrently in). Rename the copy of standard folder to blackandblue or any other name that you wish to choose (remember not to use capitals or spaces). Open your Moodle site and navigate to Site Administration Appearance | Themes | Theme Selector|, and choose the blackandblue theme that you have just created. Y ou might have noticed that the theme shown in the preceding screenshot has a header that says Black and Blue theme. This is because I have added this to the Full site name in the Front Page settings page. Time for action – setting a parent theme Open your web browser and navigate to your Moodle site and log in as the administrator. Go to Site Administration | Appearance | Themes | Theme Selector and choose your blackandblue theme if it is not already selected. Browse to the root of your blackandblue folder, right-click on the config.php file, and choose Open with | WordPad. You need to make four changes to this file so that you can use this theme and a parent theme while ensuring that you still use the default standard theme as your base. Here are the changes: $THEME->sheets = array('user_styles');$THEME->standardsheets = true;$THEME->parent = 'autumn';$THEME->parentsheets = array('styles'); Let's look at each of these statements, in turn. $THEME->sheets = array('user_styles'); This contains the names of all of the stylesheet files that you want to include in this for your blackandblue theme, namely user_styles. $THEME->standardsheets = true; This parameter is used to include the standard theme's stylesheets. If it is set to True, it will use all of the stylesheets in the standard theme. Alternatively, it can be set as an array in order to load individual stylesheets in whatever order is required. We have set this to True, so we will use all of the stylesheets of the standard theme. $THEME->parent = 'autumn'; This variable can be set to use a theme as the parent theme, which is included before the current theme. This will make it easier to make changes to another theme without having to change the actual files. $THEME->parentsheets = array('styles'); This variable can be used to choose either all of the parent theme's stylesheets or individual files. It has been set to include the styles.css file from the parent theme, namely autumn. Because there is only one stylesheet in the Autumn theme, you can set this variable to True. Either way, you will have the same outcome. Save themeblackandblueconfig.php, and refresh your web browser window. You should see something similar to the following screenshot. Note that your blocks may be different to the ones below, but you can ignore this. What just happened? Okay, so now you have a copy of the standard theme that uses the Autumn theme (by Patrick Malley) as its parent. You might have noticed that the header isn't correct and that the proper Autumn theme header isn't showing. Well, this is because you are essentially using the copy of the standard theme and that the header from this theme is the one that you see above. It's only the CSS files that are included in this hierarchy, so any HTML changes will not be seen until you edit your standard theme's header.html file. Have a go hero – choose another parent theme Go back and have a look through some of the themes on Moodle.org and download one that you like. Add this theme as a parent theme to your blackandblue theme's config.php file, but this time choose which stylesheets you want to use from that theme. The Back to School theme is a good one for this exercise, as its stylesheets are clearly labeled. So you Copying the header and footer files To show that you are using the Autumn theme's CSS files and the standard theme's HTML files, you can just go and copy the header.html and footer.html files from Patrick Malley's Autumn theme and paste them into your blackandblue theme's folder. Don't worry about overwriting your header and footer files, as you can always just copy them again from the ac tual standard theme folder. Time for action – copying the header.html and footer.html files Browse to the Autumn theme's folder and highlight both the header.html and footer.html files by holding down the Ctrl key and clicking on them both. Right-click on the selected files and choose Copy. Browse to your blackandblue theme's folder and right-click and choose Paste. Go back to your browser window and press the F5 button to refresh the page. You will now see the full Autumn theme. What just happened? You have copied the autumn theme's header.html and footer.html files into your blackandblue theme, so you can see the full autumn theme working. You probably will not actually use the header.html and footer.html files that you just copied, as this was just an example of how the Moodle theming process works. So you now have an unmodified copy of the standard theme called blackandblue, which is using the autumn theme as its parent theme. All you need to do now to make changes to this theme is to edit your CSS file in the blackandblue theme folder. Theme folder housework However, there are a couple of things that you need to do first, as you have an exact copy of the standard theme apart from the header.html and footer.html files. This copied folder has files that you do not need, as the only file that you set for your theme to use was the user_styles.css file in the config.php file earlier. This was the first change that you made: $THEME->sheets = array('user_styles'); The user_style.css file does not exist in your blackandblue theme's folder, so you will need to create it. You will also need to delete any other CSS files that are present, as your new blackandblue theme will use only one stylesheet, namely the user_styles.css file that you will be creating in the following sections. Time for action – creating our stylesheet Right-click anywhere in your blackandblue folder and choose New Text Document|. Rename this text document to user_styles.css by right-clicking again and choosing Rename. Time for action – deleting CSS files that we don't need Delete the following CSS files by selecting them and then right-clicking on the selected files and choosing Delete. styles_color.css styles_ie6.css styles_ie6.css styles_ie7.css styles_layout.css styles_moz.css {background: #000000;} What just happened? In the last two tasks, you created an empty CSS file called user_style.css in your blackandblue theme's folder. You then deleted all of the CSS files in your blackandblue theme's folder, as you will no longer need them. Remember, these are just copies of the CSS files in the standard theme folder and you have set your theme to use the standard theme as its base in the blackandblue theme's config.php file. Let's make some changes Now you have set up your theme the way that you want it, that is, you are using your own blackandblue theme by using the standard theme as a base and the Autumn theme as the parent. Move on and make a few changes to your user_style.css file so that you can see what effect this has on your theme, and check that all of your config.php file's settings are correct. Remember that all of the current styles are being inherited from the Autumn theme. Time for action – checking our setup Open up your Moodle site with the current theme (which should be blackandblue but looks like the Autumn theme). Navigate to your blackandblue theme's folder, right-click on the user_style.css file, and choose Open. This file should be completely blank. Type in the following line of CSS for the body element, and then save the fi le: body {background: #000000;} Now refresh your browser window. You will see that the background is now black. Note: When using Firebug to identify styles that are being used, it might not always be obvious where they are or which style is controlling that element of the page. An example of this is the body {background: #000000;}code that we just pasted in our user_style.css file. If we had used Firebug to indentify that style, we would not have found it. Instead, I just took a look at the CSS file from the autumn theme. What I am trying to say here is that there will always be an element of poking around and trial and error. What just happened? All seems fine there, doesn't it? You have added one style declaration to your empty user_style.css file to change the background color, and have checked the changes in your browser. You now know how the parent themes work and know that you only need to copy the styles from Firebug into your user_style.css file and edit the style declarations that need to be changed.
Read more
  • 0
  • 1
  • 8476

article-image-advanced-aspects-inserting-and-deleting-data-mysql
Packt
01 Oct 2010
10 min read
Save for later

Advanced aspects of Inserting and Deleting data in MySQL

Packt
01 Oct 2010
10 min read
  MySQL Admin Cookbook 99 great recipes for mastering MySQL configuration and administration Set up MySQL to perform administrative tasks such as efficiently managing data and database schema, improving the performance of MySQL servers, and managing user credentials Deal with typical performance bottlenecks and lock-contention problems Restrict access sensibly and regain access to your database in case of loss of administrative user credentials Part of Packt's Cookbook series: Each recipe is a carefully organized sequence of instructions to complete the task as efficiently as possible Read more about this book (For more resources on MySQL, see here.) Inserting new data and updating data if it already exists Manipulating data in a database is part of everyday work and the basic SQL means of INSERT, UPDATE, and DELETE make this a pretty straightforward, almost trivial task—but is this always true? When considering data manipulation, most of the time we think of a situation where we know the content of the database. With this information, it is usually pretty easy to find a way of changing the data the way you intend to. But what if you have to change data in circumstances where you do not know the actual database content beforehand? You might answer: "Well, then look at your data before changing it!" Unfortunately, you do not always have this option. Think of distributed installations of any software that includes a database. If you have to design an update option for this software (and the respective databases), you might easily come to a situation where you simply do not know about the actual database content. One example of a problem arising in these cases is the question of whether to insert or to update data: "Does the data in question already (partially) exist?" Let us assume a database table config that stores configuration settings. It holds key-value pairs, with name being the name (and thus the key) of the setting and value its value. This table exists in different database installations, one for every branch office of your company. Your task is to create an update package to set a uniform upper limit of 25% for the price discount that is allowed in your sales software. If no such limit has been defined yet, there is no respective entry in the config table, and you have to insert a new record. If the limit, however, has been set before (for example by the local manager), the entry does already exist, in which case you have to update it to hold the new value. While the update of a potentially existing entry does not pose a problem, an INSERT statement that violates uniqueness constraints will simply cause an error. This is, however, typically not acceptable in an automated update procedure. The following recipe will show you how to solve this problem with only one SQL command. Getting ready Besides a running MySQL server, a SQL client, and an account with appropriate user rights (INSERT, UPDATE), we need a table to update. In the earlier example, we assumed a table named sample.config with two character columns name and value. The name column is defined as the primary key: CREATE TABLE sample.config ( name VARCHAR(64) PRIMARY KEY, value VARCHAR(64)); How to do it... Connect to your database using your SQL client Execute the following command: mysql> INSERT INTO sample.config VALUES ("maxPriceDiscount","25%") ON DUPLICATE KEY UPDATE value='25%';Query OK, 1 row affected (0.05 sec) How it works... This command is easily explained because it simply does what it says: it inserts a new row in the table using the given values, as long as this does not cause a duplicate entry in either the primary key or another unique index. If a duplicate record exists, the existing row is updated according to the clauses defined after ON DUPLICATE KEY UPDATE. While it is sometimes tedious to enter some of the data and columns two times (once for the INSERT and a second time for the UPDATE), this statement allows for a lot of flexibility when it comes to the manipulation of potentially existing data. Please note that when executing the above statement, the result differs slightly with respect to the number of affected rows, depending on the actual data present in the database: When the record does not exist yet, it is inserted, which results in one affected row. But if the record is updated rather than inserted, it reports two affected rows instead, even if only one row gets updated. There's more... The INSERT INTO … ON DUPLICATE UPDATE construct does not work when there is no UNIQUE or PRIMARY KEY defined on the target table. If you have to provide the same semantics without having appropriate key definitions in place, it is recommended to use the techniques discussed in the next recipe. Inserting data based on existing database content In the previous recipe Inserting new data and updating data if it already exists, we discussed a method to either insert or update records depending on whether the records already exist in the database. A similar problem arises when you need to insert data to your database, but the data to insert depends on the data in your database. As an example, consider a situation in which you need to insert a record with a certain message into a table logMsgs, but the message itself should be different depending on the current system language that is stored in a configuration table (config). It is fairly easy to achieve a similar behavior for an UPDATE statement because this supports a WHERE clause that can be used to only perform an update if a certain precondition is met: UPDATE logMsgs SET message= CONCAT('Last update: ', NOW()) WHERE EXISTS (SELECT value FROM config WHERE name='lang' AND value = 'en');UPDATE logMsgs SET message= CONCAT('Letztes Update: ', NOW()) WHERE EXISTS (SELECT value FROM config WHERE name='lang' AND value = 'de');UPDATE logMsgs SET message= CONCAT('Actualisation derniere: ', NOW()) WHERE EXISTS (SELECT value FROM config WHERE name='lang' AND value = 'fr'); Unfortunately, this approach is not applicable to INSERT commands, as these do not support a WHERE clause. Despite this missing option, the following recipe describes a method to make INSERT statements execute only if an appropriate precondition in the database is met. Getting ready As before, we assume a database, a SQL client (mysql), and a MySQL user with sufficient privileges (INSERT and SELECT in this case). Additionally, we need a table to insert data into (here: logMsgs) and a configuration table config (please refer to the previous recipe for details). How to do it... Connect to your database using your SQL client. Execute the following SQL commands: mysql> INSERT INTO sample.logMsgs(message) -> SELECT CONCAT('Last update: ', NOW()) -> FROM sample.config WHERE name='lang' AND value='en';Query OK, 0 rows affected (0.00 sec)Records: 0 Duplicates: 0 Warnings: 0 How it works... Our goal is to have an INSERT statement take into account the present language stored in the database. The trick to do so is to use a SELECT statement as input for the INSERT. The SELECT command provides a WHERE clause, so you can use a condition that only matches for the respective language. One restriction of this solution is that you can only insert one record at a time, so the size of scripts might grow considerably if you have to insert lots of data and/or have to cover many alternatives. There's more... If you have more than just a few values to insert, it is more convenient to have the data in one place rather than distributed over several individual INSERT statements. In this case, it might make sense to consolidate the data by putting it inside a temporary table; the final INSERT statement uses this temporary table to select the appropriate data rows for insertion into the target table. The downside of this approach is that the user needs the CREATE TEMPORARY TABLES privilege, but it typically compensates with much cleaner scripts: After creating the temporary table with the first statement, we insert data into the table with the following INSERT statement. The next statement inserts the appropriate data into the target table sample.logMsgs by selecting the appropriate data from the temporary data that matches the language entry from the config table. The temporary table is then removed again. The final SELECT statement is solely for checking the results of the operation. Deleting all data from large tables Almost everyone who works with databases experiences the constant growth of the data stored in their database and it is typically well beyond the initial estimates. Because of that you often end up with rather large data sets. Another common observation is that in most databases, there are some tables that have a special tendency to grow especially big. If a table's size reaches a virtual threshold (which is hard to define, as it depends heavily on the access patterns and the data structures), it gets harder and harder to maintain and performance degradation might occur. From a certain point on, it is even difficult to get rid of data in the table again, as the sheer number of records makes deletion a pretty expensive task. This particularly holds true for storage engines with Multi-Version Concurrency Control (MVCC): if you order the database to delete data from the table, it must not be deleted right away because you might still roll back the deletion. So even while the deletion was initiated, a concurrent query on the table still has to be able to see all the records (depending on the transaction isolation level). To achieve this, the storage engine will only mark the records as deleted, but the actual deletion takes place after the operation is committed and all other transactions that access this table are closed as well. If you have to deal with large data sets, the most difficult task is to operate on the production system while other processes concurrently work on the data. In these circumstances, you have to keep the duration of your maintenance operations as low as possible in order to minimize the impact on the running system. As the deletion of data from a large table (typically starting at several millions of rows) might take quite some time, the following recipe shows a way of minimizing the duration of this operation in order to reduce side effects (like locking effects or performance degradation). Getting ready Besides a user account with appropriate privileges (DELETE), you need a sufficiently large table to delete data from. For this recipe, we will use the employees database, which is an example database available from MySQL: http://dev.mysql.com/doc/employee/en/employee.html This database provides some tables with sensible data and some pretty large tables, the largest having more than 2.8 million records. We assume that the Employees database was installed with an InnoDB storage engine enabled. To delete all rows of the largest table employees.salaries in a quick way, please read on. How to do it...
Read more
  • 0
  • 0
  • 8356
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-create-user-profile-system-and-use-null-coalesce-operator
Packt
12 Oct 2016
15 min read
Save for later

Create a User Profile System and use the Null Coalesce Operator

Packt
12 Oct 2016
15 min read
In this article by Jose Palala and Martin Helmich, author of PHP 7 Programming Blueprints, will show you how to build a simple profiles page with listed users which you can click on, and create a simple CRUD-like system which will enable us to register new users to the system, and delete users for banning purposes. (For more resources related to this topic, see here.) You will learn to use the PHP 7 null coalesce operator so that you can show data if there is any, or just display a simple message if there isn’t any. Let's create a simple UserProfile class. The ability to create classes has been available since PHP 5. A class in PHP starts with the word class, and the name of the class: class UserProfile { private $table = 'user_profiles'; } } We've made the table private and added a private variable, where we define which table it will be related to. Let's add two functions, also known as a method, inside the class to simply fetch the data from the database: function fetch_one($id) { $link = mysqli_connect(''); $query = "SELECT * from ". $this->table . " WHERE `id` =' " . $id "'"; $results = mysqli_query($link, $query); } function fetch_all() { $link = mysqli_connect('127.0.0.1', 'root','apassword','my_dataabase' ); $query = "SELECT * from ". $this->table . "; $results = mysqli_query($link, $query); } The null coalesce operator We can use PHP 7's null coalesce operator to allow us to check whether our results contain anything, or return a defined text which we can check on the views—this will be responsible for displaying any data. Lets put this in a file which will contain all the define statements, and call it: //definitions.php define('NO_RESULTS_MESSAGE', 'No results found'); require('definitions.php'); function fetch_all() { …same lines ... $results = $results ?? NO_RESULTS_MESSAGE; return $message; } On the client side, we'll need to come up with a template to show the list of user profiles. Let’s create a basic HTML block to show that each profile can be a div element with several list item elements to output each table. In the following function, we need to make sure that all values have been filled in with at least the name and the age. Then we simply return the entire string when the function is called: function profile_template( $name, $age, $country ) { $name = $name ?? null; $age = $age ?? null; if($name == null || $age === null) { return 'Name or Age need to be set'; } else { return '<div> <li>Name: ' . $name . ' </li> <li>Age: ' . $age . '</li> <li>Country: ' . $country . ' </li> </div>'; } } Separation of concerns In a proper MVC architecture, we need to separate the view from the models that get our data, and the controllers will be responsible for handling business logic. In our simple app, we will skip the controller layer since we just want to display the user profiles in one public facing page. The preceding function is also known as the template render part in an MVC architecture. While there are frameworks available for PHP that use the MVC architecture out of the box, for now we can stick to what we have and make it work. PHP frameworks can benefit a lot from the null coalesce operator. In some codes that I've worked with, we used to use the ternary operator a lot, but still had to add more checks to ensure a value was not falsy. Furthermore, the ternary operator can get confusing, and takes some getting used to. The other alternative is to use the isSet function. However, due to the nature of the isSet function, some falsy values will be interpreted by PHP as being a set. Creating views Now that we have our  model complete, a template render function, we just need to create the view with which we can look at each profile. Our view will be put inside a foreach block, and we'll use the template we wrote to render the right values: //listprofiles.php <html> <!doctype html> <head> <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css"> </head> <body> <?php foreach($results as $item) { echo profile_template($item->name, $item->age, $item->country; } ?> </body> </html> Let's put the code above into index.php. While we may install the Apache server, configure it to run PHP, install new virtual hosts and the other necessary featuress, and put our PHP code into an Apache folder, this will take time, so, for the purposes of testing this out, we can just run PHP's server for development. To  run the built-in PHP server (read more at http://php.net/manual/en/features.commandline.webserver.php) we will use the folder we are running, inside a terminal: php -S localhost:8000 If we open up our browser, we should see nothing yet—No results found. This means we need to populate our database. If you have an error with your database connection, be sure to replace the correct database credentials we supplied into each of the mysql_connect calls that we made. To supply data into our database, we can create a simple SQL script like this: INSERT INTO user_profiles ('Chin Wu', 30, 'Mongolia'); INSERT INTO user_profiles ('Erik Schmidt', 22, 'Germany'); INSERT INTO user_profiles ('Rashma Naru', 33, 'India'); Let's save it in a file such as insert_profiles.sql. In the same directory as the SQL file, log on to the MySQL client by using the following command: mysql -u root -p Then type use <name of database>: mysql> use <database>; Import the script by running the source command: mysql> source insert_profiles.sql Now our user profiles page should show the following: Create a profile input form Now let's create the HTML form for users to enter their profile data. Our profiles app would be no use if we didn't have a simple way for a user to enter their user profile details. We'll create the profile input form  like this: //create_profile.php <html> <body> <form action="post_profile.php" method="POST"> <label>Name</label><input name="name"> <label>Age</label><input name="age"> <label>Country</label><input name="country"> </form> </body> </html> In this profile post, we'll need to create a PHP script to take care of anything the user posts. It will create an SQL statement from the input values and output whether or not they were inserted. We can use the null coalesce operator again to verify that the user has inputted all values and left nothing undefined or null: $name = $_POST['name'] ?? ""; $age = $_POST['country'] ?? ""; $country = $_POST['country'] ?? ""; This prevents us from accumulating errors while inserting data into our database. First, let's create a variable to hold each of the inputs in one array: $input_values = [ 'name' => $name, 'age' => $age, 'country' => $country ]; The preceding code is a new PHP 5.4+ way to write arrays. In PHP 5.4+, it is no longer necessary to put an actual array(); the author personally likes the new syntax better. We should create a new method in our UserProfile class to accept these values: Class UserProfile { public function insert_profile($values) { $link = mysqli_connect('127.0.0.1', 'username','password', 'databasename'); $q = " INSERT INTO " . $this->table . " VALUES ( '". $values['name']."', '".$values['age'] . "' ,'". $values['country']. "')"; return mysqli_query($q); } } Instead of creating a parameter in our function to hold each argument as we did with our profile template render function, we can simply use an array to hold our values. This way, if a new field needs to be inserted into our database, we can just add another field to the SQL insert statement. While we are at it, let's create the edit profile section. For now, we'll assume that whoever is using this edit profile is the administrator of the site. We'll need to create a page where, provided the $_GET['id'] or has been set, that the user that we will be fetching from the database and displaying on the form. <?php require('class/userprofile.php');//contains the class UserProfile into $id = $_GET['id'] ?? 'No ID'; //if id was a string, i.e. "No ID", this would go into the if block if(is_numeric($id)) { $profile = new UserProfile(); //get data from our database $results = $user->fetch_id($id); if($results && $results->num_rows > 0 ) { while($obj = $results->fetch_object()) { $name = $obj->name; $age = $obj->age; $country = $obj->country; } //display form with a hidden field containing the value of the ID ?> <form action="post_update_profile.php" method="post"> <label>Name</label><input name="name" value="<?=$name?>"> <label>Age</label><input name="age" value="<?=$age?>"> <label>Country</label><input name="country" value="<?=country?>"> </form> <?php } else { exit('No such user'); } } else { echo $id; //this should be No ID'; exit; } Notice that we're using what is known as the shortcut echo statement in the form. It makes our code simpler and easier to read. Since we're using PHP 7, this feature should come out of the box. Once someone submits the form, it goes into our $_POST variable and we'll create a new Update function in our UserProfile class. Admin system Let's finish off by creating a simple grid for an admin dashboard portal that will be used with our user profiles database. Our requirement for this is simple: We can just set up a table-based layout that displays each user profile in rows. From the grid, we will add the links to be able to edit the profile, or  delete it, if we want to. The code to display a table in our HTML view would look like this: <table> <tr> <td>John Doe</td> <td>21</td> <td>USA</td> <td><a href="edit_profile.php?id=1">Edit</a></td> <td><a href="profileview.php?id=1">View</a> <td><a href="delete_profile.php?id=1">Delete</a> </tr> </table> This script to this is the following: //listprofiles.php $sql = "SELECT * FROM userprofiles LIMIT $start, $limit "; $rs_result = mysqli_query ($sql); //run the query while($row = mysqli_fetch_assoc($rs_result) { ?> <tr> <td><?=$row['name'];?></td> <td><?=$row['age'];?></td> <td><?=$row['country'];?></td> <td><a href="edit_profile.php?id=<?=$id?>">Edit</a></td> <td><a href="profileview.php?id=<?=$id?>">View</a> <td><a href="delete_profile.php?id=<?=$id?>">Delete</a> </tr> <?php } There's one thing that we haven't yet created: A delete_profile.php page. The view and edit pages - have been discussed already. Here's how the delete_profile.php page would look: <?php //delete_profile.php $connection = mysqli_connect('localhost','<username>','<password>', '<databasename>'); $id = $_GET['id'] ?? 'No ID'; if(is_numeric($id)) { mysqli_query( $connection, "DELETE FROM userprofiles WHERE id = '" .$id . "'"); } else { echo $id; } i(!is_numeric($id)) { exit('Error: non numeric $id'); } else { echo "Profile #" . $id . " has been deleted"; ?> Of course, since we might have a lot of user profiles in our database, we have to create a simple pagination. In any pagination system, you just need to figure out the total number of rows, and how many rows you want displayed per page. We can create a function that will be able to return a URL that contains the page number and how many to view per page. From our queries database, we first create a new function for us to select only up to the total number of items in our database: class UserProfile{ // …. Etc … function count_rows($table) { $dbconn = new mysqli('localhost', 'root', 'somepass', 'databasename'); $query = $dbconn->query("select COUNT(*) as num from '". $table . "'"); $total_pages = mysqli_fetch_array($query); return $total_pages['num']; //fetching by array, so element 'num' = count } For our pagination, we can create a simple paginate function which accepts the base_url of the page where we have pagination, the rows per page — also known as the number of records we want each page to have — and the total number of records found: require('definitions.php'); require('db.php'); //our database class Function paginate ($baseurl, $rows_per_page, $total_rows) { $pagination_links = array(); //instantiate an array to hold our html page links //we can use null coalesce to check if the inputs are null ( $total_rows || $rows_per_page) ?? exit('Error: no rows per page and total rows); //we exit with an error message if this function is called incorrectly $pages = $total_rows % $rows_per_page; $i= 0; $pagination_links[$i] = "<a href="http://". $base_url . "?pagenum=". $pagenum."&rpp=".$rows_per_page. ">" . $pagenum . "</a>"; } return $pagination_links; } This function will help display the above page links in a table: function display_pagination($links) { $display = ' <div class="pagination">'; <table><tr>'; foreach ($links as $link) { echo "<td>" . $link . "</td>"; } $display .= '</tr></table></div>'; return $display; } Notice that we're following the principle that there should rarely be any echo statements inside a function. This is because we want to make sure that other users of these functions are not confused when they debug some mysterious output on their page. By requiring the programmer to echo out whatever the functions return, it becomes easier to debug our program. Also, we're following the separation of concerns—our code doesn't output the display, it just formats the display. So any future programmer can just update the function's internal code and return something else. It also makes our function reusable; imagine that in the future someone uses our function—this way, they won't have to double check that there's some misplaced echo statement within our functions. A note on alternative short tags As you know, another way to echo is to use  the <?= tag. You can use it like so: <?="helloworld"?>.These are known as short tags. In PHP 7, alternative PHP tags have been removed. The RFC states that <%, <%=, %> and <script language=php>  have been deprecated. The RFC at https://wiki.php.net/rfc/remove_alternative_php_tags says that the RFC does not remove short opening tags (<?) or short opening tags with echo (<?=). Since we have laid out the groundwork of creating paginate links, we now just have to invoke our functions. The following script is all that is needed to create a paginated page using the preceding function: $mysqli = mysqli_connect('localhost','<username>','<password>', '<dbname>'); $limit = $_GET['rpp'] ?? 10; //how many items to show per page default 10; $pagenum = $_GET['pagenum']; //what page we are on if($pagenum) $start = ($pagenum - 1) * $limit; //first item to display on this page else $start = 0; //if no page var is given, set start to 0 /*Display records here*/ $sql = "SELECT * FROM userprofiles LIMIT $start, $limit "; $rs_result = mysqli_query ($sql); //run the query while($row = mysqli_fetch_assoc($rs_result) { ?> <tr> <td><?php echo $row['name']; ?></td> <td><?php echo $row['age']; ?></td> <td><?php echo $row['country']; ?></td> </tr> <?php } /* Let's show our page */ /* get number of records through */ $record_count = $db->count_rows('userprofiles'); $pagination_links = paginate('listprofiles.php' , $limit, $rec_count); echo display_pagination($paginaiton_links); The HTML output of our page links in listprofiles.php will look something like this: <div class="pagination"><table> <tr> <td> <a href="listprofiles.php?pagenum=1&rpp=10">1</a> </td> <td><a href="listprofiles.php?pagenum=2&rpp=10">2</a> </td> <td><a href="listprofiles.php?pagenum=3&rpp=10">2</a> </td> </tr> </table></div> Summary As you can see, we have a lot of use cases for the null coalesce. We learned how to make a simple user profile system, and how to use PHP 7's null coalesce feature when fetching data from the database, which returns null if there are no records. We also learned that the null coalesce operator is similar to a ternary operator, except this returns null by default if there is no data. Resources for Article: Further resources on this subject: Running Simpletest and PHPUnit [article] Mapping Requirements for a Modular Web Shop App [article] HTML5: Generic Containers [article]
Read more
  • 0
  • 0
  • 8131

article-image-interactive-crime-map-using-flask
Packt
12 Jan 2016
18 min read
Save for later

Interactive Crime Map Using Flask

Packt
12 Jan 2016
18 min read
In this article by Gareth Dwyer, author of the book, Flask By Example, we will cover how to set up a MySQL database on our VPS and creating a database for the crime data. We'll follow on from this by setting up a basic page containing a map and textbox. We'll see how to link Flask to MySQL by storing data entered into the textbox in our database. We won't be using an ORM for our database queries or a JavaScript framework for user input and interaction. This means that there will be some laborious writing of SQL and vanilla JavaScript, but it's important to fully understand why tools and frameworks exist, and what problems they solve, before diving in and using them blindly. (For more resources related to this topic, see here.) We'll cover the following topics: Introduction to SQL Databases Installing MySQL on our VPS Connecting to MySQL from Python and creating the database Connecting to MySQL from Flask and inserting data Setting up We'll create a new git repository for our new code base, since although some of the setup will be similar, our new project should be completely unrelated to our first one. If you need more help with this step, head back to the setup of the first project and follow the detailed instructions there. If you're feeling confident, see if you can do it just with the following summary: Head over to the website for bitbucket, GitHub, or whichever hosting platform you used for the first project. Log in and use their Create a new repository functionality. Name your repo crimemap, and take note of the URL you're given. On your local machine, fire up a terminal and run the following commands: mkdir crimemap cd crimemap git init git remote add origin <git repository URL> We'll leave this repository empty for now as we need to set up a database on our VPS. Once we have the database installed, we'll come back here to set up our Flask project. Understanding relational databases In its simplest form, a relational database management system, such as MySQL, is a glorified spreadsheet program, such as Microsoft Excel: We store data in rows and columns. Every row is a "thing" and every column is a specific piece of information about the thing in the relevant row. I put "thing" in inverted commas because we're not limited to storing objects. In fact, the most common example, both in the real world and in explaining databases, is data about people. A basic database storing information about customers of an e-commerce website could look something like the following: ID First Name Surname Email Address Telephone 1 Frodo Baggins fbaggins@example.com +1 111 111 1111 2 Bilbo Baggins bbaggins@example.com +1 111 111 1010 3 Samwise Gamgee sgamgee@example.com +1 111 111 1001 If we look from left to right in a single row, we get all the information about one person. If we look at a single column from top to bottom, we get one piece of information (for example, an e-mail address) for everyone. Both can be useful—if we want to add a new person or contact a specific person, we're probably interested in a specific row. If we want to send a newsletter to all our customers, we're just interested in the e-mail column. So why can't we just use spreadsheets instead of databases then? Well, if we take the example of an e-commerce store further, we quickly see the limitations. If we want to store a list of all the items we have on offer, we can create another table similar to the preceding one, with columns such as "Item name", "Description", "Price", and "Quantity in stock". Our model continues to be useful. But now, if we want to store a list of all the items Frodo has ever purchased, there's no good place to put the data. We could add 1000 columns to our customer table: "Purchase 1", "Purchase 2", and so on up to "Purchase 1000", and hope that Frodo never buys more than 1000 items. This isn't scalable or easy to work with: How do we get the description for the item Frodo purchased last Tuesday? Do we just store the item's name in our new column? What happens with items that don't have unique names? Soon, we realise that we need to think about it backwards. Instead of storing the items purchased by a person in the "Customers" table, we create a new table called "Orders" and store a reference to the customer in every order. Thus, an order knows which customer it belongs to, but a customer has no inherent knowledge of what orders belong to them. While our model still fits into a spreadsheet at the push of a button, as we grow our data model and data size, our spreadsheet becomes cumbersome. We need to perform complicated queries such as "I want to see all the items that are in stock and have been ordered at least once in the last 6 months and cost more than $10." Enter Relational database management systems (RDBMS). They've been around for decades and are a tried and tested way of solving a common problem—storing data with complicated relations in an organized and accessible manner. We won't be touching on their full capabilities in our crime map (in fact, we could probably store our data in a .txt file if we needed to), but if you're interested in building web applications, you will need a database at some point. So, let's start small and add the powerful MySQL tool to our growing toolbox. I highly recommend learning more about databases. If the taster you experience while building our current project takes your fancy, go read and learn about databases. The history of RDBMS is interesting, and the complexities and subtleties of normalization and database varieties (including NoSQL databases, which we'll see something of in our next project) deserve more study time than we can devote to them in a book that focuses on Python web development. Installing and configuring MySQL Installing and configuring MySQL is an extremely common task. You can therefore find it in prebuilt images or in scripts that build entire stacks for you. A common stack is called the LAMP (Linux, Apache, MySQL, and PHP) stack, and many VPS providers provide a one-click LAMP stack image. As we are already using Linux and have already installed Apache manually, after installing MySQL, we'll be very close to the traditional LAMP stack, just using the P for Python instead of PHP. In keeping with our goal of "education first", we'll install MySQL manually and configure it through the command line instead of installing a GUI control panel. If you've used MySQL before, feel free to set it up as you see fit. Installing MySQL on our VPS Installing MySQL on our server is quite straightforward. SSH into your VPS and run the following commands: sudo apt-get update sudo apt-get install mysql-server You should see an interface prompting you for a root password for MySQL. Enter a password of your choice and repeat it when prompted. Once the installation has completed, you can get a live SQL shell by typing the following command and entering the password you chose earlier: mysql –p We could create a database and schema using this shell, but we'll be doing that through Python instead, so hit Ctrl + C to terminate the MySQL shell if you opened it. Installing Python drivers for MySQL Because we want to use Python to talk to our database, we need to install another package. There are two main MySQL connectors for Python: PyMySql and MySqlDB. The first is preferable from a simplicity and ease-of-use point of view. It is a pure Python library, meaning that it has no dependencies. MySqlDB is a C extension, and therefore has some dependencies, but is, in theory, a bit faster. They work very similarly once installed. To install it, run the following (still on your VPS): sudo pip install pymysql Creating our crimemap database in MySQL Some knowledge of SQL's syntax will be useful for the rest of this article, but you should be able to follow either way. The first thing we need to do is create a database for our web application. If you're comfortable using a command-line editor, you can create the following scripts directly on the VPS as we won't be running them locally and this can make them easier to debug. However, developing over an SSH session is far from ideal, so I recommend that you write them locally and use git to transfer them to the server before running. This can make debugging a bit frustrating, so be extra careful in writing these scripts. If you want, you can get them directly from the code bundle that comes with this book. In this case, you simply need to populate the Password field correctly and everything should work. Creating a database setup script In the crimemap directory where we initialised our git repo in the beginning, create a Python file called db_setup.py, containing the following code: import pymysql import dbconfig connection = pymysql.connect(host='localhost', user=dbconfig.db_user, passwd=dbconfig.db_password) try: with connection.cursor() as cursor: sql = "CREATE DATABASE IF NOT EXISTS crimemap" cursor.execute(sql) sql = """CREATE TABLE IF NOT EXISTS crimemap.crimes ( id int NOT NULL AUTO_INCREMENT, latitude FLOAT(10,6), longitude FLOAT(10,6), date DATETIME, category VARCHAR(50), description VARCHAR(1000), updated_at TIMESTAMP, PRIMARY KEY (id) )""" cursor.execute(sql); connection.commit() finally: connection.close() Let’s take a look at what this code does. First, we import the pymysql library we just installed. We also import dbconfig, which we’ll create locally in a bit and populate with the database credentials (we don’t want to store these in our repository). Then, we create a connection to our database using localhost (because our database is installed on the same machine as our code) and the credentials that don’t exist yet. Now that we have a connection to our database, we can get a cursor. You can think of a cursor as being a bit like the blinking object in your word processor that indicates where text will appear when you start typing. A database cursor is an object that points to a place in the database where we want to create, read, update, or delete data. Once we start dealing with database operations, there are various exceptions that could occur. We’ll always want to close our connection to the database, so we create a cursor (and do all subsequent operations) inside a try block with a connection.close() in a finally block (the finally block will get executed whether or not the try block succeeds). The cursor is also a resource, so we’ll grab one and use it in a with block so that it’ll automatically be closed when we’re done with it. With the setup done, we can start executing SQL code. Creating the database SQL reads similarly to English, so it's normally quite straightforward to work out what existing SQL does even if it's a bit more tricky to write new code. Our first SQL statement creates a database (crimemap) if it doesn't already exist (this means that if we come back to this script, we can leave this line in without deleting the entire database every time). We create our first SQL statement as a string and use the variable sql to store it. Then we execute the statement using the cursor we created. Using the database setup script We save our script locally and push it to the repository using the following command: git add db_setup.py git commit –m “database setup script” git push origin master We then SSH to our VPS and clone the new repository to our /var/www directory using the following command: ssh user@123.456.789.123 cd /var/www git clone <your-git-url> cd crimemap Adding credentials to our setup script Now, we still don’t have the credentials that our script relies on. We’ll do the following things before using our setup script: Create the dbconfig.py file with the database and password. Add this file to .gitignore to prevent it from being added to our repository. The following are the steps to do so: Create and edit dbconfig.py using the nano command: nano dbconfig.py Then, type the following (using the password you chose when you installed MySQL): db_username = “root” db_password = “<your-mysql-password>” Save it by hitting Ctrl + X and entering Y when prompted. Now, use similar nano commands to create, edit, and save .gitignore, which should contain this single line: dbconfig.py Running our database setup script With that done, you can run the following command: python db_setup.py Assuming everything goes smoothly, you should now have a database with a table to store crimes. Python will output any SQL errors, allowing you to debug if necessary. If you make changes to the script from the server, run the same git add, git commit, and git push commands that you did from your local machine. That concludes our preliminary database setup! Now we can create a basic Flask project that uses our database. Creating an outline for our Flask app We're going to start by building a skeleton of our crime map application. It'll be a basic Flask application with a single page that: Displays all data in the crimes table of our database Allows users to input data and stores this data in the database Has a "clear" button that deletes all the previously input data Although what we're going to be storing and displaying can't really be described as "crime data" yet, we'll be storing it in the crimes table that we created earlier. We'll just be using the description field for now, ignoring all the other ones. The process to set up the Flask application is very similar to what we used before. We're going to separate out the database logic into a separate file, leaving our main crimemap.py file for the Flask setup and routing. Setting up our directory structure On your local machine, change to the crimemap directory. If you created the database setup script on the server or made any changes to it there, then make sure you sync the changes locally. Then, create the templates directory and touch the files we're going to be using, as follows: cd crimemap git pull origin master mkdir templates touch templates/home.html touch crimemap.py touch dbhelper.py Looking at our application code The crimemap.py file contains nothing unexpected and should be entirely familiar from our headlines project. The only thing to point out is the DBHelper() function, whose code we'll see next. We simply create a global DBHelper() function right after initializing our app and then use it in the relevant methods to grab data from, insert data into, or delete all data from the database. from dbhelper import DBHelper from flask import Flask from flask import render_template from flask import request app = Flask(__name__) DB = DBHelper() @app.route("/") def home(): try: data = DB.get_all_inputs() except Exception as e: print e data = None return render_template("home.html", data=data) @app.route("/add", methods=["POST"]) def add(): try: data = request.form.get("userinput") DB.add_input(data) except Exception as e: print e return home() @app.route("/clear") def clear(): try: DB.clear_all() except Exception as e: print e return home() if __name__ == '__main__': app.run(debug=True) Looking at our SQL code There's a little bit more SQL to learn from our database helper code. In dbhelper.py, we need the following: import pymysql import dbconfig class DBHelper: def connect(self, database="crimemap"): return pymysql.connect(host='localhost', user=dbconfig.db_user, passwd=dbconfig.db_password, db=database) def get_all_inputs(self): connection = self.connect() try: query = "SELECT description FROM crimes;" with connection.cursor() as cursor: cursor.execute(query) return cursor.fetchall() finally: connection.close() def add_input(self, data): connection = self.connect() try: query = "INSERT INTO crimes (description) VALUES ('{}');".format(data) with connection.cursor() as cursor: cursor.execute(query) connection.commit() finally: connection.close() def clear_all(self): connection = self.connect() try: query = "DELETE FROM crimes;" with connection.cursor() as cursor: cursor.execute(query) connection.commit() finally: connection.close() As in our setup script, we need to make a connection to our database and then get a cursor from our connection in order to do anything meaningful. Again, we perform all our operations in try: ...finally: blocks in order to ensure that the connection is closed. In our helper code, we see three of the four main database operations. CRUD (Create, Read, Update, and Delete) describes the basic database operations. We are either creating and inserting new data or reading, modifying, or deleting existing data. We have no need to update data in our basic app, but creating, reading, and deleting are certainly useful. Creating our view code Python and SQL code is fun to write, and it is indeed the main part of our application. However, at the moment, we have a house without doors or windows—the difficult and impressive bit is done, but it's unusable. Let's add a few lines of HTML to allow the world to interact with the code we've written. In /templates/home.html, add the following: <html> <body> <head> <title>Crime Map</title> </head> <h1>Crime Map</h1> <form action="/add" method="POST"> <input type="text" name="userinput"> <input type="submit" value="Submit"> </form> <a href="/clear">clear</a> {% for userinput in data %} <p>{{userinput}}</p> {% endfor %} </body> </html> There's nothing we haven't seen before. We have a form with a single text input box to add data to our database by calling the /add function of our app, and directly below it, we loop through all the existing data and display each piece within <p> tags. Running the code on our VPS Finally, we just need to make our code accessible to the world. This means pushing it to our git repo, pulling it onto the VPS, and configuring Apache to serve it. Run the following commands locally: git add git commit –m "Skeleton CrimeMap" git push origin master ssh <username>@<vps-ip-address> And on your VPS use the following command: cd /var/www/crimemap git pull origin master Now, we need a .wsgi file to link our Python code to Apache: nano crimemap.wsgi The .wsgi file should contain the following: import sys sys.path.insert(0, "/var/www/crimemap") from crimemap import app as application Hit Ctrl + X and then Y when prompted to save. We also need to create a new Apache .conf file and set this as the default (instead of the headlines.conf file that is our current default), as follows: cd /etc/apache2/sites-available nano crimemap.conf This file should contain the following: <VirtualHost *> ServerName example.com WSGIScriptAlias / /var/www/crimemap/crimemap.wsgi WSGIDaemonProcess crimemap <Directory /var/www/crimemap> WSGIProcessGroup crimemap WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all </Directory> </VirtualHost> This is so similar to the headlines.conf file we created for our previous project that you might find it easier to just copy that one and substitute code as necessary. Finally, we need to deactivate the old site (later on, we'll look at how to run multiple sites simultaneously off the same server) and activate the new one: sudo a2dissite headlines.conf sudo a2enssite crimemap.conf sudo service apache2 reload Now, everything should be working. If you copied the code out manually, it's almost certain that there's a bug or two to deal with. Don't be discouraged by this—remember that debugging is expected to be a large part of development! If necessary, do a tail –f on /var/log/apache2/error.log while you load the site in order to see any errors. If this fails, add some print statements to crimemap.py and dbhelper.py to narrow down the places where things are breaking. Once everything is working, you should be able to see the following in your browser: Notice how the data we get from the database is a tuple, which is why it is surrounded by brackets and has a trailing comma. This is because we selected only a single field (description) from our crimes table when we could, in theory, be dealing with many columns for each crime (and soon will be). Summary That's it for the introduction to our crime map project. Resources for Article: Further resources on this subject: Web Scraping with Python[article] Python 3: Building a Wiki Application[article] Using memcached with Python[article]
Read more
  • 0
  • 0
  • 8005

article-image-digging-deep-requests
Packt
16 Jun 2015
17 min read
Save for later

Digging Deep into Requests

Packt
16 Jun 2015
17 min read
In this article by Rakesh Vidya Chandra and Bala Subrahmanyam Varanasi, authors of the book Python Requests Essentials, we are going to deal with advanced topics in the Requests module. There are many more features in the Requests module that makes the interaction with the web a cakewalk. Let us get to know more about different ways to use Requests module which helps us to understand the ease of using it. (For more resources related to this topic, see here.) In a nutshell, we will cover the following topics: Persisting parameters across requests using Session objects Revealing the structure of request and response Using prepared requests Verifying SSL certificate with Requests Body Content Workflow Using generator for sending chunk encoded requests Getting the request method arguments with event hooks Iterating over streaming API Self-describing the APIs with link headers Transport Adapter Persisting parameters across Requests using Session objects The Requests module contains a session object, which has the capability to persist settings across the requests. Using this session object, we can persist cookies, we can create prepared requests, we can use the keep-alive feature and do many more things. The Session object contains all the methods of Requests API such as GET, POST, PUT, DELETE and so on. Before using all the capabilities of the Session object, let us get to know how to use sessions and persist cookies across requests. Let us use the session method to get the resource. >>> import requests >>> session = requests.Session() >>> response = requests.get("https://google.co.in", cookies={"new-cookie-identifier": "1234abcd"}) In the preceding example, we created a session object with requests and its get method is used to access a web resource. The cookie value which we had set in the previous example will be accessible using response.request.headers. >>> response.request.headers CaseInsensitiveDict({'Cookie': 'new-cookie-identifier=1234abcd', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/2.2.1 CPython/2.7.5+ Linux/3.13.0-43-generic'}) >>> response.request.headers['Cookie'] 'new-cookie-identifier=1234abcd' With session object, we can specify some default values of the properties, which needs to be sent to the server using GET, POST, PUT and so on. We can achieve this by specifying the values to the properties like headers, auth and so on, on a Session object. >>> session.params = {"key1": "value", "key2": "value2"} >>> session.auth = ('username', 'password') >>> session.headers.update({'foo': 'bar'}) In the preceding example, we have set some default values to the properties—params, auth, and headers using the session object. We can override them in the subsequent request, as shown in the following example, if we want to: >>> session.get('http://mysite.com/new/url', headers={'foo': 'new-bar'}) Revealing the structure of request and response A Requests object is the one which is created by the user when he/she tries to interact with a web resource. It will be sent as a prepared request to the server and does contain some parameters which are optional. Let us have an eagle eye view on the parameters: Method: This is the HTTP method to be used to interact with the web service. For example: GET, POST, PUT. URL: The web address to which the request needs to be sent. headers: A dictionary of headers to be sent in the request. files: This can be used while dealing with the multipart upload. It's the dictionary of files, with key as file name and value as file object. data: This is the body to be attached to the request.json. There are two cases that come in to the picture here: If json is provided, content-type in the header is changed to application/json and at this point, json acts as a body to the request. In the second case, if both json and data are provided together, data is silently ignored. params: A dictionary of URL parameters to append to the URL. auth: This is used when we need to specify the authentication to the request. It's a tuple containing username and password. cookies: A dictionary or a cookie jar of cookies which can be added to the request. hooks: A dictionary of callback hooks. A Response object contains the response of the server to a HTTP request. It is generated once Requests gets a response back from the server. It contains all of the information returned by the server and also stores the Request object we created originally. Whenever we make a call to a server using the requests, two major transactions are taking place in this context which are listed as follows: We are constructing a Request object which will be sent out to the server to request a resource A Response object is generated by the requests module Now, let us look at an example of getting a resource from Python's official site. >>> response = requests.get('https://python.org') In the preceding line of code, a requests object gets constructed and will be sent to 'https://python.org'. Thus obtained Requests object will be stored in the response.request variable. We can access the headers of the Request object which was sent off to the server in the following way: >>> response.request.headers CaseInsensitiveDict({'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/2.2.1 CPython/2.7.5+ Linux/3.13.0-43-generic'}) The headers returned by the server can be accessed with its 'headers' attribute as shown in the following example: >>> response.headers CaseInsensitiveDict({'content-length': '45950', 'via': '1.1 varnish', 'x-cache': 'HIT', 'accept-ranges': 'bytes', 'strict-transport-security': 'max-age=63072000; includeSubDomains', 'vary': 'Cookie', 'server': 'nginx', 'age': '557','content-type': 'text/html; charset=utf-8', 'public-key-pins': 'max-age=600; includeSubDomains; ..) The response object contains different attributes like _content, status_code, headers, url, history, encoding, reason, cookies, elapsed, request. >>> response.status_code 200 >>> response.url u'https://www.python.org/' >>> response.elapsed datetime.timedelta(0, 1, 904954) >>> response.reason 'OK' Using prepared Requests Every request we send to the server turns to be a PreparedRequest by default. The request attribute of the Response object which is received from an API call or a session call is actually the PreparedRequest that was used. There might be cases in which we ought to send a request which would incur an extra step of adding a different parameter. Parameters can be cookies, files, auth, timeout and so on. We can handle this extra step efficiently by using the combination of sessions and prepared requests. Let us look at an example: >>> from requests import Request, Session >>> header = {} >>> request = Request('get', 'some_url', headers=header) We are trying to send a get request with a header in the previous example. Now, take an instance where we are planning to send the request with the same method, URL, and headers, but we want to add some more parameters to it. In this condition, we can use the session method to receive complete session level state to access the parameters of the initial sent request. This can be done by using the session object. >>> from requests import Request, Session >>> session = Session() >>> request1 = Request('GET', 'some_url', headers=header) Now, let us prepare a request using the session object to get the values of the session level state: >>> prepare = session.prepare_request(request1) We can send the request object request with more parameters now, as follows: >>> response = session.send(prepare, stream=True, verify=True) 200 Voila! Huge time saving! The prepare method prepares the complete request with the supplied parameters. In the previous example, the prepare_request method was used. There are also some other methods like prepare_auth, prepare_body, prepare_cookies, prepare_headers, prepare_hooks, prepare_method, prepare_url which are used to create individual properties. Verifying an SSL certificate with Requests Requests provides the facility to verify an SSL certificate for HTTPS requests. We can use the verify argument to check whether the host's SSL certificate is verified or not. Let us consider a website which has got no SSL certificate. We shall send a GET request with the argument verify to it. The syntax to send the request is as follows: requests.get('no ssl certificate site', verify=True) As the website doesn't have an SSL certificate, it will result an error similar to the following: requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused')) Let us verify the SSL certificate for a website which is certified. Consider the following example: >>> requests.get('https://python.org', verify=True) <Response [200]> In the preceding example, the result was 200, as the mentioned website is SSL certified one. If we do not want to verify the SSL certificate with a request, then we can put the argument verify=False. By default, the value of verify will turn to True. Body content workflow Take an instance where a continuous stream of data is being downloaded when we make a request. In this situation, the client has to listen to the server continuously until it receives the complete data. Consider the case of accessing the content from the response first and the worry about the body next. In the above two situations, we can use the parameter stream. Let us look at an example: >>> requests.get("https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz", stream=True) If we make a request with the parameter stream=True, the connection remains open and only the headers of the response will be downloaded. This gives us the capability to fetch the content whenever we need by specifying the conditions like the number of bytes of data. The syntax is as follows: if int(request.headers['content_length']) < TOO_LONG: content = r.content By setting the parameter stream=True and by accessing the response as a file-like object that is response.raw, if we use the method iter_content, we can iterate over response.data. This will avoid reading of larger responses at once. The syntax is as follows: iter_content(chunk_size=size in bytes, decode_unicode=False) In the same way, we can iterate through the content using iter_lines method which will iterate over the response data one line at a time. The syntax is as follows: iter_lines(chunk_size = size in bytes, decode_unicode=None, delimitter=None) The important thing that should be noted while using the stream parameter is it doesn't release the connection when it is set as True, unless all the data is consumed or response.close is executed. Keep-alive facility As the urllib3 supports the reuse of the same socket connection for multiple requests, we can send many requests with one socket and receive the responses using the keep-alive feature in the Requests library. Within a session, it turns to be automatic. Every request made within a session automatically uses the appropriate connection by default. The connection that is being used will be released after all the data from the body is read. Streaming uploads A file-like object which is of massive size can be streamed and uploaded using the Requests library. All we need to do is to supply the contents of the stream as a value to the data attribute in the request call as shown in the following lines. The syntax is as follows: with open('massive-body', 'rb') as file:    requests.post('http://example.com/some/stream/url',                  data=file) Using generator for sending chunk encoded Requests Chunked transfer encoding is a mechanism for transferring data in an HTTP request. With this mechanism, the data is sent in a series of chunks. Requests supports chunked transfer encoding, for both outgoing and incoming requests. In order to send a chunk encoded request, we need to supply a generator for your body. The usage is shown in the following example: >>> def generator(): ...     yield "Hello " ...     yield "World!" ... >>> requests.post('http://example.com/some/chunked/url/path',                  data=generator()) Getting the request method arguments with event hooks We can alter the portions of the request process signal event handling using hooks. For example, there is hook named response which contains the response generated from a request. It is a dictionary which can be passed as a parameter to the request. The syntax is as follows: hooks = {hook_name: callback_function, … } The callback_function parameter may or may not return a value. When it returns a value, it is assumed that it is to replace the data that was passed in. If the callback function doesn't return any value, there won't be any effect on the data. Here is an example of a callback function: >>> def print_attributes(request, *args, **kwargs): ...     print(request.url) ...     print(request .status_code) ...     print(request .headers) If there is an error in the execution of callback_function, you'll receive a warning message in the standard output. Now let us print some of the attributes of the request, using the preceding callback_function: >>> requests.get('https://www.python.org/',                  hooks=dict(response=print_attributes)) https://www.python.org/ 200 CaseInsensitiveDict({'content-type': 'text/html; ...}) <Response [200]> Iterating over streaming API Streaming API tends to keep the request open allowing us to collect the stream data in real time. While dealing with a continuous stream of data, to ensure that none of the messages being missed from it we can take the help of iter_lines() in Requests. The iter_lines() iterates over the response data line by line. This can be achieved by setting the parameter stream as True while sending the request. It's better to keep in mind that it's not always safe to call the iter_lines() function as it may result in loss of received data. Consider the following example taken from http://docs.python-requests.org/en/latest/user/advanced/#streaming-requests: >>> import json >>> import requests >>> r = requests.get('http://httpbin.org/stream/4', stream=True) >>> for line in r.iter_lines(): ...     if line: ...         print(json.loads(line) ) In the preceding example, the response contains a stream of data. With the help of iter_lines(), we tried to print the data by iterating through every line. Encodings As specified in the HTTP protocol (RFC 7230), applications can request the server to return the HTTP responses in an encoded format. The process of encoding turns the response content into an understandable format which makes it easy to access it. When the HTTP header fails to return the type of encoding, Requests will try to assume the encoding with the help of chardet. If we access the response headers of a request, it does contain the keys of content-type. Let us look at a response header's content-type: >>> re = requests.get('http://google.com') >>> re.headers['content-type'] 'text/html; charset=ISO-8859-1' In the preceding example the content type contains 'text/html; charset=ISO-8859-1'. This happens when the Requests finds the charset value to be None and the 'content-type' value to be 'Text'. It follows the protocol RFC 7230 to change the value of charset to ISO-8859-1 in this type of a situation. In case we are dealing with different types of encodings like 'utf-8', we can explicitly specify the encoding by setting the property to Response.encoding. HTTP verbs Requests support the usage of the full range of HTTP verbs which are defined in the following table. To most of the supported verbs, 'url' is the only argument that must be passed while using them. Method Description GET GET method requests a representation of the specified resource. Apart from retrieving the data, there will be no other effect of using this method. Definition is given as requests.get(url, **kwargs) POST The POST verb is used for the creation of new resources. The submitted data will be handled by the server to a specified resource. Definition is given as requests.post(url, data=None, json=None, **kwargs) PUT This method uploads a representation of the specified URI. If the URI is not pointing to any resource, the server can create a new object with the given data or it will modify the existing resource. Definition is given as requests.put(url, data=None, **kwargs) DELETE This is pretty easy to understand. It is used to delete the specified resource. Definition is given as requests.delete(url, **kwargs) HEAD This verb is useful for retrieving meta-information written in response headers without having to fetch the response body. Definition is given as requests.head(url, **kwargs) OPTIONS OPTIONS is a HTTP method which returns the HTTP methods that the server supports for a specified URL. Definition is given as requests.options(url, **kwargs) PATCH This method is used to apply partial modifications to a resource. Definition is given as requests.patch(url, data=None, **kwargs) Self-describing the APIs with link headers Take a case of accessing a resource in which the information is accommodated in different pages. If we need to approach the next page of the resource, we can make use of the link headers. The link headers contain the meta data of the requested resource, that is the next page information in our case. >>> url = "https://api.github.com/search/code?q=addClass+user:mozilla&page=1&per_page=4" >>> response = requests.head(url=url) >>> response.headers['link'] '<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2&per_page=4>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=250&per_page=4>; rel="last" In the preceding example, we have specified in the URL that we want to access page number one and it should contain four records. The Requests automatically parses the link headers and updates the information about the next page. When we try to access the link header, it showed the output with the values of the page and the number of records per page. Transport Adapter It is used to provide an interface for Requests sessions to connect with HTTP and HTTPS. This will help us to mimic the web service to fit our needs. With the help of Transport Adapters, we can configure the request according to the HTTP service we opt to use. Requests contains a Transport Adapter called HTTPAdapter included in it. Consider the following example: >>> session = requests.Session() >>> adapter = requests.adapters.HTTPAdapter(max_retries=6) >>> session.mount("http://google.co.in", adapter) In this example, we created a request session in which every request we make retries only six times, when the connection fails. Summary In this article, we learnt about creating sessions and using the session with different criteria. We also looked deeply into HTTP verbs and using proxies. We learnt about streaming requests, dealing with SSL certificate verifications and streaming responses. We also got to know how to use prepared requests, link headers and chunk encoded requests. Resources for Article: Further resources on this subject: Machine Learning [article] Solving problems – closest good restaurant [article] Installing NumPy, SciPy, matplotlib, and IPython [article]
Read more
  • 0
  • 0
  • 7908

article-image-creating-and-using-composer-packages
Packt
29 Oct 2013
7 min read
Save for later

Creating and Using Composer Packages

Packt
29 Oct 2013
7 min read
(For more resources related to this topic, see here.) Using Bundles One of the great features in Laravel is the ease in which we can include the class libraries that others have made using bundles. On the Laravel site, there are already many useful bundles, some of which automate certain tasks while others easily integrate with third-party APIs. A recent addition to the PHP world is Composer, which allows us to use libraries (or packages) that aren't specific to Laravel. In this article, we'll get up-and-running with using bundles, and we'll even create our own bundle that others can download. We'll also see how to incorporate Composer into our Laravel installation to open up a wide range of PHP libraries that we can use in our application. Downloading and installing packages One of the best features of Laravel is how modular it is. Most of the framework is built using libraries, or packages, that are well tested and widely used in other projects. By using Composer for dependency management, we can easily include other packages and seamlessly integrate them into our Laravel app. For this recipe, we'll be installing two popular packages into our app: Jeffrey Way's Laravel 4 Generators and the Imagine image processing packages. Getting ready For this recipe, we need a standard installation of Laravel using Composer. How to do it... For this recipe, we will follow these steps: Go to https://packagist.org/. In the search box, search for way generator as shown in the following screenshot: Click on the link for way/generators : View the details at https://packagist.org/packages/way/generators and take notice of the require line to get the package's version. For our purposes, we'll use "way/generators": "1.0.*" . In our application's root directory, open up the composer.json file and add in the package to the require section so it looks like this: "require": { "laravel/framework": "4.0.*", "way/generators": "1.0.*" }, Go back to http://packagist.org and perform a search for imagine as shown in the following screenshot: Click on the link to imagine/imagine and copy the require code for dev-master : Go back to our composer.json file and update the require section to include the imagine package . It should now look similar to the following code: "require": { "laravel/framework": "4.0.*", "way/generators": "1.0.*", "imagine/imagine": "dev-master" }, Open the command line, and in the root of our application, run the Composer update as follows: php composer.phar update Finally, we'll add the Generator Service Provider, so open the app/config/app.php file and in the providers array, add the following line: 'WayGeneratorsGeneratorsServiceProvider' How it works... To get our package, we first go to packagist.org and search for the package we want. We could also click on the Browse packages link. It will display a list of the most recent packages as well as the most popular. After clicking on the package we want, we'll be taken to the detail page, which lists various links including the package's repository and home page. We could also click on the package's maintainer link to see other packages they have released. Underneath, we'll see the various versions of the package. If we open that version's detail page, we'll find the code we need to use for our composer.json file. We could either choose to use a strict version number, add a wildcard to the version, or use dev-master, which will install whatever is updated on the package's master branch. For the Generators package, we'll only use Version 1.0, but allow any minor fixes to that version. For the imagine package, we'll use dev-master, so whatever is in their repository's master branch will be downloaded, regardless of version number. We then run update on Composer and it will automatically download and install all of the packages we chose. Finally, to use Generators in our app, we need to register the service provider in our app's config file. Using the Generators package to set up an app Generators is a popular Laravel package that automates quite a bit of file creation. In addition to controllers and models, it can also generate views, migrations, seeds, and more, all through a command-line interface. Getting ready For this recipe, we'll be using the Laravel 4 Generators package maintained by Jeffrey Way that was installed in the Downloading and installing packages recipe. We'll also need a properly configured MySQL database. How to do it… Follow these steps for this recipe: Open the command line in the root of our app and, using the generator, create a scaffold for our cities as follows: php artisan generate:scaffold cities --fields="city:string" In the command line, create a scaffold for our superheroes as follows: php artisan generate:scaffold superheroes --fields="name:string, city_id:integer:unsigned" In our project, look in the app/database/seeds directory and find a file named CitiesTableSeeder.php. Open it and add some data to the $cities array as follows: <?php class CitiesTableSeeder extends Seeder { public function run() { DB::table('cities')->delete(); $cities = array( array( 'id' => 1, 'city' => 'New York', 'created_at' => date('Y-m-d g:i:s',time()) ), array( 'id' => 2, 'city' => 'Metropolis', 'created_at' => date('Y-m-d g:i:s',time()) ), array( 'id' => 3, 'city' => 'Gotham', 'created_at' => date('Y-m-d g:i:s',time()) ) ); DB::table('cities')->insert($cities); } } In the app/database/seeds directory, open SuperheroesTableSeeder.php and add some data to it: <?php class SuperheroesTableSeeder extends Seeder { public function run() { DB::table('superheroes')->delete(); $superheroes = array( array( 'name' => 'Spiderman', 'city_id' => 1, 'created_at' => date('Y-m-d g:i:s', time()) ), array( 'name' => 'Superman', 'city_id' => 2, 'created_at' => date('Y-m-d g:i:s', time()) ), array( 'name' => 'Batman', 'city_id' => 3, 'created_at' => date('Y-m-d g:i:s', time()) ), array( 'name' => 'The Thing', 'city_id' => 1, 'created_at' => date('Y-m-d g:i:s', time()) ) ); DB::table('superheroes')->insert($superheroes); } } In the command line, run the migration then seed the database as follows: php artisan migrate php artisan db:seed Open up a web browser and go to http://{your-server}/cities. We will see our data as shown in the following screenshot: Now, navigate to http://{your-server}/superheroes and we will see our data as shown in the following screenshot: How it works... We begin by running the scaffold generator for our cities and superheroes tables. Using the --fields tag, we can determine which columns we want in our table and also set options such as data type. For our cities table, we'll only need the name of the city. For our superheroes table, we'll want the name of the hero as well as the ID of the city where they live. When we run the generator, many files will automatically be created for us. For example, with cities, we'll get City.php in our models, CitiesController.php in controllers, and a cities directory in our views with the index, show, create, and edit views. We then get a migration named Create_cities_table.php, a CitiesTableSeeder.php seed file, and CitiesTest.php in our tests directory. We'll also have our DatabaseSeeder.php file and our routes.php file updated to include everything we need. To add some data to our tables, we opened the CitiesTableSeeder.php file and updated our $cities array with arrays that represent each row we want to add. We did the same thing for our SuperheroesTableSeeder.php file. Finally, we run the migrations and seeder and our database will be created and all the data will be inserted. The Generators package has already created the views and controllers we need to manipulate the data, so we can easily go to our browser and see all of our data. We can also create new rows, update existing rows, and delete rows.
Read more
  • 0
  • 0
  • 7807
article-image-using-underscorejs-collections
Packt
01 Oct 2015
21 min read
Save for later

Using Underscore.js with Collections

Packt
01 Oct 2015
21 min read
In this article Alex Pop, the author of the book, Learning Underscore.js, we will explore Underscore functionality for collections using more in-depth examples. Some of the more advanced concepts related to Underscore functions such as scope resolution and execution context will be explained. The topics of the article are as follows: Key Underscore functions revisited Searching and filtering This article assumes that you are familiar with JavaScript fundamentals such as prototypical inheritance and the built-in data types. The source code for the examples from this article is hosted online at https://github.com/popalexandruvasile/underscorejs-examples/tree/master/collections, and you can execute the examples using the Cloud9 IDE at the address https://ide.c9.io/alexpop/underscorejs-examples from the collections folder. (For more resources related to this topic, see here.) Key Underscore functions – each, map, and reduce This flexible approach means that some Underscore functions can operate over collections: an Underscore-specific term for arrays, array like objects, and objects (where the collection represents the object properties). We will refer to the elements within these collections as collection items. By providing functions that operate over object properties Underscore expands JavaScript reflection like capabilities. Reflection is a programming feature for examining the structure of a computer program, especially during program execution. JavaScript is a dynamic language without static type system support (as of ES6). This makes it convenient to use a technique named duck typing when working with objects that share similar behaviors. Duck typing is a programming technique used in dynamic languages where objects are identified through their structure represented by properties and methods rather than their type (the name of duck typing is derived from the phrase "if it walks like a duck, swims like a duck, and quacks like a duck, then it is a duck"). Underscore itself uses duck typing to assert that an object is an array by checking for a property called length of type Number. Applying reflection techniques We will build an example that demonstrates duck typing and reflection techniques through a function that will extract object properties so that they can be persisted to a relational database. Usually relational database stores objects represented as a data row with columns types that map to regular SQL data types. We will use the _.each() function to iterate over object properties and extract those of type boolean, number, string and Date as they be easily mapped to SQL data type and ignore everything else: var propertyExtractor = (function() { "use strict" return { extractStorableProperties: function(source) { var storableProperties = {}; if (!source || source.id !== +source.id) { return storableProperties; } _.each(source, function(value, key) { var isDate = typeof value === 'object' && value instanceof Date; if (isDate || typeof value === 'boolean' || typeof value === 'number' || typeof value === 'string') { storableProperties[key] = value; } }); return storableProperties; } }; }()); You can find the example in the propertyExtractor.js file within the each-with-properties-and-context folder from the source code for this article. The first highlighted code snippet checks whether the object passed to the extractStorableProperties() function has a property called id that is a number. The + sign converts the id property to a number and the non-identity operator !== compares the result of this conversion with the unconverted original value. The non-identity operator returns true only if the type of the compared objects is different or they are of the same type and have different values. This was a duck typing technique used by Underscore up until version 1.7 to assert whether it deals with an array-like instance or an object instance in its collections related functions. Underscore collection related functions operate over array-like objects as they do not strictly check for the built in Array object. These functions can also work with the arguments objects or the HTML DOM NodeList objects. The last highlighted code snippet is the _.each() function that operates over object properties using an iteration function that receives the property value as its first argument and the property name as the optional second argument. If a property has a null or undefined value it will not appear in the returned object. The extractStorableProperties() function will return a new object with all the storable properties. The return value is used in the test specifications to assert that, given a sample object, the function behaves as expected: describe("Given propertyExtractor", function() { describe("when calling extractStorableProperties()", function() { var storableProperties; beforeEach(function() { var source = { id: 2, name: "Blue lamp", description: null, ui: undefined, price: 10, purchaseDate: new Date(2014, 10, 1), isInUse: true, }; storableProperties = propertyExtractor.extractStorableProperties(source); }); it("then the property count should be correct", function() { expect(Object.keys(storableProperties).length).toEqual(5); }); it("then the 'price' property should be correct", function() { expect(storableProperties.price).toEqual(10); }); it("then the 'description' property should not be defined", function() { expect(storableProperties.description).toEqual(undefined); }); }); }); Notice how we used the propertyExtractor global instance to access the function under test, and then, we used the ES5 function Object.keys to assert that the number of returned properties has the correct size. In a production ready application, we need to ensure that the global objects names do not clash among other best practices. You can find the test specification in the spec/propertyExtractorSpec.js file and execute them by browsing the SpecRunner.html file from the example source code folder. There is also an index.html file that will display the results of the example rendered in the browser using the index.js file. Manipulating the this variable Many Underscore functions have a similar signature with _.each(list, iteratee, [context]),where the optional context parameter will be used to set the this value for the iteratee function when it is called for each collection item. In JavaScript, the built in this variable will be different depending on the context where it is used. When the this variable is used in the global scope context, and in a browser environment, it will return the native window object instance. If this is used in a function scope, then the variable will have different values: If the function is an object method or an object constructor, then this will return the current object instance. Here is a short example code for this scenario: var item1 = { id: 1, name: "Item1", getInfo: function(){ return "Object: " + this.id + "-" + this.name; } }; console.log(item1.getInfo()); // -> “Object: 1-Item1” If the function does not belong to an object, then this will be undefined in the JavaScript strict mode. In the non-strict mode, this will return its global scope value. With a library such as Underscore that favors a functional style, we need to ensure that the functions used as parameters are using the this variable correctly. Let's assume that you have a function that references this (maybe it was used as an object method) and you want to use it with one of the Underscore functions such as _.each.. You can still use the function as is and provide the desired this value as the context parameter value when calling each. I have rewritten the previous example function to showcase the use of the context parameter: var propertyExtractor = (function() { "use strict"; return { extractStorablePropertiesWithThis: function(source) { var storableProperties = {}; if (!source || source.id !== +source.id) { return storableProperties; } _.each(source, function(value, key) { var isDate = typeof value === 'object' && value instanceof Date; if (isDate || typeof value === 'boolean' || typeof value === 'number' || typeof value === 'string') { this[key] = value; } }, storableProperties); return storableProperties; } }; }()); The first highlighted snippet shows the use of this, which is typical for an object method. The last highlighted snippet shows the context parameter value that this was set to. The storableProperties value will be passed as this for each iteratee function call. The test specifications for this example are identical with the previous example, and you can find them in the same folder each-with-properties-and-context from the source code for this article. You can use the optional context parameter in many of the Underscore functions where applicable and is a useful technique when working with functions that rely on a specific this value. Using map and reduce with object properties In the previous example, we had some user interface-specific code in the index.js file that was tasked with displaying the results of the propertyExtractor.extractStorableProperties() call in the browser. Let's pull this functionality in another example and imagine that we need a new function that, given an object, will transform its properties in a format suitable for displaying in a browser by returning an array of formatted text for each property. To achieve this, we will use the Underscore _.map() function over object properties as demonstrated in the next example: var propertyFormatter = (function() { "use strict"; return { extractPropertiesForDisplayAsArray: function(source) { if (!source || source.id !== +source.id) { return []; } return _.map(source, function(value, key) { var isDate = typeof value === 'object' && value instanceof Date; if (isDate || typeof value === 'boolean' || typeof value === 'number' || typeof value === 'string') { return "Property: " + key + " of type: " + typeof value + " has value: " + value; } return "Property: " + key + " cannot be displayed."; }); } }; }()); With Underscore, we can write compact and expressive code that manipulates these properties with little effort. The test specifications for the extractPropertiesForDisplayAsArray() function are using Jasmine regular expression matchers to assert the test conditions in the highlighted code snippets from the following example: describe("Given propertyFormatter", function() { describe("when calling extractPropertiesForDisplayAsArray()", function() { var propertiesForDisplayAsArray; beforeEach(function() { var source = { id: 2, name: "Blue lamp", description: null, ui: undefined, price: 10, purchaseDate: new Date(2014, 10, 1), isInUse: true, }; propertiesForDisplayAsArray = propertyFormatter.extractPropertiesForDisplayAsArray(source); }); it("then the returned property count should be correct", function() { expect(propertiesForDisplayAsArray.length).toEqual(7); }); it("then the 'price' property should be displayed", function() { expect(propertiesForDisplayAsArray[4]).toMatch("price.+10"); }); it("then the 'description' property should not be displayed", function() { expect(propertiesForDisplayAsArray[2]).toMatch("cannot be displayed"); }); }); }); The following example shows how _.reduce() is used to manipulate object properties. This will transform the properties of an object in a format suitable for browser display by returning a string value that contains all the properties in a convenient format: extractPropertiesForDisplayAsString: function(source) { if (!source || source.id !== +source.id) { return []; } return _.reduce(source, function(memo, value, key) { if (memo && memo !== "") { memo += "<br/>"; } var isDate = typeof value === 'object' && value instanceof Date; if (isDate || typeof value === 'boolean' || typeof value === 'number' || typeof value === 'string') { return memo + "Property: " + key + " of type: " + typeof value + " has value: " + value; } return memo + "Property: " + key + " cannot be displayed."; }, ""); } The example is almost identical with the previous one with the exception of the memo accumulator used to build the returned string value. The test specifications for the extractPropertiesForDisplayAsString() function are using a regular expression matcher and can be found in the spec/propertyFormatterSpec.js file: describe("when calling extractPropertiesForDisplayAsString()", function() { var propertiesForDisplayAsString; beforeEach(function() { var source = { id: 2, name: "Blue lamp", description: null, ui: undefined, price: 10, purchaseDate: new Date(2014, 10, 1), isInUse: true, }; propertiesForDisplayAsString = propertyFormatter.extractAllPropertiesForDisplay(source); }); it("then the returned string has expected length", function() { expect(propertiesForDisplayAsString.length).toBeGreaterThan(0); }); it("then the 'price' property should be displayed", function() { expect(propertiesForDisplayAsString).toMatch("<br/>Property: price of type: number has value: 10<br/>"); }); }); The examples from this subsection can be found within the map.reduce-with-properties folder from the source code for this article. Searching and filtering The _.find(list, predicate, [context]) function is part of the Underscore comprehensive functionality for searching and filtering collections represented by object properties and array like objects. We will make a distinction between search and filter functions with the former tasked with finding one item in a collection and the latter tasked with retrieving a subset of the collection (although sometimes, you will find the distinction between these functions thin and blurry). We will revisit the find function and the other search- and filtering-related functions using an example with slightly more diverse data that is suitable for database persistence. We will use the problem domain of a bicycle rental shop and build an array of bicycle objects with the following structure: var getBicycles = function() { return [{ id: 1, name: "A fast bike", type: "Road Bike", quantity: 10, rentPrice: 20, dateAdded: new Date(2015, 1, 2) }, { ... }, { id: 12, name: "A clown bike", type: "Children Bike", quantity: 2, rentPrice: 12, dateAdded: new Date(2014, 11, 1) }]; }; Each bicycle object has an id property, and we will use the propertyFormatter object built in the previous section to display the examples results in the browser for your convenience. The code was shortened here for brevity (you can find its full version alongside the other examples from this section within the searching and filtering folders from the source code for this article). All the examples are covered by tests and these are the recommended starting points if you want to explore them in detail. Searching For the first example of this section, we will define a bicycle-related requirement where we need to search for a bicycle of a specific type and with a rental price under a maximum value. Compared to the previous _.find() example, we will start with writing the tests specifications first for the functionality that is yet to be implemented. This is a test-driven development approach where we will define the acceptance criteria for the function under test first followed by the actual implementation. Writing the tests first forces us to think about what the code should do, rather than how it should do it, and this helps eliminate waste by writing only the code required to make the tests pass. Underscore find The test specifications for our initial requirement are as follows: describe("Given bicycleFinder", function() { describe("when calling findBicycle()", function() { var bicycle; beforeEach(function() { bicycle = bicycleFinder.findBicycle("Urban Bike", 16); }); it("then it should return an object", function() { expect(bicycle).toBeDefined(); }); it("then the 'type' property should be correct", function() { expect(bicycle.type).toEqual("Urban Bike"); }); it("then the 'rentPrice' property should be correct", function() { expect(bicycle.rentPrice).toEqual(15); }); }); }); The highlighted function call bicyleFinder.findBicycle() should return one bicycle object of the expected type and price as asserted by the tests. Here is the implementation that satisfies the test specifications: var bicycleFinder = (function() { "use strict"; var getBicycles = function() { return [{ id: 1, name: "A fast bike", type: "Road Bike", quantity: 10, rentPrice: 20, dateAdded: new Date(2015, 1, 2) }, { ... }, { id: 12, name: "A clown bike", type: "Children Bike", quantity: 2, rentPrice: 12, dateAdded: new Date(2014, 11, 1) }]; }; return { findBicycle: function(type, maxRentPrice) { var bicycles = getBicycles(); return _.find(bicycles, function(bicycle) { return bicycle.type === type && bicycle.rentPrice <= maxRentPrice; }); } }; }()); The code returns the first bicycle that satisfies the search criteria ignoring the rest of the bicycles that might meet the same criteria. You can browse the index.html file from the searching folder within the source code for this article to see the result of calling the bicyleFinder.findBicycle() function displayed on the browser via the propertyFormatter object. Underscore some There is a closely related function to _.find() with the signature _.some(list, [predicate], [context]). This function will return true if at least one item of the list collection satisfies the predicate function. The predicate parameter is optional, and if it is not specified, the _.some() function will return true if at least one item of the collection is not null. This makes the function a good candidate for implementing guard clauses. A guard clause is a function that ensures that a variable (usually a parameter) satisfies a specific condition before it is being used any further. The next example shows how _.some() is used to perform checks that are typical for a guard clause: var list1 = []; var list2 = [null, , undefined, {}]; var object1 = {}; var object2 = { property1: null, property3: true }; if (!_.some(list1) && !_.some(object1)) { alert("Collections list1 and object1 are not valid when calling _.some() over them."); } if(_.some(list2) && _.some(object2)){ alert("Collections list2 and object2 have at least one valid item and they are valid when calling _.some() over them."); } If you execute this code in a browser, you will see both alerts being displayed. The first alert gets triggered when an empty array or an object without any properties defined are found. The second alert appears when we have an array with at least one element that is not null and is not undefined or when we have an object that has at least one property that evaluates as true. Going back to our bicycle data, we will define a new requirement to showcase the use of _.some() in this context. We will implement a function that will ensure that we can find at least one bicycle of a specific type and with a maximum rent price. The code is very similar to the bicycleFinder.findBicycle() implementation with the difference that the new function returns true if the specific bicycle is found (rather than the actual object): hasBicycle: function(type, maxRentPrice) { var bicycles = getBicycles(); return _.some(bicycles, function(bicycle) { return bicycle.type === type && bicycle.rentPrice <= maxRentPrice; }); } You can find the tests specifications for this function in the spec/bicycleFinderSpec.js file from the searching example folder. Underscore findWhere Another function similar to _.find() has the signature _.findWhere(list, properties). This compares the property key-value pairs of each collection item from list with the property key-value pairs found on the properties object parameter. Usually, the properties parameter is an object literal that contains a subset of the properties of a collection item. The _.findWhere() function is useful when we need to extract a collection item matching an exact value compared to _.find() that can extract a collection item that matches a range of values or more complex criteria. To showcase the function, we will implement a requirement that needs to search a bicycle that has a specific id value. This is how the test specifications look like: describe("when calling findBicycleById()", function() { var bicycle; beforeEach(function() { bicycle = bicycleFinder.findBicycleById(6); }); it("then it should return an object", function() { expect(bicycle).toBeDefined(); }); it("then the 'id' property should be correct", function() { expect(bicycle.id).toEqual(6); }); }); And the next code snippet from the bicycleFinder.js file contains the actual implementation: findBicycleById: function(id){ var bicycles = getBicycles(); return _.findWhere(bicycles, {id: id}); } Underscore contains In a similar vein, with the _.some() function, there is a _.contains(list, value) function that will return true if there is at least one item from the list collection that is equal to the value parameter. The equality check is based on the strict comparison operator === where the operands will be checked for both type and value equality. We will implement a function that checks whether a bicycle with a specific id value exists in our collection: hasBicycleWithId: function(id) { var bicycles = getBicycles(); var bicycleIds = _.pluck(bicycles,"id"); return _.contains(bicycleIds, id); } Notice how the _.pluck(list, propertyName) function was used to create an array that stores the id property value for each collection item. In its implementation, _.pluck() is actually using _.map(), acting like a shortcut function for it. Filtering As we mentioned at the beginning of this section, Underscore provides powerful filtering functions, which are usually tasked with working on a subsection of a collection. We will reuse the same example data as before, and we will build some new functions to explore this functionality. Underscore filter We will start by defining a new requirement for our data where we need to build a function that retrieves all bicycles of a specific type and with a maximum rent price. This is how the test specifications looks like for the yet to be implemented function bicycleFinder.filterBicycles(type, maxRentPrice): describe("when calling filterBicycles()", function() { var bicycles; beforeEach(function() { bicycles = bicycleFinder.filterBicycles("Urban Bike", 16); }); it("then it should return two objects", function() { expect(bicycles).toBeDefined(); expect(bicycles.length).toEqual(2); }); it("then the 'type' property should be correct", function() { expect(bicycles[0].type).toEqual("Urban Bike"); expect(bicycles[1].type).toEqual("Urban Bike"); }); it("then the 'rentPrice' property should be correct", function() { expect(bicycles[0].rentPrice).toEqual(15); expect(bicycles[1].rentPrice).toEqual(14); }); }); The test expectations are assuming the function under test filterBicycles() returns an array, and they are asserting against each element of this array. To implement the new function, we will use the _.filter(list, predicate, [context]) function that returns an array with all the items from the list collection that satisfy the predicate function. Here is our example implementation code: filterBicycles: function(type, maxRentPrice) { var bicycles = getBicycles(); return _.filter(bicycles, function(bicycle) { return bicycle.type === type && bicycle.rentPrice <= maxRentPrice; }); } The usage of the _.filter() function is very similar to the _.find() function with the only difference in the return type of these functions. You can find this example together with the rest of examples from this subsection within the filtering folder from the source code for this article. Underscore where Underscore defines a shortcut function for _.filter() which is _.where(list, properties). This function is similar to the _.findWhere() function, and it uses the properties object parameter to compare and retrieve all the items from the list collection with matching properties. To showcase the function, we defined a new requirement for our example data where we need to retrieve all bicycles of a specific type. This is the code that implements the requirement: filterBicyclesByType: function(type) { var bicycles = getBicycles(); return _.where(bicycles, { type: type }); } By using _.where(), we are in fact using a more compact and expressive version of _.filter() in scenarios where we need to perform exact value matches. Underscore reject and partition Underscore provides a useful function which is the opposite for _.filter() and has a similar signature: _.reject(list, predicate, [context]). Calling the function will return an array of values from the list collection that do not satisfy the predicate function. To show its usage we will implement a function that retrieves all bicycles with a rental price less than or equal with a given value. Here is the function implementation: getAllBicyclesForSetRentPrice: function(setRentPrice) { var bicycles = getBicycles(); return _.reject(bicycles, function(bicycle) { return bicycle.rentPrice > setRentPrice; }); } Using the _.filter() function alongside the _.reject() function with the same list collection and predicate function will allow us to partition the collection in two arrays. One array holds items that do satisfy the predicate function while the other holds items that do not satisfy the predicate function. Underscore has a more convenient function that achieves the same result and this is _.partition(list, predicate). It returns an array that has two array elements: the first has the values that would be returned by calling _.filter() using the same input parameters and the second has the values for calling _.reject(). Underscore every We mentioned _.some() as being a great function for implementing guard clauses. It is also worth mentioning another closely related function _.every(list, [predicate], [context]). The function will check every item of the list collection and will return true if every item satisfies the predicate function or if list is null, undefined or empty. If the predicate function is not specified the value of each item will be evaluated instead. If we use the same data from the guard clause example for _.some() we will get the opposite results as shown in the next example: var list1 = []; var list2 = [null, , undefined, {}]; var object1 = {}; var object2 = { property1: null, property3: true }; if (_.every(list1) && _.every(object1)) { alert("Collections list1 and object1 are valid when calling _.every() over them."); } if(!_.every(list2) && !_.every(object2)){ alert("Collections list2 and object2 do not have all items valid so they are not valid when calling _.every() over them."); } To ensure a collection is not null, undefined, or empty and each item is also not null or undefined we should use both _.some() and _.every() as part of the same check as shown in the next example: var list1 = [{}]; var object1 = { property1: {}}; if (_.every(list1) && _.every(object1) && _.some(list1) && _.some(object1)) { alert("Collections list1 and object1 are valid when calling both _some() and _.every() over them."); } If the list1 object is an empty array or an empty object literal calling _.every() for it returns true while calling _some() returns false hence the need to use both functions when validating a collection. These code examples demonstrate how you can build your own guard clauses or data validation rules by using simple Underscore functions. Summary In this article, we explored many of the collection specific functions provided by Underscore and demonstrated additional functionality. We continued with searching and filtering functions. Resources for Article: Further resources on this subject: Packaged Elegance[article] Marshalling Data Services with Ext.Direct[article] Understanding and Developing Node Modules [article]
Read more
  • 0
  • 0
  • 7648

article-image-exploring-performance-issues-nodejsexpress-applications
Packt
11 May 2016
16 min read
Save for later

Exploring Performance Issues in Node.js/Express Applications

Packt
11 May 2016
16 min read
Node.js is an exciting new platform for developing web applications, application servers, any sort of network server or client, and general purpose programming. It is designed for extreme scalability in networked applications through an ingenious combination of server-side JavaScript, asynchronous I/O, asynchronous programming, built around JavaScript anonymous functions, and a single execution thread event-driven architecture. Companies—large and small—are adopting Node.js, for example, PayPal is one of the companies converting its application stack over to Node.js. An up-and-coming leading application model, the MEAN stack, combines MongoDB (or MySQL) with Express, AngularJS and, of course, Node.js. A look through current job listings demonstrates how important the MEAN stack and Node.js in general have become. It's claimed that Node.js is a lean, low-overhead, software platform. The excellent performance is supposedly because Node.js eschews the accepted wisdom of more traditional platforms, such as JavaEE and its complexity. Instead of relying on a thread-oriented architecture to fill the CPU cores of the server, Node.js has a simple single-threaded architecture, avoiding the overhead and complexity of threads. Using threads to implement concurrency often comes with admonitions like these: expensive and error-prone, the error-prone synchronization primitives of Java, or designing concurrent software can be complex and error prone. The complexity comes from the access to shared variables and various strategies to avoid deadlock and competition between threads. The synchronization primitives of Java are an example of such a strategy, and obviously many programmers find them hard to use. There's the tendency to create frameworks, such as java.util.concurrent, to tame the complexity of threaded concurrency, but some might argue that papering over complexity does not make things simpler. Adopting Node.js is not a magic wand that will instantly make performance problems disappear forever. The development team must approach this intelligently, or else, you'll end up with one core on the server running flat out and the other cores twiddling their thumbs. Your manager will want to know how you're going to fully utilize the server hardware. And, because Node.js is single-threaded, your code must return from event handlers quickly, or else, your application will be frequently blocked and will provide poor performance. Your manager will want to know how you'll deliver the promised high transaction rate. In this article by David Herron, author of the book Node.JS Web Development - Third Edition, we will explore this issue. We'll write a program with an artificially heavy computational load. The naive Fibonacci function we'll use is elegantly simple, but is extremely recursive and can take a long time to compute. (For more resources related to this topic, see here.) Node.js installation Before launching into writing code, we need to install Node.js on our laptop. Proceed to the Node.js downloads page by going to http://nodejs.org/ and clicking on the downloads link. It's preferable if you can install Node.js from the package management system for your computer. While the Downloads page offers precompiled binary Node.js packages for popular computer systems (Windows, Mac OS X, Linux, and so on), installing from the package management system makes it easier to update the install later. The Downloads page has a link to instructions for using package management systems to install Node.js. Once you've installed Node.js, you can quickly test it by running a couple of commands: $ node –help Prints out helpful information about using the Node.js command-line tool: $ npm help Npm is the default package management system for Node.js, and is automatically installed along with Node.js. It lets us download Node.js packages from over the Internet, using them as the building blocks for our applications. Next, let's create a directory to develop an Express application within it to calculate Fibonacci numbers: $ mkdir fibonacci $ cd fibonacci $ npm install express-generator@4.x $ ./node_modules/.bin/express . --ejs $ npm install The application will be written against the current Express version, version 4.x. Specifying the version number this way makes sure of compatibility. The express command generated for us a starting application. You can inspect the package.json file to see what will be installed, and the last command installs those packages. What we'll have in front of us is a minimal Express application. Our first stop is not to create an Express application, but to gather some basic data about computation-dominant code in Node.js. Heavy-weight computation Let's start the exploration by creating a Node.js module namedmath.js, containing: var fibonacci = exports.fibonacci = function(n) { if (n === 1) return 1; else if (n === 2) return 1; else return fibonacci(n-1) + fibonacci(n-2); } Then, create another file namedfibotimes.js containing this: var math = require('./math'); var util = require('util'); for (var num = 1; num < 80; num++) { util.log('Fibonacci for '+ num +' = '+ math.fibonacci(num)); } Running this script produces the following output: $ node fibotimes.js 31 Jan 14:41:28 - Fibonacci for 1 = 1 31 Jan 14:41:28 - Fibonacci for 2 = 1 31 Jan 14:41:28 - Fibonacci for 3 = 2 31 Jan 14:41:28 - Fibonacci for 4 = 3 31 Jan 14:41:28 - Fibonacci for 5 = 5 31 Jan 14:41:28 - Fibonacci for 6 = 8 31 Jan 14:41:28 - Fibonacci for 7 = 13 … 31 Jan 14:42:27 - Fibonacci for 38 = 39088169 31 Jan 14:42:28 - Fibonacci for 39 = 63245986 31 Jan 14:42:31 - Fibonacci for 40 = 102334155 31 Jan 14:42:34 - Fibonacci for 41 = 165580141 31 Jan 14:42:40 - Fibonacci for 42 = 267914296 31 Jan 14:42:50 - Fibonacci for 43 = 433494437 31 Jan 14:43:06 - Fibonacci for 44 = 701408733 This quickly calculates the first 40 or so members of the Fibonacci sequence. After the 40th member, it starts taking a couple seconds per result and quickly degrades from there. It isuntenable to execute code of this sort on a single-threaded system that relies on a quick return to the event loop. That's an important point because the Node.js design requires that event handlers quickly return to the event loop. The single-thread event-loop does everything in Node.js and event handlers that return quickly to the event loop keep it humming. A correctly written application can sustain a tremendous request throughput, but a badly written application can prevent Node.js from fulfilling that promise. This Fibonacci function demonstrates algorithms that churn away at their calculation without ever letting Node.js process the event loop. Calculating fibonacci(44) requires 16 seconds of calculation, which is an eternity for a modern web service. With any server that's bogged down like this, not processing events, the perceived performance is zilch. Your manager will be rightfully angry. This is a completely artificial example, because it's trivial to refactor the Fibonacci calculation for excellent performance. This is a stand-in for any algorithm that might monopolize the event loop. There are two general ways in Node.js to solve this problem: Algorithmic refactoring: Perhaps, like the Fibonacci function we chose, one of your algorithms is suboptimal and can be rewritten to be faster. Or, if not faster, it can be split into callbacks dispatched through the event loop. We'll look at one such method in a moment. Creating a backend service: Can you imagine a backend server dedicated to calculating Fibonacci numbers? Okay, maybe not, but it's quite common to implement backend servers to offload work from frontend servers, and we will implement a backend Fibonacci server at the end of this article. But first, we need to set up an Express application that demonstrates the impact on the event loop. An Express app to calculate Fibonacci numbers To see the impact of a computation-heavy application on Node.js performance, let's write a simple Express application to do Fibonacci calculations. Express is a key Node.js technology, so this will also give you a little exposure to writing an Express application. We've already created the blank application, so let's make a couple of small changes, so it uses our Fibonacci algorithm. Edit views/index.ejs to have this code: <!DOCTYPE html> <html> <head> <title><%= title %></title> <link rel='stylesheet' href='/stylesheets/style.css' /> </head> <body> <h1><%= title %></h1> <% if (typeof fiboval !== "undefined") { %> <p>Fibonacci for <%= fibonum %> is <%= fiboval %></p> <hr/> <% } %> <p>Enter a number to see its' Fibonacci number</p> <form name='fibonacci' action='/' method='get'> <input type='text' name='fibonum' /> <input type='submit' value='Submit' /> </form> </body> </html> This simple template sets up an HTML form where we can enter a number. This number designates the desired member of the Fibonacci sequences to calculate. This is written for the EJS template engine. You can see that <%= variable %> substitutes the named variable into the output, and JavaScript code is written in the template by enclosing it within <% %> delimiters. We use that to optionally print out the requested Fibonacci value if one is available. var express = require('express'); var router = express.Router(); var math = require('../math'); router.get('/', function(req, res, next) { if (req.query.fibonum) { res.render('index', { title: "Fibonacci Calculator", fibonum: req.query.fibonum, fiboval: math.fibonacci(req.query.fibonum) }); } else { res.render('index', { title: "Fibonacci Calculator", fiboval: undefined }); } }); module.exports = router; This router definition handles the home page for the Fibonacci calculator. The router.get function means this route handles HTTP GET operations on the / URL. If the req.query.fibonum value is set, that means the URL had a ?fibonum=# value which would be produced by the form in index.ejs. If that's the case, the fiboval value is calculated by calling math.fibonacci, the function we showed earlier. By using that function, we can safely predict ahead performance problems when requesting larger Fibonacci values. On the res.render calls, the second argument is an object defining variables that will be made available to the index.ejs template. Notice how the two res.render calls differ in the values passed to the template, and how the template will differ as a result. There are no changes required in app.js. You can study that file, and bin/www, if you're curious how Express applications work. In the meantime, you run it simply: $ npm start > fibonacci@0.0.0 start /Users/david/fibonacci > node ./bin/www And this is what it'll look like in the browser—at http://localhost:3000: For small Fibonacci values, the result will return quickly. As implied by the timing results we looked at earlier, at around the 40th Fibonacci number, it'll take a few seconds to calculate the result. The 50th Fibonacci number will take 20 minutes or so. That's enough time to run a little experiment. Open two browser windows onto http://localhost:3000. You'll see the Fibonacci calculator in each window. In one, request the value for 45 or more. In the other, enter 10 that, in normal circumstances, we know would return almost immediately. Instead, the second window won't respond until the first one finishes. Unless, that is, your browser times out and throws an error. What's happening is the Node.js event loop is blocked from processing events because the Fibonacci algorithm is running and does not ever yield to the event loop. As soon as the Fibonacci calculation finishes, the event loop starts being processed again. It then receives and processes the request made from the second window. Algorithmic refactoring The problem here is the applications that stop processing events. We might solve the problem by ensuring events are handled while still performing calculations. In other words, let's look at algorithmic refactoring. To prove that we have an artificial problem on our hands, add this function to math.js: var fibonacciLoop = exports.fibonacciLoop = function(n) { var fibos = []; fibos[0] = 0; fibos[1] = 1; fibos[2] = 1; for (var i = 3; i <= n; i++) { fibos[i] = fibos[i-2] + fibos[i-1]; } return fibos[n]; } Change fibotimes.js to call this function, and the Fibonacci values will fly by so fast your head will spin. Some algorithms aren't so simple to optimize as this. For such a case, it is possible to divide the calculation into chunks and then dispatch the computation of those chunks through the event loop. Consider the following code: var fibonacciAsync = exports.fibonacciAsync = function(n, done) { if (n === 0) done(undefined, 0); else if (n === 1 || n === 2) done(undefined, 1); else { setImmediate(function() { fibonacciAsync(n-1, function(err, val1) { if (err) done(err); else setImmediate(function() { fibonacciAsync(n-2, function(err, val2) { if (err) done(err); else done(undefined, val1+val2); }); }); }); }); } }; This converts the fibonacci function from a synchronous function to an asynchronous function one with a callback. By using setImmediate, each stage of the calculation is managed through Node.js's event loop, and the server can easily handle other requests while churning away on a calculation. It does nothing to reduce the computation required; this is still the silly inefficient Fibonacci algorithm. All we've done is spread the computation through the event loop. To use this new Fibonacci function, we need to change the router function in routes/index.js to the following: exports.index = function(req, res) { if (req.query.fibonum) { math.fibonacciAsync(req.query.fibonum, function(err,fiboval){ res.render('index', { title: "Fibonacci Calculator", fibonum: req.query.fibonum, fiboval: fiboval }); }); } else { res.render('index', { title: "Fibonacci Calculator", fiboval: undefined }); } }; This makes an asynchronous call to fibonacciAsync, and when the calculation finishes, the result is sent to the browser. With this change, the server no longer freezes when calculating a large Fibonacci number. The calculation, of course, still takes a long time, because fibonacciAsync is still an inefficient algorithm. At least, other users of the application aren't blocked, because it regularly yields to the event loop. Repeat the same test used earlier. Open two or more browser windows to the Fibonacci calculator, make a large request in one window, and the requests in the other window will be promptly answered. Creating a backend REST service The next way to mitigate computationally intensive code is to push the calculation to a backend process. To do that, we'll request computations from a backend Fibonacci server. While Express has a powerful templating system, making it suitable for delivering HTML web pages to browsers, it can also be used to implement a simple REST service. Express supports parameterized URL's in route definitions, so it can easily receive REST API arguments, and Express makes it easy to return data encoded in JSON. Create a file named fiboserver.js containing this code: var math = require('./math'); var express = require('express'); var logger = require('morgan'); var util = require('util'); var app = express(); app.use(logger('dev')); app.get('/fibonacci/:n', function(req, res, next) { math.fibonacciAsync(Math.floor(req.params.n), function(err, val) { if (err) next('FIBO SERVER ERROR ' + err); else { util.log(req.params.n +': '+ val); res.send({ n: req.params.n, result: val }); } }); }); app.listen(3333); This is a stripped down Express application that gets right to the point of providing a Fibonacci calculation service. The one route it supports does the Fibonacci computation using the same fibonacciAsync function used earlier. The res.send function is a flexible way to send data responses. As used here, it automatically detects the object, formats it as JSON text, and sends it with the correct content-type. Then, in package.json, add this to the scripts section: "server": "node ./fiboserver" Now, let's run it: $ npm run server > fibonacci@0.0.0 server /Users/david/fibonacci > node ./fiboserver Then, in a separate command window, use curl to request values from this service. $ curl -f http://localhost:3002/fibonacci/10 {"n":"10","result":55} Over in the window, where the service is running, we'll see a log of GET requests and how long each took to process. It's easy to create a small Node.js script to directly call this REST service. But let's instead move directly to changing our Fibonacci calculator application to do so. Make this change to routes/index.js: router.get('/', function(req, res, next) { if (req.query.fibonum) { var httpreq = require('http').request({ method: 'GET', host: "localhost", port: 3333, path: "/fibonacci/"+Math.floor(req.query.fibonum) }, function(httpresp) { httpresp.on('data', function(chunk) { var data = JSON.parse(chunk); res.render('index', { title: "Fibonacci Calculator", fibonum: req.query.fibonum, fiboval: data.result }); }); httpresp.on('error', function(err) { next(err); }); }); httpreq.on('error', function(err) { next(err); }); httpreq.end(); } else { res.render('index', { title: "Fibonacci Calculator", fiboval: undefined }); } }); Running the Fibonacci Calculator service now requires starting both processes. In one command window, we run: $ npm run server And in the other command window: $ npm start In the browser, we visit http://localhost:3000 and see what looks like the same application, because no changes were made to views/index.ejs. As you make requests in the browser window, the Fibonacci service window prints a log of requests it receives and values it sent. You can, of course, repeat the same experiment as before. Open two browser windows, in one window request a large Fibonacci number, and in the other make smaller requests. You'll see, because the server uses fibonacciAsync, that it's able to respond to every request. Why did we go through this trouble when we could just directly call fibonacciAsync? We can now push the CPU load for this heavy-weight calculation to a separate server. Doing so would preserve CPU capacity on the frontend server, so it can attend to web browsers. The heavy computation can be kept separate, and you could even deploy a cluster of backend servers sitting behind a load balancer evenly distributing requests. Decisions like this are made all the time to create multitier systems. What we've demonstrated is that it's possible to implement simple multitier REST services in a few lines of Node.js and Express. Summary While the Fibonacci algorithm we chose is artificially inefficient, it gave us an opportunity to explore common strategies to mitigate performance problems in Node.js. Optimizing the performance of our systems is as important as correctness, fixing bugs, mobile friendliness, and usability. Inefficient algorithms means having to deploy more hardware to satisfy load requirements, costing more money, and creating a bigger environmental impact. For real-world applications, optimizing away performance problems won't be as easy as it would be for the Fibonacci calculator. We could have just used the fibonacciLoop function, since it provides all the performance we'd need. But we needed to explore more typical approaches to performing heavy-weight calculations in Node.js while still keeping the event loop rolling. The bottom line is that in Node.js the event loop must run. Resources for Article: Further resources on this subject: Learning Node.js for Mobile Application Development [article] Node.js Fundamentals [article] Making a Web Server in Node.js [article]
Read more
  • 0
  • 0
  • 7620

article-image-moodle-20-whats-new-add-resource
Packt
20 Sep 2010
7 min read
Save for later

Moodle 2.0: What's New in Add a Resource

Packt
20 Sep 2010
7 min read
(For more resources on Moodle, see here.) New look – new wording Let's compare the former view of the Add a Resource drop-down menu with the new one. In the following screenshot, the left side shows us the Moodle 2.0 view and the right shows us the Moodle 1.9 view: The first thing to notice is that the wording is simpler –no more confusion amongst teachers as to what constitutes a web page as opposed to a text page; no more explanation needed that Display a directory actually just means Show a folder of my resources. The developers of Moodle 2.0 have taken onboard comments made by trainers and frequent users that certain terms were misunderstood by beginners. In the past, I myself have had to reassure newbies that even though they think they know nothing about web design, it is safe to select Compose a web page because doing so will simply bring up a text box where you can type your information or instructions straight into Moodle. Similarly, not everyone understands that a directory is just a fancy name for a folder that can house a number of your resources. This has now been made clearer. The selection Link to a file or web site has also been altered as these are now dealt with in two different ways. Let's take each option one at a time and study it in more depth. Adding a file The link File replaces the Moodle 1.9 Link to a file or web site option and is the place where, within your course, you would ordinarily upload and display files such as a Microsoft PowerPoint slideshow or a PDF resource. Selecting this from the drop-down gives us the editing screen, the top part of which is shown in the following screenshot: In our How to Be Happy course, our teacher, Andy, is going to upload an Open Office (odt) file with a tip a day for staying cheerful in February, a somewhat grim month in the Northern Hemisphere. Here's how we do it: For Name, as in earlier versions of Moodle, type the text you wish students to click on to access the resource. For Description, type a description which we can later decide to display or not. An Admin can set it so you aren't required to type a description. Click Add to start uploading the document – the File Picker appears: If it's not offered by default, click Upload this file… Add the author name and license type Then, as with Moodle 19, click Browse… to locate the document and then Upload this file… to upload it. You'll be returned to the main editing screen where the document appears as a blue link. As we scroll down, and with Advanced set to Show, we see other settings, some new, and some familiar from previous versions of Moodle but with extra functionality: Display: Choose how you want the file to appear and if you want the actual file name and/or its description to be shown. Advanced: With this enabled, you can decide the size of the pop up window and whether or not to filter the content Common Module settings: (as with Moodle 1.9) Decide whether to make the document visible or not and to set it for groups/groupings (which are now enabled by default) Restrict availability: This will only appear if the setting has been enabled in site administration and is a feature that lets you decide when and under what conditions the file may be accessed. Activity completion: This will only appear if you've enabled it in your course. It's a feature allowing students to check off what they have done or teachers to set activities to be automatically checked as complete under certain circumstances. Save: According to your preference, as with Moodle 1.9 Displaying a file As we went through the settings to upload and show our February Beat The Blues tips, we noticed a drop-down option Display. It gives a variety of ways a file such as our .odt document can appear on the course page in Moodle 2.0. How they display will depend on their file type. Display: Automatic Leave this as the default if you want Moodle to decide for you! In the case of Andy's slideshow, it's the traditional way of displaying an uploaded document, where once clicked on, it appears with a prompt box saying something like (depending on your browser) "do you want to open or save this file?" Display: Embed This will show the Moodle page with heading, blocks, and footer. It will show the title/description of the item and display the file directly in the page as well, so is good for videos, flash animations and so on. Display: Force download When a user clicks on the file, the web browser pops up with a "where do you want to save this file?" box. Display: Open This offers no Moodle heading, blocks, footer or description; it just shows the file as it is. Display: In pop-up This will cause the link to the file to appear in a pop up window before prompting you to open or download it. You can set the size of the pop up window on the Advanced settings page. Site administration | plugins | Activity modules | file gives us two other display options if so desired. These are: Display: In frame This will show the Moodle heading and the file description, with the file displayed in a resizable area below Display: New window This is very much like 'in pop-up', but the new window is a full browser window, with menus and address bar, and so on Resource administration If we click to update our file once it has been uploaded, we can see a new area in the Settings block, giving us options to manage this uploaded resource. We've seen this before: when clicking to update an item, we can tweak it from here. Let's take a look at these: Edit settings Here's where we update the details, display options, and so on (obviously!). Locally assigned roles and Permissions Moodle 1.9 gave us the facility to assign roles and permissions locally to an individual resource, so this is not new. In Moodle 2.0 the site administrator has more control over who can assign which roles by default. Check Permissions This is new however and enables to us be doubly certain our students are allowed (and not allowed!) to access what we want them to. Let's try an example: suppose Andy hides his February tips until the end of January but that he would like one particular student, Emma, to be able to access them in advance of time. He will allow her to view the February document even though it is hidden. He needs to ensure she doesn't have the right to see any other of the hidden files until the appropriate time. Here's what to do: In Locally assigned roles, give Emma the teacher role. This will allow her to view hidden activities, and therefore, our hidden February tips. In Check permissions, select Emma and click Show this user's permissions. This brings up a table showing what Emma's permissions are in this file. She can view hidden activities, and therefore, would be able to see the hidden February resource in advance of the other students. We see that because Emma's been assigned locally the role of teacher in our February document she is allowed to view the hidden file—but as she is still a student in the course as a whole, she doesn't have this right elsewhere –so we are safe!
Read more
  • 0
  • 0
  • 7509
article-image-mysql-admin-configuring-innodb-and-installing-mysql-windows-service
Packt
01 Oct 2010
3 min read
Save for later

MySQL Admin: Configuring InnoDB and Installing MySQL as a Windows Service

Packt
01 Oct 2010
3 min read
In order to prevent the transactional nature of InnoDB from completely thwarting its performance, it implements what is called the redo log. In this recipe, we will present the relevant settings to (re-)configure a database server's redo log. Getting ready As the redo log setup is a part of the server configuration, you will need an operating system user account and sufficient rights to modify the server's configuration file. You will also need rights to restart the MySQL service because the redo log cannot be reconfigured on the fly. Moreover, an administrative MySQL user account is required to prepare the server for the shutdown, necessary as part of the procedure. Caution: As this recipe will modify the configuration of parameters critical to data integrity, you should make a backup copy of the configuration file before editing it! How to do it... Connect to the server using your administrative account. Issue the following command: mysql> SET GLOBAL innodb_fast_shutdown=0;Query OK, 0 rows affected (0.00 sec) Verify the setting like this: mysql> SHOW VARIABLES LIKE 'innodb_fast_shutdown'; Log off from MySQL and stop the MySQL server. Locate the MySQL configuration file, usually called my.cnf or my.ini (on Windows) and open it in a text editor. Locate the following parameters in the [mysqld] section (you values will vary, of course): [mysqld]...innodb_log_group_home_dir=/var/lib/mysql/redologinnodb_log_file_size=32Minnodb_log_buffer_size=64Minnodb_log_files_in_group=2... Edit the above configuration settings to their new values. If you require help on how to find suitable values, see the There's more... section of this recipe. Save the configuration file. Navigate to the directory configured for innodb_log_group_home_dir. If there is no such setting in your configuration file, navigate to MySQL's data directory that is then taken as the default. Move the files whose names start with ib_logfile to a backup location. Do not copy them; they must be removed from their original location. Restart the MySQL server. Verify that new files are created as you configured them: $ ls -l /var/lib/mysqld/redolog If you do not see the new files appear and the server does not start up correctly, check the MySQL error log for messages. Usually, the only thing that can go wrong here is that you either mistyped the directory name or did not actually remove the previous ib_logfile files. To restore everything back to the original configuration, restore your configuration file from the backup and restore the ib_logfile files you moved out to the backup to their original location. What just happened... By setting innodb_fast_shutdown to 0, you told the server to finish writing any pending changes to the disk before actually exiting. This makes sure there are no remaining transactions in the current redo logs that could get lost when these files are replaced. After that you could change the configuration to new values, possibly using a different number of files and different sizes. Then, before restarting, you could move the old redo log files out of the way. This is important because otherwise MySQL would complain about a mismatch between the settings file and the actual situation on disk. When it comes up finding no redo log files, it will create new ones with the settings just configured.
Read more
  • 0
  • 0
  • 7395

article-image-using-events-interceptors-and-logging-services
Packt
03 Oct 2013
19 min read
Save for later

Using Events, Interceptors, and Logging Services

Packt
03 Oct 2013
19 min read
(For more resources related to this topic, see here.) Understanding interceptors Interceptors are defined as part of the EJB 3.1 specification (JSR 318), and are used to intercept Java method invocations and lifecycle events that may occur in Enterprise Java Beans (EJB) or Named Beans from Context Dependency Injection (CDI). The three main components of interceptors are as follows: The Target class: This class will be monitored or watched by the interceptor. The target class can hold the interceptor methods for itself. The Interceptor class: This interceptor class groups interceptor methods. The Interceptor method: This method will be invoked according to the lifecycle events. As an example, a logging interceptor will be developed and integrated into the Store application. Following the hands-on approach of this article, we will see how to apply the main concepts through the given examples without going into a lot of details. Check the Web Resources section to find more documentation about interceptors. Creating a log interceptor A log interceptor is a common requirement in most Java EE projects as it's a simple yet very powerful solution because of its decoupled implementation and easy distribution among other projects if necessary. Here's a diagram that illustrates this solution: Log and LogInterceptor are the core of the log interceptor functionality; the former can be thought of as the interface of the interceptor, it being the annotation that will decorate the elements of SearchManager that must be logged, and the latter carries the actual implementation of our interceptor. The business rule is to simply call a method of class LogService, which will be responsible for creating the log entry. Here's how to implement the log interceptor mechanism: Create a new Java package named com.packt.store.log in the project Store. Create a new enumeration named LogLevel inside this package. This enumeration will be responsible to match the level assigned to the annotation and the logging framework: package com.packt.store.log; public enum LogLevel { // As defined at java.util.logging.Level SEVERE, WARNING, INFO, CONFIG, FINE, FINER, FINEST; public String toString() { return super.toString(); } } We're going to create all objects of this section—LogLevel, Log, LogService, and LogInterceptor—into the same package, com.packt.store.log. This decision makes it easier to extract the logging functionality from the project and build an independent library in the future, if required. Create a new annotation named Log. This annotation will be used to mark every method that must be logged, and it accepts the log level as a parameter according to the LogLevel enumeration created in the previous step: package com.packt.store.log; @Inherited @InterceptorBinding @Retention(RetentionPolicy.RUNTIME) @Target({ElementType.METHOD, ElementType.TYPE}) public @interface Log { @Nonbinding LogLevel value() default LogLevel.FINEST; } As this annotation will be attached to an interceptor, we have to add the @InterceptorBinding decoration here. When creating the interceptor, we will add a reference that points back to the Log annotation, creating the necessary relationship between them. Also, we can attach an annotation virtually to any Java element. This is dictated by the @Target decoration, where we can set any combination of the ElementType values such as ANNOTATION_TYPE, CONSTRUCTOR, FIELD, LOCAL_VARIABLE, METHOD, PACKAGE, PARAMETER, and TYPE (mapping classes, interfaces, and enums), each representing a specific element. The annotation being created can be attached to methods and classes or interface definitions. Now we must create a new stateless session bean named LogService that is going to execute the actual logging: @Stateless public class LogService { // Receives the class name decorated with @Log public void log(final String clazz, final LogLevel level, final String message) { // Logger from package java.util.logging Logger log = Logger.getLogger(clazz); log.log(Level.parse(level.toString()), message); } } Create a new class, LogInterceptor, to trap calls from classes or methods decorated with @Log and invoke the LogService class we just created—the main method must be marked with @AroundInvoke—and it is mandatory that it receives an InvocationContext instance and returns an Object element: @Log @Interceptor public class LogInterceptor implements Serializable { private static final long serialVersionUID = 1L; @Inject LogService logger; @AroundInvoke public Object logMethod(InvocationContext ic) throws Exception { final Method method = ic.getMethod(); // check if annotation is on class or method LogLevel logLevel = method.getAnnotation(Log.class)!= null ? method.getAnnotation(Log.class).value() : method.getDeclaringClass().getAnnotation(Log.class).value(); // invoke LogService logger.log(ic.getClass().getCanonicalName(),logLevel, method.toString()); return ic.proceed(); } } As we defined earlier, the Log annotation can be attached to methods and classes or interfaces by its @Target decoration; we need to discover which one raised the interceptor to retrieve the correct LogLevel value. When trying to get the annotation from the class shown in the method.getDeclaringClass().getAnnotation(Log.class) line, the engine will traverse through the class' hierarchy searching for the annotation, up to the Object class if necessary. This happens because we marked the Log annotation with @Inherited. Remember that this behavior only applies to the class's inheritance, not interfaces. Finally, as we marked the value attribute of the Log annotation as @Nonbinding in step 3, all log levels will be handled by the same LogInterceptor function. If you remove the @Nonbinding line, the interceptor should be further qualified to handle a specific log level, for example @Log(LogLevel.INFO), so you would need to code several interceptors, one for each existing log level. Modify the beans.xml (under /WEB-INF/) file to tell the container that our class must be loaded as an interceptor—currently, the file is empty, so add all the following lines: <beans xsi_schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/beans_1_0.xsd"> <interceptors> <class>com.packt.store.log.LogInterceptor</class> </interceptors> </beans> Now decorate a business class or method with @Log in order to test what we've done. For example, apply it to the getTheaters() method in SearchManager from the project Store. Remember that it will be called every time you refresh the query page: @Log(LogLevel.INFO) public List<Theater> getTheaters() { ... } Make sure you have no errors in the project and deploy it to the current server by right-clicking on the server name and then clicking on the Publish entry. Access the theater's page, http://localhost:7001/theater/theaters.jsf, refresh it a couple of times, and check the server output. If you have started your server from Eclipse, it should be under the Console tab: Nov 12, 2012 4:53:13 PM com.packt.store.log.LogService log INFO: public java.util.List com.packt.store.search.SearchManager.getTheaters() Let's take a quick overview of what we've accomplished so far; we created an interceptor and an annotation that will perform all common logging operations for any method or class marked with such an annotation. All log entries generated from the annotation will follow WebLogic's logging services configuration. Interceptors and Aspect Oriented Programming There are some equivalent concepts on these topics, but at the same time, they provide some critical functionalities, and these can make a completely different overall solution. In a sense, interceptors work like an event mechanism, but in reality, it's based on a paradigm called Aspect Oriented Programming (AOP). Although AOP is a huge and complex topic and has several books that cover it in great detail, the examples shown in this article make a quick introduction to an important AOP concept: method interception. Consider AOP as a paradigm that makes it easier to apply crosscutting concerns (such as logging or auditing) as services to one or multiple objects. Of course, it's almost impossible to define the multiple contexts that AOP can help in just one phrase, but for the context of this article and for most real-world scenarios, this is good enough. Using asynchronous methods A basic programming concept called synchronous execution defines the way our code is processed by the computer, that is, line-by-line, one at a time, in a sequential fashion. So, when the main execution flow of a class calls a method, it must wait until its completion so that the next line can be processed. Of course, there are structures capable of processing different portions of a program in parallel, but from an external viewpoint, the execution happens in a sequential way, and that's how we think about it when writing code. When you know that a specific portion of your code is going to take a little while to complete, and there are other things that could be done instead of just sitting and waiting for it, there are a few strategies that you could resort to in order to optimize the code. For example, starting a thread to run things in parallel, or posting a message to a JMS queue and breaking the flow into independent units are two possible solutions. If your code is running on an application server, you should know by now that thread spawning is a bad practice—only the server itself must create threads, so this solution doesn't apply to this specific scenario. Another way to deal with such a requirement when using Java EE 6 is to create one or more asynchronous methods inside a stateless session bean by annotating either the whole class or specific methods with javax.ejb.Asynchronous. If the class is decorated with @Asynchronous, all its methods inherit the behavior. When a method marked as asynchronous is called, the server usually spawns a thread to execute the called method—there are cases where the same thread can be used, for instance, if the calling method happens to end right after emitting the command to run the asynchronous method. Either way, the general idea is that things are explicitly going to be processed in parallel, which is a departure from the synchronous execution paradigm. To see how it works, let's change the LogService method to be an asynchronous one; all we need to do is decorate the class or the method with @Asynchronous: @Stateless @Asynchronous public class LogService { … As the call to its log method is the last step executed by the interceptor, and its processing is really quick, there is no real benefit in doing so. To make things more interesting, let's force a longer execution cycle by inserting a sleep method into the method of LogService: public void log(final String clazz,final LogLevel level,final String message) { Logger log = Logger.getLogger(clazz); log.log(Level.parse(level.toString()), message); try { Thread.sleep(5000); log.log(Level.parse(level.toString()), "reached end of method"); } catch (InterruptedException e) { e.printStackTrace(); } } Using Thread.sleep() when running inside an application server is another classic example of a bad practice, so keep away from this when creating real-world solutions. Save all files, publish the Store project, and load the query page a couple of times. You will notice that the page is rendered without delay, as usual, and that the reached end of method message is displayed after a few seconds in the Console view. This is a pretty subtle scenario, so you can make it harsher by commenting out the @Asynchronous line and deploying the project again—this time when you refresh the browser, you will have to wait for 5 seconds before the page gets rendered. Our example didn't need a return value from the asynchronous method, making it pretty simple to implement. If you need to get a value back from such methods, you must declare it using the java.util.concurrent.Future interface: @Asynchronous public Future<String> doSomething() { … } The returned value must be changed to reflect the following: return new AsyncResult<String>("ok"); The javax.ejb.AsyncResult function is an implementation of the Future interface that can be used to return asynchronous results. There are other features and considerations around asynchronous methods, such as ways to cancel a request being executed and to check if the asynchronous processing has finished, so the resulting value can be accessed. For more details, check the Creating Asynchronous methods in EJB 3.1 reference at the end of this article. Understanding WebLogic's logging service Before we advance to the event system introduced in Java EE 6, let's take a look at the logging services provided by Oracle WebLogic Server. By default, WebLogic Server creates two log files for each managed server: access.log: This is a standard HTTP access log, where requests to web resources of a specific server instance are registered with details such as HTTP return code, the resource path, response time, among others <ServerName.log>: This contains the log messages generated by the WebLogic services and deployed applications of that specific server instance These files are generated in a default directory structure that follows the pattern $DOMAIN_NAME/servers/<SERVER_NAME>/logs/. If you are running a WebLogic domain that spawns over more than one machine, you will find another log file named <DomainName>.log in the machine where the administration server is running. This file aggregates messages from all managed servers of that specific domain, creating a single point of observation for the whole domain. As a best practice, only messages with a higher level should be transferred to the domain log, avoiding overhead to access this file. Keep in mind that the messages written to the domain log are also found at the managed server's specific log file that generated them, so there's no need to redirect everything to the domain log. Anatomy of a log message Here's a typical entry of a log file: ####<Jul 15, 2013 8:32:54 PM BRT> <Alert> <WebLogicServer> <sandbox-lap> <AdminServer> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <weblogic> <> <> <1373931174624> <BEA-000396> <Server shutdown has been requested by weblogic.> The description of each field is given in the following table: Text Description #### Fixed, every log message starts with this sequence <Jul 15, 2013 8:32:54 PM BRT> Locale-formatted timestamp <Alert> Message severity <WebLogicServer> WebLogic subsystem-other examples are WorkManager, Security, EJB, and Management <sandbox-lap> Physical machine name <AdminServer> WebLogic Server name <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> Thread ID <weblogic> User ID <>  Transaction ID, or empty if not in a transaction context <>  Diagnostic context ID, or empty if not applicable; it is used by the Diagnostics Framework to correlate messages of a specific request <1373931174624> Raw time in milliseconds <BEA-000396> Message ID <Server shutdown has been requested by weblogic.> Description of the event The Diagnostics Framework presents functionalities to monitor, collect, and analyze data from several components of WebLogic Server. Redirecting standard output to a log file The logging solution we've just created is currently using the Java SE logging engine—we can see our messages on the console's screen, but they aren't being written to any log file managed by WebLogic Server. It is this way because of the default configuration of Java SE, as we can see from the following snippet, taken from the logging.properties file used to run the server: # "handlers" specifies a comma separated list of log Handler # classes. These handlers will be installed during VM startup. # Note that these classes must be on the system classpath. # By default we only configure a ConsoleHandler, which will only # show messages at the INFO and above levels. handlers= java.util.logging.ConsoleHandler You can find this file at $JAVA_HOME/jre/lib/logging.properties. So, as stated here, the default output destination used by Java SE is the console. There are a few ways to change this aspect: If you're using this Java SE installation solely to run WebLogic Server instances, you may go ahead and change this file, adding a specific WebLogic handler to the handlers line as follows: handlers= java.util.logging.ConsoleHandler,weblogic.logging.ServerLoggingHandler Tampering with Java SE files is not an option (it may be shared among other software, for instance); you can duplicate the default logging.properties file into another folder $DOMAIN_HOME being a suitable candidate, add the new handler, and instruct WebLogic to use this file at startup by adding this argument to the following command line: -Djava.util.logging.config.file=$DOMAIN_HOME/logging.properties You can use the administration console to set the redirection of the standard output (and error) to the log files. To do so, perform the following steps: In the left-hand side panel, expand Environment and select Servers. In the Servers table, click on the name of the server instance you want to configure. Select Logging and then General. Find the Advanced section, expand it, and tick the Redirect stdout logging enabled checkbox: Click on Save to apply your changes. If necessary, the console will show a message stating that the server must be restarted to acquire the new configuration. If you get no warnings asking to restart the server, then the configuration is already in use. This means that both WebLogic subsystems and any application deployed to that server is automatically using the new values, which is a very powerful feature for troubleshooting applications without intrusive actions such as modifying the application itself—just change the log level to start capturing more detailed messages! Notice that there are a lot of other logging parameters that can be configured, and three of them are worth mentioning here: The Rotation group (found in the inner General tab): The rotation feature instructs WebLogic to create new log files based on the rules set on this group of parameters. It can be set to check for a size limit or create new files from time to time. By doing so, the server creates smaller files that we can easily handle. We can also limit the number of files retained in the machine to reduce the disk usage. If the partition where the log files are being written to reaches 100 percent of utilization, WebLogic Server will start behaving erratically. Always remember to check the disk usage; if possible, set up a monitoring solution such as Nagios to keep track of this and alert you when a critical level is reached. Minimum severity to log (also in the inner General tab): This entry sets the lower severity that should be logged by all destinations. This means that even if you set the domain level to debug, the messages will be actually written to the domain log only if this parameter is set to the same or lower level. It will work as a gatekeeper to avoid an overload of messages being sent to the loggers. HTTP access log enabled (found in the inner HTTP tab): When WebLogic Server is configured in a clustered environment, usually a load-balancing solution is set up to distribute requests between the WebLogic managed servers; the most common options are Oracle HTTP Server (OHS) or Apache Web Server. Both are standard web servers, and as such, they already register the requests sent to WebLogic in their own access logs. If this is the case, disable the WebLogic HTTP access log generation, saving processing power and I/O requests to more important tasks. Integrating Log4J to WebLogic's logging services If you already have an application that uses Log4J and want it to write messages to WebLogic's log files, you must add a new weblogic.logging.log4j.ServerLoggingAppender appender to your lo4j.properties configuration file. This class works like a bridge between Log4J and WebLogic's logging framework, allowing the messages captured by the appender to be written to the server log files. As WebLogic doesn't package a Log4J implementation, you must add its JAR to the domain by copying it to $DOMAIN_HOME/tickets/lib, along with another file, wllog4j.jar, which contains the WebLogic appender. This file can be found inside $MW_HOME/wlserver/server/lib. Restart the server, and it's done! If you're using a *nix system, you can create a symbolic link instead of copying the files—this is great to keep it consistent when a path changing these specific files must be applied to the server. Remember that having a file inside $MW_HOME/wlserver/server/lib doesn't mean that the file is being loaded by the server when it starts up; it is just a central place to hold the libraries. To be loaded by a server, a library must be added to the classpath parameter of that server, or you can add it to the domain-wide lib folder, which guarantees that it will be available to all nodes of the domain on a specific machine. Accessing and reading log files If you have direct access to the server files, you can open and search them using a command-line tool such as tail or less, or even use a graphical viewer such as Notepad. But when you don't have direct access to them, you may use WebLogic's administration console to read their content by following the steps given here: In the left-hand side pane of the administration console, expand Diagnostics and select Log Files. In the Log Files table, select the option button next to the name of the log you want to check and click on View: The types displayed on this screen, which are mentioned at the start of the section, are Domain Log, Server Log, and HTTP Access. The others are resource-specific or linked to the diagnostics framework. Check the Web resources section at the end of this article for further reference. The page displays the latest contents of the log file; the default setting shows up to 500 messages in reverse chronological order. The messages at the top of the window are the most recent messages that the server has generated. Keep in mind that the log viewer does not display messages that have been converted into archived log files.
Read more
  • 0
  • 0
  • 7331
Modal Close icon
Modal Close icon