Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-creating-design-ez-publish-4-templating-system-part-1
Packt
19 Nov 2009
11 min read
Save for later

Creating a Design with eZ Publish 4 Templating System: Part 1

Packt
19 Nov 2009
11 min read
eZ Publish templating In the first part of this article, we will introduce the basics of the eZ Publish templating system, which will help us to better understand the rest of the article Templating eZ Publish has its own templating system based on the decoupling of layout and content. This will help us to assign a custom layout to any content object in different sections. Moreover, just as other templating platforms, such as Smarty (http://www.smarty.net), eZ Publish has its own markup to help developers with control structure operations, subtemplating, and on-the-fly content editing. It also exposes a particular function to fetch and filter content from a database. The official eZ Publish website has a constant, up-to-date reference with the entire templating markup. We suggest you to use the following link every time that you need to know more details about the available arguments:http://ez.no/doc/ez_publish/technical_manual/4_0/templates/ The templating markup All of the eZ Publish templating code should be placed between curly brackets. When the CMS will parse our template file and find the curly brackets, it will start executing the related code. Escaping the curly bracketsIf we need to use curly brackets, for example to write a javascript function inside our template, we need to use the {literal} operator. {literal}<script type="text/javascript">function alertMe() { window.alert('Harkonen approaching!');}</script>{/literal} Control structure operators We can divide these function into two main families: Conditional (IF-THEN-ELSE) Looping (FOR-FOREACH-WHILE) Whereas the first one should be used to change the template behavior according to some predefined condition, the other one will help us to seek and manage array and content structures. Conditional control Conditional control is sometimes useful for changing the output when some data is received by the system. For example, we would need a different CSS class for a particular value, or to change the <div> class, if the current month is the same as the one displayed, as shown below: {def $current_month=currentdate()|datetime(custom, '%F')}{if $node.name|eq($current_month) }<span class="this-month">{else}<span class="default-month">{/if}{undef $current_month} In the first line, we define a $current_month variable that has a value of the name of the month (for example, October), retrieved by the datetime() operator. Then we use the IF conditional control to choose the correct class. In the last line, we delete the variable previously created, by releasing it from system memory. Loop control As stated above, the loop control structure can be used to iterate through an array. We can, for example, create an unordered list (<ul>) from an array of items. <ul>{foreach $items as $item} <li>{node_view_gui content_node=$item view=line}</li>{/foreach}</ul> This will be rendered as: <ul> <li>1st item</li> <li>2nd item</li> <li>3rd item</li> ...</ul> As you can see, the FOREACH structure is similar to the PHP structure. In this example, the most interesting line is the definition of the list object. This we can literally read as: render the content node (node_view_gui) from a specific node (content_node=$issue) using the line view template (view=line). Fetch functions With the fetch functions, we can retrieve all of the information about a content object for a module. The fetch functions can also be used to create custom queries to retrieve only the information we need, and not everything. eZ Publish exposes many fetch functions, which can be read about on the documentation site at http://ez.no/doc/ez_publish/technical_manual/4_0/reference/template_fetch_functions The most important, and most used, fetch functions are those regarding the content, sections, and user modules. For example, we can fetch the root content object by using the following code in our template: {$object = fetch('content', 'object', hash('id', '1'))} We can then use the $object variable to display the object inside the HTML code. Generic template functions and operators The CMS gives us a lot of functions and operators, all of them described in the reference manual of the eZ System documentation site. As a thumb rule, we should remember that to execute a particular function, we have to use the following syntax: {function_name parameter1=value1 ... parameterN=valueN } All parameters are separated by spaces and can be specified in no particular order. If we want to manage the operators, we have to remember that they accept the parameters passed in a specific order, separated by a comma. Moreover, an operator should handle a parameter passed to it with a pipe (|). {$piped_parameter|my_operator( parameter1, ..., parameterN ) } Every time we see a pipe after a variable, we have to remember that we are passing a value to an operator. We used the datetime() operator in the previous example for the conditional control functionality. As a reference to API functions and operators, you can use the official variable documentation that is constantly updated on the eZ System site:http://ez.no/doc/ez_publish/technical_manual/4_0/reference/template_operatorshttp://ez.no/doc/ez_publish/technical_manual/4_0/reference/template_functions Layout variables By default, the page layout template can access some of the variables passed by the CMS. These variables, named Layout variables, can be used to render system and user information, or to change the output. These variables are automatically configured by eZ Publish when it analyzes and executes the code related to a view. One of the most important variables is $module_result, which contains the results generated by the module and the view that is being executed. A module is an HTTP interface that interacts with eZ Publish. A module consists of a set of views that contain the code to be executed. For example, if we call the following URL, the system executes the login view code of the user module:http://www.example.com/index.php/user/login. As an API reference, you can use the official variable documentation that is constantly updated on the eZ System site:http://ez.no/doc/ez_publish/technical_manual/4_0/templates/the_pagelayout/variables_in_pagelayout Overriding a template eZ Publish offers a set of standard templates that are useful, but they cannot cover all the possible design needs. To solve this issue, the CMF provides a fallback system that allows us to load different templates based on specific rules. This system is usually referred to as overriding, and allows us to change the template for each module's view by overriding the default template when the user is in a particular context. Embedding HTML inside the WYSIWYG XML editor, pt.2 We have to override a standard behavior of eZ Publish to create a generic HTML block inside the WYSIWYG XML editor. We use a content style named html for the online editor and we will work on it for the frontend to render it correctly. First, we have to create a file named literal.tpl and place it in the design folder of our extension. The following code will do exactly what we need: # mkdir -p /var/www/packtmediaproject/extension/packtmedia/design/magazine/templates/datatype/view/ezxmltags/# cd /var/www/packtmediaproject/extension/packtmedia/design/magazine/templates/datatype/view/ezxmltags/# touch literal.tpl Next, we will open the literal.tpl file in our preferred IDE. Now we will add the code that will, by default, render everything surrounded by a <pre> tag and the raw HTML code, if the class is html: {if ne( $classification, 'html' )} <pre {if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre>{else} {$content}{/if} This code will check to see if the $classification variable is different from the "html" string in order to add the <pre> tag and then, again, it will add a class attribute to the <pre> tag if the $classification variable is not null. To use it, we only need to reset the cache from the shell prompt by using the following command: cd /var/www/packmediaproject/php bin/php/ezcache.php --clear-all --purge The ezcache.php file is a PHP shell script that can be used to clear and manage the eZ Publish cache. This file has many parameters, which can be viewed by using the --help parameter. Creating a new design Before starting work on the eZ Webin template code, we need to create a wireframe in order to decide on the layout structure. We will use this structure to override the standard layout files. A wireframe is a basic visual guide that is used in web design to suggest the structure of a website and the relationships between its pages. Wireframe editorsThere are a lot of commercial and free wireframe editors. To create our site's wireframes, we will use the Firefox plugin called Pencil(http://www.evolus.vn/Pencil/).We have chosen Pencil because it is open source and works on every platform that runs the Firefox browser.If you need something more complete, you should take a look at Balsamiq (http://www.balsamiq.com/) or at OmniGraffle (http://www.omnigroup.com/applications/OmniGraffle/) if you have an Apple computer. Our site will have at least six different page layouts: The homepage The issue page, where we will display the cover and the articles list The issue archive page, by month and by years The staff profile page, where we will display the latest articles that the editor has written, along with his profile The article and the forum pages, with the default layout based on the eZ Webin design Now we will illustrate the first four layouts because we will work on them, overriding their standard eZ Webin layout. The homepage Starting from the homepage, we can see that the site will have, in the top-left corner, a logo for the magazine and a place-holder for a banner. Under these, we will have the main navigation menu and the main content area. We have chosen a three-column layout in order to easily manage the content that we want to show. In the homepage, the first column will show the latest news and the middle column will show the information and cover of the latest issue. The last column will have two boxes—one with the most important article from the latest issue and the other with the forum thread. Issue page The issue page will show some information of a specific magazine issue. In this page, the middle box of the homepage will shift towards the left, and in the right column there will be the highlighted article for the issue. At the bottom of the page, we will find all of the other articles. The issue archive We have to remember that our magazine is released monthly, so we need an archive page where we can collect all of the past issues. The issue archive page, which can be reached by clicking on the main navigation menu, will again show some information from the latest issue. (We need to sell our articles!) The rightmost column of the template will show all of the covers for the current or selected year. At the bottom of the page, we will create a box with links to the past issues grouped by years and months. The staff profile page The staff profile page will display information from a staff profile, such as his avatar, biography, and the latest articles that he has written. The staff profile page will have three columns. The first column will show information regarding the editor's profile, the middle column will show all of the articles the editor has written (paged five by five) and the third will be used for banners or other images. >> Continue Reading: Creating a Design with eZ Publish 4 Templating System: Part 2
Read more
  • 0
  • 0
  • 2522

article-image-calendars-jquery-13-php-using-jquery-week-calendar-plugin-part-1
Packt
19 Nov 2009
7 min read
Save for later

Calendars in jQuery 1.3 with PHP using jQuery Week Calendar Plugin: Part 1

Packt
19 Nov 2009
7 min read
There are many reasons why you would want to display a calendar. You can use it to display upcoming events, to keep a diary, or to show a timetable. Recently, for example, I combined a calendar with an online store for a client to book meetings and receive payments more intuitively. Google calendar is probably what springs to mind when people think of calendars online. There is a very good plugin called jquery-week-calendar that shows a week with events in a fashion similar to Google's calendar. Its homepage is at http://www.redredred.com.au/projects/jquery-week-calendar/. To get the latest copy of the plugin, go to http://code.google.com/p/jquery-week-calendar/downloads/list and get the highest-numbered file. The examples in this article are done with version 1.2.0. Download the library and extract it so that there is a directory named jquery-weekcalendar-1.2.0 in the root of your demo directory. Displaying the calendar As usual, the HTML for the simplest configuration is very simple. Save this as calendar.html: <html> <head> <script src="../jquery.min.js"></script> <script src="../jquery-ui.min.js"></script> <script src="../jquery-weekcalendar-1.2.0/jquery.weekcalendar.js"> </script> <script src="calendar.js"></script> <link rel="stylesheet" type="text/css" href="../jquery-ui.css" /> <link rel="stylesheet" type="text/css" href="../jquery-weekcalendar-1.2.0/jquery.weekcalendar.css"/> </head> <body> <div id="calendar_wrapper" style="height:500px"></div> </body> </html> We will keep all of our JavaScript in an external file called calendar.js, which will initially contain just this: $(document).ready(function() { $('#calendar_wrapper').weekCalendar({ 'height':function($calendar){ return $('#calendar_wrapper')[0].offsetHeight; } }); }); This is fairly straightforward. The script will apply the widget to the #calendar_wrapper element, and the widget's height will be set to that of the wrapper element. Even with this tiny bit of code, we already have a good-looking calendar, and when you drag your mouse cursor around it, you'll see that events are created as you lift the mouse up: It looks good, but it doesn't do anything yet. The events are temporary, and will vanish as soon as you change the week or reload the page. In order to make them permanent, we need to send details of the events to the server and save them there. Creating an event What we need to do is to have the client save the event on the server when it is created. In this article, we'll use PHP sessions to save the data for the sake of simplicity. Sessions are chunks of data, which are kept on the server side and are related to the cookie or PHPSESSID parameter that the client uses to access that session. We will use sessions in these examples because they do not need as much setup as databases. For your own projects, you should adapt the PHP side in order to connect to a database instead. If you are using this article to create a full application, you will obviously want to use something more permanent than sessions, in which case the PHP code should be adapted such that all references to sessions are replaced with database references instead. This is beyond the scope of this book, but as you are a PHP developer, you probably do this everyday anyway! When the event has been created, we want a modal dialog to appear and ask for more details. In this test case, we'll add a text area for further details, which allows for more data than would appear in the small visible area in the calendar itself. A modal dialog is a "pop up" that appears and blocks all other actions on the page until it has been taken care of. It's useful in cases where the answer to a question must be known before a script can carry on with its work. Now, let's create an event and add it to our calendar. Client-side code In the calendar.js file, add an eventNew event to the weekCalendar call: $(document).ready(function() { $('#calendar_wrapper').weekCalendar({ 'height':function($calendar){ return $('#calendar_wrapper')[0].offsetHeight; }, 'eventNew':function(calEvent, $event) { calendar_new_entry(calEvent,$event); } }); }); When an event is created, the calendar_new_entry function will be called with details of the new event in the calEvent parameter. Now, add the function calendar_new_entry: function calendar_new_entry(calEvent,$event){ var ds=calEvent.start, df=calEvent.end; $('<div id="calendar_new_entry_form" title="New Calendar Entry"> event name<br /> <input value="new event" id="calendar_new_entry_form_title" /> <br /> body text<br /> <textarea style="width:400px;height:200px" id="calendar_new_entry_form_body">event description </textarea> </div>').appendTo($('body')); $("#calendar_new_entry_form").dialog({ bgiframe: true, autoOpen: false, height: 440, width: 450, modal: true, buttons: { 'Save': function() { var $this=$(this); $.getJSON('./calendar.php?action=save&id=0&start=' +ds.getTime()/1000+'&end='+df.getTime()/1000, { 'body':$('#calendar_new_entry_form_body').val(), 'title':$('#calendar_new_entry_form_title').val() }, function(ret){ $this.dialog('close'); $('#calendar_wrapper').weekCalendar('refresh'); $("#calendar_new_entry_form").remove(); } ); }, Cancel: function() { $event.remove(); $(this).dialog('close'); $("#calendar_new_entry_form").remove(); } }, close: function() { $('#calendar').weekCalendar('removeUnsavedEvents'); $("#calendar_new_entry_form").remove(); } }); $("#calendar_new_entry_form").dialog('open'); } What's happening here is that a form is created and added to the body (the second line of the function), then the third line of the function creates a modal window from that form and adds some buttons to it. Our modal dialog should look like this: The Save button, when pressed, calls the server-side file calendar.php with the parameters needed to save the event, including the start and end, and the title and body. When the result returns, the calendar is refreshed with the new event's data included. When any of the buttons are clicked, we close the dialog and remove it from the page completely. Note how we are sending time information to the server (shown highlighted in the code we just saw). JavaScript time functions usually measure in milliseconds, but we want to send it to PHP, which generally measures time in seconds. So, we convert the value on the client so that the PHP can use the received data as it is, without needing to do anything to it. Every little helps! Server-side code On the server side, we want to take the new event and save it. Remember that we're doing it in sessions in this example, but you should feel free to adapt this to any other model that you wish. Create a file called calendar.php and save it with this source in it: <?php session_start(); if(!isset($_SESSION['calendar'])){ $_SESSION['calendar']=array( 'ids'=>0, ); } if(isset($_REQUEST['action'])){ switch($_REQUEST['action']){ case 'save': // { $start_date=(int)$_REQUEST['start']; $data=array( 'title'=>(isset($_REQUEST['title'])?$_REQUEST['title']:''), 'body' =>(isset($_REQUEST['body'])?$_REQUEST['body']:''), 'start'=>date('c',$start_date), 'end' =>date('c',(int)$_REQUEST['end']) ); $id=(int)$_REQUEST['id']; if($id && isset($_SESSION['calendar'][$id])){ $_SESSION['calendar'][$id]=$data; } else{ $id= ++$_SESSION['calendar']['ids']; $_SESSION['calendar'][$id]=$data; } echo 1; exit; // } } } ?> In the server-side code of this project, all the requested actions are handled by a switch statement. This is done for demonstration purposes—whenever we add a new action, we simply add a new switch case. If you are using this for your own purposes, you may wish to rewrite it using functions instead of large switch cases. The date function is used to convert the start and end parameters into ISO 8601 date format. That's the format jquery-week-calendar prefers, so we'll try to keep everything in that format. Visually, nothing appears to happen, but the data is actually being saved. To see what's being saved, create a new file named test.php, and use the var_dump function in it to examine the session data (view it in your browser): <?php session_start(); var_dump($_SESSION); ?> Here's an example from my test machine:
Read more
  • 0
  • 0
  • 8277

article-image-plotting-data-using-matplotlib-part-2
Packt
19 Nov 2009
15 min read
Save for later

Plotting data using Matplotlib: Part 2

Packt
19 Nov 2009
15 min read
Plotting data from a CSV file A common format to export and distribute datasets is the Comma-Separated Values (CSV) format. For example, spreadsheet applications allow us to export a CSV from a working sheet, and some databases also allow for CSV data export. Additionally, it's a common format to distribute datasets on the Web. In this example, we'll be plotting the evolution of the world's population divided by continents, between 1950 and 2050 (of course they are predictions), using a new type of graph: bars stacked. Using the data available at http://www.xist.org/earth/pop_continent.aspx (that fetches data from the official UN data at http://esa.un.org/unpp/index.asp), we have prepared the following CSV file: Continent,1950,1975,2000,2010,2025,2050Africa,227270,418765,819462,1033043,1400184,1998466Asia,1402887,2379374,3698296,4166741,4772523,5231485Europe,547460,676207,726568,732759,729264,691048Latin America,167307,323323,521228,588649,669533,729184Northern America,171615,242360,318654,351659,397522,448464Oceania,12807,21286,31160,35838,42507,51338 In the first line, we can find the header with a description of what the data in the columns represent. The other lines contain the continent's name and its population (in thousands) for the given years. In the first line, we can find the header with a description of what the data in the columns represent. The other lines contain the continent's name and its population (in thousands) for the given years. There are several ways to parse a CSV file, for example: NumPy's loadtxt() (what we are going to use here) Matplotlib's mlab.csv2rec() The csv module (in the standard library) but we decided to go with loadtxt() because it's very powerful (and it's what Matplotlib is standardizing on). Let's look at how we can plot it then: # for file opening made easierfrom __future__ import with_statement We need this because we will use the with statement to read the file. # numpyimport numpy as np NumPy is used to load the CSV and for its useful array data type. # matplotlib plotting moduleimport matplotlib.pyplot as plt# matplotlib colormap moduleimport matplotlib.cm as cm# needed for formatting Y axisfrom matplotlib.ticker import FuncFormatter# Matplotlib font managerimport matplotlib.font_manager as font_manager In addition to the classic pyplot module, we need other Matplotlib submodules: cm (color map): Considering the way we're going to prepare the plot, we need to specify the color map of the graphical elements FuncFormatter: We will use this to change the way the Y-axis labels are displayed font_manager: We want to have a legend with a smaller font, and font_manager allows us to do that def billions(x, pos): """Formatter for Y axis, values are in billions""" return '%1.fbn' % (x*1e-6) This is the function that we will use to format the Y-axis labels. Our data is in thousands. Therefore, by dividing it by one million, we obtain values in the order of billions. The function is called at every label to draw, passing the label value and the position. # bar widthwidth = .8 As said earlier, we will plot bars, and here we defi ne their width. The following is the parsing code. We know that it's a bit hard to follow (the data preparation code is usually the hardest one) but we will show how powerful it is. # open CSV filewith open('population.csv') as f: The function we're going to use, NumPy loadtxt(), is able to receive either a filename or a file descriptor, as in this case. We have to open the file here because we have to strip the header line from the rest of the file and set up the data parsing structures. # read the first line, splitting the yearsyears = map(int, f.readline().split(',')[1:]) Here we read the first line, the header, and extract the years. We do that by calling the split() function and then mapping the int() function to the resulting list, from the second element onwards (as the first one is a string). # we prepare the dtype for exacting data; it's made of:# <1 string field> <len(years) integers fields>dtype = [('continents', 'S16')] + [('', np.int32)]*len(years) NumPy is flexible enough to allow us to define new data types. Here, we are creating one ad hoc for our data lines: a string (of maximum 16 characters) and as many integers as the length of years list. Also note how the fi rst element has a name, continents, while the last integers have none: we will need this in a bit. # we load the file, setting the delimiter and the dtype abovey = np.loadtxt(f, delimiter=',', dtype=dtype) With the new data type, we can actually call loadtxt(). Here is the description of the parameters: f: This is the file descriptor. Please note that it now contains all the lines except the first one (we've read above) which contains the headers, so no data is lost. delimiter: By default, loadtxt() expects the delimiter to be spaces, but since we are parsing a CSV file, the separator is comma. dtype: This is the data type that is used to apply to the text we read. By default, loadtxt() tries to match against float values # "map" the resulting structure to be easily accessible:# the first column (made of string) is called 'continents'# the remaining values are added to 'data' sub-matrix# where the real data arey = y.view(np.dtype([('continents', 'S16'), ('data', np.int32, len(years))])) Here we're using a trick: we view the resulting data structure as made up of two parts, continents and data. It's similar to the dtype that we defined earlier, but with an important difference. Now, the integer's values are mapped to a field name, data. This results in the column continents with all the continents names,and the matrix data that contains the year's values for each row of the file. data = y['data']continents = y['continents'] We can separate the data and the continents part into two variables for easier usage in the code. # prepare the bottom arraybottom = np.zeros(len(years)) We prepare an array of zeros of the same length as years. As said earlier, we plot stacked bars, so each dataset is plot over the previous ones, thus we need to know where the bars below finish. The bottom array keeps track of this, containing the height of bars already plotted. # for each line in datafor i in range(len(data)): Now that we have our information in data, we can loop over it. # create the bars for each element, on top of the previous barsbt = plt.bar(range(len(data[i])), data[i], width=width, color=cm.hsv(32*i), label=continents[i], bottom=bottom) and create the stacked bars. Some important notes: We select the the i-th row of data, and plot a bar according to its element's size (data[i]) with the chosen width. As the bars are generated in different loops, their colors would be all the same. To avoid this, we use a color map (in this case hsv), selecting a different color at each iteration, so the sub-bars will have different colors. We label each bar set with the relative continent's name (useful for the legend) As we have said, they are stacked bars. In fact, every iteration adds a piece of the global bars. To do so, we need to know where to start drawing the bar from (the lower limit) and bottom does this. It contains the value where to start drowing the current bar. # update the bottom arraybottom += data[i] We update the bottom array. By adding the current data line, we know what the bottom line will be to plot the next bars on top of it. # label the X ticks with yearsplt.xticks(np.arange(len(years))+width/2, [int(year) for year in years]) We then add the tick's labels, the years elements, right in the middle of the bar. # some information on the plotplt.xlabel('Years')plt.ylabel('Population (in billions)')plt.title('World Population: 1950 - 2050 (predictions)') Add some information to the graph. # draw a legend, with a smaller fontplt.legend(loc='upper left', prop=font_manager.FontProperties(size=7)) We now draw a legend in the upper-left position with a small font (to better fit the empty space). # apply the custom function as Y axis formatterplt.gca().yaxis.set_major_formatter(FuncFormatter(billions) Finally, we change the Y-axis label formatter, to use the custom formatting function that we defined earlier. The result is the next screenshot where we can see the composition of the world population divided by continents: In the preceding screenshot, the whole bar represents the total world population, and the sections in each bar tell us about how much a continent contributes to it. Also observe how the custom color map works: from bottom to top, we have represented Africa in red, Asia in orange, Europe in light green, Latin America in green, Northern America in light blue, and Oceania in blue (barely visible as the top of the bars). Plotting extrapolated data using curve fitting While plotting the CSV values, we have seen that there were some columns representing predictions of the world population in the coming years. We'd like to show how to obtain such predictions using the mathematical process of extrapolation with the help of curve fitting. Curve fitting is the process of constructing a curve (a mathematical function) that better fits to a series of data points. This process is related to other two concepts: interpolation: A method of constructing new data points within the range of a known set of points extrapolation: A method of constructing new data points outside a known set of points The results of extrapolation are subject to a greater degree of uncertainty and are influenced a lot by the fitting function that is used. So it works this way: First, a known set of measures is passed to the curve fitting procedure that computes a function to approximate these values With this function, we can compute additional values that are not present in the original dataset Let's first approach curve fitting with a simple example: # Numpy and Matplotlibimport numpy as npimport matplotlib.pyplot as plt These are the classic imports. # the known points setdata = [[2,2],[5,0],[9,5],[11,4],[12,7],[13,11],[17,12]] This is the data we will use for curve fitting. They are the points on a plane (so each has a X and a Y component) # we extract the X and Y components from previous pointsx, y = zip(*data) We aggregate the X and Y components in two distinct lists. # plot the data points with a black crossplt.plot(x, y, 'kx') Then plot the original dataset as a black cross on the Matplotlib image. # we want a bit more data and more fine grained for# the fitting functionsx2 = np.arange(min(x)-1, max(x)+1, .01) We prepare a new array for the X values because we wish to have a wider set of values (one unit on the right and one on to the left of the original list) and a fine grain to plot the fitting function nicely. # lines styles for the polynomialsstyles = [':', '-.', '--'] To differentiate better between the polynomial lines, we now define their styles list. # getting style and count one at timefor d, style in enumerate(styles): Then we loop over that list by also considering the item count. # degree of the polynomialdeg = d + 1 We define the actual polynomial degree. # calculate the coefficients of the fitting polynomialc = np.polyfit(x, y, deg) Then compute the coefficients of the fitting polynomial whose general format is: c[0]*x**deg + c[1]*x**(deg – 1) + ... + c[deg]# we evaluate the fitting function against x2y2 = np.polyval(c, x2) Here, we generate the new values by evaluating the fitting polynomial against the x2 array. # and then we plot itplt.plot(x2, y2, label="deg=%d" % deg, linestyle=style) Then we plot the resulting function, adding a label that indicates the degree of the polynomial and using a different style for each line. # show the legendplt.legend(loc='upper left') We then show the legend, and the final result is shown in the next screenshot: Here, the polynomial with degree=1 is drawn as a dotted blue line, the one with degree=2 is a dash-dot green line, and the one with degree=3 is a dashed red line. We can see that the higher the degree, the better is the fit of the function against the data. Let's now revert to our main intention, trying to provide an extrapolation for population data. First a note: we take the values for 2010 as real data and not predictions (well, we are quite near to that year) else we have very few values to create a realistic extrapolation. Let's see the code: # for file opening made easierfrom __future__ import with_statement# numpyimport numpy as np# matplotlib plotting moduleimport matplotlib.pyplot as plt# matplotlib colormap moduleimport matplotlib.cm as cm# Matplotlib font managerimport matplotlib.font_manager as font_manager# bar widthwidth = .8# open CSV filewith open('population.csv') as f: # read the first line, splitting the years years = map(int, f.readline().split(',')[1:]) # we prepare the dtype for exacting data; it's made of: # <1 string field> <6 integers fields> dtype = [('continents', 'S16')] + [('', np.int32)]*len(years) # we load the file, setting the delimiter and the dtype above y = np.loadtxt(f, delimiter=',', dtype=dtype) # "map" the resulting structure to be easily accessible: # the first column (made of string) is called 'continents' # the remaining values are added to 'data' sub-matrix # where the real data are y = y.view(np.dtype([('continents', 'S16'), ('data', np.int32, len(years))]))# extract fieldsdata = y['data']continents = y['continents'] This is the same code that is used for the CSV example (reported here for completeness). x = years[:-2]x2 = years[-2:] We are dividing the years into two groups: before and after 2010. This translates to split the last two elements of the years list. What we are going to do here is prepare the plot in two phases: First, we plot the data we consider certain values After this, we plot the data from the UN predictions next to our extrapolations # prepare the bottom arrayb1 = np.zeros(len(years)-2) We prepare the array (made of zeros) for the bottom argument of bar(). # for each line in datafor i in range(len(data)): # select all the data except the last 2 values d = data[i][:-2] For each data line, we extract the information we need, so we remove the last two values. # create bars for each element, on top of the previous barsbt = plt.bar(range(len(d)), d, width=width, color=cm.hsv(32*(i)), label=continents[i], bottom=b1)# update the bottom arrayb1 += d Then we plot the bar, and update the bottom array. # prepare the bottom arrayb2_1, b2_2 = np.zeros(2), np.zeros(2) We need two arrays because we will display two bars for the same year—one from the CSV and the other from our fitting function. # for each line in datafor i in range(len(data)): # extract the last 2 values d = data[i][-2:] Again, for each line in the data matrix, we extract the last two values that are needed to plot the bar for CSV. # select the data to compute the fitting functiony = data[i][:-2] Along with the other values needed to compute the fitting polynomial. # use a polynomial of degree 3c = np.polyfit(x, y, 3) Here, we set up a polynomial of degree 3; there is no need for higher degrees. # create a function out of those coefficientsp = np.poly1d(c) This method constructs a polynomial starting from the coefficients that we pass as parameter. # compute p on x2 values (we need integers, so the map)y2 = map(int, p(x2)) We use the polynomial that was defined earlier to compute its values for x2. We also map the resulting values to integer, as the bar() function expects them for height. # create bars for each element, on top of the previous barsbt = plt.bar(len(b1)+np.arange(len(d)), d, width=width/2, color=cm.hsv(32*(i)), bottom=b2_1) We draw a bar for the data from the CSV. Note how the width is half of that of the other bars. This is because in the same width we will draw the two sets of bars for a better visual comparison. # create the bars for the extrapolated valuesbt = plt.bar(len(b1)+np.arange(len(d))+width/2, y2, width=width/2, color=cm.bone(32*(i+2)), bottom=b2_2) Here, we plot the bars for the extrapolated values, using a dark color map so that we have an even better separation for the two datasets. # update the bottom arrayb2_1 += db2_2 += y2 We update both the bottom arrays. # label the X ticks with yearsplt.xticks(np.arange(len(years))+width/2, [int(year) for year in years]) We add the years as ticks for the X-axis. # draw a legend, with a smaller fontplt.legend(loc='upper left', prop=font_manager.FontProperties(size=7)) To avoid a very big legend, we used only the labels for the data from the CSV, skipping the interpolated values. We believe it's pretty clear what they're referring to. Here is the screenshot that is displayed on executing this example: The conclusion we can draw from this is that the United Nations uses a different function to prepare the predictions, especially because they have a continuous set of information, and they can also take into account other environmental circumstances while preparing such predictions. Tools using Matplotlib Given that it's has an easy and powerful API, Matplotlib is also used inside other programs and tools when plotting is needed. We are about to present a couple of these tools: NetworkX Mpmath
Read more
  • 0
  • 0
  • 6514

article-image-load-validate-and-submit-forms-using-ext-js-30-part-3
Packt
19 Nov 2009
4 min read
Save for later

Load, Validate, and Submit Forms using Ext JS 3.0: Part 3

Packt
19 Nov 2009
4 min read
Loading form data from the server An important part of working with forms is loading the data that a form will display. Here's how to create a sample contact form and populate it with data sent from the server. How to do it... Declare the name and company panel: var nameAndCompany = { columnWidth: .5, layout: 'form', items: [ { xtype: 'textfield', fieldLabel: 'First Name', name: 'firstName', anchor: '95%' }, { xtype: 'textfield', fieldLabel: 'Last Name', name: 'lastName', anchor: '95%' }, { xtype: 'textfield', fieldLabel: 'Company', name: 'company', anchor: '95%' }, { xtype: 'textfield', fieldLabel: 'Title', name: 'title', anchor: '95%' } ]} Declare the picture box panel: var picBox = { columnWidth: .5, bodyStyle: 'padding:0px 0px 0px 40px', items: [ { xtype: 'box', autoEl: { tag: 'div', style: 'padding-bottom:20px', html: '<img id="pic" src="' + Ext.BLANK_IMAGE_URL + '" class="img-contact" />' } }, { xtype: 'button', text: 'Change Picture' } ]} Define the Internet panel: var internet = { columnWidth: .5, layout: 'form', items: [ { xtype: 'fieldset', title: 'Internet', autoHeight: true, defaultType: 'textfield', items: [{ fieldLabel: 'Email', name: 'email', vtype: 'email', anchor: '95%' }, { fieldLabel: 'Web page', name: 'webPage', vtype: 'url', anchor: '95%' }, { fieldLabel: 'IM', name: 'imAddress', anchor: '95%' }] }]} Declare the phone panel: var phones = { columnWidth: .5, layout: 'form', items: [{ xtype: 'fieldset', title: 'Phone Numbers', autoHeight: true, defaultType: 'textfield', items: [{ fieldLabel: 'Home', name: 'homePhone', anchor: '95%' }, { fieldLabel: 'Business', name: 'busPhone', anchor: '95%' }, { fieldLabel: 'Mobile', name: 'mobPhone', anchor: '95%' }, { fieldLabel: 'Fax', name: 'fax', anchor: '95%' }] }]} Define the business address panel: var busAddress = { columnWidth: .5, layout: 'form', labelAlign: 'top', defaultType: 'textarea', items: [{ fieldLabel: 'Business', labelSeparator:'', name: 'bAddress', anchor: '95%' }, { xtype: 'radio', boxLabel: 'Mailing Address', hideLabel: true, name: 'mailingAddress', value:'bAddress', id:'mailToBAddress' }]} Define the home address panel: var homeAddress = { columnWidth: .5, layout: 'form', labelAlign: 'top', defaultType: 'textarea', items: [{ fieldLabel: 'Home', labelSeparator:'', name: 'hAddress', anchor: '95%' }, { xtype: 'radio', boxLabel: 'Mailing Address', hideLabel: true, name: 'mailingAddress', value:'hAddress', id:'mailToHAddress' }]} Create the contact form: var contactForm = new Ext.FormPanel({ frame: true, title: 'TODO: Load title dynamically', bodyStyle: 'padding:5px', width: 650, items: [{ bodyStyle: { margin: '0px 0px 15px 0px' }, items: [{ layout: 'column', items: [nameAndCompany, picBox] }] }, { items: [{ layout: 'column', items: [phones, internet] }] }, { xtype: 'fieldset', title: 'Addresses', autoHeight: true, hideBorders: true, layout: 'column', items: [busAddress, homeAddress] }], buttons: [{ text: 'Save' }, { text: 'Cancel' }]}); Handle the form's actioncomplete event: contactForm.on({ actioncomplete: function(form, action){ if(action.type == 'load'){ var contact = action.result.data; Ext.getCmp(contact.mailingAddress).setValue(true); contactForm.setTitle(contact.firstName + ' ' + contact.lastName); Ext.getDom('pic').src = contact.pic; } }}); Render the form: contactForm.render(document.body); Finally, load the form: contactForm.getForm().load({ url: 'contact.php', params:{id:'contact1'}, waitMsg: 'Loading'}); How it works... The contact form's building sequence consists of defining each of the contained panels, and then defining a form panel that will serve as a host. The following screenshot shows the resulting form, with the placement of each of the panels pinpointed: Moving on to how the form is populated, the JSON-encoded response to a request to provide form data has a structure similar to this: {success:true,data:{id:'1',firstName:'Jorge',lastName:'Ramon',company:'MiamiCoder',title:'Mr',pic:'img/jorger.jpg',email:'ramonj@miamicoder.net',webPage:'http://www.miamicoder.com',imAddress:'',homePhone:'',busPhone:'555 555-5555',mobPhone:'',fax:'',bAddress:'123 Acme Rd #001nMiami, FL 33133',hAddress:'',mailingAddress:'mailToBAddress'}} The success property indicates whether the request has succeeded or not. If the request succeeds, success is accompanied by a data property, which contains the contact's information. Although some fields are automatically populated after a call to load(), the form's title, the contact's picture, and the mailing address radio button require further processing. This can be done in the handler for the actioncomplete event: contactForm.on({ actioncomplete: function(form, action){ if(action.type == 'load'){} }}); As already mentioned, the contact's information arrives in the data property of the action's result: var contact = action.result.data; The default mailing address comes in the contact's mailingAddress property. Hence, the radio button for the default mailing address is set as shown in the following line of code: Ext.getCmp(contact.mailingAddress).setValue(true); The source for the contact's photo is the value of contact.pic: Ext.getDom('pic').src = contact.pic; And finally, the title of the form: contactForm.setTitle(contact.firstName + ' ' + contact.lastName); There's more... Although this recipe's focus is on loading form data, you should also pay attention to the layout techniques used—multiple rows, multiple columns, fieldsets—that allow you to achieve rich and flexible user interfaces for your forms. See Also... The next recipe, Serving the XML data to a form, explains how to use a form to load the XML data sent from the server.
Read more
  • 0
  • 0
  • 4250

article-image-data-tables-and-datatables-plugin-jquery-13-php
Packt
19 Nov 2009
10 min read
Save for later

Data Tables and DataTables Plugin in jQuery 1.3 with PHP

Packt
19 Nov 2009
10 min read
In this article by Kae Verens, we will look at: How to install and use the DataTables plugin How to load data pages on request from the server Searching and ordering the data From time to time, you will want to show data in your website and allow the data to be sorted and searched. It always impresses me that whenever I need to do anything with jQuery, there are usually plugins available, which are exactly or close to what I need. The DataTables plugin allows sorting, filtering, and pagination on your data. Here's an example screen from the project we will build in this article. The data is from a database of cities of the world, filtered to find out if there is any place called nowhere in the world: Get your copy of DataTables from http://www.datatables.net/, and extract it into the directory datatables, which is in the same directory as the jquery.min.js file. What the DataTables plugin does is take a large table, paginate it, and allow the columns to be ordered, and the cells to be filtered. Setting up DataTables Setting up DataTables involves setting up a table so that it has distinct < thead > and < tbody > sections, and then simply running dataTable() on it. As a reminder, tables in HTML have a header and a body. The HTML elements < thead > and < tbody > are optional according to the specifications, but the DataTables plugin requires that you put them in, so that it knows what to work with. These elements may not be familiar to you, as they are usually not necessary when you are writing your web pages and most people leave them out, but DataTables needs to know what area of the table to turn into a navigation bar, and which area will contain the data, so you need to include them. Client-side code The first example in this article is purely a client-side one. We will provide the data in the same page that is demonstrating the table. Copy the following code into a file in a new demo directory and name it tables.html: <html> <head> <script src="../jquery.min.js"></script> <script src="../datatables/media/js/jquery.dataTables.js"> </script> <style type="text/css"> @import "../datatables/media/css/demo_table.css";</style> <script> $(document).ready(function(){ $('#the_table').dataTable(); }); </script> </head> <body> <div style="width:500px"> <table id="the_table"> <thead> <tr> <th>Artist / Band</th><th>Album</th><th>Song</th> </tr> </thead> <tbody> <tr><td>Muse</td> <td>Absolution</td> <td>Sing for Absolution</td> </tr> <tr><td>Primus</td> <td>Sailing The Seas Of Cheese</td> <td>Tommy the Cat</td> </tr> <tr><td>Nine Inch Nails</td> <td>Pretty Hate Machine</td> <td>Something I Can Never Have</td> </tr> <tr><td>Horslips</td> <td>The Táin</td> <td>Dearg Doom</td> </tr> <tr><td>Muse</td> <td>Absolution</td> <td>Hysteria</td> </tr> <tr><td>Alice In Chains</td> <td>Dirt</td> <td>Rain When I Die</td> </tr> <!-- PLACE MORE SONGS HERE --> </tbody> </table> </div> </body> </html> When this is viewed in the browser, we immediately have a working data table: Note that the rows are in alphabetical order according to Artist/Band. DataTables automatically sorts your data initially based on the first column. The HTML provided has a < div > wrapper around the table, set to a fixed width. The reason for this is that the Search box at the top and the pagination buttons at the bottom are floated to the right, outside the HTML table. The < div > wrapper is provided to try to keep them at the same width as the table. There are 14 entries in the HTML, but only 10 of them are shown here. Clicking the arrow on the right side at the bottom-right pagination area loads up the next page: And finally, we also have the ability to sort by column and search all data: In this screenshot, we have the data filtered by the word horslips, and have ordered Song in descending order by clicking the header twice. With just this example, you can probably manage quite a few of your lower-bandwidth information tables. By this, I mean that you could run the DataTables plugin on complete tables of a few hundred rows. Beyond that, the bandwidth and memory usage would start affecting your reader's experience. In that case, it's time to go on to the next section and learn how to serve the data on demand using jQuery and Ajax. As an example of usage, a user list might reasonably be printed entirely to the page and then converted using the DataTable plugin because, for smaller sites, the user list might only be a few tens of rows and thus, serving it over Ajax may be overkill. It is more likely, though, that the kind of information that you would really want this applied to is part of a much larger data set, which is where the rest of the article comes in! Getting data from the server The rest of the article will build up a sample application, which is a search application for cities of the world. This example will need a database, and a large data set. I chose a list of city names and their spelling variants as my data set. You can get a list of this type online by searching. The exact point at which you decide a data set is large enough to require it to be converted to serve over Ajax, instead of being printed fully to the HTML source, depends on a few factors, which are mostly subjective. A quick test is: if you only ever need to read a few pages of the data, yet there are many pages in the source and the HTML is slow to load, then it's time to convert. The database I'm using in the example is MySQL (http://www.mysql.com/). It is trivial to convert the example to use any other database, such as PostgreSQL or SQLite. For your use, here is a short list of large data sets: http://wordlist.sourceforge.net/—Links to collections of words. http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs—A list of books placed online by Project Gutenburg. http://www.world-gazetteer.com/wg.php?men=stdl—A list of all the cities in the world, including populations. The reason I chose a city name list is that I wanted to provide a realistic large example of when you would use this. In your own applications, you might also use the DataTables plugin to manage large lists of products, objects such as pages or images, and anything else that can be listed in tabular form and might be very large. The city list I found has over two million variants in it, so it is an extreme example of how to set up a searchable table. It's also a perfect example of why the Ajax capabilities of the DataTables project are important. Just to see the result, I exported all the entries into an HTML table, and the file size was 179 MB. Obviously, too large for a web page. So, let's find out how to break the information into chunks and load it only as needed. Client-side code On the client side, we do not need to provide placeholder data. Simply print out the table, leaving the < tbody > section blank, and let DataTables retrieve the data from the server. We're starting a new project here, so create a new directory in your demos section and save the following into it as tables.html: <html> <head> <script src="../jquery.min.js"></script> <script src="../datatables/media/js/jquery.dataTables.js"> </script> <style type="text/css"> @import "../datatables/media/css/demo_table.css"; table{width:100%} </style> <script> $(document).ready(function(){ $('#the_table').dataTable({ 'sAjaxSource':'get_data.php' }); }); </script> </head> <body> <div style="width:500px"> <table id="the_table"> <thead> <tr> <th>Country</th> <th>City</th> <th>Latitude</th> <th>Longitude</th> </tr> </thead> <tbody> </tbody> </table> </div> </body> </html> In this example, we've added a parameter to the .dataTable call, sAjaxSource, which is the URL of the script that will provide the data (the file will be named get_data.php). Server-side code On the server side, we will start off by providing the first ten rows from the database. DataTables expects the data to be returned as a two-dimensional array named aaData. In my own database, I've created a table like this: CREATE TABLE `cities` ( `ccode` char(2) DEFAULT NULL, `city` varchar(87) DEFAULT NULL, `longitude` float DEFAULT NULL, `latitude` float DEFAULT NULL, KEY `city` (`city`(5)) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 Most of the searching will be done on city names, so I've indexed city. Initially, let's just extract the first page of information. Create a file called get_data.php and save it in the same directory as tables.html: <?php // { initialise variables $amt=10; $start=0; // } // { connect to database function dbRow($sql){ $q=mysql_query($sql); $r=mysql_fetch_array($q); return $r; } function dbAll($sql){ $q=mysql_query($sql); while($r=mysql_fetch_array($q))$rs[]=$r; return $rs; } mysql_connect('localhost','username','password'); mysql_select_db('phpandjquery'); // } // { count existing records $r=dbRow('select count(ccode) as c from cities'); $total_records=$r['c']; // } // { start displaying records echo '{"iTotalRecords":'.$total_records.', "iTotalDisplayRecords":'.$total_records.', "aaData":['; $rs=dbAll("select ccode,city,longitude,latitude from cities order by ccode,city limit $start,$amt"); $f=0; foreach($rs as $r){ if($f++) echo ','; echo '["',$r['ccode'],'", "',addslashes($r['city']),'", "',$r['longitude'],'", "',$r['latitude'],'"]'; } echo ']}'; // } In a nutshell, what happens is that the script counts how many cities are there in total, and then returns that count along with the first ten entries to the client browser using JSON as the transport.
Read more
  • 0
  • 0
  • 11809

article-image-advanced-matplotlib-part-2
Packt
19 Nov 2009
10 min read
Save for later

Advanced Matplotlib: Part 2

Packt
19 Nov 2009
10 min read
Plotting dates Sooner or later, we all have had the need to plot some information over time, be it for the bank account balance each month, the total web site accesses for each day of the year, or one of many other reasons. Matplotlib has a plotting function ad hoc for dates, plot_date() that considers data on X, Y, or both axes, as dates, labeling the axis accordingly. As usual, we now present an example, and we will discuss it later: In [1]: import matplotlib as mplIn [2]: import matplotlib.pyplot as pltIn [3]: import numpy as npIn [4]: import datetime as dtIn [5]: dates = [dt.datetime.today() + dt.timedelta(days=i) ...: for i in range(10)]In [6]: values = np.random.rand(len(dates))In [7]: plt.plot_date(mpl.dates.date2num(dates), values, linestyle='-');In [8]: plt.show() First, a note about linestyle keyword argument: without it, there's no line connecting the markers that are displayed alone. We  created  the dates array using timedelta(), a datetime function that helps us define a date interval—10 days in this case. Note how we had to convert our date values using the date2num() function. This is because Matplotlib represents dates as float values corresponding to the number of days since 0001-01-01 UTC. Also note how the X-axis labels, the ones that have data values, are badly rendered. Matplotlib provides ways to address the previous two points—date formatting and conversion, and axes formatting. Date formatting Commonly, in Python programs, dates are represented as datetime objects, so we have to first convert other data values into datetime objects, sometimes by using the dateutil companion module, for example: import datetimedate = datetime.datetime(2009, 03, 28, 11, 34, 59, 12345) or import dateutil.parserdatestrings = ['2008-07-18 14:36:53.494013','2008-07-20 14:37:01.508990', '2008-07-28 14:49:26.183256']dates = [dateutil.parser.parse(s) for s in datestrings] Once we have the datetime objects, in order to let Matplotlib use them, we have to convert them into floating point numbers that represent the number of days since 0001-01-01 00:00:00 UTC. To do that, Matplotlib itself provides several helper functions contained in the matplotlib.dates module: date2num():  This function converts one or a sequence of datetime objects to float values representing days since 0001-01-01 00:00:00 UTC (the fractional parts represent hours, minutes, and seconds) num2date():  This function converts one or a sequence of float values representing days since 0001-01-01 00:00:00 UTC to datetime objects (or a sequence, if the input is a sequence) drange(dstart, dend, delta): This function returns a date range (a sequence) of float values in Matplotlib date format; dstart and dend are datetime objects while delta is a datetime.timedelta instance Usually, what we will end up doing is converting a sequence of datetime objects into a Matplotlib representation, such as: dates = list of datetime objectsmpl_dates = matplotlib.dates.date2num(dates) drange() can be useful in situations like this one: import matplotlib as mplfrom matplotlib import datesimport datetime as dtdate1 = dt.datetime(2008, 9, 23)date2 = dt.datetime(2009, 4, 12)delta = dt.timedelta(days=10)dates = mpl.dates.drange(date1, date2, delta) where dates will be a sequence of floats starting from date1 and ending at date2 with a delta timestamp between each item of the list. Axes formatting with axes tick locators and formatters As we have already seen, the X labels on the first image are not that nice looking. We would expect Matplotlib to allow a better way to label the axis, and indeed, there is. The solution is to change the two parts that form the axis   ticks—locators and formatters. Locators control the tick's position, while formatters control the formatting of labels. Both have a major and minor mode: the major locator and formatter are active by default and are the ones we commonly see, while minor mode can be turned on by passing a relative locator or formatter function (because minors are turned off by default by assigning NullLocator and NullFormatter to them). While this is a general tuning operation and can be applied to all Matplotlib plots, there are some specific locators and formatters for date plotting, provided by matplotlib.dates: MinuteLocator, HourLocator,DayLocator, WeekdayLocator,MonthLocator, YearLocator are all the  locators available that place a tick at the time specified by the name, for example, DayLocator will draw a tick at each day. Of course, a minimum knowledge of the date interval that we are about to draw is needed to select the best locator. DateFormatter is the tick formatter that uses strftime() to format strings.   The default locator and formatter are matplotlib.ticker.AutoDateLocator and matplotlib.ticker.AutoDateFormatter, respectively. Both are set by the plot_date() function when called. So, if you wish to set a different locator and/or formatter, then we suggest to do that after the plot_date() call in order to avoid the plot_date() function resetting them to the default values. Let's group all this up in an example: In [1]: import matplotlib as mplIn [2]: import matplotlib.pyplot as pltIn [3]: import numpy as npIn [4]: import datetime as dtIn [5]: fig = plt.figure()In [6]: ax2 = fig.add_subplot(212)In [7]: date2_1 = dt.datetime(2008, 9, 23)In [8]: date2_2 = dt.datetime(2008, 10, 3)In [9]: delta2 = dt.timedelta(days=1)In [10]: dates2 = mpl.dates.drange(date2_1, date2_2, delta2)In [11]: y2 = np.random.rand(len(dates2))In [12]: ax2.plot_date(dates2, y2, linestyle='-');In [13]: dateFmt = mpl.dates.DateFormatter('%Y-%m-%d')In [14]: ax2.xaxis.set_major_formatter(dateFmt)In [15]: daysLoc = mpl.dates.DayLocator()In [16]: hoursLoc = mpl.dates.HourLocator(interval=6)In [17]: ax2.xaxis.set_major_locator(daysLoc)In [18]: ax2.xaxis.set_minor_locator(hoursLoc)In [19]: fig.autofmt_xdate(bottom=0.18) # adjust for date labels displayIn [20]: fig.subplots_adjust(left=0.18)In [21]: ax1 = fig.add_subplot(211)In [22]: date1_1 = dt.datetime(2008, 9, 23)In [23]: date1_2 = dt.datetime(2009, 2, 16)In [24]: delta1 = dt.timedelta(days=10)In [25]: dates1 = mpl.dates.drange(date1_1, date1_2, delta1)In [26]: y1 = np.random.rand(len(dates1))In [27]: ax1.plot_date(dates1, y1, linestyle='-');In [28]: monthsLoc = mpl.dates.MonthLocator()In [29]: weeksLoc = mpl.dates.WeekdayLocator()In [30]: ax1.xaxis.set_major_locator(monthsLoc)In [31]: ax1.xaxis.set_minor_locator(weeksLoc)In [32]: monthsFmt = mpl.dates.DateFormatter('%b')In [33]: ax1.xaxis.set_major_formatter(monthsFmt)In [34]: plt.show() The result of executing the previous code snippet is as shown: We drew the subplots in reverse order to avoid some minor overlapping problems. fig.autofmt_xdate() is used to nicely format date tick labels. In particular, this function rotates the labels (by using rotation keyword argument, with a default value of 30°) and gives them  more room (by using bottom keyword argument, with a default value of 0.2). We can achieve the same result, at least for the additional spacing, with: fig = plt.figure()fig.subplots_adjust(bottom=0.2)ax = fig.add_subplot(111) This can also be done by creating the Axes instance directly with: ax = fig.add_axes([left, bottom, width, height]) while specifying the explicit dimensions. The subplots_adjust() function allows us to control the spacing around the subplots by using the following keyword arguments: bottom, top, left, right: Controls the spacing at the bottom, top, left, and right of the subplot(s)     wspace, hspace: Controls the horizontal and vertical spacing between subplots We can also control the spacing by using these parameters in the Matplotlib configuration file: figure.subplot.<position> = <value> Custom formatters and locators Even if it's not strictly related to date plotting, tick formatters allow for custom formatters too: ...import matplotlib.ticker as ticker...def format_func(x, pos): return <a transformation on x>...formatter = ticker.FuncFormatter(format_func)ax.xaxis.set_major_formatter(formatter)... The  function format_func will be called for each label to draw, passing its value and position on the axis. With those two arguments, we can apply a transformation (for example, divide x by 10) and then return a value that will be used to actually draw the tick label. Here's a general note on NullLocator: it can be used to remove axis ticks by simply issuing: ax.xaxis.set_major_locator(matplotlib.ticker.NullLocator()) Text properties, fonts, and LaTeX Matplotlib has excellent text support, including mathematical expressions, TrueType font support for raster and vector outputs, newline separated text with arbitrary rotations, and Unicode. We have total control over every text property (font size, font weight, text location, color, and so on) with sensible defaults set in the rc configuration file. Specifically for those interested in mathematical or scientific figures, Matplotlib implements a large number of TeX math symbols and commands to support mathematical expressions anywhere in the figure. We already saw some text functions, but the following list contains all the functions which can be used to insert text with the pyplot interface, presented along with the corresponding API method and a description: Pyplot function API method Description text() mpl.axes.Axes.text() Adds text at an arbitrary location to the Axes xlabel() mpl.axes.Axes.set_xlabel() Adds an axis label to the X-axis ylabel() mpl.axes.Axes.set_ylabel() Adds an axis label to the Y-axis title() mpl.axes.Axes.set_title() Adds a title to the Axes figtext() mpl.figure.Figure.text() Adds text at an arbitrary location to the Figure suptitle() mpl.figure.Figure.suptitle() Adds a centered title to the Figure annotate() mpl.axes.Axes.annotate() Adds an annotation with an optional arrow to the Axes     All of these commands return a matplotlib.text.Text instance. We can customize the text properties by passing keyword arguments to the functions or by using matplotlib.artist.setp(): t = plt.xlabel('some text', fontsize=16, color='green') We can do it as: t = plt.xlabel('some text')plt.setp(t, fontsize=16, color='green') Handling objects allows for several new possibilities; such as setting the same property to all the objects in a specific group. Matplotlib has several convenience functions to return the objects of a plot. Let's take the example of the tick labels: ax.get_xticklabels() This line of code returns a sequence of object instances (the labels for the X-axis ticks) that we can tune: for t in ax.get_xticklabels(): t.set_fontsize(5.) or else, still using setp(): setp(ax.get_xticklabels(), fontsize=5.) It can take a sequence of objects, and apply the same property to all of them. To recap, all of the properties such as color, fontsize, position, rotation, and so on, can be set either: At function call using keyword arguments Using setp() referencing the Text instance Using the modification function Fonts Where there is text, there are also fonts to draw it. Matplotlib allows for several font customizations. The most complete documentation on this is currently available in the Matplotlib configuration file, /etc/matplotlibrc. We are now reporting that information here. There are six font properties available for modification. Property name Values and description font.family It has five values: serif (example, Times) sans-serif (example, Helvetica) cursive (example, Zapf-Chancery) fantasy (example, Western) monospace (example, Courier) Each of these font families has a default list of font names in decreasing order of priority associated with them (next table). In addition to these generic font names, font.family may also be an explicit name of a font available on the system. font.style Three values: normal (or roman), italic, or oblique. The oblique style will be used for italic, if it is not present. font.variant Two values: normal or small-caps. For TrueType fonts, which are scalable fonts, small-caps is equivalent to using a font size of smaller, or about 83% of the current font size. font.weight Effectively has 13 values-normal, bold, bolder, lighter, 100, 200, 300, ..., 900. normal is the same as 400, and bold is 700. bolder and lighter are relative values with respect to the current weight. font.stretch 11 values-ultra-condensed, extra-condensed, condensed, semi-condensed, normal, semi-expanded, expanded, extra-expanded, ultra-expanded, wider, and narrower. This property is not currently implemented. It works if the font supports it, but only few do. font.size The default font size for text, given in points. 12pt is the standard value.
Read more
  • 0
  • 1
  • 6144
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-advanced-matplotlib-part-1
Packt
19 Nov 2009
7 min read
Save for later

Advanced Matplotlib: Part 1

Packt
19 Nov 2009
7 min read
The basis for all of these topics is the object-oriented interface. Object-oriented versus MATLAB styles We have seen  a lot of examples, and in all of them we used the matplotlib.pyplot module to create and manipulate the plots, but this is not the only way to make use of the Matplotlib plotting power. There are three ways to use Matplotlib: pyplot: The module used so far in this article pylab:  A module to merge Matplotlib and NumPy together in an environment closer to MATLAB Object-oriented way: The Pythonic way to interface with Matplotlib Let's first elaborate a bit about the pyplot module: pyplot provides a MATLAB-style, procedural, state-machine interface to the underlying object-oriented library in Matplotlib. A state machine is a system with a global status, where each operation performed on the system changes its status. matplotlib.pyplot is stateful because the underlying engine keeps track of the current figure and plotting area information, and plotting functions change that information. To make it clearer, we did not use any object references during our plotting we just issued a pyplot command, and the changes appeared in the figure. At a higher level, matplotlib.pyplot is a collection of commands and functions that make Matplotlib behave like MATLAB (for plotting). This is really useful when doing interactive sessions, because we can issue a command and see the result immediately, but it has several drawbacks when we need something more such as low-level customization or application embedding. If we remember, Matplotlib started as an alternative to MATLAB, where we have at hand both numerical and plotting functions. A similar interface exists for Matplotlib, and its name is pylab. pylab (do you see the similarity in the names?) is a companion module, installed next to matplotlib that merges matplotlib.pyplot (for plotting) and numpy (for mathematical functions) modules in a single namespace to  provide an environment as near to MATLAB as possible, so that the transition would be easy. We and the authors of Matplotlib discourage the use of pylab, other than for proof-of-concept snippets. While being rather simple to use, it teaches developers the wrong way to use Matplotlib. The third way to use Matplotlib is through the object-oriented interface (OO, from now on). This is the most powerful way to write Matplotlib code because it allows for complete control of the result however it is also the most complex. This is the Pythonic way to use Matplotlib, and it's highly encouraged when programming with Matplotlib rather than working interactively. We will use it a lot from now on as it's needed to go down deep into Matplotlib. Please allow us to highlight again the preferred style that the author of this article, and the authors of Matplotlib want to enforce: a bit of pyplot will be used, in particular for convenience functions, and the remaining plotting code is either done with the OO style or with pyplot, with numpy explicitly imported and used for numerical functions. In this preferred style, the initial imports are: import matplotlib.pyplot as pltimport numpy as np In this way, we know exactly which module the function we use comes from (due to the module prefix), and it's exactly what we've always done in the code so far. Now, let's present the same piece of code expressed in the three possible forms which we just described. First, we present it in the style, pyplot only: In [1]: import matplotlib.pyplot as pltIn [2]: import numpy as npIn [3]: x = np.arange(0, 10, 0.1)In [4]: y = np.random.randn(len(x))In [5]: plt.plot(x, y)Out[5]: [<matplotlib.lines.Line2D object at 0x1fad810>]In [6]: plt.title('random numbers')In [7]: plt.show() The preceding code snippet results in: Now, let's see how we can do the same thing using the pylab interface: $ ipython -pylab... In [1]: x = arange(0, 10, 0.1)In [2]: y = randn(len(x)) In [3]: plot(x, y)Out[3]: [<matplotlib.lines.Line2D object at 0x4284dd0>] In [4]: title('random numbers')In [5]: show() Note that: ipython -pylab is not the same as running ipython and then: from pylab import * This is because ipython's-pylab switch, in addition to importing everything from pylab, also enables a specific ipython threading mode so that both the interactive interpreter and the plot window can be active at the same time. Finally, lets make the same chart by using OO style, but with some pyplot convenience functions: In [1]: import matplotlib.pyplot as pltIn [2]: import numpy as np In [3]: x = np.arange(0, 10, 0.1)In [4]: y = np.random.randn(len(x))In [5]: fig = plt.figure()In [6]: ax = fig.add_subplot(111)In [7]: l, = plt.plot(x, y)In [8]: t = ax.set_title('random numbers')In [9]: plt.show() The pylab code is the simplest, and ,pyplot is in the middle, while the OO is the most complex or verbose. As the Python Zen teaches us, "Explicit is better than implicit" and "Simple is better than complex" and those statements are particularly true for this example: for simple interactive sessions, pylab or ,pyplot are the perfect choice because they hide a lot of complexity, but if we need something more advanced, then the OO API makes clearer where things are coming from, and what's going on. This expressiveness will be appreciated when we will embed Matplotlib inside GUI applications. From now on, we will start presenting our code using the OO interface mixed with some pyplot functions. A brief introduction to Matplotlib objects Before we can go on in a productive way, we need to briefly introduce which Matplotlib objects compose a figure. Let's see from the higher levels to the lower ones how objects are nested: Object Description FigureCanvas Container class for the Figure instance Figure Container for one or more Axes instances Axes The rectangular areas to hold the basic elements, such as lines, text, and so on     Our first (simple) example of OO Matplotlib In the previous pieces of code, we had transformed this: ...In [5]: plt.plot(x, y)Out[5]: [<matplotlib.lines.Line2D object at 0x1fad810>]... into: ...In [7]: l, = plt.plot(x, y)... The new code uses an explicit reference, allowing a lot more customizations. As we can see in the first piece of code, the plot() function returns a list of Line2D instances, one for each line (in this case, there is only one), so in the second code, l is a reference to the line object, so every operation allowed on Line2D can be done using l. For example, we can set the line color with: l.set_color('red') Instead of using the keyword argument to plot(), so the line information can be changed after the plot() call. Subplots In the previous section, we have seen a couple of important functions without introducing them. Let's have a look at them now: fig = plt.figure(): This function returns a Figure, where we can add one or more Axes instances. ax = fig.add_subplot(111): This function returns an Axes instance, where we can plot (as done so far), and this is also the reason why we call the variable referring to that instance ax (from Axes). This is a common way to add an Axes to a Figure, but add_subplot() does a bit more: it adds a subplot. So far we have only seen a Figure with one Axes instance, so only one area where we can draw, but Matplotlib allows more than one. add_subplot() takes three parameters: fig.add_subplot(numrows, numcols, fignum) where: numrows  represents the number of rows of subplots to prepare numcols  represents the number of columns of subplots to prepare fignum  varies from 1 to numrows*numcols and specifies the current subplot (the one used now) Basically, we describe a matrix of numrows*numcols subplots that we want into the Figure; please note that fignum is 1 at the upper-left corner of the Figure and it's equal to numrows*numcols at the bottom-right corner. The following table should provide a visual explanation of this:   numrows=2, numcols=2, fignum=1 numrows=2, numcols=2, fignum=2 numrows=2, numcols=2, fignum=3 numrows=2, numcols=2, fignum=4
Read more
  • 0
  • 0
  • 5944

article-image-user-interaction-and-email-automation-symfony-13-part1
Packt
18 Nov 2009
14 min read
Save for later

User Interaction and Email Automation in Symfony 1.3: Part1

Packt
18 Nov 2009
14 min read
The signup module We want to provide the users with the functionality to enter their name, email address, and how they found our web site. We want all this stored in a database and to have an email automatically sent out to the users thanking them for signing up. To start things off, we must first add some new tables to our existing database schema. The structure of our newsletter table will be straightforward. We will need one table to capture the users' information and a related table that will hold the names of all the places where we advertised our site. I have constructed the following entity relationship diagram to show you a visual relationship of the tables: All the code used in this article can be accessed here. Let's translate this diagram into XML and place it in the config/schema.xml file: <table name="newsletter_adverts" idMethod="native" phpName="NewsletterAds"> <column name="newsletter_adverts_id" type="INTEGER" required="true" autoIncrement="true" primaryKey="true" /> <column name="advertised" type="VARCHAR" size="30" required="true" /> </table> <table name="newsletter_signups" idMethod="native" phpName="NewsletterSignup"> <column name="id" type="INTEGER" required="true" autoIncrement="true" primaryKey="true" /> <column name="first_name" type="VARCHAR" size="20" required="true" /> <column name="surname" type="VARCHAR" size="20" required="true" /> <column name="email" type="VARCHAR" size="100" required="true" /> <column name="activation_key" type="VARCHAR" size="100" required="true" /> <column name="activated" type="BOOLEAN" default="0" required="true" /> <column name="newsletter_adverts_id" type="INTEGER" required="true"/> <foreign-key foreignTable="newsletter_adverts" onDelete="CASCADE"> <reference local="newsletter_adverts_id" foreign="newsletter_adverts_id" /> </foreign-key> <column name="created_at" type="TIMESTAMP" required="true" /> <column name="updated_at" type="TIMESTAMP" required="true" /> </table> We will need to populate the newsletter_adverts table with some test data as well. Therefore, I have also appended the following data to the fixtures.yml file located in the data/fixtures/ directory: NewsletterAds: nsa1: advertised: Internet Search nsa2: advertised: High Street nsa3: advertised: Poster With the database schema and the test data ready to be inserted into the database, we can once again use the Symfony tasks. As we have added two new tables to the schema, we will have to rebuild everything to generate the models using the following command: $/home/timmy/workspace/milkshake>symfony propel:build-all-load --no-confirmation Now we have populated the tables in the database, and the models and forms have been generated for use too. Binding a form to a database table Symfony contains a whole framework just for the development of forms. The forms framework makes building forms easier by applying object-oriented methods to their development. Each form class is based on its related table in the database. This includes the fields, the validators, and the way in which the forms and fields are rendered. A look at the generated base class Rather than starting off with a simple form, we are going to look at the base form class that has already been generated for us as a part of the build task we executed earlier. Because the code is generated, it will be easier for you to see the initial flow of a form. So let's open the base class for the NewsletterSignupForm form. The file is located at lib/form/base/BaseNewsletterSignupForm.class.php: class BaseNewsletterSignupForm extends BaseFormPropel { public function setup() { $this->setWidgets(array( 'id' => new sfWidgetFormInputHidden(), 'first_name' => new sfWidgetFormInput(), 'surname' => new sfWidgetFormInput(), 'email' => new sfWidgetFormInput(), 'activation_key' => new sfWidgetFormInput(), 'activated' => new sfWidgetFormInputCheckbox(), 'newsletter_adverts_id' => new sfWidgetFormPropelChoice (array('model' => 'NewsletterAds', 'add_empty' => false)), 'created_at' => new sfWidgetFormDateTime(), 'updated_at' => new sfWidgetFormDateTime(), )); $this->setValidators(array( 'id' => new sfValidatorPropelChoice(array ('model' => 'NewsletterSignup', 'column' => 'id', 'required' => false)), 'first_name' => new sfValidatorString(array('max_length' => 20)), 'surname' => new sfValidatorString(array('max_length' => 20)), 'email' => new sfValidatorString(array('max_length' => 100)), 'activation_key' => new sfValidatorString(array('max_length' => 100)), 'activated' => new sfValidatorBoolean(), 'newsletter_adverts_id'=> new sfValidatorPropelChoice(array ('model' => 'NewsletterAds', 'column' => 'newsletter_adverts_id')), 'created_at' => new sfValidatorDateTime(), 'updated_at' => new sfValidatorDateTime(), )); $this->widgetSchema->setNameFormat('newsletter_signup[%s]'); $this->errorSchema = new sfValidatorErrorSchema ($this->validatorSchema); parent::setup(); } There are five areas in this base class that are worth noting: This base class extends the BaseFormPropel class, which is an empty class. All base classes extend this class, which allows us to add global settings to all our forms. All of the columns in our table are treated as fields in the form, and are referred to as widgets. All of these widgets are then attached to the form by adding them to the setWidgets() method. Looking over the widgets in the array, you will see that they are pretty standard, such as sfWidgetFormInputHidden(), sfWidgetFormInput(). However, there is one widget added that follows the relationship between the newsletter_sigups table and the newsletter_adverts table. It is the sfWidgetFormPropelChoice widget. Because there is a 1:M relation between the tables, the default behavior is to use this widget, which creates an HTML drop-down box and is populated with the values from the newsletter_adverts table. As a part of the attribute set, you will see that it has set the model needed to retrieve the values to NewsletterAds and the newsletter_adverts_id column for the actual values of the drop-down box. All the widgets on the form must be validated by default. To do this, we have to call the setValidators() method and add the validation requirements to each widget. At the moment, the generated validators reflect the attributes of our database as set in the schema. For example, the first_name field in the statement 'first_name' => new sfValidatorString(array('max_length' => 20)) demonstrates that the validator checks if the maximum length is 20. If you remember, in our schema too, the first_name column is set to 20 characters. The final part calls the parent's setup() function. The base class BaseNewsletterSignupForm contains all the components needed to generate the form for us. So let's get the form on a page and take a look at the method to customize it. There are many widgets that Symfony provides for us. You can find the classes for them inside the widget/ directory of your Symfony installation. The Symfony propel task always generates a form class and its corresponding base class. Of course, not all of our tables will need to have a form bound to them. Therefore, delete all the form classes that are not needed. Rendering the form Rendering this basic form requires us to instantiate the form object in the action. Assigning the form object to the global $this variable means that we can pass the form object to the template just like any other variable. So let's start by implementing the newsletter signup module. In your terminal window, execute the generate:module task like this: $/home/timmy/workspace/milkshake>symfony generate:module frontend signup Now we can start with the application logic. Open the action class from apps/frontend/modules/signup/actions/actions.class.php for the signup module and add the following logic inside the index action: public function executeIndex(sfWebRequest $request) { $this->form = new NewsletterSignupForm(); return sfView::SUCCESS; } As I had mentioned earlier, the form class deals with the form validation and rendering. For the time being, we are going to stick to the default layout by allowing the form object to render itself. Using this method initially will allow us to create rapid prototypes. Let's open the apps/frontend/signup/templates/indexSuccess.php template and add the following view logic: <form action="<?php echo url_for('signup/submit') ?>" method="POST"> <table><?php echo $form ?></table> <input type="submit" /> </form> The form class is responsible for rendering of the form elements only. Therefore, we have to include the <form> and submit HTML tags that wrap around the form. Also, the default format of the form is set to 'table'. Again, we must also add the start and end tags of the <table>. At this stage, we would normally be able to view the form in the browser. But doing so will raise a Symfony exception error. The cause of this is that the results retrieved from the newsletter_adverts table are in the form of an array of objects. These results need to populate the select box widget. But in the current format, this is not possible. Therefore, we have to convert each object into its string equivalent. To do this, we need to create a PHP magic function of __toString() in the DAO class NewsletterAds. The DAO class for NewlsetterAds is located at lib/model/NewsletterAds.php just as all of the other models. Here we need to represent each object as its name, which is the value in the advertised column. Remember that we need to add this method to the DAO class as this represents a row within the results, unlike the peer class that represents the entire result set. Let's add the function to the NewsletterAds class as I have done here: class NewsletterAds extends BaseNewsletterAds { public function __toString() { return $this->getAdvertised(); } } We are now ready to view the completed form. In your web browser, enter the URL http://milkshake/frontend_dev.php/signup and you will see the result shown in the following screenshot: As you can see, although the form has been rendered according to our table structure, the fields which we do not want the user to fill in are also included. Of course, we can change this quiet easily. But before we take a look at the layout of the form, let's customize the widgets and widget validators. Now we can begin working on the application logic for submitting the form. Customizing form widgets and validators All of the generated form classes are located in the lib/form and the lib/form/base directories. The latter is where the default generated classes are located, and the former is where the customizable classes are located. This follows the same structure as the models. Each custom form class inherits from its parent. Therefore, we have to override some of the functions to customize the form. Let's customize the widgets and validators for the NewsletterSignupForm. Open the lib/forms/NewsletterSignupForm.class.php file and paste the following code inside the configure() method: //Removed unneeded widgets unset( $this['created_at'], $this['updated_at'], $this['activation_key'], $this['activated'], $this['id'] ); //Set widgets //Modify widgets $this->widgetSchema['first_name'] = new sfWidgetFormInput(); $this->widgetSchema['newsletter_adverts_id'] = new sfWidgetFormPropelChoice(array('model' => 'NewsletterAds', 'add_empty' => true, 'label'=>'Where did you find us?')); $this->widgetSchema['email'] = new sfWidgetFormInput (array('label' => 'Email Address')); //Add validation $this->setValidators(array ('first_name'=> new sfValidatorString(array ('required' => true), array('required' => 'Enter your firstname')), 'surname'=> new sfValidatorString(array('required' => true), array('required' => 'Enter your surname')), 'email'=> new sfValidatorString(array('required' => true), array('invalid' => 'Provide a valid email', 'required' => 'Enter your email')), 'newsletter_adverts_id' => new sfValidatorPropelChoice(array('model' => 'NewsletterAds', 'column' => 'newsletter_adverts_id'), array('required' => 'Select where you found us')), )); //Set post validators $this->validatorSchema->setPostValidator( new sfValidatorPropelUnique(array('model' => 'NewsletterSignup', 'column' => array('email')), array('invalid' => 'Email address is already registered')) ); //Set form name $this->widgetSchema->setNameFormat('newsletter_signup[%s]'); //Set the form format $this->widgetSchema->setFormFormatterName('list'); Let's take a closer look at the code. Removing unneeded fields To remove the fields that we do not want to be rendered, we must call the PHP unset() method and pass in the fields to unset. As mentioned earlier, all of the fields that are rendered need a corresponding validator, unless we unset them. Here we do not want the created_at and activation_key fields to be entered by the user. To do so, the unset() method should contain the following code: unset( $this['created_at'], $this['updated_at'], $this['activation_key'], $this['activated'], $this['id'] ); Modifying the form widgets Although it'll be fine to use the remaining widgets as they are, let's have a look at how we can modify them: //Modify widgets $this->widgetSchema['first_name'] = new sfWidgetFormInput(); $this->widgetSchema['newsletter_adverts_id'] = new sfWidgetFormPropelChoice(array('model' => 'AlSignupNewsletterAds', 'add_empty' => true, 'label'=>'Where did you find us?')); $this->widgetSchema['email'] = new sfWidgetFormInput(array('label' => 'Email Address')); There are several types of widgets available, but our form requires only two of them. Here we have used the sfWidgetFormInput() and sfWidgetFormPropelChoice() widgets. Each of these can be initialized with several values. We have initialized the email and newsletter_adverts_id widgets with a label. This basically renders the label field associated to the widget on the form. We do not have to include a label because Symfony adds the label according to the column name. Adding form validators Let's add the validators in a similar way as we have added the widgets: //Add validation $this->setValidators(array( 'first_name'=> new sfValidatorString(array('required' => true), array('required' => 'Enter your firstname')), 'surname'=> new sfValidatorString(array('required' => true), array('required' => 'Enter your surname')), 'email'=> new sfValidatorEmail(array('required' => true), array('invalid' => 'Provide a valid email', 'required' => 'Enter your email')), 'newsletter_adverts_id' => new sfValidatorPropelChoice(array ('model' => 'NewsletterAds', 'column' => 'newsletter_adverts_id'), array('required' => 'Select where you found us')), )); //Set post validators $this->validatorSchema->setPostValidator(new sfValidatorPropelUnique(array('model' => 'NewsletterSignup', 'column' => array('email')), array('invalid' => 'Email address is already registered')) ); Our form will need four different types of validators: sfValidatorString: This checks the validity of a string against a criteria. It takes four arguments—required, trim, min_length, and max_length. SfValidatorEmail: This validates the input against the pattern of an email address. SfValidatorPropelChoice: It validates the value with the values in the newsletter_adverts table. It needs the model and column that are to be used.   SfValidatorPropelUnique: Again, this validator checks the value against the values in a given table column for uniqueness. In our case, we want to use the NewsletterSignup model to test if the email column is unique. As mentioned earlier, all the fields must have a validator. Although it's not recommended, you can allow extra parameters to be passed in. To achieve this, there are two steps: You must disable the default option of having all fields validated by $this->validatorSchema->setOption('allow_extra_fields', true). Although the above step allows the values to bypass validation, they will be filtered out of the results. To prevent this, you will have to set $this->validatorSchema->setOption('filter_extra_fields', false). Form naming convention and setting its style The final part we added is the naming convention for the HTML attributes and the style in which we want the form rendered. The HTML output will use our naming convention. For example, in the following code, we have set the convention to newsletter_signup[fieldname] for each input field's name. //Set form name $this->widgetSchema->setNameFormat('newsletter_signup[%s]'); //Set the form format $this->widgetSchema->setFormFormatterName('list'); Two formats are shipped with Symfony that we can use to render our form. We can either render it in an HTML table or an unordered list. As we have seen, the default is an HTML table. But by setting this as list, the form is now rendered as an unordered HTML list, just like the following screenshot. (Of course, I had to replace the <table> tags with the <ul> tags.)
Read more
  • 0
  • 0
  • 2207

article-image-user-interaction-and-email-automation-symfony-13-part2
Packt
18 Nov 2009
8 min read
Save for later

User Interaction and Email Automation in Symfony 1.3: Part2

Packt
18 Nov 2009
8 min read
Automated email responses Symfony comes with a default mailer library that is based on Swift Mailer 4, the detailed documentation is available from their web site at http://swiftmailer.org. After a user has signed up to our mailing list, we would like an email verification to be sent to the user's email address. This will inform the user that he/she has signed up, and will also ask him or her to activate their subscription. To use the library, we have to complete the following three steps: Store the mailing settings in the application settings file. Add the application logic to the action. Create the email template. Adding the mailer settings to the application Just like all the previous settings, we should add all the settings for sending emails to the module.yml file for the signup module. This will make it easier to implement any modifications required later. Initially, we should set variables like the email subject, the from name, the from address, and whether we want to send out emails within the dev environment. I have added the following items to our signup module's setting file, apps/frontend/config/module.yml: dev: mailer_deliver: true all: mailer_deliver: true mailer_subject: Milkshake Newsletter mailer_from_name: Tim mailer_from_email: no-reply@milkshake All of the settings can be contained under the all label. However, you can see that I have introduced a new label called dev. These labels represent the environments, and we have just added a specific variable to the dev environment. This setting will allow us to eventually turn off the sending of emails while in the dev environment. Creating the application logic Triggering the email should occur after the user's details have been saved to the database. To demonstrate this, I have added the highlighted amends to the submit action in the apps/frontend/modules/signup/actions/actions.class.php file, as shown in the following code: public function executeSubmit(sfWebRequest $request) { $this->form = new NewsletterSignupForm(); if ($request->isMethod('post') && $this->form-> bindAndSave($request->getParameter($this->form-> getName()))) { //Include the swift lib require_once('lib/vendor/swift-mailer/lib/swift_init.php'); try{ //Sendmail $transport = Swift_SendmailTransport::newInstance(); $mailBody = $this->getPartial('activationEmail', array('name' => $this->form->getValue('first_name'))); $mailer = Swift_Mailer::newInstance($transport); $message = Swift_Message::newInstance(); $message->setSubject(sfConfig::get('app_mailer_subject')); $message->setFrom(array(sfConfig:: get('app_mailer_from_email') => sfConfig::get('app_mailer_from_name'))); $message->setTo(array($this->form->getValue('email')=> $this-> form->getValue('first_name'))); $message->setBody($mailBody, 'text/html'); if(sfConfig::get('app_mailer_deliver')) { $result = $mailer->send($message); } } catch(Exception $e) { var_dump($e); exit; } $this->redirect('@signup'); } //Use the index template as it contains the form $this->setTemplate('index'); } Symfony comes with a sfMailer class that extends Swift_Mailer. To send mails you could simply implement the following Symfony method: $this->getMailer()->composeAndSend('from@example.com', 'to@example.com', 'Subject', 'Body'); Let's walk through the process: Instantiate the Swift Mailer. Retrieve the email template (which we will create next) using the $this->getPartial('activationEmail', array('name' => $this->form->getValue('first_name'))) method. Breaking this down, the function itself retrieves a partial template. The first argument is the name of the template to retrieve (that is activationEmail in our example) which, if you remember, means that the template will be called _activationEmail.php. The next argument is an array that contains variables related to the partial template. Here, I have set a name variable. The value for the name is important. Notice how I have used the value within the form object to retrieve the first_name value. This is because we know that these values have been cleaned and are safe to use. Set the subject, from, to, and the body items. These functions are Swift Mailer specific: setSubject(): It takes a string as an argument for the subject setFrom(): It takes the name and the mailing address setTo(): It takes the name and the mailing address setBody(): It takes the email body and mime type. Here we passed in our template and set the email to text/html Finally we send the email. There are more methods in Swift Mailer. Check out the documentation on the Swift Mailer web site (http://swiftmailer.org/). The partial email template Lastly, we need to create a partial template that will be used in the email body. In the templates folder of the signup module, create a file called _activationEmail.php and add the following code to it: Hi <?php echo $name; ?>, <br /><br /> Thank you for signing up to our newsletter. <br /><br /> Thank you, <br /> <strong>The Team</strong> The partial is no different from a regular template. We could have opted to pass on the body as a string, but using the template keeps our code uniform. Our signup process now incorporates the functionality to send an email. The purpose of this example is to show you how to send an automated email using a third-party library. For a real application, you should most certainly implement a two-phase option wherein the user must verify his or her action. Flashing temporary values Sometimes it is necessary to set a temporary variable for one request, or make a variable available to another action after forwarding but before having to delete the variable. Symfony provides this level of functionality within the sfUser object known as a flash variable. Once a flash variable has been set, it lasts until the end of the overall request before it is automatically destroyed. Setting and getting a flash attribute is managed through two of the sfUser methods. Also, you can test for a flash variable's existence using the third method of the methods listed here: $this->getUser()->setFlash($name, $value, $persist = true) $this->getUser()->getFlash($name) $this->getUser()->hasFlash($name) Although a flash variable will be available by default when a request is forwarded to another action, setting the argument to false will delete the flash variable before it is forwarded. To demonstrate how useful flash variables can be, let's readdress the signup form. After a user submits the signup form, the form is redisplayed. I further mentioned that you could create another action to handle a 'thank you' template. However, by using a flash variable we will not have to do so. As a part of the application logic for the form submission, we can set a flash variable. Then after the action redirects the request, the template can test whether there is a flash variable set. If there is one, the template should show a message rather than the form. Let's add the $this->getUser()->setFlash() function to the submit action in the apps/frontend/modules/signup/actions/actions.class.php file: //Include the swift lib require_once('lib/vendor/swift-mailer/lib/swift_init.php'); //set Flash $this->getUser()->setFlash('Form', 'completed'); try{ I have added the flash variable just under the require_once() statement. After the user has submitted a valid form, this flash variable will be set with the name of the Form and have a value completed. Next, we need to address the template logic. The template needs to check whether a flash variable called Form is set. If it is not set, the template shows the form. Otherwise it shows a thank you message. This is implemented using the following code: <?php if(!$sf_user->hasFlash('Form')): ?> <form action="<?php echo url_for('@signup_submit') ?>" method="post" name="Newsletter"> <div style="height: 30px;"> <div style="width: 150px; float: left"> <?php echo $form['first_name']->renderLabel() ?></div> <?php echo $form['first_name']->render(($form['first_name']-> hasError())? array('class'=>'boxError'): array ('class'=>'box')) ?> <?php echo ($form['first_name']->hasError())? ' <span class="errorMessage">* '.$form['first_name']->getError(). '</span>': '' ?> <div style="clear: both"></div> </div> .... </form> <?php else: ?><h1>Thank you</h1>You are now signed up.<?php endif ?> The form is now wrapped inside an if/else block. Accessing the flash variables from a template is done through $sf_user. To test if the variable has been set, I have used the hasFlash() method, $sf_user->hasFlash('Form'). The else part of the statement contains the text rather than the form. Now if you submit your form, you will see the result as shown in the following screenshot: We have now implemented an entire module for a user to sign up for our newsletter. Wouldn't it be really good if we could add this module to another application without all the copying, pasting, and fixing?
Read more
  • 0
  • 0
  • 8654

article-image-datagrid-api-ibm-websphere-extreme-scale-6-part-1
Packt
18 Nov 2009
19 min read
Save for later

The DataGrid API with IBM WebSphere eXtreme Scale 6: Part 1

Packt
18 Nov 2009
19 min read
In a client-server ObjectGrid interaction, local ObjectGrid instances run in the same memory process as the business application. Access to objects stored in the grid is extremely fast, and there are no network hops or routing done on ObjectGrid operations. The disadvantage with a local ObjectGrid instance is that all objects stored in the grid must fit into the heap space of one JVM. The client-server distributed ObjectGrid instances overcomes that single heap space disadvantage by combining the resources of multiple JVMs on multiple servers. These combined resources hide behind the façade of an ObjectGrid instance. The ObjectGrid instance has far more CPU, memory, and network I/O available to it than the resources available to any single client. In this article, we'll learn how to use those resources held by the ObjectGrid instance to co-locate data and business logic on a single JVM. The client-server model relies on a client pulling objects across a network from an ObjectGrid shard. The client performs some operations on those objects. Any object whose state has changed must be sent back across the network to the appropriate shard. The client-server programming model co-locates data and code by moving data to the code. The data grid programming model does the opposite by moving code to the data. Rather than dragging megabytes of objects from an ObjectGrid shard to a client, only to send it right back to the ObjectGrid, we instead send our much smaller application code to an ObjectGrid shard to operate on the data in place. The end result is the same: code and data are co-located. We now have the resources of an entire data grid available to run that code instead of one client process. What does DataGrid do for me? The DataGrid API provides encapsulation to send application-specific methods into the grid and operate directly on the objects in shards. The API consists of only five public classes. These five classes provide us with several patterns to make an ObjectGrid instance do the heavy lifting for a client application. The client application did a lot of work by operating on the objects in the grid. The client requires a network hop to get an object from the grid and performs an operation on it, persisting that the object requires another network hop to the grid. In a single client environment, the probable bottlenecks in dealing with ObjectGrid are all on the client side. A single client will not stress the resources in the ObjectGrid deployment. The client application is most likely the bottleneck. With all computers in a deployment being equal, one client application on one computer will not stress the combined resources of the grid. In a naïve application that performs single object get and put operations, our application will first notice a bottleneck due to data starvation. This is where a client cannot get the data it needs fast enough, caused by network latency. Single object get and put operations (and the corresponding Entity API calls) won't saturate a gigabit ethernet connection by any means, but the latency in making the RPC is higher than what the CPU can handle. The application works, but it's slow. A smarter application would use the ObjectMap#getAll method. This would go out to the grid and get an object for every key in the list. Instead of waiting for each individual object, the client application waits for the entire list to come over the network. While the cost of network RPC is amortized over the size of the list, the client still incurs that cost. In addition to these network latency concerns, we may not want a near-cache that eats up client-side memory. Turning off the near-cache means that every get operation is an RPC. Turning it on means that some of our JVM heap space is used to store objects, which we may not need after the first use. The fundamental problem is that our objects and client application are architecturally separated. For our application to do anything, it needs to operate on objects that exist in the grid. In the client-server model, we copy data from the server to the client. At this point, our data and code are co-located, and the application can perform some business logic with that data. This model breaks down when there are huge data sets copied between boxes. Databases co-locate data and code with stored procedures. The processing power of the stored procedure is a product of the CPU and memory resources of the computer running the database. The stored procedure is code compiled into a module and executed by the database. Within that process, the stored procedure accesses data available in the same process. ObjectGrid gives us the ability to run code in the same process that gives an object access via the DataGrid API. Unlike the database example, where the throughput and latency of getting the store procedure result is limited to the power of the server it's on, ObjectGrid's power is limited by the number of CPUs in the deployment, and it can scale out at any time. ObjectGrid co-locates our code and objects by sending serialized classes with our application code methods to primary partitions in the grid. There are two ways to do this. The first way sends the code to every primary partition in the grid. The code executes and returns a result to the client. In the second way, we supply a collection of keys to the DataGrid API. With a list of keys, ObjectGrid only sends the application code to the partitions that contain at least one object with a key in the list. This reduces the amount of container processes doing the work for our client application, and is preferred instead of making the entire grid service on one client request. Let's look at finding an object by key in the client-server distributed model. The client has a key for an object. Calling the ObjectMap#get(key) method creates some work for the client. It first needs to determine to which partition the key belongs. The partition is important because the ClientClusterContext, already obtained by the client, knows how to get to the container that holds the primary shard in one hop. We find out the partition ID (pID) for a key with the PartitionManager class: BackingMap bMap = grid.getMap("Payment");PartitionManager pm = bMap.getPartitionManager();int pId = pm.getPartition(key); After obtaining the partition ID and the host running the container process, the client performs a network hop to request the object. The object is serialized and sent back to the client, where the client performs some operation with the object. Persisting an updated object requires one more network hop to put it back in the primary shard. We can now repeat that process for every object in our multi-million object collection. On second thought, that may not be such a great idea. Instead, we'll create an agent that we send to the grid. The agent encapsulates the logic we want to perform. An AgentManager serializes the agent and sends it to each primary shard in the deployment. Once on a primary shard, the agent executes and produces a result which is sent back to the client.   Borrowing from functional programming The DataGrid API borrows the "map" and "reduce" concepts from the world of functional programming. Just so we're all on the same page, let's go over the concepts behind these two functions. Functional programming focuses more on what a program does, instead of how it does it. This is in contrast to the most imperative programming we do in the C family of languages. That's not to say we can't follow a functional programming model, it's just that we don't. Other languages, like Lisp and its descendants, make functional programming the natural thing to do. Map and reduce are commonly found in functional programming. They are known as higher-order functions because they take functions as arguments. This is similar to how we would use a function pointer in C, or an anonymous inner class in Java, to implement callbacks. Though the focus is on what to do, at some point, we need to tell our program how to do it. We do this with the function passed as an argument to map or reduce. Let's look at a simple example in Ruby, which has both functional and imperative programming influences: >> numbers = [0,1,2,3,4,5,6,7,8,9]>> numbers.map { |number| number * 2 }=> [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] We assign an array of numbers 0-9 to the variable numbers. The array has a method called map that we call in the second line. Map is a higher-order function and accepts a function as its argument. The Array#map method calls the passed-in function for each element in the array. It passes the element in the variable numbers. In this way, we return a new array that contains the results of each call to our function which performs number * 2. Let's look at the reduce method. In Ruby, reduce is called inject but the concept is the same: >> numbers = [0,1,2,3,4,5,6,7,8,9]>> numbers.inject(0) { |sum, number| sum = sum + number }=> 45 The inject (read as reduce) method takes a function that performs a running total on the numbers in the array. Instead of an array as our return type, we only get one number. The reduce operation returns a single result for an entire data set. The map operation returns a new set based on running the original set through a given function. These concepts are relevant in the data grid environment because we work with large data sets where we frequently need to work with large segments of data. Pulling raw data across the network, and operating over the data set on one client, are both too slow. Map and reduce helps us by using the remote CPU resources of the grid to cut down on the data sent across the network and the CPU power required on the client. This help comes from writing methods that work like map and reduce and sending them to our objects in the grid. java.util.M  ap, BackingMaps, ObjectMaps, HashMaps, like we need one more use for the word "map". We just saw the functional origin of the map concept. Let's take a look at a Java implementation. Map implements an algorithm that performs an operation on each element in a collection and returns a new collection of results: public Collection doubleOddInts(Collection c) {Collection results = new HashSet();Iterator iter = c.iterator();while (iter.hasNext()) {int i = (Integer)iter.next();if (i % 2 == 0) {[ 172 ]results.add(i);} else {results.add(i*2);}}return results;} Our needs go beyond performing a map function over an array. In order to be useful in a DataGrid environment, the map function must operate on a distributed collection of objects in an ObjectGrid instance. The DataGrid API supports this by giving us the MapGridAgent interface. A business logic class implements the two methods in MapGridAgent to encapsulate the code we intend to run in the grid. Classes that implement MapGridAgent must implement two methods, namely, MapGridAgent#process(Session session, ObjectMap map, Object key) and MapGridAgent#processAllEntries(Session session, ObjectMap map). Let's implement the doubleOddInts algorithm with MapGridAgent. We first create a class that implements the MapGridAgent interface. We give this class a meaningful name that describes the map operation implemented in the process methods: public class DoubleOddIntsMapAgent implements Serializable,MapGridAgent {public Object process(Session session, ObjectMap map, Object key){int i = (Integer)map.get(key);if (i % 2 == 0) {return i;} else {return i*2;}}public Map processAllEntries(Session session, ObjectMap map) {// nothing to do here for now!}} The map function itself is called by our client code. The process (session, map, key) method performs the how in the map function. Because ObjectGrid gives us the what for free (the map function), we only need to implement the how part. Like the Ruby example, this process (session, map, key) method is performed for each element in a collection. The Session and ObjectMap arguments are supplied by the AgentManager based on the current session and ObjectMap that starts the map function. The key is the crucial object for a given value in the collection, and that collection is supplied by us when we run the DoubleOddIntsMapAgent. After implementing the MapGridAgent#process(session, map, key) method, the DoubleOddIntsMapAgent is ready to run. We want it to run on each shard in an ObjectGrid instance that has a key in the collection we pass to it. We do this with an instance of the AgentManager class. The AgentManager class has two methods to send a MapGridAgent to the grid: AgentManager#callMapAgent(MapGridAgent agent, Collection keys) and AgentManager#callMapAgent(MapGridAgent agent). The first method provides a set of keys for our agent to use when run on each partition. Using this method is preferable to the non-keyed version because the non-keyed version runs the code on every primary shard in the grid. The Agent Manager#callMapAgent(agent, keys) method only runs the code on primary partitions that contain at least one key in the key collection. Whenever we have the choice to use part of the grid instead of the entire grid, we should take the choice that uses only part of the grid. Whenever we use the entire grid for one operation, we limit scalability and throughput. The AgentManager serializes the DoubleOddIntsMapAgent agent and sends it to each partition that has a key in the keys collection. Once on the primary partition, the process (session, map, key) method is called for each key in the keys collection supplied to AgentManager#callMapAgent(agent, keys). This set of keys is a subset of all of the keys in the BackingMap, and likely a subset of keys in each partition. Let's create an instance of this agent and submit it to the grid: Collection numbers = new ArrayList();for(int i = 0; i < 10000; i++) {numbers.add(i);}MapGridAgent agent = new DoubleOddIntsAgent();AgentManager am = session.getMap("Integer").getAgentManager();am.callMapAgent(agent, numbers); This example assumes that we have a BackingMap of Integer for both the key and value objects. The numbers collection is a list of keys to use. Once we create the agent, we submit it to the grid with the 10,000 keys to operate on. Before running the agent, the AgentManager sorts the keys by partition. The agent only runs on partitions that have a list of keys that hash to that partition. The agent runs on each partition that has a list of keys that hash to it. In each primary partition, the DoubleOddIntsMapAgent#process(session, map, key) method is called only for the keys that map to that partition. GridAgent and Entity GridAgent works with Entity classes as well. We don't directly use key objects when working with Entity objects. The Entity API hides the key/value implementation from us to make working with Entity objects easier than working with the ObjectMap API. The method definition for MapGridAgent#process(session, map, key) normally expects an object to be used as a key for an ObjectMap. We can still find the value object by converting key and value objects to their Tuple representations, but the DataGrid API makes it much easier for us. Instead of passing a key to the process method, we can convince the primary shard to pass us the Entity object itself, rather than a key using the EntityAgentMixin interface. EntityAgentMixin has one method, namely, EntityAgentMixin#getClassForEntity(). The implementation of this method should return the class object of the Entity. DataGrid needs this method defined in the grid agent implementation so it can provide the Entity object itself, rather than its key to the MapGridAgent#process(session, map, key) method. Let's assume that we have an Entity MyInteger that acts as a wrapper for Integer: public class DoubleOddIntsMapAgent implements Serializable,MapGridAgent, EntityAgentMixin {public Object process(Session session, ObjectMap map, Object key){MyInteger myInt = (MyInteger)key;if (myInt.mod(2) == 0) {return myInt;} else {return myInt.multiplyBy(2);}}public Map processAllEntries(Session session, ObjectMap map) {// nothing to do here for now!}public Class getClassForEntity() {return MyInteger.class;}} Our agent now implements the EntityAgentMixin interface and the getClassForEntity() method. The key is converted to the correct class before the MapGridAgent#process(session, map, key) method is called. Instead of the Tuple key for an Entity, the process method is passed a reference to the Entity itself. Because it is passed as an object, we must cast the Entity to its defined class. There is no need to look up for the Entity in its BackingMap because it's already the Entity we want to work with. This means the collection of keys passed to AgentManager#callMapAgent(agent, keys) is a collection with all elements of the c lass returned by getClassForEntity(). GridAgent with an unknown key set We may not always know the keys for each object we want to submit to an agent. In this situation, we send an agent into the grid without a key set. The grid agent cannot call the process (session, map, key) method because we don't know which keys to use. Instead, our grid agent method relies on the Query API to narrow the number of objects in each partition we work with. The MapGridAgent interface gives us the MapGridAgent#processAllEntries(Session session, ObjectMap map) method for this situation. The MapGridAgent#processAllEntries(session, map) method lets us specify what to do when we potentially need to work with all objects in a partition. Particularly, it lets us narrow the field with a query. In the past, we used a query to find card and address objects in a local ObjectGrid instance. This was fine for local instances with only one partition. The real power of the Query API is revealed when used with the DataGrid API. Query does not work across partitions when called from an ObjectGrid client in a distributed environment. It works with just one partition. In a distributed deployment, where we use the DataGrid API, a grid agent instance runs on one partition. Each partition has an instance of the grid agent running in it and each agent can see the objects in its partition. If we have 20 partitions, then we have 20 grid agents running, one in each partition. Because we're working with a single partition in each grid agent, we use the Query API to determine which objects are of interest to the business logic. Now that we know how to run code in the grid, the Query API is suddenly much more useful. Now, we want a query to run against just one partition. Using a query in a GridAgent is a natural fit. Each agent runs on one partition, and each query runs on that partition in the primary shard container process: public class DoubleOddIntsMapAgent implements Serializable,MapGridAgent, EntityAgentMixin {public Object process(Session session, ObjectMap map, Object key){MyInteger myInt = (MyInteger)key;if (myInt.mod(2) == 0) {return myInt;} else {return myInt.multiplyBy(2);}}public Map processAllEntries(Session session, ObjectMap map) {EntityManager em = session.getEntityManager();Query q = em.createQuery("select m from MyInteger m " +"where m.integer > 0 " +"and m.integer < 10000");Iterator iter = q.getResultIterator();Map<MyInteger, Integer> results =new HashMap<MyInteger, Integer)();while (iter.hasNext()) {MyInteger mi = (MyInteger)iter.next();results.put(mi, (Integer)process(session, map, mi));}return results;}public Class getClassForEntity() {return MyInteger.class;}} The MapGridAgent#processAllEntries(session, map) method generally follows the same pattern when implemented: Narrow the scope of objects in the partition. This is important in the MapGridAgent because it returns a result for every object it processes. This can result in hundreds of megabytes of objects sent back to a client from every partition for an indiscriminate query. Create a map to hold the results of each process operation. This map is keyed with the key object, or the value object, when using ObjectMap. The client application can perform its own gets if the keys are returned. Otherwise, it works directly with the value objects. We can also return a map of key/value objects. The map is keyed with the Entity class itself when using Entity. Iterate over the query results calling MapGridAgent#process(session, map, key) for each result. Calling the process method is required here since we didn't pass a collection of keys to the AgentManager#callMapAgent(agent) method. The key set is unknown before the agent runs. The agent finds all objects in a partition that meet our criteria for processing, and then we call process to get each result. Return the results. This map contains an entry for each object that meets our processing criteria in this partition. This map is merged, client-side, with the maps from every other partition where the agent ran. The merged map is the final result, and it is the return value to the AgentManager#callMapAgent(agent) method. Following the call to AgentManager#callMapAgent(agent), we have a Map that contains the combined agent results from every partition. We also split the workload between N partitions rather than performing all of the processing on the client. The ObjectGrid deployment performed our business logic because we passed the business logic to the grid rather than pulling objects out of the grid. One of the great things about this pattern is that our task on many partitions completes in about 1/Nth the amount of time it would take for one huge partition containing the same objects running on one computer. Of course, there is the overhead of the merge operation and network connections, but this is amortized over the number of primary partitions used by the agent. This is distinctly different than scaling up a database server when it needs more CPU speed for stored procedures. Instead of incurred downtime for database server migration, we simply add more containers on additional computers. The power of our grid increases as easily as starting a few more JVMs. >> Continue Reading: The DataGrid API with IBM WebSphere eXtreme Scale 6: Part 2
Read more
  • 0
  • 0
  • 1388
article-image-archiva-team-part-1
Packt
18 Nov 2009
11 min read
Save for later

Archiva in a Team: Part 1

Packt
18 Nov 2009
11 min read
Roles and permissions In preparation for the latter sections of this article, let's familiarize ourselves with the user roles and permissions available in Archiva. The list of available roles can be seen by clicking a user account in User Management and then clicking on the Edit Roles link. Some of the roles in Archiva are resource-based with each repository treated as a resource. This means that access is controlled at the repository level. There are eight types of roles available in Archiva. They are: System Administrator: Provides access to Manage and Administration sections, user administration privileges, and read and write permissions to all repositories. User Administrator: Provides access to User Management and User Roles pages. Global Repository Manager: Provides read and write permissions to all repositories. Global Repository Observer: Provides read permission to all repositories. Repository Manager (resource level): Provides read and write permissions to a given repository. Repository Observer (resource level): Provides read permission to a given repository. Registered User: The default role assigned to a user who has registered in Archiva. Guest: Provides the same permissions that are enabled for the built-in guest user account, which we will discuss later on. A user assigned with a Global Repository Manager or resource level Repository Manager role automatically gains the Global Repository Observer or resource level Repository Observer role respectively. Users assigned with a Repository Manager role should be able to access the Find section as well as Upload Artifact and Delete Artifact menu in the web application. On the other hand, users with a Repository Observer role should only be able to access the Find section. Repository-level security applies to each corresponding operation. This means that a user will only be able to search, browse, and upload to or delete artifacts from those repositories that they have permission to access. When managing roles and permissions, another thing to take note of is the guest account. To enable access without authentication for a specific resource or operation, just assign the guest user the appropriate role. By default, the guest user is already assigned the Repository Observer role for internal and snapshots repositories. This allows anyone to be able to browse and search for artifacts from these repositories. If you edit the guest user account, you should be able to see the following configuration: As you can see the guest user doesn't yet have read access to the releases repository. In our examples, we will assume that the repository will be available to everyone that can access Archiva. So to make this consistent with the snapshots repository, check the Repository Observer box for releases and submit the form. You can see for yourself how the guest account works by logging out of Archiva and clicking Browse on the navigation menu. The artifacts that were requested and were downloaded to our proxy repository should be visible in the Browse page, similar to what is seen in the following screenshot: As we work through the rest of the article, we will cover a few more things about access control in Archiva. Now that we are familiar with the security basics, we are ready to tackle some of the more advanced features of Archiva. In the next section, we will learn techniques for configuring our Archiva repositories. Introducing repository groups In Archiva 1.1, the concept of repository groups (also known as virtual repositories) was introduced. Taking the meaning of the term virtual literally, these repositories are physically non-existent repositories. A virtual repository is simply a URL which gives a single interface to a group of managed repositories. Let's visualize this with a simple scenario. For example, we have a Maven 2 project which has dependencies on artifacts that reside in multiple repositories. In this case, we will assume that we have a nearby proxy cache configured in Archiva for each of them. Given this scenario, it would mean that we have to configure each of these repositories in our settings.xml (or POM), in order for us to get the needed artifacts and to be able to build our project. If these repositories are secured, we also need to configure our credentials for each. This leaves us with a long (and possibly messy) settings.xml. Remember, a messy configuration is an attraction for errors. To avoid this problem, we can make use of repository groups in Archiva. We can create a repository group and configure or add multiple repositories under that group. So when an artifact request is made (for example, by Maven) using the repository group URL, the repositories underneath it will be searched until the requested artifact is found and returned to the client. The following section teaches us how to configure repository groups and experience their strength first-hand. Configuring and using repository groups Before jumping into configuration, it is good to see how it will be without the aid of repository groups. As the Centrepoint project refers to the released version—POM Apache Maven 2: Effective Implementations Book, anyone who builds that project must be able to get the organization POM from the releases repository. This is a perfect setup for using repository groups. Let's begin by wiping out our local repository again and building the Centrepoint project. centrepoint$ mvn clean install The build should fail with the following error: [INFO] Scanning for projects...Downloading: http://localhost:8081/archiva/repository/internal/com/effectivemaven/effectivemaven-parent/1/effectivemaven-parent-1.pom[INFO] ----------------------------------------------------------[ERROR] FATAL ERROR[INFO] ----------------------------------------------------------[INFO] Failed to resolve artifact.GroupId: com.effectivemavenArtifactId: effectivemaven-parentVersion: 1Reason: Unable to download the artifact from any repository com.effectivemaven:effectivemaven-parent:pom:1from the specified remote repositories: internal (http://localhost:8081/archiva/repository/internal)   Our organization POM cannot be found because it resides in the Archiva releases repository, and we don't have it configured in our settings.xml. The version in ../effectivemaven-parent/pom.xml is also not used because the versions now differ. To get past this problem, we must add the following configuration in the settings.xml: <profiles> <profile> <id>repositories</id> <activation> <activeByDefault>true</activeByDefault> </activation> <repositories> <repository> <id>releases</id> <name>Archiva Managed Releases Repository</name> <url> http://localhost:8081/archiva/repository/releases </url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> </profile> </profiles> We already configured the <server> credentials for the releases repository when we tried deploying to Archiva using Maven so we no longer need to configure that. If you try building Centrepoint again, the build will still fail. Notice that Maven didn't even seem to try looking for the artifact from the releases repository we added previously. This is because we have locked down Maven to use only the local mirror repository internal. This is the effect of the <mirrorOf>*</mirrorOf> configuration in our settings.xml, Staying in Control with Archiva. Just change it to <mirrorOf>*,!releases</mirrorOf> so that Maven would respect the additional repositories. Execute the build again. This time we should be able to get a successful build. However, for every member of the team working on the Centrepoint project, the settings.xml (now over 40 lines long) is needed at the minimum. As the project grows bigger, more artifacts are added. Also, if these new artifacts are located in other repositories, you would need to add this repository to your settings.xml and so on and so forth. We already learned at the start of this section that in situations such as this, a repository group can make things easier for us developers. Let us see how we can create one. Let's go back to our running Archiva instance. Click Repository Groups, then type public in the Identifier field on the upper right-hand corner of the page and click Add Group. We now have a virtual repository named public with the following URL: http://localhost:8081/archiva/repository/public. You may change the name of the repository group to a more appropriate one if the repositories are not really for public consumption. To add managed repositories under the group, just select the repository you would like to add from the list under the created group and click Add Repository. Add the releases and internal repositories (this order is used so that requests for the organization's artifacts are never made on external proxied repositories). Note that we don't want to add the snapshots repository to the group as that might change the behavior of the repository. One example of this is when dealing with version ranges. You might end up getting a snapshot version instead of a released version. Now, with this configuration, we are telling Archiva that if an artifact request is made on the repository group public, it should look for the artifact in these two repositories (based on the order they are listed) and return the first matching artifact it sees. You can change the ordering of the repositories to be searched by moving a repository up or down the repository group configuration via the Up and Down icons. After configuration, the page should look similar to the following: Now that we have a repository group that we can use, let's configure it in our settings.xml. Remove the profile we added previously, and adjust the mirror section as follows: <mirrors> <mirror> <id>public</id> <url>http://localhost:8081/archiva/repository/public</url> <mirrorOf>*</mirrorOf> </mirror> </mirrors> Notice how much shorter and simpler our settings.xml is now. Group credentials The guest user has access to all of the repositories in the group so we don't need a corresponding <server> for the mirror. However, if read access control applies to any repositories in the group, make sure to add a <server> for the ID of the mirror (not the underlying repositories that are no longer visible to Maven). The existing <server> definitions continue to be used for deployment, as deployment cannot be done to a group. Let's try building Centrepoint again, but this time with a clean local repository, using the new settings.xml. We should be able to see both com.effectivemaven: effectivemaven-parent:pom:1 and the other dependencies from the central repository being retrieved from our public repository group, ending with a successful build as follows: [INFO] Scanning for projects...Downloading: http://localhost:8081/archiva/repository/public//com/effectivemaven/effectivemaven-parent/1/effectivemaven-parent-1.pom1K downloaded[INFO] Reactor build order:...[INFO] ----------------------------------------------------------[INFO] Building Centrepoint[INFO] task-segment: [clean, install][INFO] ----------------------------------------------------------Downloading: http://localhost:8081/archiva/repository/public//org/apache/maven/plugins/maven-clean-plugin/2.2/maven-clean-plugin-2.2.pom3K downloadedDownloading: http://localhost:8081/archiva/repository/public//org/apache/maven/plugins/maven-plugins/10/maven-plugins-10.pom What else can we do with repository groups? Consider, for example, that we added new dependencies to our Centrepoint project and these dependencies are projects being worked on by another team within the company. Let's say the other team have their own deployment repository (separate from ours) managed by Archiva as well. We no longer have need to make any changes in our settings.xml (or POM). The repository just needs to be added in the public repository group and the appropriate permissions assigned to the Centrepoint project developers' accounts. Configuration is much simpler now and is concentrated in Archiva itself. Developers and team members won't have to configure their settings.xml each time a new repository is needed.   RSS feeds—discovering new artifacts in your repository RSS has become the de facto standard with regard to news feeds and updates on the web. The Archiva community has seen how the project can take advantage of this current trend by providing RSS feeds for new artifacts in the repository. Projects that use or depend on specific libraries would be able to know when a new release is available or when there is a new build. This is especially useful when a project is dependent on a fix that would be available in the next release or in the next build. A Repository Observer role is required at least in order to subscribe to a feed in Archiva. There are two levels of RSS feeds available in Archiva: repository level and artifact level. In the following sections, we will be using Thunderbird's RSS feed reader for demonstration purposes. You can get Thunderbird from http://www.mozillamessaging.com/en-US/thunderbird/ and can set it up using the installation guides at http://www.mozillamessaging.com/en-US/support/. You can also use other RSS feed readers such as Google Reader. If access to your repositories require authentication, your feed reader must support authentication. If security is lenient, you can just disable authentication for read operations to your repository by granting the guest account the Repository Observer role.
Read more
  • 0
  • 0
  • 1856

article-image-archiva-team-part-2
Packt
18 Nov 2009
7 min read
Save for later

Archiva in a Team: Part 2

Packt
18 Nov 2009
7 min read
Deleting artifacts in your repository Sometimes the need for deleting artifacts from the repository arises. For example, if an artifact was deployed by accident to the repository or the artifact has already been released but an old snapshot version is still available. In Archiva, there are different ways of deleting artifacts from the repository—through WebDAV, via the web application, through the scheduled repository purging, or by directly deleting it in the file system. It is not recommended that artifacts be deleted directly from the file system. Not only does it require access to the server itself, it is also prone to error. Artifacts that should not be deleted could be deleted by mistake. In case you still want to directly delete an artifact from the file system, all files related to the artifact such as metadata files and checksums must also be deleted. The repository must be scanned as well in order to update the metadata files. This can be done by clicking the Scan Repository Now button of the repository configuration in the Repositories page. The database scanning also needs to be explicitly executed to immediately remove the deleted artifact from the database. One of the advantages of using the Delete Artifact form in the web application is that you do not need to have direct access to the server. All you need is the required Archiva permissions, which come with the Repository Manager role (without the permissions Delete Artifact will not be visible in the navigation menu). Another advantage is that the repository scanning no longer needs to be explicitly executed as Archiva already executes the repository and database scanning consumers to update the index and the database for you. Now, let's try deleting an old artifact from one of the repositories. If you go to http://localhost:8081/archiva/repository/snapshots/com/effectivemaven/centrepoint/centrepoint, the old 1.0-SNAPSHOT version of the project still exists. We will remove this artifact from the repository using the delete artifact web form. First, click Delete Artifact from the navigation menu and then fill in the form as follows: Click the Submit button. After the artifact has been deleted, you should see the confirmation message Artifact 'com.effectivemaven.centrepoint:centrepoint:1.0-SNAPSHOT' was successfully deleted from repository 'snapshots'. If you browse the repository at http://localhost:8081/archiva/repository/snapshots, the related artifacts such as the POM, maven-metadata.xml, and the checksums were also deleted. To delete artifacts through WebDAV, just open the repository using a WebDAV client and delete the artifact like in a regular file system. As for the scheduled repository purging, we will discuss this in the following sections. We have tackled the subjects of repository groups, RSS feeds, and deleting artifacts in the repository. This article would never be complete without covering repository maintenance. The succeeding sections will be all about that. The Archiva reports Archiva generates two types of reports. These are the repository statistics, providing information such as statistical data of a repository's content and the repository health report, which makes us aware of any problems in the repository such as artifacts that have invalid POM files. Both accept different criteria for customizing the generated output as seen in the following screenshot: Now, let's discuss the configuration for each report. Repository statistics This report provides statistical repository information such as the total number of artifacts in the repository, its total size, the number of plugins in the repository, and the likes based on a given repository scan execution time. This report can be used for analyzing the current content of your repositories, and tracking its growth, usage, and evolution over time. The report can be constrained by the given Start Date and End Date. If no Start Date and End Date are provided, all statistics right from the start up to the current date will be included in the report (to a maximum of the number of rows given in the Row Count). For the Repository Statistics, we can also configure the Repositories To Be Compared. If only one repository is selected in Repositories To Be Compared, the generated report will contain details of a single repository. The following is a sample report where only one repository is selected: Let's run through the contents of the sample Repository Statistics report given previously for repository internal. The Total File Count pertains to the total number of files in the repository during each execution of the repository scan. The Total Size, on the other hand, is the size (in bytes) of the repository at that time. The number of unique groups and artifact names are broken down in the report as well as the number of plugins, archetypes, JAR, and WAR files. The last two columns—number of deployments and artifact requests—are not yet implemented but will be fixed in the future releases. On the other hand, if more than one repository is selected in the Repositories To Be Compared, the generated report would contain a comparison of the latest statistics of the repositories based on the specified End Date. This is useful for tracking which repositories are the most utilized. For example, if different development groups host their own repositories, the comparison can show which groups are using the most space. Look at the following screenshot for a sample comparison report to see the difference from the previous one: To allow you to view this report outside of the web application, the report can be exported as a CSV file by clicking on the Export to CSV link. You should be able to open the exported file as an Excel spreadsheet. Repository health One of the secrets behind a successful and reproducible build is a clean and healthy repository. Corrupt metadata or an invalid or missing POM file are the usual causes for a build to break. To prevent this from happening, we must ensure that the repositories we are getting our artifacts from are in good health. Archiva provides a way of doing this through the Repository Health report and its built-in utilities for updating metadata and fixing checksums. The Repository Health report provides a detailed list of artifacts in the repository that are found to be defective. It gives a starting point for correcting any problems and can be used when diagnosing build errors with a particular artifact. For example, a common reason for an artifact being defective is when the version of the artifact specified in the POM is different from the actual version in its filename. This could easily happen when using deploy:deploy-file (or even using the Archiva web upload form) as the actual filename used for the uploaded artifact is determined based on the supplied parameters. It is a possibility that the included POM in the upload has different coordinates from the provided parameters. These defects are discovered during Archiva's database scan, when the actual POM file is read and added to the database. We can narrow down the report by providing a specific Group ID and/or a Repository ID which will be used for querying defective artifacts that match these criteria. If you try querying for the report using the default configuration, you should be able to see a generated report similar to the following one, which shows a defective POM in repository internal. To repair such an error, you can manually fix the POM in the Archiva repository by updating it in the file system. If the defect is caused by a transfer error when the artifact was proxied, you can delete the artifact (including the metadata and checksums) then force Archiva to retrieve it again by requesting it. A word of caution though—making these changes could affect the reproducibility of a dependent project's build. For example, it is possible that the actual artifact in the central repository is the defective one. If you fixed the artifact in your internal Archiva repository, project builds that go through the local proxy may get a successful build. However, the project is built directly off central and the build fails because the dependency artifact is defective. That summarizes monitoring the health of our repositories. The next section discusses the built-in Archiva utilities which in one way or another clean up and repair broken artifacts and metadata in the repositories.    
Read more
  • 0
  • 0
  • 1572

article-image-joomla-flash-showing-maps-using-yos-ammap
Packt
18 Nov 2009
12 min read
Save for later

Joomla! with Flash: Showing maps using YOS amMap

Packt
18 Nov 2009
12 min read
Showing maps using YOS amMap Adding a map to your site may be a necessity in some cases. For example, you want to show the population of countries, or you want to show a world map to your students for teaching geography. Flash maps are always interesting as you can interact with them and can view them as you like. amMap provides tools for showing Flash maps. The amMap tool is ported as a Joomla! component by yOpensource, and the component is released with the name YOS amMap. This component has two versions—free and commercial. The commercial or pro version has some advanced features that are not available in the free version. The YOS amMap component, together with its module, allows you to display a map of the world, a region, or a country. You can choose the map to be displayed, which areas or countries are to be highlighted, and the way in which the viewers can control the map. Generally, maps displayed through the YOS amMap component can be zoomed, centered, or scrolled to left, right, top, or bottom. You can also specify a color in which a region or a country should be displayed. Installing and configuring YOS amMap To use YOS amMap with your Joomla! website, you must first download it from http://yopensource.com/en/component/remository/?func=fileinfo&id=3. After downloading and extracting the compressed package, you get the component and module packages. Install the component and module from the Extensions | Install/Uninstall screen. Once installed, you can administer the YOS amMap component from Components | YOS amMap. This shows the YOS amMap Control Panel, as shown in the following screenshot: YOS amMap Control Panel displays several icons through which you can configure and publish maps. The first thing you should do is to configure the global settings for amMap. In order to do this, click on the Parameters icon in the toolbar. Doing so brings up the dialog box, as shown in the following screenshot: In the Global Configuration section, you can enter a license key if you have purchased the commercial or the pro version of this component. For the free version, this is not needed. In this section, you can also configure the legal extensions of files that can be uploaded through this component, the maximum file size for uploads, the legal image extensions, and the allowed MIME types of all uploads. You can also specify whether the Flash uploader will be used or not. Once you have configured these fields, click on the Save button and return to YOS amMap Control Panel. Adding map files You can see the list of available maps by clicking on the Maps icon on the YOS amMap Control Panel screen or by clicking on Components | amMap | Maps. This shows the Maps Manager screen, as shown in the next screenshot. As you can see, the Maps Manager screen displays the list of available maps. By default, you find the world.swf, continents.swf, and world_with_antartica.swf map files. You will find some extra maps with the amMap bundle. You can also download the original amMap package from http://www.ammap.com/download. After downloading the ZIP package, extract it, and you will find many maps in the maps subfolder. Any map from this folder can be uploaded to the Joomla! site from the Maps Manager screen. Creating a map There are several steps for creating a map using YOS amMap. First we need to upload the package for the map. For example, if we want to display the map of the United States of America, then we need to upload the map template, the map data file, and the map settings file for the United States of America. To do this first upload the map template from the Maps Manager screen. You will find the map template for USA in the ammap/maps folder. Then we need to upload the data and the settings files. For doing so, click on the Upload link on the YOS amMap Control Panel screen. Then, in the Upload amMap screen, which is shown in the next screenshot, type the map's title (United States) in the Title field. Before clicking on the Browse button besides the Package File field, you first add the ammap_data.xml and the ammap_settings.xml files to a single ZIP file, unitedstates.zip. Now, click on the Browse button, and select this unitedstates.zip file. Then click on the Upload File & Install button. Once uploaded successfully, you see this map listed in the YOS amMap Manager screen, as shown in the next screenshot. You get this screen by clicking on the amMaps link on the toolbar. As you can see, the map that we have added is now listed in the YOS amMap Manager screen. However, the map is yet in an unpublished state, and we need to configure the map before publishing it. We need to configure its data and settings files, which are discussed in the following sections. Map data file The different regions of a map are identified by the map data file. This is an XML file and it defines the areas to be displayed on the map. The typical structure of a map data file can be understood by examining ammap_data.xml. The file has many comments that explain its structure. This file looks like as follows: <?xml version="1.0" encoding="UTF-8"?><map map_file="maps/world.swf" tl_long="-168.49" tl_lat="83.63" br_long="190.3" br_lat="-55.58" zoom_x="0%" zoom_y="0%" zoom="100%"><areas> <area title="AFGHANISTAN" mc_name="AF"></area> <area title="ALAND ISLANDS" mc_name="AX"></area> <area title="BANGLADESH" mc_name="BD"></area> <area title="BHUTAN" mc_name="BT"></area> <area title="CANADA" mc_name="CA"></area> <area title="UNITED ARAB EMIRATES" mc_name="AE"></area> <area title="UNITED KINGDOM" mc_name="GB"></area> <area title="UNITED STATES" mc_name="US"></area> <area title="borders" mc_name="borders" color="#FFFFFF" balloon="false"></area></areas><movies> <movie lat="51.3025" long="-0.0739" file="target" width="10" height="10" color="#CC0000" fixed_size="true" title="build-in movie usage example"></movie> <movie x="59.6667%" y="77.5%" file="icons/pin.swf" title="loaded movie usage example" text_box_width="250" text_box_height="140"> <description> <![CDATA[You can add description text here. This text will appear the user clicks on the movie. this description text can be html-formatted (for a list which html tags are supported, visit <u><a href="http://livedocs.adobe.com/flash/8/main/00001459.html">this page</a></u>. You can add descriptions to areas and labels too.]]> </description> </movie></movies><labels> <label x="0" y="50" width="100%" align="center" text_size="16" color="#FFFFFF"> <text><![CDATA[<b>World Map]]></text> <description><![CDATA[]]></description></label></labels><lines> <line long="-0.0739, -74" lat="51.3025, 40.43" arrow="end" width="1" alpha="40"></line> </lines></map> This code is a stripped-down version of the default ammap_data.xml file. Let us examine its structure and try to understand the meaning of each markup: <map> </map>: You define the map's structure using this markup. First, by using the map_file attribute, we declare the map file that should be used to display this map. This markup has some other attributes through which we declare the top and the left offset in longitude and latitude. We can also specify the zooming level using the zoom_x, zoom_y, and zoom attributes. <areas> </areas>: Areas are the regions or countries on a map. These are defined in the map. We only need to define the areas that we want to display. For example, in the sample, we have defined eight countries to be displayed and one straight line. Each area element has several attributes, among which you need to mention mc_name and title. You specify the area's name in mc_name, which is predefined in the map template. The title element will be displayed as the title of that map area. For example, <area mc_name="BD" title="Bangladesh"></area> means the areas marked as BD in the map template will be displayed with the title Bangladesh. In order to specify the mc_name element, you need to follow the map template designer's instructions. <movies> </movies>: Movies are some extra clips that can be displayed as a separate layer on the map. For example, to display the capital of each country, a movie clip could be displayed in the specified latitude and longitude. You can also display some other animations or text using a movie definition. <labels> </labels>: The <labels> markup contains the text to be displayed on the map. You can add any text on a map by defining a label element. To view and edit the map data file, ammap_data.xml, click on the map name on the YOS amMap Manager screen. This opens-up the amMap: [Edit] screen, as shown in the following screenshot: The amMap: [Edit] screen displays several configurations for the map. From the Details section you can change the map name, publish the map, and enable security. From the Design section you can view and edit the data and the settings files. Clicking on Data will show the data file. You can edit the data file from the online editor. As we want to display the map of USA, we will make the following changes on this screen: Select usa.swf in the Maps list. Change the data file as follows: <?xml version="1.0" encoding="UTF-8"?><map map_file="maps/usa.swf" zoom="100%" zoom_x="7.8%"zoom_y="0.18%"><areas> <area mc_name="AL" title="Alabama"/> <area mc_name="AK" title="Alaska"/> <area mc_name="AZ" title="Arizona"/> <area mc_name="AR" title="Arkansas"/> <area mc_name="CA" title="California"/> <area mc_name="CO" title="Colorado"/> <area mc_name="CT" title="Connecticut"/> <area mc_name="DE" title="Delaware"/> <area mc_name="DC" title="District of Columbia"/> <area mc_name="FL" title="Florida"/> <area mc_name="GA" title="Georgia"/> <area mc_name="HI" title="Hawaii"/> <area mc_name="ID" title="Idaho"/> <area mc_name="IL" title="Illinois"/> <area mc_name="IN" title="Indiana"/> <area mc_name="IA" title="Iowa"/> <area mc_name="KS" title="Kansas"/> <area mc_name="KY" title="Kentucky"/> <area mc_name="LA" title="Louisiana"/> <area mc_name="ME" title="Maine"/> <area mc_name="MD" title="Maryland"/> <area mc_name="MA" title="Massachusetts"/> <area mc_name="MI" title="Michigan"/> <area mc_name="MN" title="Minnesota"/> <area mc_name="MS" title="Mississippi"/> <area mc_name="MO" title="Missouri"/> <area mc_name="MT" title="Montana"/> <area mc_name="NE" title="Nebraska"/> <area mc_name="NV" title="Nevada"/> <area mc_name="NH" title="New Hampshire"/> <area mc_name="NJ" title="New Jersey"/> <area mc_name="NM" title="New Mexico"/> <area mc_name="NY" title="New York"/> <area mc_name="NC" title="North Carolina"/> <area mc_name="ND" title="North Dakota"/> <area mc_name="OH" title="Ohio"/> <area mc_name="OK" title="Oklahoma"/> <area mc_name="OR" title="Oregon"/> <area mc_name="PA" title="Pennsylvania"/> <area mc_name="RI" title="Rhode Island"/> <area mc_name="SC" title="South Carolina"/> <area mc_name="SD" title="South Dakota"/> <area mc_name="TN" title="Tennessee"/> <area mc_name="TX" title="Texas"/> <area mc_name="UT" title="Utah"/> <area mc_name="VT" title="Vermont"/> <area mc_name="VA" title="Virginia"/> <area mc_name="WA" title="Washington"/> <area mc_name="WV" title="West Virginia"/> <area mc_name="WI" title="Wisconsin"/><area mc_name="WY" title="Wyoming"/></areas><labels> <label x="0" y="60" width="100%" color="#FFFFFF" text_size="18"> <text>Map of the United States of America</text> </label></labels></map> As you can see, we have defined regions (states) on the map of USA, and towards the end of the file, we have added a label for the map. Select Yes for the Published field in the Details section. When you are done making these changes click on the Save button to save these changes. Now we will look into the map settings file. Map data files for countries are available with the amMap package. Thus, if you download amMap 2.5.1, you will get the map settings files for different countries. For example, the map data file for USA will be in the amMap_2.5.1/examples/_countries/usa folder.  
Read more
  • 0
  • 0
  • 4599
article-image-plotting-geographical-data-using-basemap
Packt
18 Nov 2009
3 min read
Save for later

Plotting Geographical Data using Basemap

Packt
18 Nov 2009
3 min read
Basemap is a Matplotlib toolkit, a collection of application-specific functions that extends Matplotlib functionalities, and its complete documentation is available at http://matplotlib.sourceforge.net/basemap/doc/html/index.html. Toolkits are not present in the default Matplotlib installation (in fact, they also have a different namespace, mpl_toolkits), so we have to install Basemap separately. We can download it from http://sourceforge.net/projects/matplotlib/, under the matplotlib-toolkits menu of the download section, and then install it following the instructions in the documentation link mentioned previously. Basemap is useful for scientists such as oceanographers and meteorologists, but other users may also find it interesting. For example, we could parse the Apache log and draw a point on a map using GeoIP localization for each connection. We use the 0.99.3 version of Basemap for our examples. First example Let's start playing with the library. It contains a lot of things that are very specific, so we're going to just give an introduction to the basic functions of Basemap. # pyplot module importimport matplotlib.pyplot as plt# basemap importfrom mpl_toolkits.basemap import Basemap# Numpy importimport numpy as np These are the usual imports along with the basemap module. # Lambert Conformal map of USA lower 48 statesm = Basemap(llcrnrlon=-119, llcrnrlat=22, urcrnrlon=-64, urcrnrlat=49, projection='lcc', lat_1=33, lat_2=45, lon_0=-95, resolution='h', area_thresh=10000) Here, we initialize a Basemap object, and we can see it has several parameters depending upon the projection chosen. Let's see what a projection is: In order to represent the curved surface of the Earth on a two-dimensional map, a map projection is needed. This conversion cannot be done without distortion. Therefore, there are many map projections available in Basemap, each with its own advantages and disadvantages. Specifically, a projection can be: equal-area (the area of features is preserved) conformal (the shape of features is preserved) No projection can be both (equal-area and conformal) at the same time. In this example, we have used a Lambert Conformal map. This projection requires additional parameters to work with. In this case, they are lat_1, lat_2, and lon_0. Along with the projection, we have to provide the information about the portion of the Earth surface that the map projection will describe. This is done with the help of the following arguments: Argument Description llcrnrlon Longitude of lower-left corner of the desired map domain llcrnrlat Latitude of lower-left corner of the desired map domain urcrnrlon Longitude of upper-right corner of the desired map domain urcrnrlat Latitude of upper-right corner of the desired map domain     The last two arguments are:   Argument Description resolution Specifies what the resolution is of the features added to the map (such as coast lines, borders, and so on), here we have chosen high resolution (h), but crude, low, and intermediate are also available. area_thresh Specifies what the minimum size is for a feature to be plotted. In this case, only features bigger than 10,000 square kilometer
Read more
  • 0
  • 0
  • 6072

article-image-datagrid-api-ibm-websphere-extreme-scale-6-part-2
Packt
18 Nov 2009
21 min read
Save for later

The DataGrid API with IBM WebSphere eXtreme Scale 6: Part 2

Packt
18 Nov 2009
21 min read
Aggregate results One thing to be aware of with the MapGridAgent interface is its potential for a partition to send huge result maps to a client. This is the nature of the map function. Its output size can be proportional to its input size if we don't use a query to select    specific objects to work with or specify a key set. In this case, we need a specific result for every key, with the key set as narrow as we can make it. We then just need to deal with large maps once in a while. What if we need an aggregate result for a key set? Instead of an operation and result for each element, we need an operation over all elements with just one result. Simple examples include the highest or lowest number in a set, and the earliest or total payroll expenses in a management hierarchy. In these examples, we need data from a set of elements in a partition, but we don't need a result for each. We only want one result for the entire set of objects. Going back to our functional programming reference, this is where the reduce function shines. Like the map function, reduce has a corresponding grid agent interface. The reduce function takes a collection of input keys and only produces one result for the entire collection. The result is typically an aggregate result: a sum, product, max, min, average, or any other aggregate function. Classes that implement ReduceGridAgent are used as parameters to the AgentManager#callReduceAgent(ReduceGridAgent agent, Collection keys) and AgentManager#callReduceAgent(agent) methods. The implementation itself is similar to the MapGridAgent pattern. The reduce grid agent we write operates on a collection of known keys or an unknown key set. If we have a known key set, then we will run the agent with AgentManager#callReduceAgent(agent, keys). If the key set is not known, and if we need a query to find the interesting objects, then we will call the AgentManager#callReduceAgent(agent). Let's write a ReduceGridAgent that finds the largest integer in a set. We'll start with a naïve implementation for finding the largest integer in an array: public int findLargestInteger(Integer[] ints) {int largestInt = ints[0];for (int i = 0; i < ints.length; i++) {if (ints[i] > largestInt) {largestInt = ints[i];}}return largestInt;} Implementing ReduceGridAgent requires three methods. Two of those methods look like the process methods in MapGridAgent. We have ReduceGridAgent#reduce(session, map, keys) and ReduceGridAgent#reduce(session, map). Like its MapGridAgent counterparts, the reduce method that accepts keys in the signature works with keys or Entity objects. The reduce method without keys in the signature should use a Query to find the objects most interesting to our business logic. public class LargestIntReduceAgent implements ReduceGridAgent,EntityAgentMixin {public Object reduce(Session session, ObjectMap map,Collection keys) {MyInteger largestInt = null;Iterator iter = keys.iterator();while (iter.hasNext()) {(MyInteger)myInt = (MyInteger)iter.next();if (myInt.greaterThan(largestInt)) {largestInt = myInt;}}return largestInt;}public Object reduce(Session session,ObjectMap map) {// Nothing to do for now!}public Object reduceResults(Collection results) {// Nothing to do for now!}public Class getClassForEntity() {return MyInteger.class;}} The first reduce method is similar in signature to the MapGridAgent#process(session, map, key) method. The difference here is that the third argument in ReduceGridAgent#reduce(session, map, keys) is a collection of keys rather than one key. This immediately illustrates the difference between the two. A Map operation takes place on only one element. Reduce operates on the entire collection. With a known key set, the ReduceGridAgent#reduce(session, map, keys) method is called. Without a key set passed to the AgentManager#callReduceAgent(agent) method, the GridReduceAgent#reduce(session, map) method is called. This method should use a Query to ?  nd the objects we want to use in our business logic. The keys or entity objects can then be passed to the ReduceGridAgent#reduce(session, map, keys) method for the actual business logic. We submit this agent to the grid in almost the same way as we submit a MapGridAgent to the grid. AgentManager has two callReduceAgent methods. The first takes a collection of keys as an argument, while the second does not. Submitting this agent to the grid looks like this: Collection numbers = new ArrayList();for(int i = 0; i < 10000; i++) {numbers.add(i);}ReduceGridAgent agent = new LargestIntReduceAgent();AgentManager am = session.getMap("MyInteger").getAgentManager();am.callReduceAgent(agent, numbers); This looks so similar to submitting a MapGridAgent to the grid and you may miss the method change to am.callReduceAgent(agent, keys). The programming models are so similar you may ask why there isn't just one generic callAgent method. Take a look at the ReduceGridAgent, particularly the ReduceGridAgent#reduceResults(results)  method. This method is called on the client side after all instances of the agent return their results. At this point, we have a collection of results for each partition. It is acceptable for the AgentManager#callMapAgent(agent, keys) to return the merged results here. AgentManager#callReduceAgent(agent, keys) must return one result for the entire operation. The ReduceGridAgent#reduceResults(results) method aggregates each partition's aggregate results: public class LargestIntReduceAgent implements ReduceGridAgent,EntityAgentMixin {public Object reduce(Session session, ObjectMap map,Collection keys) {return findLargestInt(keys);}public Object reduce(Session session,ObjectMap map) {// Nothing to do for now!}public Object reduceResults(Collection results) {findLargestInt(results);}public Class getClassForEntity() {return MyInteger.class;}private MyInteger findLargestInt(Collection keys) {MyInteger largestInt = null;Iterator iter = keys.iterator();while (iter.hasNext()) {(MyInteger)myInt = (MyInteger)iter.next();if (myInt.greaterThan(largestInt)) {largestInt = myInt;}}return largestInt;}} ReduceGridAgent#reduceResults(keys) is responsible for producing the final result passed back to the AgentManager#callReduceAgent(agent, keys) caller. Sometimes, the reduce operation performed in this final aggregation is the same as the operation performed in the ReduceGridAgent#reduce(session, map, keys) method. Sometimes, the operation is different. In our case, it is the same, and we refactor the reduce operation into a private method. Finishing off the ReduceGridAgent, we come to ReduceGridAgent#reduce(session, map). The method signature is similar to MapGridAgent#processAllEntries(session, map) and should be a hint that they have a similar purpose. The ReduceGridAgent#reduce(session, map) is called when a key list is not provided to AgentManager#callReduceAgent(agent). ReduceGridAgent#reduce(session, map) should limit the number of objects used in the reduce operation. Like MapGridAgent#processAllEntries(session, map), we typically use a Query. While the reduce agent does not send large results back to the client, we still care about finding objects that meet our criteria to use in the reduce operation: public Object reduce(Session session,ObjectMap map) {EntityManager em = session.getEntityManager();Query q = em.createQuery("select m from MyInteger m " +"where m.integer > 0 " +"and m.integer < 10000");Iterator iter = q.getResultIterator();Collection<MyInteger> keys = new ArrayList<MyInteger)();while (iter.hasNext()) {MyInteger mi = (MyInteger)iter.next();keys.add(mi);}return reduce(session, map, keys);} Though these are not strict rules, this method usually follows a pattern like MapGridAgent#processAllEntries(session, map). Run a query to limit the number of objects used in the reduce operation. Create a collection of keys used by the reduce operation. We're using entities here. Rather than duplicating the reduce operation in this method, we use put entities from the Query in the key collection. ReduceGridAgent#reduce(session, map, keys), when using Entities, expects a collection of MyInteger objects. Call ReduceGridAgent#reduce(session, map, keys) using the key collection we just created. There is no rule against re-implementing the reduce operation in each method but we'll be good software engineers and keep it DRY. If we can massage the query results into arguments, the reduce method accepts, and then we have enough reason to reuse it. At this point, we can submit this agent to the grid with or without a set of known keys and get the largest MyInteger back. In both, the MapGridAgent and ReduceGridAgent, we used a Query to limit the number of objects used in each operation: Query q = em.createQuery("select m from MyInteger m " +"where m.integer > 0 " +"and m.integer < 10000");Iterator iter = q.getResultIterator(); Obviously, this query is limited in what it can do. The criteria is hardcoded into the query. This query can only find MyIntegers with values between 0 and 10,000. Initially, we hardcoded these values because the agent runs on a partition in a container. Fortunately, we can pass additional data along with our agent.   Using ephemeral objects in agents In the previous examples, we hard coded the query criteria in the process and reduce methods. We should let the client-side program set those parameters instead of dictating what range of numbers the queries operate on. Right now, our queries are limited to exactly what is coded. A grid agent is just a POJO. It can have fields, getter and setter methods, and any other methods outside of the implemented grid agent interface. It's probably best to limit functionality to grid agent functionality but that doesn't mean that we can't have fields or other objects on the implementing class. Classes that implement the agent interfaces are POJOs. We'll send additional data to the grid by adding fields to the implementing class: public class LargestIntReduceAgent implements ReduceGridAgent,EntityAgentMixin {private Integer minValue;private Integer maxValue;// Reduce methods omitted for brevitypublic void setMinValue(Integer min) {this.minValue = min;}public void setMaxValue(Integer max) {this.maxValue = max;}} The only requirement for sending these additional fields to the grid is that they must each be serializable. Sending these objects to the grid is probably a one-way trip. Unless they're passed back as part of a map result, we cannot use them to communicate the state between client and grid. The grid agent instance used on the client side does not get a copy of the state of grid agent variables when the agents finish execution in the grid. Including grid state objects in the result set is bad practice and unnecessary. Before we pass the agent to AgentManager#callReduceAgent(agent), we set the fields used in the partition-side query: ReduceGridAgent agent = new LargestIntReduceAgent();agent.setMinValue(500);agent.setMaxValue(5000);AgentManager am = session.getMap("MyInteger").getAgentManager();am.callReduceAgent(agent); The ReduceGridAgent#reduce(session, map) method requires a small change to use our new query parameters: public Object reduce(Session session,ObjectMap map) {EntityManager em = session.getEntityManager();Query q = em.createQuery("select m from MyInteger m " +"where m.integer > ?1 " +"and m.integer < ?2");query.setParameter(1, minValue);query.setParameter(2, maxValue);Iterator iter = q.getResultIterator();Collection<MyInteger> keys = new ArrayList<MyInteger)();while (iter.hasNext()) {MyInteger mi = (MyInteger)iter.next();keys.add(mi);}return reduce(session, map, keys);} It's almost the same as before. We've just parameterized the query. It now uses the two values we sent into the grid with the agent. We can send more than query parameters along with an agent. We can send additional, complex business logic. If we obey the principles of object-oriented design, then we favor composition over inheritance. This allows the composition of agents with complex map or reduce operations, without cluttering the agent implementation class with business logic. To demonstrate, we'll refactor the findLargestInt(collection) method out of the LargestIntReduceAgent class: public interface MyHelper {public MyInteger call(Collection keys);}public class AgentHelper implements MyHelper, Serializeable {public MyInteger call(Collection keys) {MyInteger largestInt = null;Iterator iter = keys.iterator();while (iter.hasNext()) {(MyInteger)myInt = (MyInteger)iter.next();if (myInt.greaterThan(largestInt)) {largestInt = myInt;}}return largestInt;}} This is just a class that encapsulates the method formerly known as findLargestInt(collection). The name changed to conform to an imaginary calling convention is used by our agents. The ReduceGridAgent changes a bit to accommodate this calling convention: public class LargestIntReduceAgent implements ReduceGridAgent,EntityAgentMixin {private Integer minValue;private Integer maxValue;private MyHelper helper;public Object reduce(Session session, ObjectMap map,Collection keys) {return helper.call(keys);}public Object reduce(Session session,ObjectMap map) {EntityManager em = session.getEntityManager();Query q = em.createQuery("select m from MyInteger m " +"where m.integer > ?1 " +"and m.integer < ?2");query.setParameter(1, minValue);query.setParameter(2, maxValue);Iterator iter = q.getResultIterator();Collection<MyInteger> keys = new ArrayList<MyInteger)();while (iter.hasNext()) {MyInteger mi = (MyInteger)iter.next();keys.add(mi);}return reduce(session, map, keys);}public Object reduceResults(Collection results) {helper.call(results);}public getClassForEntity() {return MyInteger.class;}public void setMinValue(Integer min) {this.minValue = min;}public void setMaxValue(Integer max) {this.maxValue = max;}public void setHelper(MyHelper helper) {this.helper = helper;}} LargestIntReduceAgent's concern is interacting with the grid. Refactoring the findLargestInt method into different classes keeps our code clean and more easily testable. It also allows algorithm replacement. If we come up with a better map or a reduce method, then the GridAgent implementation doesn't change. LargestIntReduceAgent calls the helper.call(collection) method. The AgentHelper class is serialized with the agent and sent to each partition the agent is sent to. Once on the grid, the AgentHelper#call(collection) method is available to the agent. The normal Java serialization process handles agent serialization. Anything serializable in that processes is sent to the grid. Serializing these objects, and sending them to the grid requires that the appropriate class files be on the classpath of each ObjectGrid container process before the agent is sent to the grid. Updates with agents The agents we've seen so far are idempotent. They do not change any objects in the grid. They create new objects as a result of their operation but the objects queried by the agents remained unchanged. There is no rule against updating objects in an agent. Any operation valid inside an ObjectGrid transaction can also be performed in an agent, including inserts, updates, and deletes. In this way, an agent doesn't necessarily need to perform a map or reduce operation. It acts as a code transport between the client and server. We should be cautious with this relaxed approach to agents because there is a lot of potential for abuse. Used with caution, running agents on the grid for inserts, updates, and deletes creates a powerful application controlled by submitting agents to the grid. Building an application around agents reduces the need for running large numbers of client processes. Let's go back to our payment processor example to look at updates using a GridAgent. Specifically, we'll update a batch of deposit payments with a status of BatchStatus.SENT_TO_NETWORK after we receive the payments from the merchant and check for duplicates. We need to make a choice between using a MapGridAgent and a ReduceGridAgent. The choice depends on the behavior our application needs with the result of the operation. If we want to do more work with each payment after it is sent to the network, then we choose a MapGridAgent. Because we only care that the payments are updated, we'll choose the ReduceGridAgent. ReduceGridAgent gives one result for the entire operation, which in this case is the status of the operation, either success or failure. We don't have a particular known key set for all of the payments in a batch. A large batch has payments spread across nearly all partitions. We call our PaymentStatusReduceAgent with the AgentManager#callReduceAgent(agent) method: PaymentStatusReduceAgent agent = new PaymentStatusReduceAgent();agent.setBatch(batch);agent.setFromStatus(PaymentStatus.WAITING);agent.setToStatus(PaymentStatus.SENT_TO_NETWORK);AgentManager am = session.getMap("Payment").getAgentManager();am.callReduceAgent(agent); We use the AgentManager#callReduceAgent(agent) method because we want all partitions in the grid to participate in the reduce operation. The reduce operation begins by finding all payments that match a certain criteria. We want all payments for a batch that have a status of WAITING. We set these properties on the agent so that the AgentManager serializes them and sends them to the grid along with the agent. They are used as query parameters in the ReduceGridAgent#reduce(session, map) method: public Object reduce(Session session,ObjectMap map) {EntityManager em = session.getEntityManager();Query q = em.createQuery("select p from Payment p " +"where p.batch = ?1 " +"and p.status = ?2");query.setParameter(1, batch);query.setParameter(2, fromStatus);Iterator iter = q.getResultIterator();Collection<MyInteger> keys = new ArrayList<MyInteger)();while (iter.hasNext()) {Payment payment = (Payment)iter.next();keys.add(payment);}return reduce(session, map, keys);} We create a collection of payments to pass to the ReduceGridAgent#reduce(session, map, keys) method. In there, we perform the update payment status operations. Instead of an aggregate result based on calculations of objects in the grid, it is based on the success or failure of the update operations to each object. The ReduceGridAgent#reduce(session, map, keys) method returns a  Boolean value if the update succeeds, and throws an exception if it does not: public Object reduce(Session session, ObjectMap map,Collection keys) {try{Session s = session.getObjectGrid().getSession();EntityManager em = s.getEntityManager();Iterator iter = keys.iterator();while (iter.hasNext()) {Payment payment = (Payment)iter.next();payment.setStatus(toStatus);em.merge(payment);}return Boolean.TRUE;} catch(ObjectGridException e) {throw new ObjectGridRuntimeException(e);}} Throwing an exception doesn't exactly follow the spirit of the reduce operation. If the update operation fails, then the exception is thrown up the call stack and across the network to AgentManager#callReduceAgent(agent). If the update operation fails, then we have bigger problems to worry about than the exception uncovered by the update operation. We throw the exception here because the situation is unrecoverable by the reduce operation. A call to ReduceGridAgent#reduceResults(results) is meaningless when there is an exception. Absent from this code are explicit transaction demarcations. When the MapGridAgent and ReduceGridAgent methods are called, they are under an already-active transaction on the session passed in to them. Should the grid agent methods throw an exception, the transaction is rolled back. This transaction is independent of the client transaction and any other active agent transactions. If one of the agent transactions rolls back, then the client transaction rolls back too. We see a few interesting things from the payment update implemented as a reduce operation. In the happy-path case, each agent will return Boolean.TRUE. We only return Boolean.TRUE to conform to the method signature. A collection of values of Boolean.TRUE is passed to the ReduceGridAgent#reduceResults(Collection results) method. There is nothing more to do in the reduce operation. The values in the results collection do not play any part in the update operation. The update was successful. We know this because an exception wasn't thrown in any of the reduce methods. These two things let us implement a very simple ReduceGridAgent#reduceResults(Collection results) method: public Object reduceResults(Collection results) {return null;} Either the update succeeds and we don't need to do any more, or we know the update failed by getting an exception thrown out of the AgentManager#callReduceAgent(agent) method. It may seem strange that we don't  confirm the update is successful. Do we always explicitly check that JDBC updates were successful? No. We assume that because there was no thrown exception, the update happened. The same goes for our update in the ReduceGridAgent. For clarity, let's look at the PaymentStatusReduceAgent in its entirety: public class PaymentStatusReduceAgent implements ReduceGridAgent,EntityAgentMixin {private Batch batch;private PaymentStatus fromStatus;private PaymentStatus toStatus;public Object reduce(Session session, ObjectMap map,Collection keys) {try{Session s = session.getObjectGrid().getSession();EntityManager em = s.getEntityManager();Iterator iter = keys.iterator();em.getTransaction().begin();while (iter.hasNext()) {Payment payment = (Payment)iter.next();payment.setStatus(toStatus);em.merge(payment);}em.getTransaction().commit();return Boolean.TRUE;} catch(ObjectGridException e) {throw new ObjectGridRuntimeException(e);}}public Object reduce(Session session,ObjectMap map) {EntityManager em = session.getEntityManager();Query q = em.createQuery("select p from Payment p " +"where p.batch = ?1 " +"and p.status = ?2");query.setParameter(1, batch);query.setParameter(2, fromStatus);Iterator iter = q.getResultIterator();Collection<MyInteger> keys = new ArrayList<MyInteger)();while (iter.hasNext()) {Payment payment = (Payment)iter.next();keys.add(payment);}return reduce(session, map, keys);}public Object reduceResults(Collection results) {return null;}public getClassForEntity() {return Payment.class;}public void setBatch(Batch b) {this.batch = b;}public void setFromStatus(PaymentStatus status) {this.fromStatus = status;}public void setToStatus(PaymentStatus status) {this.toStatus = status;}} Scheduling agents The AgentManager methods are blocking methods. A method call on any method in AgentManager remains at that point in execution, while the data grid runs the agent instances against its primary partitions. The thread that calls the AgentManager method must wait for a return from the call before it proceeds. In case blocking is unacceptable, we should schedule the call to the AgentManager methods using the java.util.concurrent API. There are two cases to consider when thinking about scheduling agents. The first is with the AgentManager#callReduceAgent(agent) and AgentManager#callMapAgent(agent) methods. These methods do not pass any keys to the agents. In this case, the agent is executed on all primary partitions. It may be okay for a client application to block here while it waits for the result from the grid. Obviously, scheduling insert, update, and delete operations provides some performance improvement, if we work at the client-side in the future, that does not depend on those objects being in the grid (or not, as the case may be). A read operation where the client depends on the result before proceeding probably shouldn't schedule the agent. One case where scheduling read operations is important is when we have multiple sets of keys passed to agents of the same type. Given a large object set, where the objects partition many different primaries, we don't want to pass the entire key set to an agent. For a sufficiently large key set, an agent will spend most of its time processing (ignoring) keys that do not belong to its partition. Instead, we can pre-sort the keys into collections of objects where all belong to the same partition. We then send the smaller, pre-sorted collections to the grid. We determine an object's partition with a PartitionManager. Each BackingMap has a PartitionManager associated with it, which is obtained with the BackingMap#getPartitionManager() method. PartitionManager#getPartition(Object key) returns the 0-based partition number, which is the partition the PartitionManager puts the object in. This is easy when working with the ObjectMap API. Let's assume: MyInteger mi = (MyInteger)myIntMap.get(35);BackingMap map = session.getObjectGrid().getMap("MyInteger");int partitionId = map.getPartitionManager().getPartition(35); We don't need the first line. It only shows that we have a MyInteger with a key of 35. We obtain the ObjectGrid reference, and then the BackingMap for the MyInteger map from the session.  We then call the getPartition(key) method for that same key. The result of this call is the ID of the partition that holds the MyInteger object with the key 35. Now, we can use object and entity keys to sort objects based on partitions. After sorting the objects into smaller collections, we pass them to AgentManager#callMapAgent(agent, keys) and AgentManager#callReduceAgent(agent, keys). These calls should now be scheduled in different threads, rather than making each call in a loop. If we then make these method calls in a loop, we then effectively turn the data grid into an expensive client program. The client program blocks during each call to the AgentManager methods. If we have 20 key collections that map to 20 different partitions, we will send only one request at a time to the grid if we send the agents in a loop. Instead, we want them to execute in parallel. We can do this by sending each instance of grid agent to the grid using a java.util.concurrent.ExecutorService. Summary We covered a lot of ground again in this article. Working with objects where they live produces much higher throughput than dragging objects to a client and pushing them back to the grid when we're done. Co-locating logic and data is easy to do with the DataGrid API. DataGrid gives us a few patterns to follow when writing agents. It also makes us think in terms of map operations and reduce operations. Though these two methods seem limiting at first, they are useful when operating on very large data sets. The    map operation gives us a way to perform an algorithm on each object in a set. The reduce operation lets us create aggregate results from a set. We aren't limited to only sending logic to the grid with an agent. Thanks to Java serialization, we send any serializable object referenced by our agent to the grid along with it. This gives us flexibility in running queries in an agent, and in passing helper logic. We also looked at pre-sorting objects into maps based on their partition ID. This reduces the size of the bytes sent from the client to a partition, and lets the agent run only for keys known to be in the partition the agent runs on. With a little imagination, we can put more work on the grid. This gives us much higher throughput and scales horizontally with the resources given to our grid. Appropriately partitioned, a grid can scale out and return results at a predictable rate, no matter how many objects it stores.
Read more
  • 0
  • 0
  • 1303
Modal Close icon
Modal Close icon