Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-selinux-highly-secured-web-hosting-python-based-web-applications
Packt
21 Oct 2009
10 min read
Save for later

SELinux - Highly Secured Web Hosting for Python-based Web Applications

Packt
21 Oct 2009
10 min read
When contemplating the security of a web application, there are several attack vectors that you must consider. An outsider may attack the operating system by planting a remote exploit, exercising insecure operating system settings, or brandishing some other method of privilege escalation. Or, the outsider may attack other sites contained in the same server without escalating privileges. (Note that this particular discussion does not touch upon the conditions under which an attack steals data from a single site. Instead, I'm focusing on the ability to attack different applications on the same server.) With hosts providing space for large numbers of PHP-based sites, security can be difficult as the httpd daemon traditionally runs under the same Unix user for all sites. In order to prevent these kinds of attacks from occurring, you need to concentrate on two areas: Preventing the site from reading or modifying the data of another site, and Preventing the site from escalating privileges to tamper with the operating system and bypass user-based restrictions. There are two toolboxes you use to accomplish this. In the first case, you need to find a way to run all of your sites under different Linux users. This allows the traditional Linux filesystem security model to provide protection against a hacked site attacking other sites on the same server. In the second case, you need to find a way to prevent a privilege escalation to begin with and barring that, prevent damage to the operating system should an escalation occur. Let's first take a look at a method to run different sites under different users. The Python web framework provides several versatile methods by which applications can run. There are three common methods: first, using Python's built-in http server; second, running the script as a CGI application; and third, using mod_python under Apache (similar to what mod_perl and mod_php do). These methods have various disadvantages: respectively, a lack of scalability, performance issues due to CGI application loading, and the aforementioned “all sites under one user” problem. To provide a scalable, secure, high-performance framework, you can turn to a relatively new delivery method: mod_wsgi. This Apache module, created by Graham Dumpleton, provides several methods by which you can run Python applications. In this case, we'll be focusing on the “daemon” mode of mod_wsgi. Much like mod_python, the daemon mode of mod_wsgi embeds a Python interpreter (and the requisite script) into a httpd instance. Much like with mod_python, you can configure sites based on mod_wsgi to appear at various locations in the virtual directory tree and under different virtual servers. You can also configure the number and behavior of child daemons on a per-site basis. However, there is one important difference: with mod_wsgi, you can configure each httpd instance to run as a different Linux user. During operation, the main httpd instance dispatches requests to the already-running mod_wsgi children, producing performance results that rival mod_python. But most importantly, since each httpd instance is running under a different Linux user, you can apply Linux security mechanisms to different sites running on one server. Once you have your sites running on a per-user basis, you should next turn your attention to preventing privilege escalation and protecting the operating system. By default, the Targeted mode of SELinux provided by RedHat Enterprise Linux 5 (and its free cousins such as CentOS) provides strong protection against intrusions from httpd-based applications. Because of this, you will need to configure SELinux to allow access to resources such as databases and files that reside outside of the normal httpd directories. To illustrate these concepts, I'll guide you as you install a Trac instance under mod_wsgi. The platform is CentOS 5. As a side note, it's highly recommended that you perform the installation and SELinux debugging in a XEN instance so that your environment only contains the software that is needed. The sidebar explains how to easily install the environment that was originally used to perform this exercise, and I will assume that is your primary environment. There are a few steps that require the use of a C compiler – namely, the installation of Trac – and I'll guide you through migrating these packages to your XEN-based test environment. Installing Trac In this example, you'll use a standard installation of Trac. Following the instructions provided in the URL in the Resource section, begin by installing Trac 0.10.4 with ClearSilver 0.10.5 and SilverCity 0.9.7. (Note that with many Python web applications such as Trac and Django, “installing” the application means that you're actually installing the libraries necessary for Python to run the application. You'll need to run a script to create the actual site.) Next, create a PostgreSQL user and database on a different machine. If you are using XEN for your development machine, you can use a PostgreSQL database running in your main DOM0 instance; all we are concerned with is that the PostgreSQL instance is accessed on a different machine over the network. (Note that MySQL will also work in this example, but SQLite will not. In this case, we need a database engine that is accessed over the network, not as a disk file.) After that's done, you'll need to create an actual Trac site. Create a directory under /opt, such as /opt/trac. Next, run the trac_admin command and enter the information prompted. trac-admin /opt/trac initenv Installing mod_wsgi You can find mod_wsgi at the source listed in the Resources. After you make sure the httpd_devel package is installed, installing mod_wsgi is as simple as extracting the tarball and issuing the normal ./configure and 'make install' commands. Running Trac under mod_wsgi If you look under /opt/trac, you'll notice two directories: one labeled apache, and one with the label of the project that you assigned when you installed this instance of Trac. You'll start by creating an application script in the apache directory. The application script is listed in Listing 1. Listing 1: /opt/trac/apache/trac.wsgi #!/usr/bin/python import sys sys.stdout = sys.stderr import os os.environ['TRAC_ENV'] = '/opt/trac/test_proj' import trac.web.main application = trac.web.main.dispatch_request (Note the 'sys.stdout = sys.stderr' line. This is necessary due to the way WSGI handles communications between the Python script and the httpd instance. If there is any code in the script that prints to STDOUT (such as debug messages), then the httpd instance can crash.) After creating the application script, you'll modify httpd.conf to load the wsgi module and set up the Trac application. After the LoadModule lines, insert a line for mod_wsgi: LoadModule wsgi_module modules/mod_wsgi.so Next, go to the bottom of httpd.conf and insert the text in Listing 2. This text configures the wsgi module for one particular site; it can be used under the default httpd configuration as well as under VirtualHost directives. Listing 2: Excerpt from httpd.conf: WSGIDaemonProcess trac user=trac_user group=trac_user threads=25 WSGIScriptAlias /trac /opt/trac/apache/trac.wsgi WSGIProcessGroup trac WSGISocketPrefix run/wsgi <Directory /opt/trac/apache> WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all </Directory> Note the WSGIScriptAlias identifier. The /trac keyword (first parameter) specifies where in the directory tree the application will exist. With this configuration, If you go to your server's root address, you'll see the default CenOS splash page. If you add /trac after the address, you'll hit your Trac instance. Save the httpd.conf file. Finally, add a Linux user called trac_user. It is important that this user should not have login privileges. When the root httpd instance runs and encounters the WSGIDaemonProcess directive noted above, it will fork itself as the user specified in the directive; the fork will then load Python and the indicated script.     Securing Your Site In this section, I'll focus on the two areas noted in the introduction: User based security and SELinux. I will touch briefly on the theory of SELinux and explain the nuts and bolts of this particular implementation in more depth. I highly recommend that you read the RedHat Enterprise Linux Deployment Guide for the particulars about how RedHat implements SELinux. As with all activities involving some risk, if you plan to implement these methods, you should retain the services of a qualified security consultant to advise you about your particular situation. Setting up the user-based security is not difficult. Because the HTTPD instance containing Python and the Trac instance will run under the Trac user, you can safely set everything under /opt/trac/test_project for read and execute (for directories) for user and none for group/all. By doing this, you will isolate this site from other sites and users on the system. Now, let's configure SELinux. First, you should verify that your system is running the proper Policy and Mode. On your development system, you'll be using the Targeted policy in its Permissive mode. If you choose to move your Python applications to a production machine, you would run under the Targeted policy, in the Enforcing mode. The Targeted policy is limited to protecting the most popular network services without making the system so complex as to prevent user-level work from being done. It is the only mode that ships with RedHat 5, and by extension, CentOS 5. In Permissive mode, SELinux policy violations are trapped and sent to the audit log, but the behavior is allowed. In enforcing mode, the violation is trapped and the behavior is not allowed. To verify the Mode, run the Security Level Configuration tool from the Administration menu. The SELinux tab, shown in Figure 1, allows you to adjust the Mode. After you have verified that SELinux is running in Permissive mode, you need to do two things. First, you need to change the Type of the files under /opt/trac. Second, you need to allow Trac to connect to the Postgres database that you configured when you installed Trac. First, you need to tweak the SELinux file types attached to the files in your Trac instance. These file types dictate what processes are allowed to access them. For example, /etc/shadow has a very restrictive 'shadow' type that only allows a few applications to read and write it. By default, SELinux expects web-based applications – indeed, anything using Apache – to reside under /var/www. Files created under this directory have the SELinux Type httpd_sys_content_t. When you created the Trac instance under /opt/trac, the files were created as type usr_t. Figure 2 shows the difference between these labels To properly label the files under /opt, issue the following commands as root: cd /optchcon -R -t httpd_user_content_t trac/ After the file types are configured, there is one final step to do: allow Trac to connect to PostgreSQL. In its default state, SELinux disallows outbound network connections for the httpd type. To allow database connections, issue the following command: setsebool -P httpd_can_network_connect_db=1 In this case, we are using the -P option to make this setting persistent. If you omit this option, then the setting will be reset to its default state upon the next reboot. After the setsebool command has been run, start HTTPD by issuing the following command: /sbin/service httpd start If you visit the url http://127.0.0.1/trac, you should see the Trac screen such as that in Figure 3.    
Read more
  • 0
  • 0
  • 8965

article-image-building-crud-application-zk-framework
Packt
21 Oct 2009
5 min read
Save for later

Building a CRUD Application with the ZK Framework

Packt
21 Oct 2009
5 min read
An Online Media Library There are some traditional applications that could be used to introduce a framework. One condition for the selection is that the application should be a CRUD (Create —Read—Update—Delete) application. Therefore, an 'Online Media Library', which has all four operations, would be appropriate. We start with the description of requirements, which is the beginning of most IT projects. The application will have the following features: Add new media Update existing media Delete media Search for the media (and show the results) User roles (administrator for maintaining the media and user accounts for browsing the media) In the first implementation round the application should only have some basic functionality that will be extended step by step. A media item should have the following attributes: A title A type (Song or Movie) An ID which could be defined by the user A description An image The most important thing at the start of a project is to name it. We will call our project ZK-Medialib. Setting up Eclipse to Develop with ZK We use version 3.3 of Eclipse, which is also known as Europa release. You can download the IDE from http://www.eclipse.org/downloads/. We recommend using the version "Eclipse IDE for Java EE Developers". First we have to make a file association for the .zul files. For that open the Preferences dialog with Window | Preferences. After that do the following steps: Type Content Types into the search dialog. Select Content Types in the tree. Select XML in the tree. Click Add and type *.zul. See the result. The steps are illustrated in the picture below: With these steps, we have syntax highlighting of our files. However, to have content assist, we have to take care about the creation of new files. The easiest way is to set up Eclipse to work with zul.xsd. For that open the Preferences dialog with Window | Preferences. After that do the following steps: Type XML Catalog into the search dialog. Select XML Catalog in the tree. Press Add and fill out the dialog (see the second dialog below). See the result. Now we can easily create new ZUL files with the following steps: File | New | Other, and select XML: Type in the name of the file (for example hello.zul). Press Next. Choose Create XML file from an XML schema file: Press Next. Select Select XML Catalog entry. Now select zul.xsd: Now select the Root Element of the page (e.g. window). Select Finish. Now you have a new ZUL file with content assist. Go into the generated attribute element and press Alt+Space. Setting up a New Project The first thing we will need for the project is the framework itself. You can download the ZK framework from http://www.zkoss.org. At the time of writing, the latest version of ZK is 2.3.0. After downloading and unzipping the ZK framework we should define a project structure. A good structure for the project is the directory layout from the Maven project (http://maven.apache.org/). The structure is shown in the figure below. The directory lib contains the libraries of the ZK framework. For the first time it's wise to copy all JAR files from the ZK framework distribution. If you unzip the distribution of the version 2.3.0 the structure should look like the figure below. The structure below shows the structure of the ZK distribution. Here you can get the files you need for your own application. For our example, you should copy all JAR files from lib, ext, and zkforge to the WEB-INF/lib directory of your application. It's important that the libraries from ext and zkforge are copied direct to WEB-INF/lib. Additionally copy the directories tld and xsd to the WEB-INF directory of your application. Now after the copy process, we have to create the deployment descriptor (web.xml) for the web application. Here you can use web.xml from the demo application, which is provided from the ZK framework. For our first steps, we need no zk.xml (that configuration file is optional in a ZK application). The application itself must be run inside a JEE (Java Enterprise Edition) Webcontainer. For our example, we used the Tomcat container from the Apache project (http://tomcat.apache.org). However, you can run the application in each JEE container that follows the Java Servlet Specification 2.4 (or higher) and runs under a Java Virtual Machine 1.4 (or higher). We create the zk-media.xml file for Tomcat, which is placed in conf/Catalina/localhost of the Tomcat directory. <Context path="/zk-media" docBase="D:/Development/workspaces/workspace-zk-medialib/ZK-Medialib/src/main/webapp" debug="0"privileged="true" reloadable="true" crossContext="false"><Logger className="org.apache.catalina.logger.FileLogger"directory="D:/Development/workspaces/workspace-zk-medialib/logs/ZK-Medialib" prefix="zkmedia-" suffix=".txt" timestamp="true"/></Context> With the help of this context file, we can directly see the changes of our development, since, we set the root of the web application to the development directory.  
Read more
  • 0
  • 0
  • 7568

article-image-multiple-templates-django
Packt
21 Oct 2009
13 min read
Save for later

Multiple Templates in Django

Packt
21 Oct 2009
13 min read
Considering the different approaches Though there are different approaches that can be taken to serve content in multiple formats, the best solution will be specific to your circumstances and implementation. Almost any approach you take will have maintenance overhead. You'll have multiple places to update when things change. As copies of your template files proliferate, a simple text change can become a large task. Some of the cases we'll look at don't require much consideration. Serving a printable version of a page, for example, is straightforward and easily accomplished. Putting a pumpkin in your site header at Halloween or using a heart background around Valentine's Day can make your site seem timely and relevant, especially if you are in a seasonal business. Other techniques, such as serving different templates to different browsers, devices, or user-agents might create serious debate among content authors. Since serving content to mobile devices is becoming a new standard of doing business, we'll make it the focus of this article. Serving mobile devices The Mobile Web will remind some old timers (like me!) of the early days of web design where we'd create different sites for Netscape and Internet Explorer. Hopefully, we take lessons from those days as we go forward and don't repeat our mistakes. Though we're not as apt to serve wholly different templates to different desktop browsers as we once were, the mobile device arena creates special challenges that require careful attention. One way to serve both desktop and mobile devices is a one-size-fits-all approach. Through carefully structured and semantically correct XHTML markup and CSS selectors identified to be applied to handheld output, you can do a reasonable job of making your content fit a variety of contexts and devices. However, this method has a couple of serious shortcomings. First, it does not take into account the limitations of devices for rich media presentation with Flash, JavaScript, DHTML, and AJAX as they are largely unsupported on all but the highest-end devices. If your site depends on any of these technologies, your users can get frustrated when trying to experience it on a mobile device. Also, it doesn't address the varying levels of CSS support by different mobile devices. What looks perfect on one device might look passable on another and completely unusable on a third because only some of the CSS rules were applied properly. It also does not take into account the potentially high bandwidth costs for large markup files and CSS for users who pay by the amount of data transferred. For example, putting display: none on an image doesn't stop a mobile device from downloading the file. It only prevents it from being shown. Finally, this approach doesn't tailor the experience to the user's circumstances. Users tend to be goal-oriented and have specific actions in mind when using the mobile web, and content designers should recognize that simply recreating the desktop experience on a smaller screen might not solve their needs. Limiting the information to what a mobile user is looking for and designing a simplified navigation can provide a better user experience. Adapting content You know your users best, and it is up to you to decide the best way to serve them. You may decide to pass on the one-size-fits-all approach and serve a separate mobile experience through content adaptation. The W3C's Mobile Web Initiative best practices guidelines suggest giving users the flexibility and freedom to choose their experience, and provide links between the desktop and mobile templates so that they can navigate between the two. It is generally not recommended to automatically redirect users on mobile devices to a mobile site unless you give them a way to access the full site. The dark side to this kind of content adaptation is that you will have a second set of template files to keep updated when you make site changes. It can also cause your visitors to search through different bookmarks to find the content they have saved. Before we get into multiple sites, let's start with some examples of showing alternative templates on our current site. Setting up our example Since we want to customize the output of our detail page based on the presence of a variable in the URL, we're going to use a view function instead of a generic view. Let us consider a press release application for a company website. The press release object will have a title, body, published date, and author name.In the root directory of your project (in the directory projects/mycompany), create the press application by using the startapp command: $ python manage.py startapp press This will create a press folder in your site. Edit the mycompany/press/models.py file: from django.db import models class PressRelease(models.Model): title = models.CharField(max_length=100) body = models.TextField() pub_date = models.DateTimeField() author = models.CharField(max_length=100) def __unicode__(self): return self.title Create a file called admin.py in the mycompany/press directory, adding these lines: from django.contrib import adminfrom mycompany.press.models import PressRelease admin.site.register(PressRelease) Add the press and admin applications to your INSTALLED_APPS variable in the settings.py file: INSTALLED_APPS = ( 'django.contrib.auth', 'django.contrib.admin', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.sites', 'mycompany.press',) In the root directory of your project, run the syncdb command to add the new models to the database: $ python manage.py syncdb We will be prompted to create a superuser, go ahead and create it. We can access the admin site by browsing to http://localhost:8000/admin/ and add data. Create your mycompany/press/urls.py file as shown: urlpatterns = patterns('', (r'detail/(?P<pid>d+)/$', 'mycompany.press.views.detail'), (r'list/$','django.views.generic.list_detail.object_list', press_list_dict), (r'latest/$','mycompany.press.views.latest'), (r'$','django.views.generic.simple.redirect_to', {'url': '/press/list/'})) In your mycompany/press/views.py file, your detail view should look like this: from django.http import HttpResponsefrom django.shortcuts import get_object_or_404from django.template import loader, Contextfrom mycompany.press.models import PressRelease def detail(request, pid): ''' Accepts a press release ID and returns the detail page ''' p = get_object_or_404(PressRelease, id=pid) t = loader.get_template('press/detail.html') c = Context({'press': p}) return HttpResponse(t.render(c)) Let's jazz up our template a little more for the press release detail by adding some CSS to it. In mycompany/templates/press/detail.html, edit the file to look like this: <html><head><title>{{ press.title }}</title><style type="text/css">body { text-align: center;}#container { margin: 0 auto; width: 70%; text-align: left;}.header { background-color: #000; color: #fff;}</style></head><body><div id="container"><div class="header"><h1>MyCompany Press Releases</h1></div><div><h2>{{ press.title }}</h2><p>Author: {{ press.author }}<br/>Date: {{ press.pub_date }}<br/></p><p>{{ press.body }}</p></div></body></html> Start your development server and point your browser to the URL http://localhost:8000/press/detail/1/. You should see something like this, depending on what data you entered before when you created your press release: If your press release detail page is serving correctly, you're ready to continue. Remember that generic views can save us development time, but sometimes you'll need to use a regular view because you're doing something in a way that requires a view function customized to the task at hand. The exercise we're about to do is one of those circumstances, and after going through the exercise, you'll have a better idea of when to use one type of view over another. Serving printable pages One of the easiest approaches we will look at is serving an alternative version of a page based on the presence of a variable in the URL (aka a URL parameter). To serve a printable version of an article, for example, we can add ?printable to the end of the URL. To make it work, we'll add an extra step in our view to check the URL for this variable. If it exists, we'll load up a printer-friendly template file. If it doesn't exist, we'll load the normal template file. Start by adding the highlighted lines to the detail function in the mycompany/press/views.py file: def detail(request, pid): ''' Accepts a press release ID and returns the detail page ''' p = get_object_or_404(PressRelease, id=pid) if request.GET.has_key('printable'): template_file = 'press/detail_printable.html' else: template_file = 'press/detail.html' t = loader.get_template(template_file) c = Context({'press': p}) return HttpResponse(t.render(c)) We're looking at the request.GET object to see if a query string parameter of printable was present in the current request. If it was, we load the press/detail_printable.html file. If not, we load the press/detail.html file. We've also changed the loader.get_template function to look for the template_file variable. To test our changes, we'll need to create a simple version of our template that only has minimal formatting. Create a new file called detail_printable.html in the mycompany/templates/press/ directory and add these lines into it: <html><head><title>{{ press.title }}</title></head><body><h1>{{ press.title }}</h1><p>Author: {{ press.author }}<br/>Date: {{ press.pub_date }}<br/></p><p>{{ press.body }}</p></body></html> Now that we have both regular and printable templates, let's test our view.Point your browser to the URL http://localhost:8000/press/detail/1/, and you should see our original template as it was before. Change the URL to http://localhost:8000/press/detail/1/?printable and you should see our new printable template: Creating site themes Depending on the audience and focus of your site, you may want to temporarily change the look of your site for a season or holiday such as Halloween or Valentine's Day. This is easily accomplished by leveraging the power of the TEMPLATE_DIRS configuration setting. The TEMPLATE_DIRS variable in the settings.py file allows you to specify the location of the templates for your site. Also TEMPLATE_DIRS allows you to specify multiple locations for your template files. When you specify multiple paths for your template files, Django will look for a requested template file in the first path, and if it doesn't find it, it will keep searching through the remaining paths until the file is located. We can use this to our advantage by adding an override directory as the first element of the TEMPLATE_DIRS value. When we want to override a template with a special themed one, we'll add the file to the override directory. The next time the template loader tries to load the template, it will find it in the override directory and serve it. For example, let's say we want to override our press release page from the previous example. Recall that the view loaded the template like this (from mycompany/press/views.py): template_file = 'press/detail.html't = loader.get_template(template_file) When the template engine loads the press/detail.html template file, it gets itfrom the mycompany/templates/ directory as specified in the mycompany/settings.py file: TEMPLATE_DIRS = ( '/projects/mycompany/templates/',) If we add an additional directory to our TEMPLATE_DIRS setting, Django will look in the new directory first: TEMPLATE_DIRS = ( '/projects/mycompany/templates/override/’, '/projects/mycompany/templates/',) Now when the template is loaded, it will first check for the file /projects/mycompany/templates/override/press/detail.html. If that file doesn't exist, it will go on to the next directory and look for the file in /projects/mycompany/templates/press/detail.html. If you're using Windows, use the Windows-style file path c:/projects/mycompany/templates/ for these examples. Therein lies the beauty. If we want to override our press release template, we simply drop an alternative version with the same file name into the override directory. When we're done using it, we just remove it from the override directory and the original version will be served (or rename the file in the override directory to something other than detail.html). If you're concerned about the performance overhead of having a nearly empty override directory that is constantly checked for the existence of template files, we should consider caching techniques as a potential solution for this. Testing the template overrides Let's create a template override to test the concept we just learned. In your mycompany/settings.py file, edit the TEMPLATE_DIRS setting to look like this: TEMPLATE_DIRS = ( '/projects/mycompany/templates/override/', '/projects/mycompany/templates/',) Create a directory called override at mycompany/templates/ and another directory underneath that called press. You should now have these directories: /projects/mycompany/templates/override//projects/mycompany/templates/override/press/ Create a new file called detail.html in mycompany/templates/override/press/ and add these lines to the file: <html><head><title>{{ press.title }}</title></head><body><h1>Happy Holidays</h1><h2>{{ press.title }}</h2><p>Author: {{ press.author }}<br/>Date: {{ press.pub_date }}<br/></p><p>{{ press.body }}</p></body></html> You'll probably notice that this is just our printable detail template with an extra "Happy Holidays" line added to the top of it. Point your browser to the URL http://localhost:8000/press/detail/1/ and you should see something like this: By creating a new press release detail template and dropping it in the override directory, we caused Django to automatically pick up the new template and serve it without us having to change the view. To change it back, you can simply remove the file from the override directory (or rename it). One other thing to notice is that if you add ?printable to the end of the URL, it still serves the printable version of the file we created earlier. Delete the mycompany/templates/override/ directory and any files in it as we won't need them again.
Read more
  • 0
  • 0
  • 15561

article-image-creating-pseudo-3d-imagery-gimp-part-2
Packt
21 Oct 2009
6 min read
Save for later

Creating Pseudo-3D Imagery with GIMP: Part 2

Packt
21 Oct 2009
6 min read
The next step would be to play around with Layer Modes which, I believe, is one of the most exciting aspects of graphic design. Let's leave layer “lower shine” for awhile and let's get back to layer “upper shine”, and change its layer mode to make it look more appealing and following a scheme in accordance to the color of the sphere. To do this, let's select “upper shine” layer on the Layers Window and right above the Opacity Slider is a dropdown menu containing lots of interesting layer modes, each having its own distinct advantages. You can play around and choose whichever suits your vision the best. I chose Overlay for that matter. Ever since I've started GIMPing, this layer mode has been my best friend for a couple of years already, it works like a charm most of the time.  I wonder though why on some applications, applying the overlay layer mode does different results.  As in the case of Photoshop, the closest I could get with GIMP's overlay is the Screen mode.  You've got to play around a bit and see which works best for you. Do the same thing for the “lower shine” layer, choosing Overlay as the layer mode.  Then, whenever you see fit, you can duplicate the layers to achieve a multiplied effect of the mode. I did that because it felt that something was still missing in the luminous aspect of the shine. So I selected each layer, duplicated them both and voila. To duplicate a layer, you can either right click on the layer name and choose Duplicate Layer from the choices or just press the Duplicate Button on the bottom part of the Layers Window. Duplicated Layers Next, we'll add additional highlights to better emulate specular reflections. And again, we're exploiting the Ellipse Select Tool and another new technique called Feathering. I don't know exactly the definition of feathering in CG, but as far as my experience goes, feathering is a technique from sets of tools where you soften the edges of a selection creating a subtle transition and blurred edges. Create a new layer at the top of the layer stack and call it “blurred shine”, then give it a Layer Fill Type of Transparency, just like what we did with the previous layers.  And with layer 'blurred shine” active, let's create an elliptical selection on the upper left hand part of the sphere, just where the sharp shine has been cast. Creating the Specular Selection With the selection active, right click on the Image Window and choose Select > Feather, then input a value for the feather and the unit to be used. I used 50 pixels. You might have noticed now that the selection seemed to have become smaller, and that's alright, that means you've done it right.  And with the marching ants still active, grab the Bucket Fill Tool over at the Toolbox Window or press SHIFT + B to activate it. Change your foreground color to something close to white or simply pure white, then with the Bucket Fill Tool active, click on the active selection. Tadaaaa! You just created a replica of a specular highlight, though not so close enough. What's great about feathering selections as opposed to applying a blur filter is that you only blur the selection border and not the entire selection. So, say, you have a picture of yourself and you wanted your face fade out smoothly on a vast landscape that you have photographed. Simply create a selection around your face, then apply a Feather to that selection, invert the selection and delete the outer parts, thus leaving only your face and the landscape behind (supposing you have your picture on a separate layer above the landscape layer.) Feathering the Selection Applying the Color with the Bucket Fill Whew, that was pretty quick, isn't it? I hope you agree with me on that.  If so, let's create another one, though smaller and placed just on the left of the blurred shine.  Create a new layer for this new blurred shine and name it “small blurred shine”.  Follow the same procedure for the feathering and color-filling. I used the same feather value for the smaller selection (even though it obviously is smaller), just so it almost affects the center of the selection, blurring the whole selection already, which is what I like for this part. And then, just like what we did with the upper and lower shine respectively, we'll change the Layer Modes to Overlay and duplicate the layers as we see fit.  Doing so results in this image: Blurred Shines Overlay Our sphere now looks a lot better than it had been when we first added its color. However, the shading still looks a bit flat and volumeless. To deal with that, we'll simulate the strength with which the light diffused our sphere object, creating deeper shadows on the opposite side of the light source. Duplication of layers is not only a matter of multiplying the effects of layer effects or such, but it can also be a good way to trace your changes, or better yet, as safe backups where working on the duplicate doesn't affect the original one and you can go back each time to the untouched layer anytime you want to see the differences that have been made.  But be careful though, the more layers and contents of each layer you have, the more computing memory will be consumed and will eventually cause a system slow down. Let's select the “sphere” layer and duplicate in once.  Automatically, the duplicate layer which is now named “sphere copy” becomes the active layer.  Right Click on “sphere copy” layer and choose Alpha to Selection to create a selection out of the fully opaque sphere. Next step is to shrink the selection such that we create a smaller elliptical selection inside the sphere.  To do this, right click on the Image Window and choose Select > Shrink.  Then on the pop up window that appears, type in an appropriate value for the shrinking. I chose 50 pixels. Shrinking the Selection Shrinked Selection Remember how we moved the selection last time? I believe you do. To translate/move our selection, grab the Ellipse Select Tool and activate the selection by clicking on it (clicking the middle portion of the selection makes this easier) until you see your cursor change into crossed arrows, this means you have just activated the move tool for the selection. And since the light is coming from the upper left direction, we would want to move the selection over to the location where the specular reflections are and where the lightest shading is.  Thats because later on, we'll be using this same selection to create shadows on the opposite side of the shade.  Now go ahead and drag the selection over to the upper left portion of the sphere.
Read more
  • 0
  • 0
  • 3995

article-image-using-jquery-script-creating-dynamic-table-contents
Packt
21 Oct 2009
6 min read
Save for later

Using jQuery Script for Creating Dynamic Table of Contents

Packt
21 Oct 2009
6 min read
  A typical jQuery script uses a wide assortment of the methods that the library offers. Selectors, DOM manipulation, event handling, and so forth come into play as required by the task at hand. In order to make the best use of jQuery, we need to keep in mind the wide range of capabilities it provides. A Dynamic Table of Contents As an example of jQuery in action, we'll build a small script that will dynamically extract the headings from an HTML document and assemble them into a table of contents for that page. Our table of contents will be nestled on the top right corner of the page: We'll have it collapsed initially as shown above, but a click will expand it to full height: At the same time, we'll add a feature to the main body text. The introduction of the text on the page will not be initially loaded, but when the user clicks on the word Introduction, the introductory text will be inserted in place from another file: Before we reveal the script that performs these tasks, we should walk through the environment in which the script resides. Obtaining jQuery The official jQuery website (http://jquery.com/) is always the most up-to-date resource for code and news related to the library. To get started, we need a copy of jQuery, which can be downloaded right from the home page of the site. Several versions of jQuery may be available at any given moment; the latest uncompressed version will be most appropriate for us. No installation is required for jQuery. To use jQuery, we just need to reside it on our site in a public location. Since JavaScript is an interpreted language, there is no compilation or build phase to worry about. Whenever we need a page to have jQuery available, we will simply refer to the file's location from the HTML document. Setting Up the HTML Document There are three sections to most examples of jQuery usage— the HTML document itself, CSS files to style it, and JavaScript files to act on it. For this example, we'll use a page containing the text of a book: <?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xml_lang="en" lang="en">  <head>      <meta http-equiv="Content-Type" content="text/html;                                                   charset=utf-8"/>      <title>Doctor Dolittle</title>    <link rel="stylesheet" href="dolittle.css" type="text/css" />      <script src="jquery.js" type="text/javascript"></script>      <script src="dolittle.js" type="text/javascript"></script>  </head>  <body>    <div id="container">      <h1>Doctor Dolittle</h1>      <div class="author">by Hugh Lofting</div>      <div id="introduction">        <h2><a href="introduction.html">Introduction</a></h2>      </div>      <div id="content">        <h2>Puddleby</h2>        <p>ONCE upon a time, many years ago when our grandfathers           were little children--there was a doctor; and his name was           Dolittle-- John Dolittle, M.D.  &quot;M.D.&quot; means            that he was a proper doctor and knew a whole lot.       </p>           <!-- More text follows... -->      </div>    </div>  </body></html> The actual layout of files on the server does not matter. References from one file to another just need to be adjusted to match the organization we choose. In most examples in this book, we will use relative paths to reference files (../images/foo.png) rather than absolute paths (/images/foo.png).This will allow the code to run locally without the need for a web server. The stylesheet is loaded immediately after the standard <head> elements. Here are the portions of the stylesheet that affect our dynamic elements: /* -----------------------------------   Page Table of Contents-------------------------------------- */#page-contents {  position: absolute;  text-align: left;  top: 0;  right: 0;  width: 15em;  border: 1px solid #ccc;  border-top-width: 0;  border-right-width: 0;  background-color: #e3e3e3;}#page-contents h3 {  margin: 0;  padding: .25em .5em .25em 15px;  background: url(arrow-right.gif) no-repeat 0 2px;  font-size: 1.1em;  cursor: pointer;}#page-contents h3.arrow-down {  background-image: url(arrow-down.gif);}#page-contents a {  display: block;  font-size: 1em;  margin: .4em 0;  font-weight: normal;}#page-contents div {  padding: .25em .5em .5em;    display: none;  background-color: #efefef;}/* -----------------------------------   Introduction-------------------------------------- */.dedication {  margin: 1em;  text-align: center;  border: 1px solid #555;  padding: .5em;} After the stylesheet is referenced, the JavaScript files are included. It is important that the script tag for the jQuery library be placed before the tag for our custom scripts; otherwise, the jQuery framework will not be available when our code attempts to reference it.  
Read more
  • 0
  • 0
  • 5707

article-image-sakai-web-services-connecting-enterprise-part-1
Packt
21 Oct 2009
17 min read
Save for later

Sakai Web Services: Connecting to the Enterprise (Part 1)

Packt
21 Oct 2009
17 min read
Connecting to Sakai is straightforward, and simple tasks, such as automatic course creation, take only a few tens of lines of programming effort. There are significant advantages to having web services in the enterprise. If a developer writes an application that calls a number of web services, then the application does not need to know the hidden details behind the services. It just needs to agree on what data to send. This loosely couples the application to the services. Later, you can replace one web service with another. Programmers do not need to change the code on the application side. SOAP works well with most organizations' firewalls (http://en.wikipedia.org/wiki/Firewall), as SOAP uses the same protocol as web browsers. System administrators have a tendency to protect an organization's network by closing unused ports to the outside world. This means that most of the time there is no extra network configuration effort required to enable web services. Another simplifying factor is that a programmer does not need to know the details of SOAP or REST, as there are libraries and frameworks that hide the underlying magic. For the Sakai implementation of SOAP, to add a new service is as simple as writing a small amount of Java code within a text file, which then is automatically compiled and run the first time the service is called. This is great for rapid application development and deployment, as the system administrator does not need to restart Sakai for each change. Just as importantly, the Sakai services use the well-known libraries from the Apache Axis project (http://ws.apache.org/axis/). SOAP is an XML message passing protocol that, in the case of Sakai sites, sits on top of the Hyper Text Transfer Protocol (HTTP). HTTP is the protocol used by web browsers to obtain web pages from a server. The client sends messages in XML format to a service, including the information that the service needs, and then the service returns a message with the results or an error message. A readable reference to this interchange is the book Pro Apache XML by Poornachandra Sarang, PhD (http://www.freesoftwaremagazine.com/articles/book_review_pro_apache_xml). The full definition of HTTP is given at http://www.w3.org/TR/soap12-part1. The architects introduced SOAP-based web services first to Sakai and later RESTful services. Unlike SOAP, instead of sending XML via HTTP posts to one URL that points to a service, REST sends to a URL that includes information about the entity, such as a user, with which the client wishes to interact. For example, a REST URL for viewing an address book item could look similar to http://host/direct/addressbook_item/15. Applying URLs in this way makes understandable address spaces that are easier for a human to read. This more intuitive approach simplifies coding. Further, SOAP XML passing requires that the client and server parse the XML and at times, the parsing effort is expensive in CPU cycles and response times. The Entity Broker is an internal service that makes life easier for programmers and helps them manipulate entities. Entities in Sakai are managed pieces of data such as representations of courses, users, grade books, and so on. In the newer versions of Sakai, the Entity Broker has the power to expose entities as RESTful services. In contrast, for SOAP services, if you wanted a new service, you would need to write it yourself. Over time, the Entity Broker exposes more and more entities RESTfully, delivering more hooks free to integrate with other enterprise systems. Both SOAP and REST services sit on top of the HTTP protocol, which is explained in the next section of this article. Protocols This section explains how web browsers talk to servers in order to gather web pages. It explains how to use the telnet command and a visual tool called TCPMON (http://ws.apache.org/commons/tcpmon/tcpmontutorial.html) to gain insight into how web services and Web 2.0 technologies work. Playing with Telnet It turns out that message passing occurs via text commands between the browser and the server. Web browsers use HTTP (http://www.w3.org/Protocols/rfc2616/rfc2616.html) to get web pages and the embedded content from the server and to send form information to the server. HTTP talks between the client and server via text (7 bit ASCII) commands. When humans talk with each other, they have a wide vocabulary. However, HTTP uses fewer than twenty words. You can experiment directly with HTTP using a Telnet client to send your commands to a web server. For example, if your demonstration Sakai instance is running on port 8080, the following command will get you the login page: telnet localhost 8080GET /portal/login The GET command does what it sounds like and gets a web page. Forms can use the GET verb to send data at the end of the URL. For example, GET /portal/login?name=alan&age=15 is sending the variables name=alan and age=15 to the server. Installing TCPMON You can use the TCPMON tool to view requests and responses from a web browser such as Firefox. One of TCPMON's abilities is that it can act as an invisible man in the middle, recording the messages between the web browser and the server. Once set up, the requests sent from the browser go to TCPMON and TCPMON passes the request on to the server. The server passes back a response and then TCPMON, a transparent proxy (http://en.wikipedia.org/wiki/Proxy_server), returns the response to the web browser. This allows us to look at all requests and responses graphically. First, you can set TCPMON up to listen on a given port number—by convention, normally, port 8888—and then you can configure your web browser to send its requests through the proxy. Then, you can type the address of a given page into the web browser, but instead of going directly to the relevant server, the browser sends the request to the proxy, which then passes it on and passes the response back. TCPMON displays both the request and responses in a window. You can download TCPMON from http://ws.apache.org/commons/tcpmon/download.cgi. After downloading and unpacking, you can, from within the build directory, run either tcpmon.bat for the Windows environment or tcpmon.sh for Unix/Linux environments. To configure a proxy, you can click the Admin tab and then set the Listen Port to 8888 and select the Proxy radio button. After that, clicking Add will create a new tab, where the requests and responses will later be displayed. Your favorite web browser now has to recognize the newly set up proxy. For Firefox 3, you can do this by selecting the menu option Edit/Preferences and then choosing the advanced tab and the network tab, as shown next. You will need to set the proxy options HTTP proxy to 127.0.0.1 and the port number to 8888. If you do this, you will need to ensure that the No proxies text input is blank. Clicking the OK button enables the new settings. To use the Proxy from within Internet Explorer 7 for a Local Area Network (LAN), you can edit the dialog box found under Tools | Internet Options | Connections | LAN settings. Once the proxy is working, typing http://localhost:8080/portal/login in the address bar will seamlessly return the login page of your local Sakai instance. Otherwise, you will see an error message similar to Proxy Server Refused Connection for Firefox or Internet Explorer cannot display the webpage. To turn the proxy settings off, simply select the No Proxies radio box and click OK for Firefox 3, or unselect the Use the proxy server for the LAN tick box in Internet Explorer 7 and click OK. Requests and returned status codes When TCPMON is running a proxy on port 8888, it allows you to view the requests from the browser and the response in an extra tab, as shown in the following screen grab. Notice the extra information that the browser sends as part of the request. HTTP/1.1 defines the protocol and version level and the lines below the GET are header variables. The User-Agent defines which client sent the request. The Accept headers tell the server what the capabilities of the browser are, and the Cookie header defines the value stored in a cookie. HTTP is stateless, that is, in principle; each response is based only on the current request. However, to get around this, persistent information can be stored in cookies. Web browsers normally store their representation of a cookie as a little text file or in a small database on the end users' computers. Sakai uses the supporting features of a servlet container, such as Tomcat, to maintain state in cookies. A cookie stores a session ID, and when the server sees the session ID, it can look up the request's server-side state. Server-side state contains information such as whether the user is logged in or what he or she has ordered. The web browser deletes the local representation of the cookie each time the browser closes. A cookie that is deleted when a web browser closes is known as a session cookie. The server response starts with the protocol followed by a status number. HTTP/1.1 200 OK tells the web browser that the server is using HTTP version 1.1 and it was able to return the requested web page successfully. 2xx status codes imply success. 3xx status codes imply some form of redirection and tell the web browser where to try to pick up the requested resource. 4xx status codes are for client errors, such as malformed requests or lack of permission to obtain the resource. 4xx states are fertile grounds for security managers to look in log files for attempted hacking. 5xx status codes mostly have to do with a failure of the server itself and are mostly of interest to system administrators and programmers during the debugging cycle. In most cases, 5xx status numbers are about either high server load or a broken piece of code. Sakai is changing rapidly and even with the most vigorous testing, there are bound to be the occasional hiccups. You will find accurate details of the full range of status codes at: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html. Another important part of the response is the Content-Type, which tells the web browser which type of material the response is returning so the browser knows how to handle it. For example, the web browser may want to run a plug-in for video types and display text natively. The Content-Length in characters is normally also given. After the header information is finished, there is a newline followed by the content. Web browsers interpret any redirects that are returned by sending extra requests. Web browsers also interpret any HTML pages and make multiple requests for resources such as JavaScript files and images. Modern browsers do not wait until the server returns all the requests, but render the HTML page live as the server returns the parts. The GET verb is not very efficient for posting a large amount of data, as the URL has a length limit of around 2000 characters. Further, the end user can see the form data, and the browser may encode entities such as spaces to make the URL unreadable. There is also a security aspect: if you are typing in passwords in forms using GET, others may see your password or other details. This is not a good idea, especially at Internet Cafés where the next user who logs on can see the password in the browsing history. The POST verb is a better choice. Let us take as an example the Sakai demonstration login page http://localhost:8080/portal/login. The login page itself contains a form tag that points with the POST method to the relogin page. <form method="post" action="http://localhost:8080/portal/relogin" enctype="application/x-www-form-urlencoded"> Notice the HTML tag also defines the content type. Key features of the Post request compared to the GET are: the form values are stored as content after the header values, there is a newline between the end of the header and the data, and the request mentions data and the amount of data by the use of the Content-Length header value. The essential POST values for a login form with user admin (eid=admin) and password admin (pw=admin) will look like: POST http://localhost:8080/portal/relogin HTTP/1.1Content-Type: application/x-www-form-urlencodedContent-Length: 31eid=admin&pw=admin&submit=Login POSTs can contain much more information than GETs, and the request hides the values from the Address bar of the web browser. This is not secure. The header is just as visible as the URL, so POST values are also neither hidden nor secure. The only viable solution is for your web browser to encrypt your transactions using SSL/TLS (http://www.ietf.org/rfc/rfc2246.txt) for security, and this occurs every time you connect to a server using an HTTPS URL. SOAP Sakai uses the Apache Axis framework, which the developers have configured to accept SOAP calls via POST. SOAP sends messages in a specific XML format with the Content-Type, otherwise known as MIME type, application/soap+xml. A programmer does not need to know much more than that, as client libraries take care of the majority of the excruciating low-level details. An example SOAP message generated by the Perl module SOAP::Lite (http://www.soaplite.com/) for creating a login session in Sakai will look like the following Post data: <?xml version="1.0" encoding="UTF-8"?><soap:Envelope soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" ><soap:Body><login ><c-gensym3 xsi_type="xsd:string">admin</c-gensym3><c-gensym5 xsi_type="xsd:string">admin</c-gensym5></login></soap:Body></soap:Envelope> There is an envelope with a body containing data for the service to consume. The important point to remember is that both the client and the server have to be able to parse the specific XML schema. SOAP messages can include extra security features, but Sakai does not require these. The architects expect organizations to encrypt web services using SSL/TSL. The last extra SOAP-related complexity is the Web Service Description Language (http://www.w3.org/TR/wsdl). Web services may change location or exist in multiple locations for redundancy. The service writer can define the location of the services and the data types involved with those services in another file, in XML format. JSON Also worth mentioning is JavaScript Object Notation (JSON) (http://tools.ietf.org/html/rfc4627), which is another popular format passed using HTTP. A significant improvement in the quality of the end user experience during web browsing occurred when web developers realized that they could force browsers to load parts of a web page in at a time. This asynchronous loading enables all kinds of whiz-bang features, such as when you type in a search term and can choose from a set of search term completions before pressing submit. Asynchronous loading delivers more responsive and richer web pages that feel more like traditional applications than a plain old web page. JSON is one of the formats of choice for passing asynchronous requests and responses. The asynchronous communication normally occurs through HTTP GET or POST, but with a specific content structure that is designed to be human readable and script language parser-friendly. JSON calls have the file extension .json as part of the URL. As mentioned in RFC 4627, an example image object communicated in JSON looks like: { "Image": { "Width": 800, "Height": 600, "Title": "View from 15th Floor", "Thumbnail": { "Url": "http://www.example.com/image/481989943", "Height": 125, "Width": "100" }, "IDs": [116, 943, 234, 38793] }} To confuse the boundaries between client and server, a lot of the presentation and business logic is locked on the client side in scripting languages such as JavaScript. The scripting language orchestrates the loading of parts of pages and the generation of widget sets. Frameworks such as jQuery (http://jquery.com/) and MyFaces (http://myfaces.apache.org/) significantly ease the client-side programming burden. REST To understand REST, you need to understand the other verbs in HTTP (http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html). The full HTTP set is OPTIONS, GET, HEAD, POST, PUT, DELETE, and TRACE. The HEAD verb returns from the server only the headers of the response without the content, and is useful for clients that want to see if the content has changed since the last request. PUT requests that the content in the request be stored at the particular location mentioned in the request. DELETE is for deleting the entity. REST uses the URL of the request to route to the resource, and the HTTP verb GET is used to get a resource, PUT to update, DELETE to delete, and POST to add a new resource. In general, POST=create an item, PUT=update an item, DELETE=delete an item, and GET=return information on the item. In SOAP, you are pointing directly towards the service the client calls or indirectly via the web service description. However, in REST, part of the URL describes the resource or resources you wish to work with. For example, a hypothetical address book application that lists all email addresses in HTML format would look similar to the following: GET /email To list the addresses in XML format or JSON format: GET /email.xmlGET /email.json To get the first email address in the list: GET /email/1 To create a new email address, of course remembering to add the rest of email details to the end of the GET: POST /email And to delete address 5 in the list: DELETE /email/5 To obtain address 5 in other formats such as JSON or XML, then use file extensions at the end of the URL, for example: GET /email/5.jsonGET /email/5.xml RESTful services are more intuitively descriptive than SOAP services and they enable easy switching of the format from HTML to JSON to fuel dynamic, asynchronously-loaded web sites. Due to the direct use of HTTP verbs by REST, this methodology also fits well with the most common application type: CRUD (Create, Read, Update, Delete) applications, such as the site or user tools within Sakai. Now that we have discussed the theory, in the next section, we shall discuss which Sakai-related SOAP services already exist. Existing web services Sakai has built in, by default, the most community-requested web services, and there are also a few more services in the contributed section of the source code repository. This section describes the currently available services and the next section explains an example use, creating a new user. Recapping terminology In general, developers write web services for other developer's code to connect to (consume). Therefore, terminology can be confusing. In Sakai, a realm is a set of roles and their associated permissions. When you create a site, a copy is made from a specific realm template for that particular site type. The permissions can then be modified for the roles in the site, and members added to the site with one or other of the specific roles. Internally, Sakai uses AuthzGroups to keep track of groups of users. An AuthzGroup is an authorization group (a group of users, each with a role and a set of permissions of functions assigned to each role). A site contains pages; when you click on the tool menu for a given tool, normally, you will see one tool displayed in a page. However, for the home page tool, you will see more tools contained within a page.
Read more
  • 0
  • 0
  • 2015
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-authentication-and-authorization-modx
Packt
20 Oct 2009
1 min read
Save for later

Authentication and Authorization in MODx

Packt
20 Oct 2009
1 min read
It is vital to keep this distinction in mind to be able to understand the complexities explained in this article. You will also learn how MODx allows grouping of documents, users, and permissions. Create web users Let us start by creating a web user. Web users are users who can access restricted document groups in the web site frontend; they do not have Manager access. Web users can identify themselves at login by using login forms. They are allowed to log in from the user page, but they cannot log in using the Manager interface. To create a web user, perform the following steps: Click on the Web Users menu item in the Security menu. Click on New Web User. Fill in the fields with the following information: Field Name Value Username samira Password samira123 Email Address xyz@configurelater.com    
Read more
  • 0
  • 0
  • 2941

article-image-soa-service-oriented-architecture
Packt
20 Oct 2009
17 min read
Save for later

SOA—Service Oriented Architecture

Packt
20 Oct 2009
17 min read
What is SOA? SOA is the acronym for Service Oriented Architecture. As it has come to be known, SOA is an architectural design pattern by which several guiding principles determine the nature of the design. Basically, SOA states that every component of a system should be a service, and the system should be composed of several loosely-coupled services. A service here means a unit of a program that serves a business process. "Loosely-coupled" here means that these services should be independent of each other, so that changing one of them should not affect any other services. SOA is not a specific technology, nor a specific language. It is just a blueprint, or a system design approach. It is an architecture model that aims to enhance the efficiency, agility, and productivity of an enterprise system. The key concepts of SOA are services, high interoperability and loose coupling. Several other architecture/technologies such as RPC, DCOM, and CORBA have existed for a long time, and attempted to address the client/server communication problems. The difference between SOA and these other approaches is that SOA is trying to address the problem from the client side, and not from the server side. It tries to decouple the client side from the server side, instead of bundling them, to make the client side application much easier to develop and maintain. This is exactly what happened when object-oriented programming (OOP) came into play 20 years ago. Prior to object-oriented programming, most designs were procedure-oriented, meaning the developer had to control the process of an application. Without OOP, in order to finish a block of work, the developer had to be aware of the sequence that the code would follow. This sequence was then hard-coded into the program, and any change to this sequence would result in a code change. With OOP, an object simply supplied certain operations; it was up to the caller of the object to decide the sequence of those operations. The caller could mash up all of the operations, and finish the job in whatever order needed. There was a paradigm shift from the object side to the caller side. This same paradigm shift is happening today. Without SOA, every application is a bundled, tightly coupled solution. The client-side application is often compiled and deployed along with the server-side applications, making it impossible to quickly change anything on the server side. DCOM and CORBA were on the right track to ease this problem by making the server-side components reside on remote machines. The client application could directly call a method on a remote object, without knowing that this object was actually far away, just like calling a method on a local object. However, the client-side applications continue to remain tightly coupled with these remote objects, and any change to the remote object will still result in a recompiling or redeploying of the client application. Now, with SOA, the remote objects are truly treated as remote objects. To the client applications, they are no longer objects; they are services. The client application is unaware of how the service is implemented, or of the signature that should be used when interacting with those services. The client application interacts with these services by exchanging messages. What a client application knows now is only the interfaces, or protocols of the services, such as the format of the messages to be passed in to the service, and the format of the expected returning messages from the service. Historically, there have been many other architectural design approaches, technologies, and methodologies to integrate existing applications. EAI (Enterprise Application Integration) is just one of them. Often, organizations have many different applications, such as order management systems, accounts receivable systems, and customer relationship management systems. Each application has been designed and developed by different people using different tools and technologies at different times, and to serve different purposes. However, between these applications, there are no standard common ways to communicate. EAI is the process of linking these applications and others in order to realize financial and operational competitive advantages. It may seem that SOA is just an extension of EAI. The similarity is that they are both designed to connect different pieces of applications in order to build an enterprise-level system for business. But fundamentally, they are quite different. EAI attempts to connect legacy applications without modifying any of the applications, while SOA is a fresh approach to solve the same problem. Why SOA? So why do we need SOA now? The answer is in one word—agility. Business requirements change frequently, as they always have. The IT department has to respond more quickly and cost-effectively to those changes. With a traditional architecture, all components are bundled together with each other. Thus, even a small change to one component will require a large number of other components to be recompiled and redeployed. Quality assurance (QA) effort is also huge for any code changes. The processes of gathering requirements, designing, development, QA, and deployment are too long for businesses to wait for, and become actual bottlenecks. To complicate matters further, some business processes are no longer static. Requirements change on an ad-hoc basis, and a business needs to be able to dynamically define its own processes whenever it wants. A business needs a system that is agile enough for its day-to-day work. This is very hard, if not impossible, with existing traditional infrastructure and systems. This is where SOA comes into play. SOA's basic unit is a service. These services are building blocks that business users can use to define their own processes. Services are designed and implemented so that they can serve different purposes or processes, and not just specific ones. No matter what new processes a business needs to build or what existing processes a business needs need to modify, the business users should always be able to use existing service blocks, in order to compete with others according to current marketing conditions. Also, if necessary, some new service blocks can be used. These services are also designed and implemented so that they are loosely coupled, and independent of one another. A change to one service does not affect any other service. Also, the deployment of a new service does not affect any existing service. This greatly eases release management and makes agility possible. For example, a GetBalance service can be designed to retrieve the balance for a loan. When a borrower calls in to query the status of a specific loan, this GetBalance service may be called by the application that is used by the customer service representatives. When a borrower makes a payment online, this service can also be called to get the balance of the loan, so that the borrower will know the balance of his or her loan after the payment. Yet in the payment posting process, this service can still be used to calculate the accrued interest for a loan, by multiplying the balance with the interest rate. Even further, a new process can be created by business users to utilize this service if a loan balance needs to be retrieved. The GetBalance service is developed and deployed independently from all of the above processes. Actually, the service exists without even knowing who the client will be or even how many clients there will be. All of the client applications communicate with this service through its interface, and its interface will remain stable once it is in production. If we have to change the implementation of this service, for example by fixing a bug, or changing an algorithm inside a method of the service, all of the client applications can still work without any change. When combined with the more mature Business Process Management (BPM) technology, SOA plays an even more important role in an organization's efforts to achieve agility. Business users can create and maintain processes within BPM, and through SOA they can plug a service into any of the processes. The front-end BPM application is loosely coupled to the back-end SOA system. This combination of BPM and SOA will give an organization much greater flexibility in order to achieve agility. How do we implement SOA? Now that we've established why SOA is needed by the business, the question becomes—how do we implement SOA? To implement SOA in an organization, three key elements have to be evaluated—people, process, and technology. Firstly, the people in the organization must be ready to adopt SOA. Secondly, the organization must know the processes that the SOA approach will include, including the definition, scope, and priority. Finally, the organization should choose the right technology to implement it. Note that people and processes take precedence over technology in an SOA implementation, but they are out of the scope of this article. In this article, we will assume people and processes are all ready for an organization to adopt SOA. Technically, there are many SOA approaches. At certain degrees, traditional technologies such as RPC, DCOM, CORBA, or some modern technologies such as IBM WebSphere MQ, Java RMI, and .NET Remoting could all be categorized as service-oriented, and can be used to implement SOA for one organization. However, all of these technologies have limitations, such as language or platform specifications, complexity of implementation, or the ability to support binary transports only. The most important shortcoming of these approaches is that the server-side applications are tightly coupled with the client-side applications, which is against the SOA principle. Today, with the emergence of web service technologies, SOA becomes a reality. Thanks to the dramatic increase in network bandwidth, and given the maturity of web service standards such as WS-Security, and WS-AtomicTransaction, an SOA back-end can now be implemented as a real system. SOA from different users' perspectives However, as we said earlier, SOA is not a technology, but only a style of architecture, or an approach to building software products. Different people view SOA in different ways. In fact, many companies now have their own definitions for SOA. Many companies claim they can offer an SOA solution, while they are really just trying to sell their products. The key point here is—SOA is not a solution. SOA alone can't solve any problem. It has to be implemented with a specific approach to become a real solution. You can't buy an SOA solution. You may be able to buy some kinds of products to help you realize your own SOA, but this SOA should be customized to your specific environment, for your specific needs. Even within the same organization, different players will think about SOA in quite different ways. What follows are just some examples of how different players in an organization judge the success of an SOA initiative using different criteria. [Gartner, Twelve Common SOA Mistakes and How to Avoid Them, Publication Date: 26 October 2007 ID Number: G00152446] To a programmer, SOA is a form of distributed computing in which the building blocks (services) may come from other applications or be offered to them. SOA increases the scope of a programmer's product and adds to his or her resources, while also closely resembling familiar modular software design principles. To a software architect, SOA translates to the disappearance of fences between applications. Architects turn to the design of business functions rather than to self-contained and isolated applications. The software architect becomes interested in collaboration with a business analyst to get a clear picture of the business functionality and scope of the application. SOA turns software architects into integration architects and business experts. For the Chief Investment Officers (CIOs), SOA is an investment in the future. Expensive in the short term, its long-term promises are lower costs, and greater flexibility in meeting new business requirements. Re-use is the primary benefit anticipated as a means to reduce the cost and time of new application development. For business analysts, SOA is the bridge between them and the IT organization. It carries the promise that IT designers will understand them better, because the services in SOA reflect the business functions in business process models. For CEOs, SOA is expected to help IT become more responsive to business needs and facilitate competitive business change. Complexities in SOA implementation Although SOA will make it possible for business parties to achieve agility, SOA itself is technically not simple to implement. In some cases, it even makes software development more complex than ever, because with SOA you are building for unknown problems. On one hand, you have to make sure that the SOA blocks you are building are useful blocks. On the other, you need a framework within which you can assemble those blocks to perform business activities. The technology issues associated with SOA are more challenging than vendors would like users to believe. Web services technology has turned SOA into an affordable proposition for most large organizations by providing a universally-accepted, standard foundation. However, web services play a technology role only for the SOA backplane, which is the software infrastructure that enables SOA-related interoperability and integration. The following figure shows the technical complexity of SOA. It has been taken from Gartner, Twelve Common SOA Mistakes and How to Avoid Them, Publication Date: 26 October 2007 ID Number: G00152446. As Gartner says, users must understand the complex world of middleware, and point-to-point web service connections only for small-scale, experimental SOA projects. If the number of services deployed grows to more than 20 or 30, then use a middleware-based intermediary—the SOA backplane. The SOA backplane could be an Enterprise Service Bus (ESB), a Message-Oriented Middleware (MOM), or an Object Request Broker (ORB). However, in this article, we will not cover it. We will build only point-to-point services using WCF. Web services There are many approaches to realizing SOA, but the most popular and practical one is—using web services. What is a web service? A web service is a software system designed to support interoperable machine-to-machine interaction over a network. A web service is typically hosted on a remote machine (provider), and called by a client application (consumer) over a network. After the provider of a web service publishes the service, the client can discover it and invoke it. The communications between a web service and a client application use XML messages. A web service is hosted within a web server and HTTP is used as the transport protocol between the server and the client applications. The following diagram shows the interaction of web services: Web services were invented to solve the interoperability problem between applications. In the early 90s, along with the LAN/WAN/Internet development, it became a big problem to integrate different applications. An application might have been developed using C++, or Java, and run on a Unix box, a Windows PC, or even a mainframe computer. There was no easy way for it to communicate with other applications. It was the development of XML that made it possible to share data between applications across hardware boundaries and networks, or even over the Internet. For example, a Windows application might need to display the price of a particular stock. With a web service, this application can make a request to a URL, and/or pass an XML string such as <QuoteRequest><GetPrice Symble='XYZ'/></QuoteRequest>. The requested URL is actually the Internet address of a web service, which, upon receiving the above quote request, gives a response, <QuoteResponse><QuotePrice Symble='XYZ'>51.22</QuotePrice></QuoteResponse/>. The Windows application then uses an XML parser to interpret the response package, and display the price on the screen. The reason it is called a web service is that it is designed to be hosted in a web server, such as Microsoft Internet Information Server, and called over the Internet, typically via the HTTP or HTTPS protocols. This is to ensure that a web service can be called by any application, using any programming language, and under any operating system, as long as there is an active Internet connection, and of course, an open HTTP/HTTPS port, which is true for almost every computer on the Internet. Each web service has a unique URL, and contains various methods. When calling a web service, you have to specify which method you want to call, and pass the required parameters to the web service method. Each web service method will also give a response package to tell the caller the execution results. Besides new applications being developed specifically as web services, legacy applications can also be wrapped up and exposed as web services. So, an IBM mainframe accounting system might be able to provide external customers with a link to check the balance of an account. Web service WSDL In order to be called by other applications, each web service has to supply a description of itself, so that other applications will know how to call it. This description is provided in a language called a WSDL. WSDL stands for Web Services Description Language. It is an XML format that defines and describes the functionalities of the web service, including the method names, parameter names, and types, and returning data types of the web service. For a Microsoft ASMX web service, you can get the WSDL by adding ?WSDL to the end of the web service URL, say http://localhost/MyService/MyService.asmx?WSDL. Web service proxy A client application calls a web service through a proxy. A web service proxy is a stub class between a web service and a client. It is normally auto-generated by a tool such as Visual Studio IDE, according to the WSDL of the web service. It can be re-used by any client application. The proxy contains stub methods mimicking all of methods of the web service so that a client application can call each method of the web service through these stub methods. It also contains other necessary information required by the client to call the web service such as custom exceptions, custom data and class types, and so on. The address of the web service can be embedded within the proxy class, or it can be placed inside a configuration file. A proxy class is always for a specific language. For each web service, there could be a proxy class for Java clients, a proxy class for C# clients, and yet another proxy class for COBOL clients. To call a web service from a client application, the proper proxy class first has to be added to the client project. Then, with an optional configuration file, the address of the web service can be defined. Within the client application, a web service object can be instantiated, and its methods can be called just as for any other normal method. SOAP There are many standards for web services. SOAP is one of them. SOAP was originally an acronym for Simple Object Access Protocol, and was designed by Microsoft. As this protocol became popular with the spread of web services, and its original meaning was misleading, the original acronym was dropped with version 1.2 of the standard. It is now merely a protocol, maintained by W3C. SOAP, now, is a protocol for exchanging XML-based messages over computer networks. It is widely-used by web services and has become its de-facto protocol. With SOAP, the client application can send a request in XML format to a server application, and the server application will send back a response in XML format. The transport for SOAP is normally HTTP / HTTPS, and the wide acceptance of HTTP is one of the reasons why SOAP is widely accepted today.
Read more
  • 0
  • 0
  • 4659

article-image-query-performance-tuning-microsoft-analysis-services-part-1
Packt
20 Oct 2009
41 min read
Save for later

Query Performance Tuning in Microsoft Analysis Services: Part 1

Packt
20 Oct 2009
41 min read
In this two-part article by Chris Webb, we will cover query performance tuning, including how to design aggregations and partitions and how to write efficient MDX. The first part will cover performance-specific design features, along with the concepts of partitions and aggregations. One of the main reasons for building Analysis Services cubes as part of a BI solution is because it should mean you get better query performance than if you were querying your relational database directly. While it's certainly the case that Analysis Services is very fast it would be naive to think that all of our queries, however complex, will return in seconds without any tuning being necessary. This article will describe the steps you'll need to go through in order to ensure your cube is as responsive as possible. How Analysis Services processes queries Before we start to discuss how to improve query performance, we need to understand what happens inside Analysis Services when a query is run. The two major parts of the Analysis Services engine are: The Formula Engine processes MDX queries, works out what data is needed to answer them, requests that data from the Storage Engine, and then performs all calculations needed for the query. The Storage Engine handles all reading and writing of data; it fetches the data that the Formula Engine requests when a query is run and aggregates it to the required granularity. When you run an MDX query, then, that query goes first to the Formula Engine where it is parsed; the Formula Engine then requests all of the raw data needed to answer the query from the Storage Engine, performs any calculations on that data that are necessary, and then returns the results in a cellset back to the user. There are numerous opportunities for performance tuning at all stages of this process, as we'll see. Performance tuning methodology When doing performance tuning there are certain steps you should follow to allow you to measure the effect of any changes you make to your cube, its calculations or the query you're running: Wherever possible, test your queries in an environment that is identical to your production environment. Otherwise ensure that the size of the cube and the server hardware you're running on is at least comparable, and running the same build of Analysis Services. Make sure that no-one else has access to the server you're running your tests on. You won't get reliable results if someone else starts running queries at the same time as you. Make sure that the queries you're testing with are equivalent to the ones that your users want to have tuned. As we'll see, you can use Profiler to capture the exact queries your users are running against the cube. Whenever you test a query, run it twice: first on a cold cache, and then on a warm cache. Make sure you keep a note of the time each query takes to run and what you changed on the cube or in the query for that run. Clearing the cache is a very important step—queries that run for a long time on a cold cache may be instant on a warm cache. When you run a query against Analysis Services, some or all of the results of that query (and possibly other data in the cube, not required for the query) will be held in cache so that the next time a query is run that requests the same data it can be answered from cache much more quickly. To clear the cache of an Analysis Services database, you need to execute a ClearCache XMLA command. To do this in SQL Management Studio, open up a new XMLA query window and enter the following: <Batch > <ClearCache> <Object> <DatabaseID>Adventure Works DW 2008</DatabaseID> </Object> </ClearCache></Batch> Remember that the ID of a database may not be the same as its name —you can check this by right-clicking on a database in the SQL Management Studio Object Explorer and selecting Properties. Alternatives to this method also exist: the MDX Studio tool allows you to clear the cache with a menu option, and the Analysis Services Stored Procedure Project (http://tinyurl.com/asstoredproc) contains code that allows you to clear the Analysis Services cache and the Windows File System cache directly from MDX. Clearing the Windows File System cache is interesting because it allows you to compare the performance of the cube on a warm and cold file system cache as well as a warm and cold Analysis Services cache: when the Analysis Services cache is cold or can't be used for some reason, a warm file system cache can still have a positive impact on query performance. After the cache has been cleared, before Analysis Services can answer a query it needs to recreate the calculated members, named sets and other objects defined in a cube's MDX script. If you have any reasonably complex named set expressions that need to be evaluated, you'll see some activity in Profiler relating to these sets being built and it's important to be able to distinguish between this and activity that's related to the queries you're actually running. All MDX Script related activity occurs between the Execute MDX Script Begin and Execute MDX Script End events; these are fired after the Query Begin event but before the Query Cube Begin event for the query run after the cache has been cleared. When looking at a Profiler trace you should either ignore everything between the Execute MDX Script Begin and End events or run a query that returns no data at all to trigger the evaluation of the MDX Script, for example: SELECT {} ON 0FROM [Adventure Works] Designing for performance Many of the recommendations for designing cubes we've given so far in this article have been given on the basis that they will improve query performance, and in fact the performance of a query is intimately linked to the design of the cube it's running against. For example, dimension design, especially optimizing attribute relationships, can have a significant effect on the performance of all queries—at least as much as any of the optimizations described in this article. As a result, we recommend that if you've got a poorly performing query the first thing you should do is review the design of your cube to see if there is anything you could do differently. There may well be some kind of trade-off needed between usability, manageability, time-to-develop, overall "elegance" of the design and query performance, but since query performance is usually the most important consideration for your users then it will take precedence. To put it bluntly, if the queries your users want to run don't run fast your users will not want to use the cube at all! Performance-specific design features Once you're sure that your cube design is as good as you can make it, it's time to look at two features of Analysis Services that are transparent to the end user but have an important impact on performance and scalability: measure group partitioning and aggregations. Both of these features relate to the Storage Engine and allow it to answer requests for data from the Formula Engine more efficiently. Partitions A partition is a data structure that holds some or all of the data held in a measure group. When you create a measure group, by default that measure group contains a single partition that contains all of the data. Enterprise Edition of Analysis Services allows you to divide a measure group into multiple partitions; Standard Edition is limited to one partition per measure group, and the ability to partition is one of the main reasons why you would want to use Enterprise Edition over Standard Edition. Why partition? Partitioning brings two important benefits: better manageability and better performance. Partitions within the same measure group can have different storage modes and different aggregation designs, although in practice they usually don't differ in these respects; more importantly they can be processed independently, so for example when new data is loaded into a fact table, you can process only the partitions that should contain the new data. Similarly, if you need to remove old or incorrect data from your cube, you can delete or reprocess a partition without affecting the rest of the measure group. Partitioning can also improve both processing performance and query performance significantly. Analysis Services can process multiple partitions in parallel and this can lead to much more efficient use of CPU and memory resources on your server while processing is taking place. Analysis Services can also fetch and aggregate data from multiple partitions in parallel when a query is run too, and again this can lead to more efficient use of CPU and memory and result in faster query performance. Lastly, Analysis Services will only scan the partitions that contain data necessary for a query and since this reduces the overall amount of IO needed this can also make queries faster. Building partitions You can view, create and delete partitions on the Partitions tab of the Cube Editor in BIDS. When you run the New Partition Wizard or edit the Source property of an existing partition, you'll see you have two options for controlling what data is used in the partition: Table Binding means that the partition contains all of the data in a table or view in your relational data source, or a named query defined in your DSV. You can choose the table you wish to bind to on the Specify Source Information step of the New Partition Wizard, or in the Partition Source dialog if you choose Table Binding from the Binding Type drop-down box. Query Binding allows you to specify an SQL SELECT statement to filter the rows you want from a table; BIDS will automatically generate part of the SELECT statement for you, and all you'll need to do is supply the WHERE clause. If you're using the New Partition Wizard, this is the option that will be chosen if you check the Specify a query to restrict rows checkbox on the second step of the wizard; in the Partition Source dialog you can choose this option from the Binding Type drop-down box. It might seem like query binding is the easiest way to filter your data, and while it's the most widely-used approach it does have one serious shortcoming: since it involves hard-coding an SQL SELECT statement into the definition of the partition, changes to your fact table such as the deletion or renaming of a column can mean the SELECT statement errors when it is run if that column is referenced in it. This means in turn will cause the partition processing to fail.. If you have a lot of partitions in your measure group—and it's not unusual to have over one hundred partitions on a large cube—altering the query used for each one is somewhat time-consuming. Instead, table-binding each partition to a view in your relational database will make this kind of maintenance much easier, although you do of course now need to generate one view for each partition. Alternatively, if you're building query-bound partitions from a single view on top of your fact table (which means you have complete control over the columns the view exposes), you could use a query like SELECT * FROM in each partition’s definition. It's very important that you check the queries you're using to filter your fact table for each partition. If the same fact table row appears in more than one partition, or if fact table rows don't appear in any partition, this will result in your cube displaying incorrect measure values. On the Processing and Storage Locations step of the wizard you have the chance to create the partition on a remote server instance, functionality that is called Remote Partitions. This is one way of scaling out Analysis Services: you can have a cube and measure group on one server but store some of the partitions for the measure group on a different server, something like a linked measure group but at a lower level. This can be useful for improving processing performance in situations when you have a very small time window available for processing but in general we recommend that you do not use remote partitions. They have an adverse effect on query performance and they make management of the cube (especially backup) very difficult. Also on the same step you have the chance to store the partition at a location other than the default of the Analysis Services data directory. Spreading your partitions over more than one volume may make it easier to improve the IO performance of your solution, although again it can complicate database backup and restore. After assigning an aggregation design to the partition (we'll talk about aggregations in detail next), the last important property to set on a partition is Slice. The Slice property takes the form of an MDX member, set or tuple—MDX expressions returning members, sets or tuples are not allowed however - and indicates what data is present in a partition. While you don't have to set it, we strongly recommend that you do so, even for MOLAP partitions, for the following reasons: While Analysis Services does automatically detect what data is present in a partition during processing, it doesn't always work as well as you'd expect and can result in unwanted partition scanning taking place at query time in a number of scenarios. The following blog entry on the SQLCat team site explains why in detail: http://tinyurl.com/partitionslicing It acts as a useful safety mechanism to ensure that you only load the data you're expecting into a partition. If, while processing, Analysis Services finds that data is being loaded into the partition that conflicts with what's specified in the Slice property, then processing will fail. More detail on how to set the Slice property can be found in Mosha Pasumansky's blog entry on the subject here: http://tinyurl.com/moshapartition Planning a partitioning strategy We now know why we should be partitioning our measure groups and what to do to create a partition; the next question is: how should we split the data in our partitions? We need to find some kind of happy medium between the manageability and performance aspects of partitioning—we need to split our data so that we do as little processing as possible, but also so that as few partitions are scanned as possible by our users' queries. Luckily, if we partition by our Time dimension we can usually meet both needs very well: it's usually the case that when new data arrives in a fact table it's for a single day, week or month, and it's also the case that the most popular way of slicing a query is by a time period. Therefore, it's almost always the case that when measure groups are partitioned they are partitioned by time. It's also worth considering, though, if it's a good idea to partition by time and another dimension: for example, in an international company you might have a Geography dimension and a Country attribute, and users may always be slicing their queries by Country too—in which case it might make sense to partition by Country. Measure groups that contain measures with the Distinct Count aggregation type require their own specific partitioning strategy. While you should still partition by time, you should also partition by non-overlapping ranges of values within the column you're doing the distinct count on. A lot more detail on this is available in the following white paper: http://tinyurl.com/distinctcountoptimize It's worth looking at the distribution of data over partitions for dimensions we're not explicitly slicing by, as there is often a dependency between data in these dimensions and the Time dimension: for example, a given Product may only have been sold in certain Years or in certain Countries. You can see the distribution of member DataIDs (the internal key values that Analysis Services creates for all members on a hierarchy) for a partition by querying the Discover_Partition_Dimension_Stat DMV, for example: SELECT *FROM SystemRestrictSchema($system.Discover_Partition_Dimension_Stat ,DATABASE_NAME = 'Adventure Works DW 2008' ,CUBE_NAME = 'Adventure Works' ,MEASURE_GROUP_NAME = 'Internet Sales' ,PARTITION_NAME = 'Internet_Sales_2003') The following screenshot shows what the results of this query look like: There's also a useful Analysis Services stored procedure that shows the same data and any partition overlaps included in the Analysis Services Stored Procedure Project (a free, community-developed set of sample Analysis Services stored procedures): http://tinyurl.com/partitionhealth. This blog entry describes how you can take this data and visualise it in a Reporting Services report: http://tinyurl.com/viewpartitionslice We also need to consider what size our partitions should be. In general between 5 and 20 million rows per partition, or up to around 3GB, is a good size. If you have a measure group with a single partition of below 5 million rows then don't worry, it will perform very well, but it's not worth dividing it into smaller partitions; it's equally possible to get good performance with partitions of 50-60 million rows. It's also best to avoid having too many partitions as well—if you have more than a few hundred it may make SQL Management Studio and BIDS slow to respond, and it may be worth creating fewer, larger partitions assuming these partitions stay within the size limits for a single partition we've just given. Automatically generating large numbers of partitionsWhen creating a measure group for the first time, it's likely you'll already have a large amount of data and may need to create a correspondingly large number of partitions for it. Clearly the last thing you'll want to do is create tens or hundreds of partitions manually and it's worth knowing some tricks to create these partitions automatically. One method involves taking a single partition, scripting it out to XMLA and then pasting and manipulating this in Excel, as detailed here: http://tinyurl.com/generatepartitions. The Analysis Services Stored Procedure Project also contains a set of functions for creating partitions automatically based on MDX set expressions: http://tinyurl.com/autopartition. Unexpected partition scans Even when you have configured your partitions properly it's sometimes the case that Analysis Services will scan partitions that you don't expect it to be scanning for a particular query. If you see this happening the first thing to determine is whether these extra scans are making a significant contribution to your query times. If they aren't, then it's probably not worth worrying about; if they are, there are some things to try to attempt to stop it happening. The extra scans could be the result of a number of factors, including: The way you have written MDX for queries or calculations. In most cases it will be very difficult to rewrite the MDX to stop the scans, but the following blog entry describes how it is possible in one scenario: http://tinyurl.com/moshapart The LastNonEmpty measure aggregation type may result in multiple partition scans. If you can restructure your cube so you can use the LastChild aggregation type, Analysis Services will only scan the last partition containing data for the current time period. In some cases, even when you've set the Slice property, Analysis Services has trouble working out which partitions should be scanned for a query. Changing the attributes mentioned in the Slice property may help, but not always. The section on Related Attributes and Almost Related Attributes in the following blog entry discusses this in more detail: http://tinyurl.com/mdxpartitions Analysis Services may also decide to retrieve more data than is needed for a query to make answering future queries more efficient. This behavior is called prefetching and can be turned off by setting the following connection string properties: Disable Prefetch Facts=True; Cache Ratio=1 More information on this can be found in the section on Prefetching and Request Ordering in the white paper Identifying and Resolving MDX Query Bottleneck available from http://tinyurl.com/mdxbottlenecks Note that setting these connection string properties can have other, negative effects on query performance. You can set connection string properties in SQL Management Studio when you open a new MDX Query window. Just click the Options button on the Connect to Analysis Services dialog, then go to the Additional Connection Parameters tab. Note that in the RTM version of SQL Management Studio there is a problem with this functionality, so that when you set a connection string property it will continue to be set for all connections, even though the textbox on the Additional Connection Parameters tab is blank, until SQL Management Studio is closed down or until you set the same property differently. Aggregations An aggregation is simply a pre-summarised data set, similar to the result of an SQL SELECT statement with a GROUP BY clause, that Analysis Services can use when answering queries. The advantage of having aggregations built in your cube is that it reduces the amount of aggregation that the Analysis Services Storage Engine has to do at query time, and building the right aggregations is one of the most important things you can do to improve query performance. Aggregation design is an ongoing process that should start once your cube and dimension designs have stabilised and which will continue throughout the lifetime of the cube as its structure and the queries you run against it change; in this section we'll talk about the steps you should go through to create an optimal aggregation design. Creating an initial aggregation design The first stage in creating an aggregation design should be to create a core set of aggregations that will be generally useful for most queries run against your cube. This should take place towards the end of the development cycle when you're sure that your cube and dimension designs are unlikely to change much, because any changes are likely to invalidate your aggregations and mean this step will have to be repeated. It can't be stressed enough that good dimension design is the key to getting the most out of aggregations: removing unnecessary attributes, setting AttributeHierarchyEnabled to False where possible, building optimal attribute relationships and building user hierarchies will all make the aggregation design process faster, easier and more effective. You should also take care to update the EstimatedRows property of each measure group and partition, and the EstimatedCount of each attribute before you start, and these values are also used by the aggregation design process. BIDS Helper adds a new button to the toolbar in thePartitions tab of the Cube Editor to update all of these count properties with one click. To build this initial set of aggregations we'll be running the Aggregation Design Wizard, which can be run by clicking the Design Aggregations button on the toolbar of the Aggregations tab of the Cube Editor. This wizard will analyse the structure of your cube and dimensions, look at various property values you've set, and try to come up with a set of aggregations that it thinks should be useful. The one key piece of information it doesn't have at this point is what queries you're running against the cube, so some of the aggregations it designs may not prove to be useful in the long-run, but running the wizard is extremely useful for creating a first draft of your aggregation designs. You can only design aggregations for one measure group at a time; if you have more than one partition in the measure group you've selected then the first step of the wizard asks you to choose which partitions you want to design aggregations for. An aggregation design can be associated with many partitions in a measure group, and a partition can be associated with just one aggregation design or none at all. We recommend that, in most cases, you have just one aggregation design for each measure group for the sake of simplicity. However if processing time is limited and you need to reduce the overall time spent building aggregations, or if query patterns are different for different partitions within the same measure group, then it may make sense to apply different aggregation designs to different partitions. The next step of the wizard asks you to review the AggregationUsage property of all the attributes on all of the cube dimensions in your cube; this property can also be set on the Cube Structure tab of the Cube Editor. The following figure shows the Aggregation Design Wizard: The AggregationUsage property controls how dimension attributes are treated in the aggregation design process. The property can be set to the following values: Full: This means the attribute, or an attribute at a lower granularity directly related to it by an attribute relationship, will be included in every single aggregation the wizard builds. We recommend that you use this value sparingly, for at most one or two attributes in your cube, because it can significantly reduce the number of aggregations that get built. You should set it for attributes that will almost always get used in queries. For example, if the vast majority of your queries are at the Month granularity it makes sense that all of your aggregations include the Month attribute from your Time dimension. None: This means the attribute will not be included in any aggregation that the wizard designs. Don't be afraid of using this value for any attributes that you don't think will be used often in your queries, it can be a useful way of ensuring that the attributes that are used often get good aggregation coverage. Note that Attributes with AttributeHierarchyEnabled set to False will have no aggregations designed for them anyway. Unrestricted: This means that the attribute may be included in the aggregations designed, depending on whether the algorithm used by the wizard considers it to be useful or not. Default: The default option applies a complex set of rules, which are: The granularity attribute (usually the key attribute, unless you specified otherwise in the dimension usage tab) is treated as Unrestricted. All attributes on dimensions involved in many-to-many relationships, unmaterialised referenced relationships, and data mining dimensions are treated as None. Aggregations may still be built at the root granularity, that is, the intersection of every All Member on every attribute. All attributes that are used as levels in natural user hierarchies are treated as Unrestricted. Attributes with IsAggregatable set to False are treated as Full. All other attributes are treated as None The next step in the wizard asks you to verify the number of EstimatedRows and EstimatedCount properties we've already talked about, and gives the option of setting a similar property that shows the estimated number of members from an attribute that appear in any one partition. This can be an important property to set: if you are partitioning by month, although you may have 36 members on your Month attribute a partition will only contain data for one of them. On the Set Aggregation Options step you finally reach the point where some aggregations can be built. Here you can apply one last set of restrictions on the set of aggregations that will be built, choosing to either: Estimated Storage Reaches, which means you build aggregations to fill a given amount of disk space. Performance Gain Reaches, the most useful option. It does not mean that all queries will run n% faster; nor does it mean that a query that hits an aggregation directly will run n% faster. Think of it like this: if the wizard built all the aggregations it thought were useful to build (note: this is not the same thing as all of the possible aggregations that could be built on the cube) then, in general, performance would be better. Some queries would not benefit from aggregations, some would be slightly faster, and some would be a lot faster; and some aggregations would be more often used than others. So if you set this property to 100% the wizard would build all the aggregations that it could, and you'd get 100% of the performance gain possible from building aggregations. Setting this property to 30%, the default and recommended value, will build the aggregations that give you 30% of this possible performance gain—not 30% of the possible aggregations, usually a much smaller number. As you can see from the screenshot below, the graph drawn on this step plots the size of the aggregations built versus overall performance gain, and the shape of the curve shows that a few, smaller aggregations usually provide the majority of the performance gain. I Click Stop, which means carry on building aggregations until you click the Stop button. Designing aggregations can take a very long time, especially on more complex cubes, because there may literally be millions or billions of possible aggregations that could be built. In fact, it's not unheard of for the aggregation design wizard to run for several days before it's stopped! Do Not Design Aggregations allows you to skip designing aggregations. The approach we suggest taking here is to first select I Click Stop and then click the Start button. On some measure groups this will complete very quickly, with only a few small aggregations built. If that's the case click Next; otherwise, if it's taking too long or too many aggregations are being built, click Stop and then Reset, and then select Performance Gain Reaches and enter 30% and Start again. This should result in a reasonable selection of aggregations being built; in general around 50-100 aggregations is the maximum number you should be building for a measure group, and if 30% leaves you short of this try increasing the number by 10% until you feel comfortable with what you get. On the final step of the wizard, enter a name for your aggregation design and save it. It's a good idea to give the aggregation design a name including the name of the measure group to make it easier to find if you ever need to script it to XMLA. It's quite common that Analysis Services cube developers stop thinking about aggregation design at this point. This is a serious mistake: just because you have run the Aggregation Design Wizard does not mean you have built all the aggregations you need, or indeed any useful ones at all! Doing Usage-Based Optimisation and/or building aggregations manually is absolutely essential. Usage-based optimization We now have some aggregations designed, but the chances are that despite our best efforts many of them will not prove to be useful. To a certain extent we might be able to pick out these aggregations by browsing through them; really, though, we need to know what queries our users are going to run before we can build aggregations to make them run faster. This is where usage-based optimisation comes in: it allows us to log the requests for data that Analysis Services makes when a query is run and then feed this information into the aggregation design process. To be able to do usage-based optimization, you must first set up Analysis Services to log these requests for data. This involves specifying a connection string to a relational database in the server properties of your Analysis Services instance and allowing Analysis Services to create a log table in that database. The white paper Configuring the Analysis Services Query Log contains more details on how to do this (it's written for Analysis Services 2005 but is still relevant for Analysis Services 2008), and can be downloaded from http://tinyurl.com/ssasquerylog. The query log is a misleading name, because as you'll see if you look inside it it doesn't actually contain the text of MDX queries run against the cube. When a user runs an MDX query, Analysis Services decomposes it into a set of requests for data at a particular granularity and it's these requests that are logged; we'll look at how to interpret this information in the next section. A single query can result in no requests for data, or it can result in as many as hundreds or thousands of requests, especially if it returns a lot of data and a lot of MDX calculations are involved. When setting up the log you also have to specify the percentage of all data requests that Analysis Services actually logs with the QueryLogSampling property—in some cases if it logged every single request you would end up with a very large amount of data very quickly, but on the other hand if you set this value too low you may end up not seeing certain important long-running requests. We recommend that you start by setting this property to 100 but that you monitor the size of the log closely and reduce the value if you find that the number of requests logged is too high. Once the log has been set up, let your users start querying the cube. Explain to them what you're doing and that some queries may not perform well at this stage. Given access to a new cube it will take them a little while to understand what data is present and what data they're interested in; if they're new to Analysis Services it's also likely they'll need some time to get used to whatever client tool they're using. Therefore you'll need to have logging enabled for at least a month or two before you can be sure that your query log contains enough useful information. Remember that if you change the structure of the cube while you're logging then the existing contents of the log will no longer be usable. Last of all, you'll need to run the Usage-Based Optimisation Wizard to build new aggregations using this information. The Usage-Based Optimisation Wizard is very similar to the Design Aggregations Wizard, with the added option to filter the information in the query log by date, user and query frequency before it's used to build aggregations. It's a good idea to do this filtering carefully: you should probably exclude any queries you've run yourself, for example, since they're unlikely to be representative of what the users are doing, and make sure that the most important users queries are over-represented. Once you've done this you'll have a chance to review what data is actually going to be used before you actually build the aggregations. On the last step of the wizard you have the choice of either creating a new aggregation design or merging the aggregations that have just been created with an existing aggregation design. We recommend the latter: what you've just done is optimize queries that ran slowly on an existing aggregation design, and if you abandon the aggregations you've already got then it's possible that queries which previously had been quick would be slow afterwards. This exercise should be repeated at regular intervals throughout the cube's lifetime to ensure that you built any new aggregations that are necessary as the queries that your users run change. Query logging can, however, have an impact on query performance so it's not a good idea to leave logging running all the time. Processing aggregationsWhen you've created or edited the aggregations on one or more partitions, you don't need to do a full process on the partitions. All you need to do is to deploy your changes and then run a ProcessIndex, which is usually fairly quick, and once you've done that queries will be able to use the new aggregations. When you run a ProcessIndex Analysis Services does not need to run any SQL queries against the relational data source if you're using MOLAP storage. Monitoring partition and aggregation usage Having created and configured your partitions and aggregations, you'll naturally want to be sure that when you run a query Analysis Services is using them as you expect. You can do this very easily by running a trace with SQL Server Profiler or by using MDX Studio (a free MDX Editor that can be downloaded from http://tinyurl.com/mdxstudio). To use Profiler, start it and then connect to your Analysis Services instance to create a new trace. On the Trace Properties dialog choose the Blank template and go to the Events Selection tab and check the following: Progress ReportsProgress Report Begin Progress ReportsProgress Report End Queries EventsQuery Begin Queries EventsQuery End Query ProcessingExecute MDX Script Begin Query ProcessingExecute MDX Script End Query ProcessingQuery Cube Begin Query ProcessingQuery Cube End Query ProcessingGet Data From Aggregation Query ProcessingQuery Subcube Verbose Then clear the cache and click Run to start the trace. Once you've done this you can either open up your Analysis Services client tool or you can start running MDX queries in SQL Management Studio. When you do this you'll notice that Profiler starts to show information about what Analysis Services is doing internally to answer these queries. The following screenshot shows what you might typically see: Interpreting the results of a Profiler trace is a complex task and well outside the scope of this article, but it's very easy to pick out some useful information relating to aggregation and partition usage. Put simply: The Query Subcube Verbose events represent individual requests for data from the Formula Engine to the Storage Engine, which can be answered either from cache, an aggregation or base-level partition data. Each of these requests is at a single granularity, meaning that all of the data in the request comes from a single distinct combination of attributes; we refer to these granularities as "subcubes". The TextData column for this event shows the granularity of data that is being requested in human readable form; the Query Subcube event will display exactly the same data but in the less friendly-format that the Usage-Based Optimisation Query Log uses. Pairs of Progress Report Begin and Progress Report End events show that data is being read from disk, either from an aggregation or a partition. The TextData column gives more information, including the name of the object being read; however, if you have more than one object (for example an aggregation) with the same name, you need to look at the contents of the ObjectPath column to see what object exactly is being queried. The Get Data From Aggregation event is fired when data is read from an aggregation, in addition to any Progress Report events. The Duration column shows how long each of these operations takes in milliseconds. At this point in the cube optimisation process you should be seeing in Profiler that when your users run queries they hit as few partitions as possible and hit aggregations as often as possible. If you regularly see slow queries that scan all the partitions in your cube or which do not use any aggregations at all, you should consider going back to the beginning of the process and rethinking your partitioning strategy and rerunning the aggregation design wizards. In a production system many queries will be answered from cache and therefore be very quick, but you should always try to optimise for the worst-case scenario of a query running on a cold cache. Building aggregations manually However good the aggregation designs produced by the wizards are, it's very likely that at some point you'll have to design aggregations manually for particular queries. Even after running the Usage Based Optimisation Wizard you may find that it still does not build some potentially useful aggregations: the algorithm the wizards use is very complex and something of a black box, so for whatever reason (perhaps because it thinks it would be too large) it may decide not to build an aggregation that, when built manually, turns out to have a significant positive impact on the performance of a particular query. Before we can build aggregations manually we need to work out which aggregations we need to build. To do this, we once again need to use Profiler and look at either the Query Subcube or the Query Subcube Verbose events. These events, remember, display the same thing in two different formats - requests for data made to the Analysis Services storage engine during query processing - and the contents of the Duration column in Profiler will show how long in milliseconds each of these requests took. A good rule of thumb is that any Query Subcube event that takes longer than half a second (500 ms) would benefit from having an aggregation built for it; you can expect that a Query Subcube event that requests data at the same granularity as an aggregation will execute almost instantaneously. The following screenshot shows an example of trace on an MDX query that takes 700ms: The single Query Subcube Verbose event is highlighted, and we can see that the duration of this event is the same as that of the query itself, so if we want to improve the performance of the query we need to build an aggregation for this particular request. Also, in the lower half of the screen we can see the contents of the TextData column displayed. This shows a list of all the dimensions and attributes from which data is being requested —the granularity of the request—and the simple rule to follow here is that whenever you see anything other than a zero by an attribute we know that the granularity of the request includes this attribute. We need to make a note of all of the attributes which have anything other than a zero next to them and then build an aggregation using them; in this case it's just the Product Category attribute of the Product dimension. The white paper Identifying and Resolving MDX Query Performance Bottlenecks (again, written for Analysis Services 2005 but still relevant for Analysis Services 2008), available from http://tinyurl.com/mdxbottlenecks, includes more detailed information on how to interpret the information given by the Query Subcube Verbose event. So now that we know what aggregation we need to build, we need to go ahead and build it. We have a choice of tools to do this: we can either use the functionality built into BIDS, or we can use some of the excellent functionality that BIDS Helper provides. In BIDS, to view and edit aggregations, you need to go to the Aggregations tab in the cube editor. On the Standard View you only see a list of partitions and which aggregation designs they have associated with them; if you switch to the Advanced View by pressing the appropriate button on the toolbar, you can view the aggregations in each aggregation design for each measure group. If you right-click in the area where the aggregations are displayed you can also create a new aggregation and once you've done that you can specify the granularity of the aggregation by checking and unchecking the attributes on each dimension. For our particular query we only need to check the box next to the Product Categories attribute, as follows: The small tick at the top of the list of dimensions in the Status row shows that this aggregation has passed the built-in validation rules that BIDS applies to make sure this is a useful aggregation. If you see an amber warning triangle here, hover over it with your mouse and in the tooltip you'll see a list of reasons why the aggregation has failed its status check. If we then deploy and run a ProcessIndex, we can then rerun our original query and watch it use the new aggregation, running much faster as a result: The problem with the native BIDS aggregation design functionality is that it becomes difficult to use when you have complex aggregations to build and edit. The functionality present in BIDS Helper, while it looks less polished, is far more useable and offers many benefits over the BIDS native functionality, for example: The BIDS Helper Aggregation Design interface displays the aggregation granularity in the same way (ie using 1s and 0s, as seen in the screenshot below) as the Query Subcube event in Profiler does, making it easier to cross reference between the two. It also shows attribute relationships when it displays the attributes on each dimension when you're designing an aggregation, as seen on the righthand side in the screenshot that follows. This is essential to being able to build optimal aggregations. It also shows whether an aggregation is rigid or flexible. It has functionality to remove duplicate aggregations and ones containing redundant attributes (see below), and search for similar aggregations. It allows you to create new aggregations based on the information stored in the Query Log. It also allows you to delete unused aggregations based on information from a Profiler trace. Finally, it has some very comprehensive functionality to allow you to test the performance of the aggregations you build (see http://tinyurl.com/testaggs). Unsurprisingly, if you need to do any serious work designing aggregations manually we recommend using BIDS Helper over the built-in functionality. Common aggregation design issues Several features of your cube design must be borne in mind when designing aggregations, because they can influence how Analysis Services storage engine queries are made and therefore which aggregations will be used. These include: There's no point building aggregations above the granularity you are slicing your partitions by. Aggregations are built on a per-partition basis, so for example if you're partitioning by month there's no value in building an aggregation at the Year granularity since no aggregation can contain more than one month's worth of data. It won't hurt if you do it, it just means that an aggregation at month will be the same size as one at year but useful to more queries. It follows from this that it might be possible to over-partition data and reduce the effectiveness of aggregations, but we have anecdotal evidence from people who have built very large cubes that this is not an issue. For queries involving a dimension with a many-to-many relationship to a measure group, aggregations must not be built using any attributes from the many-to-many dimension, but instead must be built at the granularity of the attributes with a regular relationship to the intermediate measure group. When a query is run using the Sales Reason dimension Analysis Services fi rst works out which Sales Orders relate to each Sales Reason, and then queries the main measure group for these Sales Orders. Therefore, only aggregations at the Sales Order granularity on the main measure group can be used. As a result, in most cases it's not worth building aggregations for queries on many-to-many dimensions since the granularity of these queries is often close to that of the original fact table. Queries involving measures which have semi-additive aggregation types are always resolved at the granularity attribute of the time dimension, so you need to include that attribute in all aggregations. Queries involving measures with measure expressions require aggregations at the common granularity of the two measure groups involved. You should not build aggregations including a parent/child attribute; instead you should use the key attribute in aggregations. No aggregation should include an attribute which has AttributeHierarchyEnabled set to False. No aggregation should include an attribute that is below the granularity attribute of the dimension for the measure group. Any attributes which have a default member that is anything other than the All Member, or which have IsAggregatable set to False, should also be included in all aggregations. Aggregations and indexes are not built for partitions with fewer than 4096 rows. This threshold is set by the IndexBuildThreshold property in msmdsrv.ini; you can change it but it's not a good idea to do so. Aggregations should not include redundant attributes, that is to say attributes from the same 'chain' of attribute relationships. For example if you had a chain of attribute relationships going from month to quarter to year, you should not build an aggregation including month and quarter—it should just include month. This will increase the chance that the aggregation can be used by more queries, as well as reducing the size of the aggregation. Summary In this part of the article we covered performance-specific design features such as partitions and aggregations. In the next part, we will cover MDX calculation performance and caching.
Read more
  • 0
  • 0
  • 20246

article-image-adding-interactive-course-material-moodle-19-part-3
Packt
20 Oct 2009
5 min read
Save for later

Adding Interactive Course Material in Moodle 1.9: Part 3

Packt
20 Oct 2009
5 min read
  Editing a Quiz Immediately after saving the Settings page, you are taken to the Editing Quiz page. This page is divided into five tabs. Each tab enables you to edit a different aspect of the quiz. This tab... Enables you to... Quiz Add questions to the quiz. Remove questions from the quiz. Arrange the questions in order. Create page breaks between questions. Assign a point value to each question. Assign a maximum point value to the quiz. Click into the editing page for each question. Questions Create a new question. Note that you must then add the new question to the quiz under the Quiz tab (see above). Also note that every question must belong to a category. Delete a question, not just from the quiz but from your site's question bank. Move a question from one category to another category. Click into the editing page for each question. Click into the editing page for each category. Categories Arrange the list of categories in order. Nest a category under another category (they become parent and subcategories). Publish a category, so that questions in that category can be used by other courses on the site. Delete a category (you must choose a new category to move the questions in the deleted category). Import Import questions from other learning systems. Import questions that were exported from Moodle. Export Export questions from Moodle, and save them in a variety of formats that Moodle and other learning systems can understand.       Create and Edit Question Categories Every question belongs to a category. You manage question categories under the Categories tab. There will always be a Default category. But before you create new questions, you might want to check to ensure that you have an appropriate category in which to put them. The categories which you can manage are listed on this page. To Add a New Category To add a new category, first select its Parent. If you select Top, the category will be a top-level category. Or, you can select any other category to which you have access, and then the new category will be a child of the selected category. In the Category field, enter the name for the new category. In the Category Info field, enter a description of the new category. The Publish field determines whether other courses can use the questions in this category. Click the Add button. To Edit a Category Next to the category, click the icon. The Edit categories page is displayed. You can edit the Parent, Category name, Category Info, and Publish setting. When you are finished, click the Update button. Your changes are saved and you are returned to the Categories page. Managing the Proliferation of Questions and Categories As the site administrator, you might want to monitor the creation of new question categories to ensure that they are logically named, don't have a lot of overlap, and are appropriate for the purpose of your site. As these question and their categories are shared among course creators, they can be a powerful tool for collaboration. Consider using the site-wide Teachers forum to notify your teachers, and course creators of new questions and categories. Create and Manage Questions You create and manage questions under the Questions tab. The collection of questions in your site is called the Question bank. As a teacher or the course creator, you have access to some or all the questions in the question bank. When you create questions, you add them to your site's question bank. When you create a quiz, you choose questions from the question bank for the quiz. Both these functions can be done on the same Editing Quiz page. Pay attention to which part of the page you are using—the one for creating new questions or the one for drawing question from the question bank. Display Questions from the Bank You can display questions from one category at a time. To select that category, use the Category drop-down list. If a question is deleted when it is still being used by a quiz, then it is not removed from the question bank. Instead, the question is hidden. The setting Also show old questions enables you to see questions that were deleted from the category. These deleted, or hidden, or old questions appear in the list with a blue box next to them. To keep your question bank clean, and to prevent teachers from using deleted questions, you can move all the deleted questions into a category called Deleted questions. Create the category Deleted questions and then use Also show old questions to show the deleted questions. Select them, and move them into Deleted questions. Move Questions between Categories To move a question into a category, you must have access to the target category. This means that the target category must be published, so that the teachers in all the courses can see it. Select the question(s) to move, select the category, and then click the Move to>> button: Create a Question To create a new question, from the Create new question drop-down list, select the type for the next question: This brings you to the editing page for the question: After you save the question, it is added to the list of questions in that category: Question Types The following chart explains the types of questions you can create, and gives some tips for using them.  
Read more
  • 0
  • 0
  • 1909
article-image-modifying-existing-theme-drupal-6-part-2
Packt
20 Oct 2009
4 min read
Save for later

Modifying an Existing Theme in Drupal 6: Part 2

Packt
20 Oct 2009
4 min read
Adapting the CSS We've set up Tao as a subtheme of the Zen theme. As a result, the Tao theme relies upon a number of stylesheets, both in the Tao directory and in the parent theme's directory. The good news is that we do not need to concern ourselves with hacking away at all these various stylesheets, we can instead place all our changes in the tao.css file, located in the Tao theme directory. Drupal will give precedence to the styles defined in the theme's .css file, in the event of any conflicting definitions. Precedence and inheritance Where one style definition is in an imported stylesheet and another in the immediate stylesheet, the rule in the immediate stylesheet (the one that is importing the other stylesheet) takes precedence. Where repetitive definitions are in the same stylesheet, the one furthest from the top of the stylesheet takes precedence in the case of conflicts; where repetitive definitions are in the same stylesheet, nonconflicting attributes will be inherited. Setting the page dimensions For this exercise, the goal is to create a fixed width theme optimized for display settings of 1024 x 768. Accordingly, one of the most basic changes we need to make is to the page dimensions. If you look at the page.tpl.php file, you will notice that the entire page area is wrapped with a div with the id=page. Open up the tao.css file and alter it as follows. To help avoid precedence problems, place all your style definitions at the end of the stylesheet. Let's modify the selector #page. #page { width: 980px; margin: 0 auto; border-left: 4px solid #666633; border-right: 4px solid #666633; background-color: #fff;} In this case, I set page width to 980 pixels, a convenient size that works consistently across systems, and applied the margin attribute to center the page. I have also applied the border-left and border-right styles and set the background color. We also need to add a little space between the frame and the content area as well to keep the presentation readable and clean. The selector #content-area helps us here as a convenient container: #content-area { padding: 0 20px;} Formatting the new regions Let's begin by using CSS to position and format the two new regions, page top and banner. When we placed the code for the two new regions in our page.tpl.php file, we wrapped them both with divs. Page top was wrapped with the div page-top, so let's create that in our tao.css file: #page-top { margin: 0; background-color: #676734; width: 980px; height: 25px; text-align: right;} The region banner was wrapped with a div of the same name, so let's now define that selector as well: #banner { background-color: #fff; width: 980px; height: 90px; text-align: center;} Setting fonts and colors Some of the simplest CSS work is also some of the most important—setting font styles and the colors of the elements. Let's start by setting the default fonts for the site. I'm going to use body tag as follows: body { background: #000; min-width: 800px; margin: 0; padding: 0; font: 13px Arial,Helvetica,sans-serif; color: #111; line-height:1.4em;} Now, let's add various other styles to cover more specialized text, like links and titles: a, a:link, a:visited { color: #666633; text-decoration: none;}a:hover, a:focus { text-decoration: underline;}h1.title, h1.title a, h1.title a:hover{ font-family: Verdana, Arial, Helvetica, sans-serif; font-weight: normal; color: #666633; font-size: 200%; margin: 0; line-height: normal;}h1, h1 a, h1 a:hover { font-size: 140%; color: #444; font-family: Verdana, Arial, Helvetica, sans-serif; margin: 0.5em 0;}h2, h2 a, h2 a:hover, .block h3, .block h3 a {font-size: 122%; color: #444; font-family: Verdana, Arial, Helvetica, sans-serif; margin: 0.5em 0;}h3 { font-size: 107%;font-weight: bold;font-family: Verdana, Arial, Helvetica, sans-serif;}h4, h5, h6 {font-weight: bold; font-family: Verdana, Arial, Helvetica, sans-serif;}#logo-title { margin: 10px 0 0 0; position: relative; background-color: #eaebcd; height: 60px; border-top: 1px solid #676734; padding-top: 10px; padding-bottom: 10px; border-bottom: 1px solid #676734;}#site-name a, #site-name a:hover { font-family: Verdana, Arial, Verdana, Sans-serif; font-weight: normal; color: #000; font-size: 176%; margin-left: 20px; padding: 0;}#site-slogan { color: #676734; margin: 0; font-size: 90%; margin-left: 20px; margin-top: 10px;}.breadcrumb {padding-top: 0; padding-bottom: 10px;padding-left: 20px;}#content-header .title { padding-left: 20px;} After you have made the changes, above, remember to go back and comment out any competing definitions that may cause inheritance problems.
Read more
  • 0
  • 0
  • 1621

article-image-data-migration-scenarios-sap-business-one-application-part-1
Packt
20 Oct 2009
25 min read
Save for later

Data Migration Scenarios in SAP Business ONE Application- part 1

Packt
20 Oct 2009
25 min read
Just recently, I found myself in a data migration project that served as an eye-opener. Our team had to migrate a customer system that utilized Act! and Peachtree. Both systems are not very famous for having good accessibility to their data. In fact, Peachtree is a non-SQL database that does not enforce data consistency. Act! also uses a proprietary table system that is based on a non-SQL database. The general migration logic was rather straightforward. However, our team found that the migration and consolidation of data into the new system posed multiple challenges, not only on the technical front, but also for the customer when it came to verifying the data. We used the on-the-edge tool xFusion Studio for data migration. This tool allows migrating and synchronizing data by using simple and advanced SQL data messaging techniques. The xFusion Studio tool has a graphical representation of how the data flows from the source to the target. When I looked at one section of this graphical representation, I started humming the song Welcome to the Jungle. Take a look at the following screenshot and find out why Guns and Roses may have provided the soundtrack for this data migration project: What we learned from the above screenshot is quite obvious and I have dedicated this article to helping you overcome these potential issues. Keep it simple and focus on information rather than data. You know that just by having more data does not always mean you’ve added more information. Sometimes, it just means a data jungle has been created. Making the right decisions at key milestones during the migration can keep the project simple and guarantee the success. Your goal should be to consolidate the islands of data into a more efficient and consistent database that provides real-time information. What you will learn about data migration In order to accomplish the task of migrating data from different sources into SAP Business ONE application, a strategy must be designed that addresses the individual needs of the project at hand. The data migration strategy uses proven processes and templates. The data migration itself can be managed as a mini project depending on the complexity. During the course of this article, the following key topics will be covered. The goal is to help you make crucial decisions, which will keep a project simple and manageable: Position the data migration tasks in the project plan – We will start by positioning the data migration tasks in the project plan. I will further define the required tasks that you need to complete as a part of the data migration. Data types and scenarios – With the general project plan structure in place, it is time to cover the common terms related to data migration. I will introduce you to the main aspects, such as master data and transactional data, as well as the impact they have on the complexity of data migration. SAP tools available for migration – During the course of our case study, I will introduce you to the data migration tools that come with SAP. However, there are also more advanced tools for complex migrations. You will learn about the main player in this area and how to use it. Process of migration – To avoid problems and guarantee success, the data migration project must follow a proven procedure. We will update the project plan to include the procedure and will also use the process during our case study. Making decisions about what data to bring – I mentioned that it is important to focus on information versus data. With the knowledge of the right tools and procedures, it is a good time to summarize the primary known issues and explain how to tackle them. The project plan We are still progressing in Phase 2 – Analysis and Design. The data migration is positioned in the Solution Architecture section and is called Review Data Conversion Needs (Amount and Type of Data). A thorough evaluation of the data conversion needs will also cover the next task in the project plan called Review Integration Points with any 3rd Party Solution. As you can see, the data migration task stands as a small task in the project plan. But as mentioned earlier, it can wind up being a large project depending on the number and size of data sources that need to be migrated. To honor this, we will add some more details to this task. As the task name suggests, we must review data conversion needs and identify the amount and type of data. This simple task must be structured in phases, just like the entire project that is structured in phases. Therefore, data migration needs to go through the following phases to be successful: 1. Design - Identify all of the Data Sources 2. Extraction of data into Excel or SQL for review and consistency 3. Review of Data and Verification(Via Customer Feedback) 4. Load into SAP System and Verification Note that the validation process and the consequential load could be iterative processes. For example, if the validated data has many issues, it only makes sense to perform a load into SAP if an additional verification takes place before the load. You only want to load data into an SAP system for testing if you know the quality of the records going to be loaded is good. Therefore, new phases were added in the project plan (seen below). Please do this in your project too based on the actual complexity and the number of data sources you have. A thorough look at the tasks above will be provided when we talk about the process of migration. Before we do that, the basic terms related to data migration will be covered. Data sources—where is my data There is a great variety in the potential types data sources. We will now identify the most common sources and explain their key characteristics. However, if there is a source that is not mentioned here, you can still migrate the data easily by transitioning it into one of the following formats. Microsoft Excel and text data migration The most common format for data migration is Excel, or text-based files. Text-based files are formatted using commas or tabs as field separators. When a comma is used as a field separator, the file format is referred to as Comma Separated Values (CSV). Most of the migration templates and strategies are based on Excel files that have specific columns where you can manually enter data, or copy and paste larger chunks. Therefore, if there is any way for you to extract data from your current system and present it in Excel, you have already done a great deal of data migration work. Microsoft Access An Access database is essentially an Excel sheet on a larger scale with added data consistency capability. It is a good idea to consider extracting Access tables to Excel in order to prepare for data migration. SQL If you have very large sets of data, then instead of using Excel, we usually employ an SQL database. The database then has a set of tables instead of Excel sheets. Using SQL tables, we can create SQL statements that can verify data and analyze results sets. Please note that you can use any SQL database, such as Microsoft SQL Server, Oracle, IBM DB, and so on. SaaS (Netsuite, Salesforce) SaaS stands for Software as a Service. Essentially, it means you can use software functionality based on a subscription. However, you don't own the solution. All of the hardware and software is installed at the service center, so you don't need to worry about hardware and software maintenance. However, keep in mind that these services don't allow you to manage the service packs according to your requirements. You need to adjust your business to the schedule of the SaaS company. If you are migrating from a modern SaaS solution, such as Salesforce or Netsuite, you will probably know that the data is not at your site, but rather stored at your solution hosting provider. Getting the data out to migrate to another solution may be done by obtaining reports, which could then be saved in an Excel format. Other legacy data The term legacy data is often mentioned when evaluating larger old systems. Legacy data basically comprises a large set of data that a company is using on mostly obsolete systems. AS/400 or Mainframe The IBM AS/400 is a good example of a legacy data source. Experts who are capable of extracting data from these systems are highly sought after, and so the budget must be on a higher scale. AS/400 data can often be extracted into a text or an Excel format. However, the data may come without headings. The headings are usually documented in a file that describes the data. You need to make sure that you get the file definitions, without which the pure text files may be meaningless. In addition, the media format is worth considering. An older AS/400 system may utilize a backup tape format which is not available on your Intel server. Peachtree, QuickBooks, and Act! Another potential source for data migration may be a smaller PC-based system, such as Peachtree, QuickBooks, or Act!. These systems have a different data format, and are based on non-SQL databases. This means the data cannot be accessed via SQL. In order to extract data from those systems, the proprietary API must be used. For example, if Peachtree displays data in the applications forms, it uses the program logic to put the pieces together from different text files. Getting data out from these types of systems is difficult and sometimes impossible. It is recommended to employ the relevant API to access the data in a structured way. You may want to run reports and export the results to text or Excel. Data classification in SAP Business ONE There are two main groups of data that we will migrate to the SAP Business ONE application: master data and transaction data. Master data Master data is the basic information that SAP Business ONE uses to record transactions (for example, business partner information). In addition, information about your products, such as items, finished goods, and raw materials are considered master data. Master data should always be migrated if possible. It can easily be verified and structured in an Excel or SQL format. For example, the data could be displayed using Excel sheets. You can then quickly verify that the data is showing up in the correct columns. In addition, you can see if the data is broken down into its required components. For example, each Excel column should represent a target field in SAP Business ONE. You should avoid having a single column in Excel that provides data for more than one target in SAP Business ONE. Transaction data Transaction data are proposals, orders, invoices, deliveries, and other similar information that comprise a combination of master data to create a unique business document. Customers often will want to migrate historical transactions from older systems. However, the consequences of doing this may have a landslide effect. For example, inventory is valuated based on specific settings in the finance section of a system. If these settings are not identical in the new system, transactions may look different in the old and the new system. This makes the migration very risky as the data verification becomes difficult on the customer side. I recommend making historical transactions available via a reporting database. For example, often, sales history must be available when migrating data. You can create a reporting database that provides sales history information. The user can use this data via reports within the SAP Business ONE application. Therefore, closed transactions should be migrated via a reporting database . Closed transactions are all of the business-related activities that were fully completed in the old system. Open transactions, on the other hand, are all of the business-related activities that are currently not completed. It makes sense that the open transactions be migrated directly to SAP, and not to a history database because they will be completed within the new SAP system. As a result of the data migration, you would be able to access sales history information from within SAP by accessing a reporting database. Open transactions will be completed within SAP, and then consequently lead to new transactions in SAP. Create a history database for sales history and manually enter open transactions. SAP DI-API Now that we know the main data types for an SAP migration, and the most common sources, we can take a brief look at the way the data is inserted into the SAP system. Based on the SAP guidelines, you are not allowed to insert data directly in the underlying SQL tables. The reason for that is that it can cause inconsistencies. When SAP works with the database, multiple tables are often updated. If you manually update a table to insert data, there is a good chance that another table has a link that also requires updating. Therefore, unless you know the exact table structure for the data you are trying to update, don't mess with the SAP SQL tables. If you carefully read this and understand the table structure, you will now know that there may be situations where you decide to access the tables directly. If you decide to insert data directly into the SAP database tables, you run the risk of losing your warranty. Migration scenarios and key decisions Data migration not only takes place as a part of a new SAP implementation, but also if you have a running system and you want to import leads or a list of new items. Therefore, it is a good idea to learn about the scenarios that you may come across and be able to select the right migration and integration tools. As outlined before, data can be divided into two groups: master data and transaction data. It is important that you separate the two, and structure each data migration accordingly. Master data is an essential component for manifesting transactions. Therefore, even if you need to bring over transactional data, the master data must already be in place. Always start with the master data alongside a verification procedure, and then continue with the relevant transaction data. Let’s now briefly look at the most common situations where you may require the evaluation of potential data migration options. New company (start-up) In this setup, you may not have extensive amounts of existing data to migrate. However, you may want to bring over lead lists or lists of items. During the course of this article, we will import a list of leads into SAP using the Excel Import functionality. Many new companies require the capability to easily import data into SAP. As you already know by now, the import of leads and item information will be considered as importing master data. Working with this master data by entering sales orders and so forth, would constitute transaction data. Transaction data is considered closed if all of the relevant actions are performed. For example, a sales order is considered closed if the items are delivered, invoiced, and paid for. If the chain of events is not complete, the transaction is open. Islands of data scenario This is the classic situation for an SAP implementation. You will first need to identify the available data sources and their formats. Then, you select the master data you want to bring over. With multiple islands of data, an SAP master record may result from more than one source. A business partner record may come, in part, from an existing accounting system, such as QuickBooks or Peachtree. Whereas other parts may come from a CRM system, such as Act!. For example, the billing information may be retrieved from the finance system and the relevant lead and sales information, such as specific contacts and notes, may come from the CRM system. In such a case, you need to merge this information into a new consistent master record in SAP. For this situation, first manually put the pieces together. Once the manual process works, you can attempt to automate the process. Don't try to directly import all of the data. You should always establish an intermediary level that allows for data verification. Only then import the data into SAP. For example, if you have QuickBooks and Act!, first merge the information into Excel for verification, and then import it into SAP. If the amount of data is large, you can also establish an SQL database. In that case, the Excel sheets would be replaced by SQL tables. IBM legacy data migration The migration of IBM legacy data is potentially the most challenging because the IBM systems are not directly compatible with Windows-based systems. Therefore, almost naturally, you will establish a text-based, or an Excel-formatted, representation of the IBM data. You can then proceed with verifying the information. SQL migration The easiest migration type is obviously the one where all of the data is already structured and consistent. However, you will not always have documentation of the table structure where the data resides. In this case, you need to create queries against the SQL tables to verify the data. The queries can then be saved as views. The views you create should always represent a consistent set of information that you can migrate. For example, if you have one table with address information, and another table with customer ID fields, you can create a view that consolidates this information into a single consistent set. Process of migration for your project I briefly touched upon the most common data migration scenarios so you can get a feel for the process. As you can see, whatever the source of data is, we always attempt to create an intermediary platform that allows the data to be verified. This intermediary platform is most commonly Excel or an SQL database. The process of data migration has the following subtasks: Identify available data sources Structure data into master data and transaction data Establish an intermediary platform with Excel or SQL Verify data Match data columns with Excel templates Run migration based on templates and verify data Based on this procedure, I have added more detail to the project plan. As you can see in this example, based on the required level of detail, we can make adjustments to the project plan to address the requirements: SAP standard import features Let's take a look at the available data exchange features in SAP. SAP provides two main tools for data migration. The fi rst option is to use the available menu in the SAP Business ONE client interface to exchange data. The other option is to use the Data Transfer Workbench (DTW). Standard import/export features— walk-through You can reach the Import from Excel form via Administration | Data Import/Export. As you can see in the following screenshot on the right top section of the form, the type of import is a drop-down selection. The options are BP and Items. In the screenshot, we have selected BP, which allows business partner information to be imported. There are drop-down fields that you can select based on the data you want to import. However, keep in mind that certain fields are mandatory, such as the BP Code field, whereas others are optional. The fields you select are associated with a column as you can see here: If you want to find out if a field is mandatory or not, simply open SAP and attempt to enter the data directly in the relevant SAP form. For example, if you are trying to import business partner information, enter the fields you want to import and see if the record can be saved. If you are missing any mandatory fields, SAP will provide an error message. You can modify the data that you are planning to import based on that. When you click on the OK button in the Import from Excel form (seen above), the Excel sheet with all of the data needs to be selected. In the following screenshot, you can see how the Excel sheet in our example looks. For example, column A has all of the BP Codes. This is in line with the mapping of columns to fields that we can see on the Import from Excel form. Please note that the file we select must be in a .txt format. For this example, I used the Save As feature in Excel (seen in the following screenshot) to save the file in the Text MS-DOS (*.txt) format. I was then able to select the BP Migration.txt file. This is actually a good thing because it points to the fact that you can use any application that can save data in the .txt format as the data source. The following screenshot shows the Save As screen: I imported the file and a success message confirms that the records were imported into SAP: A subsequent check in SAP confirms that the BP records that I had in the text file are now available in SAP: In the example, we only used two records. It is recommended to start out with a limited number of records to verify that the import is working. Therefore, you may start by reducing your import file to five records. This has the advantage of the import not taking a long time and you can immediately verify the result. See the following screenshot: Sometimes, it is not clear what kind of information SAP expects when importing. For example, at first Lead, Customer, Vendor were used in Column C to indicate the type of BP that was to be imported. However, this resulted in an error message upon completion of the import. Therefore, system information was activated to check what information SAP requires for the BP Type representation. As you can see in the screenshot of the Excel sheet you get when you click on the OK button in the Import from Excel form, the BP Type information is indicated by only one letter using L, C, or V. In the example screenshot above, you can clearly see L in the lower left section. The same thing is done for Country in the Addresses section. You can try that by navigating to Administration | Sales | Countries, and then hovering over the country you will be importing. In my example, USA was internally represented by SAP as US. It is a minor issue. However, when importing data, all of these issues need to be addressed. Please note that the file you are trying to import should not be open in Excel at the same time, as this may trigger an error. The Excel or text file does not have a header with a description of the data. Standard import/export features for your own project SAP’s standard import functionality for business partners and items is very straightforward. For your own project, you can prepare an Excel sheet for business partners and items. If you need to import BP or item information from another system, you can get this done quickly. If you get an error during the import process, try to manually enter the data in SAP. In addition, you can use the System Information feature to identify how SAP stores information in the database. I recommend you first create an Excel sheet with a maximum of two records to see if the basic information and data format is correct. Once you have this running, you can add all of the data you want to import. Overall, this functionality is a quick way to get your own data into the system. This feature can also be used in case you regularly receive address information. For example, if you have salespeople visiting trade fairs, you can provide them with the Excel sheet that you may have prepared for BP import. The salespeople can directly add their information there. Once they return from the trade fair with the Excel fi les, you can easily import the information into SAP and schedule follow-up activities using the Opportunity Management System. The item import is useful if you work with a vendor who updates his or her price lists and item information on a monthly basis. You can prepare an Excel template where the item information will regularly be entered and you can easily import the updates into SAP. Data Migration Workbench (DTW) The SAP standard import/export features are straightforward, but may not address the full complexity of the data that you need to import. For this situation, you may want to evaluate the SAP Data Migration Workbench (DTW). The functionality of this tool provides a greater level of detail to address the potential data structures that you want to import. To understand the basic concept of the DTW, it is a good idea to look at the different master data sections in SAP as business objects. A business object groups related information together. For example, BP information can have much more detail than what was previously shown in the standard import. The DTW templates and business objects To better understand the business object metaphor, you need to navigate to the DTW directory and evaluate the Templates folder. The templates are organized by business objects. The oBusinessPartners business object is represented by the folder with the same name (seen below). In this folder, you can find Excel template files that can be used to provide information for this type of business object. The following objects are available as Excel templates: BPAccountReceivables BPAddresses BPBankAccounts BPPaymentDates BPPaymentMethods BPWithholdingTax BusinessPartners ContactEmployees Please notice that these templates are Excel .xlt files, which is the Excel template extension. It is a good idea to browse through the list of templates and see the relevant templates. In a nutshell, you essentially add your own data to the templates and use DTW to import the data. Connecting to DTW In order to work with DTW, you need to connect to your SAP system using the DTW interface. The following screenshot shows the parameters I used to connect to the Lemonade Stand database: Once you are connected, a wizard-type interface walks you through the required steps to get started. Look at the next screenshot: The DTW examples and templates There is also an example folder in the DTW installation location on your system. This example folder has information about how to add information to your Excel templates. The following screenshot shows an example for business partner migration. You can see that the Excel template does have a header line on top that explains the content in the particular column. The actual template files also have comments in the header fi le, which provide information about the data format expected, such as String, Date, and so on. See the example in this screenshot: The actual template is empty and you need to add your information as shown here:   DTW for your own project If you realize that the basic import features in SAP are not sufficient, and your requirements are more challenging, evaluate DTW. Think of the data you want to import as business objects where information is logically grouped. If you are able to group your data together, you can modify the Excel templates with your own information. The DTW example folder provides working examples that you can use to get started. Please note that you should establish a test database before you start importing data this way. This is because once new data arrives in SAP, you need to verify the results based on the procedure discussed earlier. In addition, be prepared to fine-tune the import. Often, an import and data verification process takes four attempts of data importing and verification. Summary In this article, we covered the tasks related to data migration. This also included some practical examples for simple data imports related to business partner information and items. In addition, more advanced topics were covered by introducing the SAP DTW (Data Transfer Workbench) and the related aspects to get you started. During the course of this article, we positioned the data migration task in the project plan. The project plan was then fine-tuned with more detail to give some justice to the potential complexity of a data migration project. The data migration tasks established a process, from design to data mapping and verification of the data. Notably, the establishment of an intermediary data platform was recommended for your projects. This will help you verify data at each step of the migration. The key message of keeping it simple will be the basis for every migration project. The data verification task ensures simplicity and the quality of your data. If you have read this article you may be interested to view : Competitive Service and Contract Management in SAP Business ONE Implementation: Part 1 Competitive Service and Contract Management in SAP Business ONE Implementation: Part 2 Data Migration Scenarios in SAP Business ONE Application- part 2
Read more
  • 0
  • 0
  • 11269

article-image-query-performance-tuning-microsoft-analysis-services-part-2
Packt
20 Oct 2009
21 min read
Save for later

Query Performance Tuning in Microsoft Analysis Services: Part 2

Packt
20 Oct 2009
21 min read
MDX calculation performance Optimizing the performance of the Storage Engine is relatively straightforward: you can diagnose performance problems easily and you only have two options—partitioning and aggregation—for solving them. Optimizing the performance of the Formula Engine is much more complicated because it requires knowledge of MDX, diagnosing performance problems is difficult because the internal workings of the Formula Engine are hard to follow, and solving the problem is reliant on knowing tips and tricks that may change from service pack to service pack. Diagnosing Formula Engine performance problems If you have a poorly-performing query, and if you can rule out the Storage Engine as the cause of the problem, then the issue is with the Formula Engine. We've already seen how we can use Profiler to check the performance of Query Subcube events, to see which partitions are being hit and to check whether aggregations are being used; if you subtract the sum of the durations of all the Query Subcube events from the duration of the query as a whole, you'll get the amount of time spent in the Formula Engine. You can use MDX Studio's Profile functionality to do the same thing much more easily—here's a screenshot of what it outputs when a calculation-heavy query is run: The following blog entry describes this functionality in detail: http://tinyurl.com/mdxtrace; but what this screenshot displays is essentially the same thing that we'd see if we ran a Profiler trace when running the same query on a cold and warm cache, but in a much more easy-to-read format. The column to look at here is the Ratio to Total, which shows the ratio of the duration of each event to the total duration of the query. We can see that on both a cold cache and a warm cache the query took almost ten seconds to run but none of the events recorded took anywhere near that amount of time: the highest ratio to parent is 0.09%. This is typical of what you'd see with a Formula Engine-bound query. Another hallmark of a query that spends most of its time in the Formula Engine is that it will only use one CPU, even on a multiple-CPU server. This is because the Formula Engine, unlike the Storage Engine, is single-threaded. As a result if you watch CPU usage in Task Manager while you run a query you can get a good idea of what's happening internally: high usage of multiple CPUs indicates work is taking place in the Storage Engine, while high usage of one CPU indicates work is taking place in the Formula Engine. Calculation performance tuning Having worked out that the Formula Engine is the cause of a query's poor performance then the next step is, obviously, to try to tune the query. In some cases you can achieve impressive performance gains (sometimes of several hundred percent) simply by rewriting a query and the calculations it depends on; the problem is knowing how to rewrite the MDX and working out which calculations contribute most to the overall query duration. Unfortunately Analysis Services doesn't give you much information to use to solve this problem and there are very few tools out there which can help either, so doing this is something of a black art. There are three main ways you can improve the performance of the Formula Engine: tune the structure of the cube it's running on, tune the algorithms you're using in your MDX, and tune the implementation of those algorithms so they use functions and expressions that Analysis Services can run efficiently. We've already talked in depth about how the overall cube structure is important for the performance of the Storage Engine and the same goes for the Formula Engine; the only thing to repeat here is the recommendation that if you can avoid doing a calculation in MDX by doing it at an earlier stage, for example in your ETL or in your relational source, and do so without compromising functionality, you should do so. We'll now go into more detail about tuning algorithms and implementations. Mosha Pasumansky's blog, http://tinyurl.com/moshablog, is a goldmine of information on this subject. If you're serious about learning MDX we recommend that you subscribe to it and read everything he's ever written. Tuning algorithms used in MDX Tuning an algorithm in MDX is much the same as tuning an algorithm in any other kind of programming language—it's more a matter of understanding your problem and working out the logic that provides the most efficient solution than anything else. That said, there are some general techniques that can be used often in MDX and which we will walk through here. Using named sets to avoid recalculating set expressions Many MDX calculations involve expensive set operations, a good example being rank calculations where the position of a tuple within an ordered set needs to be determined. The following query includes a calculated member that displays Dates on the Rows axis of a query, and on columns shows a calculated measure that returns the rank of that date within the set of all dates based on the value of the Internet Sales Amount measure: WITH MEMBER MEASURES.MYRANK AS Rank ( [Date].[Date].CurrentMember ,Order ( [Date].[Date].[Date].MEMBERS ,[Measures].[Internet Sales Amount] ,BDESC ) )SELECT MEASURES.MYRANK ON 0 ,[Date].[Date].[Date].MEMBERS ON 1 FROM [Adventure Works] It runs very slowly, and the problem is that every time the calculation is evaluated it has to evaluate the Order function to return the set of ordered dates. In this particular situation, though, you can probably see that the set returned will be the same every time the calculation is called, so it makes no sense to do the ordering more than once. Instead, we can create a named set hold the ordered set and refer to that named set from within the calculated measure, so: WITH SET ORDEREDDATES AS Order ( [Date].[Date].[Date].MEMBERS ,[Measures].[Internet Sales Amount] ,BDESC ) MEMBER MEASURES.MYRANK AS Rank ( [Date].[Date].CurrentMember ,ORDEREDDATES ) SELECT MEASURES.MYRANK ON 0 ,[Date].[Date].[Date].MEMBERS ON 1 FROM [Adventure Works] This version of the query is many times faster, simply as a result of improving the algorithm used; the problem is explored in more depth in this blog entry: http://tinyurl.com/mosharank Since normal named sets are only evaluated once they can be used to cache set expressions in some circumstances; however, the fact that they are static means they can be too inflexible to be useful most of the time. Note that normal named sets defined in the MDX Script are only evaluated once, when the MDX script executes and not in the context of any particular query, so it wouldn't be possible to change the example above so that the set and calculated measure were defined on the server. Even named sets defined in the WITH clause are evaluated only once, in the context of the WHERE clause, so it wouldn't be possible to crossjoin another hierarchy on columns and use this approach, because for it to work the set would have to be reordered once for each column. The introduction of dynamic named sets in Analysis Services 2008 improves the situation a little, and other more advanced techniques can be used to work around these issues, but in general named sets are less useful than you might hope. For further reading on this subject see the following blog posts: http://tinyurl.com/chrisrankhttp://tinyurl.com/moshadsetshttp://tinyurl.com/chrisdsets Using calculated members to cache numeric values In the same way that you can avoid unnecessary re-evaluations of set expressions by using named sets, you can also rely on the fact that the Formula Engine can (usually) cache the result of a calculated member to avoid recalculating expressions which return numeric values. What this means in practice is that anywhere in your code you see an MDX expression that returns a numeric value repeated across multiple calculations, you should consider abstracting it to its own calculated member; not only will this help performance, but it will improve the readability of your code. For example, take the following slow query which includes two calculated measures: WITH MEMBER [Measures].TEST1 AS [Measures].[Internet Sales Amount] / Count ( TopPercent ( { [Scenario].[Scenario].&[1] ,[Scenario].[Scenario].&[2] }* [Account].[Account].[Account].MEMBERS* [Date].[Date].[Date].MEMBERS ,10 ,[Measures].[Amount] ) )MEMBER [Measures].TEST2 AS [Measures].[Internet Tax Amount] / Count ( TopPercent ( { [Scenario].[Scenario].&[1] ,[Scenario].[Scenario].&[2] }* [Account].[Account].[Account].MEMBERS* [Date].[Date].[Date].MEMBERS* [Department].[Departments].[Department Level 02].MEMBERS ,10 ,[Measures].[Amount] ) )SELECT { [Measures].TEST1 ,[Measures].TEST2 } ON 0 ,[Customer].[Gender].[Gender].MEMBERS ON 1FROM [Adventure Works] A quick glance over the code shows that a large section of it occurs twice in both calculations—everything inside the Count function. If we remove that code to its own calculated member as follows: WITH MEMBER [Measures].Denominator AS Count ( TopPercent ( { [Scenario].[Scenario].&[1] ,[Scenario].[Scenario].&[2] }* [Account].[Account].[Account].MEMBERS* [Date].[Date].[Date].MEMBERS ,10 ,[Measures].[Amount] ) )MEMBER [Measures].TEST1 AS [Measures].[Internet Sales Amount] / [Measures].DenominatorMEMBER [Measures].TEST2 AS [Measures].[Internet Tax Amount] / [Measures].DenominatorSELECT { [Measures].TEST1 ,[Measures].TEST2 } ON 0 ,[Customer].[Gender].[Gender].MEMBERS ON 1FROM [Adventure Works] The query runs much faster, simply because instead of evaluating the count twice for each of the two visible calculated measures, we evaluate it once, cache the result in the calculated measure Denominator and then reference this in the other calculated measures. It's also possible to find situations where you can rewrite code to avoid evaluating a calculation that always returns the same result over different cells in the multidimensional space of the cube. This is much more difficult to do effectively though; the following blog entry describes how to do it in detail: http://tinyurl.com/fecache Tuning the implementation of MDX Like just about any other software product, Analysis Services is able to do some things more efficiently than others. It's possible to write the same query or calculation using the same algorithm but using different MDX functions and see a big difference in performance; as a result, we need to know which are the functions we should use and which ones we should avoid. Which ones are these though? Luckily MDX Studio includes functionality to analyse MDX code and flag up such problems—to do this you just need to click the Analyze button—and there's even an online version of MDX Studio that allows you to do this too, available at: http://mdx.mosha.com/. We recommend that you run any MDX code you write through this functionality and take its suggestions on board. Mosha walks through an example of using MDX Studio to optimise a calculation on his blog here: http://tinyurl.com/moshaprodvol Block computation versus cell-by-cellWhen the Formula Engine has to evaluate an MDX expression for a query it can basically do so in one of two ways. It can evaluate the expression for each cell returned by the query, one at a time, an evaluation mode known as "cell-by-cell"; or it can try to analyse the calculations required for the whole query and find situations where the same expression would need to be calculated for multiple cells and instead do it only once, an evaluation mode known variously as "block computation" or "bulk evaluation". Block computation is only possible in some situations, depending on how the code is written, but is often many times more efficient than cell-by-cell mode. As a result, we want to write MDX code in such a way that the Formula Engine can use block computation as much as possible, and when we talk about using efficient MDX functions or constructs then this is what we in fact mean. Given that different calculations in the same query, and different expressions within the same calculation, can be evaluated using block computation and cell-by-cell mode, it’s very difficult to know which mode is used when. Indeed in some cases Analysis Services can’t use block mode anyway, so it’s hard know whether we have written our MDX in the most efficient way possible. One of the few indicators we have is the Perfmon counter MDXTotal Cells Calculated, which basically returns the number of cells in a query that were calculated in cell-by-cell mode; if a change to your MDX increments this value by a smaller amount than before, and the query runs faster, you're doing something right. The list of rules that MDX Studio applies is too long to list here, and in any case it is liable to change in future service packs or versions; another good guide for Analysis Services 2008 best practices exists in the Books Online topic Performance Improvements for MDX in SQL Server 2008 Analysis Services, available online here: http://tinyurl.com/mdximp. However, there are a few general rules that are worth highlighting: Don't use the Non_Empty_Behavior calculation property in Analysis Services 2008, unless you really know how to set it and are sure that it will provide a performance benefit. It was widely misused with Analysis Services 2005 and most of the work that went into the Formula Engine for Analysis Services 2008 was to ensure that it wouldn't need to be set for most calculations. This is something that needs to be checked if you're migrating an Analysis Services 2005 cube to 2008. Never use late binding functions such as LookupCube, or StrToMember or StrToSet without the Constrained flag, inside calculations since they have a serious negative impact on performance. It's almost always possible to rewrite calculations so they don't need to be used; in fact, the only valid use for StrToMember or StrToSet in production code is when using MDX parameters. The LinkMember function suffers from a similar problem but is less easy to avoid using it. Use the NonEmpty function wherever possible; it can be much more efficient than using the Filter function or other methods. Never use NonEmptyCrossjoin either: it's deprecated, and everything you can do with it you can do more easily and reliably with NonEmpty. Lastly, don't assume that whatever worked best for Analysis Services 2000 or 2005 is still best practice for Analysis Services 2008. In general, you should always try to write the simplest MDX code possible initially, and then only change it when you find performance is unacceptable. Many of the tricks that existed to optimise common calculations for earlier versions now perform worse on Analysis Services 2008 than the straightforward approaches they were designed to replace. Caching We've already seen how Analysis Services can cache the values returned in the cells of a query, and how this can have a significant impact on the performance of a query. Both the Formula Engine and the Storage Engine can cache data, but may not be able to do so in all circumstances; similarly, although Analysis Services can share the contents of the cache between users there are several situations where it is unable to do so. Given that in most cubes there will be a lot of overlap in the data that users are querying, caching is a very important factor in the overall performance of the cube and as a result ensuring that as much caching as possible is taking place is a good idea. Formula cache scopes There are three different cache contexts within the Formula Engine, which relate to how long data can be stored within the cache and how that data can be shared between users: Query Context, which means that the results of calculations can only be cached for the lifetime of a single query and so cannot be reused by subsequent queries or by other users. Session Context, which means the results of calculations are cached for the lifetime of a session and can be reused by subsequent queries in the same session by the same user. Global Context, which means the results of calculations are cached until the cache has to be dropped because data in the cube has changed (usually when some form of processing takes place on the server). These cached values can be reused by subsequent queries run by other users as well as the user who ran the original query. Clearly the Global Context is the best from a performance point of view, followed by the Session Context and then the Query Context; Analysis Services will always try to use the Global Context wherever possible, but it is all too easy to accidentally write queries or calculations that force the use of the Session Context or the Query Context. Here's a list of the most important situations when that can happen: If you define any calculations (not including named sets) in the WITH clause of a query, even if you do not use them, then Analysis Services can only use the Query Context (see http://tinyurl.com/chrisfecache for more details). If you define session-scoped calculations but do not define calculations in the WITH clause, then the Session Context must be used. Using a subselect in a query will force the use of the Query Context (see http://tinyurl.com/chrissubfe). Use of the CREATE SUBCUBE statement will force the use of the Session Context. When a user connects to a cube using a role that uses cell security, then the Query Context will be used. When calculations are used that contain non-deterministic functions (functions which could return different results each time they are called), for example the Now() function that returns the system date and time, the Username() function or any Analysis Services stored procedure, then this forces the use of the Query Context. Other scenarios that restrict caching Apart from the restrictions imposed by cache context, there are other scenarios where caching is either turned off or restricted. When arbitrary-shaped sets are used in the WHERE clause of a query, no caching at all can take place in either the Storage Engine or the Formula Engine. An arbitrary-shaped set is a set of tuples that cannot be created by a crossjoin, for example: ({([Customer].[Country].&[Australia], [Product].[Category].&[1]),([Customer].[Country].&[Canada], [Product].[Category].&[3])}) If your users frequently run queries that use arbitrary-shaped sets then this can represent a very serious problem, and you should consider redesigning your cube to avoid it. The following blog entries discuss this problem in more detail: http://tinyurl.com/tkarbsethttp://tinyurl.com/chrisarbset Even within the Global Context, the presence of security can affect the extent to which cache can be shared between users. When dimension security is used the contents of the Formula Engine cache can only be shared between users who are members of roles which have the same permissions. Worse, the contents of the Formula Engine cache cannot be shared between users who are members of roles which use dynamic security at all, even if those users do in fact share the same permissions. Cache warming Since we can expect many of our queries to run instantaneously on a warm cache, and the majority at least to run faster on a warm cache than on a cold cache, it makes sense to preload the cache with data so that when users come to run their queries they will get warm-cache performance. There are two basic ways of doing this, running CREATE CACHE statements and automatically running batches of queries. Create Cache statement The CREATE CACHE statement allows you to load a specified subcube of data into the Storage Engine cache. Here's an example of what it looks like: CREATE CACHE FOR [Adventure Works] AS({[Measures].[Internet Sales Amount]}, [Customer].[Country].[Country].MEMBERS,[Date].[Calendar Year].[Calendar Year].MEMBERS) More detail on this statement can be found here: http://tinyurl.com/createcache CREATE CACHE statements can be added to the MDX Script of the cube so they execute every time the MDX Script is executed, although if the statements take a long time to execute (as they often do) then this might not be a good idea; they can also be run after processing has finished from an Integration Services package using an Execute SQL task or through ASCMD, and this is a much better option because it means you have much more control over when the statements actually execute—you wouldn't want them running every time you cleared the cache, for instance. Running batches of queries The main drawback of the CREATE CACHE statement is that it can only be used to populate the Storage Engine cache, and in many cases it's warming the Formula Engine cache that makes the biggest difference to query performance. The only way to do this is to find a way to automate the execution of large batches of MDX queries (potentially captured by running a Profiler trace while users go about their work) that return the results of calculations and so which will warm the Formula Engine cache. This automation can be done in a number of ways, for example by using the ASCMD command line utility which is part of the sample code for Analysis Services that Microsoft provides (available for download here: http://tinyurl.com/sqlprodsamples); another common option is to use an Integration Services package to run the queries, as described in the following blog entries— http://tinyurl.com/chriscachewarm and http://tinyurl.com/allancachewarm This approach is not without its own problems, though: it can be very difficult to make sure that the queries you're running return all the data you want to load into cache, and even when you have done that, user query patterns change over time so ongoing maintenance of the set of queries is important. Scale-up and scale-out Buying better or more hardware should be your last resort when trying to solve query performance problems: it's expensive and you need to be completely sure that it will indeed improve matters. Adding more memory will increase the space available for caching but nothing else; adding more or faster CPUs will lead to faster queries but you might be better off investing time in building more aggregations or tuning your MDX. Scaling up as much as your hardware budget allows is a good idea, but may have little impact on the performance of individual problem queries unless you badly under-specified your Analysis Services server in the first place. If your query performance degenerates as the number of concurrent users running queries increases, consider scaling-out by implementing what's known as an OLAP farm. This architecture is widely used in large implementations and involves multiple Analysis Services instances on different servers, and using network load balancing to distribute user queries between these servers. Each of these instances needs to have the same database on it and each of these databases must contain exactly the same data in it for queries to be answered consistently. This means that, as the number of concurrent users increases, you can easily add new servers to handle the increased query load. It also has the added advantage of removing a single point of failure, so if one Analysis Services server fails then the others take on its load automatically. Making sure that data is the same across all servers is a complex operation and you have a number of different options for doing this: you can either use the Analysis Services database synchronisation functionality, copy and paste the data from one location to another using a tool like Robocopy, or use the new Analysis Services 2008 shared scalable database functionality. The following white paper from the SQLCat team describes how the first two options can be used to implement a network load-balanced solution for Analysis Services 2005: http://tinyurl.com/ssasnlb. Shared scalable databases have a significant advantage over synchronisation and file-copying in that they don't need to involve any moving of files at all. They can be implemented using the same approach described in the white paper above, but instead of copying the databases between instances you process a database (attached in ReadWrite mode) on one server, detach it from there, and then attach it in ReadOnly mode to one or more user-facing servers for querying while the files themselves stay in one place. You do, however, have to ensure that your disk subsystem does not become a bottleneck as a result. Summary In this article we covered MDX calculation performance and caching, and also how to write MDX to ensure that the Formula Engine works as efficiently as possible. We've also seen how important caching is to overall query performance and what we need to do to ensure that we can cache data as often as possible, and we've discussed how to scale-out Analysis Services using network load balancing to handle large numbers of concurrent users.
Read more
  • 0
  • 0
  • 7991
article-image-modifying-existing-theme-drupal-6-part-1
Packt
20 Oct 2009
10 min read
Save for later

Modifying an Existing Theme in Drupal 6: Part 1

Packt
20 Oct 2009
10 min read
Setting up the workspace There are several software tools that can make your work modifying themes more efficient. Though no specific tools are required to work with Drupal themes, there are a couple of applications that you might want to consider adding to your tool kit. I work with Firefox as my primary browser, principally due to the fact that I can add into Firefox various extensions that make my life easier. The Web Developer extension, for example, is hugely helpful when dealing with CSS and related issues. I recommend the combination of Firefox and the Web Developer extension to anyone working with Drupal themes. Another extension popular with many developers is Firebug, which is very similar to the Web Developer extension, and indeed more powerful in several regards. Pick up Web Developer, Firebug, and other popular Firefox add-ons at https://addons.mozilla.org/en-US/firefox/ When it comes to working with PHP files and the various theme files, you will need an editor. The most popular application is probably Dreamweaver, from Adobe, although any editor that has syntax highlighting would work well too. I use Dreamweaver as it helps me manage multiple projects and provides a number of features that make working with code easier (particularly for designers). If you choose to use Dreamweaver, you will want to tailor the program a little bit to make it easier to work with Drupal theme files. Specifically, you should configure the application preferences to open and edit the various types of files common to PHPTemplate themes. To set this up, open Dreamweaver, then: Go to the Preferences dialogue. Open file types/editors. Add the following list of file types to Dreamweaver's open in code view field: .engine.info.module.install.theme Save the changes and exit. With these changes, your Dreamweaver application should be able to open and edit all the various PHPTemplate theme files. Previewing your work Note that, as a practical matter, previewing Drupal themes requires the use of a server. Themes are really difficult to preview (with any accuracy) without a server environment. A quick solution to this problem is the XAMPP package. XAMPP provides a one step installer containing everything you need to set up a server environment on your local machine (Apache, MySQL, PHP, phpMyAdmin, and more). Visit http://www.ApacheFriends.org to download XAMPP and you can have your own Dev Server quickly and easily. Another tool that should be on the top of your list is the Theme developer extension for the popular Drupal Devel module. Theme developer can save you untold hours of digging around trying to find the right function or template. When the module is active, all you need to do is click on an element and the Theme developer pop-up window will show you what is generating the element, along with other useful information. In the example later in this article, we will also use another feature of the Devel module, that is, the ability to automatically generate sample content for your site. You can download Theme developer as part of the Devel project at Drupal.org: http://drupal.org/project/devel Note that Theme developer only works on Drupal 6 and due to the way it functions, is only suitable for use in a development environment—you don't want this installed on a client's public site! Visit http://drupal.org/node/209561 for more information on the Theme developer aspects of the Devel module. The article includes links to a screencast showing the module in action—a good quick start and a solid help in grasping what this useful tool can do. Planning the modifications We're going to base our work on the popular Zen theme. We'll take Zen, create a new subtheme, and then modify the subtheme until we reach our final goal. Let's call our new theme "Tao". The Zen theme was chosen for this exercise because it has a great deal of flexibility. It is a good solid place to start if you wish to build a CSS-based theme. The present version of Zen even comes with a generic subtheme (named "STARTERKIT") designed specifically for themers who wish to take a basic theme and customize it. We'll use the Starterkit subtheme as the way forward in the steps that follow. The Zen theme is one of the most active theme development projects. Updated versions of the theme are released regularly. We used version 6.x-1.0-beta2 for the examples in this article. Though that version was current at the time this text was prepared, it is unlikely to be current at the time you read this. To avoid difficulties, we have placed a copy of the files used in this article in the software archive that is provided on the Packt website. Download the files used in this article at http://www.packtpub.com/files/code/5661_Code.zip. You can download the current version of Zen at http://drupal.org/project/zen. Any time you set off down the path of transforming an existing theme into something new, you need to spend some time planning. The principle here is the same as in many other areas of life: A little time spent planning at the front end of a project can pay off big in savings later. A proper dissertation on site planning and usability is beyond the scope of this article; so for our purposes let us focus on defining some loose goals and then work towards satisfying a specific wish list for the final site functionality. Our goal is to create a two-column blog-type theme with solid usability and good branding. Our hypothetical client for this project needs space for advertising and a top banner. The theme must also integrate a forum and a user comments functionality. Specific changes we want to implement include: Main navigation menu in the right column Secondary navigation mirrored at the top and bottom of each page A top banner space below top nav but above the branding area Color scheme and fonts to match brand identity Enable and integrate the Drupal blog, forum, and comments modules In order to make the example easier to follow and to avoid the need to install a variety of third-party extensions, the modifications we will make in this article will be done using only the default components—excepting only the theme itself, Zen. Arguably, were you building a site like this for deployment in the real world (rather than simply for skills development) you might wish to consider implementing one or more specialized third-party extensions to handle certain tasks. Creating a new subtheme Install the Zen theme if you have not done so before now; once that is done we're ready to create a new subtheme. First, make a copy of the directory named STARTERKIT and place the copied files into the directory sites/all/themes. Rename the directory "tao". Note that in Drupal 5.x, subthemes were kept in the same directory as the parent theme, but for Drupal 6.x this is no longer the case. Subthemes should now be placed in their own directory inside the sites/all/themes/directory. Note that the authors of Zen have chosen to vary from the default stylesheet naming. Most themes use a file named style.css for their primary CSS. In Zen, however, the file is named zen.css. We need to grab that file and incorporate it into Tao. Copy the Zen CSS (zen/zen/zen.css) file. Rename it tao.css and place it in the Tao directory (tao/tao.css). When you look in the zen/zen directory, in addition to the key zen.css file, you will note the presence of a number of other CSS files. We need not concern ourselves with the other CSS files. The styles contained in those stylesheets will remain available to us (we inherit them as Zen is our base theme) and if we need to alter them, we can override the selectors as needed via our new tao.css file. In addition to renaming the theme directory, we also need to rename any other theme-name-specific files or functions. Do the following: Rename the STARTERKIT.info file to tao.info. Edit the tao.info file to replace all occurrences of STARTERKIT with tao. Open the tao.info file and find this copy: The name and description of the theme used on the admin/build/themes page. name = Zen Themer's StarterKit description = Read the <a href="http://drupal.org/node/226507">online docs</a> on how to create a sub-theme. Replace that text with this copy: The name and description of the theme used on the admin/build/themes page. name = Tao description = A 2-column fixed-width sub-theme based on Zen. Make sure the name= and description = content is not commented out, else it will not register. Edit the template.php file to replace all occurrences of STARTERKIT with tao. Edit the theme-settings.php file to replace all occurrences of STARTERKIT with tao. Copy the file zen/layout-fixed.css and place it in the tao directory, creating tao/layout-fixed.css. Include the new layout-fixed.css by modifying the tao.info file. Change style sheets[all][] = layout.css to style sheets[all][] = layout-fixed.css. The .info file functions similar to a .ini file: It provides configuration information, in this case, for your theme. A good discussion of the options available within the .info file can be found on the Drupal.org site at: http://drupal.org/node/171205 Making the transition from Zen to Tao The process of transforming an existing theme into something new consists of a set of tasks that can categorized into three groups: Configuring the Theme Adapting the CSS Adapting the Templates & Themable Functions Configuring the theme As stated previously, the goal of this redesign is to create a blog theme with solid usability and a clean look and feel. The resulting site will need to support forums and comments and will need advertising space. Let's start by enabling the functionality we need and then we can drop in some sample contents. Technically speaking, adding sample content is not 100% necessary, but practically speaking, it is extremely useful; let's see the impact of our work with the CSS, the templates, and the themable functions. Before we begin, enable your new theme, if you have not done so already. Log in as the administrator, then go to the themes manager (Administer | Site building | Themes), and enable the theme Tao. Set it to be the default theme and save the changes. Now we're set to begin customizing this theme, first through the Drupal system's default configuration options, and then through our custom styling. Enabling Modules To meet the client's functional requirements, we need to activate several features of Drupal which, although contained in the default distro, are not by default activated. Accordingly, we need to identify the necessary modules and enable them. Let's do that now. Access the module manager screen (Administer | Site building | Modules), and enable the following modules: Blog (enables blog-type presentation of content) Contact (enables the site contact forms) Forum (enables the threaded discussion forum) Search (enables users to search the site) Save your changes and let's move on to the next step in the configuration process.
Read more
  • 0
  • 0
  • 6765

article-image-rotating-post-titles-post-preview-gadget
Packt
20 Oct 2009
6 min read
Save for later

The Rotating Post Titles with Post Preview Gadget

Packt
20 Oct 2009
6 min read
The Rotating Post Titles with Post Preview gadget lists all your blog posts classified according to labels or categories. Blogger uses Labels to classify posts while Wordpress uses Categories for the same. Clicking on a Label or a category in the sidebar of a blog brings up all posts associated with that particular label or category. However, you will see only posts associated with that one label. In this gadget post titles are grouped under their respective labels. In this gadget in one look you can see all the post titles in that blog and all the labels in it. Thus you get a full summary of the blog. Hovering on a post title shows the Post Preview in the top pane. You can then click on it to go to that post to read it in full detail. What is Google AJAX feed API? AJAX (shorthand for asynchronous JavaScript and XML) is a web development technique which retrieves data from the server asynchronously in the background without interfering with the display, and behaviour of the existing page. The whole page is not refreshed when data is retrieved. Only that section of the page which is a part of the gadget shows the data brought. With the Google AJAX Feed API, you can retrieve feeds and mash them up using Javascript. In this gadget, we will retrieve the post titles from the label feeds and display them using Javascript code. See picture below: This gadget shows list of posts grouped by label from my blog http://www.blogdoctor.me. Four post titles from three labels are shown but the code can be modified to show all posts from all labels (categories). This label is also shown as Gadget No 4 in My Gadget Showcase blog. The cursor autoscrolls down the post titles, and each post preview is shown at the top as an excerpt for five seconds before moving on to the next post. Obtaining the Google AJAX API Key The first step in installing the above gadget is to get the Google AJAX API Key. It is free and you can easily obtain it for any site blog or page by signing up for the key at the API key signup page. Type in your blog address in the My web site URL text box and click the "Generate API Key" button. On the resulting page copy the key and paste it in code below as shown. Customizing the code In the code below replace PASTE AJAX API KEY HERE with your actual key obtained above. <!-- ++Begin Dynamic Feed Wizard Generated Code++ --><!-- // Created with a Google AJAX Search and Feed Wizard // http://code.google.com/apis/ajaxsearch/wizards.html --> <!-- // The Following div element will end up holding the actual feed control. // You can place this anywhere on your page. --><div id="content"> <span style="color:#676767;font-size:11px;margin:10px;padding:4px;">Loading...</span> </div><!-- Google Ajax Api --> <script src="http://www.google.com/jsapi?key=PASTE AJAX API KEY HERE" type="text/javascript"></script> <!-- Dynamic Feed Control and Stylesheet --> <script src="http://www.google.com/uds/solutions/dynamicfeed/gfdynamicfeedcontrol.js" type="text/javascript"></script> <style type="text/css"> @import url("http://www.google.com/uds/solutions/dynamicfeed/gfdynamicfeedcontrol.css"); </style><script type="text/javascript"> google.load('feeds', '1'); function OnLoad() { var feeds = [ { title: 'LABEL_1', url: 'http://MYBLOG.blogspot.com/feeds/posts/default/-/LABEL1?max-results=100' }, { title: 'LABEL_2', url: 'http://MYBLOG.blogspot.com/feeds/posts/default/-/LABEL2?max-results=100' }, { title: 'LABEL_3', url: 'http://MYBLOG.blogspot.com/feeds/posts/default/-/LABEL3?max-results=100' } ]; var options = { stacked : true, horizontal : false, title : "Posts from BLOG_TITLE" }; new GFdynamicFeedControl(feeds, 'content', options); document.getElementById('content').style.width = "200px"; } google.setOnLoadCallback(OnLoad); </script> In the above code replace LABEL_1, LABEL_2 and LABEL_3 and LABEL1, LABEL2 and LABEL3 by respective Label Names and BLOG_TITLE by the actual title of your blog. Also replace MYBLOG by actual blog subdomain. This is for blogspot blogs only. For Wordpress blog you will have to replace the label feeds:: http://MYBLOG.blogspot.com/feeds/posts/default/-/LABEL1?max-results=100 http://MYBLOG.blogspot.com/feeds/posts/default/-/LABEL2?max-results=100 http://MYBLOG.blogspot.com/feeds/posts/default/-/LABEL3?max-results=100 by the Category feed URLs from Wordpress blog. After customizing the above code in Blogger paste it in a HTML gadget while in Wordpress paste it in a Text widget. Further Customization To show more than four posts per label or category, you will have to modify the following Javascript code file: http://www.google.com/uds/solutions/dynamicfeed/gfdynamicfeedcontrol.js In the above mentioned Javascript code file, alter the following code line in it : GFdynamicFeedControl.DEFAULT_NUM_RESULTS = 4; Change '4' to '1000' and save the file as a MODgfdynamicfeedcontrol.js file in a text editor like Notepad. Upload the file to a free host and replace the link of the file in the above code. To change the styling of the gadget, you will have to modify the following file: http://www.google.com/uds/solutions/dynamicfeed/gfdynamicfeedcontrol.css Then save the modified file and upload it to a free host and replace its link in the above code. Working Example Posts from The Blog Doctor Change Post Formatting According to Author. by Vin - 30 Apr 2008 If you have a Team Blog made up of two authors you can change the formatting of the posts written by one author to ... Template Calling all Newbie Bloggers! Upgrade Classic Template without the "UPGRADE YOUR TEMPLATE" button. The Minimalist Minima Photoblog Template. Making the COMMENTS Link more User Friendly. CSS Change Post Formatting According to Author. Free CSS Navigation Menus in Blogger. Fix the Page Elements Layout Editor No Scrollbar Problem. Frame the Blog Header Image. Blogger Hacks Rounded Corner Headers for Blogger. Timestamp under the Date and Other Hacks. Add Icon to Post Titles. Many Headers In One Blog.   Summary The Rotating Post Titles with Post Preview gadget provides an at-a glance summary of your blog. All the posts are grouped by label or category and are linked to their post pages. The constantly rotating post excerpts at the top draws the attention of the reader and gets him/her more involved and eager to explore your blog. This increases the traffic and decreases the bounce rate of visitors from your blog.nara
Read more
  • 0
  • 0
  • 3925
Modal Close icon
Modal Close icon