(For more resources on Drupal, see here.)
Every once in a while, someone makes a site that becomes wildly popular. Having many people visiting all at once can put some serious strain on the server's resources and cause all sorts of problems as the congestion builds.
If you are unsure about what resources are available on your site, check with the hosting service and find out what they provide in the way of disk space, monthly transfer, and transfer speed. Many hosting services will boast unlimited bandwidth, but won't talk about connection speed. In other words, they don't meter how much water you use because they only let you switch the hosepipe on half way.
It's important to know the limitations of the hardware and network resources, but don't fall into the trap of believing this is the most important thing to know.
Ensuring that there are facilities in place to handle a large amount of traffic will go some way in ensuring that your site scales well.
It's a time-honored tradition in the corporate world to throw extra resources at computing problems—buying the latest, fastest servers to help speed up slow applications, upgrading network hardware to allow data to travel more freely, and so on. Invariably though, poorly designed software, or software that is poorly tuned for performance always finds a way to utilize all the resources one can throw at it and still want more.
More often than not, it is better to look at why software is chewing up resources and see what can be done to either stop it or at least alleviate the problem, so that the software utilizes its resources wisely. Drupal already has several strategies in place to help you, the site administrator, decide how and when to use resource-intensive modules and how to maximize the site's efficiency.
This section provides several options to improve the performance of your site, and as nothing in this world is really free, you need to understand that, by and large, obtaining a performance boost comes at the expense of something else—namely, how up-to-date content is.
The following screenshot shows the performance page:
The first option, CLEAR CACHE, is useful while making modifications to a site because it helps to ensure that changes are definitely displayed and not held up while the site cache is still in operation. Having the ability to clear the cache in order to view precisely how pages are being built is useful, but comes at a price. Remember that if you have a large site with lots of content, then Drupal will have to do a lot of work to rebuild its cache, and it is possible that users may notice a slowdown during this time.
It is important to only enable caching on a live site, and not on the development machine, because changes to a page show up only when the cache expires—causing confusion if you are expecting something else during testing.
As you know, Drupal uses PHP to build web pages that are returned to a user's browser. Most of the time, these pages are unchanged between requests, and Drupal ends up repeating the work of building the same page before sending it off to the other users who requested it. It makes sense to tell Drupal that if it has created a web page once, it should store a copy of this page and serve that copy instead of going through the trouble of recreating it.
The process of storing copies of web pages in order to reduce the amount of effort required to repeatedly create a page is known as caching.
The trade-off when using page caching is that any changes to a page are only shown to the users once that cached version has expired and been replaced. This makes caching a suitable method for boosting performance whenever content is not updated very often or when it is not important to have new content presented immediately.
You will need to decide how long you think it is suitable to go before any updates made to a page must be shown—the longer you leave a page cached, the less work Drupal has to do, but the longer it will take for new content to show on the site. If your site is a daily blog, then by all means set the cache for up to a day at a time. If your site is a super busy, breaking news portal, then clearly you would opt for a cache time in minutes.
Drupal also has the ability to cache the content of blocks, which can be a real performance boost for authenticated users (since page caching is only available for anonymous users). Blocks are constructed independently from the page as a whole, and often require expensive database requests or other operations in order to provide the information they contain.
Enabling block caching means that blocks no longer have to query the database (or whatever else it is they are doing) each time a page refreshes. Rather, they simply serve up the cached version and save on all that work.
Again, make sure you carefully weigh up the benefits between having fresh content and having high performance.
BANDWIDTH OPTIMIZATION, shown at the bottom of the page in the previous screenshot, deals with how to best transfer data from your server across the Internet to the users' browsers. The way in which data is transferred plays a big role in optimizing performance. In general, the most important things to remember are:
- Keep files small
- Keep the number of files down
Again, don't aggregate files during development. Turn this feature on only once the site has gone live, otherwise you are in for some serious frustration when changes to themes or scripts don't show up or behave as expected.
I should make the following point very clear:
All major development or changes to a site should be performed on the development machine and thoroughly tested before being implemented or ported to the live site.
There will be times, however, when you simply have to make some changes directly to the live site—even if it is only to implement upgrades that have already been tested out on the development server. If this is the case, then rather than allow users to browse a site under maintenance, visit the Maintenance mode page in the Development section, and select Put site into maintenance mode, provide a Maintenance mode message to display if the default one is not suitable, and get on with your work.
Be very careful when working in maintenance mode because once you have logged out you are effectively locked out too. This is because, by default, only one user (that is the administrative user) can do anything on the site while it is offline. If you log off and try logging in again, you are no longer the administrative user; you are instead anonymous and are shown only the offline message:
This is not very helpful if you do happen to be the site administrator; so Drupal allows the login page to be accessed as normal. Navigate to http://localhost/ drupal/user, and you will be able to log in as the administrator and use the site without hindrance.
Make sure you know the administrator's password before going into maintenance mode.
Everyone else is locked out until the site is no longer under maintenance.
Logging and errors
Go to Logging and errors in the DEVELOPMENT section of the Configuration overlay. This page provides a few options used to control how errors are displayed and logged:
Error messages to display allows you to decide whether to write errors to the screen or not. While you are busy building the site, it's useful to view All messages in order to determine what has gone wrong and when. However, once it is time to go live you should change this to None for security reasons. Doing so prevents Drupal from displaying information to malicious users who might be able to use it in an attack on your site.
The final setting, Database log entries to keep, at least to begin with, is sensible. You may wish to increase or decrease the number of records stored on the system depending on how much work you have to do in order to maintain the site properly. Remember that Drupal can properly maintain the site's event logs only if the cron jobs are being run regularly.
Having only one setting to make is not that exciting, but once the site is live and messages are no longer visible through the pages, you can check the logs in the Reports section. Doing this on a regular basis is a good strategy to ensure that the site continues to run smoothly. Error messages, warnings, and so on are effectively windows into the operations of the site, and are indispensable tools.
(For more resources on Drupal, see here.)
It is important to discuss this particular topic early on because it acts as a cog in the machinery of not only your site, but also in how your site interacts with the rest of the Internet. The simplicity of the Clean URLs in the SEARCH AND METADATA section of the Configuration overlay belies its importance.
As you can see, the choice is simple—either enable or disable Clean URLs. Your system should also tell you whether or not it is possible to use clean URLs—if you see something like the following screenshot, then you have problems:
It is critical for SEO purposes that you have Clean URLs enabled on the live site.
The reason for this strong recommendation is because clean URLs are needed in order for your site to be properly indexed by Google and other search engines. Search engines use automated programs to traverse the web (called bots) and when they come across nice, straightforward URLs like the ones displayed by Drupal when Clean URLs are enabled, for example, http://localhost/drupal/about-us, they happily go about their business.
Indexing allows content to start showing up in web searches, and hence more people can find these pages (more or less). If however, they come across dynamic URLs (ones that contain query strings), then they often don't put the same effort into indexing that page, or worse, ignore it entirely. This can lead to a situation where you have a lot of lovely content just waiting to be read, but no one is able to find it because the search engines are ignoring all the pages of form: Indexing allows content to start showing up in web searches, and hence more people can find these pages (more or less). If however, they come across dynamic URLs (ones that contain query strings), then they often don't put the same effort into indexing that page, or worse, ignore it entirely. This can lead to a situation where you have a lot of lovely content just waiting to be read, but no one is able to find it because the search engines are ignoring all the pages of form: http://localhost/ drupal/?q=node/2
The highlighted part of this URL (?q=) is what causes the problem. Drupal navigates around its own pages using a system of internal URLs that it finds using queries in the format shown in the previous URL. In other words, ?q=node/2 is asking Drupal to find the content or page that is held at node/2. The problem is that the Googlebot simply sees the dynamic query and says to itself, "This could be a nasty trick designed to make me index the same page millions of times over so I won't pay any attention to it."
The people at Drupal realized this is the case, so if it is possible on your web server, clean URLs are enabled by default and you don't have to worry about any of this. The problem comes during deployment because it is possible that your Internet service provider's setup does not allow for clean URLs. Now what?
If you already know who is going to host your live site, then try testing things out now by installing a copy of Drupal on the live server and ensuring that it is possible to use clean URLs. If you can't, consider finding another host that does. Otherwise, you will end up dealing with their system administrators and waiting until they can properly configure Apache.
Whether you can or can't use clean URLs basically comes down to a configuration setting in Apache. On your development machine, you have direct access to the httpd.conf file that Apache uses for its configuration—this is probably not the case on the live servers since any given host obviously doesn't want to give everyone using their servers total control to mangle everything as they see fit.
In order for Drupal to implement clean URLs, Apache needs to have mod_rewrite enabled. For example, open up httpd.conf and search for the line that reads LoadModule rewrite_module modules/mod_rewrite.so
This line determines whether or not Apache can implement what Drupal requires in order to implement clean URLs. If it's commented out, you will need to uncomment it and then restart Apache before any changes take effect.
If you find that at some stage you fall into the trap of having clean URLs enabled on a system that cannot implement them (causing all sorts of problems), then manually navigating to the following page should allow you to disable the clean URLs and use the site as normal: http://localhost/drupal/?q= admin/config/search/ clean-urls
Remember to exchange the highlighted base URL to whatever is pertinent for your setup.
One of the best ways to spread the word about your website is to create and maintain a useful RSS feed. The RSS publishing page under WEB SERVICES on the Configuration overlay provides a few options to control the behavior of the site's feeds:
From here, it is possible to specify how many items are to be included in the feed, and what content should be shown—from the title, select the title, teaser, and full content.
There are plenty of feeds that are automatically created on your site. If you set up a blog, it will also have its own feed. In fact, each blogger has their own blog feed and visitors can subscribe by clicking the small RSS feed icon that Drupal inserts wherever a feed is available.
Clicking on the feed icon of the front page shows a mix of the latest content that has been promoted to the front page:
As you can see from the preceding screenshot, even polls can be easily aggregated into the site feed. This is a really powerful feature that can help to build up a well known and popular blog or website.
Reports are an absolutely crucial part of maintaining a healthy website. They are your eyes and ears on the ground and can often be decisive in isolating attacks or malicious users and programs that have accessed your site. They can also provide analytical information about how people are utilizing a site (through the search phrases report) or can offer some interesting benefits. For example, you might find that a certain site refers a number of people to you, and therefore, may be a good candidate for pursuing a relationship with.
Click on Reports in the Toolbar to bring up the site's list of reports and logs:
This page can change depending on which modules are installed and enabled. For example, enabling the Syslog and Statistics modules in the Modules section, and reviewing this page shows the newly available logs and reports:
Selecting Recent log entries brings up the site's log of events, and you can filter these events by clicking on FILTER LOG MESSAGES and then selecting options in the list under the heading Type and Severity, before clicking on the Filter button.
Each of the log records has several important features that help to determine its type and importance, who or what initiated it, and what the outcome of the event was. If you want to look at the details of any individual message, click on the link found in the Message column, and its details will be displayed, as shown in the following screenshot:
This logging interface gives you fairly good control over how to locate and deal with the site issues. There are several other options that you should explore on your own in the Reports section, the most notable being the Available updates and Status report.
While performance is not really a huge consideration at this early stage in a site's development, it serves as a good learning exercise to understand what facilities Drupal puts in place to boost performance through caching and file aggregation. Remember to return to this section once your site goes live.
We then took a quick look at the important topics of clean URLs and explored Drupal's native RSS publishing features.
A look at how logs and reports serve as our eyes and ears on the ground rounded off this article. Remember that different modules can add different reports and logs so you aren't necessarily limited to the ones shown here.
- Drupal 5 Themes [book]
- Drupal Multimedia [book]
- Drupal Theming [article]
- Translations in Drupal 6 [article]
- Navigating the Online Drupal Community [article]
- Building a News Aggregating Site Using Drupal 6 [article]
- Drupal Site Configuration: Site Information, Triggers and File System [article]