About this book

Drupal is a free and open-source content management system and content management framework written in PHP and distributed under the GNU General Public License. It is used as a back-end system for at least 1.5% of all websites worldwide ranging from personal blogs to corporate, political, and government sites. SEO, or Search Engine Optimization, is the process and techniques by which you optmize the content and style of your site in order to induce more people to view it.

Drupal SEO will help you develop and execute an effective search engine optimization strategy for your site. From planning to implementation, the book covers best practices in contemporary SEO.

In Drupal SEO you will learn how to develop a dynamic and productive SEO campaign. Covering both the basics of campaign development as well as the daily work it takes to maintain your SEO competitiveness, this book will show you how to produce a distinct and appropriate strategy for your site. In particular you will learn key phrase selection and competitor analysis and the correct groundwork for your dynamic SEO campaign.

Drupal SEO will then show you, by finding the right combination of extensions, how to supercharge your site. You will also be given a guided tour of key SEO services, like Google and Bing Webmaster, in order to implement a progressive and effective link building campaign. You will then learn key expert tips and tricks to enable you to build SEO-effective content which will take your site from invisible to unmissable with little effort.

Publication date:
September 2012


Chapter 1. An Introduction to Search Engine Optimization

This chapter lays the foundation for what's to come later in the book. It introduces basic concepts, terms, and fundamental information needed to understand the rationale behind the techniques discussed in the subsequent chapters. While some of the content in this chapter will be known to experienced users, it will be essential content for newbies and those who are not SEO specialists.

The topics covered in this chapter include:

  • An introduction to the SEO process

  • An SEO vocabulary

  • An explanation of how search engines view your site


What is SEO?

At its most basic, SEO is an acronym for Search Engine Optimization. More importantly, for the purposes of the philosophy espoused in this text, SEO is a process—a series of planning and execution steps that lead to a website being optimized to perform its best on the search engines.

Notice the emphasis on process—SEO is not something you do once and then forget about. While an intensive period of attention to your site's optimization factors can lay a solid foundation and get you off to a proper start, if you do not continue to make efforts to improve and respond to market conditions, your rankings will stagnate and then erode over time. Moreover, your efforts do not exist in isolation; there are others out there competing for rankings and traffic. In order to succeed, you need to do your best to stay ahead of the others fighting for ranking for their sites.


When we talk about the search engines in this text, we mean Google, Bing, Baidu, or other similar sites focused on allowing the general public to search for and find information on the Web. Typically, what works for one search engine will work for others. Though there are peculiarities and optimization strategies that can be applied to target-specific engines, most SEO techniques are search-engine agnostic.

The competition for attention online should never be underestimated. If you are in a competitive business vertical—be it travel, finance, gambling, web design, property, or others—the battle for traffic from the search engines is cutthroat. Never forget that the major players out there have dedicated SEO teams that do nothing every day but tweak, optimize, build links, create content, and generally do their best to out-compete all other similar business vying for the top spots on the search engines.

In this book, we put forward a methodology for search engine optimization. The process we advocate can be viewed broadly as having two parts—foundations and on-going efforts. We start by looking at how to lay a great foundation for your site, that is, the basics of creating a search engine friendly site. In later chapters, we turn our attention to on-going techniques for maintaining and improving your rankings over time. Along the way, we look at how to formulate and implement a coherent search engine strategy.


Never forget, for most site owners the actual goal is traffic generation, not pure search engine ranking.

While many of the issues in SEO relate to technical aspects of the site, there is much more to SEO than just getting the technical aspects of your Joomla! site in order. One of the fundamental principles advocated in this book is to focus on the creation of useful, unique content. There is a strong, positive correlation between high quality content and high site ranking. This is one of the few areas where the search engines provide specific guidance about what they are looking for in a site. On the subject of quality, Google provides the following guidance:

  • Make pages primarily for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking".

  • Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you. Another useful test is to ask, "Does this help my users? Would I do this if search engines didn't exist?"

Bing also emphasizes the importance of content and advises as follows:

  • Ensure content is built based on keyword research to match content to what users are searching for

  • Produce deep, content-rich pages; be an authority for users by producing excellent content

  • Set a schedule and produce new content frequently

  • Be sure content is unique—don't reuse content from other sources


Don't try to outsmart Google—it's not going to work. Even if you find a way to artificially manipulate your rankings, there will come a day—very soon—when Google will pick up on it and make adjustments to their algorithms. When that happens, your site rankings will plummet and you will go from hero to zero.

While content is critical, it should not be your only concern. SEO practitioners often disagree about the relative importance of various factors in site rankings, but there is general agreement on which factors play a part. The search engine business is very competitive and companies such as Google and Bing do not disclose details of how their algorithms work. Fortunately for us, there is a considerable body of third-party research focused on discerning trends and patterns in search engine ranking. One of the best sources of information on this topic is SeoMoz's Search Ranking Factors, a report they publish free of charge and update annually. The data in the report comes from interviews of more than 130 SEO specialists and from a large data set that seeks to identify correlations between site variables and search engine rank.


View the report online by visiting http://www.seomoz.org/article/search-ranking-factors.

Among the factors that are agreed to be significant are:

  • Keywords in the domain name

  • Keywords in a page's URL

  • Keywords in the content title

  • Keyword placement on page

  • Keyword repetition on page

  • Uniqueness of content

  • Freshness of content

  • Facebook activity

  • Twitter activity, including influence of account tweeting

  • Google+ activity

  • Social media up votes and comments

  • Click through rate for the site

  • Bounce rate for the site

  • Number, quality, and content of links to this site

  • Number of internal links

  • Number of errors on site

  • Speed of the site

In sum, SEO is a process that requires a multifaceted strategy. At a minimum, you need to make an effort to create a site that is search engine friendly, but in order for your site to excel in the rankings, you must do more. SEO requires concerted effort across time and you must also focus on the creation of unique, quality content.


The future of SEO

SEO is a moving target. The search engines are constantly adjusting their algorithms and practitioners are constantly trying new strategies and modifying their approach. While it is impossible to predict with any accuracy what the future of SEO will bring, there is some consensus among experts about which direction it is moving in. Generally speaking, we believe the future will see a continued emphasis on determining the perceived value of each site. This will be done by looking at not only the quality of the site's content, but also social media signals and site traffic patterns. Site performance will also continue to be a factor, with faster, better built sites being preferred over slow, badly engineered sites.

These factors are consistent with what we know about the general goals the search engines aspire to, that is, to be able to perceive sites more like users perceive them, rather than as a purely mathematical exercise.


SEO terminology

The SEO field is replete with esoteric terminology and peculiar expressions. An awareness of the discipline's vocabulary is essential to clear understanding. In this section of the chapter, we provide definitions for the most commonly-used terms.


The .htaccess file is a configuration file for your web server. In the context of SEO, it is used to help your web server determine how to route HTTP traffic. In the world of SEO, the .htaccess file is most commonly discussed in the context of URL aliases, which are often used to create search engine friendly URLs.


Note that .htaccess is only applicable to sites running on the Apache web server. The web.config file performs the same tasks on IIS.

301 redirect (also known as Permanent Redirect)

A 301 redirect is an instruction given to the web server, informing it that a page that was previously located at one URL has been moved permanently to a new URL. The 301 redirect is most commonly used in situations where a site has been rebuilt and the URLs have changed. By adding 301 redirects to the site, you are able to avoid missed connections caused by traffic going to the old URL. When a 301 redirect is used, the search engines will also update their indexes to remove the old URL for the page and substitute the new one, thereby preserving the page's indexing.

302 redirect (also known as Temporary Redirect or Found)

A 302 redirect, like a 301 redirect, informs the web server that a page has moved. Unlike a 301 redirect, a 302 redirect indicates that the move is temporary. This option is a disfavored option as some search engines will penalize for the use of this sort of redirect.

404 error (also known as Page Not Found)

When a person visits a URL to a page that no longer exists (or has been moved), or types in an incorrect URL, the visitor will automatically be shown a 404 error message. The default message informs the visitor that the page cannot be found. Many sites build custom pages specifically designed to be displayed when a 404 error occurs.


AdSense is a Google advertising program aimed at website owners. Site owners can sign up for the AdSense program and then display it on their site. (The ad inventory is provided by Google, often from the AdWords program, discussed next). The website owner will be paid a percentage of the revenues generated when someone clicks on one of the ads displayed on his or her site.


AdWords is a Google commercial advertising program aimed at advertisers. If you want to advertise on the Google network, you can sign up for the AdWords program, build an ad and set a daily budget for the display of that ad. The ad will then appear in the Google network and you will be charged when someone clicks on one of the ads (or, alternatively, you can elect to be charged according to the number of views of the ad).

Alexa Rank

Alexa.com provides a website ranking service that attempts to rate all the sites on the Web in order of their popularity. Like a golf score, the lower the score, the better. The most popular site on the Web (typically Google.com) has an Alexa Rank of 1. The service, though not 100 percent accurate and the subject of some criticism, is yet another way of tracking the success of your efforts to raise your site's profile. To learn more visit http://alexa.com.

Alt attribute

The HTML image tag (img) is used to place images on the page. The tag includes an option to specify a value for the attribute alt. This attribute is intended to allow webmasters to specify an alternative description for the image, typically for the benefit of users who are using screen readers or browsers with the image display disabled.


Anchors are hyperlinks that allow a user to jump from one place to another within the same page.

Back link (also known as an "inbound link")

A back link is a link on an external site that points to your site.

Bing Webmaster

The Bing Webmaster service is provided by Microsoft to enable site owners to gain access to some basic tools that help you diagnose and track your site. Registration is free of charge.

Black hat

Black hat is a label used to describe the use of SEO techniques that are illegal, unethical, or of questionable propriety.

Bot (also known as Robot, Spider, or Crawler)

A robot, or "bot" for short, is a software agent that indexes web pages. It is also called a "spider" or a "crawler".

Canonical URLs

Canonical URLs are URLs that have been standardized into a consistent form. For the search engines, this typically implies making sure all your pages use consistent URL structures, for example, making sure all your URLs start with "www".


Cloaking is a black hat SEO technique that involves presenting the search engine spider with different content than you show a normal site visitor.

Crawl depth

Crawl depth is a measure of how deeply the search engine spider has indexed a website. This is typically an issue relevant for sites with a complex hierarchy of pages. The deeper the spider indexes the site, the better.

Deep link

Deep link is a hyperlink that points to something other than the front page of a website.

Doorway page (also known as a "gateway page")

Doorway page is a page built specifically to point users to another page. This technique is used legitimately when a site owner holds multiple domain names and wishes to channel all the traffic into a primary domain. The technique is often used inappropriately by some black hat SEO practitioners as a way to create highly optimized pages targeting a specific term or terms, then push the users to another site—an online variation of the old bait and switch routine.

Duplicate content penalty

Duplicate content penalty is a theory that the search engines penalize sites that repeat content, or use content that is duplicated from another source. The theory is controversial, with many believing that the penalty may not exist, or may only be enforced in situations where there are other factors that indicate bad intent.

Google Webmaster

The Google Webmaster service is provided by Google to enable site owners to gain access to some basic tools that help you diagnose and track your site. Registration is free of charge.

Internal link density

Internal link density is the number of self-referential links on a site; that is, the number of links on a site pointing to other pages on the same site.


KEI is an acronym standing for Keyphrase Effectiveness Index. KEI is normally used during keyphrase research in an attempt to find the optimal keyphrases for a site. It is a simple ratio, most often defined as, "Frequency of search engine queries for the term/number of pages competing for the term".

More the number of searches, more the potential traffic. The lower the competition, the easier it is to rank highly in the SERP. The most ideal term will have low competition and a high number of searches.

Keyphrase density (also known as "keyword density")

Keyphrase density is a calculation done by looking at all the text on a page, then calculating a ratio that represents the total number of words to the number of times a particular keyphrase or keyword appears on that page.

Keyword (or Keyphrase)

A keyword is a word being targeted for site's SEO efforts. A keyphrase is simply the targeting of a phrase instead of a single word.

Keyphrase stuffing

Keyphrase stuffing is the over-optimizing of a page for a particular keyphrase. This is a disfavored practice that can have a negative impact on your site's ranking as it is viewed by the search engines as an attempt to exert inappropriate influence on the rankings for the page.

Landing page

A landing page is a web page that has been optimized to capture a customer, and is typically used as the target for an ad or other promotional campaign, or simply for capturing leads.

Link building

Link building is the process of seeking out or creating links to a site for the purpose of increasing the site's search engine relevance or inbound traffic.

Link farm

Link farm is a site that includes an excessive number of links. These sites are typically built purely to generate links for SEO purposes. Sites of this nature are disfavored by the search engines, which view them as inappropriate attempts to exert influence over rankings.

Link text (also known as "anchor text")

When you create a hyperlink on a page by wrapping a text string with an <a> tag, the text wrapped by the tag is referred to as the link or anchor text. There is a search engine optimization benefit to using text for hyperlinks, as the text can then be indexed in conjunction with the hyperlink.

Long tail

In general terms, the long tail of a distribution is the trailing end of the distribution. In the context of SEO, the term is used to refer to targeting longer and more specific search queries, where there is usually less competition.

Meta tags

Metadata is, quite literally, data about data. On the Web, meta tags are the most common implementation of metadata and in the past were a key part of search engine indexing. Today, meta tags are still in use on the Web and can be found in the head section of web pages.


MozRank is a site ranking algorithm formulated by SeoMoz. Often used in SEO circles as an alternative to Google's PageRank.


nofollow is a possible value for the rel attribute inside the <a> tag. If the value of the rel attribute for a link is set to nofollow, the search engines' spiders will not follow or index the link.

Organic rank

Organic rank refers to natural search engine ranking, as opposed to paid ranking.

Outbound link

Outbound link is a hyperlink on one site pointing to an external site.


PageRank is a ranking algorithm created by and named for Larry Page at Google. The ranking criteria is unknown, but the scale ranges from zero at the low end to ten at the high end. The higher the score the more persuasive a website is deemed to be. There is argument, however, that the rank is no longer in use at Google and may not continue to evolve.


PPC is an acronym for Pay Per Click advertising. If you use a PPC advertising scheme, you pay every time someone clicks on one of your ads. The most popular PPC system is the Google AdWords program. It is also sometimes called "pay for performance advertising".

Reciprocal link

Reciprocal link is a link from one site to another, given in exchange for a link back. It is a link exchange between webmasters, done in hopes of boosting both sites' rankings.


Redirect is an instruction given to the web server to redirect traffic seeking one URL to a different URL. There are different types of redirects, such as 301 redirect and 302 redirect , as we have seen earlier in this chapter.


Robots.txt is a file containing instructions for search engine robots. This file is located on the server but is not used by the human visitors to the website.


SEF URLs is an acronym for Search Engine Friendly URLs. The term refers to the creation of URLs that use natural words and phrases, rather than query strings and other abstract values (such as numbers) not associated with the page content.


SEM is an acronym for Search Engine Marketing. The term is broad and applies to not only search engine optimization, but also to other techniques, such as social media, pay per click advertising, and other marketing techniques focused on search engines.


SEOMoz is a popular commercial SEO consultancy service. Learn more at http://www.seomoz.org.


SERP is an acronym for Search Engine Results Page.


SMO is an acronym for Social Media Optimization. The process of using social media to drive traffic to your site and the related process of making your site suitable for social media, for example, by including social bookmarking tools and other social sharing devices on the site's pages.

Splash page

Splash page is an entry page, typically decorative, used to greet visitors to a website.

Stop word

Stop words are words included in search queries that are not actively indexed, unless included in quotations (phrase search). Typical examples include articles and conjunctions such as the, a, and or.

Title attribute

The title attribute is available on a number of HTML elements. It is used to provide a description for a link, a table, a frame, an image, or other elements. Some search engines index the title attribute and it therefore provides another option for on page optimization. Some browsers will also display the content of the title attribute as a tool tip when you move your mouse over the object.

White hat

White hat is a label used to describe the use of SEO techniques that are legal, ethical, or exhibit best practices.

XML sitemap

XML sitemaps lists the pages on a website in a format that is easily digestible by search engine agents. The sitemaps follow a standard convention agreed upon by all the major search engines. The XML sitemap is typically not visible to site visitors, and should not be confused with the normal sitemaps often used on the frontend of websites.


How search engines assess sites?

Search engines all function in approximately the same fashion—a software agent, known as a bot, spider, or crawler, visits a page, gathers the content, and stores it in the search engine's data repository. Once the information is in the repository, it is indexed. The crawling and indexing processes are constant and on-going. Each of the major search engines maintain multiple crawlers that work tirelessly to refresh their index. The spiders find new pages by a variety of methods, typically including XML sitemaps, URLs already in the index, links to pages discovered while indexing, and URLs submitted for inclusion by users. How frequently they visit a specific site, and how deeply they spider the site on each visit, varies.

When a user visits the search engine and runs a search, the search engine extracts (from the search engine's index) a list of pages that are relevant to the query and then displays that list of pages to the user. The output on the search results page is defined according to each search engine's own criteria. The ranking methodology used by each engine is the result of the search engine's secret algorithm.

The search engine's crawler is primarily interested in certain types of information on the page, particularly the URL, the text, and the links on the page. Formatting is not indexed. Images and other media are indexed by most search engines, but to varying degrees of depth. Some types of media, such as Flash or attached files, are rarely indexed, though there are exceptions.


Seeing what the spider sees

If you have a Google Webmaster account, you can see a web page exactly as the Googlebot (the name of the Google crawler) sees it. To do this, log in to Google Webmaster Tools (http://www.google.com/webmasters/) and click on a site profile. In the navigation menu on the left, select the Diagnostics menu and then select the option Fetch as Googlebot . Type the URL of the page you want to see and after a delay, the system will produce the results. You can see a webpage, as shown in the following screenshot, followed by the Googlebot's view of the same page:

The following is the spider's view of the same page:



This chapter seeks to acquaint you with the basic principles of search engine optimization, including the terminology used. As noted at the outset, the philosophy that is promoted in this book emphasizes SEO as an on-going process intended to optimize a website to perform its best on the search engines. Throughout this book, the techniques discussed will all reinforce this process-oriented approach to SEO.

At the conclusion of this chapter, you should have gained an awareness of the most commonly used terms in the SEO field and you should have also gained insights into what is indexed by the search engines and how it is used to produce search engine results. At the outset of this chapter we stated the importance of quality and original content; at the end of this chapter, where we provided an example of how a search engine spider views your page, you can once again see how the content is key to your efforts.

In the next chapter, we take our first steps towards laying the foundations of SEO for your site, as we look at the default SEO options that are available on your Drupal site.

About the Author

  • Ric Shreves

    Ric Shreves is a web applications consultant and tech author. He’s been building websites since the mid-90s and writing about tech for almost as long. Ric specializes in open source content management systems and has written texts on each of the big three: WordPress, Joomla! and Drupal. Ric is the founding partner of water&stone, a digital agency that focuses on new media and online marketing. He works with clients on digital marketing strategy and supervises the SEO implementation team. Ric lives in Bali and divides his time between the island and Singapore.

    Browse publications by this author
Drupal Search Engine Optimization
Unlock this book and the full library FREE for 7 days
Start now