Sessions and Users in PHP 5 CMS

Exclusive offer: get 50% off this eBook here
PHP 5 CMS Framework Development

PHP 5 CMS Framework Development — Save 50%

Expert insight and practical guidance to creating an efficient, flexible, and robust framework for a PHP 5-based content management system

$39.99    $20.00
by Martin Brampton | August 2010 | Content Management Open Source PHP Web Development

In this article, by Martin Brampton author of PHP 5 CMS Framework Development, we get into the detailed questions involved in providing continuity for people using our websites. Almost any framework to support web content needs to handle this issue robustly, and efficiently. In this article, we will look at the need for sessions, and the PHP mechanism that makes them work. There are security issues to be handled, as sessions are a well known source of vulnerabilities. Search engine bots can take an alarmingly large portion of your site bandwidth, and special techniques can be used to minimize their impact on session handling. Actual mechanisms for handling sessions are provided. Session data has to be stored somewhere, and it is better to take charge of this task rather than leave it to PHP. A simple but fully effective session data handler is developed using database storage.

(For more resources on PHP, see here.)

The problem

Dealing with sessions can be confusing, and is also a source of security loopholes. So we want our CMS framework to provide basic mechanisms that are robust. We want them to be easy to use by more application-oriented software. To achieve these aims, we need to consider:

  • The need for sessions and their working
  • The pitfalls that can introduce vulnerabilities
  • Efficiency and scalability considerations

Discussion and considerations

To see what is required for our session handling, we shall first review the need for them and consider how they work in a PHP environment. Then the vulnerabilities that can arise through session handling will be considered. Web crawlers for search engines and more nefarious activities can place a heavy and unnecessary load on session handling, so we shall look at ways to avoid this load. Finally, the question of how best to store session data is studied.

Why sessions?

The need for continuity was mentioned when we first discussed users. But it is worth reviewing the requirement in a little more detail.

If Tim Berners-Lee and his colleagues had known all the developments that would eventually occur in the internet world, maybe the Web would have been designed differently. In particular, the basic web transport protocol HTTP might not have treated each request in isolation. But that is hindsight, and the Web was originally designed to present information in a computer-independent way. Simple password schemes were sufficient to control access to specific pages.

Nowadays, we need to cater for complex user management, or to handle things like shopping carts, and for these we need continuity. Many people have recognized this, and introduced the idea of sessions. The basic idea is that a session is a series of requests from an individual website visitor, and the session provides access to enduring information that is available throughout the session. The shopping cart is an obvious example of information being retained across the requests that make up a session. PHP has its own implementation of sessions, and there is no point reinventing the wheel, so PHP sessions are the obvious tool for us to use to provide continuity.

How sessions work

There are three main choices which have been available for handling continuity:

  • Adding extra information to the URI
  • Using cookies
  • Using hidden fields in the form sent to the browser

All of them can be used at times. Which of them is most suitable for handling sessions?

PHP uses either of the first two alternatives. Web software often makes use of hidden variables, but they do not offer a neat way to provide an unobtrusive general mechanism for maintaining continuity. In fact, whenever hidden variables are used, it is worth considering whether session data would be a better alternative.

For reasons discussed in detail later, we shall consider only the use of cookies, and reject the URI alternative. There was a time when there were lots of scary stories about cookies, and people were inclined to block them. While there will always be security issues associated with web browsing, the situation has changed, and the majority of sites now rely on cookies. It is generally considered acceptable for a site to demand the use of cookies for operations such as user login or for shopping carts and purchase checkout.

The PHP cookie-based session mechanism can seem obscure, so it is worth explaining how it works. First we need to review the working of cookies. A cookie is simply a named piece of data, usually limited to around 4,000 bytes, which is stored by the browser in order to help the web server to retain information about a user. More strictly, the connection is with the browser, not the user. Any cookie is tied to a specific website, and optionally to a particular part of the website, indicated by a path. It also has a life time that can be specified explicitly as a duration; a zero duration means that the cookie will be kept for as long as the browser is kept open, and then discarded.

The browser does nothing with cookies, except to save and then return them to the server along with requests. Every cookie that relates to the particular website will be sent if either the cookie is for the site as a whole, or the optional path matches the path to which the request is being sent. So cookies are entirely the responsibility of the server, but the browser helps by storing and returning them. Note that, since the cookies are only ever sent back to the site that originated them, there are constraints on access to information about other sites that were visited using the same browser.

In a PHP program, cookies can be written by calling the set_cookie function, or implicitly through session handling. The name of the cookie is a string, and the value to be stored is also a string, although the serialize function can be used to make more structured data into a string for storage as a cookie. Take care to keep the cookies within the size limit. PHP makes available the cookies that have been sent back by the browser in the $_COOKIES super-global, keyed by their names.

Apart from any cookies explicitly written by code, PHP may also write a session cookie. It will do so either as a result of calls to session handling functions, or because the system has been configured to automatically start or resume a session for each request. By default, session cookies do not use the option of setting an expiry time, but can be deleted when the browser is closed down. Commonly, browsers keep this type of cookie in memory so that they are automatically lost on shutdown.

Before looking at what PHP is doing with the session cookie, let's note that there is an important general consideration for writing cookies. In the construction of messages between the server and the browser, cookies are part of the header. That means rules about headers must be obeyed. Headers must be sent before anything else, and once anything else has been sent, it is not permitted to send more headers. So, in the case of server to browser communication, the moment any part of the XHTML has been written by the PHP program, it is too late to send a header, and therefore too late to write a cookie.

For this reason, a PHP session is best started early in the processing. The only purpose PHP has in writing a session cookie is to allocate a unique key to the session, and retrieve it again on the next request. So the session cookie is given an identifying name, and its value is the session's unique key. The session key is usually called the session ID, and is used by PHP to pick out the correct set of persistent values that belong to the session. By default, the session name is PHPSESSID but it can, in most circumstances, be changed by calling the PHP function session_name prior to starting the session. Starting, or more often restarting, a session is done by calling session_start, which returns the session ID. In a simple situation, you do not need the session ID, as PHP places any existing session data in another superglobal, $_SESSION. In fact, we will have a use for the session ID as you will soon see.

The $_SESSION super-global is available once session_start has been called, and the PHP program can store whatever data it chooses in it. It is an array, initially empty, and naturally the subscripts need to be chosen carefully in a complex system to avoid any clashes. The neat part of the PHP session is that provided it is restarted each time with session_start, the $_SESSION superglobal will retain any values assigned during the handling of previous requests. The data is thus preserved until the program decides to remove it. The only exception to this would be if the session expired, but in a default configuration, sessions do not expire automatically. Later in this article, we will look at ways to deliberately kill sessions after a determinate period of inactivity.

As it is only the session ID that is stored in the cookie, rules about the timing of output do not apply to $_SESSION, which can be read or written at any time after session_start has been called. PHP stores the contents of $_SESSION at the end of processing or on request using the PHP function session_write_close. By default, PHP puts the data in a temporary file whose name includes the session ID. Whenever the session data is stored, PHP retrieves it again at the next session_start.

Session data does not have to be stored in temporary files, and PHP permits the program to provide its own handling routines. We will look at a scheme for storing the session data in a database later in the article.

Avoiding session vulnerabilities

So far, the option to pass the session ID as part of the URI instead of as a cookie has not been considered. Looking at security will show why. The main security issue with sessions is that a cracker may find out the session ID for a user, and then hijack that user's session. Session handling should do its best to guard against that happening. PHP can pass the session ID as part of the URI. This makes it especially vulnerable to disclosure, since URIs can be stored in all kinds of places that may not be as inaccessible as we would like. As a result, secure systems avoid the URI option.

It is also undesirable to find links appearing in search engines that include a session ID as part of the URI. These two points are enough to rule out the URI option for passing session ID. It can be prevented by the following PHP calls:

ini_set('session.use_cookies', 1);
ini_set('session.use_only_cookies', 1);

These calls force PHP to use cookies for session handling, an option that is now considered acceptable. The extent to which the site will function without cookies depends on what a visitor can do with no continuity of data—user login will not stick, and anything like a shopping cart will not be remembered.

It is best to avoid the default name of PHPSESSID for the session cookie, since that is something that a cracker could look for in the network traffic. One step that can be taken is to create a session name that is the MD5 hash of various items of internal information. This makes it harder but not impossible to sniff messages to find out a session ID, since it is no longer obvious what to seek—the well known name of PHPSESSID is not used.

It is important for the session ID to be unpredictable, but we rely on PHP to achieve that. It is also desirable that the ID be long, since otherwise it might be possible for an attacker to try out all possible values within the life of a session. PHP uses 32 hexadecimal digits, which is a reasonable defense for most purposes.

The other main vulnerability apart from session hijacking is called session fixation. This is typically implemented by a cracker setting up a link that takes the user to your site with a session already established, and known to the cracker.

An important security step that is employed by robust systems is to change the session ID at significant points. So, although a session may be created as soon as a visitor arrives at the site, the session ID is changed at login. This technique is used by Amazon among others so that people can browse for items and build up a shopping cart, but on purchase a fresh login is required. Doing this reduces the available window for a cracker to obtain, and use, the session ID. It also blocks session fixation, since the original session is abandoned at critical points. It is also advisable to change the ID on logout, so although the session is continued, its data is lost and the ID is not the same.

It is highly desirable to provide logout as an option, but this needs to be supplemented by time limits on inactive sessions. A significant part of session handling is devoted to keeping enough information to be able to expire sessions that have not been used for some time. It also makes sense to revoke a session that seems to have been used for any suspicious activity.

Ideally, the session ID is never transmitted unencrypted, but achieving this requires the use of SSL, and is not always practical. It should certainly be considered for high security applications.

Search engine bots

One aspect of website building is, perhaps unexpectedly, the importance of handling the bots that crawl the web. They are often gathering data for search engines, although some have more dubious goals, such as trawling for e-mail addresses to add to spam lists. The load they place on a site can be substantial. Sometimes, search engines account for half or more of the bandwidth being used by a site, which certainly seems excessive.

If no action is taken, these bots can consume significant resources, often for very little advantage to the site owner. They can also distort information about the site, such as when the number of current visitors is displayed but includes bots in the counts.

Matters are made worse by the fact that bots will normally fail to handle cookies. After all, they are not browsers and have no need to implement support for cookies. This means that every request by a bot is separate from every other, as our standard mechanism for linking requests together will not work. If the system starts a new session, it will have to do this for every new request from a bot. There will never be a logout from the bot to terminate the session, so each bot-related session will last for the time set for automatic expiry.

Clearly it is inadvisable to bar bots, since most sites are anxious to gain search engine exposure. But it is possible to build session handling so as to limit the workload created by visitors who do not permit cookies, which will mostly be bots. When we move into implementation techniques, the mechanisms will be demonstrated.

Session data and scalability

We could simply let PHP take care of session data. It does that by writing a serialized version of any data placed into $_SESSION into a file in a temporary directory. Each session has its own file.

But PHP also allows us to implement our own session data handling mechanism. There are a couple of good reasons for using that facility, and storing the information in the database. One is that we can analyze and manage the data better, and especially limit the overhead of dealing with search engine bots. The other is that by storing session data in the database, we make it feasible for the site to be run across multiple servers. There may well be other issues before that can be achieved, but providing session continuity is an essential requirement if load sharing is to be fully effective. Storing session data in a database is a reliable solution to this issue.

Arguments against storing session data in a database include questions about the overhead involved, constraints on database performance, or the possibility of a single point of failure. While these are real issues, they can certainly be mitigated. Most database engines, including MySQL, have many options for building scalable and robust systems. If necessary, the database can be spread across multiple computers linked by a high speed network, although this should never be done unless it is really needed. Design of such a system is outside the scope of this article, but the key point is that the arguments against storing session data in a database are not particularly strong.

PHP 5 CMS Framework Development Expert insight and practical guidance to creating an efficient, flexible, and robust framework for a PHP 5-based content management system
Published: June 2008
eBook Price: $39.99
Book Price: $49.99
See more
Select your format and quantity:

(For more resources on PHP, see here.)

Exploring PHP—frameworks of classes

A major reason for committing to PHP5 is to take advantage of the improved possibilities for object orientation. One significant part of OO is the creation of frameworks of classes. In the next section, we shall be building the main elements of the session handling classes, which form a simple framework.

What is a framework of classes? Design has to be done at different levels. Sometimes we look at the whole picture, and sometimes we struggle with the detailed mechanism of a single method. In between, the aim of OO is to break down the problem into objects that act as a model. They represent important features of the problem. But we often find that some of the objects that are used to build the model have very close ties.

The idea of a framework is quite a loose one. There will sometimes be formal connections between the classes that make up a framework, such as one being a subclass of another. Other times, there will be no formal relationship, but the design means that classes work together to achieve a result.

We can find a concrete example in the classes to be developed, used for session handling. It makes sense to have a session object within the CMS framework. The properties of the session and the behaviors needed for session management can be gathered together in a session object. Detailed design indicates that the behaviors and properties of the session object are slightly different, according to whether we are working on the interface that is exclusive to administrators, or on the general interface. In a situation like that, it makes sense to have a different class for administrators from users, but to include common properties and methods in a parent class.

The first part of the session handling structure is an abstract class, so called because it is never actually made into an object, which we can call aliroSession. Two slightly different classes extend aliroSession with extra properties and methods, and they are aliroUserSession and aliroAdminSession. They are not abstract classes, because they will be used to create a session object, depending on which interface is being handled. In fact, they could be declared final since we have no intention of extending them further, at least for the moment.

In fact, these classes can behave as singletons, since there is no sense in having more than one session object to handle a single request. And in a case like this, it is often helpful to have a Factory Method (in capitals because it is a well known pattern) to obtain the object. The factory method could be in another class, but there is no particular logic to creating an extra class to hold factory methods, so the factory method can be a static method in the abstract class. Specifically, the session object can be obtained by calling aliroSession::getSession().

Although it may seem unlikely in this case, the scheme we have created allows for the possibility that the decision made in getSession could become more complicated, and there could be a larger number of possible classes used to create session objects. Callers would not need to know about those implementation details, provided the same interfaces could still be honored. Likewise, the class inheritance scheme will be more complicated in many real life situations.

Now there is a further twist, and it is the aliroSessionData class. It is not formally related to the other classes, but is obviously involved in the same area, and in Aliro is actually only used by the other session classes. Its design is therefore very closely tied to them, and we can regard the group of four classes as a framework, even though this last class is not formally connected to the others. As its name implies, aliroSessionData is the class that deals with the storage and retrieval of session data. This class also makes calls on the other session-related classes.

From this example, I hope that it becomes clear how frameworks of classes are an important vehicle for system design. Unfortunately, neither PHP nor most OO languages actually provide for restriction of access within a framework. Provided classes are related by subclassing, we can use protected to say that properties or methods are only accessible within the class hierarchy. But there is no way to say, for example, that the methods of aliroSessionData are only available to aliroSession and its subclasses. While that limitation remains, the best that can be done is to document the limitations.

The fact that there are interfaces that are technically public but intended for use only within a framework highlights the need to specify interfaces. A framework of classes should have a truly public interface, which is a description of the methods (and possibly properties) that are intended for use by code outside the framework.

The fact that there are interfaces that are technically public but intended for use only within a framework highlights the need to specify interfaces. A framework of classes should have a truly public interface, which is a description of the methods (and possibly properties) that are intended for use by code outside the framework.

Framework solution

Now let's develop some code that implements the ideas discussed in the first part of this article. It is built as two classes, one for the session itself and one for dealing with the storage of session data.

Building a session handler

Aliro deals with sessions using a singleton object to represent the current session. Another singleton object handles session data, and is described later. The session object is obtained from a very simple factory method that is located in aliroSession:

public static function getSession () {
return _ALIRO_IS_ADMIN ? aliroAdminSession::getInstance() :
aliroUserSession::getInstance();

}

Different code is run depending on whether the CMS is entered through the administrator or the general interface, and different definitions are set for a number of symbols, including _ALIRO_IS_ADMIN. The fact that these settings are made in very early processing makes it difficult to subvert the information. The singleton session object is an instance of one of the classes aliroUserSession or aliroAdminSession as appropriate. Both of these classes inherit common features from the abstract class aliroSession.

The getInstance methods are slightly different between the two session classes. First, let us review what goes on in the user session class. The code starts off as follows:

public static function getInstance () {
if (!is_object(self::$currentSession)) {
self::$currentSession = new self();
if (!self::$currentSession->checkValidSession()) {
// Must be a new visitor
self::$currentSession->setNew();
$_SESSION = array();
self::$currentSession->setNewUserData(new aliroAnyUser());
$_SESSION['aliro_user_session_start'] = date('Y-M-d H:i:s');
if (self::$currentSession->orphandata AND
self::$currentSession->loginCookieValue() AND aliroSEF::
getInstance()->isValidURI('/login')) {
aliroRequest::getInstance()->redirect(aliroSEF::
getInstance()->urilink('/login'));
}
}
else{
if (isset($_SESSION['aliro_logout'])) $_SESSION['aliro_
logout']++;
if (isset($_SESSION['aliro_login'])) $_SESSION['aliro_
login']++;
}
}
return self::$currentSession;
}

A couple of object properties are declared to supplement those in the abstract base class, and as you might expect they are declared differently in the administrator session class. As a singleton class, the constructor will be called only once, and when it is called, most of the work is done in the parent abstract class. We'll look at that later. One thing that is different between administrator and user is the session life that is set. For both, the time is configurable, with a reasonable minimum imposed by the system.

The method that is called to obtain a session object via the factory class is the static getInstance. This follows standard singleton logic, returning the single instance if it already exists. Otherwise, a new instance of the class is created, followed by a check on session validity. As the checks are common to both user and administrator, they are implemented in checkValidSession, a method in the parent abstract class.

If the session is not one we already know about, then a new visitor is assumed. Any data in $_SESSION is scrapped as irrelevant, and a null user object is used to represent a visitor in the session's store of user data. Partly for testing purposes, the time the session started is recorded.

There is some additional code to support the abililty for a user who tries to do something but finds their session has expired. If possible, they are allowed to login again, and their request processed.

Finally, either the existing or the newly created session object is returned. The corresponding code for the administrator is slightly different, and a bit simpler:

class aliroAdminSession extends aliroSession {
protected $_prefix = 'admin';
public $isadmin = 1;
protected function __construct () {
parent::__construct();
$this->lifetime = max (aliroCore::getInstance()->getCfg('adminlife'),
300);
}
public static function getInstance () {
if (!is_object(self::$currentSession)) {
self::$currentSession = new aliroAdminSession();
if (!self::$currentSession->checkValidSession()) {
// self::$currentSession->logout(true);
$_SESSION = array();
setcookie ('aliroAdminSession', 0, time()-7*24*60*60, '/');
}
}
return self::$currentSession;
}

As we expected, a couple of properties are set differently. The constructor is exactly the same except that the session lifetime is obtained from a different configuration value. Likewise, the singleton mechanism is exactly the same, as is the check for a valid session. However, if the session is not validated, a logout is performed to ensure no administrator data is retained, in addition to $_SESSION being cleared. On the user side, we have the concept of a visitor, someone who is looking at the site but has not logged on. For the administrator services, this makes no sense, so the only outcomes from starting a session are either that the user already has a valid login as administrator, or a login is required.

To make sense of the setting of an aliroAdminSession cookie, an explanation of an Aliro facility is needed. It is common to close a site when tricky maintenance operations are being performed. This blocks normal access to the site. But that is a nuisance to the administrator working on the site who may want to see how the site is functioning without letting everyone have access. The solution is to permit access to the closed site only to people who have another live session (it must be on the same computer and in the same browser so that the cookies can be shared) as an administrator. This way, most people are excluded from the site until work is finished, but anyone with an administrator login can still obtain access. Not only can they obtain access, they are free to work as a visitor, or to log into the user side as any user, not necessarily an administrator user. The feature is implemented by setting the aliroAdminSession cookie when an administrator logs in, hence it needs to be deleted in session processing when an administrator session fails to validate.

As you can see, cookies are deleted by being set with an expiry time that has already passed. Do not attempt too much precision with cookie expiry times, as computer times are often not accurately synchronized.

To see how the cookie is used, let's look at the static helper method that tells whether an administrator is present on the same computer, and browser. The method is aliroSession::isAdminPresent, and it consists of the following code:

public static function isAdminPresent () {
if (isset($_COOKIE['aliroAdminSession'])) $admin_session =
$_COOKIE['aliroAdminSession'];
else return false;
$database = aliroCoreDatabase::getInstance();
$database->setQuery("SELECT COUNT(session_id) FROM #__session WHERE session_id = '$admin_session' AND isadmin = 1");
return $database->loadResult() ? true : false;
}

Where an administrator cookie is found, its value is checked for whether it's a valid administrator session ID before being counted as acceptable.

Creating a session

The real work of creating a session is done in the constructor of the abstract class aliroSession. You will recall that the concrete class constructors called the parent constructor. It works like this:

protected function __construct() {
$this->time = time();
if ('crontrigger' == @$_REQUEST['option']) return true;
ini_set('session.use_cookies', 1);
ini_set('session.use_only_cookies', 1);
$name = md5('aliro_'.$this->_prefix.$this->getIP().
_ALIRO_ABSOLUTE_PATH);
session_name($name);
if (!session_id()) {
$sh = aliroSessionData::getInstance();
session_set_save_handler(array($sh,'sess_open'), array($sh,'sess_close'), array($sh,'sess_read'),
array($sh,'sess_write'), array($sh,'sess_destroy'), array($sh,'sess_gc'));
session_start();
}
}

Because the class is a singleton, the constructor is a protected method. It cannot be called from outside the class, and in fact is only triggered through the getInstance method.

Before a session is actually started, there is a check whether the $_REQUEST superglobal contains a value of crontrigger with a key of option. This occurs only when Aliro is run from a command, such as from a cron timer. In this kind of situation, there is no user, no browser, and it is not appropriate to start a session. Otherwise, we start to set up a session.

A session always needs to know the current time, and then the restrictions on using the URI for passing the session ID, as we just discussed, are enforced. The session name is constructed in an obscured form.

We can check that there is not already a session in operation by checking the return from PHP function session_id. Provided there is no current session, the session data handler is started and PHP is told to use it. Finally, PHP is asked to start the session.

All of this happens early on in the handling of a request, because the main logic gets a session object from the factory class. The code illustrated so far is sufficient to set up a session, so that the rest of the system is able to operate within an environment where data can be preserved simply by storing it in $_SESSION. What has not yet been done is to sort out what has to happen when the new session is checked. This is where we can enforce time limits on inactivity. But first a digression on finding the IP address of our client, as used in concocting the session name.

Finding the IP address

Getting an IP address sounds easy enough, especially as PHP provides what appears to be the answer in the $_SERVER super-global. Unfortunately, the availability of proxy browsing means that this will sometimes be the wrong answer. There is no complete or effective way to find out the IP address, but this is the best code I have found so far:

public function getIP() {
if ($this->ipaddress) return $this->ipaddress;
$ip = false;
if (!empty($_SERVER['HTTP_CLIENT_IP']))
$ip = $_SERVER['HTTP_CLIENT_IP'];
if (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$ips = explode (', ', $_SERVER['HTTP_X_FORWARDED_FOR']);
if ($ip != false) {
array_unshift($ips,$ip);
$ip = false;
}
$count = count($ips);
// Exclude IP addresses that are reserved for LANs
for ($i = 0; $i < $count; $i++) {
if (!preg_match("/^(10|172\.16|192\.168)\./i", $ips[$i])) {
$ip = $ips[$i];
break;
}
}
}
$this->ipaddress = (false == $ip AND isset($_SERVER['REMOTE_ADDR'])) ?
$_SERVER['REMOTE_ADDR'] : $ip;
return $this->ipaddress;
}

The test near the end illustrates the technique of error avoidance mentioned earlier. Writing the test the other way round, as $ip == false, will not create an error if only a single equals sign is coded by mistake. But the meaning will be radically changed!

Validating a session

So now we have the makings of a session, but it is not safe to use the data linked to the session until we have checked that it is currently valid. To keep track of sessions, each one is recorded in the database. A bit of organization is needed around the check of session against database, so it is best to implement a checkValidSession method. This is the method that was used in the getInstance methods for user or administrator:

protected function checkValidSession () {
if ('crontrigger' == @$_REQUEST['option']) return true;
$this->session_id = session_id();
if ($this->session_id) {
// We try to update the time stamp in the matching record of the session
table
$result = $this->updateTime();
if (!$result) {
setcookie('aliroCookieCheck', 'A', time()+365*24*60*60, '/');
$this->saveOrphanData();
$this->session_id = '';
}
return $result;
}
else {
trigger_error(T_('No session ID found, although aliroSession has been
instantiated'));
return false;
}
}

Firstly, a check is needed for the system being run from a command rather than a web request. Otherwise, let's perform a sanity check that we have really started a session. Then use another method updateTime to attempt to update the session record in the database with the current time. If that fails, then we know that the session we are trying to start is not a valid continuation of any existing session. The session ID is made into a null string to suppress any attempt to delete the session from the database. A new ID will be created automatically elsewhere in the code. Also, a cookie is written, really as a check to see whether cookies will be handled by this visitor to our site. The importance of that will become clear when we review how to handle visitors who refuse cookies. Any GET or POST data is saved for an attempt to salvage a user request if the user can successfully login. The return value indicates whether the session passed muster as a continuation of an existing one.

There is no need to go through all of the time check routine, but it is worth looking at how the query is done:

$past = $this->time - $this->lifetime;
$database->doSQL("UPDATE #__session SET time = '$this->time', marker = marker+1 WHERE session_id = '$this->session_id' AND isadmin = $this->isadmin AND time > $past");
return ($database->getAffectedRows()) ? true : false;

Firstly, the earliest time for a valid session is computed, based on subtracting the maximum session lifetime from the current time. If you remember, the session object's time was set to current time in the constructor. What we want now is to find out whether there is a session record in the database that matches our session ID, it is the correct type (user or administrator), and it has not expired. If we find such a record, we want to update it to the current time.

The whole operation can be done economically with a single SQL operation, because databases do not mind if an UPDATE query fails to match anything. In fact, we can ask the database whether the query matched anything, and the result tells us whether there was a valid, matching session record. If there was, it has been updated.

Just one small but essential point, the database field marker has no particular use in the application, but it is vital to the single query technique. The time is not especially granular (whole seconds) and in cases where a request is redirected back to the site, a session may be checked twice within the same second. Without marker, this would result in the database finding a matching record, but not needing to update it since the time has not changed. Our session would be treated as invalid, because no rows were affected! To avoid this from happening, marker is simply a count that is always incremented. This way, if there is a matching record, we can be sure it will be affected by the query, and the database will give us a non-zero count when we ask how many rows were affected.

It is well worth using devices like this. Saving one SQL operation may not be much, but the effect is cumulative. Both for performance and maintenance, the best code is the code that is not there!

Remembering users

A useful facility that some sites like to have is for the login details to be remembered so that there is no need to enter them on each fresh visit. The function is not strictly a part of session handling, but it is included as a kind of helper method. It utilizes the session object's knowledge of whether or not this is a new session. The processing of rememberMe is not appropriate at any other time.

Remembering users is done by setting a cookie with a long expiry time. Clearly it only functions when the user returns with the same browser on the same computer. It is equally obvious that users should be discouraged from using the facility on computers that are open to other people. When we want to see if there is any remembered user login available, these methods are used:

public function rememberMe ($request) {
if (!$this->newsess OR $this->userid) return;
$loginfo = $this->rememberedUser();
if ($loginfo) {
// If the login is successful, then the session data will be updated
// In any case, the return will be set either to user data or to null
$message = aliroUserAuthenticator::getInstance()->systemLogin
($loginfo->username, $loginfo->password, 1);
if ($message) $request->setErrorMessage(T_('Remember Me login failed'),
_ALIRO_ERROR_WARN);
else {
aliroUser::reset();
$this->newsess = false;
}
}
}
private function rememberedUser () {
$usercookie = isset($_COOKIE['usercookie']) ? $_COOKIE['usercookie'] :
null;
if ($usercookie AND !empty($usercookie['username']) AND
!empty($usercookie['password'])) {
return new aliroLoginDetails($usercookie['username'],
$usercookie['password'], true);
}
return null;
}

First, a few checks—on the session being new, the user not being logged in (has an ID of zero) and on the presence of the appropriate cookie. The cookie must contain a username and password. Don't forget that the user has control over cookies and therefore can fake them. For example, the very useful web developer add-on for Firefox provides an easy way to view, or update any cookie. So it is essential to apply the normal login checks. If all goes well, then the user object for the current session is reset, which gives it all the user session data that was created during login.

The code shown previously stores the username, and password in plain text. For greater security, it would be better to encrypt this information. Provided the PHP mcrypt functions are available on the server to provide two-way encryption services, this can be done quite easily. A useful key might be the salt.(Description of salt is out of the scope of this article)

PHP 5 CMS Framework Development Expert insight and practical guidance to creating an efficient, flexible, and robust framework for a PHP 5-based content management system
Published: June 2008
eBook Price: $39.99
Book Price: $49.99
See more
Select your format and quantity:

(For more resources on PHP, see here.)

Completing session handling

The Aliro session classes have several more methods, but there is little that is worth looking at in detail. All the important principles have been described in the preceding sections. The functions of the main remaining methods are:

  • setSessionData: It is passed a user object when it is invoked, and stores the user details in session data (as part of the $_SESSION superglobal). It is typically called after a login, or on the arrival of a new visitor (represented by a minimal user object), to preserve the user information across multiple requests. The session object is written to the database as a record in the session table.
  • Security tip!
    Whenever this method is called, it regenerates the session ID to reduce the likelihood of the session being hijacked, or fixed.

  • purge: It is called at the end of setSessionData to remove any expired sessions from the database session table. It is also called by the session data handling class when its own timeout processing is invoked. Although simply removing entries from the table will effectively terminate a login, it is better to find out which users have logins that are being expired. Then any plugin that wants to know about user logout can be triggered. In Aliro, that processing happens for the user side, but not the administration side. Finally, the purge processing triggers a tidy up of session data, as described later.
  • logout: It is implemented in different methods for user or administration sessions, unlike the previous methods. In either case, the session data is deleted. For an administration session, the only other action is to delete the cookie that shows such a session is active. For a user session, the session is continued, but it retains only the information relating to a visitor who is not logged in to the system. This puts the person using the system in the same position after logout as they would be if they had just arrived at the site as a visitor.

Session data

For the reasons discussed earlier, Aliro implements a simple session data handling class using the database. The constructor of the abstract session class started things off by creating an instance of the data handling class, and calling the PHP function session_set_save_handler. Since we will always want to have a single session data handler, the class is written as a singleton in the usual way.

The constructor for the session data handler would be very simple if it were not for the problem of initial installation of the whole system. When the system is being installed, the database does not exist. Because of that, we cannot store any data in it, and the handler has to work differently. The constructor is therefore as given:

private function __construct () {
if (aliro::getInstance()->installed)
$this->db = aliroCoreDatabase::getInstance();
}

The condition in the constructor is a check on whether installation has been completed, and configuration information written to disk. Only if these setup jobs are out of the way can we get access to the secure database for storing session data.

Session data and bots

If we treat every request the same, then a session will be started for each request that does not provide a cookie showing that it is a continuation of an existing session. When search engine bots are very active, this can result in a lot of data being stored unnecessarily. Normally, the bots will not accept cookies, so each bot request is liable to start another session. Any session data will be stored, entirely fruitlessly since the bot will never present the cookie that is needed for the data to be retrieved. If session data is being stored in files, many useless files are created. Likewise if the database is used, the table is likely to contain many useless records.

To combat this, whenever a new request is encountered, Aliro stores its session data in a cookie. The quantity of data on a first request is not likely to be especially high, so the typical size limit of 4000 characters is not a concern. Obviously, the bots will ignore the cookie, but the data in it was going to be wasted anyway. This way, the session data table in the database will contain only information about real sessions that are ongoing. The write method for session data is therefore as follows:

public function sess_write ($session_id, $session_data) {
if ((!isset($_COOKIE['aliroCookieCheck']) AND !isset($_COOKIE['usercookie'])) OR !$this->db) {
if (!headers_sent()) setcookie ('aliroTempSession', base64_encode($session_data), 0, '/');
return true;
}
if (isset($_COOKIE['aliroTempSession'])) setcookie ('aliroTempSession', null, time()-7*24*60*60, '/');
$session_id = $this->db->getEscaped($session_id);
$session_data = base64_encode($session_data);
$this->db->doSQL("UPDATE #__session_data SET session_
data = '$session_data', marker = marker+1 WHERE session_id_crc = CRC32('$session_id') AND session_id = '$session_id'");
if (0 == $this->db->getAffectedRows()) $this->db->doSQL("INSERT INTO #__session_data (session_id, session_id_crc, session_data) VALUES ('$session_id', CRC32('$session_id'), '$session_data')");
return true;
}

The first checks are designed to find out whether cookies are being accepted. If we receive a rememberMe cookie we know immediately that cookies are being accepted, and we don't need to use the temporary cookie device. Every time a request comes along that is not linked up with an existing session, the session class tries to write a cookie with the name aliroCookieCheck. So if a cookie of that name is received, we know that we are dealing with a follow-up request. If neither of these apply, we write the temporary session data cookie. There is also the situation during installation where no database is yet available, so this is also handled by writing session data as a cookie. Processing during the limited period of a fresh installation is not likely to place serious demands on session data handling, but a call to the database would be disastrous.

As mentioned earlier, cookies cannot be written after headers have been sent to the browser. PHP will tell us whether that has happened through its headers_sent function. Aliro tries to ensure session data is written before output of XHTML starts, and the sending of headers is triggered but it is difficult to absolutely guarantee this. If it is too late, we simply abandon the session data. On a first or an isolated request, that is unlikely to do too much harm. All being well, the session data is encoded to avoid any problems with difficult characters, and written as a cookie. The expiry time is given as zero, which makes it a session cookie that is deleted automatically when the browser is closed. There is no reason to preserve session data beyond the closing down of the browser.

If we have established that cookies are accepted and the database is available, then we are probably not handling a bot or a fresh installation. It should therefore be possible and worthwhile to write the session data to the database, where there is a much more generous limit on the amount of data that can be stored. If there is still a temporary session data cookie in existence from a previous request, it is deleted.

Note that the session ID is escaped before being used in a SQL statement. Since it comes from a cookie, there is always a risk of it being tampered with by a cracker, so to protect against SQL injection it is necessary to escape it before putting it into SQL. The session data is encoded so as to handle all kinds of special characters without problems. Finally, the database operation is done. It is written as a single request that will either insert or update data according to the record already present for the current session.

Retrieving session data

Now that we have figured out how to handle the write operations, reading back the data is relatively simple:

public function sess_read ($session_id) {
if (isset($_COOKIE['aliroTempSession'])) return
base64_decode($_COOKIE['aliroTempSession']);
if (!isset($_COOKIE['aliroCookieCheck']) OR !isset($this->db))
return '';
$session_id = $this->db->getEscaped($session_id);
$this->db->setQuery("SELECT session_data FROM #__session_
data WHERE session_id_crc = CRC32('$session_id') AND
session_id = '$session_id'");
return base64_decode($this->db->loadResult());
}

If we wrote a temporary session data cookie and received it back again in the $_COOKIE super-global, then we know that cookies are working and that this must be a subsequent request. The data from the cookie is returned as the session data. As we now know that cookies are being accepted, we also know that when this request's session data is written, the temporary session data cookie will be deleted, so it is not necessary to do so just yet.

If we have not already obtained session data from a cookie, some more checks are needed. If the check cookie is not available in $_COOKIE then we do not yet have a viable session. Likewise, if no database is available because installation is going on, then nothing more can be done. So, in both these cases null session data is returned.

Provided all these hurdles are overcome, which they often will be the session ID is escaped and used to look up the session data from the database, decode it, and then return it to the caller.

Keeping session data tidy

Our session data handler can be asked to delete a session, a process that follows similar logic to the one just described:

public function sess_destroy ($session_id) {
setcookie ('aliroTempSession', null, time()-7*24*60*60, '/');
if (!isset($_COOKIE['aliroCookieCheck']) OR !isset($this->db))
return true;
$session_id = $this->db->getEscaped($session_id);
$this->db->doSQL("DELETE FROM #__session_data WHERE session_
id_crc = CRC32('$session_id') AND session_id = '$session_
id'");
return true;
}

As you can see, deletion is simpler than reading, since the temporary session data cookie can be deleted regardless of whether it presently exists. Provided cookies are accepted and the database is available, the relevant session data record can be deleted. It does not matter if there is no such record, since SQL deletions simply delete whatever matches the WHERE condition, and do not mind if nothing matches.

In principle, keeping things tidy on the basis of expiration is a more complicated task. But here the session class can do nearly all of the work for us. The interface with PHP for a session data handler is required to implement a method to handle session expiry, and is passed a timeout value in seconds. The method is as simple as given:

public function sess_gc ($timeout) {
$session = aliroSession::getSession();
$session->purge($timeout);
}

All that happens is that we get the current session object and ask it to carry out a purge, passing the timeout. This relies on the session handler's ability to deal with the timeout of sessions. The last thing the purge does is to call back to the session data handler to remove any data that is no longer linked to a session. So aliroSessionData also contains this very brief method:

public function sess_destroy_orphans () {
if ($this->db) {
$this->db->doSQL("DELETE LOW_PRIORITY d FROM `#__session_data` AS d LEFT
JOIN #__session AS s ON d.session_id = s.session_id WHERE s.session_id
IS
NULL");
$this->db->doSQL("OPTIMIZE TABLE `#__session_data`");
$this->db->doSQL("OPTIMIZE TABLE `#__session`");
}
}

Provided the database is available, a single request removes any entries from the session data table that do not have corresponding entries in the session table. This relies on the fact that when a session is still valid, its session ID will be found in the session table, so where null is returned by the LEFT JOIN we know that we are dealing with redundant session data. Bear in mind that LEFT JOIN can be a very slow query, and too much use of it should be avoided. But in this case, it is the neatest way to clean up, and happens relatively infrequently.

Summary

I hope that this article has dispelled any mystique that may have surrounded sessions. The need for them has been established, and the security problems reviewed and, as far as possible, overcome. The quirky behavior of search engine bots has been reviewed, and session handling adapted to be relatively impervious to their demands.

We have built the greater part of a session handling class, and explored its workings. The full class can be downloaded as part of Aliro, as with all the code discussed here. The benefits of building our own code to handle session data have been considered and a class built to do the work.


Further resources on this subject:

About the Author :


Martin Brampton

Now primarily a software developer and writer, Martin Brampton started out studying mathematics at Cambridge University. He then spent a number of years helping to create the so-called legacy, which remained in use far longer than he ever expected. He worked on a variety of major systems in areas like banking and insurance, spiced with occasional forays into technical areas such as cargo ship hull design and natural gas pipeline telemetry.

After a decade of heading IT for an accountancy firm, a few years as a director of a leading analyst firm, and an MA degree in Modern European Philosophy, Martin finally returned to his interest in software, but this time transformed into web applications. He found PHP5, which fits well with his prejudice in favor of programming languages that are interpreted and strongly object oriented.

Utilizing PHP, Martin took on development of useful extensions for the Mambo (and now also Joomla!) systems, then became leader of the team developing Mambo itself. More recently, he has written a complete new generation CMS named Aliro, many aspects of which are described in this book. He has also created a common API to enable add-on applications to be written with a single code base for Aliro, Joomla (1.0 and 1.5) and Mambo.

All in all, Martin is now interested in many aspects of web development and hosting; he consequently has little spare time. But his focus remains on object oriented software with a web slant, much of which is open source. He runs Black Sheep Research, which provides software, speaking and writing services, and also manages web servers for himself and his clients.

Books From Packt


AJAX and PHP: Building Modern Web Applications 2nd Edition
AJAX and PHP: Building Modern Web Applications 2nd Edition

Object-Oriented Programming with PHP5
Object-Oriented Programming with PHP5

Magento 1.3: PHP Developer's Guide
Magento 1.3: PHP Developer's Guide

CakePHP Application Development
CakePHP Application Development

Agile Web Application Development with Yii1.1 and PHP5
Agile Web Application Development with Yii1.1 and PHP5

PHP 5 E-commerce Development
PHP 5 E-commerce Development

Expert PHP 5 Tools
Expert PHP 5 Tools

Mastering phpMyAdmin 3.1 for Effective MySQL Management
Mastering phpMyAdmin 3.1 for Effective MySQL Management


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software