Web Content Management (WCM)
Modern web site implementations are architected with two distinct sets of capabilities – creation and maintenance of the content and delivery of the content to the target audiences. Creation and maintenance of web content is called Web Content Management (WCM) and is a relatively new discipline in the history of web technology.
In the early days, web sites were simple with static content that changed infrequently. Modification of the content was not the biggest of challenges and such changes were typically performed by technical developers. Web technology and the associated disciplines have come a long way since those times. Today there are well-defined disciplines involved in creation of web presence such as information architecture, visual design, content strategy, and site development. Products of these disciplines are merged together and realized through enterprise application technology, which adds functionality and dynamic nature to web sites.
Specialization of roles among these disciplines as well as the complexity of the modern technology has made content updates difficult and error prone when performed directly on stored content. A WCM system makes it easier to create and update content by hiding the technical details related to the storage and delivery of content. It becomes almost imperative to use a WCM system for sites that have highly dynamic content which may also need to be delivered on multiple channels such as web, handheld devices, and print media. In such situations, content can be created once and rendered automatically for different channels.
The WCM products available today vary widely in terms of their capabilities, the underlying technologies and frameworks, and the extent to which they use open-source products as architecture components. Many WCM systems also offer presentation capabilities in order to facilitate delivery of content at least for the web medium. When discussing WCM features, it is useful to identify what WCM is and what it isn’t. WCM deals with creation, persistence, and maintenance of content. It may also include some support for presentation in terms of previewing the content being created or edited. However, fully-featured presentation frameworks or technologies, such as portals, should be evaluated separately, since majority of their concerns are significantly different.
Products with integrated WCM and portal capabilities offer the convenience of a one-stop solution for building, managing, and delivering web content. On the other hand, products with only WCM capabilities, or at least clearly separable WCM capabilities, offer freedom to use rich and mature presentation technologies which may already be in place in the existing infrastructure. Let’s take a look at a few products, which represent various points along the spectrum between these two extremes. Microsoft SharePoint Server is a well-known product with a wide range of content management capabilities that go beyond WCM and portals. Similarly, EMC Documentum is a high-end platform for enterprise content management which can be used in almost any way enterprise content may be intended to be used. EMC Documentum provides products catering to specialized needs – Web Publisher for creating and managing web content, Site Caching Services for pushing content to Web or Application Servers, and integration options with portals for presenting content. Interwoven and Vignette also offer high-end WCM systems. There is also a wide range of open-source options for WCM which have developed large user bases. Joomla! (as well as Mambo, where Joomla!’s origins lie) offers an integrated WCM and portal technology built on the LAMP (Linux, Apache, MySQL, PHP) platform. Joomla! offers over 100 components for providing specific optional capabilities such as building user communities or incorporating shopping carts. Joomla! makes it very easy to set up a web site in minutes if you are willing to work with its content organization and presentation model.
Alfresco is an open-source platform developed with the stated goal of bringing enterprise content management (ECM) capabilities to open source. The Alfresco leadership team brings content management experience from companies such as Documentum, Business Objects, and SeeBeyond. Alfresco has matured with two major releases over a period of 2.5 years. Alfresco enables Document Management, Collaboration, Records Management, Knowledge Management, Web Content Management and Imaging – some of the most common applications of enterprise content management. Alfresco WCM 2.0 was released recently and offers some exciting built-in features that promote development efficiency and reduce infrastructure demands. We will explore these aspects in this article.
Every WCM system is expected to provide certain fundamental features. The core WCM feature is the ability to edit content through a user-friendly and technology-neutral (as much as possible) interface. The system also needs to provide security and the capability to allow a team to work on the content. WCM systems typically support versioning for content items. Finally, the content needs to be exported to a form suitable for web or application servers for delivering to the target audiences.
There are certain other features that are not necessarily required in a WCM system, but are considered to be desirable and are supported by many contemporary WCM products. One such feature is the ability to store content in XML format. XML content facilitates publishing the same content through multiple channels such as web, wireless devices, and print media. Another common and desirable feature is the support for business processes or workflows. At the minimum a simple approval process is desired which can be used to review and approve the promotion of content changes to a live environment.
The WCM features discussed so far focus on what a WCM system can do. However, some of the challenges for WCM initiatives deal with how a WCM system supports certain capabilities. Such aspects may significantly affect development efficiency, quality assurance, and infrastructure costs. We take a look at some of these concerns below.
Simple static HTML pages seem to belong to a bygone era. Even simple web sites today are usually dynamic and frequently contain media (images, videos) and utilize code and a database. Today, web site management deals with managing content, media, and code together. As a result, the line separating a web application from a web site is a blur. Note that sometimes the term “content” is used to refer to all the resources managed by a CMS. For sites utilizing code and media along with plain content, it becomes important that all of these resources are managed in sync. For example, if a change is made to include a new user attribute in the user profile the corresponding code or configuration changes need to make it to the live environment, along with any presentation changes (web pages that will use this new attribute). These requirements become more complex when you consider the fact that different team members may be working on related artefacts and these changes need to go through the review process concurrently.
Another aspect of managing dynamic web sites is that often multiple web pages need to change to reflect one feature change in the web site. For the example above, the new user attribute may require a change to the user registration screen, user profile screen, and possibly some site functionality that utilizes the given feature. Thus, adding an email format preference to the user profile may require changes to three pages, two code components, and one configuration file. These affected artefacts together form a change set – they all need to go into the web site together or none of them should. If one or more pieces in the change set were left out, the change would not function as expected and would likely introduce a bug.
Change sets bear some similarity to transactions and versions. The description above reflects the similarity to a transaction since all the changes should either be committed or discarded together. On the other hand, suppose a change set implementing a particular feature was promoted to the live environment. It worked as designed and expected. However, it led to unexpected business impacts and now this change needs to be undone. The WCM system should make it easy to roll back the changes introduced by a change set. If the system kept track of web site snapshots – state of the complete site after each change set was promoted, it would be simpler to go back to a prior state of the web site.
One of the trickier challenges of WCM systems is to provide multiple environments which are isolated but more or less similar in structure. For example, each developer requires an area where he/she can make changes without impacting other developers or to the live environment. However, each developer should only see his/her changes on top of the currently live content. Then multiple change sets might need to be reviewed by different reviewers concurrently. Each reviewer should only see the change set to be reviewed on top of the currently approved content. Thus, there is a need for an on-demand virtual copy of the web site which includes the approved content and a change set. Such an isolated environment is commonly referred to as a sandbox.
In the rest of this article, we will explore the core features offered by Alfresco WCM 2.0 and how it addresses the WCM challenges described above.
Alfresco WCM 2.0
Alfresco is an open-source content management platform. Alfresco WCM 2.0 is an optional add-on which can be installed on top of the core platform to enable WCM capabilities. In this section, we will look at typical usage of Alfresco WCM 2.0 and highlight how it handles the challenges described earlier.
Alfresco organizes storage in spaces. A space is a smart folder which can be associated with configurable rules. These rules control what happens to documents or other spaces that are added under the space. Among other things, these rules enable workflows within the platform, support automatic versioning, and automatic rendition generation.
Installing Alfresco WCM 2.0 adds two spaces to the repository – Web Projects and Web Forms, as shown below in the Alfresco standard web interface. The Web Projects space holds other spaces, where each space represents one web site. The Web Forms space holds templates for creating and publishing content for different content types such as a press release or a company profile.
Two Spaces are Added by Alfresco WCM 2.0
Web content for a site is managed under a web project space which is created under Web Projects. For example, the following screenshot shows alfrescowww as a space created for managing content for this web site.
A Space for Holding Content for a Web Site