Content Modeling

Exclusive offer: get 50% off this eBook here
Managing eZ Publish Web Content Management Projects

Managing eZ Publish Web Content Management Projects — Save 50%

Strategies, best practices, and techniques for implementing eZ publish open-source CMS projects to delight your clients

$35.99    $18.00
by Martin Bauer | November 2007 | Content Management Open Source

Organizing content in a meaningful way is nothing new. We have been doing it for centuries in our libraries—the Dewey decimal system being a perfect example. So, why can't we take known approaches and apply them to the Web? The main reason is that a web page has more than two dimensions. A page on a book might have footnotes or refer to other pages, but the content only appears in one place. On a web page, content can directly link to other content and even show a summary of it.

In this article, author Martin Bauer explains the importance of having the right Content Model, and gives a step-by-step process to determine and create the desired model.

Organizing content in a meaningful way is nothing new. We have been doing it for centuries in our libraries—the Dewey decimal system being a perfect example. So, why can't we take known approaches and apply them to the Web? The main reason is that a web page has more than two dimensions. A page on a book might have footnotes or refer to other pages, but the content only appears in one place. On a web page, content can directly link to other content and even show a summary of it. It goes way beyond just the content that appears on the page—links, related content, reviews, ratings, etc. All of this brings extra dimensions to the core content of the page and how it is displayed. This is why it's so important to ensure your content model is sound. However, there is no such thing as the "right" content model. Each content model can only be judged on how well it achieves the goals of the website now and in the future.

The Purpose of a Content Model

The idea of a content model is new, but it has similarities to both a database design and an object model. The purpose of both of these is to provide a foundation for the logic of the operation. With a database design, we want to structure the data in a meaningful way to make storage and retrieval effective. With an object model, we define the objects and how they relate to each other so that accessing and managing objects is efficient and effective. The same applies to a content model. It's about structuring the content and the relationships between the classes to allow the content to be accessed and displayed easily.

The following diagram is a simple content model that shows the key content classes and how they relate to each other. In this diagram, we see that resources belong to a collection which in turn belongs to a context. Also, a particular resource can belong to more than one collection.

Content Modeling

As stated before, there is no such thing as the "right" model. What we are trying to achieve is the most "effective" model for the project at hand. This means coming up with a model that will provide the most effective way of organizing content so that it can be easily displayed in the manner defined in the functional specification. The way a content model is defined will have an impact on how easy it is to code templates, how quickly the code will run, how easy it is for the editors to input content, and also how easy it is to change down the track. From experience, rarely is a project completed and then never touched again. Usually, there are changes, modifications, updates, etc. down the track. If the model is well structured, these changes will be easy, if not, they can require a significant amount of work to implement. In some cases, the project has to be rebuilt entirely and content re-entered to achieve the goals of the client. This is why the model is so important. If done well, it means the client pays less and has a better-running solution. A poor model will take longer to implement and changes will be more difficult to implement.

What Makes a Good Model?

It's not easy to define exactly what makes a good model. Like any form of design, simplicity is the key. The more the elements, the more complex it gets. Ideally, a model should be technology independent, but there are certain ways in which eZ publish operates that can influence how we structure the content model.

Do we always need a content model? No, it depends on the scale of the project. Smaller projects don't really need a formal model. It's only when there are specific relationships between content classes that we need to go to the effort of creating a model. For example, a basic website that has a number of sections, e.g., About Us, Services, Articles, Contact, etc., doesn't need a model. There's no need for an underlying structure. It's just content added to sections. The in-built content classes in eZ publish will be enough to cater for that type of site. It's when the content itself has specific relationships e.g., a book belongs to a category or a product belongs to a product group, which belongs to a division of the business—this is when you need to create a model to capture the objects and the relationships between them.

T o start with, we need to understand the content we are dealing with. The broad categories are existing/known content and new content. If we know the structure of the content we are dealing with and it already exists, this can help to shape the model. If we are dealing with content that doesn't exist yet (i.e. is to be written or created for this project) it's harder to know if we are on the right track. For example, when dealing with products, generally the product data will already exist in a database or ERP system. This gives us a basis from which to work. We can establish the structure of the content and the relationships from the existing data. That doesn't mean that we simply copy what's there, but it can guide us in the right direction. Sometimes the structure of the data isn't effective for the way it's to be displayed on the Web or it's missing elements. (As a typical example, in a recent project, the product data was stored in three places—the core details were in the Point of Sale system, product details and categorization were in a spreadsheet, and the images were stored on a file system.)

So, the first step is to get an understanding of all the content we are dealing with. If the content doesn't exist as yet, at least get some examples of what it is likely to be. Without knowing what you are dealing with, you can't be sure your model will accommodate everything.

T his means you'll need to allow for modifications down the track. Of course we want to minimize this but it's not always possible. Clients change their minds so the best we can do is hope that our model will accommodate what we think are the likely changes. This really can only be done through experience. There are patterns in content as well as how it's displayed. Through these patterns e.g., a related-content box on each page, we can try to foresee the way things might alter and build room for this into the model. A good example was that on a recent project, for each object, there was the main content but there were also a number of related objects (widgets) that were to be displayed in the right-hand column of the page. Initially, the content class defined the specific widgets to be associated with the object. The table below contains the details of a particular resource (as shown in the previous content model). It captures the details of the "research report" resource content class.

Attribute

Type

Notes

Title

Text

line

Short Title

Text Line

If present, will be used in menus and URLs

Flash

Flash

Navigator object

Hero Image

Image

(displays if no flash)

Caption

Rich text

 

Body*

Rich Text

 

Free Form Widgets

Related Objects

Select one or more

Multimedia Widget

Related Object

Select one

This would mean that when the editor added content, they would pick the free-form widgets and then the multimedia widget to be associated with the research report. Displaying the content would be straightforward as from the parent object we would have the object IDs for each widget.

The idea is sound but lacks flexibility. It would mean that the order in which the object was added would dictate the order in which it was displayed. It also means that if the editor wants to choose to add a different type of widget, they couldn't unless the model was changed, i.e., another attribute was added to the content class.

We updated the content class as follows:

Attribute

Type

Notes

Title*

Text

line

Short Title

Text Line

If present, will be used in menus and URLs

Flash

Flash

Navigator object

Hero Image

Image

(displays if no flash)

Caption

Rich text

 

Body*

Rich Text

 

Widgets

Related Objects

Select one or more

This approach is less strict and provides more flexibility. The editor can choose any widget and also select the order. In terms of programming the template, there's the same amount of work. But, if we decide to add another widget type down the track, there's no need to update the content class to accommodate it.

Does this mean that anytime we have a related object we should use the latter approach? No, the reason we did it in this situation is that the content was still being written as we were creating the model, and there was a good chance that once the content was entered and we saw the end result, the client was going to say something like "can we add widget x" to the right-hand column of a context object? In a different project, in which a particular widget should only be related to a particular content class, it's better to enforce the rule by only allowing that widget to be associated with that content class.

Defining a Content Model

The process of creating a content model requires a number of steps. It's not just a matter of analyzing the content; the modeler also needs to take into consideration the domain, users, groups, and the relationships between different classes within the model. To do this, we start with a walkthrough of the domain.

Step 1: Domain Walkthrough

The client and domain experts walk us through the entire project. This is a vital part of the process. We need to get an understanding of the entire system, not just the part that is captured in the final solution. The model that we end up creating may need to interact with other systems and knowing what they are and how they work will inform the shape of the model. A good example is with e-commerce systems, any information captured on a sale will eventually need to be entered into the existing financial system (whether is it automated or manual).

Without an understanding of the bigger picture, we lack the understanding of how the solution we are creating will fit in with what the business does. That's when there is an existing business process. Sometimes there is no business process and the client is making things up as they go along, e.g. they have decided to do online shopping but they have never dealt with overseas orders so don't know how that will work and have no idea how they would deal with shipping costs.

One of the typical problems that will surface during the domain walkthrough is that the client will try to tell you how they want the solution to work. By doing this, they are actually defining the model and interactions. This is something to be wary of. It is unlikely that they would be aware of how best to structure a solution; what you want to be asking is what they currently do, what's their current business process.

You want to deal with facts that are in existence so that you can decide how best to model the solution. To get the client back on track ask questions like: How do you currently do "it" (i.e. the business process)?

  • What information to you currently capture?
  • How do you capture that information?
  • What format is that information in?
  • How often is the information updated?
  • Who updates it?

This gives you a picture of what is currently happening. Then you can start to shape the model to ensure that you are dealing with the real world, not what the client thinks they want. Sometimes they won't be able to answer the question and you'll have to get the right person from the business involved to get the answers you want. Sometimes you discover that what the client thought was happening is not really what happens.

Another benefit of this process is gaining a common understanding. If both you and the client are in the room when the process for calculating shipping costs is being explained by the Shipping Manager, you'll both appreciate how complex the process is. If the client thinks it's easy, they won't expect it to cost much. If they are in the room when the shipping manager explains there are five different shipping methods and each has its own way of calculating the costs for a shipment based on their own set of international zones, you know modeling that part of the system is not going to be straightforward unlike what the client initially thought.

What this means is that the domain walkthrough gives you a sense of what's real, not what people think the situation is. It's the most important part of the process. Assumptions that "shipping costs" are straightforward, so you don't need to worry about that, can be a disaster later down the track when you find out it's not the case. Also, don't necessarily rely on requirements documents (unless you have written them yourself). A statement in a requirements document may not reflect what really happens; that's why you want to make sure you go through everything to confirm that you have all the facts. Sometimes, a particular requirement can be stated in the document but when you go through it in more detail, ask a few questions, pose a few scenarios, the client changes their mind on what it is that they really want as they realize what they thought they wanted is going to be difficult or expensive to implement. Or, you put an alternative approach to them and they are happy to achieve the same result in a different manner that is easier to implement. This is a valuable way to work out what's real and what really matters.

Managing eZ Publish Web Content Management Projects Strategies, best practices, and techniques for implementing eZ publish open-source CMS projects to delight your clients
Published: October 2007
eBook Price: $35.99
Book Price: $49.99
See more
Select your format and quantity:

Step 2: Identify Users of the System

Hold a workshop in which you identify the main users of the system e.g.:

  • Admin
  • Business User
  • Subscribed users
  • Non-subscribed users

We should have an understanding of how these groupings are viewed by the business and the marketing approach towards each from the domain walkthrough in Step 1. This allows us to design the system to meet these goals.

Once again, we are dealing with the true roles within the business. It might be that these roles are carried out by one person in the business or mixed, but when defining the model we need to be clear on the differences so we can appropriately define the permissions.

Step 3: Identify the Key Classes Define the key classes within the model:

Business Profile

Location

State

Product/Service

Category

These are fleshed out with attributes in the functional specification but for preliminary work it should be enough just to identify them. We aren't trying to identify EVERY object in the system; that will come later. There will be key objects that provide the core of the system. We can add other objects at a later date that may have nothing to do with the model and are just for additional content, more to do with sales and marketing than the business process we are automating.

When defining these objects it's important to use the same language as the client. For example, if the client groups products by "variety" then use the term variety rather than product group. If you try to use your own terms then there's the possibility for confusion at a later date when you're talking about details to do with product groups and the client has forgotten that you mean "variety" and doesn't realize that a permission or relationship is wrong because they don't fully understand what we mean by a product group.

Once again, if we are dealing with a known business system, chances are the objects we are talking about will already have known terms. If we are dealing with a new system that is being created as a part of the solution being built, then it's a bit more tricky. Names are very important—although Shakespeare wrote "What's in a name? That which we call a rose, by any other name would smell as sweet;" in a content model, the name given to the object is important. It has intrinsic meaning and if you use an arbitrary or misleading name, it can create confusion not only for the client but developers also, and ultimately the user. When choosing names, there are a couple of simple rules: be clear, keep them short.

Clarity comes from the name reflecting the nature of the object e.g., customer, member, supplier, product, property, etc. Name length is important when it comes to programming and displays. Long names take longer to type, a simple thing, but you don't want to have to be typing "science_ideas_and_concepts_worksheet" each time you want to access that particular object. Even though eZ publish will allow the identifier to be different from the name, it's a good idea to make them the same to avoid confusion.

Long names are also a pain when it comes to displaying the class. Often, the class name will be displayed as a part of the navigation, e.g. in a breadcrumb or as an identifier in a search result. This is where long names can cause problems. Once you've set the name, it's understood by the client and developers, templates have been coded and content entered, changing that name will be a lot of work. This is somewhat pedantic but names have intrinsic meanings. Great care needs to be taken in selecting names. What seems like a simple thing upfront can have ramifications down the track if care is not taken.

Step 4: Identify Relationships between the Classes

Capture the relationships between the classes which are stated as rules:

  • A Location is in a State (e.g. Brisbane is in Queensland, Gold Coast is in Queensland)
  • A Business is located in a Location

These relationships create rules, set patterns, permissions, and define how the content will be entered, managed, and related. They set what belongs to what. They're fundamental. If a product can only belong to one variety, then this is set as a rule. Changing it later becomes problematic as changes are required on a number of levels, especially if content has already been entered. For instance, if you decide that a product can only belong to one variety then it's straightforward, you never have to worry about displaying the product in different locations and deal with the different display rules that could be associated with it. You don't have to worry about it coming up in a search for products listed by variety, etc.

These rules also inform the way that content is entered into the system. We need to take into consideration the people that will be entering and maintaining the content. They will not necessarily understand the model and it's not always apparent from the way things are structured in the administration interface. So making sure we capture the rules properly will help on this level. It will stop mistakes such as an editor trying to add an object to a part of the node tree where it doesn't belong and that has no custom template to display it. A simple example is that of nesting. Let's say we are dealing with a news item. It's a common object in a CMS. One rule could be:

  • News items can be added to folders

Another rule could be

  • News items can only be added to news listings

There doesn't seem to be much difference to these rules. Object can be added to parent object. When it comes to using the system, it can make a big difference. The "folder" object is a standard object within eZ publish. It can contain many different types of objects and there are standard displays associated with it. If you don't want to do anything special with the news item, this will work fine. But you are at the mercy of the editors to decide which is the appropriate content type to use when they are adding content.

The second rule is more specific. It means that a news item can only ever belong to a news listing. It means that the display of the news listing and news item are set in a particular fashion. It means that the "news item" object has a particular meaning because the rule won't let the object be used for different purposes. It will force the editor to consider the nature of the content being added.

From a programming perspective, it's also easier to manage as you don't have to consider the display of the item in different contexts. For example, if the rules allow different objects to be displayed in the same parent object, we have to think about the navigation in greater depth. Let's say the folder object can contain articles and news items. A typical approach to navigation is to have all the objects within a folder listed as links down the left-hand side of the page as links to the full view of that object. How will the user know which object is a news item and which object is an article? Do we have to write extra code to differentiate between the two? Does it matter what order they are displayed in? If the news item has a date, then do we need to order them differently from the articles, which are to be displayed in an alphabetical order by name? All of these questions arise if the rules aren't clear and precise.

If the rules are clear and accurate, then development and use of the end solution will be more straightforward with less possibilities arising. The more vague and open the rules, the more the problems that arise when content is entered and you discover there's a situation that hasn't been accounted for. And also, it doesn't make sense to the end user. Then, you have to superimpose rules, which could mean re-entering of content.

However, making the rules too strict can limit the flexibility that ez publish provides. Finding a happy balance is the key. The content itself will suggest the rules and then you need to confirm them with the client by considering the edge cases and posing questions such as:

Can a product ever belong to more than one variety?

What happens if a variety has no products?

Step 5: Create a Relationship Diagram

This is a combination of the information from Steps 2 through 4 captured in a single relationship diagram. It consists, at it's most basic, of a sketch of circles with object names attached and arrows indicating the relationships.

A relationship diagram is similar to an object role diagram.

The model is where the objects and relationships come together in a single diagram. It captures the foundation of all work to follow. If there are mistakes in the model, there will be problems down the track. However, if you get your client to sign off on the model, then if something changes, you have concrete justification to change the budget and timeline (although the client is rarely happy about this!). It helps to protect again the evils of scope creep.

The example displayed is based on the concept of an object role model and is an effective way to capture and define an object model (for more information see http://www.orm.net/). An object role model is "a fact-oriented approach used to conceptually model information systems" (http://www.objectrolemodeling. com/AboutORM/tabid/34/Default.aspx). Given what we are doing with a content management system is modeling content (or information if you wish), the ORM approach works well. But, it's not the only way to capture a model, it's just one way to do it. If you have your own approach to modeling, that's fine as long as you capture the information in a way that you, the client, and the developer can understand. That's the key here, to get a common understanding of how things relate to each other.

Content Modeling

Step 6: Create a Glossary

A part of the modeling process uncovers a series of terms that have a particular meaning within the system. It's important to capture the meaning and make sure there's agreement on the definitions of each term. The plain English definition of a word like "group" can have a specific meaning within a business context. The catch is that sometimes the client will use the same term to mean different things depending on the context. In one project, we had the following definitions:

Graduate Destination Survey (GDS)

T he graduate destination survey is made up of two parts:

  • Graduate Questions
  • Course Experience Questions

However, often the "Graduate Questions" part was referred to as the GDS; as it is where the actual details of the destination of the graduate was captured. So, when talking about the "GDS", it could refer to either the entire survey or just the first part of the survey. Even though we had the definition in the glossary and the client agreed to the definition, they still used the same term to mean different things depending on the context. This is an easy trap to fall into and requires diligence to check with the client when they use a term what they are referring to. It can muddy the waters greatly when trying to create a model that is accurate and reflects the actual business system being defined.

Summary

The purpose of a content model is to capture a high-level view of the system to be created. Getting this right allows you to then go into more depth into each of the classes. It's similar in nature to a class diagram, which captures the names, attributes, and methods for all classes and the relationships between them. Everything then flows from the model: the content classes to be created, the permissions that need to be established, and the views of each content class. It's the foundation of the functional specification. Getting the model right is fundamental; if the model doesn't accurately reflect the business domain, then chances are, when the system has been built, there will be problems that will be difficult and expense to fix.

Managing eZ Publish Web Content Management Projects Strategies, best practices, and techniques for implementing eZ publish open-source CMS projects to delight your clients
Published: October 2007
eBook Price: $35.99
Book Price: $49.99
See more
Select your format and quantity:

About the Author :


Martin Bauer

Martin Bauer is the Managing Director of designIT, an Australian based content management specialist practice. Martin has ten years experience in web development and web based content management. He is the world's first certified Feature Driven Development Project Manager. Prior to his role as Managing Director, Martin held a variety of roles across a range of industries. This experience includes careers in law, advertising and IT. Martin's breadth of expertise has culminated in a focus upon the delivery of effective content management solutions.

Books From Packt

Drupal 6 Themes
Drupal 6 Themes

trixbox CE 2.6
trixbox CE 2.6

Selling Online with Drupal e-Commerce
Selling Online with Drupal e-Commerce

Joomla! E-Commerce with VirtueMart
Joomla! E-Commerce with VirtueMart

MODx Web Development
MODx Web Development

Choosing an Open Source CMS: Beginner's Guide
Choosing an Open Source CMS: Beginner's Guide

Drupal 6 Social Networking
Drupal 6 Social Networking

Django 1.0 Website Development
Django 1.0 Website Development

Practical Plone 3: A Beginner's Guide to Building Powerful Websites
Practical Plone 3: A Beginner's Guide to Building Powerful Websites

 

 

 

No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
N
u
s
g
2
P
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software