Documenting our Application in Apache Struts 2 (part 2)

June 2009

Documenting web applications

Documenting an entire web application can be surprisingly tricky because of the many different layers involved. Some web application frameworks support automatic documentation generation better than others. It's preferable to have fewer disparate parts. For example, Lisp, Smalltalk, and some Ruby frameworks are little more than internal DSLs that can be trivially redefined to produce documentation from the actual application code.

In general, Java frameworks are more difficult to limit to a single layer. Instead, we are confronted with HTML, JSP, JavaScript, Java, the framework itself, its configuration methodologies (XML, annotations, scripting languages, etc.), the service layers, business logic, persistence layers, and so on? Complete documentation generally means aggregating information from many disparate sources and presenting them in a way that is meaningful to the intended audience.

High-level overviews

The site map is obviously a reasonable overview of a web application. A site map may look like a simple hierarchy chart, showing a simple view of a site's pages without showing all of the possible links between pages, how a page is implemented, and so on.

Apache Struts 2 Web Application Development

This diagram was created by hand and shows only the basic outline of the application flow. It represents minor maintenance overhead since it would need to be updated when there are any changes to the application.

Documenting JSPs

There doesn't seem to be any general-purpose JSP documentation methodology. It's relatively trivial to create comments inside a JSP page using JSP comments or a regular Javadoc comment inside a scriptlet. Pulling these comments out is then a matter of some simple parsing. This may be done by using one of our favorite tools, regular expressions, or using more HTML-specific parsing and subsequent massaging.

Where it gets tricky is when we want to start generating documentation that includes elements such as JSP pages, which may be included using many different mechanisms—static includes, <jsp:include.../> tags, Tiles, SiteMesh, inserted via Ajax, and so on. Similarly, generating connections between pages is fraught with custom cases. We might use general-purpose HTML links, Struts 2 link tags, attach a link to a page element with JavaScript, ad infinitum/nauseum.

When we throw in the (perhaps perverse) ability to generate HTML using Java, we have a situation where creating a perfectly general-purpose tool is a major undertaking. However, we can fairly easily create a reasonable set of documentation that is specific to our framework by parsing configuration files (or scanning a classpath for annotations), understanding how we're linking the server-side to our presentation views, and performing (at least limited) HTML/JSP parsing to pull out presentation-side dependencies, links, and anything that we want documented.

Documenting JavaScript

If only there was a tool such as Javadoc for JavaScript. The JsDoc Toolkit provides Javadoc-like functionality for JavaScript, with additional features to help handle the dynamic nature of JavaScript code. Because of the dynamic nature, we (as developers) must remain diligent in both in how we write our JavaScript and how we document it.

Fortunately, the JsDoc Toolkit is good at recognizing current JavaScript programming paradigms (within reason), and when it can't, provides Javadoc-like tags we can use to give it hints.

For example, consider our JavaScript Recipe module where we create several private functions intended for use only by the module, and return a map of functions for use on the webpage. The returned map itself contains a map of validation functions. Ideally, we'd like to be able to document all of the different components.

Because of the dynamic nature of JavaScript, it's more difficult for tools to figure out the context things should belong to. Java is much simpler in this regard (which is both a blessing and a curse), so we need to give JsDoc hints to help it understand our code's layout and purpose.

A high-level flyby of the Recipe module shows a layout similar to the following:

var Recipe = function () {
var ingredientLabel;
var ingredientCount;
// ...
function trim(s) {
return s.replace(/^s+|s+$/g, "");
function msgParams(msg, params) {
// ...
return {
loadMessages: function (msgMap) {
// ...
prepare: function (label, count) {
// ...
pageValidators: {
validateIngredientNameRequired: function (form) {
// ...
// ...

We see several documentable elements: the Recipe module itself, private variables, private functions, and the return map which contains both functions and a map of validation functions. JsDoc accepts a number of Javadoc-like document annotations that allow us to control how it decides to document the JavaScript elements.

The JavaScript module pattern, exemplified by an immediately-executed function, is understood by JsDoc through the use of the @namespace annotation.

* @namespace
* Recipe module.
var Recipe = function () {
// ...

We can mark private functions with the @private annotation as shown next:

* @private
* Trims leading/trailing space.
function trim(s) {
return s.replace(/^s+|s+$/g, "");

It gets interesting when we look at the map returned by the Recipe module:

return /** @lends Recipe */ {
* Loads message map.
* <p>
* This is generally used to pass in text resources
* retrieved via <s:text.../> or <s:property
* value="getText(...)"/> tags on a JSP page in lieu
* of a normalized way for JS to get Java I18N resources
* </p>
loadMessages: function (msgMap) {
_msgMap = msgMap;
// ...

The @lends annotation indicates that the functions returned by the Recipe module belong to the Recipe module. Without the @lends annotation, JsDoc doesn't know how to interpret the JavaScript in the way we probably intend the JavaScript to be used, so we provide a little prodding.

The loadMessages() function itself is documented as we would document a Java method, including the use of embedded HTML.

The other interesting bit is the map of validation functions. Once again, we apply the @namespace annotation, creating a separate set of documentation for the validation functions, as they're used by our validation template hack and not directly by our page code.

* @namespace
* Client-side page validators used by our template hack.
* ...
pageValidators: {
* Insures each ingredient with a quantity
* also has a name.
* @param {Form object} form
* @type boolean
validateIngredientNameRequired: function (form) {
// ...

Note also that we can annotate the type of our JavaScript parameters inside curly brackets. Obviously, JavaScript doesn't have typed parameters. We need to tell it what the function is expecting. The @type annotation is used to document what the function is expected to return. It gets a little trickier if the function returns different types based on arbitrary criteria. However, we never do that because it's hard to maintain.

JsDoc has the typical plethora of command-line options, and requires the specification of the application itself (written in JavaScript, and run using Rhino) and the templates defining the output format. An alias to run JsDoc might look like the following, assuming the JsDoc installation is being pointed at by the ${JSDOC} shell variable:

alias jsdoc='java -jar ${JSDOC}/jsrun.jar
${JSDOC}/app/run.js -t=${JSDOC}/templates/jsdoc'

The command line to document our Recipe module (including private functions using the -p options) and to write the output to the jsdoc-out folder, will now look like the following:

jsdoc -p -d=jsdoc-out recipe.js

The homepage looks similar to a typical JavaDoc page, but more JavaScript-like:

Apache Struts 2 Web Application Development

A portion of the Recipe module's validators, marked by a @namespace annotation inside the @lends annotation of the return map, looks like the one shown in the next image (the left-side navigation has been removed):

Apache Struts 2 Web Application Development

We can get a pretty decent and accurate JavaScript documentation using JsDoc, with only a minimal amount of prodding to help with the dynamic aspects of JavaScript, which is difficult to figure out automatically.

Documenting interaction

Documenting interaction can be surprisingly complicated, particularly in today's highly-interactive Web 2.0 applications. There are many different levels of interactivity taking place, and the implementation may live in several different layers, from the JavaScript browser to HTML generated deep within a server-side framework.

UML sequence diagrams may be able to capture much of that interactivity, but fall somewhat short when there are activities happening in parallel. AJAX, in particular, ends up being a largely concurrent activity. We might send the AJAX request, and then do various things on the browser in anticipation of the result.

More UML and the power of scribbling

The UML activity diagram is able to capture this kind of interactivity reasonably well, as it allows a single process to be split into multiple streams and then joined up again later. As we look at a simple activity diagram, we'll also take a quick look at scribbling, paper, whiteboards, and the humble digital camera.

Don't spend so much time making pretty pictures!

One of the hallmarks of lightweight, agile development is that we don't spend all of our time creating the World's Most Perfect Diagram™. Instead, we create just enough documentation to get our points across. One result of this is that we might not use a $1,000 diagramming package to create all of our diagrams. Believe it or not, sometimes just taking a picture of a sketched diagram from paper or a whiteboard is more than adequate to convey our intent, and is usually much quicker than creating a perfectly-rendered software-driven diagram.

Apache Struts 2 Web Application Development

Yes, the image above is a digital camera picture of a piece of notebook paper with a rough activity diagram. The black bars here are used to indicate a small section of parallel functionality, a server-side search and some activity on the browser. The browser programming is informally indicated by the black triangles. In this case, it might not even be worth sketching out. However, for moderately more complicated usage cases, particularly when there is a lot of both server- and client-side activity, a high-level overview is often worth the minimal effort.

The same digital camera technique is also very helpful in meetings where various documentation might be captured on a whiteboard. The resulting images can be posted to a company wiki, used in informal specifications, and so on.

User documentation

Development would be substantially easier if we didn't have to worry about those pesky users, always wanting features, asking questions, and having problems using the applications we've developed. Tragically, users also drive our paycheck. Therefore, at some point, it can be beneficial to acknowledge their presence and throw them the occasional bone, in the form of user documentation.

Developing user documentation is a subject unto itself, but deserves to be brought up here. We can generally assume that it will not include any implementation details, and will focus primarily on the user interface and the processes our applications use.

When writing user documentation, it's often sufficient to take the user stories, decorate them with screenshots and extra expository text, and leave it at that. It really depends on the client's requirements how much (if any) user-specific documentation is needed. If it's an application which will be used inside the client's business, it may be sufficient to provide one or more onsite training sessions.

One thing worth mentioning is that a screenshot can often save oodles of writing effort, communicate ideas more clearly, and remain easily deliverable through the application itself, in a training environment, and so on.

Screenshots can be a valuable documentation tool at many levels, including communications with our client when we're trying to illustrate a point difficult to communicate via text or still images alone.

Documenting development

The last form of documentation we'll look at is development documentation. This goes beyond our UML diagrams, user manual, functional specification, and so on. Development documentation includes the source control and issue tracking systems, the reasoning behind design decisions, and more. We'll take a quick look at some information we can use from each of these systems to create a path through the development itself.

Source code control systems

A Source Code Control System (SCCS) is an important part of the development process. Our SCCS is more than just a place to dump our source code—it's an opportunity to give a high-level overview of system changes.

The best ways to use our SCCS are dependent on which SCCS we use. However, there are a few quick ideas we can adopt across any SCCS and use them to extract a variety of information about our development streams.

Most clients will have their preferred SCCS already in place. If our deliverable includes source, it's nice if we can provide it in a way that preserves our work history.

Code and mental history

The history of change can be used on several levels, in several ways. There are products available that can help analyze our SCCS, or we can analyze it ourselves depending on what kind of information we're looking for.

For example, the number of non-trivial changes made to a file provides information in itself—for whatever reason, this file gets changed a lot. It's either an important document, a catchall, a candidate for parameterization, and so on. If two files are always modified together, then there's a chance of an unnecessarily tight coupling between them.

Sometimes, we just need to know what we were working on a for a particular date(s). We can retrieve all of our SCCS interaction for auditing purposes, to help determine what we were doing on a particular date, as part of a comprehensive change and time tracking system, and so on.

Commit comment commitment

We should view our commit comments as an important part of development documentation. One way to normalize commit comments is to create them as Javadoc-like comments, but different. Mostly, this just means that the first sentence is a succinct summary of the unit of work done, and the remaining sentences describe what was actually done.

What that first sentence includes is somewhat dependent on the rest of the development infrastructure. It's reasonable to put an issue tracking reference as the most prominent part of that comment, perhaps followed by the same summary sentence as the issue item, or a summary if that's too long.

The rest of the commit comment should include any information deemed useful, and might include general change information, algorithm changes, new tests, and so on. Having a summary commit comment sentence also allows tools to get the output of history or log commands, and create a new view of existing information when necessary. For example, getting a list of files we changed between certain dates, along with a summary of why the files were changed. These can be used as a part of release notes, high-level summaries, and so on.

When (and what) do we commit?

We should tend to commit more rather than less. The more recently a change was made, the easier it is to remember why and what was modified. Update the spelling in a single comment? Sure, might as well commit. When that file is changed later, and you're code-reviewing the changes, it's easier to look at only significant changes, and not at some trivial changes such as a punctuation change made the day before.

Also, while combining related commits, strive to keep them as granular as possible. For example, let's say we've updated some functionality in an action. As we were doing that, we corrected a couple of spelling errors in some other files. In an ideal world, even minor non-code changes would get their own commit, rather than being lumped in with changes to the code itself. If we see a commit message of "corrected spelling", we can probably ignore it. If it's lumped in to an issue-specific commit, we need to check the file itself to know if it's really part of the issue, and we'll be disappointed to find out it was to fix a misspelled Javadoc.

However, in the real world, we're not always so disciplined. In that case, the commit would be commented with information about the actual issue being addressed. However, in the comments, we might note that some spelling changes were included in the commit. Note that some SCCSs make the task of breaking up our commits easier than others.


Even relatively simple changes in application functionality might warrant an experimental branch. By indicating the start of a unit of work in our SCCS, we allow all of the changes related to that unit of work to be easily reproduced.

It also creates a mini repository within which we can keep revision control of our development spike. It keeps the experimental code and its changes out of our mainline code and isolates the changes based on a distinct unit of work, which makes us feel better about life in general.

If the experimental branch lasts a long time, it should be updated with the current trunk (the head revision) as the mainline development proceeds. This will ease integration of the experimental patch when it's completed and merged back into the mainline code.

Branching discipline

Just as commits should be as granular as possible, any branches we create should be tied as closely as possible to the underlying work being done. For example, if we're working on refactoring in an experimental branch, we shouldn't begin making unrelated changes to another system in the same branch. Instead, hold off on making those changes, or make them in the parent revision and update our revision against the mainline code.

Issue and bug management

It's equally important to maintain a list of defects, enhancements, and so on. Ideally, everyone involved in a project will use the same system. This will allow developers, QA, the the client, or anybody else involved, to create help tickets, address deficiencies, and so on.

Note that the structure for doing this varies wildly across organizations. It will not always possible or appropriate to use our client's system. In cases like this, it's still a good idea to keep an internal issue management system in place for development purposes.

Using an issue tracking system can consolidate the location of our high-level to-do list, our immediate task list, our defect tracking, and so on. In a perfect world, we can enter all issues into our system and categorize them in a way meaningful to us and/or our clients. A "bug" is different from an "enhancement" and should be treated as such. An enhancement might require authorization to implement, it could have hidden implications, and so on. On the other hand, a bug is something that is not working as expected (whether it's an implementation or specification issue), and should be treated with appropriate urgency.

The categories chosen for issue tracking also depend on the application environment, client, and so on. There are a few that are safe, such as bug, enhancement, and so on. We can also have labels such as "tweak" "refactoring" and so on. These are primarily intended for internal and developmental use in order to indicate that it's a development-oriented issue and not necessarily client driven.

Issue priorities can be used to derive work lists. (And sometimes it's nice to knock off a few easy, low-priority issues to make it seem like something was accomplished.) A set of defined and maintained issue priorities can be used as part of an acceptance specification. One requirement might be that the application shouldn't contain any "bug"-level issues with a priority higher than three, meaning all priority one and priority two bugs must be resolved before the client accepts the deliverable.

This can also lead to endless, wildly entertaining discussions between us and the client, covering the ultimate meaning of "priority" and debating the relative importance of various definitions of "urgent", and so on. It's important to have an ongoing dialog with the client, in order to avoid running into these discussions late in the game. Always get discrepancies dealt with early in the process, and always document them.

Linking to the SCCS

Some environments will enjoy a tight coupling between their SCCS and issue tracking systems. This allows change sets for a specific issue to be tracked and understood more easily.

When no such integration is available, it's still relatively easy to link the SCCS to the issue tracking system. The two easiest ways to implement this are providing issue tracking information prominently in the SCCS commit comment or by including change set information in the issue tracking system (for example, when an issue is resolved, include a complete change set list).

Note that by following a convention in commit comments, it's usually possible to extract a complete list of relevant source changes by looking for a known token in the complete history output. For example, if we always referenced issue tracking items by an ID such as (say) "ID #42: Fix login validation issue", we could create a regular expression that matches this, and then get information about each commit that referenced this issue.


Wikis lower the cost of information production and management in many ways, particularly when it's not clear upfront all that will be required or generated. By making it easy to enter, edit, and link to information, we can create an organic set of documentation covering all facets of the project. This may include processes used, design decisions, links to various reports and artifacts—anything we need and want.

The collaborative nature of wikis makes them a great way for everyone involved in a project to extend and organize everything related to the project. Developers, managers, clients, testers, deployers, anybody and everybody related to the project may be involved in the care and upkeep of the project's documentation.

Developers might keep detailed instructions on a project's build process, release notes, design documents (or at least links to them), server and data source information, and so on. Some wikis even allow the inclusion of code snippets from the repository, making it possible to create a "literate programming" environment. This can be a great way to give a high-level architectural overview of an application to a developer, who may be unfamiliar with the project.

Many wikis also provide a means of exporting their content, allowing all or part of the wiki to be saved in a PDF format suitable for printed documentation. Other export possibilities exist including various help formats, and so on.

Lowering the barrier to collaborative documentation generation enables wide-scale participation in the creation of various documentation artifacts.

RSS and IRC/chat systems

RSS allows us quick, normalized access to (generally) time-based activities. For example, developers can keep an RSS feed detailing their development activities. The feed creation can come from an internal company blog, a wiki, or other means. The RSS feed can also be captured as a part of the development process documentation.

Particularly in distributed development environments, a chat system can be invaluable for handling ad hoc meetings and conversations. Compared to email, more diligence is required in making sure that decisions are captured and recorded in an appropriate location.

Both RSS and IRC/chat can be used by our application itself to report on various issues, status updates, and so on, in addition to more traditional logging and email systems. Another advantage is that there are many RSS and chat clients we can keep on our desktops to keep us updated on the status of our application. And let's face it, watching people log in to our system and trailing them around the website can be addictive.

Word processor documents

We should avoid creating extensive word processor documents as the main documentation format. There are quite a few reasons for that: it can be more difficult to share in their creation, more difficult to extract portions of documents for inclusion in other documents, some word processors will only produce proprietary formats, and so on.

It's substantially more flexible to write in a format that allows collaborative participation such as a Wiki, or a text-based format such as DocBook that can be kept in our SCCS and exported to a wide variety of formats and allow linking in to, and out of, other sections or documents.

When proprietary formats must be used, we should take advantage of whatever functionality they offer in terms of version management, annotations, and so on. When a section changes, adding a footnote with the date and rationalization for the change can help track accountability.

Note that some wikis can be edited in and/or exported to various formats, which may help them fit in to an established corporate structure more readily. There are also a number of services popping up these days that help manage projects in a more lightweight manner than has been typically available.


This part of the article started with the ways to document JSPs and Javascripts. Then we looked at the ways of documenting development, which included going through the source code control system, along with the issue and bug management system.

If you have read this article you may be interested to view :

You've been reading an excerpt of:

Apache Struts 2 Web Application Development

Explore Title