Getting Started with IBM FileNet P8 Content Manager — Save 50%
Install, customize, and administer the powerful IBM FileNet Enterprise Content Management platform using this book and eBook
Although we will be using FileNet Enterprise Manager (FEM) to illustrate most of the features, remember that these are really features of the Content Engine (CE) itself. As such, they are sometimes available in other tools and always available via the CE APIs. The point is not to show how to operate FEM itself, but to illustrate Content Manager (CM) features. Because FEM is an administrator's tool, we'll concentrate mostly on administrative concepts and tasks.
In this article by William J. Carpenter, author of Getting Started with IBM FileNet P8 Content Manager, we'll cover:
- A discussion of the P8 Domain and what's in it
- How to use topology levels to configure your environment
- A discussion of the objects that reside directly within a Domain
|Read more about this book|
(For more resources on this subject, see here.)
The following will be covered in the next article.
- A discussion of an Object Store and what's in it
- A example of creating a custom class and adding custom properties to it
FEM must run on a Microsoft Windows machine. Even if you are using virtual machine images or other isolated servers for your CM environment, you might wish to install FEM on a normal Windows desktop machine for your own convenience.
Domain and GCD
Here's a simple question: what is a P8 Domain? It's easy to give a simple answer—it's the top-level container of all P8 things in a given installation. That needs a little clarification, though, because it seems a little circular; things are in a Domain because a Domain knows about them.
In a straightforward technical sense, things are in the same Domain if they share the same Global Configuration Database (GCD) . The GCD is, literally, a database. If we were installing additional CE servers, they would share that GCD if we wanted them to be part of the same Domain.
When you first open FEM and look at the tree view in the left-hand panel, most of the things you are looking at are things at the Domain level. We'll be referring to the FEM tree view often, and we're talking about the left-hand part of the user interface, as seen in the following screenshot:
FEM remembers the state of the tree view from session to session. When you start FEM the next time, it will try to open the nodes you had open when you exited. That will often mean something of a delay as it reads extensive data for each open Object Store node. You might find it a useful habit to close up all of the nodes before you exit FEM.
Most things within a Domain know about and can connect directly to each other, and nothing in a given Domain knows about any other Domain.
The GCD, and thus the Domain, contains:
- Simple properties of the Domain object itself
- Domain-level objects
- Configuration objects for more complex aspects of the Domain environment
- Pointers to other components, both as part of the CE environment and external to it
It's a little bit subjective as to which things are objects and which are pointers to other components. It's also a little bit subjective as to what a configuration object is for something and what a set of properties is of that something. Let's not dwell on those philosophical subtleties. Let's instead look at a more specific list:
- Properties: These simple properties control the behavior of or describe characteristics of the Domain itself.
- Name and ID: Like most P8 objects, a Domain has both a Name and an ID. It's counterintuitive, but you will rarely need to know these, and you might even sometimes forget the name of your own Domain. The reason is that you will always be connecting to some particular CE server, and that CE server is a member of exactly one Domain. Therefore, all of the APIs related to a Domain object are able to use a defaulting mechanism that means "the current Domain".
- Database schemas: There are properties containing the database schemas for an Object Store for each type of database supported by P8. CM uses this schema, which is an actual script of SQL statements, by default when first fleshing out a new Object Store to create tables and columns. Interestingly, you can customize the schema when you perform the Object Store creation task (either via FEM or via the API), but you should not do so on a whim.
- Permissions: The Domain object itself is subject to access controls, and so it has a Permissions property. The actual set of access rights available is specific to Domain operations, but it is conceptually similar to access control on other objects.
- Domain-level objects: A few types of objects are contained directly within the Domain itself. We'll talk about configuration objects in a minute, but there are a couple of non-configuration objects in the Domain.
- AddOns: An AddOn is a bundle of metadata representing the needs of a discrete piece of functionality that is not built into the CE server. Some are provided with the product, and others are provided by third parties. An AddOn must first be created, and it is then available in the GCD for possible installation in one or more Object Stores.
- Marking Sets: Marking Sets are a Mandatory Access Control mechanism, Security Features and Planning. Individual markings can be applied to objects in an Object Store, but the overall definition resides directly under Domain so that they may be applied uniformly across all Object Stores.
- Configuration objects:
- Directories: All CM authentication and authorization ultimately comes down to data obtained from an LDAP directory. Some of those lookups are done by the application server, and some are done directly by the CE server. The directory configuration objects tell the CE server how to communicate with that directory or directories.
- Subsystem configurations: There are several logical subsystems within the CE that are controlled by their own fl avors of subsystem configuration objects. Examples include trace logging configuration and CSE configuration. These are typically configured at the domain level and inherited by lower level topology nodes. A description of topology nodes is coming up in the next section of this article.
- Pointers to components:
- Content Cache Areas: The Domain contains configuration information for content caches, which are handy for distributed deployments.
- Rendition Engines: The Domain contains configuration and connectivity information for separately installed Rendition Engines (sometimes called publishing engines).
- Fixed Content Devices: The domain contains configuration and connectivity information for external devices and federation sources for content.
- PE Connection Points and Isolated Regions: The domain contains configuration and connectivity information for the Process Engine.
- Object Stores: The heart of the CE ecosystem is the collection of ObjectStores.
- Text Search Engine: The Domain contains configuration and connectivity information for a separately-installed Content Search Engine.
In addition to the items directly available in the tree view shown above, most of the remainder of the items contained directly within the Domain are available one way or another in the pop-up panel you get when you right-click on the Domain node in FEM and select Properties.
The pop-up panel General tab contains FEM version information. The formatting may look a little strange because the CM release number, including any fix packs, and build number are mapped into the Microsoft scheme for putting version info into DLL properties. In the previous figures, 18.104.22.168 represents CM 22.214.171.124, build 100. That's reinforced by the internal designation of the build number, dap451.100, in parentheses. Luckily, you don't really have to understand this scheme. You may occasionally be asked to report the numbers to IBM support, but a faithful copying is all that is required.
There is an explicit hierarchical topology for a Domain. It shows up most frequently when configuring subsystems. For example, CE server trace logging can be configured at any of the topology levels, with the most specific configuration settings being used. What we mean by that should be clearer once we've explained how the topology levels are used. You can see these topology levels in the expanded tree view in the left-hand side of FEM in the following screenshot:
At the highest level of the hierarchy is the Domain, discussed in the previous section. It corresponds to all of the components in the CE part of the CM installation.
Within a domain are one or more sites. The best way to think of a site is as a portion of a Domain located in a particular geographic area. That matters because networked communications differ in character between geographically separate areas when compared to communications within an area. The difference in character is primarily due to two factors—latency and bandwidth. Latency is a characterization of the amount of time it takes a packet to travel from one end of a connection to another. It takes longer for a network packet to travel a long distance, both because of the laws of physics and because there will usually be more network switching and routing components in the path. Bandwidth is a characterization of how much information can be carried over a connection in some fixed period of time. Bandwidth is almost always more constrained over long distances due to budgetary or capacity limits. Managing network traffic traveling between geographic areas is an important planning factor for distributed deployments.
A site contains one or more virtual servers. A virtual server is a collection of CE servers that act functionally as if they were a single server (from the point of view of the applications). Most often, this situation comes about through the use of clustering or farming for high availability or load balancing reasons. A site might contain multiple virtual servers for any reason that makes sense to the enterprise. Perhaps, for example, the virtual servers are used to segment different application mixes or user populations.
A virtual server contains one or more servers. A server is a single, addressable CE server instance running in a J2EE application server. These are sometimes referred to as physical servers, but in the 21st century that is often not literally true. In terms of running software, the only things that tangibly exist are individual CE servers. There is no independently-running piece of software that is the Domain or GCD. There is no separate piece of software that is an Object Store (except in the sense that it's a database mediated by the RDBMS software). All CE activity happens in a CE server.
There may be other servers running software in CM—Process Engine, Content Search Engine, Rendition Engine, and Application Engine. The previous paragraph is just trying to clarify that there is no piece of running software representing the topology levels other than the server. You don't have to worry about runtime requests being handed off to another level up or down the topological hierarchy.
Not every installation will have the need to distinguish all of those topology levels. In our all-in-one installation, the Domain contains a single site. That site was created automatically during installation and is conventionally called Initial Site, though we could change that if we wanted to. The site contains a single virtual server, and that virtual server contains a single server.
This is typical for a development or demo installation, but you should be able to see how it could be expanded with the defined topology levels to any size deployment, even to a deployment that is global in scope. You could use these different topology levels for a scheme other than the one just described; the only downside would be that nobody else would understand your deployment terms.
Using topology levels
We mentioned previously that many subsystems can be configured at any of the levels. Although it's most common to do domain-wide configuration, you might, for example, want to enable trace logging on a single CE server for some troubleshooting purpose. When interpreting subsystem configuration data, the CE server first looks for configuration data for the local CE server (that is, itself). If any is found, it is used. Otherwise, the CE server looks for configuration data for the containing virtual server, then the containing site, and then the Domain. Where present, the most specific configuration data is used.
A set of configuration data, if used, is used as the complete configuration. That is, the configuration objects at different topology levels are not blended to create an "effective configuration".
CE has a feature called request forwarding. Because the conversation between the CE server and the database holding an Object Store is chattier than the conversation between CE clients and the CE server, there can be a performance benefit to having requests handled by a CE server that is closer, in networking terms, to that database. When a CE server forwards a request internally to another CE server, it uses a URL configured on a virtual server. The site object holds the configuration options for whether CE servers can forward requests and whether they can accept forwarded requests.
Sites are the containers for content cache areas, text index areas, Rendition Engine connections, storage areas, and Object Stores. That is, each of those things is associated with a specific site.
eBook Price: $38.99
Book Price: $64.99
|Read more about this book|
(For more resources on this subject, see here.)
Exploring Domain-level items
Let's spend a little time looking in more detail at some of the Domain objects. In the APIs and some documentation, these are sometimes called non-repository objects to distinguish them from objects that reside within a repository, that is, an Object Store. These objects are primarily of interest to an administrator and are seldom manipulated by end-user applications.
There are two common ways to access these objects and dialog boxes in FEM. Some are accessed directly from the tree view in the left-hand side, sometimes by rightclicking and selecting from the context menu. Others are available via the multitabbed pop-up dialog when you select Properties from the context menu of the Domain node in the tree-view.
As is generally the pattern in FEM, everything in the Domain is available, though in more of a self-guided form, if you select the Properties tab on that pop-up panel. This property sheet, available for most objects, lets you view or edit individual properties of the object. It is always the case that the CE server is the enforcer of things like allowable property values, security access, and referential integrity. FEM is sophisticated, but it's still just a client application. FEM cannot allow you to violate CE rules, although it cannot always prevent you from doing unwise things. The following screenshot is the dialog box that opens after you select Properties as shown in the previous screenshot:
An AddOn is a CE packaging mechanism containing metadata (and possibly instance data) for an Object Store. An AddOn is expected to represent things needed for some discrete application or piece of functionality, but that's not enforced. AddOns are created in the GCD but are installed into individual Object Stores.
If you click the AddOns node in the tree, you will see a list of AddOns available in the GCD. All of the AddOns you see in the screenshot are provided with CM, but additional AddOns may be present in any installation because AddOns can be created via the CE APIs.
In addition to being Recommended or Optional, an AddOn may have Pre-requisites. The CE server will not let you install an AddOn into an Object Store unless its prerequisites are installed, either beforehand or concurrently.
There is a mechanism for superseding an AddOn with an updated version, but AddOns cannot be uninstalled from an Object Store once installed. You will therefore want to consider carefully before installing an AddOn into an Object Store. A step near the end of the wizard for creating an Object Store offers to install AddOns for you, but you can also install them at any later time (by right-clicking an individual Object Store node and navigating to All Tasks | Install AddOn). The Base Content Engine Extensions AddOn should always be installed into a new Object Store as most applications depend on the few things it provides (for example, the custom property DocumentTitle). For any other AddOn, we suggest you avoid installing it at Object Store creation time unless you are sure you will actually need it.
Fixed Content Devices
Document content can be stored in the Object Store database, a traditional filesystem, or in what CM calls a Fixed Content Device (FCD). The term FCD includes vendorprovided devices that do not allow changes to content once it is written, but the name is arguably a slight misnomer because it also includes federated content sources. From the point of view of the CE server, none of these devices allow updates to content files.
Before you can use any FCD from an Object Store, you must first register it in the GCD so that the CE server will know how to communicate with it. If you right-click on the Fixed Content Devices node in the FEM tree view and select New Fixed Content Device, a wizard will walk you through the steps of configuring connectivity to the FCD. The configuration parameters that are needed vary with the device type.
Server Cache configuration
The Server Cache tab gives you control over the characteristics of several caches maintained internally by the CE server. As you can probably tell from the names of the items and the general subject of caching, the values set here can have a dramatic impact on performance. The purpose of each of these items is described in the P8 product documentation.
You shouldn't change these values recklessly. The default values are chosen to be reasonable for most use cases, though circumstances do vary. The same advice applies for other performance tuning settings described in this article.
Specific changes will sometimes be recommended by IBM support in response to specific situations. There is also some guidance for appropriate values in the FileNet P8 Performance Tuning Guide.
Near the bottom right corner is a button labeled Copy Values. This is a general mechanism that you will see in several of the configuration panels. We mentioned earlier that many subsystem configurations could be set at the domain, site, virtual server, or server levels, with the more specific entries being used. The Copy Values button gives a convenient way to copy a set of configuration values from one of those other topology nodes to this node. If you click the button, you will see a pop-up panel that lets you navigate to any node from which you wish to copy values.
eBook Price: $38.99
Book Price: $64.99
|Read more about this book|
(For more resources on this subject, see here.)
The Content tab contains settings for tuning the behavior of background operations for content uploads, downloads, and related movements. Compared to other kinds of updates, content uploads (from the client to the CE server) can take a long time to complete. The CE server and APIs avoid transaction timeout issues by arranging to upload the content outside of the transaction that actually commits the content to the repository. Under some circumstances, content is uploaded in chunks that must be reassembled for final storage.
Both of these things lead to a certain amount of behind-the-scenes bookkeeping, background processing, and clean-up. The settings on this tab tune the performance and behavior of that background activity.
This panel has the Copy Values button that we noted earlier. Another thing to notice, directly under the tabs, is the information Configuration Source: This Object. If you looked at the same tab on the site, virtual server, or server topology node, you would see similar text indicating that the configuration came from the Domain level, and all of the input boxes would be disabled. That's a handy way of seeing where the configuration actually lives and whether or not you have overridden the higher-level configuration at a particular topology node. In contrast, the figure below shows the Content tab for server server1.
The CE server writes two kinds of log files. The error log file is for important exceptions and is always enabled. The trace log file is for troubleshooting and has to be specifically enabled. The Trace Control tab is for configuring tracing inside the CE server.
Although similar technologies are used to provide trace logging in the client APIs, the configuration mechanisms are completely separate. The panels in FEM control only tracing within the CE server and do not apply to any client tracing.
The trace log files themselves are intended for use by IBM support personnel. The formats and meanings of the trace log files, therefore, are not precisely documented and often change from release to release. Also, the contents of trace log files are not localized in the way that other IBM components are. You will generally be enabling trace logging under the direction of IBM support, in which case they will tell you specific things to enable.
You may become familiar with some of the contents of trace log files in the natural course of events, and you may decide to gather trace log files for your own purposes. There's nothing wrong with that, but there are a few important things to remember:
- Trace logging can be very verbose in the output it produces. It's generally a good strategy to start with a minimal amount of trace logging and add to it to widen your diagnostic search. Even the most obvious diagnostic clues can be missed when buried amidst tons of irrelevant material.
- Trace logging can have a significant performance impact on the CE server. This is not just because of the volume of logged output that has to be written. When trace logging is enabled, conditional code in the CE server will sometimes do additional lookups, calculations, and other expensive operations to create the logged entries.
Let's look at the Trace Control tab for configuring trace logging:
Trace logging is organized in two ways:
- There are functional subsystems that can be individually enabled or disabled. The names of these subsystems are suggestive (and documented in the P8 product documentation). When troubleshooting a particular situation, it is very rare to need to select more than a handful of subsystems. IBM support will often request logging for only one or two subsystems.
- There are different levels of verbosity: Summary, Moderate, Detail, and Timer. Although the user interface makes it look like these verbosity levels are strictly hierarchical (if you select Detail, the user interface will automatically select Moderate and Summary), they are actually independent. Nevertheless, you will always want to select a verbosity level and all of the less verbose levels.
If you find that performance still drags or that the trace log file continues to grow even after you have disabled trace logging in the Domain configuration, it could be that trace logging is still configured at a more specific level. That's very easy to overlook, especially in more complex deployments or where CM administration duties are shared.
By default, CE server trace log files are written to an application server-specific directory. Unfortunately, the FEM configuration panel won't tell you what the default location is, so you'll have to consult the product documentation. For example, on our WAS setup, the trace log files are written to /opt/ibm/WebSphere/AppServer/profiles/AppSrv02/FileNet/server1/p8_server_trace.log. You can specify a different location for the trace log files by supplying a value for that field. Remember, the location is from the CE server's point of view, not the local filesystem of the machine where FEM is running.
The final domain level feature we'll look at is content caching. Content caching is a CE feature useful in geographically distributed deployments. It enables you to have a transient local copy of document content while the true master copy is stored at a distant site. Although the content is a cached copy, the CE server always performs the normal security access checks and also guarantees that an application will not be seeing a stale copy of the content from the cache. A content cache is affiliated with a specific site, but the configuration of cache tuning parameters can be done for a site or on a domain-wide basis.
In this article, we took a brief tour of some common administrative features of CM. In the next article we will Explore Object Store-level Items
- IBM FileNet P8 Content Manager: Exploring Object Store-level Items [Articles]
- IBM FileNet P8 Content Manager: End User Tools and Tasks [Articles]
- Domino 7 Lotus Notes Application Development [Books]
- IBM Lotus Quickr 8.5 for Domino Administration [Books]
- IBM Cognos 8 Report Studio Cookbook [Books]
- WS-BPEL 2.0 for SOA Composite Applications with IBM WebSphere 7 [Books]
- IBM Rational Team Concert: Team Collaboration [Articles]
About the Author :
William J. Carpenter is an Enterprise Content Management Architect with IBM in the Seattle, Washington, area. He has experience in the Enterprise Content Management business since 1998, as a developer, development manager, and architect. He is co-author of the books IBM FileNet Content Manager Implementation Best Practices and Recommendations and Developing Applications with IBM FileNet P8 APIs, is a Contributing Author on IBM developerWorks, and is a frequent conference presenter. He has experience in building large software systems at Fortune 50 companies and has also served as the CTO of an Internet startup. He has been a frequent mailing list and patch contributor to several open source projects. He holds degrees in Mathematics and Computer Science from Rensselaer Polytechnic Institute in Troy, New York.