Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Documentum Content Management Foundations: EMC Proven Professional Certification Exam E20-120 Study Guide

You're reading from  Documentum Content Management Foundations: EMC Proven Professional Certification Exam E20-120 Study Guide

Product type Book
Published in Jun 2007
Publisher Packt
ISBN-13 9781847192400
Pages 284 pages
Edition 1st Edition
Languages
Author (1):
Pawan Kumar Pawan Kumar
Profile icon Pawan Kumar

Table of Contents (23) Chapters

Documentum Content Management Foundations
Credits
About the Author
Acknowledgement
About the Reviewers
Preface
1. ECM Basics 2. Working with Content 3. Objects and Types 4. Architecture 5. Users and Privileges 6. Groups and Roles 7. Object Security 8. Searching 9. Custom Types 10. DocApps 11. Workflows 12. Lifecycles 13. Aliases 14. Virtual Documents Practice Test 1 Practice Test 2 Answers

Chapter 1. ECM Basics

In this chapter, we will explore the following concepts:

  • Content and metadata

  • Repository and Content Server

  • Various features of the Documentum platform

This chapter introduces key content management concepts in Documentum terminology. The concepts are described at a high level to provide an overview of the breadth of the platform. These concepts are explored in detail in the following chapters.

Content and Metadata

Databases are ubiquitous in modern technology solutions. This is a mature field and well-known best practices are routinely used for deploying databases. Databases provide standard means for accessing and manipulating structured data. Structured means that the data components (fields) are of specific type (integer, string, etc.) and this knowledge helps in querying and manipulating the data.

On the other hand, files stored on the file system are generally unstructured and can contain information in any form. Such files and the unstructured information contained therein are collectively referred to as content.

While databases provide standard means of managing structured data, content management systems (CMS) are a relatively new phenomenon. Since the content itself is unstructured, it is not possible to read and understand the content without any prior knowledge about it. Therefore, some structured data is attached to each content item, which describes the content item. This data that provides information about the attached content item is called metadata.

Content management systems utilize metadata extensively to provide sophisticated functionality. For example, metadata is essential for making documents searchable in terms of their author, title, subject, or keywords. It is hard to imagine any functionality of Documentum that does not utilize metadata in one form or another. The following figure shows two content items and their associated metadata:

Content and Metadata


Databases are ubiquitous in modern technology solutions. This is a mature field and well-known best practices are routinely used for deploying databases. Databases provide standard means for accessing and manipulating structured data. Structured means that the data components (fields) are of specific type (integer, string, etc.) and this knowledge helps in querying and manipulating the data.

On the other hand, files stored on the file system are generally unstructured and can contain information in any form. Such files and the unstructured information contained therein are collectively referred to as content.

While databases provide standard means of managing structured data, content management systems (CMS) are a relatively new phenomenon. Since the content itself is unstructured, it is not possible to read and understand the content without any prior knowledge about it. Therefore, some structured data is attached to each content item, which describes the content item. This data that provides information about the attached content item is called metadata.

Content management systems utilize metadata extensively to provide sophisticated functionality. For example, metadata is essential for making documents searchable in terms of their author, title, subject, or keywords. It is hard to imagine any functionality of Documentum that does not utilize metadata in one form or another. The following figure shows two content items and their associated metadata:

Repository


Content management systems need to manage both content and metadata. EMC Documentum uses the host file system (by default) to store the content and a database to manage metadata and its association with the content items. Note that the content can also be stored in other types of storage systems, including a Relational Dababase Management System (RDBMS), a content-addressed storage (CAS), or external storage devices.

Note

EMC coined the term content-addressed storage (CAS) in 2002 when it released its Centera product. CAS provides a digital fingerprint for a stored content item. The fingerprint (also known as an ID or logical address) ensures that it is exactly the same item that was saved. No duplicates are ever stored in CAS.

A repository is a managed unit of content and metadata storage and includes areas on the file system and a database. However, the details of the organization of the files and metadata in a repository are hidden from the users and applications that need to interact with the repository. The repository is managed and made available to the users and applications via standard interfaces by a Content Server process. The following figure shows the basic structure of a repository:

The repository was known as docbase in pre-5.3 versions of the Documentum platform. These two terms are used interchangeably by the Documentum community.

Content Server


Content Server serves content to applications, which in turn provide friendly interfaces to human users. Content Server brings the stored content and metadata to life and manages its lifecycle. It exposes a known interface for using the content while hiding the details of how and where files and metadata are stored.

The term Content Server is used in two contexts — the Content Server software that is installed and resides on the file system and the Content Server instance, which is the running process that resides in memory and serves content at run time. However, there is little chance of confusion since the usage is often clear from the context and the term Content Server is typically used without additional qualification (software or process/instance).

A Content Server instance is dedicated to and manages only one repository. However, we will see later in architecture discussion that multiple Content Server instances can be dedicated to the same repository. This is typically done for performance reasons where the multiple Content Server processes divide up the load for serving content from the same repository. The following figure shows two Content Servers serving one repository:

Content Server is the foundation of the Documentum platform and provides the following services:

  1. 1. Content management services

  2. 2. Process management services (workflows)

  3. 3. Security service for content and metadata in the repository

  4. 4. Distributed services

These features are described here briefly and in more detail in later chapters.

Content Management Services

Content management services include library services (checkin and checkout of objects stored in the repository), version control, and archiving. The Content Server uses an object-oriented model and stores everything as an object in the repository.

Metadata can be retrieved using Document Query Language (DQL), which is a superset of Structured Query Language used with databases (ANSI SQL). DQL can query objects and their relationships, in addition to any database tables registered to be queried using DQL.

Data dictionary stores information about object types in the repository. The default data dictionary can be altered by addition of user-defined object types and properties. Properties are also known as attributes and the two terms are used interchangeably to refer to metadata.

Virtual documents link multiple component documents together into a larger document. An individual document can be part of multiple virtual documents. The assembly of virtual documents can also be controlled by business rules and data stored in the repository.

Collaborative services can be deployed with an optional license and make collaboration features available in client applications. Collaborative features (Documentum 5.3) include:

  • Room: This is a secured area within a repository with defined membership and access restrictions.

  • Discussion: This is a comment thread associated with an object.

  • Contextual folder: This is a folder with attached description and discussion.

  • Note: This is simple document with built-in discussion and rich text content.

Note

Documentum 6 is expected to introduce new collaborative features such as polls and calendars.

Retention Policy Services (RPS) is an optional product and enables use of policies to manage the lifecycle of the objects stored in the repository. A retention policy defines the phases through which such an object passes and how it is finally disposed of or exported.

Process Management Services

Process management services (features) include the following:

  • Workflows: Workflows typically represent business processes and model event-oriented applications. Workflows can be defined for documents, folders (representing the contained documents), and virtual documents. A workflow definition acts like a template and multiple workflow instances can be created from one workflow definition.

  • Lifecycle: A lifecycle defines the stages through which a document passes. For each stage, prerequisites can be defined and actions can be defined that are performed prior to an object's entry into that stage.

Security Services

A repository uses access control lists (ACLs), also known as permission sets, as the security mechanism by default. The repository security can be turned off as well.

While the repository security model is ACL, each object has an associated ACL. The ACL provides object-level permissions to users and groups.

When the repository security is enabled, the Content Server enforces seven levels of basic permissions and six levels of extended permissions. There are additional privileges and security components, which are discussed in later chapters.

Content Server provides robust accountability and capability via auditing and tracing facilities. Auditing can track operations such as checkin or checkout that have been configured to be audited. Tracing can provide detailed run-time information useful for debugging applications.

Electronic signatures can enforce sign-offs in business processes. A sign-off is a way of authorizing or approving a decision similar to signing off on paper.

Distributed Services

A Documentum installation can include multiple repositories and the Content Server is aware of distributed configurations that deployments can take.

The Content Server provides an application programming interface (API) and therefore needs a layer in front of it to expose its capabilities to human users. Documentum provides desktop and web-based client applications and supports creation of custom applications of either type. The following figure shows several client applications:

The full set of Content Server features is exposed via Documentum Foundation Classes (DFC) and Documentum Client Library (DMCL). DFC provides the API for interacting with the Content Server programmatically.

Documentum also provides a Web Development Kit (WDK) to facilitate development and customization of web-based client applications.

Documentum provides two interactive utilities for interacting with the Content Server using queries — IDQL and IAPI. These utilities are typically used by administrators and developers.

Checkpoint


At this point you should be able to answer the following key questions:

  1. 1. What is content and what is metadata?

  2. 2. What is a repository and what is Content Server? What is the relationship between the two?

  3. 3. What services are provided by EMC Documentum platform? What features are enabled by these services?

Test Your Understanding


  1. 1. A comma-separated value (CSV) file is not content since it contains structured information (True/False).

  2. 2. Where is metadata stored in a repository?

    a. Files

    b. File properties

    c. Database

    d. None of the above

  3. 3. Content and metadata are served by the repository (True/False).

  4. 4. Which of the following statements are correct?

    a. One Content Server instance can serve two repositories

    b. One repository can be served by two Content Server instances

    c. One Content Server instance can serve only one repository

  5. 5. DQL can be used to query database tables (True/False).

  6. 6. The collaborative service feature(s) offered by the Content Server is/are:

    a. Discussions

    b. Calendars

    c. Chat

    d. Notes

  7. 7. Workflows can be defined for:

    a. Documents

    b. Jobs

    c. Folders

    d. Lifecycles

  8. 8. Content Server provides accountability features via:

    a. Jobs

    b. Audit trail

    c. Tracing

    d. Documentum Administrator

  9. 9. Documentum offers the following interactive query utilities:

    a. WDK

    b. IDQL

    c. DMCL

    d. IAPI

  10. 10. The default repository security mechanism is:

    a. ACL

    b. Permission set

    c. Alias set

    d. Login

You have been reading a chapter from
Documentum Content Management Foundations: EMC Proven Professional Certification Exam E20-120 Study Guide
Published in: Jun 2007 Publisher: Packt ISBN-13: 9781847192400
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}