Documentum Content Management Foundations: EMC Proven Professional Certification Exam E20-120 Study Guide

By Pawan Kumar
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    ECM Basics
About this book

This is a complete study guide including study material and practice questions to prepare for the EMC Proven Professional certification Exam E20-120. It can also serve Documentum beginners and practitioners as a handy guide and quick reference to the technical fundamentals that is fully up to date for Documentum 5.3. Beginners are introduced to concepts in a logical manner while practitioners can use it as a reference to jump to relevant concepts directly.

EMC Documentum is a leading enterprise content management technology platform that helps enterprises to streamline the capture, processing, and distribution of business information including documents, records, e-mails, web content, images, reports, and digital assets. It can also automate entire business processes in accordance with business rules.

EMC Proven Professional is an exam-based certification program, which introduced a new EMC Proven Content Management Application Developer (EMCAD) track in early 2007. The first exam in this track is Content Management Foundations (CMF) Associate-level Exam, with exam code E20-120, which tests knowledge about technical fundamentals of Documentum. This book is a study guide to help you prepare for this exam with hundreds of practice questions and an efficient exam-preparation strategy.

Publication date:
June 2007
Publisher
Packt
Pages
284
ISBN
9781847192400

 

Chapter 1. ECM Basics

In this chapter, we will explore the following concepts:

  • Content and metadata

  • Repository and Content Server

  • Various features of the Documentum platform

This chapter introduces key content management concepts in Documentum terminology. The concepts are described at a high level to provide an overview of the breadth of the platform. These concepts are explored in detail in the following chapters.

Content and Metadata

Databases are ubiquitous in modern technology solutions. This is a mature field and well-known best practices are routinely used for deploying databases. Databases provide standard means for accessing and manipulating structured data. Structured means that the data components (fields) are of specific type (integer, string, etc.) and this knowledge helps in querying and manipulating the data.

On the other hand, files stored on the file system are generally unstructured and can contain information in any form. Such files and the unstructured information contained therein are collectively referred to as content.

While databases provide standard means of managing structured data, content management systems (CMS) are a relatively new phenomenon. Since the content itself is unstructured, it is not possible to read and understand the content without any prior knowledge about it. Therefore, some structured data is attached to each content item, which describes the content item. This data that provides information about the attached content item is called metadata.

Content management systems utilize metadata extensively to provide sophisticated functionality. For example, metadata is essential for making documents searchable in terms of their author, title, subject, or keywords. It is hard to imagine any functionality of Documentum that does not utilize metadata in one form or another. The following figure shows two content items and their associated metadata:

 

Content and Metadata


Databases are ubiquitous in modern technology solutions. This is a mature field and well-known best practices are routinely used for deploying databases. Databases provide standard means for accessing and manipulating structured data. Structured means that the data components (fields) are of specific type (integer, string, etc.) and this knowledge helps in querying and manipulating the data.

On the other hand, files stored on the file system are generally unstructured and can contain information in any form. Such files and the unstructured information contained therein are collectively referred to as content.

While databases provide standard means of managing structured data, content management systems (CMS) are a relatively new phenomenon. Since the content itself is unstructured, it is not possible to read and understand the content without any prior knowledge about it. Therefore, some structured data is attached to each content item, which describes the content item. This data that provides information about the attached content item is called metadata.

Content management systems utilize metadata extensively to provide sophisticated functionality. For example, metadata is essential for making documents searchable in terms of their author, title, subject, or keywords. It is hard to imagine any functionality of Documentum that does not utilize metadata in one form or another. The following figure shows two content items and their associated metadata:

 

Repository


Content management systems need to manage both content and metadata. EMC Documentum uses the host file system (by default) to store the content and a database to manage metadata and its association with the content items. Note that the content can also be stored in other types of storage systems, including a Relational Dababase Management System (RDBMS), a content-addressed storage (CAS), or external storage devices.

Note

EMC coined the term content-addressed storage (CAS) in 2002 when it released its Centera product. CAS provides a digital fingerprint for a stored content item. The fingerprint (also known as an ID or logical address) ensures that it is exactly the same item that was saved. No duplicates are ever stored in CAS.

A repository is a managed unit of content and metadata storage and includes areas on the file system and a database. However, the details of the organization of the files and metadata in a repository are hidden from the users and applications that need to interact with the repository. The repository is managed and made available to the users and applications via standard interfaces by a Content Server process. The following figure shows the basic structure of a repository:

The repository was known as docbase in pre-5.3 versions of the Documentum platform. These two terms are used interchangeably by the Documentum community.

 

Content Server


Content Server serves content to applications, which in turn provide friendly interfaces to human users. Content Server brings the stored content and metadata to life and manages its lifecycle. It exposes a known interface for using the content while hiding the details of how and where files and metadata are stored.

The term Content Server is used in two contexts — the Content Server software that is installed and resides on the file system and the Content Server instance, which is the running process that resides in memory and serves content at run time. However, there is little chance of confusion since the usage is often clear from the context and the term Content Server is typically used without additional qualification (software or process/instance).

A Content Server instance is dedicated to and manages only one repository. However, we will see later in architecture discussion that multiple Content Server instances can be dedicated to the same repository. This is typically done for performance reasons where the multiple Content Server processes divide up the load for serving content from the same repository. The following figure shows two Content Servers serving one repository:

Content Server is the foundation of the Documentum platform and provides the following services:

  1. 1. Content management services

  2. 2. Process management services (workflows)

  3. 3. Security service for content and metadata in the repository

  4. 4. Distributed services

These features are described here briefly and in more detail in later chapters.

Content Management Services

Content management services include library services (checkin and checkout of objects stored in the repository), version control, and archiving. The Content Server uses an object-oriented model and stores everything as an object in the repository.

Metadata can be retrieved using Document Query Language (DQL), which is a superset of Structured Query Language used with databases (ANSI SQL). DQL can query objects and their relationships, in addition to any database tables registered to be queried using DQL.

Data dictionary stores information about object types in the repository. The default data dictionary can be altered by addition of user-defined object types and properties. Properties are also known as attributes and the two terms are used interchangeably to refer to metadata.

Virtual documents link multiple component documents together into a larger document. An individual document can be part of multiple virtual documents. The assembly of virtual documents can also be controlled by business rules and data stored in the repository.

Collaborative services can be deployed with an optional license and make collaboration features available in client applications. Collaborative features (Documentum 5.3) include:

  • Room: This is a secured area within a repository with defined membership and access restrictions.

  • Discussion: This is a comment thread associated with an object.

  • Contextual folder: This is a folder with attached description and discussion.

  • Note: This is simple document with built-in discussion and rich text content.

Note

Documentum 6 is expected to introduce new collaborative features such as polls and calendars.

Retention Policy Services (RPS) is an optional product and enables use of policies to manage the lifecycle of the objects stored in the repository. A retention policy defines the phases through which such an object passes and how it is finally disposed of or exported.

Process Management Services

Process management services (features) include the following:

  • Workflows: Workflows typically represent business processes and model event-oriented applications. Workflows can be defined for documents, folders (representing the contained documents), and virtual documents. A workflow definition acts like a template and multiple workflow instances can be created from one workflow definition.

  • Lifecycle: A lifecycle defines the stages through which a document passes. For each stage, prerequisites can be defined and actions can be defined that are performed prior to an object's entry into that stage.

Security Services

A repository uses access control lists (ACLs), also known as permission sets, as the security mechanism by default. The repository security can be turned off as well.

While the repository security model is ACL, each object has an associated ACL. The ACL provides object-level permissions to users and groups.

When the repository security is enabled, the Content Server enforces seven levels of basic permissions and six levels of extended permissions. There are additional privileges and security components, which are discussed in later chapters.

Content Server provides robust accountability and capability via auditing and tracing facilities. Auditing can track operations such as checkin or checkout that have been configured to be audited. Tracing can provide detailed run-time information useful for debugging applications.

Electronic signatures can enforce sign-offs in business processes. A sign-off is a way of authorizing or approving a decision similar to signing off on paper.

Distributed Services

A Documentum installation can include multiple repositories and the Content Server is aware of distributed configurations that deployments can take.

The Content Server provides an application programming interface (API) and therefore needs a layer in front of it to expose its capabilities to human users. Documentum provides desktop and web-based client applications and supports creation of custom applications of either type. The following figure shows several client applications:

The full set of Content Server features is exposed via Documentum Foundation Classes (DFC) and Documentum Client Library (DMCL). DFC provides the API for interacting with the Content Server programmatically.

Documentum also provides a Web Development Kit (WDK) to facilitate development and customization of web-based client applications.

Documentum provides two interactive utilities for interacting with the Content Server using queries — IDQL and IAPI. These utilities are typically used by administrators and developers.

 

Checkpoint


At this point you should be able to answer the following key questions:

  1. 1. What is content and what is metadata?

  2. 2. What is a repository and what is Content Server? What is the relationship between the two?

  3. 3. What services are provided by EMC Documentum platform? What features are enabled by these services?

 

Test Your Understanding


  1. 1. A comma-separated value (CSV) file is not content since it contains structured information (True/False).

  2. 2. Where is metadata stored in a repository?

    a. Files

    b. File properties

    c. Database

    d. None of the above

  3. 3. Content and metadata are served by the repository (True/False).

  4. 4. Which of the following statements are correct?

    a. One Content Server instance can serve two repositories

    b. One repository can be served by two Content Server instances

    c. One Content Server instance can serve only one repository

  5. 5. DQL can be used to query database tables (True/False).

  6. 6. The collaborative service feature(s) offered by the Content Server is/are:

    a. Discussions

    b. Calendars

    c. Chat

    d. Notes

  7. 7. Workflows can be defined for:

    a. Documents

    b. Jobs

    c. Folders

    d. Lifecycles

  8. 8. Content Server provides accountability features via:

    a. Jobs

    b. Audit trail

    c. Tracing

    d. Documentum Administrator

  9. 9. Documentum offers the following interactive query utilities:

    a. WDK

    b. IDQL

    c. DMCL

    d. IAPI

  10. 10. The default repository security mechanism is:

    a. ACL

    b. Permission set

    c. Alias set

    d. Login

About the Author
  • Pawan Kumar

    Pawan Kumar is a Technical Architect with current expertise in Enterprise Content Management with EMC Documentum. He has an MS in Computer Science from University of North Carolina at Chapel Hill and a BS in Electrical Engineering from the Indian Institute of Technology, New Delhi (India). Pawan has experience developing products as well as delivering business solutions on the Documentum platform and has created two products for this platform. He is intimately familiar with effective processes and tools for achieving business objectives through Documentum-based technology solutions. He has led and executed requirements and design workshops, architecture design, scoping, estimation, project planning, resource planning, technical design, software development, software testing, solution roll-out, and ongoing support for the deployed solutions. Pawan has been architecting, designing, and developing enterprise applications for ten years. He has developed software systems for financial services, healthcare, pharmaceutical, logistics, energy services, and retail industries. His expertise spans solution architecture, document management, system integration, web content management, business process management, imaging and input management, and custom application development. Currently, Pawan provides consulting and training services through doQuent (http://doquent.com), which was founded with the vision of enabling client success in content-related business initiatives. He also believes in giving back to the community. He founded the free online Documentum community dm_cram (http://dmcram.org), which is a test preparation resource for Documentum exams. He is also an active contributor to the Documentum-users Yahoo! User group, where Documentum community members seek help for their technical challenges. He can be reached at pk@doquent.com. Contact Pawan Kumar

    Browse publications by this author
Documentum Content Management Foundations: EMC Proven Professional Certification Exam E20-120 Study Guide
Unlock this book and the full library FREE for 7 days
Start now