Home Data Managing Multimedia and Unstructured Data in the Oracle Database

Managing Multimedia and Unstructured Data in the Oracle Database

books-svg-icon Book
eBook $39.99 $27.98
Print $65.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $39.99 $27.98
Print $65.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    What is Unstructured Data?
About this book
Multimedia is the new digital frontier. Managers, software architects, administrators and developers need to fully comprehend this exciting new technology as its widespread use and acceptance cannot be ignored any longer."Managing Multimedia and Unstructured Data in the Oracle Database" will give you a complete understanding of how to manage all data, especially multimedia. You will learn all the latest terminology, how to set up a database, load digital objects, search on them and even how to sell them. Whether you are a manager or database administrator, this book will give you the knowledge you need to take control of this rapidly growing and industry- changing technology. Technology which is transforming our lives.Starting with the basic principles of unstructured data and detailing the concepts behind multimedia warehouses and digital asset management systems, this book will describe how to load this data, search against it, display it intelligently, and deliver it to customers and users. Learn how all these concepts work within the Oracle 11g R2 database environment and how to tune the database effectively to manage it.Begin to learn about this new and exciting field and use it to give your business a competitive edge or give yourself the ability to take a leadership role in this exciting new computing genre.
Publication date:
March 2013


Chapter 1. What is Unstructured Data?

There has been a noticeably slow uptake in the use of databases to manage unstructured data, in particular multimedia data. The technology at both the hardware and software levels for the management of multimedia is both mature and stable. What is preventing sites from the move to storing multimedia in the database is attributed to a lack of expertize, understanding, and a conservative view fostered by a number of factors including historical issues with performance and integration software.

Initially it is important to define what multimedia is in relation to structured and unstructured data. Unstructured data is any data that is not stored in a structured format. Structured data is anything that has an enforced composition to the atomic data types(1).

A relational database stores data in a structured format. Other non-relational databases also store their data in a structured format, so relational data can be considered a subset of structured data. XML is also considered structured, as well as data stored inside object-oriented databases. Because the structure of XML is fluid, one can consider XML as semi-structured.

There is a large amount of unstructured data in the real world that needs managing. In the last ten years most organizations have begun to recognize that there is a great need to manage it and to understand it. As unstructured data refers to anything that is not structured; it can become very difficult to understand what is out there and how to deal with it. The traditional thinking has been to just treat it as a blob (binary large object), but with a greater understanding of the variety of unstructured data types that exist, the need to manage them has grown.

To help understand this point think of geometry and the rules (mathematics) associated with it. When mathematicians tried to come to grips with circles, triangles, and shapes it was seen to be so complex, they started on the basic concepts first. This was dealing with geometry in a two-dimensional world. In this world view, triangles had three sides with three angles that always added up to 180 degrees. Parallel lines never met. By just focusing on this world view a greater understanding of geometry was formed. Core principles were calculated along with a lot of formulas and mathematics. In this analogy, the two-dimensional world is equivalent to the structured data.

Once this two-dimensional world reached a stage of becoming well studied and understood, focus was moved to the real three-dimensional world to see how it would behave. The three-dimensional world proved to be very complex and so made us focus on key areas that could be understood. This included the study of knots, symmetry, surfaces with holes, and curves. Some of the two-dimensional rules flowed through to the three-dimensional world but fewer didn't. Parallel lines can meet and triangles can have more or less than 180 degrees.

In this analogy the unstructured data is the three-dimensional world and there is a need to understand what is in it. Just like there exists no thorough understanding of three-dimensional geometry, so there is no full understanding of the unstructured data. It is an evolving and growing discipline as more information and experiences are gathered, tested, and learnt. So, like the notion of studying knots, holes, and curves, one can also focus on key areas of the unstructured data and learn from them. One key component is multimedia, which contains video, audio, photographs, and documents.

Multimedia is also referred to as rich media. It's not just limited to the four types identified and some even might debate whether documents are a component of multimedia. As will be shown, when breaking down multimedia into its fundamental components, one can classify these multimedia types and then develop new types from it. This includes three-dimensional objects, simulation data, and neural network data.

The analogy of comparing three-dimensional geometry to unstructured data works well and one has to also consider that mathematicians have gone beyond three-dimensional geometry into multi-dimensional geometry in an effort to help explain some key components of string theory, quantum theory, and astronomy. There are still a lot of unknowns with unstructured data. The recent introduction into the world of quantum computing using qubits to store information will undoubtedly push the field of unstructured data management into complete new areas(2).

Just like there is overlap between the two-dimensional world with the three-dimensional world, so there is between multimedia and structured data. The two are dependent on each other at the moment, but eventually with improvements in technology this might change. The rules formulated today might change tomorrow. It's important to realize that as technology changes the rules change. Working in multimedia is trying to hit a moving target. What is right today might be invalidated tomorrow.


Digital data

Digital data can be broken down into structured digital data and unstructured digital data. Structured data is best known as relational data, but is really any text-based data stored in such a way that enables it to be accessed and queried to an agreed standard.

For relational data, it is stored in a well defined mathematical structure with official rules and standards for accessing and manipulating it. In the market there are other types of databases that store text data that conform to other standards (for example, ADABAS, IMS/DB).

Any data that is not stored in a well-defined structured format can by default be seen as unstructured. The traditional view is that unstructured data is just any binary data.

There is a fuzzy area between structured and unstructured, more akin to saying there are degrees of structure and there is a lot of overlap.

It's possible to store unstructured data in a column in a relational table, which is structured. The physical database files containing structured data are binary and stored in a propriety format without well-defined rules and are considered unstructured. A propriety format is one where the vendor (the maker of the format) controls and decides its behavior. There is no agreed standard or peer review for its format. There are gray areas covering this as can be shown with the the Adobe PDF format. Though the format was controlled by Adobe and considered proprietary, in 2008 it was made open and released to the general community(3).

Data stored in NoSQL or XML can be considered to be stored in a semi-structured format. For XML there are rules for accessing and querying it, but the data itself and its structure can vary. It can conform to agreed standards or be stored in a raw format.

Just saying that text data is structured and binary data is unstructured is not sufficient, as a text file (notepad or vi) can contain a random set of characters without definition, rules, or conform to any standard.

The unstructured data can be broken down into different groups. A well-known group is multimedia or rich media. Here there are types such as digital image, audio, video, and document (though there are more in this list). Some of these types are well-defined and can contain embedded XML that conform to an agreed set of standards (this is covered further in Chapter 2, Understanding Digital Objects). The format of the binary data can also follow agreed rules. The digital image format JPEG is an open standard. For video, MPEG is also an open standard. Multimedia would be a category of unstructured data that is well defined. Its category is fluid and changing as technology changes and unlikely to conform to the mathematical and well-proven relational structure.

So we can now define all data as follows:

  • Structured: The structured data is any data stored in a well-defined, non-propriety system. This data is primarily text based. It typically conforms to ACID(4).

    The structured data is anything that has an enforced composition to the atomic data types(5).

  • Semi-structured: The semi-structured data is any data stored in a system that conforms to some rules and can be proprietary. This data is primarily text based. It does not have to conform to ACID.

  • Well-defined unstructured: It is the binary data that is well defined and conforms mostly to an agreed standard.

  • Unstructured: It is the binary data that is proprietary.

The challenge is that, even based on these definitions, some data falls across one or more definitions. This is typical of what one encounters when dealing with unstructured data. There is no concise and easy to use definition. The temptation is to say that unstructured data is just any data that is not structured. But with example data sets such as NoSQL, XML, and a multitude of other storage systems, there is a feeling that they should belong to structured. In that case, is HTML structured or unstructured? HTML in theory is a subset of XML, but errors are allowed in HTML and it's not case sensitive, whereas XML is. A raw text file can be labeled as HTML and be a valid HTML file, but you can't do the same with XML. An XML file with one syntax error in it is not XML because it doesn't conform to the XML rule set.

A well known joke is, what is the name of a boomerang that doesn't return? A stick! Except that when one looks at the true history of boomerangs, most were designed not to return. Yet we associate a boomerang as any object that when thrown returns. An object of any shape can be used as a boomerang. This has been shown by boomerang experts, who use letters of the alphabet as the shape of boomerangs just to show how versatile the ability of an object when thrown to return can be. The point to be made is that our traditional, innate sense of what something should be and belong to, is not always right.

One can also say that unstructured data is really structured data that hasn't been defined correctly yet. Because of the exceptions to the rule it might not even be valid to break data up into structured and unstructured. Yet by breaking it up and identifying each set, one can associate rules with it, understand its limitations, and formulate new concepts around it. So it is useful to be able to do this.

When we look at the situation of a digital image being stored in a relational database like Oracle, we actually see two different situations. We see the digital image, which is binary data conforming to a well-defined standard, but it's being stored in a structured system. We can see what the data represents and where it is stored as two different systems.

So let's look at this further. If we now separate the storage mechanism from the data itself, we can have unstructured data stored in a relational database. The unstructured data is a separate entity and even though it's handled using ACID that is not important as the data itself is unstructured. Of course, that raises some new issues. What about some of the text elements stored in a structured database, are they structured or unstructured? What if we store a date value that behaves as structured, is fixed in its definition and conforms to a mathematical standard? If the date is stored in a varchar field (which means variable character length) then it's not structured. This is because any value can be put into it. We could enter in 12th Jan 2005, 30-Feb 2012, or 01.02.03. Any value without validation can be stored in it. If we store an address in a varchar field, is that structured or unstructured? If we store the values in an abstract data type, it can be classified as structured data as methods can be applied to it and the structure is well defined and controlled. If the address is stored in only a varchar field, then any value can be added in free-form and it is unstructured. A similar situation holds for names and a raft of other values (this is covered further in Chapter 3, The Multimedia Warehouse). So it appears that a lot of the individual data items in a structured database might actually be unstructured. This issue is well known in data warehouses, where a lot of time is spent cleaning the data into a structured format.

So again we come to a situation where trying to clearly define structured and unstructured data always brings up inconsistencies and exceptions to the rule. At this point we realize that this isn't an issue at all and come to a better understanding of how one has to rethink the whole strategy of working with the unstructured data. A document can contain only photos. Is it a document or a photo album? If a video only has an audio track but no picture, is it still a video? Is a GIF animated image a video? Even when looking at two images and comparing, how can we say they are the same? If one image differs from the other by one byte, is it still the same? If comparing two seemingly identical videos, but one is missing only the final frame, which has no audio or picture, is it the same or different? The world of unstructured data introduces us to a world where our traditional rules for dealing with commonly held concepts break down and don't make sense any more. The strict definitions we are used to and comfortable with for defining relational data fall apart when dealing with the unstructured data.

For a database management system to begin to correctly handle the unstructured data, it must initially have support for objects. An object can be seen to be a grouping of fields with associated rules. The grouping of fields can be referred to as an Abstract Data Type (or ADT). The associated rules are called methods. The data as stored can be linked directly to other data items, which is referred to as a reference. The data items themselves can repeat and can be stored hierarchically or in a nested structure. Object-oriented systems are known to conflict with the relational systems because they break a number of the rules involved in the data normalization(6). In the late 1990s this caused the market to divide between using relational or object databases, as each offered strengths and weaknesses. Oracle managed to combine the two in its database allowing data architects to pick up the best method. With the embedding of Online Analytical Processing (OLAP) and XML into the database in later releases, the Oracle database grew from being relational to one supporting most structures.

With the recent rise in popularity of NoSQL, again the debate has been raised about which is better to use, a relational system or a NoSQL one? The experienced data architects, who remember the relational/object debate, will realize that it's not really one or the other, it's using the one that can satisfy a number of conditions that are business dependent, including the ability to do the following:

  • Scale (support large numbers of users and/or large volumes of data)

  • Be open (not proprietary) or be locked into a vendor

  • To provide data integrity and prevent data corruption or loss

Most databases can enable unstructured data to be stored in them, but do not support the management, control, and manipulation of that data. Most provide the equivalent of lip service to unstructured data and encourage it to be stored externally. Even in the case of Oracle, which has built-in support of the unstructured data and provides a powerful database environment for handling it, it still has serious limitations with it (this is covered further in Chapter 9, Understanding the Limitations of Oracle Products). Even though it is a market leader in unstructured data management there are still a large number of major improvements the database needs.


Throughout this book, most chapters will cover the usage of metadata. With unstructured data management, metadata is crucial. It is the data that describes the unstructured data and gives meaning to it. Each type of unstructured data object has its own metadata. It might be as simple as a filename, or as complex as a complete set of relational records. Without metadata the unstructured data loses meaning.

The metadata is primarily used for searching. Without it, it's not possible to construct a multimedia warehouse. It is also used for assigning a description. A person might see a photo of a plant. The metadata might have a description of what that photo is, giving meaning and context to the photo.

The metadata is also used to relate unstructured data objects, which in turn adds intelligence and structure to it. It is also used to store information about the object like its name, when it was created, who created it, and who modified it.

The metadata can be used to represent any knowledge about the unstructured object. It's typically stored in a structured format. Currently the trend is to use XML, but this has not always been the case. Additionally, metadata can be matched to data in relational databases or NoSQL databases.

As will be shown in the following chapters, the metadata usage can be rich, varied, and complex. At the moment because of limitations in computer technology, metadata is crucial for most systems that want to extensively use unstructured data. A computer if asked the question, find me the video with the picture of the person John in it, would have great difficulty answering it. Likewise, a question asking, find me all audio files with a lyre bird singing after sunset, would be equally hard to answer. By having a human operator attach metadata with this information in it, then while searching multimedia with that information, the questions raised can be answered.

Unfortunately, the need to manually attach metadata is a time consuming and costly exercise. A number of sites are investigating crowd sourcing to resolve it (see Chapter 3, The Multimedia Warehouse) or just bringing in a number of people to go through and identify the unstructured data.

As computer technology improves and new algorithms are discovered, the need to store metadata will disappear. Computers are already good at facial recognition and can convert speech to text. They do have major limitations and still struggle in complex situations that humans do easily. It is envisaged that in the next 20 years technology will improve to the point where algorithms will become commonplace that will be able to identify objects and people in a video or photo, and understand sounds and complex speech in audio files. When this point is reached, the need for metadata will be reduced and constrained to a smaller, more tightly controlled subset. The metadata will always exist and always be needed.

As the veil over the unstructured data is slowly removed, and as knowledge and understanding grows, so will the use of metadata. As covered in the previous point, the use will change and diminish over time, and the market for its use will grow. For example, if the current market represented 100 units, and if multimedia represented 30 percent that would be 30 units. If its usage over time dropped to 5 percent that would be 5 units. But if the growth of the market expanded to 10,000 units, 5% would be 500 units, which is five times bigger than the current market. So even though the need will be reduced, the market as it grows will demand an increasing usage for metadata.

The uses for metadata will start to strain relational databases, and object relational databases will be pushed to their limits to identify and handle the changing complexities of it. Time-based structures (effectively four-dimensional) will be needed. Oracle's flashback capabilities will need to be ramped up in data warehouses to handle large-scale, complex queries. The fuzzy data structures, which are needed to handle the vagaries of some multimedia types, struggle to be easily represented and queried against in most databases. Neural structures are another story altogether and most computer systems can't even cope with the basic handling of them. It's feasible in concept to attach a neural network as a metadata to an object type, which details how to recognize and handle components within it(7).


Defining unstructured data

A starting point is needed for defining exactly what is unstructured data. The goal of this section is to begin to describe and define the base components of unstructured data.


In reviewing this book, an important question was raised. And that was, what is the best term to describe the concept of storing and delivering digital information? On investigation, a number of terms that closely fit the mark were discovered, though none truly described the concept that was trying to be expressed.

The following are a list of some of the terms discovered and reviewed, including definitions found on the Internet.


An image is a collection of data logically grouped together.

Digital file

A digital file is a collection of binary data represented as bytes, contained and assigned a name to identify it. Digital files traditionally exist within a filesystem. They can also be captured and stored in a database.

Digital image

A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels(8). It is commonly known as a digital photo.

Digital object

In various current usages, a digital object or asset may comprise a single media file or group of files including or excluding some or all associated metadata. The framework's apparent usage of a digital object to denote a single media file excluding its associated metadata should be made explicit to avoid misreading in opposition to the term's other contemporary usages. This recommendation for explicit definition would apply equally to the term digital asset should that language be adopted instead(9).

Digital content

There are a number of definitions available. They are as follows:

  • Any digital data traffic should be viewed as a digital content product

  • Digital content products would seem logically to include those that have a digital representation

  • Digital content products would include any products that are encoded in digital form

  • Products that are in digital format and that form part of the content of a repository, collection, exhibition, or archive(10)

  • The definition of digital content encompasses images, music, and videos(11)

Digital asset

A digital asset is a digital object that can be clearly identified as a singular item or component, which may be ascribed a value. Computer systems can be built to manage these assets also referred to as a Digital Asset Management System (DAMS), which is a system for organizing and managing access to digital materials.

Digital material

This is a broad term encompassing digital surrogates created as a result of converting analogue materials to digital form (digitization), and born digital, for which there has never been and is never intended to be an analogue equivalent, and digital records(12).

Digital library

Digital libraries (DLs) are organized collections of digital information. They combine the structuring and gathering of information, which libraries and archives have always done, with the digital representation that computers have made possible(13).

A DL contains digital representations of the objects found in it. Most understanding of the DL probably also assumes that it will be accessible via the Internet, though not necessarily to everyone. But the idea of digitization is perhaps the only characteristic of a digital library on which there is a universal agreement(14).

Analyzing the digital object

Each of the preceding definitions are correct, but the issue is that none truly conveys the meaning behind what it is to manage the unstructured data and deliver it. Each definition is restrictive and not adaptive to the changing digital technology. Most assume a digital image is a photo or document, and all assume they are owned. As will be shown further, these assumptions do not stand up on a closer scrutiny.

What did stand out was that most definitions conveyed the idea of representation, that is the digital information is meant to symbolize something, be it a photo, document, or video.

So which term should be used? After reviewing all terms the one that seems to have the most potential is a digital object. This is the term that will be used throughout most of the book. It is far easier to use an existing term that people are familiar with than it is to create a new one or define an acronym.

It is then important to accurately define what a digital object actually is. With technology changing, any classic definition we give today is likely to be out of date within a couple of years. The standard perception that the general public has of a digital object is a photograph taken by a digital camera. As will be explained later, a digital photograph is just a subset of type Picture. In fact, when looking at digital objects we are looking at ways of representing data, which is ultimately used by one of our traditional five senses.

When looking at the types of digital objects available they can be broken up as shown in the following table:

Digital image type




It is a two-dimensional representation of anything

Photos, drawings, paintings, and icons

Document (text based)

It is a set of pictures with each picture optionally representing a character from one or more well-defined character sets

Microsoft Word, e-mails, Adobe PDF files


It is a time-based set of sounds

WAV files, CDs, and MP3s


It is a combination of an optional set of time-based pictures and an optional set of time-based sounds

A video, DVD

These are the traditional object types used throughout the world, but one needs to address the need of what types of images will exist in 50 years time. It is nearly impossible to predict this, so to accurately define a digital object we need to look at how we as humans deal with digital objects and use this to future proof our definition. This involves looking at the senses humans use for viewing digital objects and then expand on this.

So let's redefine the definition of a digital object to the following:

A digital object is a representation of anything, stored in binary format, to be used by our senses.

Why to be used by our senses? If there is no intention of use for a digital object, it can be classified as a digital file. A Windows DLL, a Unix executable, or security attributes are all digital representations of something, but they are not digital objects because they are not used by our senses. They are used by computers for the management of data. By specifying that the binary representation has to be used by our senses, then the boundaries of use for that digital object are captured and can then be further defined.

The traditional view is that we have five senses: sight, sound, touch, taste, and smell. When looking in greater detail at these senses we can break them down as shown in the following table:

Human sense


Core physical concept

Human physical concept


Picture, document, video

Photons of light

Light, sensitive cells within the eye tuned to certain wavelengths of light fire when hit by photons of light



Vibrations of air

Very fine hair follicles located within the ear fire when vibrated at certain frequencies


Braille, sculpture

Pressure and temperature

Nerve cells fire when pressure is applied to them or when subjected to a temperate variation




Cells on the tongue fire when they come into contact with certain chemical substances

Smell (Olfaction)

Scratch and sniff card


Cell receptors within the nose fire when they come into contact with chemical substances

There is not much difference between taste and smell, as both involve chemical reactions. Interestingly, we can actually taste with our nose(15). When it comes to sound we can actually feel certain low vibration sounds. For watching movies on DVDs, this is an important experience and part of the entertainment value. In this case the deep bass, which is emitted by certain speakers, is felt by the body through touch. So one digital object can be used by multiple senses.

By equating it to a sense we can resolve a number of real world problems associated with defining an image. For example, a document that can be viewed or read using sight, can be converted to Braille and then read by touch. For those familiar with the TV show Red Dwarf, in that series they even explored the concept of reading a book using smell.

Just because we currently are not using one of our senses for viewing a digital object, it does not mean it should be excluded. A good example is taste. Currently it is very hard to simulate taste in a digital sense, but this doesn't mean that in ten years' time the concept of artificial taste will not be invented.

Digital object types

A digital object can be broken down into image types. Each image type can be further broken into image subtypes. We can then apply conversion and transformation rules to each subtype to modify the digital object.

By breaking down the senses into their core concepts and then equating them to traditional image concepts, it now becomes possible to identify traditional object types and then define them.

A digital object does not need to have any meaning associated with it, nor does it have to represent a real world scene (which is the traditional view of a photograph). A picture of an abstract painting is a digital object and the white noise of an empty TV channel can be classified as a digital object.

For simplicity we will maintain a digital object as having to be stored in the binary format. Though there are audio and video formats that use analogue signals, these formats can be expressed in a binary digital format. Even when looking at artificial intelligence and the use of neural networks, this can be represented in a binary format.

Core types

Digital objects can be composed of two core types and the dimension of time. By combining these core types into different combinations, a variety of base types can be created.

The core types are as follows:

  • Image

  • Audio

The use of the dimension of time is very fluid and varies based on how it is used. Video uses a very strict definition of time. Animated GIFs use a very simple time-based sequence, whereas heraldry uses a very loose definition of time. The use of time is covered later in greater detail.

When expanding this definition to handle three-dimensional objects, the concept of an artifact is introduced. This is an object created from physical materials, but created digitally. An example is a model created from a three-dimensional printer using resins and glue.


For each object type we introduce object subtypes. For example, we can define a photograph as a real world representation of a picture. A line drawing is a hand drawing. The CGI is a computer-generated image.

Human sense

Example of image subtypes



Photo, drawing, DVD, font

A wedding photo, a line drawing of a plant


Music, sonar, radar

An MP3 file


Braille, sculpture

A pin-map converts an image into a surface, which can be felt, force feedback glove


Human taste, animal taste

A recipe


Chemical formula for smell



A picture is a two-dimensional representation of anything. A picture can be viewed using all senses. A picture is defined as having a width and height assuming the picture is rectangular. For non-rectangular pictures the width and height describe the upper boundary lengths of the picture.

The following are the examples of subtypes:

  • Geo-raster

  • Photo

  • Art

  • Line drawing

  • Montage

  • 3D view using a set of 2D (but still 2D)

  • Stereoscopic image


Audio is a time-based set of sounds. If we investigated hard enough, we could eventually equate a sound to a picture. This is not required and to keep things simple this is not going to be done.

The following are the examples of audio subtypes:

  • Music

  • Audio book


A model is a three-dimensional object that is a representation created using binary data. Three-dimensional printers are now available that can produce physical structures from three-dimensional drawings.

Creating new base types

When looking at these definitions it can be seen that the definition for a document and video can be expressed using the terms of a picture and an audio. By adding a rule set we can create new digital image types. The two new rule definitions as used previously are as follows:

  1. Digital image types can be time based, a set of digital objects linked together using the dimension of time.

  2. A well defined character set. It is a set of pictures or icons grouped together. UTF8 and US7ASCII are character sets. Egyptian hieroglyphs can be grouped together to form a character set.

Using new rules we can create new digital object types based on the core picture and audio image types.

The following are examples of non-traditional digital object types:


The document is a set of pictures with each picture optionally representing a character from one or more well-defined character sets. Each picture can be classed as a font.

The example subtypes are as follows:

  • Ephemera

  • Structured documents (used for signaling)

  • Forms


A video is a combination of an optional set of time-based pictures and an optional set of time-based sounds. A DVD is an example of a video subtype. A photo montage is not an example of a video subtype. In this case we have a set of pictures but because they are not time based, they can only be classified as type picture.

The example subtypes are as follows:

  • Film

  • TV

  • Documentary

  • Surveillance

Multimedia (Rich Media)

Multimedia is a combination of one or more object types that is optionally time based. Usually, they are created to be interactive, such as an educational program or game.

The example subtypes are as follows:

  • VRML

  • SVG

  • SMIL

  • Macromedia Flash

  • Java Applet


Data is a document that is perceived by its users as a collection of tables (and nothing but tables). This is a slight expansion on the original definition of relational. Relational data is treated as image data, as it can be transformed into a picture (creating a graph) or a document (creating a report) or even a video (creating a view using data mining analysis).

The example subtypes are as follows:

  • Relational

  • XML

  • Object

  • Metadata


This is where we take the data, convert parts of it into well-defined objects, and then extend it over a well-defined period of time. A simulation can be converted into video. A simulation, which is given a set of tightly enforced rules, can be extended into a self-evolving artificial neural network with the resulting output being an enhanced pattern-matching algorithm. Such an algorithm can be subsequently used for transforming digital object subtypes.


The genealogy is a record of the descent of a person, family, or group from an ancestor or ancestors(17). It involves taking data, documents, photos, video, and audio and extending it over time.

The subtypes include the following:

  • Heraldry: It is the study and classification of armorial bearings and the tracing of genealogies(18)

  • Private record: It is a privately defined record hierarchy position(19)

Virtual digital object

It is possible for a digital object to be categorized into multiple object types. This is because the line on what actually constitutes a digital object can change depending on how it is delivered. For example, an MP3 file is classified as an audio type. If it is delivered using the Real Player server and streamed to a client, it is treated as a video type. Another example is an animated GIF, which is a time-related set of images enclosed within a repeating cycle. An animated GIF is by definition a video, yet for ease of delivery, it is delivered as a specialized GIF (that is, a type of static picture).

This means it is important to separate the storage of the digital object from the delivery mechanism. The delivery mechanism might involve a virtual change of the digital object. The digital object exists in two (or possibly more) states and it isn't until the object is delivered that its true state is determined. When this happens it is called a virtual digital object.

In a perfect world there is no difference between a digital object and its delivery mechanism. But because of Internet standards that limit what can be seen (for example, browsers by default only view JPEG and GIF images and not TIFF ones) and due to limitations in network bandwidth and cost of delivery, virtual digital objects have had to be created to address these issues. These issues are subject to current environmental constraints and will change over time. HTML5 is attempting to define a set of supported video standards. This is covered further in Chapter 3, The Multimedia Warehouse.

Digital object delivery

One goal in this book is to describe how to deliver a digital object. This is covered in great detail in Chapter 6, Delivery Techniques. At the moment we have classified what a digital object is, but have not defined what it is to actually deliver it.

We expand on the original definition and add the following:

Only when that digital object has been successfully consumed by one of our senses can it be considered to be delivered.

This means a photograph viewed on the computer screen has been delivered. A DVD streamed to a computer terminal has been delivered and a document viewed and read has been delivered.

It is not important that money has been transacted when delivering a digital image. Buying a digital image is an optional part of the delivery process.

But what about the scenario where an audio file is cut to CD and then shipped to a customer? What if that customer does not listen to it? By the preceding definition it has not been consumed therefore it has not been delivered. Common sense indicates that the image has been delivered. From a traditional consumer viewpoint it is on actual receipt of the digital image that the image can be considered to be delivered. This view is now starting to conflict with new e-commerce concepts starting to appear on the Internet. That is, consumers are now only being charged for use of a digital image only when it has been consumed and not when they have received it.

So when defining consumption of an image and ensuring that definition is future proof, we have to be careful that our traditional viewpoint of commerce does not interfere with that definition. With e-commerce the rules are changing, consumer habits are changing, and new ideas for image delivery are being tried.

At this point we will leave the definition as it is. In Chapter 6, Delivery Techniques, we will explore this concept in greater detail. Here we will be looking at who the consumer is and who the producer is. With e-commerce our traditional perspectives need to be challenged.

Manipulating digital objects

This section is an introduction to the methods available for managing and manipulating a digital object. When working with digital objects it soon becomes apparent that techniques have to be utilized to view and understand what they actually are. The digital object itself can contain other digital objects and only by processing it can these other objects be discovered.


Conversion is when we change an object subtype into another object subtype. Major conversions occur when we convert between types. For example, when we go from a picture to a document. Minor image conversions occur when we convert an image between subtypes, a common example is when converting a JPEG image to a GIF image.

In converting a digital image the process might be irreversible, meaning once converted it cannot be converted back again. For example, in converting a video to a photograph, we cannot convert that photograph back to the same video.

The process might also lose information in the conversion. In converting a JPEG image to a GIF image and back to a JPEG image, color information is lost. Though the image might look like the original image, it is not the original image. This is a lossy conversion and covered in greater detail in Chapter 2, Understanding Digital Objects. At the end of this chapter there is a chart detailing how it's possible to convert between all the major types.


Transformation occurs when a digital object is modified. For example, we can rotate, watermark, or crop a photo. We can convert the bit rate of an audio file, change a Word document into an Adobe document, or add special effects to a video. Transforming does not change the object subtype.


A digital object can be composed of multiple digital objects. The extraction process involves unpacking those digital objects. For example, a DICOM image can be composed of multiple photographs and documents that are in turn digital objects themselves.


We live in a world where storage is limited. The storage not only includes the volume of space a digital object uses, but the bandwidth required to deliver that image. As such with digital objects, compression becomes important. And for all digital objects we deal with lossless or lossy compression.

With lossless compression the digital object is compressed (reduced in size) and when uncompressed the original digital object is reconstructed without the loss of any original information.

With lossy compression, the obtained object is not the same on reconstruction as the original object. This is covered in greater detail in Chapter 2, Understanding Digital Objects.

Image comparison

After the compression or conversion of an object, we may lose some information in the process, therefore it now becomes important to be able to define whether that modified object is still the original object.

The technical definition is, two digital objects are classified as absolutely identical such that when they are compared in an uncompressed format each byte exactly matches the byte in the same corresponding position.

With a digital object, this definition does not match with real world expectations. For example, we can convert a WAV file to MP3 and then back to WAV. The technical definition says the two are different, but to the human ear listening to the original and the converted WAV file, there will be no difference.

In another example, it is possible to embed hidden watermarks in a JPEG image. To a person viewing the original image and the modified image, they will not be able to tell them apart. They will say they are the same digital image.

To address this we can then add a new definition: two digital objects are classified as observably identical when they are perceived to be identical.

Now that we have defined digital object comparison, we can apply this to our compression definition as follows.

On compressing a digital object, if the obtained compressed object matches with the original object, the compression is said to be lossless. If they do not match, that compression is lossy.

This in turns raises a new issue. If the resultant lossy image when viewed is not perceived to be identical to the original image, that image is termed as being badly compressed.

Badly compressed

The skill comes in balancing compression to reduce the size of the original digital object without it becoming noticeably badly compressed.

It should be understood that stating an object is identical just because it is perceived to be identical is highly dependent on the individual doing the comparison checking. It is this area that moves into a very gray area by going into image searching. It will be discussed in more detail in Chapter 3, The Multimedia Warehouse, and Chapter 4, Searching the Multimedia Warehouse. It is an imprecise area that is not suited for traditional binary logic but well suited for neural networks, pattern matching, and fuzzy logic.


A thumbnail is a digital object that has been transformed and/or converted into a format which uses less storage. The goal in creating thumbnails is to improve the performance of object delivery. It is not fair to classify a thumbnail as an index, for the simple reason that it is not transparent. In a relational database the data is perceived by the user as tables (and nothing but tables)(20). An index is an object designed to improve performance. It cannot be seen as a table, so the corollary is that it must be transparent. A thumbnail is seen and yet is designed to improve performance like an index, so it breaks the original relational rule. A thumbnail fits in the structure referred to as a pyramid index.

From an Oracle perspective the closest equivalent is to treat thumbnails as a form of materialized view. Multiple thumbnails can be created from an original image of varying size. Two types of thumbnails are the web quality thumbnail and the standard thumbnail. The standard thumbnail is the smallest size produced, whereas the web quality is the largest size thumbnail produced.

In the case of a Georaster Image (which is a very large digital photograph typically seen as a satellite image), hundreds of thumbnails can be created of varying sizes based on the original.

Thumbnails are optional and do not have to be produced.

As will be shown throughout this book, a lot of traditional relational concepts are broken when applied to the world of unstructured data. The thumbnail and indexing is just one good example. This can be unsettling for those who have been trained and skilled in the relational database world. Unstructured data is seen as either a threat or an anomaly that is best treated by placing it into a blob field or insisting that it be stored externally and not in the database. The psychology behind this resistance to adopt and use unstructured data in itself cannot be easily dismissed and must be factored in by the data designer, database administrator, and developer. The introduction to the market of multimedia centric devices, such as the iPad or Android are beginning to break down the notion of keeping all unstructured data outside the database, as users start to become better educated and fluent in the usage of multimedia and are insisting on greater use and access to it in their applications.


This is the act of combining multiple object subtypes into a new object subtype while still keeping each subtype separate and distinct.

The traditional example of this is mapping spatial data over an image. The data is separate and can be searched. For example, we can search for a grid reference point on a map. Another example is seen when attaching metadata to an object. We can add EXIF data to a camera picture. The metadata is a specialized case and will be looked at in more detail in Chapter 3, The Multimedia Warehouse.


A photo of a book is not a transposition, it is a photo. This is because the data within the book is not separate but part of the photo. To do a search on the book title involves running the photo through a transformation process to extract the book title (OCR) before searching on it.


Searching for a digital object is a complex topic and is covered in greater detail in Chapter 3, The Multimedia Warehouse, and Chapter 4, Searching the Multimedia Warehouse.

Due to the complexity in searching a digital object, the current method is to search within a transposed object, with data being transposed over the digital object. Searching against data is simpler than trying to search within the image. Searching using this technique is called Data Transpose Searching. For example, the standard search method for looking for images involves searching against metadata attached to the image.

One key goal when searching is to search on the actual digital object. For example, find me all photos with a tree in it, or find the audio file that contains a lyric, or find the video that has Elvis Presley singing the song "you were always on my mind" in it. Currently, computer technology has not progressed to a stage where this is easily possible. A search using this method is called Actual Searching.

Another form of searching involves expanding on the concept of badly compressed objects and finding related or similar digital objects. We might want to find all pictures that have a sunset in them and use an existing photo as a base for the search engine to use them. This type of searching is called Similarity Searching, and the technology is now available to search on a variety of digital objects.

Similarity searching has the potential to be used in a number of fields, especially in fraud and copyright protection. For example, software is now available for universities where they can find all essays that are similar to ones submitted by students. By adjusting the similarity parameters a teacher can then compare two essays and determine with a high degree of certainty whether one is a copy of the other and has been slightly modified.

Product group

A set of images linked together is referred to as a product group. This is not to be confused with the composite type discussed further in the chapter. A product group in an intelligence warehouse that might be a set of digital images of a crime scene. In an electronic commerce system it might be a set of songs, videos, and digital booklets relating to an album.


It is an origin or destination point for a digital object. This can be related to where it was taken or where the digital object is being delivered to.


Defining multimedia in the Oracle database

It's important to exactly define what multimedia is. The common thinking is that multimedia is just a photo taken using a digital camera (or scanned in). Multimedia is much more than this.

To try and define what multimedia is, it's best to look at examples and see how they work with the Oracle database.


The photograph is also referred to as a picture, but the proper usage is a digital image.

It can be taken by a digital camera or can be scanned in. A photograph can have the metadata embedded in it (common formats include EXIF, IPTC, Adobe XML, Dicom). A photo can be of type JPEG, TIFF, PNG. There are well over 300 other types. Some camera manufacturers use a raw option when storing their digital images (two of the most common formats being DNG and NEF).

The photo is stored in the Oracle multimedia ORDSYS.ORDIMAGE data type. More complex photos can be stored in the ORDDICOM data type along with other multimedia types.

A photo can also be of type Georaster, in which case it's best stored using the Oracle Spatial Georaster data type.

A photo can be defined as a two-dimensional object composed of binary data. The photo is typically stored in compressed format using compression software built for that image type.


A video is a time-based set of two-dimensional photographs with optional audio. A video can contain metadata. It can also optionally contain an audio track (audio type) and a caption (document type). The video can be compressed and photographs can be extracted from the image. Common examples include MPEG, Divx, AVI, and QuickTime.

In Oracle multimedia, a video is stored using the ORDSYS.ORDVIDEO data type.


An audio image is a time-based collection of analog-based sounds. An audio image can be compressed. It can also optionally contain a caption (document type).

In Oracle multimedia, audio is stored using the ORDSYS.ORDAUDIO data type.


It is a set of two-dimensional pictures that conform to a well-defined set. A document can also contain within it all image types. As a result, a document is stored as binary not character. Microsoft Word and Adobe PDF are two well known examples. There are over 3000 examples of documents found in the marketplace.

A document can be indexed using Oracle text.

In Oracle multimedia, a document is stored using the ORDSYS.ORDDOC data type.


Text is a document that is not binary but composed of character data only. It can contain structured data (the two best known examples being relational and XML). Depending on the type, it is determined where to store the data. It can be stored in an XML type, an Oracle table, a CLOB, or a varchar field. It can be indexed using Oracle text.


It is a three-dimensional representation of an object. Though still in its infancy, some cameras can create a 3D view of an object. It can also be referred to as a blueprint, equating to a three-dimensional drawing used by architects(16).

In Oracle multimedia, an artifact is stored using the ORDSYS.ORDSOURCE data type.

Additional multimedia types

The multimedia is not just limited to the types mentioned. It can include anything. In the next decade we will be seeing new types of multimedia containing very large amounts of data. Some of these will be based on life sciences and simulation. Individual multimedia files will be on an average over a 1 TB in size.

For those familiar with VMware, it is feasible (but not currently practical) to store whole VMware instances as a multimedia type. These can be of any size. As more sites move down the virtualization path, the ability to create many installs will be simplified and organizations will be creating more of them. In the next decade we will see computers with a large number of cores and very large amounts of memory being able to host one VMware instance per user. Thus going down the path of each user having their own client computer, which is centrally stored and managed. Once more we see the rules for tuning and management change. Smart use of virtualization will ensure that the CPU use is fairly distributed. But as the number of these virtualizations grow, they will need to be managed and archived. And the most logical place to put them is in a database.

Other smaller sized types can include e-mail messages, flash files, and executables.

Composite types

It is a set of one or more multimedia types stored in the one type. A ZIP, RAR, or TAR file can contain a mixture of multimedia. Further multimedia can then be extracted. A Dicom file can contain multiple multimedia types. Certain photographic file formats can contain multiple images within them. A GIF can be animated, a TIFF can contain multiple images, and a JPEG can contain other JPEGs within it. How the multimedia is going to be used best determines how it is stored.

A composite type is different to a product group or a container.

A composite type introduces the concept of multiple originals. Our traditional notion of an image needs to extend to deal with a multimedia type that is related to other multimedia. A good example is with a DICOM image. This is typically an image that contains information about a medical patient. It can contain patient history, x-rays, ultrasound, and scans. Each is a different image type, but together they all represent the one patient. If we view the patient as an object, then the object is a digital image composed of multiple original with each one being another multimedia type. Another example is a museum painting. The one painting can have multiple photographs taken of it. It might have an associated video showing how it was painted and an audio commentary of it by the artist. Each is a separate image but together they create one image with multiple originals in it. Another example is when a photographer takes a mosaic picture. This is a set of photographs of a scene that can be stitched together to create a new picture (like a jigsaw puzzle). Each image is still treated separately.

For a composite type, one digital image is chosen as the representative image and used as the thumbnail. Depending on the context in which the image is used or accessed, this representative image can dynamically change.

Composite types are best handled using the Oracle database's object/relational capabilities.


It is used to describe the fact that a file type can have multiple encoding algorithms used within it. A video file of type AVI is a container because different compression formats can be used within it. This is covered in greater detail in Chapter 2, Understanding Digital Objects.

ZIP files

A ZIP file is a specialized composite type. The goal is compression. ZIP is now used in the vernacular, even though there are other products that can do the same task. Some of these include Winrar, Unix Tar, and Unix Gzip. With ZIP the idea is to create one or more large files containing all the other files and to compress them within it. This is useful for backups or delivery/transfer of large number of images between computer systems.

Within Oracle there are a number of methods for dealing with a ZIP file. The context or how it's designed to be used determines what the ZIP file actually is. The following highlights three different uses of a ZIP file:

  • Delivery: It extracts all the multimedia within it. It treats each file extracted as a separate file and discards the original ZIP file. This is useful for loading up a set of images via the web browser to the database.

  • Index: It extracts one image for display and indexing purposes and stores the original ZIP. It is useful when a large number of images need to be delivered. The original ZIP is delivered to the customer.

  • Composite: It extracts all the files but treats the set of images as a composite type and discards the original ZIP. The one digital image is composed of multiple originals.


The Oracle PL/SQL Package UTL_COMPRESS, will not prove to be useful for handling zipped images. This package assumes the ZIP contains exactly one file. To unzip multiple files requires writing a Java program (which runs in the database and can unzip multiple files, even if they are in subdirectories). Another option is to use Java to shell out to the operating system. Dump the ZIP file to a temporary location then invoke the operating system unzip (now supported in Windows as well as Unix) and then load in the extracted files.


The metadata is a text data associated with a digital object for the purpose of searching and providing structured or semi-structured information about the digital object. Metadata is covered in greater detail in Chapter 3, The Multimedia Warehouse. In Oracle metadata can be stored in tables.

The NULL case

For a multimedia type, the NULL equivalent should be discussed. It's possible for the digital images associated with metadata to exist, but the actual multimedia component to not yet exist. For example, a museum has information about an object that needs to be photographed and stored in the database. The object could be a painting, a vase, a person, or any general collection object. They first store the metadata about the object in the database. At a later time the object is scanned, photographed, or a video is taken of it. The digital image is then associated with the initial metadata.

Because the metadata for the object exists but its associated multimedia does not, this is considered the NULL case for the multimedia type. The definition of NULL is related to the potential of what the image could become and avoids confusing a NULL image with one that is empty or blank.


Why store unstructured data in a database?

As the business imperative grows for companies to start managing and then publishing their digital image assets, the issue about where those image assets are stored is raised. More recently, multiple analysts have estimated that data will grow 800 percent over the next five years. The unstructured information accounts for more than 70–80 percent of all data in organizations and is growing 10–50 times more than the structured data(21).

This then grows to include any type of data. The initial choice is to store it on a disk file system. This can be seen as the quickest and simplest approach. Another, better, alternative is to store the images in the database.

A database is not normally considered to be an ideal repository for multimedia or any form of unstructured data. Historically they are known to have had issues with performance with large volume data retrieval. There has also been a noticeable lack of support with third-party tools, leaving any data in the database well and truly locked in. With the Oracle database this was seen in the older Oracle7 release and the use of long fields.

Only recently the possibility of storing multimedia in a database has become realistic. Though the capability has been around for some time, with the increase in disk capacity, and introduction of low-cost, high volume SANs, there has been a greater push towards moving any multimedia from the filesystem into the database. In the past five years, with changes in database technology and improvements in disk performance and storage, the rules have changed and it now makes business sense to use the Oracle database to store and manage all of an organizations digital assets.

Most companies are also now recognizing that large amounts of corporate knowledge and assets are stored within their filesystems. Accessing them is difficult and most do not follow standards for managing and dealing with them. As such, most are now looking to acquire or build some sort of digital asset management system.

Currently the type of unstructured data is limited to those classified as digital assets, which includes multimedia and some other forms of data. The notion of storing a whole operating system inside the database is yet to be reached due to the logistics of adhoc retrieval versus any perceived performance issues. Given time the question will be raised, should the database be the operating system? Having the database as the operating system changes the mentality for its use. It already has security, auditing, extensible programming languages, schemas, backup and recovery, diagnostic management, and a built-in web server. Though such a scenario does not exist, it's plausible and would definitely appeal to a niche market that only uses the database on its server. It would not replace a Windows or Mac PC, it might replace a Unix or Windows server.

The following are the strengths an Oracle database can offer over traditional filesystem storage.


Images stored in the database can be directly linked with metadata. In the one transaction an image can be manipulated, a thumbnail of that image created, and all associated metadata modified. Related information is kept in sync. If an image is stored in a file system, it is possible for external processes to delete or modify that image, causing the image itself to either become orphaned or lose synchronicity with its corresponding relational data. Another common issue is web quality images losing their associated thumbnails, meaning web page displays become broken.

Oracle multimedia, which extends control over images, allows images to be manipulated inside the database. They can be resized, copied, converted, and rotated. This simplifies management of them and allows for the one programming environment (PL/SQL or Java).

Moving multimedia is simplified as only one object is being moved. When deleting any multimedia all associated thumbnails and metadata is deleted. Management becomes simpler and less prone to error, especially on recovery and when doing general database maintenance.


If all images are stored in a directory, fine grained control is not possible. That is, it is not possible to restrict access of the images to individual users. Once users can gain access to an image in the directory they can access all of them (this is based on the assumption of using Digest Authentication).

By storing an image in the database, fine grained security becomes possible. Access to an image can be restricted to individual users and it also becomes possible to achieve the following:

  • Attach a timeout to access the image

  • Include check in/check out capabilities

  • Audit who accessed the image and when

  • Offer image exclusivity (one user accesses an image for a set period of time during which no one else can access it)

Using Oracle security it becomes possible to attach roles to images and introduce fine grained access on them. Security can be configured so that a user can access a thumbnail but not the original. Full auditing of who accessed each image and how they accessed it can be tracked. Auditing can also be included to keep track of network capacity used per user, making it possible to track and then charge for network usage.


The one backup program that is used to backup the database will also backup the images. This simplifies the backup process. In the event of failure the whole database can be recovered to the last committed transaction.

The traditional behavior for backing up a filesystem is to back it up daily or weekly. This means in the event of failure the filesystem on recovery will be out of sync with the database by at least one day.

So by having the images in the database only one backup program is required and in the event of failure only one recovery procedure is needed.

Another advantage that can be seen comes when using some of the more advanced database features such as standby databases and replication. Images are automatically replicated if the advanced replication option is used, and for disaster recovery situations, image data is automatically transferred to a standby database.

The Oracle database is designed to handle backing up and recovering very large volumes of data. Using RMAN, incremental backups ensure only the changed blocks are backed up. Backup and recovery can be done in parallel. The database supports full rollback. Recovery is done until the last committed transaction, meaning no data is lost. This is important as it ensures that when the database is recovered all multimedia and associated metadata match. If a filesystem was used to store the multimedia and a database to store the metadata, when failure occurs it becomes possible for the metadata to become out of sync with the files recovered on the filesystem.

Though the latest generation of SANs can do high volume and high-speed backups of data, there is still no way to guarantee complete consistency between the database on the SAN. With changes in technology, a number of SANs now support real-time block level replication and can ensure consistency. This capability is vendor and database specific.


All the data is in one location. The digital image becomes an object and can be accessed using the one query. Client server and web applications can access the one image and retrieve its associated metadata using the one SQL statement. Image management is also subject to the same transactional rules as relational data. Using PLSQL and/or Java, the one query can access a variety of multimedia types in the one query. For example, a photo with its associated metadata and video can be retrieved in the one simple query. It's also possible to do a query that not only retrieves the metadata, but also performs a spatial query to do analysis of it.

The simplistic nature of these highly complex tasks makes it a powerful option to use.


An image stored in the database can be indexed. If an image is a document it can be thematically searched and gists (summaries) can be extracted from it.

An image can be converted from one format to another. Metadata can be extracted from it. It can be copied, re-sized, and the image quality controlled.


When it comes to managing and controlling the images in the database, the Oracle database offers the greatest in flexibility. Sets of images can be deleted, updated, or copied as easy as it is to write a query.

Images can be linked together and metadata can be easily attached to them. All data related to an image or set of images can logically co-exist.

This adds flexibility, which gives a DBA and developer greater control over managing and working with the images.


The Oracle database has various built-in features, which when used makes it easier to manage and deliver multimedia:

  • The Oracle database can extract metadata from a number of multimedia types. The metadata is stored in XML format making it easier to manipulate and control. For images, metadata can also be saved back into it. Using Oracle's built-in XML handling capabilities, accessing the XML data is easily done.

  • Transportable tablespaces: Multimedia can be migrated en-masse by copying them to a transport tablespace and then moving this tablespace to the new location. This is useful when firewalls are involved. It also allows for the large scale copying of image databases.

  • Database links: Multimedia can be copied between databases directly using a database link.

  • Oracle supports streaming of videos directly from the database.

  • Photos can be processed within the database. They can be rotated, re-sized, watermarked, or translated from one type to another.

  • Embedded gateway: Web access, including multimedia loading and retrieval is simply done using the built-in HTTP gateway.


Why not store the multimedia in the filesystem?

When managing multimedia, the argument should now be, please justify why the files should be stored in the filesystem and not the database. There might be business cases for storing multimedia in the filesystem, especially if there are older applications and tools that need to access the files, but can only access a filesystem. As will be covered in Chapter 9, Understanding the Limitations of Oracle Products, there is a strong case for the use of the Oracle Database File System.

Only by using web services and integrating access into these tools to access the database can these restrictions be removed. It's possible to build programs that can be integrated into Windows File Explorer, Adobe Photoshop, Microsoft Word, and PowerPoint that can directly access the database and retrieve the files.

The following are some arguments why storing unstructured data in the filesystem might not be a good idea:

  • Security: Different operating systems have different types of filesystem security. Some are quite powerful but most offer basic course grain which cannot easily integrate with database security. If using Apache and all your images are in one directory, how do you configure it so that a user can access only a set of files, while another user can only access others? It can be done with a lot of effort and using specialized plugins, but it doesn't easily integrate with the database security and it's very hard to monitor, audit, integrate, and control. There is more likelihood of holes in the security being opened by trying to implement a tight security policy. Applying security to unstructured data stored in the database is so much easier.

  • Backup/recovery: Database backups are well known. The challenge is to try and ensure the filesystem backups are coordinated with the database.

  • Filesystem limitations: Most filesystems can only store 65,536 per directory. For a multimedia warehouse it's feasible to want to store millions of digital objects in one directory.

  • Performance: Filesystems are notoriously slow for accessing and managing. Put 10,000 digital objects in a Windows filesystem and try to use File Explorer to look at it. Try to mass rename or change the security on 20,000 digital objects. Try to do a search against a filesystem looking at all directories when the filesystem might contain a million or more digital objects. It's incredibly slow. In some cases it fails. Try highlighting 1,000 objects in File Explorer and moving them to another location. It's painfully slow and difficult to do. Digital objects stored in the database offer the ability to make changes to millions of objects in seconds. There is no real performance comparison. Searching for and manipulating objects in the database is much faster than trying to achieve the same in the operating system. Arguments might be made about load and retrieval times, but with the latest release of Oracle with Securefiles this argument doesn't hold much weight anymore.


Why use Oracle multimedia and not a blob?

Oracle multimedia is tightly integrated into the database. Application development can be greatly simplified when the images and all associated metadata is stored together in the database. Oracle multimedia uses blobs within its type definition, which can be accessed and used as required. In addition, it supports a variety of methods that simplify the act of loading and manipulating digital images. This is covered in greater detail in Chapter 7, Techniques for Creating a Multimedia Database.

Addressing the concerns

Even though the focus should always be to store the images in the database first, this experience is still to be accepted in the marketplace. The attitude is still to store it in the filesystem. Most management when confronted with the idea of putting images in the database invariably come up with the same set of fears that first appeared over ten years ago when database vendors first tried to push the idea of storing images in the database and failed. What's different now is that the rules have changed, the technology has changed and experience has shown that a lot of the previously raised issues are not valid anymore.


Isn't it slower to retrieve an image from the database compared to a filesystem?

This might have been true ten years ago on older disk systems, but with improvements in disk technology this issue has subsequently disappeared. Tests have shown that it is just as fast to retrieve an image from a database as it is from a disk filesystem.

In addition, by using optional caching technology, it is possible to cache frequently accessed images thus improving the time to retrieve them.

Fine grained control over where an image is stored and how it's accessed is possible. A thumbnail can be stored on a local disk and cached in the SGA to ensure the fastest possible speed for retrieval. The original can be stored on lower speed disks. The architecture of the Oracle database is one that inherently supports scalability. This makes it simpler to develop applications that load and deliver images.

Oracle's new Securefile Lobs offer speeds nearly twice as fast as previous versions, for loading and retrieval.

Database size

Doesn't it take more storage to put in an image in the database compared to a file system?

Yes it does. The storage format used in the database adds extra overheads to manage locking and to reserve storage for growth. In addition, Oracle puts indexes on images to improve the time it takes to retrieve and manipulate them.

Though there is extra storage required, it is not significant in the overall storage requirements. When tens of thousands of images are stored in the database, the extra required is dwarfed by the overall size of the images. It is also fair to say that disk is cheaper than it once was, and when it comes to database management, the strategy now is to sacrifice storage for performance. When dealing with relational data it is now common practice to add additional indexes and use locally managed tablespaces to improve performance. These extra features come at an additional storage cost. In some cases with data warehouses just for storing relational data, the rule of thumb is to factor in eight times the raw storage to handle all the additional overheads.

So increasing the storage requirements in the database by storing images in it, only ensures that the image data is retrieved optimally and consistently.


Isn't it more complicated and time consuming having to put images into the database and retrieve them compared to a filesystem?

This was exactly the same argument used fifteen years ago when relational databases first appeared. But at that time the argument was concerned with storing data in what was known as flat files versus putting it into a relational database. Time has shown that the overheads of putting data into a relational database offer more benefits and ultimately greater control than when storing it in a flat file. The same argument can be applied to images. Yes there is some programming overhead to put them in, but the advantages gained from having them in the database (as explained previously) is greater than when not having them in the database.

So when it comes to storing and managing those digital assets, keep in mind that ultimately it is easier, safer, and better to keep them stored in a database.



Unstructured data is more than just any data which isn't structured. It's a complete set of different types of data that can be categorized into different groups, with rules that can be used to define and manage them.

The introduction of digital objects enables a large set of unstructured data to be classified using the human senses as a base for that classification. By describing multimedia, digital images, video, audio, and documents can be categorized and methods detailed for the handling and manipulation of those digital objects.

The use of a database that can support objects makes it a lot easier to manage large volumes of digital objects. Though these objects can be stored in a filesystem, there are now a lot of advantages to having them stored inside the database.

Chapter 2, Understanding Digital Objects, will go into detail on each of the different multimedia types and how they work in the real world.



These questions are designed to have the reader go beyond the traditional method of answering questions. They involve using the concepts designed in the chapter and doing additional research on the Internet to come up with the best solution to address the questions raised.

  1. Name a human sense not included above that can be digitized.

    Can these senses be digitized?

    • temperature

    • balance

    • pain

  2. 3D printers can now be used to take CAD designs stored digitally and to print them out. How would a recipe for cooking be digitally stored?

  3. How would one store a database in itself?

    What concepts would be required to achieve this?

    How does this relate to the concept of read consistency?

  4. The table below shows how the different types of unstructured data can be converted between the different forms. Expand the table to include:

    Which types conversions are one way (information is lost on the conversion, so it's impossible to reverse it).

    Name three other unstructured data types that can be added to the table.


Unstructured data conversion table


→ To







Neural Network



OCR. Read textual data from an image. Can be a fax or data on an existing image.

Audio commentary is attached to image.

Multiple photos are combined into a video or animation like a slideshow.

3D imaging such as VRML

Metadata is extracted about the photo (EXIF, IPTC).

Network visualizes and understands the photo.


Pages are scanned in or converted to an image.


Text to speech conversion.

The closest equivalent is animation.

Pages are transformed into 3D virtual book

Metadata is extracted about the document (header).

Network reads and understands the document contents.


Waveforms of audio turned into an image.

Speech is converted to text.


Audio is streamed as per video.

3D visualization of the audio is created

Metadata is extracted about the audio (ID3).

Network listens and understands the audio file.


A frame is extracted from the video.

Metadata about the video is extracted.

Audio stream is extracted from the video stream.


An animated 3D version of the video is created

Metadata is extracted about the video.

Network views and understands the video.


A scene, icon or screen shot is extracted from the program.

Instructions, manual, code is extracted.

Audio is extracted from the visual structure.

Animation is extracted from the visual structure.


Metadata is extracted.

Network looks at and can interpret the 3D view.


Data is converted into graphical (see Excel, Reports).

Report is generated.

Data is converted to speech as per document.

Data mining analysis converts to video.

Instructions (similar to SVG) is converted to visual


Network interprets and understands the data.

Neural network

A visual representation of the network is displayed.

Report / documentation about the simulation is produced.

Commentary is given on the simulation.

The simulation is animated.

A VRML representation of the network

Metadata is extracted.

About the Author

    Marcelle Kratochvil is an accomplished Oracle database administrator and developer. She is CTO of Piction and has designed and developed industry leading software for the management and selling of digital assets. She has also developed an award winning shipping and freight management system, designed and built a booking system, a sport management system, a e-commerce system, social network engine, a reporting engine and numerous search engines. She has been an Oracle beta tester since the original introduction of Oracle Multimedia. She is also a well known presenter at Oracle Conferences and has produced numerous technical podcasts. Born in Australia, she is living in Canberra. She is actively working as a database administrator on supporting a large number of customer sites internationally. She is also campaigning with Oracle to promote the use of storing all data and any data in a database. In her spare time she plays field hockey and does core research in artificial intelligence in database systems. Marcelle has a Bachelor of Science Degree from the Australian National University and majored in computing and mathematics.

    Browse publications by this author
Managing Multimedia and Unstructured Data in the Oracle Database
Unlock this book and the full library FREE for 7 days
Start now