Managing Audio Content in Plone 3.3

(For more resources on Plone, see here.)

There are at least four use cases when we think of integrating audio in a web application:

  1. We want to provide an audio database with static files for download.
  2. We have audio that we want to have streamed to the Internet (for example, as a podcast).
  3. We want a audio file/live show streamed to the Internet as an Internet radio service.
  4. We want some sound to be played when the site is loaded or shown.

In this article series, we will discuss three of the four cases. The streaming support is limited to use case 2. We can stream to one client like a podcast does, but not to many clients at once like an Internet Radio does. We need special software such as Icecast or SHOUTcast for this purpose. Further, we will investigate how we solve use cases 1, 2, and 3 with the Plone CMS and extensions. Technically, these are the topics covered in this article series:

  • Manipulation of audio content stored as File content in Plone
  • The different formats used for the binary storage of audio data
  • Storing and accessing MP3 audio metadata with the ID3 tag format
  • Managing metadata, formats, and playlists with p4a.ploneaudio in Plone
  • Including a custom embedded audio player in Plone
  • Using the Flowplayer product to include an audio player standalone in rich text and as a portlet
  • Previewing the audio element of HTML5
  • Extracting metadata from a FLAC file using mutagen


Uploading audio files with an unmodified Plone installation

The out of the box support of Plone for audio content is limited. What is possible to do is to upload an audio file utilizing the File content type of Plone to the ZODB. A File is nothing more and nothing less than a simple binary file. Plone does not make any difference between a MP3 file and a ZIP, an EXE, or an RPM binary file.

When adding File content to Plone, we need to upload a file (of course!). We don't necessarily need to specify a title, as the filename is used if the title is omitted. The filename is always taken for the short name (ID) of the object. This limits the number of files with any specific name to one in a container

While uploading a file, Plone tries to recognize the MIME type and the size of the file. This is the smallest subset of information shared by all binary files the content type File was intended for. Normally, detecting the MIME type for standard audio is not a problem if the file extension is correctly set.

Clicking on the link in the default view either downloads the file or opens it with the favorite player of your operating system. This behavior depends on the settings made on the target browser and corresponds with the method 1 of our audio use cases. It goes without saying that we can add the default metadata to files and organize them in folders.

Like Images, File objects do not have a workflow associated in a default Plone setup. They inherit the read and write permissions from the container they are placed into. Still, we can add an existing workflow to this content type or create a new one via the portal_workflow tool if we want.

That's pretty much it. Fortunately, we can utilize some extensions to enhance the Plone audio story greatly.

What we will see in this article is as follows: First, we will go over some theoretical ground. We will see what formats are available for storing audio content and which is best for which purpose. Later we will investigate the Plone4Artists extension for Plone's File content type—p4a.ploneaudio. We will talk about metadata especially used for audio content and how to manipulate it. As a concrete example, we will use mutagen to extract metadata from a FLAC file to add FLAC support to p4a.ploneaudio. Finally, we will have a word on streaming audio data through the Web and see how to embed a Flash player into our Plone site. We will see how we can do this programmatically and also with the help of a third-party product called collective.flowplayer. At the very end of the article, we have a small technical preview on HTML5 where a dedicated audio element is available. This element allows us to embed audio directly into our HTML page without the detour with Flash.

Accessing audio content in Plone

Once we upload a file we want to work with to Plone, we will link it with other content and display it in one way or another. There are several ways of accessing audio data in Plone. It can be accessed in the visual editor by editors, in templates by integrators and in Python code by developers.

Kupu access

Unlike for images, there is no special option in the visual editor to embed file/audio content into a page. The only way to access an audio file with Kupu is to use an internal link. The file displays as a normal link and is executed when clicked. Executed means (as for the standalone file) saved or opened with the music player of your operating system as is done in the standard view of the File content type. Of course, it is possible to reference external audio files as well.

Page template access

As there is no special access method in Kupu, there is none in page templates. If we need to access a file there, we can use the absolute_url method of the audio content object. This computes a link we can refer to. So the only way to access a file from another context is to refer to its URL.

<a tal:attributes="href audiocontext/absolute_url"

Python script access

If we need to access the content of an (audio) file in a Python script, we can get the binary data with the Archetype accessor getFile.

>>> binary = context.getFile()

This method returns the data wrapped into a Zope OFS.File object. To access the raw data as a string, we need to do the following:

>>> rawdata = str(

Accessing the raw data of an audio file might be useful if we want to do format transformations on the fly or other direct manipulation of the data.

Field access

If we write our own content type and want to save audio data with an object, we need a file field. This field stores the binary data and takes care of communicating with the browser with adequate view and edit widgets. The file field is defined in the Field module of the Archetype product. Additional to the properties, it exclusively defines that it inherits from the ObjectField base class. The following properties are important.


Default value




' '










The type property provides a unique name for the field. We usually don't need to change this. The default property defines the default value for this field. It is normally empty. If we want to change it, we need to specify an instance of the content_class property.

One field of the schema can be marked as primary. This field can be retrieved by the getPrimaryField accessor. When accessing the content object with FTP, the content of the primary field is transmitted to the client.

Like every other field, the file field needs a widget. The standard FileWidget is defined in the Widget module of the Archetypes product.

The content_class property declares the instance, where the actual binary data is stored. As standard, the File class from Zope's OFS.Image module is used. This class supports chunk-wise transmission of the data with the publisher.

Field can be accessed like any other field by its accessor method. This method is either defined as a property of the field or constructed from its name. If the name were "audio", the accessor would be getAudio. The accessor is generated from the "get" prefix with the capitalized name of the field.

Audio formats

Before we go on with Plone and see how we can enhance the story of audio processing and manipulate audio data, we will glance at audio formats. We will see how raw audio data is compressed to enable effective audio storage and streaming. We need to have some basic audio know-how about some of the terminology to understand how we can effectively process audio for our own purposes.

As with images, there are several formats in which audio content can be stored. We want to learn a bit of theoretical background. This eases the decision of choosing the right format for our use case.

An analog acoustic signal can be displayed as a wave:

If digitalized, the wave gets approximated by small rectangles below the curve. The more rectangles are used the better is the sound (fidelity) of the digital variant. The width of the rectangles is called the sampling rate.

Usual sampling rates include:

  • 44.1 kHz (44,100 samples per second): CD quality
  • 32 kHz: Speech
  • 14.5 kHz: FM radio bandwidth
  • 10 kHz: AM radio

Each sample is stored with a fixed number of bits. This value is called the audio bit depth or bit resolution.

Finally, there is a third value that we already know from the analog side. It is the channel. We have one channel for mono and two channels for stereo. For the digital variant, this means a doubling of data if stereo is used.

So let's do a calculation. Let's assume we have an audio podcast with a length of eight minutes, which we want to stream in stereo CD quality. The sampling rate corresponds with the highest frequency of sound that is stored. For accurate reproduction of the original sound, the sample rate has to be at least twice that ofthe highest frequency sound. Most humans cannot hear frequencies higher than 20 kHz. The corresponding sampling rate to 20 kHz is a sampling rate of 44100 samples. We want to use a bit resolution of 16. This is the standard bit depth for audio CDs. Lastly, we have two channels for stereo: 44100 x 16 x 2 x 60 x 8= 677376000 bits = 84672000 bytes ˜ 80.7 MB

This is quite a lot of data for eight minutes of CD-quality sound. We do not want to store so much data and more importantly, we do not want to send so much data over the Internet. So what we do is compress the data. Zipping the data would not give us a big effect because of the binary structure of digital audio data. There are different types of compressions for different types of data. ZIPs are good for text, JPEG is good for images, and MP3 is good for music—but why? Each of these algorithms takes the nature of the data into account. ZIP looks for redundant characters, JPEG unifies similar color areas, and MP3 strips the frequencies humans do not hear from the raw data.

(For more resources on Plone, see here.)

Audio compression algorithms are called codecs. There are two kinds of these codecs—lossless codecs and lossy codecs. We don't want to go into further details here. All we need to know is that lossless codecs don't lose data (quality) when they compress. Lossy codecs compress better but loose data. Thus if we convert a raw stream to a lossy format (such as MP3 or Ogg Vorbis), converting it back to raw, and back again to MP3, the output will differ from the first one. Usually, one won't hear the difference between a raw file and a lossy-encoded one after a single encoding pass, but there is some recognizable quality loss after multiple passes. If we do the same with a lossless codec, the output stays the same no matter how often we encode and decode.

Some commonly used audio codecs are:

  • Lossy: MP3, Ogg Vorbis, Musepack, WMA, and AAC
  • Lossless: FLAC, WavPack, Monkey's Audio, and ALAC/Apple Lossless
  • Raw: WAV and AIFF
  • 8 kHz: Telephone speech

Choosing the right audio format

You may ask the question: There are so many formats, which one shall I use? If you have a choice, which may not always be the case, you can rely on some short guidelines:


Decision guidelines


You want to reach as many people as possible.

You want your audio content to be playable with almost all mobile players.

Your audio may be used together with Flash; MP3 is easily embedded there.

You want a format that is easily streamable.

You want small file sizes for storing and streaming.

Ogg Vorbis

You want the most of the advantages of MP3.

You want a patent-free format (this can be helpful if you plan to use HTML5).


You have high-quality audio content.

You have big disk space.

You and your users don't care about Internet bandwidth.

Other formats

You have a special reason to do so (for example, if your users stick to iTunes, you may probably use AAC).

Converting audio formats

Sometimes we need to convert one audio format to another. Most web audio players understand only a few formats. Often, they are limited to MP3 only. If we want to play our audio—available in the Ogg Vorbis format—with such players, we have to convert it first. We will see how to do that in this section. If you work a lot with multimedia, you probably know the VLC player from VideoLAN. VLC is a media player and server. It is available on most platforms, including Windows, Mac OS X, and Linux. If VLC doesn't support the format you need, check the home page of the audio format.

Many audio players support the encoding of audio too. On Windows, the popular audio player Winamp can be used to convert audio formats. On Linux, you probably want to try Amarok. Amarok is a player for KDE and its plugins are scriptable with Python. There are ready available plugins for converting audio data.

Sometimes it is not possible to convert directly. This makes it necessary to convert to raw audio (WAV) first, and then convert it to the desired target format.

Converting audio with VLC

If we use the VLC player for converting audio files, we are utilizing the streaming mode of the player. We open the Streaming/Export Wizard... from the File menu.There we choose the second option Transcode/Save to file in the dialog box. Next, we select a file available in the playlist of the player or from the hard disk. After doing so, we select the target format. As stated before most players support MP3, so MP3 might be good choice. If the raw format is needed, we have to choose Uncompressed, integer. On the next screen, we have to pick the encapsulation format. If we have selected MP3 before, RAW is a good choice here because we are able to read and manipulate the created file with most audio editing software (for example. Audacity). If we have selected Uncompressed, integer in the earlier step, we don't have a choice now as WAV is the only supported encapsulation format in this case. As the last step we choose a filename for the file, which is created with the new format.

After confirming the summary, we are ready and have a new item in our VLC playlist: Streaming Transcoding Wizard (1/1). We need to "play" this item to make the actual transcoding happen. The process might take some time depending on the source and target format. We don't hear anything during the transcoding process. In the case of success, there is a new file on our hard disk that we can test with VLC and then use with our favorite web player in Plone.

Audio metadata

As for most digital photo formats, it is also possible for most audio formats to store some additional data on the binary file. This data contains information on the artist, the album, the genre, the encoding itself, and some more information.

ID3 tag: The metadata format for MP3

The ID3 tag is the metadata format for MP3. ID3 stands for Identify an MP3. Before it was introduced, the only chance of storing metadata was in the filename, which tended to get very long. The ID3v1tag is capable of storing this information:






'TAG' Identifier



Song title









A four-digit year








There have been some revisions in the format. Nowadays, ID3v2 tags are commonly used. The ID3v2 tag is a complete rewrite of the original ID3 tag implementation. The format is capable of storing icons of the cover art, supports character encoding, and the stored information is not limited to a few characters. The maximum for storing metadata on MP3 with the ID3v2 tag is 256 megabytes. This is enough space for storing karaoke lyrics in several languages.

Metadata of other audio formats

Most other audio formats support storing metadata information on the file as well. The Ogg Vorbis metadata is called Vorbis comments. They support metadata tags similar to those implemented in the ID3 tag standard for MP3. Music tags are typically implemented as strings of the [TAG]=[VALUE] form (for example, "ARTIST=The Rolling Stones").

Like the current version of the ID3 tag, users and encoding software are free to use whichever tags are appropriate for the content.

FLAC defines several types of metadata blocks. One of these blocks is favored. It is the only mandatory STREAMINFO block. This block stores audio-centric information such as the sample rate, the number of channels, and so on. Also included in the STREAMINFO block is the MD5 signature of the unencoded audio data. This is useful for checking an entire stream for transmission errors.

Metadata blocks can be any length and new ones can be defined. A decoder is allowed to skip any metadata types it does not understand.

Editing audio metadata

Let's see how the metadata comes into the audio. Most CD encoding programs query an open metadata database such as freedb to generate the metadata for our audio content automatically. If we have MP3 files that are not encoded on their own, we need a tag editor. Almost every modern player supports accessing the ID3 tag information nowadays. If you have a Mac, you can use iTunes to manipulate the ID3 tag information of every track. Use command for accessing the metadata window.

On Windows and Linux there is a product called EasyTAG (, which allows you to manage the metadata of your audio files for whole directories. You can use this software on the Mac too, if you have MacPorts.

EasyTAG also supports writing the Ogg Vorbis and FLAC metadata.

There are other options. Check the manuals of your favorite audio player. Very likely, it comes with some support of reading and writing metadata.

Now we are perfectly prepared to manage our content with Plone: We chose a compression format for our data. We structured the data with additional metadata. What we want is to take this effort into Plone. A simple File content type is not sufficient any longer. We will investigate an extension in the next section, which aims to solve this issue.

>> Continue Reading: Audio Enhancements with p4a.ploneaudio in Plone 3.3

Further resources on this subject:





You've been reading an excerpt of:

Plone 3 Multimedia

Explore Title