OpenSceneGraph: Methods for Improving Rendering Efficiency

Rui Wang

February 2011


OpenSceneGraph 3.0: Beginner's Guide

OpenSceneGraph 3.0: Beginner's Guide

Create high-performance virtual reality applications with OpenSceneGraph, one of the best 3D graphics engines.

  • Gain a comprehensive view of the structure and main functionalities of OpenSceneGraph
  • An ideal introduction for developing applications using OpenSceneGraph
  • Develop applications around the concepts of scene graphs and design patterns
  • Extend your own scene elements from the base interfaces of OpenSceneGraph
  • Packed with examples, this book explains each knowledge point in detail and makes you practice your knowledge for better understanding
        Read more about this book      

(For more resources on OpenSceneGraph, see here.)

The reader can benefit from the previous article on Implementing Multithreaded Operations and Rendering in OpenSceneGraph.

Improving your application

There are a lot of tricks to improve the rendering performance of applications with a large amount of data. But the essence of them is easy to understand: the smaller the number of resources (geometries, display lists, texture objects, and so on) allocated, the faster and smoother the user application is.

There are lots of ideas on how to find the bottleneck of an inefficient application. For example, you can replace certain objects by simple boxes, or replace textures in your application by 1x1 images to see if the performance can increase, thanks to the reduction of geometries and texture objects. The statistics class (osgViewer::StatsHandler, or press the S key in the osgviewer) can also provide helpful information.

To achieve a less-enough scene resource, we can refer to the following table and try to optimize our applications if they are not running in good shape:

Problem Influence Possible solution
Too many geometries Low frame rate and huge resource cost Use LOD and culling techniques to reduce the vertices of the drawables.

Use primitive sets and the index mechanism rather than duplicate vertices.

Merge geometries into one, if possible. This is because one geometry object allocates one display list, and too many display lists occupy too much of the video memory.

Share geometries, vertices, and nodes as often as possible.

Too many dynamic objects (configured with the setDataVariance() method) Low frame rate because the DRAW phase must wait until all dynamic objects finish updating Don't use the DYNAMIC flag on nodes and drawables that do not need to be modified on the fly.


Don't set the root node to be dynamic unless you are sure that you require this, because data variance can be inherited in the scene graph.

Too many texture objects Low frame rate and huge resource cost Share rendering states and textures as much as you can. Lower the resolution and compress them using the DXTC format if possible.

Use osg::TextureRectangle to handle non-power-of-two sized textures, and osg::Texture2D for regular 2D textures.

Use LOD to simplify and manage nodes with large-sized textures.

The scene graph structure is "loose", that is, nodes are not grouped together effectively. Very high cull and draw time, and many redundant state changes If there are too many parent nodes, each with only one child, which means the scene has as many group nodes as leaf nodes, and even as many drawables as leaf nodes, the performance will be totally ruined.

You should rethink your scene graph and group nodes that have close features and behaviors more effectively.

Loading and unloading resources too frequently Lower and lower running speed and wasteful memory fragmentation Use the buffer pool to allocate and release resources. OSG has already done this to textures and buffer objects, by default.

An additional helper is the osgUtil::Optimizer class. This can traverse the scene graph before starting the simulation loop and do different kinds of optimizations in order to improve efficiency, including removing redundant nodes, sharing duplicated states, checking and merging geometries, optimizing texture settings, and so on. You may start the optimizing operation with the following code segment:

osgUtil::Optimizer optimizer;
optimizer.optimize( node );

Some parts of the optimizer are optional. You can see the header file include/osgUtil/Optimizer for details.

Time for action – sharing textures with a customized callback

We would like to explain the importance of scene optimization by providing an extreme situation where massive textures are allocated without sharing the same ones. We have a basic solution to collect and reuse loaded images in a file reading callback, and then share all textures that use the same image object and have the same parameters. The idea of sharing textures can be used to construct massive scene graphs, such as digital cities; otherwise, the video card memory will soon be eaten up and thus cause the whole application to slow down and crash.

  1. Include the necessary headers:

    #include <osg/Texture2D>
    #include <osg/Geometry>
    #include <osg/Geode>
    #include <osg/Group>
    #include <osgDB/ReadFile>
    #include <osgViewer/Viewer>

  2. The function for quickly producing massive data can be used in this example, once more. This time we will apply a texture attribute to each quad. That means that we are going to have a huge number of geometries, and the same amount of texture objects, which will be a heavy burden for rendering the scene smoothly:

    #define RAND(min, max) \
    ((min) + (float)rand()/(RAND_MAX+1) * ((max)-(min)))
    osg::Geode* createMassiveQuads( unsigned int number,
    const std::string& imageFile )

    osg::ref_ptr<osg::Geode> geode = new osg::Geode;
    for ( unsigned int i=0; i<number; ++i )
    osg::Vec3 randomCenter;
    randomCenter.x() = RAND(-100.0f, 100.0f);
    randomCenter.y() = RAND(1.0f, 100.0f);
    randomCenter.z() = RAND(-100.0f, 100.0f);

    osg::ref_ptr<osg::Drawable> quad =
    osg::Vec3(1.0f, 0.0f, 0.0f),
    osg::Vec3(0.0f, 0.0f, 1.0f)
    osg::ref_ptr<osg::Texture2D> texture = new osg::Texture2D;
    texture->setImage( osgDB::readImageFile(imageFile) );
    0, texture.get() );
    geode->addDrawable( quad.get() );
    return geode.release();

  3. The createMassiveQuads() function is, of course, awkward and ineffective here. However, it demonstrates a common situation: assuming that an application needs to often load image files and create texture objects on the fly, it is necessary to check if an image has been loaded already and then share the corresponding textures automatically. The memory occupancy will be obviously reduced if there are plenty of textures that are reusable. To achieve this, we should first record all loaded image filenames, and then create a map that saves the corresponding osg::Image objects.
  4. Whenever a new readImageFile() request arrives, the osgDB::Registry instance will try using a preset osgDB::ReadFileCallback to perform the actual loading work. If the callback doesn't exist, it will call the readImageImplementation() to choose an appropriate plug-in that will load the image and return the resultant object. Therefore, we can take over the reading image process by inheriting the osgDB::ReadFileCallback class and implementing a new functionality that compares the filename and re-uses the existing image objects, with the customized getImageByName() function:

    class ReadAndShareImageCallback : public osgDB::ReadFileCallback
    virtual osgDB::ReaderWriter::ReadResult readImage(
    const std::string& filename, const osgDB::Options* options

    osg::Image* getImageByName( const std::string& filename )
    ImageMap::iterator itr = _imageMap.find(filename);
    if ( itr!=_imageMap.end() ) return itr->second.get();
    return NULL;

    typedef std::map<std::string, osg::ref_ptr<osg::Image> >
    ImageMap _imageMap;

  5. The readImage() method should be overridden to replace the current reading implementation. It will return the previously-imported instance if the filename matches an element in the _imageMap, and will add any newly-loaded image object and its name to _imageMap, in order to ensure that the same file won't be imported again:

    osgDB::ReaderWriter::ReadResult ReadAndShareImageCallback::read
    const std::string& filename, const osgDB::Options* options )
    osg::Image* image = getImageByName( filename );
    if ( !image )
    osgDB::ReaderWriter::ReadResult rr;
    rr = osgDB::Registry::instance()->readImageImplementation(
    filename, options);
    if ( rr.success() ) _imageMap[filename] = rr.getImage();
    return rr;
    return image;

  6. Now we get into the main entry. The file-reading callback is set by the setReadFileCallback() method of the osgDB::Registry class, which is designed as a singleton. Meanwhile, we have to enable another important run-time optimizer, named osgDB::SharedStateManager, that can be defined by setSharedStateManager() or getOrCreateSharedStateManager(). The latter will assign a default instance to the registry:

    new ReadAndShareImageCallback );

  7. Create the massive scene graph. It consists of two groups of quads, each of which uses a unified image file to decorate the quad geometry. In total, 1,000 quads will be created, along with 1,000 newly-allocated textures. Certainly, there are too many redundant texture objects (because they are generated from only two image files) in this case:

    osg::ref_ptr<osg::Group> root = new osg::Group;
    root->addChild( createMassiveQuads(500, "Images/lz.rgb") );
    root->addChild( createMassiveQuads(500, "Images/osg64.png") );

  8. The osgDB::SharedStateManager is used for maximizing the reuse of textures and state sets. It is actually a node visitor, traversing all child nodes' state sets and comparing them when the share() method is invoked. State sets and textures with the same attributes and data will be combined into one:

    osgDB::SharedStateManager* ssm =
    if ( ssm ) ssm->share( root.get() );

  9. Finalize the viewer:

    osgViewer::Viewer viewer;
    viewer.setSceneData( root.get() );

  10. Now the application starts with a large number of textured quads. With the ReadAndShareImageCallback sharing image objects, and the osgDB::SharedStateManager sharing textures, the rendering process can work without a hitch. Try commenting out the lines of setReadFileCallback() and getOrCreateSharedStateManager() and restart the application, and then see what has happened. The Windows Task Manager is helpful in displaying the amount of currently-used memory here:


What just happened?

You may be curious about the implementation of osgDB::SharedStateManager. It collects rendering states and textures that firstly appear in the scene graph, and then replaces duplicated states of successive nodes with the recorded ones. It compares two states' member attributes in order to decide whether the new state should be recorded (because it's not the same as any of the recorded ones) or replaced (because it is a duplication of the previous one).

For texture objects, the osgDB::SharedStateManager will determine if they are exactly the same by checking the data() pointer of the osg::Image object, rather than by comparing every pixel of the image. Thus, the customized ReadAndShareImageCallback class is used here to share image objects with the same filename first, and the osgDB::SharedStateManager shares textures with the same image object and other attributes.

The osgDB::DatabasePager also makes use of osgDB::SharedStateManager to share states of external scene graphs when dynamically loading and unloading paged nodes. This is done automatically if getOrCreateSharedStateManager() is executed.

Have a go hero – sharing public models

Can we also share models with the same name in an application? The answer is absolutely yes. The osgDB::ReadFileCallback could be used again by overriding the virtual method readNode(). Other preparations include a member std::map for recording filename and node pointer pairs, and a user-defined getNodeByName() method as we have just done in the last example.

Paging huge scene data

Are you still struggling with the optimization of huge scene data? Don't always pay attention to the rendering API itself. There is no "super" rendering engine in the world that can work with unlimited datasets. Consider using the scene paging mechanism at this time, which can load and unload objects according to the current viewport and frustum. It is also important to design a better structure for indexing regions of spatial data, like quad-tree, octree, R-tree, and the binary space partitioning (BSP).

Making use of the quad-tree

A classic quad-tree structure decomposes the whole 2D region into four square children (we call them cells here), and recursively subdivides each cell into four regions, until a cell reaches its target capacity and stops splitting (a so-called leaf). Each cell in the tree either has exactly four children, or has no children. It is mostly useful for representing terrains or scenes on 2D planes.

The quad-tree structure is useful for view-frustum culling terrain data. Because the terrain is divided into small pieces that are a part of it, we can easily render pieces of small data in the frustum, and discard those that are invisible. This can effectively unload a large number of chunks of a terrain from memory at a time, and load them back when necessary—which is the basic principle of dynamic data paging. This process can be progressive: when the terrain model is far enough from the viewer, we may only handle its root and first levels. But as it is drawing near, we can traverse down to corresponding levels of the quad-tree, and cull and unload as many cells as possible, to keep the load balance of the scene.



        Read more about this book      

(For more resources on OpenSceneGraph, see here.)

Time for action – building a quad-tree for massive rendering

In this example we would like to show how OSG handles massive data (often massive terrain data) with the quad-tree structure and paged nodes (osg::PagedLOD). We are going to construct a terrain model with fake elevation data, and use a recursion to build all child cells of a complete quad-tree. These cells are saved into separate files and managed by the osgDB::DatabasePager.

  1. Include the necessary headers:

    #include <osg/ShapeDrawable>
    #include <osg/PagedLOD>
    #include <osgDB/WriteFile>
    #include <sstream>

  2. Define some global variables. These will define the dimensions of a regularly-spaced grid of elevation points, including the data pointer (g_data), intervals of X and Y directions (g_dx and g_dy), rows and columns of the leaf cell in the quad-tree (g_minCols and g_minRows), and rows and columns of the entire dataset (g_numCols and g_numRows):

    float* g_data = NULL;
    float g_dx = 1.0f;
    float g_dy = 1.0f;
    unsigned int g_minCols = 64;
    unsigned int g_minRows = 64;
    unsigned int g_numCols = 1024;
    unsigned int g_numRows = 1024;

  3. The following figure shows how variables work here:


  4. These preset global values indicate that we have a 1024x1024 area to be rendered, which contains over one million vertices. This already slows down the rendering of normal geometries, but it's far from enough for representing a digital terrain. Fortunately, we have the quad-tree and paging mechanism, which can solve the massive data problem in a nearly perfect way.
  5. We will first fill the elevation grid (g_data) with random points. This is done via a simple createMassiveData() function. To retrieve an elevation at a certain column and row, we have to define an additional getOneData() function. This gets the minimum value between the input column/row number and the total value with the osg::minimum() function, and then finds the elevation data from the g_data pointer:

    #define RAND(min, max) \
    ((min) + (float)rand()/(RAND_MAX+1) * ((max)-(min)))
    void createMassiveData()
    g_data = new float[g_numCols * g_numRows];
    for ( unsigned int i=0; i<g_numRows; ++i )
    for ( unsigned int j=0; j<g_numCols; ++j )
    g_data[i*g_numCols + j] = RAND(0.5f, 0.0f);
    float getOneData( unsigned int c, unsigned int r )
    return g_data[osg::minimum(r, g_numRows-1) * g_numCols +
    osg::minimum(c, g_numCols-1)];

  6. The createFileName() function is another important customized function for naming paged data files. It will be used later in this example:

    std::string createFileName( unsigned int lv,
    unsigned int x, unsigned int y )
    std::stringstream sstream;
    sstream << "quadtree_L" << lv << "_X" << x << "_Y" << y <<
    return sstream.str();

  7. The core of the quad-tree construction is the outputSubScene() function. This should be called recursively to build all child cells of a quad-tree, until an end condition is reached. The lv, x, and y parameters indicate the depth level of the quad-tree cell, as well as the X/Y indices in the current level. The color parameter is just used for distinguishing cells in a simple way:

    osg::Node* outputSubScene( unsigned int lv,
    unsigned int x, unsigned int y,
    const osg::Vec4& color )

  8. The indices of the cell don't equal the real position of the elevation value in the g_data pointer. Thus, we have to compute how many elevation points are contained in the current cell, along with the indices of the start/end column and row, and then save them for later use:

    unsigned int numInUnitCol = g_numCols / (int)powf(2.0f,
    unsigned int numInUnitRow = g_numRows / (int)powf(2.0f,
    unsigned int xDataStart = x * numInUnitCol,
    xDataEnd = (x+1) * numInUnitCol;
    unsigned int yDataStart = y * numInUnitRow,
    yDataEnd = (y+1) * numInUnitRow;

  9. Assuming that the root level of a quad-tree is 0, we have a formula that explains the previous code segment: (Points in a cell) = (Total points) / (level-thpower of 2).
  10. We can easily figure out that a level 1 cell contains a quarter of all points, and a level 2 cell contains one sixteenth of them. That means the rendering of four level 1 cells still requires all data to be drawn, if none of them are culled. So, is there a solution that can reduce the vertex number of these lower levels, that is, to downsample the height field in these cells? For example, each level 1 cell of a 1024x1024 dataset has 512x512 points. If these can be downsampled to 64x64, we will only have to render no more than 20,000 points at one time.
  11. The answer is absolutely yes. As we have just discussed, the quad-tree can be progressively traversed as if it is a LOD (level-of-detail) based graph. Low levels work when the model is still far away and can't represent too many details, and leaf cells will come with uncompressed data only when the viewpoint is near enough.
  12. We will create the downsampling height field for the current level using the osg::HeightField class, which is derived from osg::Shape and can be used by osg::ShapeDrawable. Its origin is defined as the bottom-left corner, and the skirt height can prevent gaps between two terrain cells:

    bool stopAtLeafNode = false;
    osg::ref_ptr<osg::HeightField> grid = new osg::HeightField;
    grid->setSkirtHeight( 1.0f );
    grid->setOrigin( osg::Vec3(g_dx*(float)xDataStart,
    g_dy*(float)yDataStart, 0.0f) );

  13. We will first check to see if the current cell reaches the last level, by comparing the start and end indices with the global g_minCols and g_minRows. If it does, we simply allocate the height field with the computed columns and rows and X/Y intervals, and read and set each point of the allocated elevation grid:

    if ( xDataEnd-xDataStart<=g_minCols &&
    yDataEnd-yDataStart<=g_minRows )
    grid->allocate( xDataEnd-xDataStart+1, yDataEnd-yDataStart+1
    grid->setXInterval( g_dx );
    grid->setYInterval( g_dy );
    for ( unsigned int i=yDataStart; i<=yDataEnd; ++i )
    for ( unsigned int j=xDataStart; j<=xDataEnd; ++j )
    grid->setHeight( j-xDataStart, i-yDataStart,
    getOneData(j, i) );
    stopAtLeafNode = true;

  14. Otherwise, we should obtain downsampling data and keep the height field to a fixed, low resolution, using specific g_minCols and g_minRows variables. The simplest method here is to pick one point and add it to the osg::HeightField every few points. The X/Y intervals of the elevation grid should also be changed:

    unsigned int jStep = (unsigned int)ceilf(
    (float)(xDataEnd - xDataStart) / (float)g_minCols);
    unsigned int iStep = (unsigned int)ceilf(
    (float)(yDataEnd - yDataStart) / (float)g_minRows);
    grid->allocate( g_minCols+1, g_minRows+1 );
    grid->setXInterval( g_dx * jStep );
    grid->setYInterval( g_dy * iStep );
    for ( unsigned int i=yDataStart, ii=0; i<=yDataEnd;
    i+=iStep, ++ii )
    for ( unsigned int j=xDataStart, jj=0; j<=xDataEnd;
    j+=jStep, ++jj )
    grid->setHeight( jj, ii, getOneData(j, i) );

  15. Set the height field to an osg::ShapeDrawable instance, and set the color. Add the shape to osg::Geode. If this is the leaf cell of the quad-tree, the recursion will end:

    osg::ref_ptr<osg::ShapeDrawable> shape =
    new osg::ShapeDrawable( grid.get() );
    shape->setColor( color );
    osg::ref_ptr<osg::Geode> geode = new osg::Geode;
    geode->addDrawable( shape.get() );
    if ( stopAtLeafNode ) return geode.release();

  16. Now we construct the paged nodes for the OSG scene. A quad-tree cell always has four children, except for leaf ones. Their levels and indices should be increased properly before starting the next level's recursion call. We also specify four different colors, red, green, blue and yellow, for rendering different child cells:

    osg::ref_ptr<osg::Group> group = new osg::Group;
    outputSubScene(lv+1, x*2, y*2, osg::Vec4(1.0f,0.0f,0.0f,1.0f))
    outputSubScene(lv+1, x*2, y*2+1, osg::Vec4(0.0f,1.0f,0.0f,1.0f))
    outputSubScene(lv+1, x*2+1,y*2+1,
    osg::Vec4(0.0f,0.0f,1.0f,1.0f)) );
    outputSubScene(lv+1, x*2+1, y*2, osg::Vec4(1.0f,1.0f,0.0f,1.0f))

  17. The paged LOD node representing the current quad-tree level can be made up of two children: a rough model (the downsampled height field) that is cached for displaying at a far distance, and the fine "model" which actually consists of four cells describing the next level in the quad-tree. Because the next level can still be described as paged LOD nodes, we actually build a quad-tree style scene graph full of osg::PagedLOD nodes. The group node of next level cells can be saved into a separate file, with the filename being generated by createFileName():

    osg::ref_ptr<osg::PagedLOD> plod = new osg::PagedLOD;

    std::string filename = createFileName(lv, x, y);
    plod->insertChild( 0, geode.get() );
    plod->setFileName( 1, filename );

    osgDB::writeNodeFile( *group, filename );

  18. The paged LOD node must have a valid bounding sphere in order to make it correctly pass the view-frustum culling. Here, we have to successively set the center mode to user-defined, and define the center and radius of our customized bounding sphere. After that, we will set the visibility ranges of two child levels of the LOD node. The cutoff parameter is just an empirical value:

    plod->setCenterMode( osg::PagedLOD::USER_DEFINED_CENTER );
    plod->setCenter( geode->getBound().center() );
    plod->setRadius( geode->getBound().radius() );
    float cutoff = geode->getBound().radius() * 5.0f;
    plod->setRange( 0, cutoff, FLT_MAX );
    plod->setRange( 1, 0.0f, cutoff );
    return plod.release();

  19. In the main entry, the createMassiveData() function must be the first thing executed, in order to allocate the global terrain data. And we can add the root of the quad-tree to an osg::Group root node and save it into a file, too:

    osg::ref_ptr<osg::Group> root = new osg::Group;
    outputSubScene(0, 0, 0, osg::Vec4(1.0f, 1.0f, 1.0f, 1.0f)) );
    osgDB::writeNodeFile( *root, "quadtree.osg" );
    delete g_data;
    return 0;

  20. Assuming that the executable name is MyProject.exe. Now we can just enter the console mode and enter:

    # MyProject.exe
    # osgviewer quadtree.osg

  21. The result is smooth and clear. We have just built a terrain model using customized elevation points. Looking from far away, it is obviously divided into four pieces, which is in fact the first four square cells of the quad-tree:


  22. Move towards the terrain and you will see more detailed height fields within different cells of different levels:


  23. The most detailed data will only be rendered when the viewer is very close to the ground. This is because the last four leaf cells of the quad-tree can only be loaded when the highest level of paged nodes is reached by the OSG backend:


What just happened?

In this example, we haven't provided anything new, but have only made use of the same, known node and drawable types (osg::PagedLOD and osg::HeightField) as well as a world famous algorithm—quad-tree, to construct a complex scene graph that can perform dynamic scene paging, smoothly.

Obviously, there is a lot more work to do before we can put this small example into practical use. Terrain data with non-power-of-two row or column numbers may produce incorrect results at present. The concept of coordinate system datum (for instance, WGS84) is not included, so geometric earth models will not be creatable. The randomly-generated height data is also not ideal at all. Some .geotiff format imagery and elevation data may be good enough as a replacement, if you have an interest in extending the example in any way.

Have a go hero – testing a new terrain creation tool

We would like to introduce an independent terrain database creation tool named VirtualPlanetBuilder (or VPB for short), which reads a wide range of geospatial imagery and elevation data, and builds small pieces of terrain area with layers of paged LOD nodes. The previous example code actually comes from the theory of the VPB project.

VPB mainly depends on the OSG project and the third-party GDAL project. VPB is described and provided at:

And GDAL can be found at:

After downloading the source code, use CMake to build native solutions or makefiles, choose ALL_BUILD in your Visual Studio interface (or use make install in the UNIX shell), and obtain the VPB libraries and utilities. Use the vpbmaster executable to quickly build an OSG native format terrain from .geotiff files, such as:

# vpbmaster -d dem_file.tif -t dom_file.tif -o output.osgb

Because of the ability to handle a multi-terabyte database, create tiles across networks of computers, and read multi source imagery and DEM file formats, VPB can be used as a complete terrain-creation tool. You will always have the time to taste it and see if it can build the whole world for your applications. So good luck with it!

A demo database generated by VPB can be found on the web. You may use the osgviewer utility to view it, unless you don't have the osgdb_curl plug-in built (it requires the libcurl library):

# osgviewer

For more information about earth and terrain rendering, have a look at the osgEarth project ( This does a similar job, by alternative creating terrain tiles offline or at run-time.


In this article we learnt:

  • A basic texture sharing implementation for different external files using the osgDB::ReadFileCallback and osgDB::SharedStateManager
  • The quad-tree structure and its initial implementation for building large terrain data in OSG, as well as a brief introduction of the professional creation tool VPB

Further resources on this subject:

You've been reading an excerpt of:

OpenSceneGraph 3.0: Beginner's Guide

Explore Title