Iteration and Searching Keys

Exclusive offer: get 50% off this eBook here
Getting Started with LevelDB

Getting Started with LevelDB — Save 50%

Store and retrieve key-value based data quickly on iOS and OS X using LevelDB with this book and ebook

€13.99    €7.00
by Andy Dent | November 2013 | Open Source

This article is written by Andy Dent, the author of Getting Started with LevelDB. This article will give us a brief idea about iteration and searching keys. If you don't know what data may be in your keys, then you need a way to search for partial matches or just start from the beginning of the database. This ability to search for and iterate through keys in sorted order is what completes LevelDB's ability to be the foundation of a database. The default sorting order is a BytewiseComparator, effectively ASCII.

(For more resources related to this topic, see here.)

Introducing Sample04 to show you loops and searches

Sample04 uses the same LevelDbHelpers.h as before. Please download the entire sample and look at main04.cpp to see the code in context. Running Sample04 starts by printing the output from the entire database, as shown in the following screenshot:

Console output of listing keys

Creating test records with a loop

The test data being used here was created with a simple loop and forms a linked list as well. It is explained in more detail in the Simple Relational Style section. The loop creating the test data uses the new C++11 range-based for style of the loop:

vector<string> words {"Packt", "Packer", "Unpack", "packing",
"Packt2", "Alpacca"}; stringprevKey; WriteOptionssyncW; syncW.sync = true; WriteBatchwb; for (auto key : words) { wb.Put(key, prevKey + "\tsome more content"); prevKey = key; } assert(db->Write(syncW, &wb).ok() );

Note how we're using a string to hang onto the prevKey. There may be a temptation to use a Slice here to refer to the previous value of key, but remember the warning about a Slice only having a data pointer. This would be a classic bug introduced with a Slice pointing to a value that can be changed underneath it!

We're adding all the keys using a WriteBatch not just for consistency, but also so that the storage engine knows it's getting a bunch of updates in one go and can optimize the file writing. I will be using the term Record regularly from now on. It's easier to say than Key-value Pair and is also indicative of the richer, multi-value data we're storing.

Stepping through all the records with iterators

The model for multiple record reading in LevelDB is a simple iteration. Find a starting point and then step forwards or backwards.

This is done with an Iterator object that manages the order and starting point of your stepping through keys and values. You call methods on Iterator to choose where to start, to step and to get back the key and value. Each Iterator gets a consistent snapshot of the database, ignoring updates during iteration. Create a new Iterator to see changes.

If you have used declarative database APIs such as SQL-based databases, you would be used to performing a query and then operating on the results. Many of these APIs and older, record-oriented databases have a concept of a cursor which maintains the current position in the results which you can only move forward. Some of them allow you to move the cursor to the previous records. Iterating through individual records may seem clunky and old-fashioned if you are used to getting collections from servers. However, remember LevelDB is a local database. Each step doesn't represent a network operation!

The iterable cursor approach is all that LevelDB offers, called an Iterator. If you want some way of mapping a collected set of results directly to a listbox or other containers, you will have to implement it on top of the Iterator, as we will see later.

Iterating forwards, we just get an Iterator from our database and jump to the first record with SeekToFirst():

Iterator* idb = db->NewIterator(ropt); for (idb->SeekToFirst(); idb->Valid(); idb->Next()) cout<<idb->key() <<endl;

Going backwards is very similar, but inherently less efficient as a storage trade-off:

for (idb->SeekToLast(); idb->Valid(); idb->Prev())
cout<<idb->key() <<endl;

If you wanted to see the value as well as the keys, just use the value() method on the iterator (the test data in Sample04 would make it look a bit confusing so it isn't being done here):

cout<<idb->key() << " " <<idb->value() <<endl;

Unlike some other programming iterators, there's no concept of a special forward or backward iterator and no obligation to keep going in the same direction. Consider searching an HR database for the ten highest-paid managers. With a key of Job+Salary, you would iterate through a range until you know you have hit the end of the managers, then iterate backwards to get the last ten.

An iterator is created by NewIterator(), so you have to remember to delete it or it will leak memory. Iteration is over a consistent snapshot of the data, and any data changes through Put, Get, or Delete operations won't show until another NewIterator() is created.

Searching for ranges of keys

The second half of the console output is from our examples of iterating through partial keys, which are case-sensitive by default, with the default BytewiseComparator:

Console output of searches

As we've seen many times, the Get function looks for an exact match for a key. However, if you have an Iterator, you can use Seek and it will jump to the first key that either matches exactly or is immediately after the partial key you specify.

If we are just looking for keys with a common prefix, the optimal comparison is using the starts_with method of the Slice class:

Void listKeysStarting(Iterator* idb, const Slice& prefix) { cout<< "List all keys starting with " <<prefix.ToString() <<endl; for (idb->Seek(prefix); idb-<Valid() &&idb->key().starts_with(prefix); idb-<Next()) cout<<idb->key() <<endl; }

Going backwards is a little bit more complicated. We use a key that is guaranteed to fail. You could think of it as being between the last key starting with our prefix and the next key out of the desired range. When we Seek to that key, we need to step once to the previous key. If that's valid and matching, it's the last key in our range:

Void listBackwardsKeysStarting(Iterator* idb, const Slice& prefix) { cout<<"List all keys starting with " <<prefix.ToString() << " backwards " <<endl; const string keyAfter = prefix.ToString() + "\xFF"; idb->Seek(keyAfter); if (idb->Valid()) idb->Prev(); // step to last key with actual prefix else // no key just after our range, but idb->SeekToLast(); // maybe the last key matches? for(;idb->Valid() &&idb->key().starts_with(prefix); idb->Prev()) cout<<idb->key() <<endl; }

What if you want to get keys within a range? For the first time, I disagree with the documentation included with LevelDB. Their iteration example shows a similar loop to that shown in the following code, but checks the key values with idb->key().ToString() < limit. That is a more expensive way to iterate keys as it's generating a temporary string object for every key being checked, which is expensive if there were thousands in the range:

Void listKeys Between(Iterator* idb, const Slice&startKey, const Slice&endKey) { cout<< "List all keys >= " <<startKey.ToString() << " and < " <<endKey.ToString() <<endl; for (idb->Seek(startKey); idb->Valid() &&idb->key().compare(endKey) < 0; idb->Next()) cout<<idb->key() <<endl; }

We can use another built-in method of Slice; the compare() method, which returns a result <0, 0, or >0 to indicate if Slice is less than, equal to, or greater than the other Slice it is being compared to. This is the same semantics as the standard C memcpy. The code shown in the previous snippet will find keys that are the same, or after the startKey and are before the endKey. If you want the range to include the endKey, change the comparison to compare(endKey) <= 0.

Summary

In this article, we learned the concept of an iterator in LevelDB as a way to step through records sorted by their keys. The database became far more useful with searches to get the starting point for the iterator, and samples showing how to efficiently check keys as you step through a range.

Resources for Article :


Further resources on this subject:


Getting Started with LevelDB Store and retrieve key-value based data quickly on iOS and OS X using LevelDB with this book and ebook
Published: November 2013
eBook Price: €13.99
Book Price: €21.99
See more
Select your format and quantity:

About the Author :


Andy Dent

Andy Dent is a cross-platform developer from Perth, Western Australia, who started programming Macs with a 512 K Fat Mac in 1986. He has worked on mainframe, desktop, and mobile apps in Perth and remotely for US-based clients. The latter US work on retail products includes developing code generators for all the pre-OS X GUI C++ application-generation tools.

Andy’s background in ISAM filesystems and love of writing frameworks coalesced in creation of the OOFILE products, designed to make C++ programming as easy as xBase. These included an object-oriented data layer, forms integration, and report-writing. He created the expatpp lightweight parser framework to simplify writing XML parsers and capped a love of writing XML tooling with several years working on complex geospatial data interchange at Australia’s CSIRO. His search for a more flexible data store led him to LevelDB. He is currently working on a range of iOS apps for his own label and contract clients.

Books From Packt


iOS and OS X Network Programming Cookbook
iOS and OS X Network Programming Cookbook

OCA Oracle Database 11g: Database Administration I: A Real-World Certification Guide
OCA Oracle Database 11g: Database Administration I: A Real-World Certification Guide

Oracle Database XE 11gR2 Jump Start Guide
Oracle Database XE 11gR2 Jump Start Guide

Oracle 10g/11g Data and Database Management Utilities
Oracle 10g/11g Data and Database Management Utilities

Creating your MySQL Database: Practical Design Tips and Techniques
Creating your MySQL Database: Practical Design Tips and Techniques

Core Data iOS Essentials
Core Data iOS Essentials

Pentaho Data Integration 4 Cookbook
Pentaho Data Integration 4 Cookbook

Instant Oracle Database and PowerShell How-to [Instant]
Instant Oracle Database and PowerShell How-to [Instant]

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software