Hybrid Data Management Gets Traction in Set-Top Boxes

Steve Graves

July 28, 2008

Steve GravesJuly 28, 2008

(Editor's Note: In this "Product How-To" design article Steve Graves of McObject describes how DirectTV used the company's ExtremeDB in memory embedded database to build a hybrid database for managing program information stored locally the customer's settop box as well as on the company's website.)

Consumer electronics devices often come in two flavors: with hard disks (or other permanent storage media such as flash), and without. A manufacturer's product line for a particular device will often include both disk-less and disk-enabled models, to accommodate different price points or features.

For example, a set-top box product family will include hard disks in those units that offer digital video recording, but will use memory-only run-time storage in boxes that manage only programming information, with this in-memory programming database provisioned with data from EPROM, ROM, the satellite transponder or another source upon startup.

That presents a challenge when a manufacturer also wants to incorporate database management system (DBMS) software in devices " a desire that is increasingly common as advanced product features rely on large amounts of locally stored, complex data.

The problem is that the vast majority of database systems are hard-wired to support either disk-based or memory-based storage. Traditional on-disk DBMSs are designed with the assumption that all data will be stored on permanent media.

These systems cache some records for fast access, but write all updates through the cache to disk. A newer type of database, the in-memory database system (IMDS), keeps data entirely in main memory; this allows for faster data access and a smaller footprint, because it eliminates the logic (and overhead) related to file access, disk I/O, caching and other processes.

But neither type of database provides both on-disk and in-memory data management. Developing a product line with both kinds of data storage would require two different database products. This is undesirable.

It adds to costs through duplication in coding (since the two database systems will likely have their own application programming interfaces and other code-level differences), increased licensing fees, more complicated code maintenance, and time spent learning two new technologies vs. one.

DIRECTV, the leading U.S. satellite television service provider, faced this hurdle. Its new software platform for set-top boxes was meant to incorporate an off-the-shelf embedded database system to store programming information, including TV show descriptions, schedules, ratings, user preferences, and multimedia content.

However, some DIRECTV set-top boxes have hard drives, and some don't. A 100% on-disk database wouldn't do the job, and neither would a "pure" in-memory database.

DIRECTV solved the problem using new hybrid database technology, offered by a handful of vendors, which combines the on-disk and in-memory approaches in a single database system. With the database system used by DIRECTV - McObject's eXtremeDB Fusion - a notation in the database design or "schema" causes certain records to be written to disk, while others are managed entirely in memory.

With this change - and a few others, as described below - DIRECTV simplified its task, compared to developing the set-top box programming application using two different database systems.

When developing an application with eXtremeDB Fusion, on-disk or in-memory storage for any object class is determined in the database schema - the database design, written in a formal Data Definition Language (DDL), that also declares database elements such as object classes (or tables, in relational database terms), fields and indexes (keys).

Specifying one set of data within a database schema as transient (managed in memory), while choosing on-disk storage for other record types, requires the schema declaration shown below:

transient class classname { [fields]

persistent class classname { [fields]

The first step in moving this hybrid database system from a disk-enabled set-top box, to one that stores data in memory, is to change "persistent" to "transient" for all class declarations in the schema.

This is vastly simpler than the changes required between database schemas when using two different database systems, which likely have proprietary DDLs and incompatible syntaxes (or may have no DDL, at all).

These differences between database products mean that despite the work that has gone into creating the schema for the first application, the developer would essentially be starting from scratch when specifying a database design for the second.

Additional code changes are required to move an application containing a hybrid database from a disk-less to a disk-enabled box, but these are largely "tweaks" that touch only the beginning and the closing software operations; they bypass the bulk of application logic and its database interaction. Specifically, an all-in-memory eXtremeDB Fusion database application would use the following sequence of steps to start up and shut down:


body of application


Extending the same application code to take advantage of disk storage would require a few additional function calls to eXtremeDB Fusion's disk manager (shown below in red) when starting and stopping the application:


body of application

mco_disk_save_cache() /* optional */

In this example, the "disk open" and "disk close" calls prepare the data store for updates, although the term 'disk' shouldn't be taken literally. It is an abstraction and might actually be a disk file, a file on a flash drive, a solid state drive, or a network drive.

The function mco_disk_transaction_policy() takes a handle to the database (the "database context"), as well as the selected policy, which is a choice of


and sets the level of transaction durability. The semantics of these options vary somewhat depending on whether the programmer selects the UNDO transaction logging model or the REDO transaction logging model, but generally speaking, "No flush" means file write operations concerning the transaction are not forced through the file system cache to the physical media.

Forgoing this write-through process increases performance, but lowers data durability. Using asynchronous flush, a transaction will be logged, but the logging might complete after the transaction is committed - potentially lowering performance somewhat compared to the NO_FLUSH policy (again, depending on the choice of REDO or UNDO model), but increasing durability.

"Synchronous flush" ensures that that transaction logging, and the transaction itself, will complete together (or both will be rolled back). This provides the highest level of durability, but taxes performance more than the other two options.

The mco_disk_save_cache() function is optional, and is a unique feature of eXtremeDB Fusion - it enables the application to save the database cache before exiting the application, and restore the cache after restarting the application. In typical on-disk databases, you have no choice but to start with an empty cache every time. A likely use would be to start up an MP3 player from the same context as when it was turned off.

< Previous
Page 1 of 2
Next >

Loading comments...