Hybrid Data Management Gets Traction in Set-Top Boxes

(Editor's Note: In this “Product How-To” design article Steve Graves of McObject describes how DirectTV used the company's ExtremeDB in memory embedded database to build a hybrid database for managing program information stored locally the customer's settop box as well as on the company's website. )

Consumer electronics devices often come in two flavors: with hard disks(or other permanent storage mediasuch as flash ), and without. A manufacturer's product line for aparticular device will often include both disk-less and disk-enabledmodels, to accommodate different price points or features.

For example, a set-top box productfamily will include hard disks in those units that offer digital videorecording, but will use memory-only run-time storage in boxes thatmanage only programming information, with this in-memory programmingdatabase provisioned with data from EPROM, ROM, the satellitetransponder or another source upon startup.

That presents a challenge when a manufacturer also wants toincorporate database management system (DBMS)software in devices ” a desire that is increasingly common as advancedproduct features rely on large amounts of locally stored, complex data.

The problem is that the vast majority of database systems arehard-wired to support either disk-based or memory-based storage.Traditional on-disk DBMSs are designed with the assumption that alldata will be stored on permanent media.

These systems cache some records for fast access, but write allupdates through the cache to disk. A newer type of database, the in-memory database system (IMDS),keeps data entirely in main memory; this allows for faster data accessand a smaller footprint, because it eliminates the logic (and overhead)related to file access, disk I/O, caching and other processes.

But neither type of database provides both on-disk and in-memorydata management. Developing a product line with both kinds of datastorage would require two different database products. This isundesirable.

It adds to costs through duplication in coding (since the twodatabase systems will likely have their own application programminginterfaces and other code-level differences), increased licensing fees,more complicated code maintenance, and time spent learning two newtechnologies vs. one.

DIRECTV, the leading U.S. satellite television serviceprovider, faced this hurdle. Its new software platform for set-topboxes was meant to incorporate an off-the-shelf embedded databasesystem to store programming information, including TV showdescriptions, schedules, ratings, user preferences, and multimediacontent.

However, some DIRECTV set-top boxes have hard drives, and somedon't. A 100% on-disk database wouldn't do the job, and neither would a”pure” in-memory database.

DIRECTV solved the problem using new hybrid database technology,offered by a handful of vendors, which combines the on-disk andin-memory approaches in a single database system. With the databasesystem used by DIRECTV – McObject's eXtremeDB Fusion – a notation inthe database design or “schema” causes certain records to be written todisk, while others are managed entirely in memory.

With this change – and a few others, as described below – DIRECTVsimplified its task, compared to developing the set-top box programmingapplication using two different database systems.

When developing an application with eXtremeDB Fusion, on-disk orin-memory storage for any object class is determined in the databaseschema – the database design, written in a formal Data Definition Language (DDL),that also declares database elements such as object classes (or tables,in relational database terms), fields and indexes (keys).

Specifying one set of data within a database schema as transient(managed in memory), while choosing on-disk storage for other recordtypes, requires the schema declaration shown below:

transient class classname { [fields]
};

persistent class classname {[fields]
};

The first step in moving this hybrid database system from adisk-enabled set-top box, to one that stores data in memory, is tochange “persistent” to “transient” for all class declarations in theschema.

This is vastly simpler than the changes required between databaseschemas when using two different database systems, which likely haveproprietary DDLs and incompatible syntaxes (or may have no DDL, atall).

These differences between database products mean that despite thework that has gone into creating the schema for the first application,the developer would essentially be starting from scratch whenspecifying a database design for the second.

Additional code changes are required to move an applicationcontaining a hybrid database from a disk-less to a disk-enabled box,but these are largely “tweaks” that touch only the beginning and theclosing software operations; they bypass the bulk of application logicand its database interaction. Specifically, an all-in-memory eXtremeDBFusion database application would use the following sequence of stepsto start up and shut down:

mco_error_set_handler()
mco_runtime_start()
mco_db_open()
mco_db_connect()

body of application

mco_db_disconnect()
mco_db_close()
mco_runtime_stop()

Extending the same application code to take advantage of diskstorage would require a few additional function calls to eXtremeDBFusion's disk manager (shown below in red )when starting and stopping the application:

mco_error_set_handler()
mco_runtime_start()
mco_db_open()
mco_disk_open()
mco_db_connect()
mco_disk_transaction_policy()

body of application

mco_disk_save_cache()/* optional */
mco_db_disconnect()
mco_disk_close()
mco_db_close()
mco_runtime_stop()

In this example, the “disk open” and “disk close” calls prepare thedata store for updates, although the term 'disk' shouldn't be takenliterally. It is an abstraction and might actually be a disk file, afile on a flash drive, a solid state drive, or a network drive.

The function mco_disk_transaction_policy() takes a handle to the database (the “database context”), as well as theselected policy, which is a choice of

       MCO_COMMIT_NO_FLUSH
       MCO_COMMIT_ASYNCH_FLUSH
       MCO_COMMIT_SYNCH_FLUSH

and sets the level of transaction durability. The semantics of theseoptions vary somewhat depending on whether the programmer selects theUNDO transaction logging model or the REDO transaction logging model,but generally speaking, “No flush” means file write operationsconcerning the transaction are not forced through the file system cacheto the physical media.

Forgoing this write-through process increases performance, butlowers data durability. Using asynchronous flush, a transaction will belogged, but the logging might complete after the transaction iscommitted – potentially lowering performance somewhat compared to theNO_FLUSH policy (again, depending on the choice of REDO or UNDO model),but increasing durability.

“Synchronous flush” ensures that that transaction logging, and thetransaction itself, will complete together (or both will be rolledback). This provides the highest level of durability, but taxesperformance more than the other two options.

The mco_disk_save_cache() function is optional, and is a unique feature of eXtremeDB Fusion – it enablesthe application to save the database cache before exiting theapplication, and restore the cache after restarting the application. Intypical on-disk databases, you have no choice but to start with anempty cache every time. A likely use would be to start up an MP3 playerfrom the same context as when it was turned off.

Beyond the database schema changes and the small differences in thecode's beginning and ending, described above, software incorporatingthe hybrid database can be moved seamlessly between the differentset-top box types, disk-enabled and disk-less.

No changes are needed in the body of the application; the systemstores data on the hard disk (or flash memory, SD card, etc.) of adevice with permanent media, and stores records entirely in memorywithin a disk-less box.

This contrasts sharply with the major alterations in code requiredto port an application between two different database systems, as wouldbe required using a dedicated on-disk database system, and one designedfor in-memory storage only.

Approaches to error-handling ,database locking, and transaction models are just someof the areas where database systems can differ radically, even when theseparate DBMSes are generallyconsidered “embedded databases.” Accommodating these differencestypically demands substantial changes in program logic.

One especially thorny area when moving between DBMSs is application programming interfaces (APIs),which vary widely between database software vendors. “Ripping out thedatabase” is common programmer slang for moving an application from oneDBMS to another.

It accurately describes the code disruption of taking out the”stitching” that the API provides between application logic anddatabase system. Functions comprising one database system's API oftendo not map directly to another DBMS, often requiring substantiallydifferent syntax and arguments.

Using an API that supports the ANSI SQL and open database connectivity (ODBC)APIs standards can mitigate the pain of moving between databases, butto be 100% portable (or as close as possible) a developer would have tostay away from any vendor's advantages and be satisfied with aleast-common-denominator solution. Developers are rarely eager to dothat.

With eXtremeDB Fusion, DIRECTV gained the ability to design aunified software platform, incorporating one embedded database, forboth disk-less and disk-enabled set-top boxes. To adapt theapplications between these hardware environments, the company'sdevelopers still must make changes at both the database design andapplication code levels.

However, the porting is greatly simplified. It is safe to say thatprogrammer-weeks are saved in initial development of the softwareplatform, and even greater savings should occur post-release, as theefficiencies are gained from maintaining and extending one code baserather than two.

As software applications increasingly reside on embedded devices(rather than desktop or server computers), and as manufacturers offermore feature-set choices within product lines, then embedded softwareengineers increasingly face target environments that are split betweendisk-enabled and disk-less devices.

Hybrid database systems help developers meet this challenge whilepreserving much of the database coding simplicity of dealing with asingle approach to data storage.

Steve Graves is CEO and founder ofMcObject

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.