For example, a set-top box product family will include hard disks in those units that offer digital video recording, but will use memory-only run-time storage in boxes that manage only programming information, with this in-memory programming database provisioned with data from EPROM, ROM, the satellite transponder or another source upon startup.
That presents a challenge when a manufacturer also wants to incorporate database management system (DBMS) software in devices " a desire that is increasingly common as advanced product features rely on large amounts of locally stored, complex data.
The problem is that the vast majority of database systems are hard-wired to support either disk-based or memory-based storage. Traditional on-disk DBMSs are designed with the assumption that all data will be stored on permanent media.
These systems cache some records for fast access, but write all updates through the cache to disk. A newer type of database, the in-memory database system (IMDS), keeps data entirely in main memory; this allows for faster data access and a smaller footprint, because it eliminates the logic (and overhead) related to file access, disk I/O, caching and other processes.
But neither type of database provides both on-disk and in-memory data management. Developing a product line with both kinds of data storage would require two different database products. This is undesirable.
It adds to costs through duplication in coding (since the two database systems will likely have their own application programming interfaces and other code-level differences), increased licensing fees, more complicated code maintenance, and time spent learning two new technologies vs. one.
DIRECTV, the leading U.S. satellite television service provider, faced this hurdle. Its new software platform for set-top boxes was meant to incorporate an off-the-shelf embedded database system to store programming information, including TV show descriptions, schedules, ratings, user preferences, and multimedia content.
However, some DIRECTV set-top boxes have hard drives, and some don't. A 100% on-disk database wouldn't do the job, and neither would a "pure" in-memory database.
DIRECTV solved the problem using new hybrid database technology, offered by a handful of vendors, which combines the on-disk and in-memory approaches in a single database system. With the database system used by DIRECTV - McObject's eXtremeDB Fusion - a notation in the database design or "schema" causes certain records to be written to disk, while others are managed entirely in memory.
With this change - and a few others, as described below - DIRECTV simplified its task, compared to developing the set-top box programming application using two different database systems.
When developing an application with eXtremeDB Fusion, on-disk or in-memory storage for any object class is determined in the database schema - the database design, written in a formal Data Definition Language (DDL), that also declares database elements such as object classes (or tables, in relational database terms), fields and indexes (keys).
Specifying one set of data within a database schema as transient (managed in memory), while choosing on-disk storage for other record types, requires the schema declaration shown below:
transient class classname { [fields]
};
persistent class classname {
[fields]
};
The first step in moving this hybrid database system from a disk-enabled set-top box, to one that stores data in memory, is to change "persistent" to "transient" for all class declarations in the schema.
This is vastly simpler than the changes required between database schemas when using two different database systems, which likely have proprietary DDLs and incompatible syntaxes (or may have no DDL, at all).
These differences between database products mean that despite the work that has gone into creating the schema for the first application, the developer would essentially be starting from scratch when specifying a database design for the second.
Additional code changes are required to move an application containing a hybrid database from a disk-less to a disk-enabled box, but these are largely "tweaks" that touch only the beginning and the closing software operations; they bypass the bulk of application logic and its database interaction. Specifically, an all-in-memory eXtremeDB Fusion database application would use the following sequence of steps to start up and shut down:
mco_error_set_handler()
mco_runtime_start()
mco_db_open()
mco_db_connect()
body of application
mco_db_disconnect()
mco_db_close()
mco_runtime_stop()
Extending the same application code to take advantage of disk storage would require a few additional function calls to eXtremeDB Fusion's disk manager (shown below in red) when starting and stopping the application:
mco_error_set_handler()
mco_runtime_start()
mco_db_open()
mco_disk_open()
mco_db_connect()
mco_disk_transaction_policy()
body of application
mco_disk_save_cache()
/* optional */
mco_db_disconnect()
mco_disk_close()
mco_db_close()
mco_runtime_stop()
In this example, the "disk open" and "disk close" calls prepare the data store for updates, although the term 'disk' shouldn't be taken literally. It is an abstraction and might actually be a disk file, a file on a flash drive, a solid state drive, or a network drive.
The function mco_disk_transaction_policy() takes a handle to the database (the "database context"), as well as the selected policy, which is a choice of
MCO_COMMIT_NO_FLUSH
MCO_COMMIT_ASYNCH_FLUSH
MCO_COMMIT_SYNCH_FLUSH
and sets the level of transaction durability. The semantics of these options vary somewhat depending on whether the programmer selects the UNDO transaction logging model or the REDO transaction logging model, but generally speaking, "No flush" means file write operations concerning the transaction are not forced through the file system cache to the physical media.
Forgoing this write-through process increases performance, but lowers data durability. Using asynchronous flush, a transaction will be logged, but the logging might complete after the transaction is committed - potentially lowering performance somewhat compared to the NO_FLUSH policy (again, depending on the choice of REDO or UNDO model), but increasing durability.
"Synchronous flush" ensures that that transaction logging, and the transaction itself, will complete together (or both will be rolled back). This provides the highest level of durability, but taxes performance more than the other two options.
The mco_disk_save_cache() function is optional, and is a unique feature of eXtremeDB Fusion - it enables the application to save the database cache before exiting the application, and restore the cache after restarting the application. In typical on-disk databases, you have no choice but to start with an empty cache every time. A likely use would be to start up an MP3 player from the same context as when it was turned off.
Beyond the database schema changes and the small differences in the code's beginning and ending, described above, software incorporating the hybrid database can be moved seamlessly between the different set-top box types, disk-enabled and disk-less.
No changes are needed in the body of the application; the system stores data on the hard disk (or flash memory, SD card, etc.) of a device with permanent media, and stores records entirely in memory within a disk-less box.
This contrasts sharply with the major alterations in code required to port an application between two different database systems, as would be required using a dedicated on-disk database system, and one designed for in-memory storage only.
Approaches to error-handling, database locking, and transaction models are just some of the areas where database systems can differ radically, even when the separate DBMSes are generally considered "embedded databases." Accommodating these differences typically demands substantial changes in program logic.
One especially thorny area when moving between DBMSs is application programming interfaces (APIs), which vary widely between database software vendors. "Ripping out the database" is common programmer slang for moving an application from one DBMS to another.
It accurately describes the code disruption of taking out the "stitching" that the API provides between application logic and database system. Functions comprising one database system's API often do not map directly to another DBMS, often requiring substantially different syntax and arguments.
Using an API that supports the ANSI SQL and open database connectivity (ODBC) APIs standards can mitigate the pain of moving between databases, but to be 100% portable (or as close as possible) a developer would have to stay away from any vendor's advantages and be satisfied with a least-common-denominator solution. Developers are rarely eager to do that.
With eXtremeDB Fusion, DIRECTV gained the ability to design a unified software platform, incorporating one embedded database, for both disk-less and disk-enabled set-top boxes. To adapt the applications between these hardware environments, the company's developers still must make changes at both the database design and application code levels.
However, the porting is greatly simplified. It is safe to say that programmer-weeks are saved in initial development of the software platform, and even greater savings should occur post-release, as the efficiencies are gained from maintaining and extending one code base rather than two.
As software applications increasingly reside on embedded devices (rather than desktop or server computers), and as manufacturers offer more feature-set choices within product lines, then embedded software engineers increasingly face target environments that are split between disk-enabled and disk-less devices.
Hybrid database systems help developers meet this challenge while preserving much of the database coding simplicity of dealing with a single approach to data storage.
Steve Graves is CEO and founder of McObject