The near ubiquitous connectivity of the Internet of Things (IoT) is buffeting virtually every embedded software building block and tool, from RTOSs to compilers, debuggers, code analysis tools and languages. Embedded database management systems and tools, small footprint versions of their cousins on desktop and enterprise systems, are also being forced to change drastically.
In the 1990s and early 2000s, embedded databases were used to manage the data flows required for many media-intensive applications over the wired Internet. In some designs, especially on network switches, this involved sorting and managing the various data flows. This key role continued as the world moved toward wireless consumer media players and, ultimately, smart phones. Those kinds of applications still exist, and traditional embedded database programs are still playing a pivotal role and also extending their influence in high reliability automotive, industrial and military/aerospace designs.
But in the Internet of Things, life is becoming much more complicated. All IoT transactions, not just media-intensive applications, will require data management services. As a result, embedded database vendors such as McObject, Ittia, Raima, and others are facing a set of data management challenges never conceived of until recently. Developers who want to develop applications for the embedded IoT will need to integrate a new set of design parameters into their thinking. IoT database management apps will have to become as second nature to designers as considerations of interrupt service routines (ISRs), callbacks, and function pointers are now in traditional embedded designs. Some of these new considerations include:
Schema : The database structure your device will need, such as relational, object, networked, or structured.
Persistence: the degree to which an application retains information about devices it is connected to. In the current wireless environment, this depends on the amount of flash available and the efficiency with which it is used.
ACID: Acronym for Atomicity, Consistency, Isolation, and Durability, key features any embedded database design will need in a wireless IoT environment.
- Atomicity requires that each transaction be “all or nothing” such that if the transaction fails, the database state is left unchanged.
- Consistency ensures that any transaction will bring the database from one valid state to another.
- The isolation property ensures that the concurrent execution of transactions will result in a system state that would be obtained if transactions were executed serially.
- Durability ensures that once a transaction has been committed, it will remain so, even in the event of power loss or system crashes.
Database query languages. In addition to C, C++, Java, and a few Web scripting languages, developers of IoT apps will have to be knowledgeable about database query languages, particularly the most common one, the Structured Query Language (SQL) , the lingua franca of most cloud-based systems that IoT systems will interact with. Developers will also have to be aware of when and where to use NoSQL , a broad term applied to data storage and retrieval schemes other than those used in relational databases based on SQL. NoSQL serves the needs of large cloud service companies for managing big data and real-time web applications.
Key value storage. Unlike relational databases used in the cloud environment, a key value database stores, retrieves, and manages data structures in the form of a dictionary or hash. Records are stored and retrieved using a key that uniquely identifies the record, and is used to quickly find data within the database.
In-process data management . An embedded database technique in which the database and the application software it manages execute within the same address space, reducing latency by through elimination of unnecessary client-server communications.
In-memory data management. Often employed in IoT systems that use flash, in-memory databases rely primarily on main memory for computer data storage, versus external disk storage as on cloud-based servers. In-memory is faster since the internal optimization algorithms are simpler and execute fewer CPU instructions.
IoT DBM challenges
A designer who wants to build an embedded IoT based system has to deal with two different kinds of data management. First are those procedures and functions necessary to the operation of the field devices. Second are operations for managing data flow to and from remote aggregation points, usually on a server in the cloud, or in an intermediary layer in the near-user “fog” closer to the edge devices.
The first data management chore is fundamentally simple: collect small subsets of data, analyze them, and respond with automated, real-time control actions in each case. Because an embedded device is designed for a limited range of jobs, the developer can know quite a lot about its data, including the types, relationships, volume, and velocity of data, all of which make it easier to assess and direct what actions are to be taken. What is different now is the sheer number of devices that must be managed.
And with the cloud infrastructure of the IoT rolled into the equation, things get much more complicated, especially as far as the second operation is concerned. An aggregation point on the cloud or in communication with it must collect and manage data from a large number of field devices, analyze it, and determine if it has actionable intelligence that must either be responded to quickly or passed along for analysis or action in the future. For example: aggregating vending machine data and determining where and when to take an immediate corrective action, or place a request for later repair, or just to refill or change what the machine is dispensing.
More compelling to many companies rushing into the IoT space is the information about the habits of the users of their devices. The demand for such “historical” data is money in the bank for many companies because it gives them a better idea on how to appeal to the users of their systems: power usage of the sensor-enabled light bulbs in a home or commercial building, the duration and time of operation of web-connected electrical appliances, entertainment devices, and all manner of IoT devices. While neither as real time or deterministic as the first category, such uses will put heavy demands on the software used to manage such operations.
The emerging database technologies to handle these complex IoT operations are amazingly diverse. The various client/server, distributed, or embedded in-memory/in-process DBMS architectures are targeted for roles at different levels in the IoT, on the devices, or at the fog or cloud level for data analytics.
Table 1. Embedded DBMSs (Source: McObject)
To meet these requirements, developers have a wide range of small footprint DBMS tools available, many of them open source, though most as yet ill-adapted for this new environment. However, if the developer wants to play it safe until the data management requirements of the Internet of Things have stabilized somewhat, there are a number of commercial DBMS vendors who are striving to meet the stringent and often bewildering requirements. (Table 1 ). They include:
McObject was one of the first commercial DBMS vendors to enter the embedded market in the early 2000s. The company has continued to upgrade and adapt its offering to the range of IoT data management requirements. It is an extremely flexible in-memory, in-process embedded DBMS for use in managing field-deployed end-point devices as well as a client/server DBMS for use in data analytics applications at aggregation points on the cloud, or in the fog nearer to end-point devices.
Its newest offering is eXtremeDB 7.0, an upgrade specifically targeted at the Internet of Things with enhancements such as distributed query processing and database clustering. Among the improvements in eXtremeDB 7.0 is faster performance through incorporation of transaction logging capability, which provides a key tool for database recoverability on both field-based devices and server-based IoT data aggregation points.
Primarily a relational DBMS, RDM is a flexible embedded DBMS with low-level native APIs with SQL features. It supports in-memory, in-process, and persistent database creation and is flexible enough to be implemented in standalone client/servers or in the distributed architectures that are becoming common in the new cloud-mediated IoT applications.
Empress is a full-function relational database that is ACID compliant and which incorporates several transaction isolation levels for real-time embedded applications. It supports both persistent and in-memory storage of data and can operate in standalone and client/server configurations.
Berkeley DB from Oracle
Berkeley DB is a full-function key-value store DBMS that can easily be paired with other schema such as SQL. It supports replication, failover, and in-memory and in-process persistent database management. Despite having a simple architecture, it supports many advanced database features such as ACID transactions, fine-grained locking, hot backups, and replication.
From Faircom, this is an ACID-compliant, key value oriented DBMS that supports multiple relational and non-relational data management interfaces as well as SQL, a flexibility that will be important in this new cloud-mediated IoT environment. ctreeACE also incorporates a number of NoSQL structures for high performance, low latency, and precise control of data accesses. Important to embedded developers is that it allows the design of data and index structures that closely parallel the requirements of specific IoT applications, allowing tight coupling of the application and database.
We are still early in the evolution and growth of the embedded IoT market. But four things are clear at this stage:
- Embedded databases, which in the past were only necessary in a very small segment of embedded applications, will now be required in virtually every application and will be as necessary as an operating system.
- No one database architecture will be able to span the range of applications that will emerge, requiring developers to not depend on just one approach but several, and pick their embedded database building blocks accordingly.
- In a market as large and pervasive as IoT is expected to be, there will be room for a multiplicity of choices, both open source and proprietary, to survive and thrive.
- The new IoT environment will require that embedded developers become as familiar with the arcane features and capabilities of DBMS tools as they are with traditional RTOSs, compilers, debuggers and programming languages.