Managing data distributed over multiple mobile and embedded devices and systems -

Managing data distributed over multiple mobile and embedded devices and systems


How often have you complained about data that is in another system?More to the point, how often have you complained that that data isn'tin your system? The challenges of electronic data interchange betweenforeign systems are legion, but quite often we are concerned with moremundane problems: sharing data around within our enterprise.

Nowadays, most of us do application development in Java or .Net. Most IT professionalsstudiedJava in university, and today large enterprise environments are oftenbuilt on a foundation of Java enterprise application servers like IBM'sWebSphere or BEA's WebLogic.These applications have grown in size andcomplexity but after years of investment are now the backbone uponwhich we run our business.

So we find ourselves with rich data models and large amounts ofbusiness & application logic written in these platforms. The factthat the data itself is stored in a database is secondary ” when wewrite applications, we work with objects ” in most cases we rely on thecontainer to do the persistence for us and on the DBAs to design theschema and table structure so that performance will be acceptable.

And this is the way we want it: most programmers really don't wantto be troubled with details of data storage; indeed, most shops aredeliberately structured to provide separation of concerns; applicationdevelopers don't even have access to the database ” the app serverprovides them with the business objects and they carry on.

And everyone lived happily ever after… until someone tried to pullthe plug.

Mobile ubiquity
The problem that occurs again and again is that these mobile devicesneed to share information with an enterprise data store.

Mobile, disconnected devices with software enabled features are fastbecoming ubiquitous in the world today. Just because they'reeverywhere, however, doesn't mean that they are easy to develop for.

They have unique challenges that stem from the limited resourcesthey have at their disposal (small CPU, tiny memory, etc). Despitethese limitations, firms developing such products increasingly wish touse modern tools available for languages like Java rather than beconstrained by the hassle of coding in lower level languages.

Enterprises already have significant Java experience in-house.Furthermore, since the systems they will be connecting to are likelycoded in Java, sharing code and getting the two sides to talk to eachother is a lot easier if the software in your mobile device is likewisewritten in Java. But that is only a beginning.

While in some environments it may be possible to communicatedirectly with the main database (say, over by wireless local areainternet, cellular, radio, or satellite bounce), at the very least thismust be deemed unreliable ” your appliance needs to keep working evenif the connection goes down, and in any bandwidth is likely limited andprobably very expensive.

And all that's if you're lucky! No. In general that direct link isnot available, and we have to design around the case where we only getto connect to the corporate network on an infrequent basis.

When it comes time to replicate specific information to a devicethat will be disconnected from the enterprise data network, you don'twant to suddenly saddle your coders with having to deal with thedatabase directly. Even so, these small systems need their own copy ofthe data to operate on. And that's a problem.

If you're a salesperson and you have a PDA or other device you carryaround with you that helps you keep track of customer information, thenyou want to work in a very restricted family of objects: your`Client`s, those clients' “Addresses; and their “Sales” histories.

You don't want to replicate the entire enterprise data store. Forone thing it's huge ” many hundreds of gigabytes at least ” and youcertainly can't carry that around on an embedded device (or even on alaptop). Besides, all you need is the information on your 70 clients orso.

For another, the effort of composing your domain model's objectsfrom the disparate normalized tables they might happen to be stored inis significant: it takes a lot of computing power to look up indexes,search for rows, conduct joins, and do object-to-relational mapping.

This is trivial for a spruced up enterprise database server machine,but prohibitively taxing for a small embedded device. Even on a morepowerful desktop or laptop machine, we want our applications to beresponsive, not grinding away and starved for memory. And, of course,there are security considerations ” a user should only have access tothe information they need, not the business's entire information store!

People writing small applications for disconnected operation do havethe need to persist things, of course. As their users work they makeupdates to the information they're working on. They certainly don'twant to loose that data, so it has to be reliably stored somewhere.

This is the space that the now well-established db4o object-orientedpersistence system operates in. Blindingly easy to use, it is an idealdatabase for embedded systems needing a reliable, simple tool to be thebacking store for their data. For almost all circumstances, you justtell it to save an object and it does. Querying is just as easy. Nofuss. And fast! Not bad for a 400kB .jar file.

Anticipating Collision
When most of us think of replication, we visualize a setup where amaster database server is sending off transactions to one or moreread-only slaves. That does indeed work well for centralized onlinesystems, but is runs into trouble when it meets the disconnected world:what happens when a piece of data has changed on both the device andthe main system?

This then raises the real issue: the problem is not replicating data(copying it once from one place to another is easy enough) butsynchronizing that data so that changes on the mobile device arepropagated back. What happens when there is a collision?

The problem arises when an object has changed on both thedisconnected device and the parent data repository. This is notsomething that can be resolved by falling back to a default. While manysynchronization systems deal with this by declaring an overridingpolicy such as “the server always wins”, this is not sufficient foranything other than the most carefully restricted environments.

Continuing the example above, if an `Address` is changed on both theserver side and on the mobile device, how does the database know whichto accepted as correct?

New Replication Alternatives
The answer is that it is not up to the database to decide. Which objectwins is a question of business policy! The place where such logic isencoded and acted upon is in the business layer – ie, in theapplication code. That is where the decision needs to be made.

Last year, db4objectsreleased a new replication product called dRS,  based on this veryidea. It is an efficient mechanism for copying your data from one placeto another, and enables your software to make intelligent choicesregarding how to go about the synchronization of that data. How do youknow you need to make a decision? In the specification for dRS, theywrite:

“If an object is modified in bothpeers after the last round of replication, the system will allow theuser to choose whether to proceed with either one of the copies asmaster and copies it across to another peer, or not to replicate thatobject at all.

dRS provides a callback to a 'Replication Conflict Handler' whichwill be called when the application needs to make a choice. That'swhere your knowledge of the environment, the application, the company,and its business rules all come together allowing you to code the logicto decide which way to propagate the data … or allowing you to writea user interface to ask a person what to do ” either way, resulting inthe most correct outcome.

db4oto db4o replication and synchronization is fascinating in its ownright, but dRS goes one further: they can exchange data withtraditional enterprise RDBMSes! Through an innovative design leveragingthe Hibernate object-relational mapper, developers can replicate andsynchronize with primary main enterprise relational data systems.

This enables db4o to act as a highly capable view of an underlyingdata store. All the data that the client device needs is present in thefinal form needed by the application ” objects! Changes will bepropagated back to the central database when the device is next docked,and vice versa.

This powerful combination has opened up possibilities of all kinds.In addition to the predictable use cases for assembly robots andmedical imaging devices, some of db4objects's customers are using dRSto power high speed caches for their data within the data center. Andone team is even using it to synchronize data between an airbornesensor platform and its ground station.

Andrew Frederick Cowie, ManagingDirector, Operational DynamicsConsulting Pty Ltd., hasextensive experience as a Unix/Linux sysadmin and Java developer. He isa contributor to a number of open source efforts, including GNOMEproject by maintaining the “java-gnome” bindings allowing you to writeGTK programs from Java.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.