Building a effective real-time distributed publish-subscribe framework Part 1 -

Building a effective real-time distributed publish-subscribe framework Part 1

Data-centric design is emerging as a key tenet for building advanceddata-critical distributed systems that link diverse control orientedembedded devices and sensors to data processing systems within largeenterprises.

For example, in manufacturing, the production equipment is quicklybecoming all electronic and is being networked together; as a resultthe data from the sensors and controllers on the factory floor arebeing linked to the data collection and EnterpriseResource Planning (ERP) systems in the enterprise, to create amore nimble and responsive manufacturing organization. Another exampleis the auto service shop that may send the datafrom the auto sensors to the auto manufacturer's main data collectioncenters for analysis, diagnostics, and feedback.

Yet another example is the military's “GlobalInformation Grid,” linking the diverse sensors and local decisionmaking nodes deployed in the field with command and control centers toempower the soldier in the battelfield. 

Two emerging middleware API standards emerging to accomplish thistask are the DataDistribution Service (DDS) and the JavaMessaging Service (JMS), because they are easy to use, and offerthe benefits of using a publish-subscribe communication model resultingin loosely coupled scalable distributed applications. However, theirdifferences have significant impact on a data-centric design.

DDS and JMS are based on fundamentally different paradigms withrespect to data modeling, dataflow routing, discovery, and data typing;yet they offer a similar and easy to use experience to the applicationprogrammer. They differ significantly in their support for datafiltering and transformation, connectivity monitoring, redundancy andreplication, and delivery effort. Each also offers some distinctcapabilities; and they both offer some equivalent capabilities.

When evaluating these alternatives in a distributed controlenvironment it is important to understand the practical considerationsand differences in using the two standards with respect to middlewarearchitecture, platform support, interoperability, transports, security,administration, performance, scalability, real-time applicationspecific support, and enterprise application specific support.

DDS and JMS APIs may be used together in an application. The canleverage each other via JMS-DDS bridging, JMS/DDS bindings, or by usingDDS for JMS discovery. We discuss these approaches and theirsuitability for different data-centric integration scenarios.

What is data-centric design?
As result of the growing popularity of cheap and widespread embeddeddata collection “edge” devices, the easy availability of highperformance messaging and database technology, and the increasingadoption of SOA and Web Services, data-centric design is emerging as akey tenet for building advanced data-critical embedded systems.

As computation and storage costs continue to drop faster thannetwork costs, the trend is to move data and computation locally, whichmeans that choosing the right data distribution method for moving databetween the nodes as and when needed, is becoming critical in manydistributed embedded systems.

Data-centric design is key to systems which exhibit some or all ofthe following five characteristics: (a) participants are distributed;(b) interactions between participants are data-centric and notobject-centric; often these can be viewed as “dataflows” that may carryinformation about identifiable data-objects; (c) data is criticalbecause of large volumes, or predictable delivery requirements, or thedynamic nature of the entities; (d) computation is time sensitive andmay be critically dependent on the predictable delivery of data, (e)storage is local. Examples of data-centric systems are found in trafficcontrol, command and control, networking equipment, industrialautomation, robotics, simulation, medical, supply chain, and financialprocessing.

Several middleware technologies and standards have been applied toconstruction of distributed systems including DDS and JMS, EnterpriseJava Beans (EJB) as well as High LevelArchitecture (HLA), CORBA, CORBANotification Service. These middleware technologies fit therequirements of data-centric distributed systems to varying degrees.Specific requirements demanded by data-centric distributed systemsinclude (1) ability to specify structured data models; (2) ability todynamically specify and (re)configure the data flows; (3) ability todescribe delivery requirements per data flow; (4) ability to specifyand control middleware resources such as queues and buffering; (5)resiliency to individual node or participant failures; and (6)performance and scalability with respect to number of nodes,participants, and data flows.

The Publish-Subscribe CommunicationModel. Distributed data-centric application architectures oftenmap naturally to a publish-subscribe(P-S) communication model. A P-S communication model (Figure 1, below ), uses asynchronousmessage passing between concurrently operating subsystems. Thepublish-subscribe model connects anonymous information producers withinformation consumers. The overall distributed system is composed ofprocesses, each running in a separate address space possibly ondifferent computers. We will call each of these processes a“participant application.” A participant may be a producer or consumerof data, or both.

Figure1 Publish-subscribe middleware decouples information producers fromconsumers.

Data producers declare the topics on which they intend to publishdata; data consumers subscribe to the topics of interest. When a dataproducer publishes some data on a topic, all the consumers subscribingto that topic receive it. The data producers and consumers remainanonymous, resulting in a loose coupling of sub-systems, which is wellsuited for data-centric distributed applications.

The P-S communication model enables a robust service basedapplication architecture that decouples participants from one another,provides location transparency, and flexibility to dynamically add orremove participants.

Both DDS and JMS support a P-S communication model and often serveas the integration glue or the “data bus” interconnecting theparticipants producing or consuming data.

The Data Distribution Service (DDS) is a formal standard from theObject management Group (OMG) popular in embedded systems, especiallyin industrial automation, aerospace, and defense applications. DDSspecifies an API designed for enabling real-time data distribution. Ituses a publish-subscribe communication model, and supports bothmessaging and data-object centric data models.

Java Message Service. JMS isa defacto industry standard popular in the enterprise systems formessaging applications. JMS specifies a Java API for wrappingmessage-oriented middleware (MOM) APIs, so that portable application(Java) application code may be written. In that respect, it is similarto other Java APIs such as JDBC for abstracting database access, orJNDI for abstracting naming and directory services. JMS uses apublish-subscribe communication model, and a messaging or eventing datamodel.

Thus, both DDS and JMS provide standardized APIs to preserveapplication portability across middleware vendors; both use apublish-subscribe (P-S) communication model. Both DDS and JMS APIs areintuitive and easy to use, and their popularity mitigates the risk inutilizing them for new data-centric designs.

DDS versus JMS. The twoprotocols differ in their ability to cater to the key data-centricdesign requirements outlined earlier, with respect to (1) data modelingand manipulation, including lifecycle management, data filtering, andtransformation; (2) dataflow routing and discovery, including point topoint connectivity; (3) delivery quality of service (QoS) per dataflow, including delivery effort levels, timing control, orderingcontrol, time-to-live, and message priority; (4) resource specificationand management, including resource limits, and history; (5) resiliencyto failures, including redundancy and failover, and statusnotifications; and (6) performance and scalability.

DDS is newer standard based on fundamentally different paradigmsthan JMS, with regards to data modeling, dataflow routing, discovery,and data typing; these differences enable applications designers withpowerful new architectural possibilities. Despite these differences,the user experience of writing to DDS APIs is similar to that of JMSAPIs. Also, they both provide support for persistent delivery, andtime-to-live for a data item.

Distinctive DDS capabilities include data modeling and lifecyclemanagement, automatic dataflow routing, spontaneous discovery, contentbased filtering and transformation, per dataflow connectivitymonitoring, simple redundancy and replication, delivery ordering, andreal-time specific features such as best efforts delivery, predictabledelivery, resource management, and status notifications. In addition,DDS offers several enhanced capabilities with respect to data filteringand transformation, connectivity monitoring, redundancy andreplication, and delivery effort. DDS offers new capabilities withrespect to data-object lifecycle management, predictable delivery,delivery ordering, transport priority, resource management, and statusnotifications.

JMS offers some capabilities not offered by DDS. Distinctive JMScapabilities include point-to-point delivery to exactly one of manyconsumers, message priority, and enterprise specific features such asfull transactional support, and application level acknowledgements.Unlike DDS, JMS requires administration of the JMS provider (server)and JNDI registries.

Unlike JMS, which is a Java language standard, standard DDS APIs areavailable in many languages. The API design choices made by DDS cansupport potentially higher performance (lower latency and higherthroughput) and better scalability than JMS. DDS has some capabilitiesoptimized for real-time applications, not found in JMS. JMS has somecapabilities optimized for enterprise applications, not found in DDS.

DDS is amenable to a decentralized peer-to-peer architecture, whichcan be more robust and efficient compared to centralized server basedarchitecture commonly used for JMS.

Neither DDS nor JMS provide an interoperability protocol, althoughthere is one currently under standardization for DDS. Neither specifiesa transport model, although there are some capabilities in DDS that arebetter suited to unreliable transports such as UDP, while JMS cangenerally benefit from the availability of a reliable transport likeTCP. Both DDS and JMS defer security to the application, and onlyprovide support for communicating security credentials.

DDS and JMS merit careful consideration for data-centric design.Using one or both can considerably simplify a data-centric design, andhelp maintain the focus on application issues, rather than becomingbogged down by communication and data delivery concerns.

The Basics of DDS
DDS targets real-time systems; the API and Quality of Service (QoS) arechosen to balance predictable behavior and implementationefficiency/performance. The DDS specification describes two levels ofinterfaces:

* A lower level Data-CentricPublish-Subscribe (DCPS) that is targeted towards the efficientdelivery of the proper information to the proper recipients.
* An optionalhigher-level Data-Local Reconstruction Layer (DLRL), which allows for asimpler integration into the application layer.

The DCPS model builds on the idea of a “global data space” ofdata-objects that any entity can access. Applications that need datafrom this space declare that they want to subscribe to the data, andapplications that want to modify data in the space declare that theywant to publish the data. A data-object in the space is uniquelyidentified by its keys and topic, and each topic must have a specifictype. There may be several topics of a given type. A global data spaceis identified by its domain id, each subscription/publication mustbelong to the same domain to communicate.

Figure 2, below , illustratesthe overall data-centric publish-subscribe model, which consists of thefollowing entities: DomainParticipant,DataWriter, DataReader, Publisher, Subscriber, and Topic.

Figure2. UML diagram of the DDS data-centric publish-subscribe interfaces

All these classes extend Entity ,representing their ability to be configured through QoS policies, beenabled, be notified of events via listener objects, and supportconditions that can be waited upon by the application. Eachspecialization of the Entity base class has a corresponding specializedlistener and a set of QoSPolicy values that are suitable to it.

Publisher represents the objects responsible for data issuance. A Publisher maypublish data of different data types. A DataWriter is atyped facade to a publisher; participants use DataWriter(s) tocommunicate the value of and changes to data of a given type. Once newdata values have been communicated to the publisher, it is thePublisher ’sresponsibility to determine when it is appropriate to issuethe corresponding message and to actually perform the issuance (thePublisher will do this according to its QoS, or the QoS attached to thecorresponding DataWriter ,and/or its internal state).

A Subscriber receivespublished data and makes it available to the participant. A Subscriber may receive and dispatch data of different specified types. To accessthe received data, the participant must use a typed DataReader attachedto the subscriber.

The association of a DataWriter object (representinga publication) with (representing the subscriptions) is done by meansof the DataReaderobjectsTopic . A Topic associates a name (unique in the system),a data type, and QoS related to the data itself. The type definitionprovides enough information for the service to manipulate the data (forexample serialize it into a network-format for transmission). Thedefinition can be done by means of a textual language (e.g. somethinglike “float x; float y;”) or by means of an operational “plugin” thatprovides the necessary methods.

The DDS middleware handles the actual distribution of data on behalfof a user application. The distribution of the data is controlled byuser settable Qualityof Service (QoS) .

The basics of JMS
JMS targets enterprise messaging; the API is chosen to abstract theprogramming of a wide variety of message-oriented-middleware(MOM) products in a vendor neutral and portable manner,using the Java programming language.

Figure 3, below, illustratesthe structure of the JMS API. A Destination refers to a named physicalresource managed by the underlying MOM. It is administered andconfigured via vendor provided tools, and typically accessed by a userapplication via the Java Naming and Directory Interface (JNDI) APIs(external to JMS). A MessageProducer will send messages to a destination and a MessageConsumer can receive messages from a destination. The destination can be thoughtof a mini-message broker or a channel independent of the producers andconsumers.

Figure3. UML diagram of JMS messaging interfaces

JMS supports two different “messaging domains” (unrelated to the DDSdomain concept) point-to-point(PtP) and publish-subscribe(Pub/Sub ). The two messaging domains are provided to support thewide variety of MOM vendors; only one of them is required to besupported by a JMS provider, although many support both. They providetwo different sets of derived classes that extend the common abstractAPIs, as shown in Figure 4, below.

Figure4. The PtP and Pub/Sub JMS domains extend common abstract interfaces,and follow the same programming idioms.

The two JMS messaging domains are similar in every respect, exceptfor the following ways:

1) In PtP messaging domain,only one consumer will receive a message; the policy is not specifiedby JMS and left up to the vendor. The messages are delivered in theorder they are produced (as if put into a shared serial queue). Also,an application can peek ahead using a QueueBrowser.

2) In the PtP messagingdomain, the consumers are durable (see below), and therefore don’t haveto be running concurrently with the producers to receive messages. Thiscan be achieved in the JMS Pub/Sub messaging domain by using durablesubscriptions

A ConnectionFactory refers to vendor provided factory for Connection objects, and is alsoconfigured and administered using vendor provided tools, and typicallyobtained via JNDI APIs. An optional username, and password may besupplied when creating a Connection.

A Connection i sa heavy-weight object representing the link between the application andthe middleware. Its attributes include a clientID. It provides methodsto start() and stop() communication and to close() aconnection. An ExceptionListener may be registered with it, to trap lost connections. A Connection isused to create Session objects.

A Session represents a single threaded context for producing and/orconsuming data. It provides methods to create Messages,MessageProducers and MessageConsumers .Its attributes include whether it isTransacted and the acknowledgementMod e.

In a transacted session, messages are not actually sent(MessageProducer) or the received messages not acknowledged (MessageConsumer )until a commit() operation. A rollback() operation can undo the pending messages to be sent (MessageProducer )or acknowledged (MessageConsumer ).The acknowledgementMode determines whether received messages should beautomatically acknowledged such that duplicates may (or may not) bereceived, or whether they must be explicitly acknowledged by theapplication by calling Message.acknowledge() .

A Message is a first class object in JMS; it represents an event,and can carry an optional payload. A message is comprised of headers,optional user defined properties, and an optional user data payload.

The JMS provider automatically assigns most message headersincluding: destination, delivery mode, message id, timestamp,expiration, redelivery flag, and priority. The user can assign someheaders, including: reply to, correlation id, and type.

In addition, the user can associate arbitrary properties consistingof (name, value) pairs. These properties can be used in ‘selectors’,which are expressions specified on a MessageConsumer to sub-select and consume only the matching messages.

JMS defines five message subclasses to conveniently specify the datapayload. The message subclasses for unstructured payloads include TextMessage,ByteMessage, and ObjectMessage ;and for structured payloads include StreamMessage and MapMessage .

A MessageProducer is used to produce messages. A default destination may be specifiedwhen the producer is created; it can also be specified when sendingmessages. In addition, the delivery mode, priority, and expiration canbe specified for the outgoing message headers. A persistent deliverymode means that a message will be delivered once-and-only-once; themessage is stored in permanent storage before the send() methodreturns. A non-persistent delivery mode means that the message will bedelivered at most once; a message may be dropped if the JMS providerfails.

A MessageConsumer is used to consume messages from a destination. A selector canbe specified when creating a consumer; the consumer will only deliverthe messages whose properties match the selector expression. Messagecan be delivered asynchronously by registering a MessageListener ;the onMessage() method will be called when a message arrives. Alternatively, messagescan also be received synchronously by calling receive*() methods,the desired timeout (zero, finite, infinite) can be chosen by the user.

A consumer can be durable; for the Pub/Sub messaging domain this isspecified by calling Session.createDurableSubscriber() and specifying a subscription name; in the PtP messaging domain, a QueueReceiver is always durable. A durable consumer receives all messages sent to adestination, including ones that are sent when the consumer isinactive. The JMS provider retains a record of the durable consumer(s)and ensures that all messages from the destination’s producers areretained until the durable consumer acknowledges them or they haveexpired.

A Session can also create unique temporary destinations (TemporaryQueue ora TemporaryTopic ),which are like administered destinations except that they are onlyvalid for the duration of the connection and only the consumersassociated with the connection can consume the messages. However anyonecan produce on the temporary destinations; their presence is typicallyconveyed to other producers using the Message.setReplyTo() method.

In the DCPS layer of the DDS protocol, there are a number ofresemblances, as shown in Figure 5,below. However, there is no DLRL counterpart in JMS.

Figure5. Mapping of key JMS and DDS concepts and terminology.

DDS and JMS APIs are similar in many respects and correspondences canbe observed between the two APIs. For example, a DDSDomainParticipantFactory corresponds to a JMS ConnectionFactory ;a DDSDomainParticipant corresponds to a JMS Connection ;a DDS Publisheror Subscriber corresponds to a JMS Session ; a DDS Topic corresponds to a JMSDestination ; a DDS data-object update correspondsto a JMS Message ;a corresponds to a DDSDataWriterJMS MessageProducer ; a DDS DataReader correspondsto a JMSMessageConsumer .

The similarities make it easy to switch back and forth between thetwo APIs, and to leverage the experience in one API to another.

Next in Part 2: The differencesbetween DDS and JMS .

Rajive Joshi, Ph.D., is principalengineer at Real-TimeInnovations, Inc.

1) DataDistribution Service for Real-time Systems, v1.1,

2) J2EE Java Message Service(JMS)

3) RTI DataDistribution Service

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.