Since 2000, when IPv6 was first introduced, and 2007, when the 6LoWPAN wireless networking extension was released, virtually all IP-connected devices, wired and wireless, can be connected. Such networks, known as the “Internet of Things,” are now generating interest among developers of industrial-control networks. These industrial systems connect to both external and internal Internet Protocol (IP) networks through gateways that require custom provisioning and programming to expose the necessary data to the enterprise systems. Invariably, the gateway constrains what information can pass back and forth, and its configuration is difficult to evolve to support new requirements. To provide the benefits of a common IP communications infrastructure, the advanced communications requirements of these systems must be addressed.
The Industrial Internet of Things (IIoT), however, is just one of three main classes of IP-enabled connected devices. The two other main categories are consumer and machine-to-machine devices (Figure 1 ).
Figure 1: Currently there are three main Internet Protocol (IP) enabled Internet of Things (IoT) categories: consumer, machine to machine and industrial machines to machines.
Consumer IoT. Unlike the industrial environment, consumer IoT is non-real-time and non-deterministic and characterized by a human interacting with a device. Viewing a video on a cell phone, or starting up an exercise monitor to send your statistics to your account in the cloud, are examples of consumer IoT applications. In case of failure, a human is there to recover or restart the application. In the consumer IoT, communications run between client/server and are often streaming large amounts of data.
This is very different than IIoT, where, in terms of reliability and determinism, the requirements are a superset of the main IoT requirements. This segment of the IoT market includes two main categories: Machine to Machine (M2M) application monitoring and a superset of traditional M2M called Machines to Machines networking for autonomous, peer-to-peer distributed control.
Machine to Machine. Typical of client/server-based application-monitoring architecture of M2M are vehicle-tracking systems, systems that monitor a building’s mechanisms for signs of wear, or systems that track mobile hospital equipment. This class of applications uses client/server communications and sends smaller amounts of data. For example, a data record might include the device identifier, position coordinates, and a time stamp.
Most important in these communications is the reliability of communications because there is no human operator or user to aid in recovery from error. Another crucial reliability factor is that the items the data locates are valuable, as well as the knowledge of where they are at any given point. Cost is incurred when the information is not available or is unreliable.
Much more demanding than either consumer IoT or traditional M2M are Industrial IoT applications (Figure 2 ), is a communications-emphasis class of Machines-to-Machines communications where the application uses autonomous, peer-to-peer distributed control.
Figure 2: The Industrial Internet of Things is characterized by many-to-many connections where groups of nodes work together on a single task.
Using the plural ‘machines’ versus the singular ‘machine’ is important because in these many-to-many applications, groups of nodes work together to accomplish a single task. For example, a baggage- handling machine in an airport senses luggage moving on a conveyor belt. It identifies the luggage by reading a bar code and then nudges the suitcase to the correct next conveyer belt based on the bar code.
Then, further along, another node makes a routing decision as multiple conveyor belts converge. For these systems, the communications requirements are not merely client/server. Instead, the nodes act as peers on the network, each making decisions and reporting status to the other nodes.
Data transfers are frequent, but typically do not convey large amounts of data. A message may merely convey temperature or the pressure or the status of a switch. Often, these systems process their tasks at rates greater than a human could, so they must run reliably and safely without human intervention. Communication failures risk added costs or can even threaten human safety.
Besides performing their main task, these systems also connect to an enterprise system to issue alarms, archive historical data, and store a basis of performing analytics on the data. This connection can work via a local IP connection or can be hosted in the cloud. When communicating with the enterprise system, the communications model reverts to the client/server M2M model discussed above.
Systems performing these demanding industrial applications have been available for some time. However, they have often depended on hard wiring and purpose-built communications protocols for exchanging data and status between nodes. But now recent advances in Low Power Wireless, Power Line, and high-speed multi-drop twisted-pair communications technologies, coupled with more compact implementations of IP, are enabling a migration to IP-based nodes for even the smallest and most cost-sensitive nodes within these systems.
Requirements for the Industrial Internet of Things
Similar to the broader IoT market, the Industrial IoT market needs inexpensive nodes that work on easy-to-install links such as wireless, power line, and simple twisted pair. These links do not always have the same reliability found with traditional data communication links, so there are problems of error detection and reliability common to the entire IoT, but the consequences of communications failures are much worse in the Industrial IoT due to the investment returns expected from flawless operation.
As with the broader IoT market, it is important for the IIoT to have a rich set of services in an IP-based protocol stack that allows that IP protocol stack to be used across the entire Industrial IoT application space, so that application developers can depend on a common set of communications services as they implement IIoT applications. Because the stakes are high when industry is involved, the Industrial IoT space has a very specific set of performance and reliability requirements that must in most cases (but not all) be satisfied, including:
- Resilience in the face of failures
- Physical connectivity requirements
- Control services
Resilience in the face of failures
Packet recovery. The new Low Power Wireless and Power Line links that these networks run on have very low bandwidth compared to Ethernet connections and, unlike Ethernet, the links are not nearly as reliable.
Packets can be lost due to interference and noise or even collisions. When these events happen, and especially if they happen frequently, more bandwidth is needed to recover from the loss in the form of a packet retransmission. Because these systems typically have real-time constraints, delivering the packet well beyond the application’s timing constraints is not important or even desirable. In such an environment the IIoT protocol stack must recover from intermittent packet loss quickly via packet retransmission, or it must report a message failure to the application.
Real-time requirements. On a factory floor, a material handler might drop something; in a semiconductor fab line, a wafer handler might fail to place a wafer on a probe station. Late packets mean communication failures in most control systems – there is no benefit in delivering a packet late.
To deal with this, the communications network must be engineered such that the real-time requirements of the application are met. This involves being able to:
- Design the network to meet response-time criteria by limiting the number of nodes per link, and tune the communications such that the network will not become overloaded
- Specify that a given communications transaction will either succeed or fail within a specified time, as well as guarantee that the success or failure of that transaction will be known to the application
Failure resistance. A main purpose of distributing control is to make it nearly impossible for the entire system to fail. Building single points of failure into the communications infrastructure – such as non-redundant routers, switches, or communication transceivers that can fail in such a way as to take down the entire link – defeats an important purpose of a distributed system. So in whatever protocol is used, there must be no single failure that can take down the communications for an entire link.
Reliable network-wide delivery. In applications where the message must get through or a major equipment shutdown is required, the sending node must have confirmation that its message was received by all the members of the group (Figure 3 ). This requirement can be addressed by making sure the protocol used supports network-wide (spanning all links within the system) confirmed multicast messaging.
Efficient duplicate packet delivery. In the industrial environment, there are some transactions that are inherently not idempotent (i.e., the property of certain computer-based operations to be applied multiple times without changing the result beyond the initial application).
Let’s say that an electricity customer is on a pre-pay contract with the utility, and the customer adds money to her/his account. The additional credit is transferred to the customer’s meter, but the meter acknowledgement is lost. The utility re-sends the “add credit” message. Correct behavior would dictate that the meter add the credit only one time.
To avoid such situations, the protocol stack must support duplicate packet detection and resend the previously generated response without reprocessing or regenerating it.
Message overload. In control systems, sometimes nodes are synchronized to an external event that causes a flood of messages (for instance: the oil refinery is about to catch fire). Not all those messages are important in dealing with the external event, but some messages that could help avoid the impending problem must be propagated quickly across the network. To control this, the protocol stack must support a mechanism that allows emergency messages to be routed in an expedited manner to overcome queuing delays within the nodes as well as queuing delays in routers between links.
Node response times. Most control systems have supervisory nodes that ping the status of all nodes in the network, and drive an operator display of the system health. In this operation, if a node is down, the update of the entire display will halt until communication with the down node times out after some number of retries, unless the protocol supports having multiple responses outstanding and a means to correlate those responses to original requests. This can be dealt with through the use of a protocol stack that supports a sender node communicating with its peers in sequence without waiting for the response from one node to arrive before going on to the next one.Physical connectivity
No single link meets all communicationneeds for the Industrial Internet of Things. Today multiple RF links,multiple power line links, and a variety of wired solutions are neededto implement the various applications. Furthermore, transceiverdevelopment is an area of active research and investment, so theprotocol stack must be able to take advantage of new technologies asthey become available. To make this possible, the protocol stack must beindependent of the underlying MAC/PHY.
Today,the vast majority of control networks are not secured. As the worldmoves from hardwired control systems and closed, unconnected networks tonetworks that can be connected to the Internet without gateways,strategies for protection on the Internet can be repurposed to attackproblems with the Internet of Things.
Fortunately, NIST hascreated a widely accepted set of best practices for securing systemsfrom cyber-attacks. These best practices have been vetted worldwide andform the basis for effective security policies management. One of thebest ways is to secure the communications system to follow theguidelines in NIST’s FIPS 140-2 level 1 , at a minimum.
Requirements for IIoT control services
Withouta uniform set of communications services, nodes could not besuccessfully integrated into a cohesive network. Multiple, conflictingsubsets of capabilities would make it cost-prohibitive to integratenodes from multiple sources. Achieving this in the IIoT environmentrequires the developer to evaluate the control services requirementswithin the context of the following:
- memory and performance tradeoffs;
- network scalability;
- ad hoc self-organizing capabilities;
- data logging;
- backwards compatibility;
- node data exchange;
- interoperability; and
- operator access to nodes.
Memory/performance tradeoffs. When application developers pick a platform, most expect to be able touse RAM for their applications and not have it all devoted to the needsof the stack. But in the world of low-cost systems-on-a-chip (SoCs), RAMis the most precious resource. In communications applications, RAM isneeded to know when to retry a packet, to detect a duplicate packet, toput packets in correct order for delivery to the application, etc.
Unfortunatelythese memory requirements are all in direct competition with what theapplication needs for its memory. The more memory used by thecommunications, the less the application can use. Given that these nodesare in a cost-sensitive environment, the use of SoCs is required tomeet the cost constraints of the application.
Therefore, therequired services must fit within the memory and performancecapabilities of all devices. To meet this requirement, memory and RAMconsumption for the protocol stack must be limited to provide theapplication with adequate RAM as well.
Scalability. Manybuilding and factory systems today are composed of well more than 1000nodes spread out among multiple links with a high-speed backbone. As IPis pushed to every device, these networks will only get larger. Tosatisfy this requirement, not only should the protocol stack scale tothousands of nodes and multiple links of different speeds in a singlelogical network, but, network-wide (spanning all links within thesystem), multicast group membership must be supported in the stack sothat all applications do not see all packets and consume the node’sresources, discarding packets that are not addressed to them.
Multicastconserves bandwidth and improves response time over multiple serialunicast messages. When closing a control loop over a network, all nodesthat subscribe to a sensor value should get that value nearly at thesame time.
Applications cannot be constrained to have all thenodes in their multicast group on their link, since some content, likeemergency messages, must go to most or even all of the nodes on thenetwork.
Bandwidth versus response time. Forresponse-time reasons, nodes cannot wait to be granted access to thenetwork by a server, nor can they go through a lengthy session set-upsequence with a server. For finely distributed systems, the nodes mustinteract as groups performing a function across the network, the waylight dimmers control lights. For this reason, the protocol stack mustsupport peer-to-peer communications. In addition, It must be possible toprovision timers in the protocol stack to indicate when to re-send apacket that has not been confirmed. These timers should be individuallyprovisioned according to the destination address in the packet. This canimprove response time and limit bandwidth consumption.
Howquickly a node should retry a message is a function of the round-tripdelay to/from the destination address. Retrying too early wastesbandwidth and may cause network congestion and packet collisions.Retrying too late makes response time suffer when a packet is lost.
Ina network composed of multiple links of differing speeds, aresource-constrained node cannot be expected to know the round-tripdelays to all the subscribers of its data. Therefore, for largenetworks, a node with topology knowledge needs to provision theseparameters.
Ad hoc self-organization. Some IIoTnetworks must be ad-hoc in their formation and allow nodes to come andgo without a management station. Therefore it must be possible todiscover the application-level information that a node can publish overthe network. In this way, a network can self-organize and begin toperform an application function with minimal, if any, human involvement.
Data-logging and provisioning. Data logging is acommon control network function. Systems need to upgrade firmware, andprovisioning such as linearization tables for sensors and calibrationdata is often needed. Therefore the protocol must have a means totransfer a sequence of packets as a logical unit, like a firmwareupgrade, a data log, or provisioning information.
Backwards compatibility. Controlnetworks are long-lived, as long as 20+ years in many cases, yet theyare networks, so additional nodes and additional applications are addedover time. It is unreasonable, and often prohibitively costly, toupgrade all the existing nodes in a network to the new version. Theexisting nodes may not have enough memory, or might face otherconstraints. Therefore future versions of the protocol must work withprior versions and provide all the same capabilities as prior versions.
Interoperability. Resource-constrained nodes on an IIoT cannot support parsers of datastreams. But, today’s XML parsers are code space and RAM intensive, sothere must be some lightweight application interoperability model tofacilitate the exchange of data between publishers and subscribers.
Access to nodes. While Industrial IoT systems operate autonomously, they also need topresent data and status to system operators. To facilitate this withmodern Web user interface (UI) technology, the stacks must support aRESTful API to provide system data. RESTful API was part of the 6LoWPANwireless networking capability added to IPv6 in 2007. It is wellsupported by tools and developers and allows system access throughordinary browsers.
Robert Dolin ,Echelon system architect as well as Vice President and Chief TechnologyOfficer, has worked for the company since 1989. He is the principal orco-inventor of fourteen Echelon patents, and is one of the designers ofthe LonWorks protocol, the network development system environment, theNeuron C programming model, and LonWorks network management. In 1995 hewas named chief technology officer. Before joining Echelon, Dolin spent11 years at ROLM Corporation, where he was one of the main developers ofits fully distributed PBX telephone system. He also held positions offirst- and second-line management as well as system architecture. Dolinhas a B.S. degree in Electrical Engineering and Computer Science fromthe University of California at Berkeley.