Building an IoT for industrial control: Part 1 – What is Industrial IoT?
Since 2000, when IPv6 was first introduced, and 2007, when the 6LoWPAN wireless networking extension was released, virtually all IP-connected devices, wired and wireless, can be connected. Such networks, known as the “Internet of Things,” are now generating interest among developers of industrial-control networks. These industrial systems connect to both external and internal Internet Protocol (IP) networks through gateways that require custom provisioning and programming to expose the necessary data to the enterprise systems. Invariably, the gateway constrains what information can pass back and forth, and its configuration is difficult to evolve to support new requirements. To provide the benefits of a common IP communications infrastructure, the advanced communications requirements of these systems must be addressed.
The Industrial Internet of Things (IIoT), however, is just one of three main classes of IP-enabled connected devices. The two other main categories are consumer and machine-to-machine devices (Figure 1).
Figure 1: Currently there are three main Internet Protocol (IP) enabled Internet of Things (IoT) categories: consumer, machine to machine and industrial machines to machines.
Consumer IoT. Unlike the industrial environment, consumer IoT is non-real-time and non-deterministic and characterized by a human interacting with a device. Viewing a video on a cell phone, or starting up an exercise monitor to send your statistics to your account in the cloud, are examples of consumer IoT applications. In case of failure, a human is there to recover or restart the application. In the consumer IoT, communications run between client/server and are often streaming large amounts of data.
This is very different than IIoT, where, in terms of reliability and determinism, the requirements are a superset of the main IoT requirements. This segment of the IoT market includes two main categories: Machine to Machine (M2M) application monitoring and a superset of traditional M2M called Machines to Machines networking for autonomous, peer-to-peer distributed control.
Machine to Machine. Typical of client/server-based application-monitoring architecture of M2M are vehicle-tracking systems, systems that monitor a building’s mechanisms for signs of wear, or systems that track mobile hospital equipment. This class of applications uses client/server communications and sends smaller amounts of data. For example, a data record might include the device identifier, position coordinates, and a time stamp.
Most important in these communications is the reliability of communications because there is no human operator or user to aid in recovery from error. Another crucial reliability factor is that the items the data locates are valuable, as well as the knowledge of where they are at any given point. Cost is incurred when the information is not available or is unreliable.
Much more demanding than either consumer IoT or traditional M2M are Industrial IoT applications (Figure 2), is a communications-emphasis class of Machines-to-Machines communications where the application uses autonomous, peer-to-peer distributed control.
Figure 2: The Industrial Internet of Things is characterized by many-to-many connections where groups of nodes work together on a single task.
Using the plural ‘machines’ versus the singular ‘machine’ is important because in these many-to-many applications, groups of nodes work together to accomplish a single task. For example, a baggage- handling machine in an airport senses luggage moving on a conveyor belt. It identifies the luggage by reading a bar code and then nudges the suitcase to the correct next conveyer belt based on the bar code.
Then, further along, another node makes a routing decision as multiple conveyor belts converge. For these systems, the communications requirements are not merely client/server. Instead, the nodes act as peers on the network, each making decisions and reporting status to the other nodes.
Data transfers are frequent, but typically do not convey large amounts of data. A message may merely convey temperature or the pressure or the status of a switch. Often, these systems process their tasks at rates greater than a human could, so they must run reliably and safely without human intervention. Communication failures risk added costs or can even threaten human safety.
Besides performing their main task, these systems also connect to an enterprise system to issue alarms, archive historical data, and store a basis of performing analytics on the data. This connection can work via a local IP connection or can be hosted in the cloud. When communicating with the enterprise system, the communications model reverts to the client/server M2M model discussed above.
Systems performing these demanding industrial applications have been available for some time. However, they have often depended on hard wiring and purpose-built communications protocols for exchanging data and status between nodes. But now recent advances in Low Power Wireless, Power Line, and high-speed multi-drop twisted-pair communications technologies, coupled with more compact implementations of IP, are enabling a migration to IP-based nodes for even the smallest and most cost-sensitive nodes within these systems.
Requirements for the Industrial Internet of Things
Similar to the broader IoT market, the Industrial IoT market needs inexpensive nodes that work on easy-to-install links such as wireless, power line, and simple twisted pair. These links do not always have the same reliability found with traditional data communication links, so there are problems of error detection and reliability common to the entire IoT, but the consequences of communications failures are much worse in the Industrial IoT due to the investment returns expected from flawless operation.
As with the broader IoT market, it is important for the IIoT to have a rich set of services in an IP-based protocol stack that allows that IP protocol stack to be used across the entire Industrial IoT application space, so that application developers can depend on a common set of communications services as they implement IIoT applications. Because the stakes are high when industry is involved, the Industrial IoT space has a very specific set of performance and reliability requirements that must in most cases (but not all) be satisfied, including:
- Resilience in the face of failures
- Physical connectivity requirements
- Control services
Resilience in the face of failures
Packet recovery. The new Low Power Wireless and Power Line links that these networks run on have very low bandwidth compared to Ethernet connections and, unlike Ethernet, the links are not nearly as reliable.
Packets can be lost due to interference and noise or even collisions. When these events happen, and especially if they happen frequently, more bandwidth is needed to recover from the loss in the form of a packet retransmission. Because these systems typically have real-time constraints, delivering the packet well beyond the application’s timing constraints is not important or even desirable. In such an environment the IIoT protocol stack must recover from intermittent packet loss quickly via packet retransmission, or it must report a message failure to the application.
Real-time requirements. On a factory floor, a material handler might drop something; in a semiconductor fab line, a wafer handler might fail to place a wafer on a probe station. Late packets mean communication failures in most control systems - there is no benefit in delivering a packet late.
To deal with this, the communications network must be engineered such that the real-time requirements of the application are met. This involves being able to:
- Design the network to meet response-time criteria by limiting the number of nodes per link, and tune the communications such that the network will not become overloaded
- Specify that a given communications transaction will either succeed or fail within a specified time, as well as guarantee that the success or failure of that transaction will be known to the application
Failure resistance. A main purpose of distributing control is to make it nearly impossible for the entire system to fail. Building single points of failure into the communications infrastructure - such as non-redundant routers, switches, or communication transceivers that can fail in such a way as to take down the entire link - defeats an important purpose of a distributed system. So in whatever protocol is used, there must be no single failure that can take down the communications for an entire link.
Reliable network-wide delivery. In applications where the message must get through or a major equipment shutdown is required, the sending node must have confirmation that its message was received by all the members of the group (Figure 3). This requirement can be addressed by making sure the protocol used supports network-wide (spanning all links within the system) confirmed multicast messaging.
Efficient duplicate packet delivery. In the industrial environment, there are some transactions that are inherently not idempotent (i.e., the property of certain computer-based operations to be applied multiple times without changing the result beyond the initial application).
Let’s say that an electricity customer is on a pre-pay contract with the utility, and the customer adds money to her/his account. The additional credit is transferred to the customer’s meter, but the meter acknowledgement is lost. The utility re-sends the “add credit” message. Correct behavior would dictate that the meter add the credit only one time.
To avoid such situations, the protocol stack must support duplicate packet detection and resend the previously generated response without reprocessing or regenerating it.
Message overload. In control systems, sometimes nodes are synchronized to an external event that causes a flood of messages (for instance: the oil refinery is about to catch fire). Not all those messages are important in dealing with the external event, but some messages that could help avoid the impending problem must be propagated quickly across the network. To control this, the protocol stack must support a mechanism that allows emergency messages to be routed in an expedited manner to overcome queuing delays within the nodes as well as queuing delays in routers between links.
Node response times. Most control systems have supervisory nodes that ping the status of all nodes in the network, and drive an operator display of the system health. In this operation, if a node is down, the update of the entire display will halt until communication with the down node times out after some number of retries, unless the protocol supports having multiple responses outstanding and a means to correlate those responses to original requests. This can be dealt with through the use of a protocol stack that supports a sender node communicating with its peers in sequence without waiting for the response from one node to arrive before going on to the next one.