There are a variety of different circumstances when an embedded system needs to deal with data that is presented as a stream of bits. This is not, in itself, a problem, but bits only come in two flavors - 0 and 1. This makes it difficult to manage the flow of data, as there is no easy way to determine the end of a packet or message, for example. When data traffic was controlled by hardware, this was addressed using a "long 1" or "long 0", but this is rarely an option with modern systems. So, network protocol designers have developed techniques that involve adding extra bits in appropriate ways. This is commonly called "bit stuffing". This article takes a look at those techniques and how they work.
This article is intended for developers who are not networking specialists. If you are an expert in this area, the ideas will be very familiar territory. With some years of working with embedded systems, you are likely to acquire a certain amount of knowledge about protocols, so learning about a new one is not starting from scratch. For many, networking begins and ends with TCP/IP. However, there are lots of other Internet protocols - FTP, UDP and HTTP, for example. There are also other kinds of connectivity that may or may not be thought of as networking - WiFi, Bluetooth and USB, for example.
It was the study of the operation of the last of these, USB, a technique that was familiar in form, but unfamiliar in application, showed up: bit stuffing.
The term "bit stuffing" broadly refers to a technique whereby extra bits are added to a data stream, which do not themselves carry any information, but either assist in management of the communications or deal with other issues.
Bit stuffing is all about protocol management. Imagine you have a stream of binary data and you periodically want to include a marker to say that a data set has finished. Obviously, you can use a particular bit sequence, but how do you recognize it when any sequence of bits might occur within the data stream? This is where bit stuffing comes in.
For example, we can define a rule that no more than five 1s can occur in a row in the transmitted data. To make this work, the sending software/hardware will insert (stuff) an extra 0 after any sequence of five 1s. It can then send a sequence of six 1s specifically to indicate an end of data set. The receiver, on seeing a sequence of five 1s, checks the next bit. If it is 0, the bit is simply discarded; if it is 1, then it notes that an end of data set has been flagged. This technique is very flexible and can be adapted to various circumstances. It is broadly the same idea as using an "escape" character in byte-oriented protocols. Here is an example: