How to mitigate packet loss in wireless USB networking applications
By Bart Vertenten and Sun Yuxi
Embedded.com
(10/07/08, 06:00:00 AM EDT)
USB is a very reliable networking medium where collisions cannot happen. This means that almost no packets are lost or corrupted. Because of this, interoperability testing has never detected corner cases when packets are lost.

The scenario is different, however, for the emerging wireless USB connection. Due to the likelihood of collisions in the wireless medium, care must be taken when migrating an application from wired to wireless USB.

This article addresses an issue that can occur in both wired and wireless USB. But because the chance of losing packets in a wireless medium is much bigger, the problem is more likely to happen in wireless USB than in wired USB. The issue is "the danger of losing one packet."

Losing even one packet can cause serious problems when a developer overlooks the issue. This article aims to help the wireless USB developer understand the problem and provide some clues for the implementation of a solution.

It begins with a detailed description of the problem followed by an analysis of why it becomes a problem in wireless USB when it is not a problem in wired USB. At the end of the article, some clues will be given for the solution of this issue.

Figure 1: When a device needs to send a complete buffer of data to the host and the handshake sent by the host for the last packet is lost, implementation problems may occur.

When it happens
This particular problem occurs when the device needs to send a complete buffer of data to the host and the handshake sent by the host for the last packet (or last several packets in a burst) in the transfer got smashed or lost in the air.

As the host must send the handshake minimum times, there is a possibility that those micro-scheduled management commands (MMCs) which include the handshake for the last bulk in packet (or last several packets in the burst) are not seen by the device side.

In this case, the device must keep that part of data in the buffer till it sees the handshake - if any - to maintain data integrity. This situation will cause problems in the implementation as illustrated in Figure 1 above.

In this case, the device will not raise any interrupt or report any event to indicate to the higherlayer software that the data was received correctly by the host. In fact, it should not, since the device is thinking that the last part of the data in this transfer has not been successfully transmitted to the host side, the host could poll again for the data.

If the state machine of the higher layer protocol (e.g class driver) requires this explicit handshake/ interrupt to move to the next stage, then the whole state machine will be stuck. This will cause a deadlock situation and the whole device implementation will fail to handle this situation.

Please bear in mind that this problem of losing one packet is not typical for wireless USB and it can also happen in wired USB. Designers must also take care of this when designing something for wired USB. The difference is, the wireless medium is much less reliable than the wired one.

Figure 2: Some traces are shown to illustrate the problem for mass-storage that occurs when the handshake for the IN is not seen by the device.

Packets will be lost very often compared to wired USB where packets are almost never lost. In case designers overlook the problem, the chance that will be found on a wired system is very low, but it is pretty high in wireless USB. Let's take a practical example. In case of a mass-storage device implementation, the following sequence is always happening multiple times:

CBW—BULK OUT—31bytes
Data Stage—BULK IN
or BULK OUT—multiple
transactions of maximum
packet size
CSW - IN—13bytes

In particular, for mass-storage, this problem can occur when the handshake for the IN - 13bytes is not seen by the device. Some traces are shown in Figure 2 above to illustrate this problem.

In Figure 3 below, wireless USB packet 291 is an MMC from the host which includes a WdtCTA used as acknowledge for the previous In data packet (wireless USB packet 288). Only when the device receives this MMC, can it claim that the particular

In data packet has been received by the host. In this particular case, the device cannot claim the CSW packet has been successfully delivered to host until it receives the acknowledge embedded in the packet 291. But what if this packet gets lost in the air? From this particular host implementation, we can see it will only send the MMC with acknowledge once. In the wireless world, the possibility of losing this packet is not negligible.

Figure 3: In the wireless world, the possibility of losing packet 291 - an MMC from the host which includes a WdtCTA used as acknowledge for the previous In data packet - is not negligible.

In some traditional USB class drivers written for mass storage, the state machine inside the class driver will only get triggered when it receives a status completion event (in most cases, it is an interrupt from the device controller or an event from lower-level stacks).

More specifically, in the mass storage state machine, the class driver will always expect a completion on the IN pipe for receiving the 13bytes of CSW before it pursues further operation, for example, moves the state machine to CBW stage and programs the device controller to receive CBW on the OUT pipe.

If the ACK for the CSW gets smashed, although it is actually received by wireless USB host, there will be a mismatch between the class driver state machine at the host side and at the device side. The host will move the class state machine to CBW stage and will be ready to send out CBW anytime if it finds the particular OUT pipe active.

But unfortunately, the OUT pipe enabling operation is normally triggered by some software operation in the CBW stage of the device class state machine. This means that on the bus there won't be any DN_EPRdy for EPxOUT sent out to indicate that the pipe is active. At the same time, the device class state machine will remain in the CSW stage, waiting for a completion of the CSW (there is no way for the device to know that the CSW packet has been received by the host unless further transfer has been scheduled from device). Due to this mismatch, deadlock is generated.

Device-side management
The class driver on the device side needs to manage this case carefully with the consideration of "packet lost" situation. A typical solution to this problem in mass storage is to change the state machine so that the OUT pipe for CBW will be activated inside the CSW stage of the device class state machine, so the device side will not block the activities on the bus for the next CBW. This solution is valid to address the lost ACK for CSW in mass storage class and other situations.

In the wireless USB world - even the wired USB world - any class driver on the device side that defines a protocol which sends a packet to an OUT endpoint after the transfer of an IN endpoint is finished must take care that it enables the reception of OUT packets after programming the IN transfer. It must not wait to enable the reception of OUT data until it has seen an IN transfer complete interrupt.

Bart Vertenten is Chief Architect for Connectivity Solutions, Mobile & Personal Applications and Sun Yuxi is Senior Systems Engineer in the UWB Group at NXP Semiconductors