Reliable systems for micro aerial vehicles — Supply resilience

Editor's Note: Embedded designers must contend with a host of challenges in creating systems for harsh environments. Harsh environments present unique characteristics not only in terms of temperature extremes but also in areas including availability, security, very limited power budget, and more. In Rugged Embedded Systems, the authors present a series of papers by experts in each of the areas that can present unusually demanding requirements. A separate excerpt of the book addresses fundamental concerns in reliability and system resiliency.

Elsevier is offering this and other engineering books at a 30% discount. To use this discount, click here and use code ENGIN317 during checkout.

Adapted from Rugged Embedded Systems, Computing in Harsh Environments, by Augusto Vega. Pradip Bose, Alper Buyuktosunoglu.

CHAPTER 7. Reliable electrical systems for micro aerial vehicles and insect-scale robots: Challenges and progress (Cont.)
By X. Zhang, Washington University, St. Louis, MO, United States

5 SUPPLY RESILIENCE IN MICROROBOTIC SoC

Among the numerous challenges surrounding the SoC design for microrobotic applications, reliability, and performance of the system in the presence of supply noise is one critical problem to be addressed. Similar to many integrated computing systems, a microrobotic SoC employs synchronous digital logics in its central control unit and thus is susceptible to disturbance on the supply voltage.

However, the crucial weight and form factor constraints set the microrobotic SoC apart from conventional systems. Given the extremely stringent weight budget, extra external components must be avoided at all cost, which leads to the integration of on-chip DC-DC converter and the absence of external frequency reference. With such IVRs powered directly off a discharging battery, the microrobotic SoC experiences supply-noise characteristics different from conventional digital systems, where existing supply-noise mitigation techniques cannot be easily applied.

In this section, we describe how an adaptive-frequency clocking scheme is used in our BrainSoC design to exploit the synergy between IVR and clock generation. The resulting supply-noise resilience and performance improvement has been demonstrated by a prototype SoC developed prior to the BrainSoC [12]. Our proposed adaptive clocking scheme not only delivers better reliability and performance, but also extends the error-free operation to a wider battery voltage range, which is beneficial to a microrobotic system.

SIDEBAR: BACKGROUND ON SUPPLY NOISE

Digital computing systems based on synchronous logic circuits typically employ a fixed frequency clock. To guarantee correct operation, final outputs from the datapath must arrive at the next flip-flop stage before the next clock edge by some time margin known as the “setup time”. Since the datapath delay is a function of the supply voltage, it is susceptible to noise on the supply line.

Supply noise is the result of nonideal power delivery system and load current fluctuation under varying computation workload. It can come from the parasitic resistance, inductance, and capacitance in the power delivery network, and manifests itself as static IR-drop, which is the static voltage drop due to power grid resistance, as well as dynamic L di/dt drop, which is the transient voltage fluctuation caused by the inductance and capacitance in response to load current changes. Also, for systems with integrated switching regulators, the intrinsic voltage ripple of the regulator contributes additional noise to the supply. The existence of supply noise can modulate the datapath delay, which may lead to setup time margin violation and eventually computation errors. In order to ensure sufficient delay margins under all operating conditions, the most straightforward approach is to lower the clock frequency and provide a “guardband” to tolerate even the worst supply-noise scenario.

SIDEBAR: SUPPLY RESILIENCE IN CONVENTIONAL COMPUTING SYSTEMS

Conservative design strategy such as timing guardband is the most commonly applied to combat supply-noise in conventional computing systems. It may incur hefty performance loss and thus is highly undesirable. Instead, a number of alternative techniques have been proposed to mitigate supply-noise with less performance penalty. The active management of timing guardband [13] in a prototype IBM POWER7 processor is an example of adaptive clocking: a digital phase-locked loop (DPLL) adjusts the processor core’s clock frequency based on the timing guardband sensed by a critical path timing monitor [14]. Since resonant noise caused by the LC tank between the package inductance and the die capacitance has been identified as the dominant component of supply noise in high-performance microprocessors [15], many studies have focused on this particular type of supply noise by proposing adaptive phase-shifting PLL [16] and clock drivers [17]. Following the duality between the clock frequency and the supply voltage in synchronous digital systems, the other approach to optimize performance in the presence of supply noise is adaptively adjusting the voltage level delivered by the power supply at different desirable operating frequency. Despite their different implementations, both adaptive clocking and adaptive supply are along a similar vein of technical route that applies closed feedback loop to adjust frequency and/or supply based on monitored timing margin of the system, and therefore are subject to the bandwidth limitation of the feedback loop.

In addition to the above-mentioned systems and techniques, there exist other classes of logic implementations such as asynchronous logics and self-timed logics [18] that do not rely on a global clock for their operations. Unlike synchronous logics, these systems are intrinsically delay-insensitive and thus immune from the negative impact of supply noise. However, these logic implementations lack the full support of standard libraries, IPs, and EDA tools and thus are difficult to incorporate into the digital design flow of a sophisticated SoC.

5.1 UNIQUE RELIABILITY CHALLENGE FOR RoboBee

In many ways, the microrobotic SoC, such as the BrainSoC, suffers similar setup time violations due to supply noise as other types of conventional computing systems, but its weight and form factor constraints present unique design challenges that require different supply-noise mitigation techniques from those employed in a typical microprocessor.

In the BrainSoC, the supply voltage generated by the fully integrated switched-capacitor voltage regulator (SC-IVR), which is directly powered off a battery, can have different noise characteristics from the resonant-noise-dominated supply experienced by the microprocessor. In fact, for the IVR-enabled microrobotic SoC, the worst supply-noise is often caused by sudden load current steps in a very small time scale instead of LC resonance. Therefore, the mitigation techniques developed earlier for slow-changing or periodic supply-noise in microprocessors [13,16,17] are not applicable to the fast-changing supply-noise induced by the load current in a microrobotic SoC.

On the other hand, unlike the microprocessor, RoboBee is a self-sustained autonomous system with no need to synchronize with other systems for communication or I/O transactions, therefore the timing jitter/phase noise requirement of its clock signal can be relaxed from the specifications in typical high-speed I/O interfaces that demand PLL-generated clean clock. Moreover, the lack of external frequency source as the reference signal renders the implementation of a PLL impractical. We therefore conclude that a free-running oscillator is a better candidate as the clock generator for the BrainSoC.

The above discussion explains that the microrobotic SoC differs significantly from conventional digital systems in its integration of a battery-connected IVR and its ability to operate with a free-running clock. The former suggests distinctive supply-noise characteristics dominated by fast load current changes and slow battery discharge, while the latter provides opportunity for an adaptive clocking scheme.

In the context of the RoboBee’s BrainSoC, the goal is thus to optimize the flight time with respect to the total energy available in its battery and the associated battery discharge profile. However, using the RoboBee system as a platform, we are able to investigate effective supply-noise mitigation techniques that can be applied to more general microrobotic systems with minimal performance penalty. Along this vein, we explore the relative merits of different operational modes offered by the supply regulation mechanisms and the clock generation schemes. Fig. 6 illustrates two possible modes of supply regulation with respect to a typical lithium-ion battery discharge profile. The first one (Fig.6A) is closed-loop operation at a fixed voltage. In this case, with the help of feedback control loop, the SC-IVR can provide a constant supply voltage that is resilient to input battery (VBAT) and output load (ILOAD) conditions. One advantage associated with the fixed voltage operation is that it provides a relatively constant operating frequency. However, for a target output voltage level (VREF), the SC-IVR’s operating range is limited to VBAT > 4VREF. In contrast, open-loop operation with variable unregulated voltage (Fig. 6B) exhibits an entirely different set of attributes; with no feedback control, the SC-IVR’s output voltage is roughly 1/4th the input battery voltage, but varies with both the discharge profile and load fluctuations.

click for larger image

FIG. 6 Illustration of two SC-IVR modes of operations versus typical battery discharge profile. (A) Closed-loop regulation and (B) open-loop operation.

While open-loop SC-IVR mode allows the system to operate over a wider range, down to the minimum voltage limit of the digital load, performance and energy efficiency depend on the clocking strategy used. The choices are between two clocking schemes: fixed-frequency and adaptive-frequency clocking. Out of the four total combinations, we compare the following three: (1) fixed regulated voltage, fixed frequency; (2) fixed regulated voltage, adaptive frequency; and (3) variable unregulated voltage, adaptive frequency. Intuitively speaking, fixed-frequency (FFIX) clocking requires extra timing margins to account for nonnegligible worst-case voltage ripple, which is an intrinsic artifact of the SC-IVR’s feedback loop; while, alternatively, an adaptive-frequency (FADP) clocking scheme that allows the clock period to track the changes in the supply voltage could offer higher average frequency. Adaptive-frequency clocking also works well for open-loop SC-IVR mode, because it maximizes performance with respect to battery and load conditions.

5.2 TIMING SLACK ANALYSIS ON ADAPTIVE CLOCKING

To drill deeper on the above intuitive explanation of the supply-noise impact on timing margin and the potential beneficial compensation effect when the clock period can match the datapath delay by tracking supply voltages, we now employ a more rigorous analysis of timing slack for broad-band supply-noise with fast transients.

click for larger image

FIG. 7 Simplified diagram of a pipeline circuits.

Fig. 7 shows one stage of a pipeline circuit that is clocked by a free-running DCO. The clock signal (CLK) is generated by a clock edge propagating through the delay cells of the DCO, and is then buffered to trigger the flip-flops at the input and output of the datapath. The buffered clock signals are labeled as CP1 and CP2. Using similar definition proposed by previous work [4,9], the timing slack can be calculated as:

Here, t=0 is the time the first clock edge is launched, and it takes tcp1 to travel through the clock buffers and reach the first flip-flop as CP1. In the meantime, the first clock edge propagates through the delay cells and, at t=tclk , it completes the round trip in the DCO and the second clock edge is launched at CLK and takes tcp2 to reach the second flip-flop as CP2. Instead of resorting to small signals, we simply represent the supply voltage as a function of time with v(t). Without loss of generality, let us assume each circuit block (X) has a unique function fx (v(t)) that measures the rate of propagation delay accumulation as a function of the supply voltage, such that the propagation delay of the circuit tx can be expressed as:

In this way, we can re-write the delay parameters in Eq. (1) as the following:

The important finding derived from Eq. (3) is that to ensure constant positive timing slack under all supply-noise conditions is to match all the delay accumulation functions (fclk , fcp1 , fcp2 , fd ), rather than simply the DCO and the datapath. Intuitively, this is because the delay is accumulated first at the DCO and then at the clock driver for CP2, whereas it is first at the clock driver and then at the datapath for CP1. If there is any mismatch between the clock driver and the DCO or the datapath, the impact of the supply noise cannot be fully compensated. Therefore, our design uses the fanout-of-4 delay tracking DCO as a reasonably good approximation to track the typical delay in both the datapaths and the clock drivers, rather than a precise implementation to perfectly match either the datapaths or the clock drivers.

The next installment in this series takes a deeper look at the system implementation for the RoboBee MAV.  

Reprinted with permission from Elsevier/Morgan Kaufmann, Copyright © 2016


Professor Zhang joined the faculty at Washington University in St. Louis in 2015. Previously, she was a postdoctoral fellow in computer science at Harvard University, where she worked on the RoboBee BrainSoC and energy-efficient computing projects. She has worked as a graduate research assistant at Cornell University studying variability-tolerant circuits. Zhang earned a doctorate in electrical and computer engineering at Cornell University in 2012. She earned a bachelor’s degree in electrical engineering at Tsinghua University in Beijing in 2006.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.