Every flip-flop has two critical specifications we violate at our
peril. "Set-up time" is the minimum number of nanoseconds that input
data must be stable before clock comes. "Hold time" tells us how long
to keep the data present after clock transitions.
These specs vary depending on the logic device. Some might require
tens of nanoseconds of set-up and/or hold time; others need an order of
magnitude less.
 |
| Figure
9.1: Setup and Hold Times |
If we tend to our knitting we'll respect these parameters and the
flip-flop will always be totally predictable. But when things are
asynchronous—say, the wrist rotates at it's own rate and the software
does a read whenever it needs data—there's a chance the we'll violate
set-up or hold time.
Suppose the flip-flop requires 3 nanoseconds of set-up time. Our
data changes within that window, flipping state perhaps a single
nanosecond before clock transitions. The device will go into a
metastable state where the output gets very strange indeed.
By violating the specification the device really doesn't know if we
presented a zero or a one. It's output goes, not to a logic state, but
to either a half-level (in between the digital norms) or it will
oscillate, toggling wildly between states. The flip-flop is metastable.
 |
| Figure
9.2: A Metastable State |
This craziness doesn't last long; typically after a few to 50
nanoseconds the oscillations damp out or the half-state disappears,
leaving the output at a valid one or zero. But which one is it? This is
a digital system, and we expect ones to be ones, and zeroes zeroes.
The output is random. Bummer, that. You cannot predict which
level it will assume. That sure makes it hard to design predictable
digital systems!
Hardware folks feel that the random output isn't a problem. Since
the input changed at almost exactly the same time the clock strobed,
either a zero or a one is reasonable. If we had clocked just a hair
ahead or behind we'd have gotten a different value, anyway.
Philosophically, who knows which state we measured? Is this really a
big deal? Maybe not to the EEs, but this impacts our software in a big
way, as we'll see shortly.
Metastability occurs only when clock and data arrive almost
simultaneously; the odds increase as clock rates soar. An equally
important factor is the type of logic component used: slower logic
(like 74HCxx) has a much wider metastable window than faster devices
(say, 74FCTxx).
Clearly at reasonable rates the odds of the two asynchronous signals
arriving closely enough in time to cause a metastable situation are
low, measurable, yes, important, certainly. With a 10 MHz clock and 10
KHz data rate, using typical but not terribly speedy logic, metastable
errors occur about once a minute. Though infrequent, no reliable system
can stand that failure rate.
The classic metastable fix uses two flip-flops connected in series.
Data goes to the first; its output feeds the data input of the second.
Both use the same clock input. The second flop's output will be
"correct" after two clocks, since the odds of two metastable events
occurring back-to-back are almost nil. With two flip-flops, at
reasonable data rates errors occur millions or even billions of years
apart, good enough for most systems.
However "correct" means the second stage's output will not be
metastable: it's not oscillating, nor is it at an illegal voltage
level. There's still an equal chance the value will be in either legal
logic state.