by Don Morgan
In the last few issues, weve examined some popular digital signal
processors from Motorola, Analog Devices, and Texas Instruments. In
addition to standard arithmetic architectures, these devices each have
special features that make them valuable for particular applications, as
well as general digital signal processing.
Digital signal processing may embody any operation on a data sequence.
It may be a simple AND operation, a two- or three-dimensional mask, a
polynomial filter, such as an FIR,
or it may be a transform. All of the
DSPs we have discussed offer the arithmetic equipment that makes certain
mathematical procedures easier, such as an optimized
multiply/accumulate. One model adds refinements for automating transform
processing, some include the hardware to easily accept serial A/Ds, and
one actually possesses a dual pipeline. As a rule, the construction of
these DSPs makes arithmetic computation more efficient and faster to
perform. Most of them offer greater overall speed than the common
microprocessor or microcontroller.
The previous enhancements make the
parts valuable, but it should not be construed by the number of such
devices or the name or the arithmetic architecture that digital signal
processing can only be done on a DSP. Digital signal processing can be
done on any microprocessor or microcontroller. In this column,
well examine these and some alternative methods of performing DSP
operations. These alternative methods are becoming more important, in
fact, for both
reduction of cost and increase in speed and efficiency.
In this issue, we will look at PLDs and FPGAs, as well as ICs built
expressly to implement some particular aspect of digital signal
processing. Such parts can be put together with standard
microcontrollers to create sophisticated systems.
DSP components
Digital signal processors aim for a generalized target. They are
flexible enough to perform many functions, even those of a CPU, though
this is definitely a stretch. But lets assume you
dont have
a generalized need; instead, you have a specific need that could be
satisfied with something lessperhaps less expensively.
An
example of such a case might be found in systems employing A/Ds as an
interface to the real world. The analog system that prepares the signal
for the A/Ds not only communicates the signal to the A/D, but it will
also carry any offsets in the previous circuitry to that A/D. Not only
that, the A/D itself can add certain products to the signal that can
require low
passing before the data can be used. In many cases, either
or both of these elements prove clearly undesirable and actually
deleterious. In audio processing, for example, any DC bias on the input
signal can offset compressors, filters, and amplifiers so that the
signal is attenuated and distorted. Sliding filters that use signal
magnitude to determine the cutoff frequency can be fooled by such bias
and never reach the proper operating point.
In a case such as this, it
would be nice to have a highpass
filter between the A/D and the
processing elements. If you have a DSP in the system and it has the
bandwidth remaining after its application processing, you may be able to
add the code to remove these products there. However, if you dont
already have the DSP or there isnt enough room or time left for
the filter, you may want to choose another course. (As a side note, some
manufacturers are beginning to include highpass filters in their A/Ds to
help control these offsets.)
The other problem
involving imaging can
be eliminated with the appropriate lowpass filter placed after the A/D
converter.
COTS parts are available for these tasks from several
companies, including Harris Semiconductor (
note-5/11/00: Harris Semiconductor has changed its name to Intersil Corporation).
The nice thing about these
parts is that they are usually self-contained and require only a small
understanding of signal processing, and they dont generally
require programming. They are very useful for low-cost audio and
video
applications.
Other applications, such as radio, require parts such as
these because DSPs are not fast enough. Better performance can be had in
terms of signal to noise by moving FIR (linear phase) elements as far up
in the IF chain as possible. Also, half-band FIR filters are available
for use in quadrature splitting for the removal of unwanted
side-bands.
Many companies offer discrete parts that will perform some
of these individual tasks for radio. Harris Semiconductor offers a
relatively full
line of DSP components that can be used individually or
in combination to perform many functions. Among their products are FIR
filters, quadrature decoders, multipliers, half-band filters,
numerically controlled oscillators, histogrammers, video image filters,
and convolvers.
As you can see, its possible to put together a
DSP system that relies on hardware with a minimum of programming. For
more information concerning these components, along with application
notes, visit
http://www.intersil.com/sitemap.asp
.
In addition to
their DSPs, Analog Devices has an interesting series of non-DSP products
featuring DSP functionality, including a sample rate converter and a
video codec.
The sample rate converters from Analog Devicesthe AD1890, AD1891,
and AD1892allow two digital systems with asynchronous clocks to
share data almost seamlessly. The sample rate of the input, usually
AES/EBU or SP/DIF, clocks the input,
while the receiving system clocks
the output side. A complex set of decimating and interpolating FIR
filters within the chip move the data from one sample clock to the other
with low values of jitter. These parts have a high sample clock range
and the ratio between the sample clocks can also be quite large.
Applications for these parts include digital mixing consoles and digital
audio interface, CD-R, DAT, DCC and MD recorders, routers, switches,
broadcast equipment, and so on.
The video codec, ADV601,
is a new part that incorporates wavelet
technology in a chip that allows for the real-time compression of video
signals for transmission or storage. The chip, in conjunction with a
DSP, provides for precise compressed bit rate control, with the DSP
setting the bin widths for the compression. The compression ratios range
from visually loss-less to 350:1. It will interface to a wide variety of
equipment, including CCIR-656, and has an eight-, 16- or 32-bit host
interface. The applications for this chip include
network and Internet
video, editing, video capture, remote CCTV, digital cameras, archival
systems, and so on. For more information on these parts, visit Analog
Devices Web site at:
www.analogdevices.com
.
FPGAs and PLDs
FPGAs and PLDs can also be used to perform DSP functions. The nice thing
about these devices is that they are often reprogrammable. Generally
speaking, more expertise is required to use one of these
devices than
the COTS ASICs such as Harris supplies, but the rewards are many. A
generalized product can be built that is programmed for a particular
purpose just before it is shipped. As is the case with programmable
parts, upgrades and modifications are more easily made in the field, and
these parts can run at sample rates well in excess of even the fastest
DSPs available today. The main drawback is that often the design
engineer must program the device himself, which can require time and
money to
complete.
Fortunately, several FPGA and PLD manufacturers, including Altera and
Xilinx, offer prepared software for their devices. Visit their Web pages
for information on the offerings of either of these two manufacturers:
www.altera.com/html/mega/mega_devkit.html
and
www.xilinx.com/products/logicore/dsp/
.
These functions are
intellectual property
; they are usually
sold
as macros or cores and may be expandable or customized for your
particular application. They are placed in the target device and
compiled along with whatever other program you desire. The functions
these companies offer include forward and inverse FFT functions; DCT;
FIR filters; floating-point adders, dividers, and multipliers; JPEG
encoders and decoders; Laplacian edge detection; adaptive filters; and
oscillators.
Distributed arithmetic
How do you implement the multiply/accumulate inside
a dedicated IC, PLD,
or FPGA? It can be done a number of ways, including as a masked DSP with
a single purpose. But faster ways exist. Here are two forms of one
technique that are both useful for PLDs and FPGAs, and also interesting.
This is
distributed arithmetic
.
Distributed arithmetic is not
a new technique, but one that is not in common use for mainstream
processors. It may not seem obvious at first glance, but it is a simple
mix of Boolean logic and algebra. To appreciate the mechanism here,
lets look at some familiar forms. First, recall the expression for
linear time invariant (LTI) systems:
(1)
This formula actually describes only a single instance in
the infinite sum of time:
y
[
n
].
A
k
is the
k
th coefficient of the
polynomial and represents the system impulse response, and
x
k
is the
k
th input sample at time
n
.
Each output is equal to the sum of
k
products. This
technique converts the input sample into an effective address to a table
of
scaled
coefficients that are then summed. Distributed
arithmetic replaces the sum of products with a table lookup method.
We
use binary arithmetic, which is expressed, as values in any base are,
with the expression for a polynomial:
(2)
Here, -
x
k
0
is a sign
bit, and
x
kb
is the value of the bit position
at 2
-
b
. This equation simply states that
the binary fraction we are representing is a sum of the contributions of
each of the powers of two necessary for the required precision. In other
words, the binary number:
0.11010011
is equal to the value of
the sum:
(3)
One nice thing about base two is that each
power can only have a value
of either one or zero, which means that we can turn our original
equation into a series of sums of each bit position multiplied by the
scaled coefficients of the impulse response. Since the bits can only
assume a value of zero or one, the multiplication is really a Boolean
AND operation.
We can make this clearer by substituting our
expression for a binary fraction in Equation 2 back into the equation
for an LTI system (Equation 1):
(4)
And now we expand the equation:
(5)
What you see here is that each bit is used as a gating function
for summing scaled versions of the coefficients at each power of two. To
put that another way, each bit of the input sample (variable) is ANDed
with
all
the bits of the particular scaled coefficient and that
result is summed.
A lookup table can be implemented that is addressed
by the bits of the input sample. The contents of that table would be the
sums of the scaled coefficients indicated by that address.
Thus, an
LTI can be implemented with addition, subtraction, and simple scaling
(arithmetic right shifts).
The second form of this technique is more like an extension. It is also
a table lookup technique involving partial products. In the case of
linear phase FIR filters, where the coefficients are symmetric around
the center values, we may fold the input data word about
the center and
add the symmetric taps before mulitplying them by the coefficients. We
use the input sample bits as described in Equation 2, but we fold on the
center, and the corresponding bits are summed and used as addresses for
partial products. Since we know the coefficients, a table can be made
that will form the products of the sub-bit fields in the input and the
coefficients. These partial products would then be summed to produce the
complete product, which is again summed with the other products to
become the sum of products.
The advantages of these processes are their speed and ease of
implementation in an FPGA.
Specialized DSPs
A number of specialized DSPs are available. In addition to the DSPs we
discussed in previous issues, Analog Devices offers a series of 16-bit
devices for voice band processing that are actually quite inexpensive.
Some of these DSPs have A/D and I/O built into the product to minimize
the number of parts necessary to build a complete system. They also
offer
something called The Worlds Smallest DSP.
Motorola and Crystal Semicond-uctor have a series of DSPs that are
tailored for audio. These two companies offer different cores but both
feature DSPs with AC-3, Prologic, and DTS algorithms as masked ROMS
on-board.
For high-end transform processing, Sharp Semiconductor offers the
Butterfly DSP that is tailored to performing fast Fourier transform
processing. They also have DSP chipsets available.
Standard microcontrollers
Digital signal
processing can be done on any microcontroller, assuming
you can meet all the necessary timing and bus width conditions for your
application. Ive written both audio and video processing for the
80C196. In fact, Intel offers a successor to the C196 that
incorporates the MAC and other arithmetic-intensive operations as part
of the instruction set.
I hope Ive made it apparent that the DSP technology is an
abstraction and not device-dependent. The choice of the device is
application-driven. esp
Don Morgan is senior engineer at Ultra Stereo Labs and a consultant
with 25 years experience in signal processing, embedded systems,
hardware, and software. His most recent book is
Numerical Methods
for DSP Systems in C
. He is also the author of
Practical DSP
Modeling, Techniques, and Programming in C
.