The FFT is the supreme spectral analysis technique. But there are other, far simpler ways to examine a signal's energy.
The adaptive and predictive mechanisms we have been discussing in recent months are important features in equipment we use all the time, whether for engineering purposes or entertainment. Constantly increasing DSP clock speeds and wider instruction codes that execute in fewer cycles have made it possible to implement sophisticated algorithms for analysis and synthesis. As a result, we can implement transforms that took minutes twenty years ago (or even hours forty years ago) fast enough to work at audio speeds. This allows for high-speed analysis using algorithms such as the fast Fourier transform (FFT) and the discrete cosine transform (DCT) for compression techniques using the perceptual coding, as in the MPEG descriptions and AC3.
However, methods for deriving spectral information from a signal existed prior to the FFT and, in fact, are still used in all sorts of applications. They are the subject of this month's column.
These well-known techniques can be helpful in locating the energy in a signal. They are the same methods used in both simple and sophisticated algorithms for analysis, from equalization to steering in compression. We will examine third-octave filters (and their relatives), wavelets, and Discrete Fourier Transform (DFT) applications. But first, two more adaptive filters types.
Sliding/shelving filters and parametric filters
Two common filters that we can use to implement adaptive characteristics are the sliding/shelving filter and the parametric filter. Both of these filters are used extensively in sound processing and noise management for analysis.
Usually, the sliding/shelving filter changes its corner frequency and transition slope in response to signal frequency and magnitude, though other references are also possible. Here, the corner frequency (the –3dB point) changes to accommodate the frequency content of the input, placing the roll off just beyond the area where the energy in the input is located.
Normally, a filter's transition band exhibits a roll off or attenuation that is equal to approximately –6dB per octave. Shelving means that the filter is allowed to roll off as it normally would for a given time and then stop. The shelving point, then, is the point at which the filter no longer rolls off. The shelving point is often adaptive. It will move relative to the signal's overall magnitude.
It is just such a filter that is used to remove media and other additive noises in algorithms such as Dolby's spectral recording. A signal stream is analyzed for frequency content. When you learn, through this analysis, the source of most of the energy, the corner frequency of the sliding filter is adjusted to fit just outside it. The corner frequency adjusts itself as quickly as necessary to follow the audio but not be tricked by transients. At the same time, a shelf in the filter's transition band is always adjusting itself to the overall magnitude of the input. The higher the levels, the deeper the shelving becomes.
The sliding filter most often takes the form of a low-pass or high-pass filter with shelving. You may construct a bandpass sliding filter structure by using a lowpass sliding filter and a highpass sliding filter, each moving in accordance with the levels and frequencies of the input.
The parametric filter is one in which the corner frequency, gain, and Q (a measure of resonance; the higher the Q, the narrower the bandwidth at a particular pole location) are adjustable. This filter takes many forms, including low and high pass, but it is most commonly used in a bandpass form for equalization or analysis. Because it is so malleable, we use it to create other filters such as the sliding filter mentioned above.
Where is the energy?
These filters, and others like them, need to know something about the spectrum of the input to produce the appropriate change in, say, frequency response or level. Typically, when one thinks of determining frequency response, one thinks of the Fourier transform.
With the Fourier transform, a moving, time-based data stream, such as an analog signal, is transformed into a static frequency response with coefficients indicating the signal level at the available frequencies. With faster DSPs, the computation of a high-resolution FFT becomes much more reasonable.
Fourier analysis can produce a great deal of useful information about a signal. Not only can you get the spectral response of a signal but, with the addition of a small amount of code, you can derive the power spectrum too. This kind of analysis has two major problems, which limit its ability or, at least, makes its computation more expensive in terms of time and computing resources.
First, to get high enough resolution in a particular band, you must sample at a high rate, which means that you must process more samples to complete the calculation.
The other important problem with this form of analysis is that it has no time resolution, only frequency resolution. This means that you can know that a certain frequency is involved in the signal you are analyzing, but you will not know when, exactly, it occurred. Work-arounds exist. One of the most popular is the Gabor transform, which is little more than the Fourier transform constrained to a small time portion of the input stream. This works for many applications, but, again, can involve more expensive calculations.
The Goertzel filter is also an option. It acts as a filter with very high Q filters and extraordinary gain. If you only need to know the magnitudes of a few frequency points, this filter can produce a very fast result at a low cost. The only drawback is that the time efficiency in using this filter drops off as the number of frequency points you require increases. Filters of this sort are used in touch-tone phones for detecting the combination of individual frequencies used. They are also used in analysis in place of the DFT or FFT when only a few bands of narrow frequency are needed.
Don't get the impression that Fourier analysis is too costly and inefficient to be useful. As filter order or the number of frequency bins increase in an application, Fourier analysis becomes the most efficient way to go. But when the information you need is not as specialized or detailed, you might want to consider other means.
Simple sometimes works better
Bandpass filters have been around for a long time. They are simple in nature and require little code or time to process. In sound processing, they have been used for singling out certain bands for amplification, attenuation, or enhancement. But they may also be used to process a signal stream for the amount of energy in a single area of the spectrum. Well-known audio compression techniques such as Dolby A, B, and SR use bandpass filters to determine how much energy is present in any band so that it might be used to control the amount of boost or attenuation needed.
A-type compression employs four main bands: below 80Hz, 80Hz to about 3kHz, 3kHz to 9kHz, and 9kHz to 22kHz. Each band is an individually tuned servo loop controlling the gain of an amplifier according to a compression law. When the levels are low, the compression is great, as the levels move toward 0dB, the compression is alleviated. This way, the energy in each band determines how much magnitude comes through. This technique has been used to attenuate tape hiss and other background and media noises.
SR compression also employs bandpass filters to quantify the signal magnitudes in certain bands and drive compressors operating in servo loops. This information is also used to control sliding filters that, in turn, are used to wrap their skirts around the areas of greatest audio magnitude, attenuating levels outside the main bands.
Other applications use the band pass filter to rectify and further filter the outputs to drive attack/release networks for time localization of certain frequencies. Bandpass filters can also be used as steering for algorithms such as the SR or A-type. And, as hinted at above, the bandpass filter could even be followed by a Fourier transform for increased frequency resolution.
Cascading bandpass filters into filter banks is another useful analysis tool. Common analytical filter banks employ octave band filters in which the center frequency of each band pass is separated from the next by one octave. This is commonly used in home audio equipment to fit the audio to the room it is playing in.
The third-octave filter bank is used extensively in audio and sound processing. It is akin to the octave band filter banks whose skirts start at every doubling of frequency, but the third-octave is popular because its resolution is tight enough to pick out certain areas in the band for individual attention. Here, the center frequency of each filter is separated from the next by , thus providing much more resolution and information. These are often used in auditoria, cinema, and noise management analysis-anywhere that fast and accurate, but inexpensive, analysis is needed.
Of course, there are filter banks with even narrower bins. It was not that long ago that 1/10 octave filter banks were used. But here, the speed of the DSP has made the Fourier transform just as economical. As a result, this type of filter bank is less commonly used.
Filter banks similar to these are also used in such sophisticated algorithms as MPEG, ATRAC, and Dolby's AC3 to find the focus of the input stream and direct the encoding or decoding.
Wavelet filter banks
The wavelet filter bank is another technique finding a great deal of use. In its most elementary form, this filter bank works with simple structures and can basically be made to work with as little as one set of coefficients. Typically, the filters are half-band low -pass filters with a cutoff at 1/2 of Nyquist. Each node develops a high-pass and lowpass output. The high–pass output is the output of the transform; the lowpass output is fed to the next node. This results in high-definition spectral information and time localization data. This makes it possible to determine when a certain component entered the spectrum.
The half-band filters are cascaded to form a tree. At each node the bandwidth of the wavelet and the signal are exactly one-half of what they were at each previous node, so that each node can operate with a down sample of two. This structure implements the algorithm economically, both in code space and speed.
It has another very attractive feature. The frequency spacing is logarithmic (similar, in fact, to the human ear) so that the information it produces can be used with perceptual coding almost without modification.
The wavelet transform is easy to implement and fast, making it a good competitor for the Fourier transform-especially when the application fits it well.
Don Morgan is a senior engineer at Ultra Stereo Labs and a consultant with 25 years of experience in signal processing, embedded systems, hardware, and software. His e-mail address is .