Low-cost techniques for sound generation - Embedded.com

Low-cost techniques for sound generation

When designing a low-cost device where pennies count, we need a creative and inexpensive way for the device to communicate effectively with the user. Using sound as output requires comparatively few system resources relative to the amount of information it can deliver.

We'd all love to have a color display attached to the systems we design. It'd make communicating with the user so much easier. But most of us don't have that option. We get by with a few LEDs, maybe a little monochrome LCD—pretty low-bandwidth stuff. But a user interface doesn't have to be visual. By using sound, we can offer the user more information with only a small increase in system complexity.

Humans are well adapted to communicating with sound. We can learn a lot from a simple grunt uttered at the right time. And sound is omnidirectional; the user need not be looking at the source to get the message. Not having to look at a screen can be important, for example, when driving an automobile.

In this article, I'll explore some of the hows and whys of installing sound output in a system and give you some “sound” suggestions (pardon the pun) for doing so. The “how” of sound covers the equipment and techniques necessary to make useful sounds. We'll start with this, to give you an understanding of the capabilities of several methods and how they can be applied to meet the system requirements.

In the broadest sense, the “why” of sound output is simple—to communicate something to the user. Whether the information is a warning, an item of interest, or entertainment depends on the application. I'll go into more detail on user-interface issues later.

Sound hardware
Any electronic system that emits sound requires a transducer—a device that converts electricity to audible vibration. Sound transducers can range in complexity from a simple electromechanical buzzer to a plasma tweeter. I'll concentrate on the more conventional transducer types—buzzers, piezoelectric transducers, and dynamic speakers—since the more exotic transducers have expensive tastes when it comes to drive requirements.

Possibly the simplest sound output system is the electromechanical or piezoelectric buzzer. Both are driven by simply applying a DC voltage (with sufficient current) to the device. The electromechanical or magnetic buzzer operates by energizing an electromagnet coil that pulls on a metal armature. As the armature moves, a set of electrical contacts open, de-energizing the coil and releasing the armature. As the armature returns to its resting position, the contacts close, energizing the coil once again. The process repeats at a frequency audible to humans. In applications requiring loud sound, you can connect the armature to a metal diaphragm.


Figure 1: A simple circuit that will suppress inductive spikes caused by an electromechanical buzzer

While the electromechanical buzzer is inexpensive and simply made, it has some pitfalls that limit its use in electronic systems. It draws a lot of current, and the large inductance of the electromagnet causes a huge voltage spike each time the coil is de-energized. If not filtered carefully, this spike can damage or reset more sensitive electronic components in the system. Figure 1 shows a simple way to help suppress the worst spikes, using a small power rectifier (D1) and an NPN transistor (Q1). Liberal use of bypass capacitors (not shown) is recommended. Additional filtering or even a separate power supply for the buzzer may be necessary in particularly sensitive applications.

The piezoelectric buzzer, instead of using an electromagnet, contains an oscillator driving a piezoelectric (piezo) element. The piezo element is a small piece of quartz or other material that flexes when electricity is applied. If a signal varying at an audio rate is applied, it produces audible sound. A piezo buzzer uses less power than an electromechanical one, and is less likely to cause power spikes.

Electromechanical and piezo buzzers usually require more current than can be supplied by the typical digital output or microprocessor port pin. A high-current driver or switching transistor may be required.

Piezo elements without the oscillator are also available. It's up to the system designer to provide an oscillator or other audio-frequency drive. This gives the designer the flexibility of using different tone frequencies to convey more information to the user, at the cost of slightly more complex drive circuitry. In some cases small piezo elements can be driven directly by 5V logic, so a gated audio oscillator or frequency divider implemented in CMOS or TTL logic may be sufficient. It's even possible to use a general-purpose I/O (GPIO) pin, toggled at an audio rate. To reduce stress on the piezo element, make sure the voltage across it is zero when it's not in use (in other words, set the port pin to 0).

The most widely recognized sound output device is the dynamic loudspeaker. Typically it's made from a cone of paper, fiber, plastic, or metal with a coil of wire near the center. The coil is suspended in the field of a permanent magnet. When current passes through the coil, the magnetic field it produces interacts with that of the permanent magnet, causing the cone to move. If the coil current varies at an audio rate, the cone creates audible sound.

A variation of the dynamic loudspeaker commonly used in electronic instruments consists of a flat metal disk that is held in place against a permanent magnet surrounded by a coil. As current passes through the coil, it either aids or impedes the permanent magnet's field, causing the force on the metal disk to change. Again, current varying at an audio rate causes the disk to flex and create audible sound. The metal disk has a narrow range of frequencies at which it vibrates efficiently, making it more suitable for single-frequency applications such as a warning beep. The cone speaker, like a piezo element, has a wider frequency range more suitable for music, voice, or multiple tone frequencies.

Dynamic speakers can require significant current, since the coil may have a DC resistance of less than an ohm and an impedance of 3 to 32 ohms in the audio frequency range. They typically require an analog signal followed by a linear power amplifier or a Class D switching amplifier, although for simple tones the drive signal may be created by a digital oscillator or GPIO port pin and amplified appropriately.

Linear audio-amplifier ICs for driving small speakers have been available for decades. National Semiconductor's venerable LM386 has been used in countless designs and is available in DIP and surface-mount packages. It's inexpensive and operates over a wide range of supply voltages. More recent ICs developed for cellular telephone applications use a Class D switching configuration to provide analog amplification with much higher efficiency than a linear amplifier, at the cost of slightly more complex circuitry.

Sound software
A device that emits sound continuously doesn't convey much information and quickly becomes an annoyance. In order to provide useful information, your system must be able to modulate the sound it produces. Modulation can be as simple as turning a buzzer on and off at the correct times, or it may involve manipulating several parameters such as the volume, pitch, and waveform.

On-off modulation is the simplest to produce and is capable of carrying a lot of information if done intelligently. Morse code is an excellent example. Even if you don't want to make your users learn Morse, you can still tell them a lot with simple timing of sounds. Short beeps can be used for simple status indicators such as “keypress accepted.” A longer beep can warn a user of a condition that requires immediate attention, such as “door ajar.” It's even possible to transmit numeric codes by having the user count beeps, although this does require a lot of attention on the user's part. (It's a great debugging tool, however.) On-off modulation can be applied to any transducer type, even an electromechanical buzzer, and requires only a single output bit.

Frequency modulation is possible only if you have control of the pitch of the signal sent to the transducer. Until sound cards were common, IBM-PC compatible computers had a speaker driven by a programmable timer. The software could vary the frequency of the signal driving the speaker and, thus, the pitch of the tone. Even this simple technique allowed great flexibility in output sounds, making even music possible. For alerting purposes, it's possible to create sirens and other sound effects. Frequency modulation can even be performed on a single-bit output, if the software continuously updates it at an audio rate.

Amplitude modulation requires the ability to control volume. It can be used to make urgent sounds more noticeable and may also be required for systems that must simulate speech or play back prerecorded sounds or messages. Usually this would be implemented by using a digital-to-analog converter (DAC) driving an analog amplifier and a speaker. Such a system requires a significant amount of CPU horsepower to feed a constant stream of samples to the DAC. Prerecorded sounds also consume large amounts of memory, even when compressed.

A variation of the amplitude modulation method is to use pulse-width modulation (PWM). With PWM, a periodic digital pulse's duty cycle is proportional to the desired amplitude of the signal at that point in time. Many microcontrollers available today already have PWM outputs, and those that don't usually have some timers that may be pressed into service to create one. A PWM output must usually drive a low-pass filter to convert the signal to analog levels suitable for driving an amplifier and speaker. In extremely cost-sensitive applications it may be acceptable to apply the PWM signal directly to a speaker and let the inertia of the speaker cone or diaphragm perform the filtering. In most cases, however, better sound quality requires a separate filter circuit, which may even be incorporated into the amplifier stage that drives the speaker. In general, the higher the pulse rate of a PWM signal is in relation to the modulating signal, the easier filtering becomes. A PWM frequency of 10 times the highest modulating (tone) frequency is a generally sufficient.

A Class D amplifier uses PWM to get its high efficiency—it converts an analog signal to PWM, uses that to drive a high-current switch, then passes the output of the switch through a low-pass filter and on to a speaker. If the signal is already PWM, all that's needed is the switch and the low-pass filter.

Generating tones of an arbitrary frequency with a single-bit output usually requires a programmable timer, or at least a periodic interrupt at a rate much higher than the highest output frequency. Early IBM-PC compatibles used the 8253 programmable timer, which could be placed in auto-reload mode to generate a square wave of the desired frequency.

Creating arbitrary tones with a DAC or PWM is a different task. Since such an output relies on samples at a regular rate, the first task is to choose that rate. According to the Nyquist Theorem, the highest frequency that can be represented by a sampled system is less than half of the sample rate. Therefore, if you want to create a 1,000Hz tone, you'll need a sample rate greater than 2,000Hz. To recreate clear human speech, which occupies a spectrum from 300 to 3,000Hz, you'll need a sample rate of greater than 6,000Hz. (For reference, the highest note on a piano is a bit less than 4,200Hz, while the upper limit of human hearing is in the region of 20kHz.) For simple warning or informational tones, 100 to 1,000Hz is a good range.

A sampled sound system may require low-pass filtering to remove the harsh, hissing sound of high-frequency switching transients caused by the transition from one sample to the next. As with a PWM output, a high sample rate makes filtering easier. If you're using a PWM output already, the low-pass filter that averages the variations on duty cycle of the PWM pulse train will also even out the steps in the sample values.

After choosing a sample rate, you need a way to determine the analog values of the waveform for each successive sample. While it's possible to use the sine or cosine function (assuming one is available in your code), they tend to be costly in terms of CPU cycles. I prefer to use direct digital synthesis (DDS), which I'll describe later on. It's also possible to store sampled sounds captured with a microphone and digital recorder and play them back as required, but it's important to use a playback sample rate that is the same as the rate at which the sounds were recorded.

Sound design
The kinds of sounds your system should make depend a great deal on what you want to communicate and what kind of output device you've chosen to implement. A simple buzzer can make only a single tone, but it can be switched on and off with different timing. A short beep can be used to indicate success or some other benign condition. A longer beep, a repeating short beep, or a long beep switched at a low audio rate (which gives a discordant, “minor chord” sound) can indicate a warning or error condition. More information can be sent using an encoding scheme such as Morse code or specific numbers of beeps for each message.

(In a printer product I worked on several years ago, I modified the high-pitched error beep tone to be switched off and on at about a 100Hz rate. The resulting raspy-sounding squawk made it easier for the user to recognize that an error had occurred. Our support engineer was even able to hear and count the error beeps over the phone, which made the job of remotely debugging printers easier.)

If you have control over the pitch of your sound output device, either by controlling the frequency of an output pulse train or using amplitude modulation, you can play tones or tone sequences. If you've ever heard a ringing cellular phone play Beethoven's “Ode to Joy,” you know what I mean. While playing songs may sound trite in some applications, a sequence of two or three rising or falling tones can be used to signal various conditions. Don't get carried away with the duration of the notes or the length of the sequence—the purpose of most systems' sound capability is to inform the user, not slow him down or entertain him.

If you can control the waveform being played through the speaker or transducer with a DAC or PWM, you have many options. Simple tones, note sequences, musical chords, and even prerecorded voices or sound effects are available to you. Again, don't let your system drone on and on just because it can—sound is best used sparingly. In fact, it's been said that half of all music is silence. A moment of well-placed quiet can say a lot about a system: everything's okay, the keystroke wasn't accepted, or the power is off.

So what should you say with sound? Like any other good engineering question, the honest answer is “it depends.” What does your system do? Is it a pinball machine or some other sort of entertainment device? Then music, sound effects, or voices may be in order. These are best served by a sampled audio scheme played back through a DAC or PWM output.

If your product is a programmable controller, acknowledging key presses and other user inputs with a short, staccato beep may be appropriate. Indicating an error with a longer beep or a dangerous condition with a siren might also be wise. Single tones can be made with a buzzer, but varying tones or a siren requires at least frequency modulation.

Choosing the correct sound for a given situation is tricky. It's a choice heavily influenced by human psychology. People associate certain sounds with certain events: a siren, glass shattering, discordant notes, or even a single note held too long all remind us of unpleasant situations and make good candidates for error or warning signals. Short tones, clicks, or short pleasant melodies remind us of happy times and successful functions. Also consider the user's convenience. Does he really need sound for every event? Should she be able to command the system to silence?

I'm a firm believer in letting the user turn off the sound if he doesn't want it. However, be aware that some situations demand attention regardless of the user's wishes: situations that represent possible loss of life, health, or property. Consider letting the user choose silence, then consider the possibility of ignoring their choice. Safety first!

Sound alternatives
While this article has focused on using sound as a user-interface device, it can also be used as a means of communication between computer systems. The ubiquitous Touch-Tone telephone, for example, uses audio tones to transmit a desired phone number to the switching system. The encoding used is called dual tone multiple frequency (DTMF), which is a set of eight nonharmonically-related tones that are played in pairs to indicate each of 16 possible keys. (The basic telephone handset has just 12 keys [0 through 9, *, and #], but the DTMF standard adds the letters A through D.) Table 1 shows the tone frequency pairs and their corresponding characters.

Integrated circuits that can generate DTMF tone pairs in response to either values written by a controller or read directly from a keypad have existed for many years and can be a viable solution for a DTMF-capable device. But the tones are simple enough to generate in software that the additional cost of a chip may not be necessary.

A favorite method for generating a sine wave at a desired frequency is the direct digital synthesizer, or DDS. The DDS uses a free-running counter (the phase accumulator ) to index a lookup table with values that reflect the amplitude of a sine wave or other periodic waveform. Once each sample period, the phase accumulator is incremented by the phase increment , a value that determines the speed at which the phase accumulator overflows. The sample period is a regular time interval at which points on the waveform are output. The larger the phase increment, the higher the output frequency. The rate at which the phase accumulator overflows determines the actual output frequency.

Values from the lookup table are sent to a suitable DAC or PWM circuit to be converted to analog signals. As with any sampled scheme, you'll need to use some low-pass filtering on the output to remove the stair-step quantization error from the signal. However, the filtering requirements are less stringent than for a simple square-wave output. As I mentioned earlier, the Nyquist Theorem dictates that the sample rate be more than twice the highest expected output frequency. To make filtering easier, use the highest sample rate that you can support in your system.

The DDS is suitable for any transducer that can be driven with an analog signal. Most commonly this is a dynamic speaker, but a piezo element with a frequency response wide enough to accommodate the expected output frequencies will do as well. Narrowband transducers like the metal-disk speaker will limit the versatility of the DDS somewhat.

Given the sample period and frequency, one could determine the analog output value with a sine function. But most trig functions, including the sine, are computationally intensive; the necessary library routines may not even exist in some simple systems. With DDS, a set of sine samples are precalculated and stored in a lookup table before the code is even compiled.

Generating two sine waves with a DDS is almost as easy as generating one; a second phase accumulator and a phase increment and a quick sum or the average of the two sample values that are obtained will suffice. Listing 1 shows a software-only DDS that is suitable for generating DTMF tones with a PWM or other unipolar (only positive sample values) DAC scheme. The complete code, including the lookup table is available at ftp://ftp.embedded.com/pub/2003/06hinerman.

Listing 1: DTMF generation using DDS

Another excellent illustration of system sound techniques is an amateur radio project that's documented on the Web. For several years, radio amateurs have used a modulation method called PSK31 for keyboard-to-keyboard chats on the air. PSK31 uses phase-shift keying of a carrier to transmit bits of data. Usually a program running on a personal computer generates a phase-shifted audio carrier and outputs it from the sound card. A single-sideband (SSB) transmitter converts the audio to RF and transmits it out the antenna. An SSB receiver recovers the audio carrier, and a computer equipped with a sound card demodulates the phase-shift keying at the other end and displays the original text.

George Heron (amateur radio callsign N2APB) has developed a simple device that generates a PSK31 audio message. Links to George's schematic, source code, and other documents are at www.njqrp.org/psk31beacon/psk31beacon.html. The module uses a Ubicom SX28AC/DP microcontroller and a simple resistor array instead of a DAC chip. A personal computer running a PSK31 program (available through aintel.bi.ehu.es/psk31.html) and a microphone plugged into its sound card's input can hear and demodulate the audio signal. George's design also illustrates how to use the LM386 amplifier chip.

Bring on the noise
When designing a system, especially a low-cost device where pennies count, we need to use our imaginations to come up with effective yet inexpensive user interfaces. We rarely have the luxury of a color video display. Using sound as an output device requires relatively few system resources for the amount of information it can deliver. Sound can be used to communicate with a user, or it can be used to communicate with other systems over narrow-bandwidth channels. When you factor in the inherent advantages of sound in general—it is omnidirectional, and humans can learn to identify good and bad sounds quickly—you'll find that it gives a lot of “bang for the buck.” esp

David Hinerman has been developing embedded software and systems for over 20 years and is currently employed by Ametek Power Instruments. His designs have included high-reliability electricity meters, communication and control devices, and printers. David has an AS in electronics technology from Hocking Technical College. Contact him at .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.