CMP EMBEDDED.COM

Login | Register     Welcome Guest ESC Boston  esc india  Call for Abstracts
 

PRODUCT HOW-TO: Improving real-time voice quality in a VoIP-based telephony design



Embedded.com
The general purpose SoCs used by today's cordless or IP phones, integrated access devices and wireless unified communications devices, fully support the software DSP (soft-DSP) required for VoIP by integrating a software voice engine within the system software.

Voice engines fit within an embedded processor's system performance capabilities using soft-DSP implementation techniques, and to guarantee telephony- quality voice performance for VoIP, the system software must meet the real-time requirements of the voice engine.

Next-generation soft-DSP products that incorporate both real-time processing and wideband (high definition) voice communication achieve greater end user satisfaction and market potential than current technology. These products set a new high definition standard for voice communication.

Figure 1. The use of a DMA peripheral to collect audio samples into a buffer for servicing by the voice engine is a more efficient approach than CPLD implementation.

This article discusses how to integrate a voice engine for soft-DSP processing in order to exceed telephony quality communication.

Conversely, failure to meet the real-time requirements may cause many symptoms of poor voice quality, including voice dropout, noticeable delay, pops or clicks, fax/modem call failure or corrupt fax pages, incomprehensible speech due to packet loss or excessive delay etc.

Failure to meet real-time requirements results in a missed deadline; this may be a critical system failure requiring a full system reset, unless the system supports recovery in hardware and software.

Minimize delay
Voice communication in telephony calls is bi-directional: Transmission and reception of audio occur simultaneously. Thus, it is critical to minimize delay within the voice system to ensure audio quality. However, delay-minimizing optimizations conflict with meeting the demands of voice processing.

In traditional playback audio systems, such as audio (MP3) playback or multimedia streaming, buffering can increase significantly to compensate for lack of system processing capability - delay is independent of quality.

The voice engine does not have this option, as an audio buffer must be fully processed within a fixed time. This is architected through interrupt prioritization and software scheduling, leveraging and, in some cases, enhancing the operating system's real-time capabilities to guarantee voice processing completion.

In a voice engine system, a software interrupt service routine exchanges voice samples with a voice hardware codec. The voice hardware codec converts analog signals to and from audio samples with a sampling rate of 8kHz.

For telephony applications, the hardware codec is connected to a subscriber line interface circuit (SLIC) as the telephony physical interface, or to a DECT radio, for cordless handsets.

For IP phones or mobile handsets, the hardware codec is connected to an amplifier, which connects to a microphone and loudspeaker.

Figure 2. Listed are the voice engine timing requirements.

The SoC hardware interfaces play a large role in guaranteeing both real-time performance and accurate scheduling of the voice engine. If the SoC has a TDM or AC97 peripheral, a telephony voice codec is directly interfaced to the processor.

If the embedded processor is missing this peripheral, the lowest-cost solution is to interface a CPLD to the processor. The CPLD sends and receives samples to the hardware codec on a sampleby- sample basis, representing the most time-sensitive system solution and the worst-case timing requirements.

Servicing the interrupt
Whether through TDM, AC97 or CPLD, the servicing of the voice hardware must be prioritized to ensure that the interrupt is serviced; other system software must not block this interrupt's critical timing. At 8kHz sampling rate, the interrupt will occur every 125µs.

For an SoC running at 200MHz, the duration of the speed-optimized CPLD interrupt service routine requires processing time of 25µs. This allows the maximum interrupt latency to be calculated as 90µs (125µs - (25µs + 10µs for interrupt servicing setup time)).

For the system to meet real-time deadlines, the OS must invoke the interrupt service routine upon receiving the codec interrupt within 90µs and the OS must allow the servicing to run to immediate completion.

The OS must also guarantee that the interrupt service routine can schedule the voice engine to perform immediate operation on the audio buffers; the interrupt service routine uses a buffer ready signal to activate this scheduling, as shown in the figure. A DMA peripheral is used to collect audio samples into a buffer for servicing by the voice engine, a more efficient approach than the CPLD implementation.

The requirement for the voice engine is to complete before the next voice buffer is ready. The time required to process voice in the voice engine depends on several factors: the processor, cache size, RAM speed, number of physical voice interfaces (audio channels), the soft-DSP processing required for the buffer and the type of speech coders employed.

Timing needs
For complete analysis of the voice engine timing requirements, refer to the table. The tidle measurement indicates the remaining time in which all other system processes or system applications have for available processing; from the voice engine design perspective, this is referred to as idle time.

All lower priority system processing occurs in the idle time after the voice engine completes real-time voice processing. In worst-case processing, the tidle may reach 0ms for several iterations of voice engine processing.

D2 Technologies' vPort software includes performance benchmarks for supported con- figurations. For example, a vPort release may specify the voice processing of a three-way G.729AB voice conference call, requiring a maximum of 100MHz of processing every 10ms in the voice engine, as worst-case and with cache continually flushed.

If running on a 400MHz RISC processor, tvoice will require 100MHz in worst-case processing (25 percent of CPU processing), which corresponds to 2.5ms of processing time in every 10ms processing interval.

The real-time deadline will be missed if tswitch is greater than 7.5ms (tswitch = tbuffer - (tvoice + tidle)); and, this does not include the additional overhead introduced during voice engine processing due to other peripheral interrupts, bottom halves or tasklets.

These are the most important design criteria for the system designer to consider when integrating a voice engine for soft-DSP processing:

For maximum quality, voice communication requires minimizing system delays.

Voice communication is continuous; missing samples or real-time is a critical error.

The voice hardware has strict timing requirements and needs a method for error recovery in the case of missed timing.

The voice engine real-time processing must complete processing on a voice buffer within a 10ms software deadline.

The voice engine interrupt service routine has strict timing restrictions based on the CPU peripheral hardware.

Jonathan Cline is Senior Lead Engineer at D2 Technologies Inc.

1

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Ready to take that job and shove it?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS


 :