Reducing VoIP quality degradation when network conditions are unstable
Delay and packet loss can significantly affect the perceived quality of voice transmitted over packet networks. Packets travelling from source to destination may suffer from delay variation, may arrive out of order or even may be lost. To compensate for delay variation, de-jitter buffers (the term jitter buffer is more commonly used) are used at the receive side of packet based systems. The role of de-jitter buffers is to restore the correct order of packets and to allow the “slower” packets to arrive.
There are two major classes of jitter buffers: static jitter buffers and adaptive jitter buffers. The static jitter buffers have a fixed size and the packets leaving the jitter buffer have a constant delay (the delay from the moment the packet is produced until the moment when the packet is consumed), whilst the adaptive jitter buffers have variable size and variable delay.The management and implementation of the jitter buffers is not specified by any standard resulting in different implementations for both static and adaptive algorithms. Also, there is not a general recipe for “a good” jitter buffer; it is very much dependent on the target application and the environment where the application is used (e.g. it is not necessary to have a complex and/or memory consuming jitter buffer implementation if the delay variation for the target network is very low).
We have focused our attention on developing an adaptive jitter buffer algorithm for a Voice over IP embedded application with restrictions in both memory consumption and processing power. Our objective was to build a base algorithm that uses less processing power when the network is “behaving properly” and to use additional mechanisms that would minimize the quality degradation of the output streams when the network behavior becomes unpredictable.
Previous approaches
Different approaches to playout buffer algorithms have been studied in literature. We have classified the algorithms we studied into 4 categories (a similar classification being proposed by Narbutt et al. in [1]):
- algorithms that establish the playout delay based on a continuous estimation of the network parameters
- statistics based algorithms
- algorithms that are maximizing the user satisfaction
- algorithms that are using various heuristics and monitor certain parameters (e.g. late packets fraction, buffer occupancy, etc).
The most known algorithm pertaining to the first category is the one presented in [3]. It uses an autoregressive estimate method to estimate the average network delay and its variance.
The algorithm estimates two parameters (delay and delay variance) and uses them to calculate the playout time.

where di and dv are the i-th estimates of delay and its variance respectively, while nl is the one way packet delay of the i-th packet (as defined by RFC2679).
α is a parameter that impacts the jitter buffer adaptation speed. A lower value for this parameter makes the jitter buffer sensitive to small variations in delay. A higher value makes the jitter buffer less sensitive to small delay variations, but adapts slower to sudden changes in the network delays. We propose a value of 0.998002, which corresponds to an exponential moving average of about 500 samples.
Based on the above estimates, the playout time is computed as:

where pi is the playout time and ti is the sent time. β is a factor influencing how important the delay variance is in the computation of the playout time and it is empirically computed. The authors proposed a value of 4.
The values di and vi are computed for each packet received, but they are used to calculate the playout time only for the first packet in a talkspurt (During silence periods an application may sent occasional comfort-noise packets or may not sent packets at all. The first packet of a talkspurt is the first packet following a silence period. The notion of talkspurt and talkspurt identification is defined and discussed in RFC3551 [22])
Several modifications have been proposed to this algorithm. Some of them were proposed by the original authors in [3], introducing detection of spikes in network delay and different adaptation speeds for delay increase and decrease. Another suggested improvement of the algorithm was to use a dynamic value for alfa ([4], [10]) dependent on the network conditions (higher values for stable periods and lower values for instability).
The algorithm presented above is actually based on an exponential moving average filter; other types of filters being also proposed (e.g. NLMS filter in [11]).
In the second category, there are a significant number of algorithms that are building the packet delay distribution function for the packets received. In [7] and [8] a histogram of the previous delays is computed and maintained. In [9] the parameters of the Pareto cumulative density function are continuously updated. The playout time is then computed such that the packet loss is kept less than a defined threshold.
The third category contains several algorithms that are building functions for measuring the user satisfaction. In [8], the authors established a relationship between MOS value, packet loss ratio and playout delay assuming a Pareto distribution of the network delays. In this way, the playout delay was calculated to maximize the MOS function.
The algorithms pertaining to the fourth category monitor certain parameters like buffer occupancy, loss percentage, late packets fraction, etc. In [12], the buffer occupancy is monitored over time and the delay reduced when the buffer is consistently containing more than N frames for a defined period of time.
In [13], an occupancy watermark is introduced to define the threshold for the buffer occupancy when the risk of overflow and underflow is reached. The area between the two limits (underflow and overflow) is considered the normal mode of functioning. When the buffer occupancy falls outside this area, the jitter buffer enters an adaptation phase with the aim of returning the occupancy inside the targeted area.


Loading comments... Write a comment