# Using statistics to make sense of noisy data

Oscilloscopes do amazing things with the large amounts of data they generate from making measurements. Your oscilloscope lets you “see through” noise and reduce measurement uncertainty. The magic happens through statistics applied to a large data set. While some processing, like histograms, is obviously statistically based, some of the statistics are hidden. In either case, you can take advantage of your oscilloscope's statistical analysis.

Consider a basic oscilloscope measurement of a noisy square wave as shown in Figure 1.

*Figure 1. Noise on a square wave adds difficulty in finding the signal's amplitude.*

Noise on the square wave makes measuring its amplitude difficult. Amplitude forms the basis of other measurements such as width, rise times, fall times, overshoot, and even to some extent frequency and period. For example, the rise time measurement is the time needed for the signal to change from 10% to 90% of its amplitude. Width is the time difference between transitions with opposite slopes at 50% of the signal amplitude. So, determining the amplitude is critical for almost all other measurements.

**Averaging**

Almost all oscilloscopes provide an averaging function, the most common statistical process applied to waveforms. Acquiring multiple waveforms and adding them point by point then dividing the sum by the number of waveforms in the average yields the average or mean value of the waveform as in Figure 2. The upper trace is the acquired waveform. The lower trace is the average of one thousand acquisitions. Averaging has suppressed the noise leaving a clean waveform.

*Figure 2. Averaging adds multiple waveforms and normalizes the sum by the number of waveforms acquired, thus reducing noise.*

For Gaussian distributed noise, the noise amplitude decreases as the square root of the number of waveforms averaged. So, one thousand acquired waveforms decreases the noise amplitude component by a factor of 31.6 or about 30 dB. The only downside of the averaging process comes from the need to acquire a large number of waveforms.

**Another approach for statistics**

Statistics can also quantify uncertainty. If each sample value of the square wave is used to create a histogram of its instantaneous amplitude value, you can begin to see the structure of our waveform as in the composite graphic in Figure 3. The histogram was generated, rotated so that its amplitude scale was vertical to match the acquired waveform, and then superimposed on the signal waveform.

*Figure 3.After generating the histogram, I rotated it aligns with the top and base amplitudes of the waveform. Doing so yields the signal's amplitude, even in the presence of low signal-to-noise ratios.*

The histogram has two peaks. One corresponding to the highest level of the square wave, called the top while the other relates the lower level or base value. The mean values of the respective histogram elements represent top and base values. The square wave's amplitude is the difference between the top and base amplitudes. Knowing the amplitude allows computation of all the other pulse parameters as shown in Fig. 2. Statistics let you see through the random parts of the data to extract meaningful information from chaotic noise effects. This technique works on a single acquisition and doesnâ€™t require multiple acquisitions. Multiple acquisitions do, however, improve the accuracy of the measurement.

**Measurement statistics**

Statistics can be used to advantage when applied to the oscilloscope measurement parameters. Statistics can be applied to any of the measurement parameters available in the oscilloscope. Figure 4 is an example of a measurement of a clock signalâ€™s time interval error (TIE).

*Figure 4. A TIE measurement of a 333 MHz clock with measurement statistics including mean, minimum (min), maximum (max), standard deviation (SDEV), and the number of measurements included in the statistics.*

The upper trace of Fig 4. Shows a 333-MHz clock signal while the lower trace is a zoom showing the trace horizontally expanded. TIE is a measure of the time difference between an acquired edge and its ideal location in time. Think of it as a signalâ€™s instantaneous phase. The oscilloscope performs TIE measurements on each edge of the acquired waveforms, what is called “all instance” measurement. The measurement readout field has been expanded to make it easier to read. The measurement readout shows the last TIE measurement made as the value, in this case 8.2 ps. It also shows the mean, minimum, maximum, standard deviation, and the number of measurements included in the statistical values.

In this case, the statistics include over a million values. The mean is the average value of all those measurements, which is zero in this case. Becuase the value of TIE is not zero, this indicates that the TIE measurements are both positive and negative values and average to zero. The minimum is the lowest TIE value determined and the maximum is the largest TIE value encountered. The difference of maximum and minimum is the statistical range of the measurements. In the example the minimum is -34.3 ps and the maximum is 40.7 ps which confirms both positive and negative values. The standard deviation, often referred to as sigma, is a measure of the distribution of sample values about the mean. Since the mean is zero the standard deviation is equivalent to the TIEâ€™s root mean square (rms) value. 68% of all measurement sample values lie within Â± one standard deviation of the mean for the Gaussian distribution.

Under the numerical measurement readout is an iconic histogram, called a “histicon” that shows the distribution of the measurement values. A histogram plots the number of samples with values within a small range, or bin, versus the value. The bell-shaped distribution of the “histicon” is characteristic of the Gaussian or normal distribution.

The measurement statistics have provided a concise description of the million measurements. We know the average value, the largest and smallest measurement, as well is the shape of the histogram distribution which appears Gaussian.

You can easily look at the histogram for the TIE measurement in greater detail. Clicking on the “histicon” will open a math trace containing the histogram for closer analysis as revealed in Figure 5.

*Figure 5. Expanding the “histicon” lets you study the histogram of the TIE measurement. Histogram parameters read the mean, standard deviation, and range.*

Trace F1 shows the histogram of the TIE measurement values. Histogram can be a source of measurements and three histogram parameters have been enabled to show the mean, standard deviation, and range of the histogram. The parameter markers graphically show the histogram parameter basis. Histograms are not available in all oscilloscopes, in most cases it is an optional feature.

TIE represents a form of timing uncertainty or jitter. The histogram parameters quantify the jitter readings. The standard deviation is the rms jitter and the range parameter represents the peak to peak jitter. These values measure the actual jitter over the number of values included in the histogram. This information can be extrapolated to project jitter values out to 10^{12} or more values. This is done for jitter testing required in many high-speed serial data standards and most high-end oscilloscopes offer such serial data analysis options.