The normal distribution
that the guy's only doing it for some doll --
Stubby Kaye and Johnny Silver, Guys and Dolls, 1955
This column is the fourth in a series on parameter estimation, leading up to the justly famous Kalman filter. The discipline is based on the fact that our knowledge of the state of any real-world system is limited to the measurements we can take -- measurements that are inevitably corrupted by noise.
Our challenge, then, is to determine the true state of the system, based on these imperfect measurements.
In previous columns, I've discussed parameter estimation from the context of curve fitting, taking a graphical approach to arrive at the method of least squares. The general idea is to take more measurements -- usually many more -- than the minimum needed to determine the system state. Then you crank the data through an algorithm that mitigates the effect of noisy data.
Jack Crenshaw's Estimation Series
Part 1: Why all the math?
Part 2: Can you give me an estimate?
Part 3: Estimation interruptus
Part 4: The normal distribution
Of course, the whole point of the method of least squares is to smooth out noisy measurements. But we've never addressed the nature of the noise itself. We even estimated statistical parameters like mean, variance, and standard deviation, without ever defining these terms.
That has to change. In this column, we're going to look noise in the eye, and deal with its nature. We'll discuss the behavior of random processes, introducing notions like probability and probability distributions. For reasons that will become clear, we'll focus like a laser on a thing variously called the bell curve, Gaussian distribution, or normal distribution.
Now, I've been dealing with problems involving the normal distribution for many decades. But to my recollection, no one ever derived it for me. They just sort of plunked it down with little or no explanation.
This would usually be the place where I'd start deriving it for you, but I'm not going to do that either. The reason is the same one my professors had: The classical derivation is pretty horrible, involving power series of binomial coefficients.
Instead, I'm going to take a different approach here. I'm going to wave my arms a lot, and give you enough examples to convince you that the normal distribution is not only correct, but is inevitable.