CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Why all the math?



Embedded.com

Some notation
If there is truly a linear relationship between x and y, it's described by some linear equation:

(2)

where a and b are constants with values we don't yet know. Our challenge is to determine those values that give the "best fit" to the experimental data. But to do that, we must first define what we mean, exactly, by the term "best fit." And to do that, we need to establish a few definitions.

Remember, we already have experimental data, consisting of measured values of x and y, collected together as data points (x1, y1), (x2, y2), and so forth. Let the coordinates of the ith point be xi and yi. respectively.

Never forget that these coordinates are not, repeat not, variables. They aren't going to change. They're data, with specific values recorded in our data book.

This concept may take a little getting used to. Years of algebra have conditioned us to think of parameters with names like x and y to be unknowns, and parameters like a and b to be constants. That isn't the case here. While a and b certainly look like constants in Equation 2, they are in fact the things whose values we seek. They're the unknowns.

Now, for every value xi there are actually two values of y. First, there's the measured value, yi. That, remember, is a constant. Then there's the value y would have, if it were generated by Equation 2. In general, these two values will be different, which is another way of saying that the measured data has an error, given by:

(3)

For our straight-line function:

(4)

We need one last concept to complete the formulation. Remember that we typically have anywhere from a handful to thousands of data points, but we are looking for the values of only two parameters, a and b. It should be pretty obvious that no data point is any more important than any other (although we can adjust the weighting of them if we choose to). That is, there's not usually anything that requires that the line pass through the first point, or the last, or any other. Whatever criterion we're going to use for "best fit," it's probably going to be a bulk property that involves all the data points.

To that end, we're going to define a penalty function based on the individual errors. That penalty function is the sum of the squares of all the errors:

(5)

where n is the total number of data points. We'll assert that the best-fit values of a and b are those values that minimize M.

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS



TECH PAPER
WEBINAR
WEBINAR
WEBINAR




 :