# Beyond Regression: Line-Fitting Algorithms for Exceptional Cases: Part 3

As a follow-on to **Part 1** and **Part 2**, this last of a series of three articles on line-fitting algorithms will deal with embedded control applications with **arctangents** .

Many practical embedded system designs have an angle parameter which varies at a constant rate that must be estimated in order to control the system. In many cases the angle parameter itself cannot be measured directly, and must be estimated by means of arctangents.

For instance, a constant frequency offset between a Quadrature phase-shift keying (QPSK) transmitter’s carrier frequency and a receiver’s demodulation frequency produces a phase in the received signal that varies linearly with time. This effect is shown in **Figure 1 below** .

Figure 1a. Transmitted and Received QPSK Signals (Sine), with Frequency Offset |

To find the rate of the phase change, one method is to compute for each signal sample the arctangent of the ratio of (“sine” signal) quadrature-phase (**Figure 1a, above** ) amplitude to “cosine” signal in-phase (**Figure 1b, below** ) amplitude. These arctangents when plotted versus time should (apart from noise) lie on a straight line with slope equal to the frequency offset (*see the black line in Figure 1b * ).

Figure 1b. Transmitted and Received QPSK Signals (Cosine), with Frequency Offset |

**Arctangents, Aliasing, and Phase Jumps**

It would seem then that the rate of phase change could be estimated by applying regression to the arctangents computed from the data. Unfortunately, the situation is not quite so simple. Arctangents are only determined up to a 2π ambiguity – a phenomenon called “aliasing”.

For instance, the phase angle data in **Figure 2 below** clearly represents a phase angle varying with positive slope and some added noise, but applying linear regression to the arctangents yields a fit with negative slope. This is due to the fact that the arctangents “jump” from values near 2π back down to near 0.

Figure 2. Poor Regression Fit when Phase Jumping Occurs |

If we adjust the arctangent range so as to minimize the “phase jumping,” then a more correct fit is obtained. In **Figure 3 below** , the phase angle range is chosen from 3 to 3+2π, and the correct linear trend is picked up by the regression line.

Figure 3. Good Regression Fit when Phase Jumping is Avoided |

**Minimizing Phase Jumping via Maximizing Variance**

One way to minimize phase jumping is to choose a range for arctangent so that most of the data points lie near the center of the range. This can be done as follows.

First, bin the angle data into K equally spaced bins (K may be adjusted according to theapplication: I have found that K=10 is usually sufficient). We denote the bin centers as {b_{1} , b_{2} , …, b_{K} } and the corresponding data counts are {c_{1} , c_{2} , …, c_{K} }.Next, for each bin center b_{j} compute a phase angle variance V_{j} as follows:

A relatively large variance V_{j} means that more of the phase data lies far from the bin center b_{j} . The bin center corresponding to maximum variance V_{j} may be chosen to be the lower endpoint of the phase angle range.

This is the best choice because it puts the phase angle discontinuity as far as possible from the bulk of the data, thus minimizing the possibility of misleading phase jumps.

Finally, an even more exact choice of phase angle range can be made using interpolation. If the bin center b_{j} corresponds to maximum V_{j} , then we may use the two adjacent bins b_{j-1} and b_{j+1} to interpolate a more accurate value of the maximum variance angle:

(*Note that if j=1, then the lower adjacent bin b _{0} actually corresponds to b_{K} ; and if j=K, the upper adjacent bin b_{K+1} corresponds to b_{1} * ).

If we apply this formula with K=10 bins to the data shown in Figure 2, we obtain a value of 2.69 as the lower endpoint of a relatively “jump-free” phase angle range (hence 2.69 + 2π is the upper endpoint). This leads to a fit very similar to the one shown in Figure 3.

**Handling Large Slopes with Subintervals**

Unfortunately, the maximum variance method fails if the slope of the line is too large and the phase angles run through the entire range. This case is shown in **Figure 4 below** . The data was generated by a time vector X =(x_{1} ,x_{2} ,…x_{400} ) consisting of 400 equally-spaced points on [0,1], and a phase angle vector Y = (y_{1} ,y_{2} ,…y_{400} ) obtained as:

Where (ς_{1} ς_{2} …ς_{400} ) are independent Gaussian noises with standard deviation 1. Evidently the phase angles wrap twice around the entire [0,2π] range. Phase jumping will occur no matter where the phase angle range is set.

Figure 4 Noisy Angle Data with Large Frequency Offset |

Fortunately, it is still possible to estimate the slope by modifying the maximum-variance procedure described above. If we have at least some prior information about the maximum possible values of slope and noise variance, then we can choose time subintervals that are small enough so that the angle data on the subinterval does not wrap. So we should be able to find a relatively “jump-free” range for the subinterval’s angle data.

Suppose for example that in fitting the data in Figure 4 we have prior information that the magnitude of the frequency offset (i.e. slope) is less than 20, and the noise variance is about 1. Simulation can then be used to find an optimal subinterval length.

Simulation was performed as follows. For each of four different slopes (5, 10,15, 20), 1000 lines were generated with the given slope and different noises. Then for each line and seven different subinterval sizes, the maximum-variance procedure was used to obtain “jump-free” angle ranges for slope estimates on several subintervals of each given size.

For example, for subinterval size 0.1 the slope was estimated on 19 subintervals of length 0.1 spaced 0.05 apart. These subinterval slope estimates were averaged to obtain an overall slope estimate for each line and each subinterval length.

The statistics for the simulation are summarized <>in **Table 1** and **Table 2** . According to these tables, for lines of slope 20 a subinterval of length 0.1 gave slope estimates that averaged about 10% too low, with a standard deviation of about 1.1. As a result, we choose to use a subinterval length of 0.1, and to add 10% to the estimate.

Table 1. Mean Slope Estimates for Different Actual Slopes and Subinterval Lengths |

Table 2. Standard Deviation of Slope Estimates for Different Actual Slopes and Subinterval Lengths |

Using the maximum-variance procedure with subinterval length 0.1 on the data in Figure 4 and adjusting for 10% underestimation, we obtain a slope estimate of 11.88. We then subtract this trend from the angle data, obtaining a new “de-trended” angle data set {y_{n} ’}:

The de-trended phase angle data is shown in **Figure 5 below** .

Figure 5. Angle Data De-Trended via Subinterval Slope Estimation |

The de-trended data no longer exhibits a steep slope, and can be localized within a single phase interval of width 2π, using the maximum-variance method. The result is a slope estimate of 1.136 for the de-trended data. We may once again subtract this trend from the data set {y_{n} ’},

When the maximum-variance method is applied to yn”, the resulting slope has order 1E-16. Evidently the process has converged, and the final slope estimate is:

`11.88+1.136 = 13.016,`

which is within 0.15% the actual slope of 13.

**Algorithm Pseudocode**

A pseudocode for the algorithm is as follows:

(1) **Initialize:**

X ≡ time vector;

Y ≡ phase angle data;

*delta* ≡ subinterval size (**determined by prior simulation* )

(2) For each time subinterval of the form *[(n-1) x delta/2, (n+1) x delta/2) (n = 1,2,…)]*

(a) Use the **maximum variance procedure** to determine a “jump-free” phase range for the subinterval’s angle data; and

(b) Estimate the slope for subinterval’s angle data restricted to the “jump-free” phase range

(3) Take the mean of the subinterval slope estimates to obtain *slope_est*

(4) Correct *slope_est* according to multiplicative factor determined by simulation

(5) Form de-trended phase angle data *Y’ &equiv ; (Y – slope_est) X*

(6) Use the **maximum variance procedure** to determine a “jump-free” phase angle range, and estimate the slope for de-trended phase angle data restricted to the “jump-free” phase angle range.

(7) Repeat steps (5–6) until the slope value converges.

The **maximum variance procedure** and slope estimation are accomplished as follows:

1) Determine K equally spaced bins for angle data. Denote the bin centers as *{b _{1<,>2 , …, bK } (note bk+1 – b<><>} = 2π/K, k=1,…K). *

2) Count the number of angle data points in each bin. Denote the counts as *{c _{1} , c_{2} , …, c_{K} }*

3) For each j=1,…K, compute a phase angle variance V_{j} :

4) Choose the value J that maximizes V_{J} .

5) Define a lower limit α for the phase range:

6) Revise the angle data {y_{n} } to lie in the range [a, a+2π] according to the formula:

7) Estimate the slope of the range-adjusted angle data as:

To read **Part 1** , go to **Line-fitting with minimax**.

To read ** Part 2** , go to **Dealing with periodic noise**.

* Chris Thron is currently assistant professor of mathematics at Texas A&M University Central Texas, and does consulting in algorithm design, numerical analysis, system analysis and simulation, and statistical analysis. Previously Chris was with Freescale Semiconductor, doing R&D in cellular baseband processing, amplifier predistortion, internet security, and semiconductor device performance. He has six U.S. patents granted plus three published applications. His web page iswww.tarleton.edu/faculty/ thron/, and he can be reached at thron@tarleton.edu * .