Least Squares, Part 3

Least Squares

Part 3: The normal equations

If p = b1 + mx is the projection of y onto W = Span(1, x), then we can write

p = Xv,

where X is the matrix with columns 1 and x, and v is the solution vector (b, m)^T. Also, the vector y - p is orthogonal to the entire subspace W, in particular, to both 1 and x. This orthogonality can be expressed as

X^T(y - p) = 0,

X^Ty = X^Tp = X^TXv.

The system of equations

X^TXv = X^Ty,

in which everything is known except v, constitutes the normal equations. Thus, the coefficients for the best fitting line are the solutions of the normal equations.

Hanford data with least squares line

Construct the matrix X for the cancer death data, and solve the normal equations. Make sure your numbers m and b agree with what you see in the figure showing the least squares line.

Confirm that your coefficients m and b give the same projection p as the one you computed in Part 2.

If there were a location in the Columbia River area with exposure index 5.5, what would you predict about its cancer death rate? What portion of this death rate would you attribute to radioactive wastes from the Hanford plant?

modules at math.duke.edu