Curve Fitting, Part 3

Curve Fitting

Part 3: Linearization

It is common practice to try to fit non-linear models to data by first applying some transformation to the model that "linearizes" it. For example, suppose we want to fit the non-linear exponential model

y = a e^bt

to the U.S. population data from Part 1. The standard trick is to linearize the model by taking logs:

ln(y) = ln(a) + b t.

Now we have a model in which the parameters A = ln(a) and b appear linearly. We can fit a least squares line to the data

(T₁, ln(Y₁) ) ), (T₂, ln(Y₂) ), ... , (T₁₀, ln(Y₁₀) ).

That is, we fit a least squares line to the semi-log plot of ln(y) versus t shown below.

Semi-log plot of U.S. Population data

For the U.S. population data, the vectors

y* = ( ln(Y₁), ln(Y₂), ..., ln(Y₁₀) )^T,

t = ( T₁, T₂, ..., T₁₀ )^T ,

and 1 = (1, 1, ..., 1)^T
are defined in your helper application worksheet. Solve the normal equations for the parameters A = ln(a) and b of best fit and then find the best fitting exponential model y = a e^bt.

Plot the exponential function that you just found together with a scatter plot of the original U.S. Population data (not the logarithms). How good is the fit?

Compute the residuals y - p and the sum of squares S of the residuals. Make a residual plot. What do you learn from the plot about the goodness of fit?

Compare the fit of the exponential model to the quadratic model of Part 1. You should have already computed S for the the quadratic model. You should also make a residual plot for the quadratic model for comparison purposes.

**Small Data Set**
t	y
1.0	2.5
2.0	8.0
3.0	19.9
4.0	50.0

Above is a small data set. Let's try to fit this data with a power function of the form

y = a t^b

by first linearizing the model by taking logs:

ln(y) = ln(a) + b ln(t).

For the small data set above, the vectors

y* = ( ln(Y₁), ln(Y₂), ln(Y₃), ln(Y₄) )^T,

t* = ( ln(T₁), ln(T₂), ln(T₃), ln( T₄) )^T ,

and
1 = (1, 1, 1, 1)^T
are defined in your helper application worksheet. Fit a line to the log-log (ln(T_i), ln(Y_i)) data and use the results to find the model parameters for the corresponding power function y = a t^b

Plot the power function that you just found together with a scatter plot of the original small data set (not the log-log plot). How good is the fit?

Compute the residuals y - p and the sum of squares S of the residuals. Make a residual plot. What do you learn from the plot about the goodness of fit?

modules at math.duke.edu