Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Testing Statistical Models



When faced with a table of experimental data which is close to
a straight line, one asks, is a quadratic fit better than a
linear one?

One can represent a linear model by
y = a1 + b1.x + e

and a quadratic model by
y = a2 + b2.x + c2.x^2 + e

where e represents the random experimental error.

One immediately seeks to fit the data using the method of least
squares, so that the sum of the squares of the residuals is
minimized.

In a particular case where there are 12 pairs of observations
one could estimate the error-variance by dividing the sum of the
squared residuals by the degrees of freedom.

The degrees of freedom are found from the number of observation
pairs, less the number of parameters we use to model the data:
in the linear model there are two, a1 and b1; in the quadratic
model there are three, a2, b2, and c2.

Where the data could support either model, we expect to find
the error-variance is somewhat smaller for the quadratic model
which would lead us to choose the quadratic fit.

However, there is a step which I suspect is not always executed -
that is to acknowledge that both these estimates are uncertain
(the less uncertain the more the observations) - and so it would
be good to ask whether the smaller estimate is SUFFICIENTLY smaller
so that we make our choice with confidence.

The method of answering this question is the test of significance.

This test is mechanized in commonly available routines for
analysis of variance (ANOVAR) which compute a ratio called F
as a test statistic to decide whether to accept the null-hypothesis
(typically the linear model) by reference to a lookup table for
some desired level of significance (often 5%)

One accepts that the life-sciences, encumbered with variable and
uncertain data, need to be proficient in this area, and that the
'hard-sciences' are often much more comfortable with black and white
data. It goes without saying that in real life students will
eventually be placed in situations where data are uncertain, and a
little sympathetic exposure to these methods would be fruitful.

(This approach patterned on
"The Statistical Analysis of Experimental Data"
Mandel, Dover ISBN 0-486-64666-1)

Sincerely,
brian whatcott <inet@intellisys.net>
Altus OK