Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: Systematic vs. Random



At 08:14 AM 8/29/00 -0500, Tim Burgess wrote:
During a post lab discussion yesterday students were presenting
the results of the standard pendulum lab (l vs. T).

OK.

A student asserted that because the test plot of T^2 vs. l by another
team resulted in a y-intercept that was about 4% of the maximum T^2
value then there was probably a "systematic error". My first reaction
(while the students discussed this) was that this was not true. The bell
rang. All left. I will see them again last period today.

I would like to get the general thoughts some on this list might have
regarding the validity of making the "systematic error" assertion based on
y-intercept value (combined with 9 well fitted data points).

1) I assume we are talking about the y-intercept of the straight line
fitted to the 9 observed points.

2a) The "general thought" is that a chi-square analysis would be appropriate.

2b) The question appears to hinge on the definition of "systematic"
error. Was this concept covered in class? See also item (9) below.

3a) How was the fit performed? By computer? I hope so; eyeball fitting
is notoriously unreliable.

3b) Extrapolation is a notorious way to amplify any uncertainty (systematic
or otherwise) in the data. The posted question did not specify the
"mechanical disadvantage" (i.e. the length of the extrapolation relative to
the domain of the observations).

4a) The posted question contains no information about the error-bars of the
original data.

4b) I assume that no error-bar information was available in the classroom
at the time. See also item (8) below.

Therefore the student had no logical basis for asserting that systematic
errors were present. On the other side of the same coin, there was no
logical basis for asserting that systematic errors were absent, either.

5) Philosophical tangent: Actually, there is no 100% logical basis for
asserting that there is an experimental "error" at all, systematic or
otherwise. In principle, when experiments fail to match textbook theory,
it could be a sign that the textbook is in error, and the observations will
be found to agree with some new theory.

Philosophically speaking, all we have here is a 4% discrepancy between
theory and experiment. (In this case I rather doubt that the students have
discovered a 4% deviation in the laws of mechanics -- but you can't prove
that from these observations alone.)

6) To return to practical matters: Since we have a strong expectation that
period->0 as length->0, we can incorporate this by adding a tenth point to
the data set: a point at (0,0) with infinite weight (i.e. zero error
bars). See what happens to the chi-square when this point is added. If
the chi-square goes up significantly, you've got a problem. If it doesn't
go up significantly, the theory-guided fit is a better representation of
the period-versus-length relationship.

Note that this is tantamount to doing a one-parameter fit (slope only) to
the 9 points, rather than a two-parameter fit (slope and intercept).

Note that _all_ fits are theory-guided; it's just a question of how much
theory you throw in and how much you leave out. In general, if you put in
a little less theory you need a _lot_ more data.

7) If the full-blown chi-square analysis is beyond what the students can
handle, a monte-carlo analysis should get the point across. Specifically:
a) Keep the abscissas of the 9 observed points. Throw away the ordinates.
b) Manufacture new ordinates mathematically, according to the exact
square law.
c) Starting from the exact manufactured points, add some random noise to
the ordinates _and_ abscissas.
d) Perform a two-parameter fit. Find the y-intercept.
e) Go back to step (c) and iterate. Each time, draw new noise-samples
from the same noise distribution. (Hint: the "recalculate" button in a
spreadsheet program draws random numbers anew.)

Histogram the resulting y-intercepts.

This should give you a handle on how uncertainty in the y-intercept depends
on uncertainty in the observations.

8) But this begs the question of how much uncertainty there was in the
original observations. Nine data points is not really enough to perform
the two required tasks: obtaining a decent internal estimate of the
variance of the data _and_ then estimating the slope and intercept (and the
uncertainties thereof). Imagine a group of four points at one abscissa,
and five points at another; that's barely enough to get a rough estimate
of the variance within each group.

9) There is no clear-cut, principled distinction between systematic
uncertainty and other types of uncertainty. Professionals just speak of
uncertainty and leave it at that.

For example, suppose the rulers in the classroom expand and contract as a
function of temperature. At any given moment, this leads to a systematic
error. But if the experiment is repeated at different times of the year
(i.e. at different ambient temperatures) the systematic error becomes a
random error. Also if the ruler is randomly selected from an ensemble
(metal, wood, plastic, ...) another element of randomness is
introduced. The point is that it's not worth obsessing over the details.

At the end of the task you want to understand the _physics_ of each process
that introduces significant uncertainty; applying adjectives like
"systematic" to the processes is of secondary importance or less.