Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-l] Basic statistics



On 11/09/2006 06:24 AM, Ludwik Kowalski wrote:
.... Should the
error bar be 1.2 or should it be 0.4,

It depends.

The usual advice is to say what you mean, and mean what you
say ... but that is sometimes hard to do, because the standard
terminology in this area is a mess.

In the following I will use my own colorful but nonstandard
terminology.

There are *three* different probabilities in play:
a) the statistics of the underlying distribution from which samples
are drawn.
b) the statistics of the individual measurements ("atoms")
c) the statistics of clusters ("molecules") containing 9 atomic
measurements.

In http://blake.montclair.edu/~kowalskil/basicstat.html
it would be helpful to clearly distinguish the three different
concepts.

In any case:

a) The underlying distribution has some mean μ and some standard
deviation σ, which we might never know exactly.
b) We can estimate μ based on one atomic measurement.
c) We can estimate μ based on a cluster of 9 measurements.
d) We cannot estimate σ from a single atomic measurement.
e) We can estimate σ from a multi-atom cluster.

Here's the crux of the matter: Given enough measurements, our
estimate μ' will be a rather precise estimate of μ. This process
is known as signal averaging. The uncertainty in μ' is denoted
Δμ' and might be markedly less than σ, the standard deviation of
the underlying distribution.

Among other things, that means:
-- If we increase the number of measurements, there should not
be any systematic drift in μ' i.e. our estimate of μ. (It may
wander around randomly, but should not systematically drift.)
-- If we increase the number of measurements, there should not
be any systematic drift in σ' i.e. our estimate of σ.
-- If we increase the number of measurements, there will be a
systematic decrease in Δμ' i.e. the uncertainty of our estimate
of μ.

Therefore, the bottom line is that you have a choice:
*) If you choose to consider μ to be the object of interest, then
you should report μ' ± Δμ', which tells people how well you have
estimated μ.
*) If you choose to consider the underlying distribution to be
the object of interest, then you should report μ' ± σ'.