Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-L] points don't have error bars (distributions do)

On 10/21/2012 02:23 PM, Folkerts, Timothy J wrote:
in many (all?) cases, the individual data point itself can be
considered to have been drawn from a random distribution. For
example, suppose I measure a voltage as 8.00 V using a specific
voltmeter. The manufacturer tells me that the accuracy of the meter
is 1 digit ± 0.5%. In effect, the manufacturer is telling me that if
I measured the SAME voltage with a set of their instruments, then the
readings of those meters would not all be the same and they should
range within 0.05 V of the "correct voltage". That single point is a
single instance of a value drawn from a random distribution. As
such, the error bars are telling us about the distribution of values
for that specific point, whereas the parameters of the curve itself
relate to a different random distribution.

I claim that there is not a "different random distribution" of the
kind described.

I've been thinking some more about this. I now have a concrete,
quantitative, objective, pedagogical example that illustrates
why it is a bad idea to associate error bars with the individual
reading. By "bad" I mean wrong in principle and demonstrably
problematic in practice.

Let's start with some "original" random distribution. In accordance
with the frequentist definition of probability, we make a humongous
number of readings, drawing them from this distribution. The result
is what physicists call an ensemble (and statisticians call a sample).
In the large-N limit, the frequency with which such-and-such event
occurs in the ensemble is the frequentist /definition/ of probability.
See e.g. Feynman volume I chapter 6.

If you plot the ensemble of points, it will have a certain spread.
It must be emphasized that the unadorned points -- without any alleged
error bars -- will exhibit spread. In fact they will exhibit exactly
the correct amount of spread. For instance, in the large-N limit, the
standard deviation of the ensemble of points will converge to the
standard deviation of the "original" theoretical ensemble. Other
statistical properties will converge as well.

My point is that if you attribute to each point some additional width,
perhaps by imagining that it wiggles around a little bit within its
error bars, you will get the wrong answer. It's wrong by a lot, as
you can see by glancing at this figure:

The dashed red line is the right answer, while the black curve is what
you get by attributing error bars to the individual points. The black
curve does not converge to the right answer.

In contrast, if you treat the points as zero-size points, without any
error bars, the ensemble converges to the right answer. For details

That document is new and somewhat drafty. It will need to be rewritten
a few more times before it comes up to my standards. Questions and
suggestions are welcome.


I am quite aware that virtually everyone who ever took a high-school
chemistry course has been "taught" that every reading must have an
"associated" uncertainty.

That doesn't make it right. It's provably not right.

The uncertainty is a property of the distribution as a whole (aka the
ensemble as a whole) ... not of any individual point drawn from the