Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: t-distribution, was Geiger ...



At 01:21 PM 3/25/00 -0500, Ludwik Kowalski wrote:
I still do not know how the t-distribution was discovered in
1907 by Gosset. Everybody repeats that the bear company
prevented him from using his real name, and other trivial
details, but nobody, as far as I know, tries to explain how
he discovered the distribution. The hints are that it was
first discovered (and tabulated) and then mathematically
derived by others. But this is not clear to me.

I have no sources beyond the Encyclopedia Britannica, but the story I read
seems to hold together. We know where he ended up, so the route to get
there must have been:

*) He was drawing numbers from some random process, which I will call an urn.

*) He wanted to know the mean and S.D. that characterize the urn.

*) He couldn't afford to draw enough samples to get a good estimate for the
mean.

*) Obviously, if you haven't got a good estimate of the mean, you can't get
a good estimate of the variance. In fact, if you just use the obvious
estimate (the "sample mean") then your estimate of the variance is
_systematically_ wrong, always too low. If you doubt it, consider the case
of just a single draw: the sample variance is _zero_, which is unlikely to
be a good estimate of the urn's variance.

*) Gosset wanted to apply appropriate corrections so he could get the best
possible estimates from a few draws. He devised his formulas by a
combination of mathematical analysis and what we would nowadays call Monte
Carlo simulations.


Let me say that I have a fantastically good evidence that the
means from experimental (Geiger counter) samples of small
size follow the t-distribution and disagree with the normal
distribution. Every statistics textbook describes this.

... Preferably references
which could easily be found by a typical science teacher.

I'm not sure the Geiger counter is the best pedagogical approach. Unlike
drawing numbers from an urn, where the numbers are imagined to pre-exist,
the Geiger-counter numbers are themselves some sort of average of a
statistical process. This complicates the terminology (what "mean" are you
talking about? which "process" are you talking about?) and makes the
concepts look more complicated than they really are. And Geiger tubes (and
radioactive sources) are not available to the typical science teacher.

Following the spirit of Gosset, why not just do a Monte Carlo with a
spreadsheet? That's available to the typical science teacher.