Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-l] Basic statistics



On 11/10/2006 09:46 AM, Brian Blais wrote:

From my personal experience trying to learn basic statistics, I always got hung up
on the notion of a population, and of the standard deviation of the mean.

Yeah. There are inconsistencies in the usual explanation,
and lots of bugs in the terminology.

I found
the Bayesian approach to be both more intuitive, easier to apply to real data, and
more mathematically sound (there is a great article by E.T. Jaynes at
http://bayes.wustl.edu/etj/articles/confidence.pdf where he outlines several
pathologies in standard stats).

Bottom line: there is no population in the Bayesian approach. Probability is a
measure of ones state of knowledge, not a property of the system. In doing so, all
of the strained attempts at creating a fictitious population out of measurements
vanish (such as, say, analyzing measurements of the mass of the moon by imagining
many hypothetical universes of "identical" measurements). On instead is quantifying
your state of knowledge.

In almost all easy cases, the Bayesian approach yields the *exact same* numerical
result as the standard approach. The interpretation is a lot easier, and a lot
easier to communicate to students.

That's true as stated. I agree the Bayesian approach is "better"
and "easier", but I think the measure-theory approach is
/even better and easier still/.

One advantage of the Bayesian approach is that it is in tune with
the typical physicist's intuitive approach to probability. That's
mostly good, but not without drawbacks.

Specifically, IMHO the Bayesian approach (as usually formulated)
puts too much emphasis on "the" prior and "the" state of nature.
It reminds me of Euclidean geometry, which pretends to know
_a priori_ "the" geometry of the universe. That idea was accepted
for 2000 years, but now we know it is not the smart approach. In
fact there are many inequivalent non-Euclidean geometries. Applying
this analogy to probability, we must keep in mind that there are
many different probability measures, and we do not know
_a priori_ which of them are suited to the situation of interest.

For me, the rubber really met the road when I was building a
neural-network machine-learning optical-character-recognition
system. The machine looked at an input (say, a handwritten
digit) and generated an output that represented the probability
of being in each of the possible categories (ten categories
for digits).

The question arose, what *type* of probability was represented by
the output? Was it the likelihood of image given category P(i|c),
or was it the _a posteriori_ probability P(c|i), or perhaps the
joint probability P(i,c), or ....?????

I read a bunch of books and asked a bunch of experts from a variety
of fields. I got many authoritative answers ... mutually inconsistent
authoritative answers. After a while, I decided they were /all/
wrong. The answer is that the process of training the neural network
can be considered an optimization i.e. a search through the space
of all possible probability measures. You can /train/ the network
to produce P(i|c) or P(c|i) or whatever you want; you just need
to engineer a training proceedure that will search out the probability
measure you want.

Anything you can do with the Bayesian approach you can do just as
easily with the measure-theory approach.

Bottom-line analogy:
Bayes = ~~ Euclidean geometry
Measure theory ~~ Modern geometry (which includes Euclidean geometry
as a special case)