Chronology Current Month Current Thread Current Date [Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

# Re: LSF method vs. averaging

• From: John Denker <jsd@AV8N.COM>
• Date: Wed, 10 Mar 2004 17:59:59 -0800

Quoting Chuck Britton <britton@NCSSM.EDU>:

At 4:12 PM -0500 3/10/04, Bob Sciamanda wrote:

For a summary of the LSF method and the rationale for using squares vs
absolute values, go to:

http://mathworld.wolfram.com/LeastSquaresFitting.html

Interesting comments from Wolfram here.
They say that squares of the 'offsets' works better than absolute
values because the squared function is differentiable.

I am *not* impressed by that argument.

A more thoughtful rationale for using the squares (as opposed
to the absolute values) can be summarized in two words:
log probability.

The following set of assumptions suffices (and there are other
sets of assumptions that lead to similar conclusions):
-- assuming errors are IID (independent and identically distributed)
-- assuming the errors in the ordinate are additive Gaussian
white noise
-- assuming the errors in the abscissa are negligible

... then the probability of error on each point is exp(-error^2),
and the overall probability for a set of points is the product of
terms like that, so the overall log probability is the sum of
squares. Minimizing the sum of squares is maximizing the
probability. That is, you are finding parameters such that the
model, with those parameters, would have been maximally likely
to generate the observed data. So far so good. Most non-experts
are satisfied with this explanation.

There is, however, a fly in the ointment. The probability in
question is, alas, a likelihood, i.e. an _a priori_ probability,
and if you really care about the data you should almost certainly
be doing MAP (maximum _a posteriori_) rather than maximum
likelihood. That is, you want to macimize the probability of the
model *given* the data, not the probability of the data *given*
the model. Still, for typical high-school exercises maximum
likelihood should be good enough.

I would also like to remind people to avoid the book by
Bevington. The programs in the book are macabre object
lessons in how not to write software. And the other parts
of the book are not particularly good, either. The
_Numerical Recipes_ book by Press et al. has reasonable
programs. Gonick's book _The Cartoon Guide to Statistics_
is a very reasonable introduction to the subject; don't be
put off by the title.