Chronology |
Current Month |
Current Thread |
Current Date |

[Year List] [Month List (current year)] | [Date Index] [Thread Index] | [Thread Prev] [Thread Next] | [Date Prev] [Date Next] |

*From*: John Denker <jsd@AV8N.COM>*Date*: Wed, 10 Mar 2004 17:59:59 -0800

Quoting Chuck Britton <britton@NCSSM.EDU>:

At 4:12 PM -0500 3/10/04, Bob Sciamanda wrote:

For a summary of the LSF method and the rationale for using squares vs

absolute values, go to:

http://mathworld.wolfram.com/LeastSquaresFitting.html

Interesting comments from Wolfram here.

They say that squares of the 'offsets' works better than absolute

values because the squared function is differentiable.

I am *not* impressed by that argument.

A more thoughtful rationale for using the squares (as opposed

to the absolute values) can be summarized in two words:

log probability.

The following set of assumptions suffices (and there are other

sets of assumptions that lead to similar conclusions):

-- assuming errors are IID (independent and identically distributed)

-- assuming the errors in the ordinate are additive Gaussian

white noise

-- assuming the errors in the abscissa are negligible

... then the probability of error on each point is exp(-error^2),

and the overall probability for a set of points is the product of

terms like that, so the overall log probability is the sum of

squares. Minimizing the sum of squares is maximizing the

probability. That is, you are finding parameters such that the

model, with those parameters, would have been maximally likely

to generate the observed data. So far so good. Most non-experts

are satisfied with this explanation.

There is, however, a fly in the ointment. The probability in

question is, alas, a likelihood, i.e. an _a priori_ probability,

and if you really care about the data you should almost certainly

be doing MAP (maximum _a posteriori_) rather than maximum

likelihood. That is, you want to macimize the probability of the

model *given* the data, not the probability of the data *given*

the model. Still, for typical high-school exercises maximum

likelihood should be good enough.

I would also like to remind people to avoid the book by

Bevington. The programs in the book are macabre object

lessons in how not to write software. And the other parts

of the book are not particularly good, either. The

_Numerical Recipes_ book by Press et al. has reasonable

programs. Gonick's book _The Cartoon Guide to Statistics_

is a very reasonable introduction to the subject; don't be

put off by the title.

- Prev by Date:
**Re: LSF method vs. averaging** - Next by Date:
**Re: LSF method vs. averaging** - Previous by thread:
**Re: LSF method vs. averaging** - Next by thread:
**Re: LSF method vs. averaging** - Index(es):