Chronology Current Month Current Thread Current Date [Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

# [Phys-L] weighted linear regression using spreadsheets

• From: John Denker <jsd@av8n.com>
• Date: Fri, 3 Apr 2020 09:17:42 -0700

On 4/2/20 1:33 PM, bernard cleyet wrote:

I think the fit (is a Marquardt) treats each datum equally unless one weights the data.

Let's discuss the topic of averaging and/or curve fitting.
Note that I consider averaging to be just a particularly simple
form of curve fitting.

Pedagogical suggestion:
A) When introducing the topic, don't even mention weights.
Let all fits be unweighted, by which we mean equally-weighted.

B) On the next turn of the pedagogical spiral, the motto
should be:
-- All averages are weighted averages.
-- All fits are weighted fits.

In my world, equally-weighted data is the exception not the
rule.

Whether the scale (ordinate) is linear of log, the fit is the same.

Really? I must be misunderstanding that sentence, because I
don't see how it could be true. Scaling the data has a huge
effect on the weights. Indeed this is an arcane but effective
way of controlling the weights, if that's what you want, as
discussed below.

===================================

Recently John Mallinckrodt mentioned using /spreadsheets/.
There's value in that, since there are quite a few students
who can can cope with a spreadsheet but would be terrified
by imperative programming languages such as c++, perl,
python, etc.

The linest() spreadsheet function can do a lot more than
it's usually given credit for. In particular:

Fun fact #1: Commonly linest() is used to fit a straight
line to data, but actually it can handle polynomials, Fourier
series, and more.

The name says it does "linear" regression. That does not
mean that the fitted function needs to be a linear function
of x. The key requirement is just that the /parameters/
aka /coefficients/ that you are adjusting must appear
linearly in the fitted function. Any linear combination
of basis functions will do. The poster child for this
is a Fourier series (with no DC) term, where all of the
basis functions are wildly nonlinear.

Fun fact #2: The instructions for linest() don't mention
it, but it is entirely possible to perform *weighted* fits
with it. Arcane but not difficult.

The trick is to scale the data.
-- scale y_i by the inverse of the ith error bar
-- scale each basis function b(x_i) by the same
-- then apply linest() to the scaled data

Additional discussion, with graphs of examples, can be
found starting at:
https://www.av8n.com/physics/linear-least-squares.htm#sec-linear-or-not