Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

[Phys-L] fluctuations, correlations, Heisenberg, Brünnhilde



Once I had a pet snake, a very well-mannered snake, good with children.
Using a tape measure, I determined that he was 5 feet 10 inches long
when he breathed in, and six feet two inches when he breathed out.

Now if you ask me what is "the" length of the snake, I cannot answer
exactly. Physics, like politics, is supposed to be "the art of the
possible". So the sensible approach would be to reframe the question.
I can perhaps tell you the mean length or the modal length. OTOH if
you insist on asking about "the" length that will show up in a
snapshot to be taken next month, I cannot make an exact prediction.
It's a moving target.

I mention this because 12 years ago, 8 years ago, and 4 years ago
there was some fun to be had by carrying out sophisticated statistical
models of the presidential election. Twelve years ago this was a
novel idea; nowadays it's almost mainstream. IMHO the most skilled
practitioner is Nate Silver:
http://projects.fivethirtyeight.com/2016-election-forecast/

The snake parable is relevant because less-skilled people are
horribly over-interpreting the statistical data.

The worst mistakes have to do with overlooking the /correlations/.

If you take a snapshot of the snake right now, you can measure
and re-measure the snapshot using 100 different rulers. You will
observe some statistical uncertainty in the measurement. However,
this tells you very little about possible /systematic/ error. If
the snake breathes out between now and the next snapshot, every
one of the rulers will make a wrong prediction. Every prediction
will be off in the same direction.

If a certain candidate has a 50/50 shot in each of 6 different
states, and needs to win all of them, then -- hypothetically
speaking -- if they were uncorrelated that would mean the
overall chance would be 50% to the sixth power, i.e. less than
1.6%. On the other hand, if they were all highly correlated,
you don't have six coin flips, you have only *one* coin flip,
the so-called master coin, and the overall chance is 50%.

Let's be clear: I am not pretending to know what's going to
happen; just the opposite! I am saying the publicly available
data is not good enough to make a confident prediction. Anybody
who claims to know what's going to happen either (a) has some
spectacular private information, or (b) doesn't know what he's
talking about.

As an example of the sort of thing I'm talking about, I cite
the recent Brexit vote. All the polls predicted that "stay"
would win, but in fact it lost. Using 20/20 hindsight this
was attributed to intensity on one side and complacency on
the other side.

As a related point, a pundit who asserts that "the election
is over" is at risk for making a /self-defeating/ prediction,
insofar as it promotes complacency. This is like Heisenberg
only worse: quantum mechanics says when you measure something
it messes up something /else/, but here the measurement messes
up the very thing you're trying to measure.

Some pundits are getting this right, but most are not. One guy
has been predicting for months that candidate #1 will win by a
landslide, but recently he switched -- overnight -- to predicting
that candidate #2 will win by a landslide. (The lack of any
middle ground strikes me as odd.)

As the proverb says, it ain't over until the fat lady sings, and
I say Brünnhilde has not even begun her aria. Any statistical
model that overlooks (or underestimates) the systematic errors
is going to be highly misleading.

There's another proverb that says imperfect data is better
than no data. That's true, but still we shouldn't pretend
the data is better than it is. There are lots of reasons why
there could be exceptionally large systematic errors this
time around.

1) The two candidates have both unpopular, to an unprecedented
degree. There is no statistical basis for predicting what
effect this will have.

2) All respectable polls control for /turnout/, as they should.
2a) There is a large, possibly unprecedented disparity in
enthusiasm. Nobody knows exactly how large, or what effect
it will have.
2b) There is a large disparity in get-out-the-vote efforts,
i.e. "ground game".

3) There are unprecedented threats to party loyalty. It is
a matter of guesswork, not statistics, as to how much effect
this will have.

If the polls get this wrong, they will all get it wrong,
all in the same direction, resulting in a "polling miss"
on the scale of Brexit or worse.

4) Remember the snake: It's a moving target. The act of
taking a snapshot creates a /noise amplifier/. You could
conduct the same election on Monday, Tuesday, and Wednesday
and get three different answers, but only one of them counts!
If this were a physics experiment, we might predict the
average and leave it at that, but elections don't work that
way. The average doesn't count. Only the snapshot counts.

In more detail, look at Nate Silver's histograms:
http://projects.fivethirtyeight.com/2016-election-forecast/#electoral-vote
Knowing the average score or the modal score does not
guarantee the actual outcome, because the distribution
is treeeemendously broad. (It's broad be because of the
aforementioned correlations.)

Actually, if this were a physics experiment, we would
probably be looking for ways to redesign the experiment,
to improve the signal-to-noise ratio, i.e. to make the
distribution less broad.

Let's be clear: If the election outcome differs from what the
mean score would predict, you can be a little bit surprised,
but only a little. You have no grounds for being very surprised.

I have not bothered to fire up my own statistical models,
because I know in advance that the results would not be
reliably informative.

The fat lady has not sung.

Elections have consequences.

There are /at least/ ten states that could go either way.
Last week, Al Gore said "Your vote counts." Ouch. As
Sattler put it, that's like Caesar saying "Watch your back."

Again: I am not pretending to know what's going to happen.
The overall result might be a blowout, or it might be very
very close. Recount close.