Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: A paradox? Why not?



Two people reported resolving of the paradox. I suspect
that some of may still be puzzled by it; I would probably
be in that group without benefiting from an existing
explanation. So let me provide the expanation. The way
of presenting the paradox can be found at the end of this
message; I am repeating it for those who missed it several
days ago.

DO YOURSELF A FAVOR, SKIP MY COMMENTS
NOW AND ONLY THEN RETURN TO WHAT FOLLOWS.
OTHERWISE YOU WILL DEPRIVE YOURSELF OF A
CHANCE FOR INDEPENDENT CRTICAL THINKING.
THE INITIAL MESSAGE STARTS BELOW THE LINE
OF ASTERISKS.

First what is a paradox and what does it mean to resolve
it?. By paradox I mean a situation in which two correct
statements seem to be in conflict with each other. And
by "to resolve" a paradox I mean to give a convincing
evidence that there in no conflict.

The data are not fictitious; "A" refers to Alaska
Airline while "B" refers to America West. The
reference given is "How numbers can trick you", by
A. Barnet (Technology Review, October 1994, pages
38 to 45). The data used were actually submitted by
the airlines to the Department of Transportation.

Let us not be distracted by the definitions of
"on-time" and "delayed". Or by possibilities of
"errors". In other words, let us assume that the
data are correct. Then we have a situation in
which the same true data support these two
apparently conflicting statements:

1) The airline A wins over B at every single city.
2) The airline B wins over A on combined cities.

How can this be possible? I did not see the original
Barnet's article but the author referring to it writes
that the paradox can be solve by involving the
"lurking" variable". That hidden variable is the
meteorology (which is very unfavorable in Seattle,
from where most Alaska Airline flights originate
and very favorable in Phoenix from where most
America West flights originate). The tables show
numbers of departures but the fog and snow in
Seattle are not mentioned.

Here is my way to explain the resolution of the
paradox. Yes, it would be absolutely impossible to
add numbers in columns 4 and create a situation in
which the sum for A is larger than the sum for B,
no matter what these numbers represent. If the
numbers for A are smaller than the numbers for B,
on the line by line basis then the sum for A must
also be smaller than the sum for B. This is obvious.

But the second statement "B wins ..." is not based on
the sums from columns 4. The new percentages are
calculated on the basis of sums in columns 2 and 3.
By doing this the outcome is dominated by what
happens in the city from which most of the departures
take place. Thus the outcome is dominated by Seattle
for A, and by Phoenix for B. The effect of the
lurking variable becomes obvious, there is no
paradox. Do you agree?

John Denker wrote:

[This] even has a name; in the statistics literature
it is known as Richardson's Paradox.

Does anybody know who was Richardson and in
what context was the paradox first published? Was
it discovered in real data or was it conceived as a
theoretical possibility by Richardson?

Nothing profound, only a pedagogical exercise. And
an attempt to share what impresses me last week.

Ludwik Kowalski

****************************************
AND HERE IS THE FIRST MESSAGE AGAIN:

Paradoxes are worth fishing for; they are great in
promoting critical thinking (as you may notice I am
avoiding the forbidden word UNDERSTANDING).

Here is a paradox I found in a textbook to be used in
Math-109. It is based on the following data about airline
departures, on-time and delayed. Two airlines, A and B,
were compared on the basis of five airports, as below.
Numbers in columns 2 and 3 refer to on-time and delayed
departures in a particular month. Column 4 shows %s of
delays. The first table is for the airline A while the
next one is for B.

Note that the airline A beats airline B in every airport.
On the other hand the total for the airline A is 13.3 %
(501 delays out of 3775 departures) while the total
for the airline B is 10.9% (787 delays out of 6438
departures). The author asks: "How can it happen that
A wins at every city but B wins when we combine all
the cities?" Can you resolve this "paradox"?

****************************************
Table for col 2 col 3 col 4
airline A on-time delayed % del

Los Angeles 497 62 11.1
Phoenix 221 12 5.2
San Diego 212 20 8.6
San Francisco 503 102 16.9
Seattle 1841 305 14.2
****************************************
Table for col 2 col 3 col 4
airline B on-time delayed % del

Los Angeles 694 117 14.4
Phoenix 4840 415 7.9
San Diego 383 65 14.5
San Francisco 320 129 28.7
Seattle 201 61 23.3
***************************************

Note that the airline A beats airline B in every airport.
On the other hand the total for the airline A is 13.3 %
(501 delays out of 3775 departures) while the total
for the airline B is 10.9% (787 delays out of 6438
departures). The author asks: "How can it happen that
A wins at every city but B wins when we combine all
the cities?"

TRY TO ANSWER THIS BEFORE READING THE
EXPLANATION ABOVE.