Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-L] From a Math Prof (physics BS major) at my institution ( math challenge)



On 02/18/2014 08:46 AM, Rauber, Joel wrote:
I didn't personally calculate it, but the Math Prof. told me that
the probability of consecutive numbers appearing on a truly random
list is 48%, much higher than most people would guess.

I looked at two factors, the number of times consecutive numbers
appear -> leads to 2nd list is random The number of times numbers in
the range [30-35] appeared compared to the other decade ranges, which
also lends evidence that the second list was the random one.

That's very weak evidence.

In the first data set, the prevalence of consecutive pairs
is about what one would expect. In the second set, the
prevalence of consecutive pairs is /more/ than one would
expect ... but this does not make the first set any less
random. It just tells you that fluctuations are huge ...
which is not surprising given the small size of the data
sets.

As for the 30--35 range, the first data set has exactly
the expected number of hits in this range.

OTOH of course range is under-represented relative to
the "other" decade ranges ... but that's not surprising,
since 30--35 is *not* a decade range ... and more importantly,
in the second data set this range is even /more/ under-
represented. So by this criterion, the second set is
less random.

So far, the only statistic that looks out of whack to me
is the scarcity of numbers ending in 0 (i.e. numbers equal
to 0 mod 10) in the first set. OTOH people have looked at
a lot of statistics, and if you look at enough, sooner or
later you will find /something/ that is out of whack ...
even if the data is truly random.

=======================================

The lesson I take from this is the importance of proper
experimental controls. That includes double-blinding.
Anybody who knows the "right" answer here can easily
find "factors" to support that conclusion ... but doing
it blind is not so easy.

Another type of essential control is what I call "closing
the loop" i.e. cobbling up some Monte Carlo data and
feeding it through the analysis process, to see how often
the analysis screws up.

I haven't done the experiment, but I suspect that the two
"factors" suggested above, when applied to this student
data and an /ensemble/ of random data sets, would get the
wrong answer about 50% of the time ... quite possibly even
worse than that, depending on implementation details ...
in other words, worse than random guessing.

Doing experiments on human subjects is very, very hard. It
demands fastidious, elaborate controls. Think of all the
trouble that drug companies go to when conducting field
trials of a new drug ... double blinding, placebos, randomized
controls, long-term longitudinal studies, et cetera.

There's a simple reason why they bother with all that:
If they didn't, the results would not be reliable.

There is a lesson in this for the PER community and the
education bureaucracy in general. Testing a new book or
a new teaching method is an experiment on human subjects.
It is like a drug trial ... only much harder, because
proper blinding is usually impossible. Yet again and
again, the effort that goes into such test is less than
the effort that goes into a drug trial. Less effort
applied to a harder problem. You know in your bones
that the results cannot possibly be reliable.