Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-L] From a Math Prof (physics BS major) at my institution ( math challenge)



I was greatly encouraged to read Jeff Schnick's contribution below.
I ran his function in MATLAB several times, with the following results.


>> getdist(10)
totalrounds = 0 0 0 0 1 1 2
totalseq2 = 0 0 0 0 0 0 0
total = 0 0 0 0 0 0 0
Elapsed time is 0.053598 seconds.

>> getdist(100)
totalrounds = 0 0 0 1 1 6 6
totalseq2 = 0 0 0 0 0 0 7
total = 0 0 0 0 0 0 1
Elapsed time is 0.056836 seconds.

>> getdist(1000000)
totalrounds = 49 482 2723 9827 25707 52175 86488
totalseq2 = 0 14 136 739 2712 7811 18039
total = 0 0 0 18 147 1063 5289

Elapsed time is 6711.689050 seconds.

I conclude that a MATLAB and an OCTAVE version of his code provide plausibly similar results:
"In a million sets of 21 rows of five numbers between 1 and 35 inclusive, I got 2784 [MATLAB 2723] cases in which there were exactly 2 round numbers and only 133 [MATLAB 136] cases in which there were exactly 2 sequences of 2 numbers in a row."

I am not sure that I yet correctly grasp his result presentation: are we to expect 1%, 3%, 5% and 9% of the numbers in his [5,21] arrays on average, to show 3, 4, 5 and 6 numbers ending in '0' respectively?
Are we to expect about 2% of his [5,21] arrays to show six pairs of consecutive numbers IN A ROW? Probably he meant to indicate their occurrence in a [21,5] array.
The conjunction of these two tests would seem to readily distinguish a student drive process from a lottery process, if the code holds up. Whether or not there are subtle or obvious errors in the code, it is evident to me that this approach is essentially given in scientific form i.e. it is open to review and criticism: I contrast it with analyses asserting :

"That's very weak evidence. In the first data set, the prevalence of consecutive pairs
is about what one would expect"
What is the data for what one would expect??

"Another type of essential control is what I call "closing the loop" i.e. cobbling up some Monte Carlo data and feeding it through the analysis process, to see how often the analysis screws up."

I agree, but I saw nothing of this from that author.


Sincerely

Brian Whatcott Altus OK

On 2/25/2014 9:58 AM, Jeffrey Schnick wrote:
I ran a Monte Carlo Octave function to investigate occurrences of sequences of 2 numbers in a row and, occurrences of round numbers (10,20,30). In a million sets of 21 rows of five numbers between 1 and 35 inclusive, I got 2784 cases in which there were exactly 2 round numbers and only 133 cases in which there were exactly 2 sequences of 2 numbers in a row. In 0 out of a million cases there were both 2 or fewer round numbers and 2 or fewer sequences of 2 numbers in a row. The first set given in this thread met both of these conditions. In 11 cases out of a million there were both 3 or fewer whole numbers and 3 or fewer sequences of 2 numbers in a row.

It is clear that one can not prove (without looking up the lottery list) which is the student-generated set and which is the lottery-machine-generated set. Each set is equally probable (I think at about 1 in 1e96). Based on what John Clement tells us about the correlation between what our intuition tells us is good for student learning and what PER tells us what is good for student learning, Richard Tarara's hypothesis that the students that produced the student-generated set of values would tend to shy away from sequences of numbers in a row, and mine that they would shy away from round numbers could be exactly wrong. Still: John Denker, if someone offered you a million dollars to correctly pick the student-generated set based on the information in the thread-starting post alone (e.g. without looking up the lottery values on the internet), with the outcome being decided by a coin toss if you chose neither, would you feel that it would be just as good to leave it to a coin toss as
to pick the first list based on the paucity of sequences of numbers in a row and the paucity of round (10,20,30) numbers?

Here's the code--it should run in MatLab (Octave is basically a freeware version of MatLab)--pointing out mistakes to me would be appreciated:

function dummy = getdist(n)
tic;
for m=1:n
a=zeros(21,5);
for i=1:21
a(i,1) = fix(1+rand*35);
for j = 2:5
rn = fix(1+rand*35);
while any(rn==a(i,1:j-1))
rn = fix(1+rand*35);
end
a(i,j)=rn;
end
end
a=sort(a,2);
nrounds(m)=sum(10==a(:)) + sum(20==a(:)) + sum(30==a(:));
nseq2(m)=0;
for i=1:21
b=diff(a(i,:));
nseq2(m)=nseq2(m)+sum(1==b);
end
end


for i=0:6
totalrounds(i+1)=sum(i==nrounds);
end
totalrounds=totalrounds
for i=0:6
totalseq2(i+1)=sum(i==nseq2);
end
totalseq2=totalseq2
for i=0:6
total(i+1)= sum( nseq2<=i & nrounds<=i );
end
total=total
toc
end

_______________________________________________
Forum for Physics Educators
Phys-l@phys-l.org
http://www.phys-l.org/mailman/listinfo/phys-l