Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: science for all?



On Wed, 26 Dec 2001, John Clement wrote:

The problem is that I have used some standard jargon from the standard
educational literature and alluded to some results from that same
literature.

That does not mean that either "the standard jargon" or the
"results" quoted are using statistical analysis correctly.

Effect size is generally defined as the change in the mean on an evaluation
divided by the standard deviation of the curve.
OK. You have now defined which standard deviation you are
discussing.

Most of the educational
literature effect sizes are less than 1.0 and a curriculum which achieves
anything over 0.5 is usually considered to be very effective.
We're not concerned with "usually". We"re doing straight
mathematics. If the curve is taken to represent an estimate of a
probability distribution, and the distribution is normal, then an effect
size of 1.0 may be interpreted as a 30% chance that the mean did not
change.

Normally this
would be used to compare the effects of 2 different curricula and the effect
size would be calculated for the difference between the curricula.
Obviously the effect size is not a valid comparison tool when the students
come in with a statistically zero score, or a score that could be produced
by random guessing.
How does a "statistically zero score" differ from a "zero score"?

Most teachers and education researchers should be
familiar with effect size, so it will convey meaning. Whether or not this
is the best way to compare curricula can certainly be questioned, but it is
currently the method often used.
As one famous logician put it, the answer to "You don't clean a
watch with butter", is "But it was the best of butter". Isn't
that what you're really saying?

The initial curve (Lawson test) in JCST (figure 2) essentially looks like a
normal distribution. The final one is also similar, but moved over by about
1 SD. I am judging this by the curve. The result is that the number of
students who would be classified as concrete is dramatically reduced. I
have found for the Lawson test that when one looks at individual student
scores they do not all move up, but rather each student moves a different
amount, with some making dramatic gains, and others none at all. The curve
in JCST unfortunately moves so far to the right that the right hand tail is
cut off by saturation on the test.

A test may be likened to a measurement in the lab. When the
needle pegs, the measurement is invalid.

Again, the best way to communicate about
this is to look at the same article. To communicate effectively about an
article it is helpful if both people have read it.

We don't subscribe to that particular journal. Sorry. At any
rate, I am not communicating with the author of that article, who
would have to speak for itself.
____________________________________________snip________________________

Whether or not the data exactly follows the theory you outlined is indeed
problematical, but the effect size analysis is routinely used when talking
about test results for students.
I don't recall "outlining a theory". The process of making
statistical inferences involves applications of mathematical logic. If
such logic is misused in educational circles by people with only
rudimentary understanding of mathematical statistics, we need not be bound
by those transgressions.

This allows one to compare different
student treatments in a fairly standard manner. A given distribution may
not be exactly normal, and I don't think that the idea that the ideal
distribution is the same for each student has any meaning. Students
generally fall in the same place on the curve and do not fluctuate over the
curve substantially when retested if you have a well designed test.
Students do not behave like gas molecules.
Ahh. Then how do you justify using statistical measures that
were designed for analyzing systems of gas molecules?

One must also look at the error
on the mean to see if the gain comparisons are significant. If one has a
fairly large number of students the error on the mean will not be
significant.
The error on the mean is defined by the curve, not by the
number of students. You seem to be mixing two very different
concepts in the last statement.


I am a bit puzzled about how you would ask for the probability that they
came from the same unknown distribution. Student test scores generally rise
rather than fall after instruction.

This is called "testing for the null hypothesis". You believe
that the test scores rose, but let's ask for the probability that
they did not.

My impression is that you did not understand my previous posting,
which was about mathematical statistics. Let's just stay with that topic
if you wish to continue this thread; we can omit references to the sins
of other practitioners.
regards,
Jack


--
"But as much as I love and respect you, I will beat you and I will kill
you, because that is what I must do. Tonight it is only you and me, fish.
It is your strength against my intelligence. It is a veritable potpourri
of metaphor, every nuance of which is fraught with meaning."
Greg Nagan from "The Old Man and the Sea" in
<The 5-MINUTE ILIAD and Other Classics>