Chronology | Current Month | Current Thread | Current Date |
[Year List] [Month List (current year)] | [Date Index] [Thread Index] | [Thread Prev] [Thread Next] | [Date Prev] [Date Next] |
Most of the educational
literature effect sizes are less than 1.0 and a curriculumwhich achieves
anything over 0.5 is usually considered to be very effective.We're not concerned with "usually". We"re doing straight
mathematics. If the curve is taken to represent an estimate of a
probability distribution, and the distribution is normal, then an effect
size of 1.0 may be interpreted as a 30% chance that the mean did not
change.
Normally this
would be used to compare the effects of 2 different curriculaand the effect
size would be calculated for the difference between the curricula.the students
Obviously the effect size is not a valid comparison tool when
come in with a statistically zero score, or a score that couldbe produced
by random guessing.How does a "statistically zero score" differ from a "zero score"?
Most teachers and education researchers should be
familiar with effect size, so it will convey meaning. Whetheror not this
is the best way to compare curricula can certainly bequestioned, but it is
currently the method often used.As one famous logician put it, the answer to "You don't clean a
watch with butter", is "But it was the best of butter". Isn't
that what you're really saying?
looks like a
The initial curve (Lawson test) in JCST (figure 2) essentially
normal distribution. The final one is also similar, but movedover by about
1 SD. I am judging this by the curve. The result is that the number ofdifferent
students who would be classified as concrete is dramatically reduced. I
have found for the Lawson test that when one looks at individual student
scores they do not all move up, but rather each student moves a
amount, with some making dramatic gains, and others none atall. The curve
in JCST unfortunately moves so far to the right that the righthand tail is
cut off by saturation on the test.
A test may be likened to a measurement in the lab. When the
needle pegs, the measurement is invalid.
One must also look at the error
on the mean to see if the gain comparisons are significant. Ifone has a
fairly large number of students the error on the mean will not beThe error on the mean is defined by the curve, not by the
significant.
number of students. You seem to be mixing two very different
concepts in the last statement.
generally rise
I am a bit puzzled about how you would ask for the probability that they
came from the same unknown distribution. Student test scores
rather than fall after instruction.
This is called "testing for the null hypothesis". You believe
that the test scores rose, but let's ask for the probability that
they did not.
My impression is that you did not understand my previous posting,
which was about mathematical statistics. Let's just stay with that topic
if you wish to continue this thread; we can omit references to the sins
of other practitioners.