Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-L] Indicators of quality teaching - Concrete Example



This is the sort of arithmetic, and it can very easily be done in a
spreadsheet. Since most grades are now calculated using a grading program,
you might be able to coerce the grade program into doing the work for you.
But if you use two different tests, you really can not interpret the
results. So you the researchers use the same test, or they have an
alternative that they know how the scores track when given back to back. So
it is possible to use the FCI as a pretest and the FMCE as a post test, but
you need a way of converting the scores on one to the other. You also need
some assurance that they are measuring the same things. I have actually
given the FCI and FMCE back to back and the closely fall on the same line
even though they look very different.

It is possible to calculate the gain for each student and then just average
the gains, but this produces essentially the same score.

Of course if you have a small sample or a small effect you need to check for
statistical significance. Some tests seldom show a decrease in scores, so
you do not have to be too worried about significance. The hand graded
Lawson test is one of these. It is very difficult to show gain on it.

Individual student gain is not as significant as gain for the class. Some
students actually show negative gain. This can happen when the pre test
score is essentially equivalent to random guessing. I interpret this as
meaning they are actually doing more thinking and are hitting the
distractors more, but just guessed without any clue on the pre test. Of
course looking at the data is also a source of information about what is
actually going on. Having a question by question spread sheet can give
clues as to what they are missing. So Thornton and Sokoloff show in
graphical form the effect on particular questions of specific ILDs.

These evaluations are usually given without any penalty to the students, or
no penalty for the pre test and a small weight as part of the final exam.
But I give the pre test as a small 100% grade for taking and finishing it.
I also warn students that if I see they are not taking it seriously, they
will get zero as a grade. The FMCE has 2 questions where the correct answer
lines up with what people generally believe, and these can be used a checks
for random guessing. The pre test is always given on the first day before
any teaching other than class rules and a pep talk. The post test should be
given at the very end, but I an convinced that in HS that is not the correct
timing if it is close to or after the prom, when students totally check out.

This type of calculation may seem to be a lot, but most HS now require use
of a grading program, so this calculation can be easily done by the
computer. Indeed colleges are going this way also. I can remember giving
grades using just a calculator, and I know my HS teachers did it by hand or
just estimated the final grade. Now as an adjunct I have been required to
turn in all copies of the final exam along with the complete spread sheet of
grades. I even know a teacher who came to a parent teacher conference and
was surprised to see the parent brought along a graduate student in the
subject, Spanish. The graduate student pored over the recent exam and
declared that it was graded correctly, so the failing grade remained. Next
I suppose they will also bring in a lawyer and a shrink. We all know that
grades are very approximate and often do not reflect many important
qualities of the learning, but now they are being considered as something to
litigate over. No wonder teachers are happy to have grade inflation!

Gain is certainly a term which can have a variety of mathematical meanings.
But I almost always quote normalized gain <G> as do most PER researchers.
And most papers specify explicitly that they are using "normalized gain".
Many education papers specify effect size.

John M. Clement
Houston, TX


Let us suppose that in preparation for a particular class
topic, a teacher finds that the ten students who are
enrolled, score in this way on a hundred question paper:

25, 30, 35, 40, 45, 50, 55, 60, 65, 70

Using an old TI-30XA calculator, he finds that
mean 47.5
std 14.36
std (n-1) = 15.14

Four months later, when the class draws to completion,
he provides another test, and gets the following results:

35, 40, 45, 50, 55, 59, 64, 68, 70, 71

He digests these scores to
mean 55.7
std 12.22
std (n-1) 12.88


Using as his benchmark text, this :

Effect size = (post - pre)/STD
"An effect size of 1 is considered enormous and many studies
do not get effect sizes larger than .5."

He finds that the Effect size for the ensemble is
( 55.7 - 47.5)/14.36 = 0.57 with a range of
(59 - 55)/14.36 = 0.28 for the low gain student
to
(70 - 40)/14.36 = 2.09 for the high gain student

He then compares that evaluation with the following benchmark
text due to Hake:

"Normalized Gain = (post-pre)/(max score - pre)"

He finds that Normalized Gain is
(55.7 - 47.5) / (100 - 47.5) = 0.16 with a range of:
(59 - 55)/(100 - 55) = 0.09 for the low gain student to
(70 - 40)/(100 - 40) = 0.5 for the high-gain student


Is this the sort of data handling you'd expect a teacher to use?
Or am I mishandling the data?

Sincerely


Brian Whatcott Altus OK

On 6/22/2013 10:31 AM, John Clement wrote:
If the pre and post are different the max score would be
from the pretest.
This is because the denominator represents what the student
does not
know coming into the class. But the same test is usually
used for pre and post.
I am not sure if there is much difference in STD between
the pre and
post tests, so either would do. But if there is a
saturation effect
then the pre test would be the one to use. I have never seen a
specificaton here as to which is preferred.

Part of the problem with digging out the actual effect of an
intervention is that often the only quoted figure is that
the effect
is statistically significant. This gives practically no useful
information other than they saw a result. A 1% gain can be
significant, but so small it is not worth pursuing. So
effect size or
normalized gain are really the best measures of importance.

I always refer to normalized gain when looking at PER results. But
other research often just quotes the results of testing,
and you have
to dig out the gain. Normalized gain is apparently also used in
economic education research. It is not clear that any
evaluation has
normalized gain independent of the pre test value. In
reality there
is still some dependence even for the FCI, but it is small.
The most
important predictor of FCI gain is the Lawson test. Gain on the
Lawson test is more important than FCI gain because it indicates an
increase in the use of higher level thinking. But gain on
the Lawson test is elusive and difficult to get.
Lawson shows graphs with very good gain on his test, but it is not
clear what he is doing to get it. Shayer & Adey show an
increase in
thinking which is consistent with good gain on the Lawson test.

I can show some gain on the Lawson test, and others have also shown
some gain using various physics materials. Curiosly using
the Knight
workbook shows lower gain than some other materials.
Modeling shows
gain when infused with some explicit referencing the
important thinking skills.

It is tough to make improvements that actually produce
better results
on evaluations. What you think works is often a dud, and sometimes
specific things that you are doubtful about may actually be
beneficial. This is often called action research and if
all teachers
did it realistically then we would have much more data to
be able to
make rational changes in teaching. Until teachers do it,
the researchers are the only good resource.
Teaching needs to get out of the usual look what I did in
class, and I
think it worked! It is like physics. People believed that
light was
a particle because Newton said so, and eventually the wave effects
forced a conceptual change. The QM forced more conceptual
change, and
it was surprising. So conventional wisdom is just as often
wrong as
right, and needs to be examined scientifically when possible.

Please notice that what I am advocating is that the
teachers use the
gain figures. When administrators use mandated tests to
measure gain,
they do not have good tools to figure out what is actually
going on.
A class may have poor gain because it has more low SES students, or
because of other external factors beyond the control of the
teacher.
The same teacher in a high SES class may do OK.

John M. Clement
Houston, TX



-----Original Message-----
From: Phys-l [mailto:phys-l-bounces@phys-l.org] On Behalf Of brian
whatcott
Sent: Saturday, June 22, 2013 9:14 AM
To: Phys-L@Phys-L.org
Subject: Re: [Phys-L] Indicators of quality teaching

I found this explanation helpful.Perhaps you will indulge
me a little
further:
Is the standard deviation calculated from the pre test, or
from the
post test results?
Is the maximum possible score selected for the pre test,
or for the
post test?

Brian Whatcott

On 6/22/2013 12:03 AM, John Clement wrote:
STD is standard deviation, which is calculated from the
aggregate of scores.
Max score would be the maximum score attainable. So if
you use the
raw score and there are 23 questions the max would be 23.
But if the test is converted to a percentage then the max
score would
be 100.
John M. Clement
Houston, TX

On 6/21/2013 7:19 PM, John Clement wrote [in part]:


Effect size = (post - pre)/STD
An effect size of 1 is considered enormous and many
studies do not
get effect sizes larger than .5. Many PER practicioners
get effect sizes greater than 1.
But this definition of gain has the problem that it is
skewed by the
size of the pre-test and also is highly dependent on class
homogeneity. Just a straight post-pre has a large
dependence on the
pre test. So Hake came up with Hake gain or normalized gain.
Normalized Gain = (post-pre)/(max score - pre)

_______________________________________________
Forum for Physics Educators
Phys-l@phys-l.org
http://www.phys-l.org/mailman/listinfo/phys-l




_______________________________________________
Forum for Physics Educators
Phys-l@phys-l.org
http://www.phys-l.org/mailman/listinfo/phys-l