Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-L] Indicators of quality teaching - Concrete Example



Let us suppose that in preparation for a particular class topic, a teacher finds that the ten students who are enrolled, score in this way on a hundred question paper:

25, 30, 35, 40, 45, 50, 55, 60, 65, 70

Using an old TI-30XA calculator, he finds that
mean 47.5
std 14.36
std (n-1) = 15.14

Four months later, when the class draws to completion,
he provides another test, and gets the following results:

35, 40, 45, 50, 55, 59, 64, 68, 70, 71

He digests these scores to
mean 55.7
std 12.22
std (n-1) 12.88


Using as his benchmark text, this :

Effect size = (post - pre)/STD
"An effect size of 1 is considered enormous and many studies do not
get effect sizes larger than .5."

He finds that the Effect size for the ensemble is
( 55.7 - 47.5)/14.36 = 0.57 with a range of
(59 - 55)/14.36 = 0.28 for the low gain student
to
(70 - 40)/14.36 = 2.09 for the high gain student

He then compares that evaluation with the following benchmark text
due to Hake:

"Normalized Gain = (post-pre)/(max score - pre)"

He finds that Normalized Gain is
(55.7 - 47.5) / (100 - 47.5) = 0.16 with a range of:
(59 - 55)/(100 - 55) = 0.09 for the low gain student
to
(70 - 40)/(100 - 40) = 0.5 for the high-gain student


Is this the sort of data handling you'd expect a teacher to use?
Or am I mishandling the data?

Sincerely


Brian Whatcott Altus OK

On 6/22/2013 10:31 AM, John Clement wrote:
If the pre and post are different the max score would be from the pretest.
This is because the denominator represents what the student does not know
coming into the class. But the same test is usually used for pre and post.
I am not sure if there is much difference in STD between the pre and post
tests, so either would do. But if there is a saturation effect then the pre
test would be the one to use. I have never seen a specificaton here as to
which is preferred.

Part of the problem with digging out the actual effect of an intervention is
that often the only quoted figure is that the effect is statistically
significant. This gives practically no useful information other than they
saw a result. A 1% gain can be significant, but so small it is not worth
pursuing. So effect size or normalized gain are really the best measures of
importance.

I always refer to normalized gain when looking at PER results. But other
research often just quotes the results of testing, and you have to dig out
the gain. Normalized gain is apparently also used in economic education
research. It is not clear that any evaluation has normalized gain
independent of the pre test value. In reality there is still some
dependence even for the FCI, but it is small. The most important predictor
of FCI gain is the Lawson test. Gain on the Lawson test is more important
than FCI gain because it indicates an increase in the use of higher level
thinking. But gain on the Lawson test is elusive and difficult to get.
Lawson shows graphs with very good gain on his test, but it is not clear
what he is doing to get it. Shayer & Adey show an increase in thinking
which is consistent with good gain on the Lawson test.

I can show some gain on the Lawson test, and others have also shown some
gain using various physics materials. Curiosly using the Knight workbook
shows lower gain than some other materials. Modeling shows gain when
infused with some explicit referencing the important thinking skills.

It is tough to make improvements that actually produce better results on
evaluations. What you think works is often a dud, and sometimes specific
things that you are doubtful about may actually be beneficial. This is
often called action research and if all teachers did it realistically then
we would have much more data to be able to make rational changes in
teaching. Until teachers do it, the researchers are the only good resource.
Teaching needs to get out of the usual look what I did in class, and I think
it worked! It is like physics. People believed that light was a particle
because Newton said so, and eventually the wave effects forced a conceptual
change. The QM forced more conceptual change, and it was surprising. So
conventional wisdom is just as often wrong as right, and needs to be
examined scientifically when possible.

Please notice that what I am advocating is that the teachers use the gain
figures. When administrators use mandated tests to measure gain, they do
not have good tools to figure out what is actually going on. A class may
have poor gain because it has more low SES students, or because of other
external factors beyond the control of the teacher. The same teacher in a
high SES class may do OK.

John M. Clement
Houston, TX


-----Original Message-----
From: Phys-l [mailto:phys-l-bounces@phys-l.org] On Behalf Of
brian whatcott
Sent: Saturday, June 22, 2013 9:14 AM
To: Phys-L@Phys-L.org
Subject: Re: [Phys-L] Indicators of quality teaching

I found this explanation helpful.Perhaps you will indulge me a little
further:
Is the standard deviation calculated from the pre test, or
from the post test results?
Is the maximum possible score selected for the pre test, or
for the post test?

Brian Whatcott

On 6/22/2013 12:03 AM, John Clement wrote:
STD is standard deviation, which is calculated from the
aggregate of scores.
Max score would be the maximum score attainable. So if you use the
raw score and there are 23 questions the max would be 23.
But if the test is converted to a percentage then the max
score would be 100.
John M. Clement
Houston, TX

On 6/21/2013 7:19 PM, John Clement wrote [in part]:


Effect size = (post - pre)/STD
An effect size of 1 is considered enormous and many studies do not
get effect sizes larger than .5. Many PER practicioners
get effect sizes greater than 1.
But this definition of gain has the problem that it is
skewed by the
size of the pre-test and also is highly dependent on class
homogeneity. Just a straight post-pre has a large dependence
on the pre test. So Hake came up with Hake gain or normalized gain.
Normalized Gain = (post-pre)/(max score - pre)

_______________________________________________
Forum for Physics Educators
Phys-l@phys-l.org
http://www.phys-l.org/mailman/listinfo/phys-l