Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

[Phys-L] Re: Should Randomized Control Trials Be the Gold Standard of Educational Research?



If you respond to this long (11 kB) post, PLEASE DON'T HIT THE REPLY
BUTTON UNLESS YOU PRUNE THE ORIGINAL MESSAGE NORMALLY CONTAINED IN
YOUR REPLY DOWN TO A FEW LINES, otherwise you may inflict this entire
post yet again on suffering list subscribers.

In his Phys-L post of 16 Apr 2005 of the above title, Jack Uretsky
(2005) wrote [bracketed by lines "UUUUUUU. . . . ."]:

UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
I don't understand. If you can't control the variables, how can you get
statistically "valid" results?

My expeerimentalizt friends (in particle physics) now divide
uncertainties into a +statistical" part and a "systematic" part.
These two parts must be combined somehow (how to combine is still IMO
an unsolved problem) in order to arrive at a meaningful statement of
uncertainty. When several experiments measure the same quantity,
then a comparison of quoted uncertainties gives one an intuitive
feeling for the uncertainty of our knowledge of the quantity being
measured.

I have seen nothing in the field of measuring teaching techniques that
tells me that we have much insight into how to make "statistically valid"
comparisons.
UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU

[I shall politely forego criticism of Jack's reply-button-pushing,
archive-clogging, re-infliction on suffering Phys-L'ers of two ENTIRE
already archived posts, Shapiro (2005) and Hake (2005a) - why not use
"webcites" (Hake 2005b)].

Jack appears to be either oblivious or dismissive of:

(a) the approximately two-standard deviation difference in average
normalized gains <g> between interactive-engagement (IE) and
traditional (T) mechanics courses observed by myself [Hake (1998a,b)]
and MANY OTHER RESEARCH GROUPS [referenced in Hake (2002a,b)];

(b) Cohen's (1988) effect size d = 2.43 for the average <g>'s of IE
and T courses calculated in Hake (2002a) for the data of Hake
(1998a,b).

Cohen's (1988, p. 24) rule of thumb - based on typical results in
social science research - that d = 0.2, 0.5, 0.8 imply respectively
"small," "medium," and "large" effects. But Cohen cautions that the
adjectives "are relative, not only to each other, but to the area of
behavioral science or even more particularly to the specific content
and research method being employed in any given investigation." Eight
reasons for the unusually high g = 2.43 are given in Hake (2004).

It's possible that Jack seeks statistical validation through Null
Hypothesis Significance Testing (NHST) - the significance of which
has been under attack for many years. The effect size is commonly
used in meta-analyses, and strongly recommended by many psychologists
and biologists [for references see Hake (2002a)] as a preferred
alternative (or at least addition) to the usually inappropriate
t-tests and p values associated with null-hypothesis testing.

As related in Hake (2002a), Carver (1993) subjected the Michelson &
Morley (1887) data to a simple analysis of variance (ANOVA) and found
**statistical significance** associated with the direction the light
was traveling
(p < 0.001)! He writes [my *emphasis*]:

"It is interesting to speculate how the course of history might have
changed if Michelson and Morley had been trained to use this *corrupt
form of the scientific method,* that is, testing the null hypothesis
first. They might have concluded that there was evidence of
*significant* differences in the speed of light associated with its
direction and that therefore there was evidence for the luminiferous
ether . . . . Fortunately Michelson and Morley . . .(first) . . . .
interpreted their data with respect to their research hypothesis."

Consistent with the scientific methodology of physical scientists
such as Michelson/Morley, Rozeboom (1960) wrote: ". . . the primary
aim of a scientific experiment is not to precipitate decisions, but
to make an appropriate adjustment in the degree to which one accepts,
or believes, the hypothesis or hypotheses being tested."

For a plethora of other anti-NHST references see Hake (2002a). For a
mildly pro-NHST discussion see Wainer & Robinson (2003). The latter
conclude that "NHST is most often useful as an adjunct to other
results (e.g., effect sizes) rather than as a stand-alone result."

Among examples of NHST analyses of physics education research results
are e.g., Beichner (1994), Beichner et al. (1999), Cheng et al.
(2004), and Hake (2002a).

Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<rrhake@earthlink.net>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

REFERENCES
Beichner, R.J. 1994. Testing student interpretation of kinematics
graphs. Am. J. Phys. 62(8): 750-762.

Beichner, R., L. Bernold, E. Burniston, P. Dail, R. Felder, J.
Gastineau, M. Gjertsen, and J. Risley. 1999. Case study of the
physics component of an integrated curriculum. Physics Ed. Res.
Supplement to Am. J. Phys. 67(7): S16-S24.

Cheng, K.K., B.A. Thacker, R.L. Cardenas, & C. Crouch. 2004. "Using
an online homework system enhances students' learning of physics
concepts in an introductory course," Am. J. Phys. 72(11): 1447-1453.

Carver. R.P. 1993. "The case against statistical significance
testing, revisited.' Journal of Experimental Education 61(4): 287-292.

Cohen, J. 1988. "Statistical power analysis for the behavioral
sciences." Lawrence Erlbaum, 2nd ed.

Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A
six-thousand-student survey of mechanics test data for introductory
physics courses," Am. J. Phys. 66: 64-74; online as ref. 24 at
<http://www.physics.indiana.edu/~hake>, or simply click on
<http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB).

Hake, R.R. 1998b. "Interactive-engagement methods in introductory
mechanics courses," online as ref. 25 at
<http://www.physics.indiana.edu/~hake>, or simply click on
<http://www.physics.indiana.edu/~sdi/IEM-2b.pdf> (108 kB), a crucial
companion paper to Hake (1998a).

Hake, R.R. 2002a. "Lessons from the physics education reform effort,"
Ecology and Society 5(2): 28; online at
<http://www.ecologyandsociety.org/vol5/iss2/art28/>. Ecology and
Society
(formerly Conservation Ecology) is a free online "peer-reviewed
journal of integrative science and fundamental policy research" with
about 11,000 subscribers in about 108 countries.

Hake, R.R. 2002b. "Assessment of Physics Teaching Methods,
Proceedings of the UNESCO-ASPEN Workshop on Active Learning in
Physics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also online
as ref. 29 at
<http://www.physics.indiana.edu/~hake/>, or download directly by clicking on
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-Assessb.pdf> (84 kB)

Hake, R.R. 2004. "Design-Based Research: A Primer for Physics
Education Researchers," submitted to the American Journal of Physics
on 10 June 2004; online as reference 34 at
<http://www.physics.indiana.edu/~hake>, or download directly by clicking on
<http://www.physics.indiana.edu/~hake/DBR-AJP-6.pdf> (310kB).

Hake, R.R. 2005a. "Should Randomized Control Trials Be the Gold
Standard of Educational Research ?" online at
<http://lists.asu.edu/cgi-bin/wa?A2=ind0504&L=aera-l&T=0&O=D&P=1945>.
Post of
15 Apr 2005 to AERA-C, AERA-D, AERA-G, AERA-H, AERA-J, AERA-K, AERA-L,
AP-Physics, ASSESS, Biopi-L, Chemed-L, EvalTalk, Math-Learn, Phys-L,
Physhare, POD, STLHE-L, & TIPS.

Hake, R.R. 2005b. Why Not Webcites? online at
<http://lists.asu.edu/cgi-bin/wa?A2=ind0501&L=aera-d&P=R8129&I=-3>.
Post of 30 Jan 2005 14:35:08-0800 to AERA-D, ASSESS, EdStat,
EvalTalk, PhysLrnR, POD, TeachEdPsych, & Multilevel.

Michelson, A.A. & E.W. Morley. 1887. On the relative motion of earth
and luminiferous ether. American Journal of Science 134:333-345.

Rozeboom, W.W. 1960. The fallacy of the null-hypothesis significance
test. Psychological Bulletin 57:416-428; online at
<http://psychclassics.yorku.ca/Rozeboom/>.

Shapiro, M. 2005. "Re: Should Randomized Control Trials Be the Gold
Standard of Educational Research?" Phys-L post of 16 Apr 2005
10:05:14-0700; online at
<http://lists.nau.edu/cgi-bin/wa?A2=ind0504&L=phys-l&F=&S=&P=18397>.

Uretsky, J. 2005. "Re: Should Randomized Control Trials Be the Gold
Standard of Educational Research?" Phys-L post of 16 Apr 2005
13:18:59-0500; online at
<http://lists.nau.edu/cgi-bin/wa?A2=ind0504&L=phys-l&P=18569>.

Wainer, H. & D.H. Robinson. 2003. "Shaping Up the Practice of Null
Hypothesis Significance Testing," Educational Researcher 32(7);
22-30; online as a 148 kB pdf at
<http://www.aera.net/publications/?id=399>.
_______________________________________________
Phys-L mailing list
Phys-L@electron.physics.buffalo.edu
https://www.physics.buffalo.edu/mailman/listinfo/phys-l