Some subscribers to Phys-L might be interested in a discussion-list
post "Re: controlled experiments" [Hake (2011)].
The abstract reads:
*********************************************
ABSTRACT: PhysLnrR's Brian Foley wrote (paraphrasing):
"I would love to be doing controlled experiments much of the time -
but they are darn near impossible to pull off. . . . . . Once you
have your 40 approved classrooms, you RANDOMLY SELECT 20 TEACHERS and
train them on your innovation. . . . . Also your 20 control teachers
need to be using their 'traditional' teaching (and specifically teach
like they have never heard of your innovation.. . . . and if you have
designed some good assessments of learning, then you will finally
have your result. . . . . .and if your innovation makes a difference
YOU JUST MIGHT GET THE MAGICAL p < 0.05 RESULT."
Brian seems to have succumbed to the siren calls of the Gold
Standardistas and the Statistical Significance Cultists. Modesty
forbids mention of these possible antidotes:
a. "Should Randomized Control Trials Be the Gold Standard of
Educational Research?" at <http://bit.ly/qrUfFz>,
b. "Seventeen Statements by Gold-Standard Skeptics #2" at
<http://bit.ly/oRGnBp>,
c. "The Cult of Statistical Significance" at <http://bit.ly/dkTyXP>.
*********************************************
"In some quarters, particularly medical ones, the randomized
experiment is considered the causal 'gold standard.' IT IS CLEARLY
NOT THAT IN EDUCATIONAL CONTEXTS, given the difficulties with
implementing and maintaining randomly created groups, with the
sometimes incomplete implementation of treatment particulars, with
the borrowing of some treatment particulars by control group units,
and with the limitations to external validity that often follow from
how the random assignment is achieved."
- Tom Cook & Monique Payne (2002, p. 174)
"After 4 decades of severe criticism, the ritual of null hypothesis
significance testing - mechanical dichotomous decisions around a
sacred 0.05 criterion - still persists. This article reviews the
problems with this practice, including its near-universal
misinterpretation of p as the probability that Ho . . . .[[the null
hypothesis]]. . . . is false, the misinterpretation that its
complement is the probability of successful replication, and the
mistaken assumption that if one rejects Ho one thereby affirms the
theory that led to the test. Exploratory data analysis and the use of
graphic methods, a steady improvement in and a movement toward
standardization in measurement, and emphasis on effect sizes using
confidence intervals, and the informed use of available statistical
methods is suggested. FOR GENERALIZATION, PSYCHOLOGISTS MUST FINALLY
RELY, AS HAS BEEN DONE IN THE OLDER SCIENCES, ON REPLICATION." [My
CAPS.]
- Jacob Cohen (1994) in "The earth is round (p < .05)"
REFERENCES [URL's shortened by <http://bit.ly/> and accessed on 11 July 2011.]
Cook, T.D. & M.R. Payne. 2002. "Objecting to the Objections to Using
Random Assignment in Educational Research" in Mosteller & Boruch
(2002).
Cohen, J. 1994. "The earth is round (p < .05)," American Psychologist
49: 997-1003; online as a 1.2 MB pdf at <http://bit.ly/a45I2t> thanks
to Christopher Green <http://www.yorku.ca/christo/>.
Hake, R.R. 2011. "Re: controlled experiments," online on the OPEN!
AERA-L archives at <http://bit.ly/onA7jk>. Post of 11 Jul 2011
11:15:54-0700 to AERA-L, Net-Gold, and PhysLrnR. The abstract and
link to the complete post are being transmitted to various discussion
lists and are also on my blog "Hake'sEdStuff" at
<http://bit.ly/qNlxAV> with a provision for comments.
Mosteller, F. & R. Boruch, eds. 2002. "Evidence Matters: Randomized
Trials in Education Research." Brookings Institution. Amazon.com
information at <http://amzn.to/n6T0Uo >. A searchable expurgated
Google Book Preview is online at <http://bit.ly/mTcPIE>.