If you reply to this long (19kB) post please don't hit the reply
button unless you prune the copy of this post that may appear in your
reply down to a few relevant lines, otherwise the entire already
archived post may be needlessly resent to subscribers.
***************************************
ABSTRACT: It is argued that direct measure of students' higher-level
*domain-specific* learning through pre/post testing using (a) valid
and consistently reliable tests *devised by disciplinary experts*,
and (b) traditional courses as controls, can provide a crucial
complement to the top-down assessment of broad-ability areas
advocated by Hersh (2005) and Klein et al. (2005). .
***************************************
Michael Sylvester, in his TIPS [Teaching In the Psychological
Sciences with archives at
<http://www.mail-archive.com/tips%40acsun.frostburg.edu/>] post of 4
May 2006 14:03:13-0000 titled "Learning Evaluation" wrote [bracketed
by lines "SSSSSSSS. . . "; slightly edited]:
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
Can anyone recommend a learning evaluation instrument - one that would
assess the extent of students learning in the classroom?
Relevant items could be: (a) the teacher stimulates my thinking, (b)
I am really learning a lot from this course, (c) I would recommend
this course to others, (d) I learn more in this course than my grade
would indicate, etc.
The problem that I find with the current teacher evaluation is that
it does not address issues as to how the course contributes to
students learning.
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
To which David Wasieleski, in his TIPS response of 04 May 2006
07:13:21-0700 responded:
"Aren't exams and assignments learning evaluations?"
In this response to Wasieleski & Sylvester (no, I didn't pay them to
serve as straight men), I'll draw upon "The Physics Education Reform
Effort: A Possible Model for Higher Education" [Hake (2005a)]. My
apologies to the few outliers who have read that article.
Regarding David Wasieleski's apparent belief that course exams
constitute learning evaluations, Wilbert McKeachie (1987) has pointed
out that the time-honored gauge of student learning - course exams
and final grades - typically measures lower-level educational
objectives such as memory of facts and definitions rather than
higher-level outcomes such as critical thinking and problem solving.
Regarding Michael Sylvester's criticism of Student Evaluations of
Teaching (SET's), the same criticism as to assessing only lower-level
learning applies to SET's - even those that contain questions such as
those suggested by Sylvester - since their primary justification as
measures of student learning appears to lie in the modest correlation
with overall ratings of course (+ 0.47) and instructor (+ 0.43) with
"achievement" *as measured by course exams or final grades* (Cohen
1981).
HOW THEN CAN WE MEASURE STUDENTS' HIGHER-LEVEL LEARNING IN COLLEGE COURSES?
Several *indirect* (and therefore in my view problematic) gauges have
been developed; e.g., Reformed Teaching Observation Protocol (RTOP),
National Survey Of Student Engagement (NSSE), Student Assessment of
Learning Gains (SALG), and Knowledge Surveys (KS's) (Nuhfer & Knipp
2003). For a discussion and references for all but the last see Hake
(2005b). RTOP and NSSE contain questions of the type desired by
Sylvester.
On the other hand, *direct measures of student learning have been
developed by Hersh (2005) and Klein et al. (2005). Hersh codirects
the "Learning Assessment Project"
<http://www.cae.org/content/pro_collegiate.htm> that "evaluates
students' ability to articulate complex ideas, examine claims and
evidence, support ideas with relevant reasons and examples, sustain a
coherent discussion, and use standard written English." But Shavelson
& Huang (2003) warn that:
". . . learning and knowledge are highly domain-specific - as,
indeed, is most reasoning. Consequently, **the direct impact of
college is most likely to be seen at the lower levels of Chart 1 -
domain-specific knowledge and reasoning** . . . [of the Shavelson &
Huang 2003 "Framework of Cognitive Objectives" (SHFCO)]."
Klein et al. have devised tests that compare student learning across
institutions in both domain-specific and broad-ability areas of the
SHFCO.
In sharp contrast to the above mentioned invalid (course exams, final
grades, SET's); indirect (RTOP, NSSE, SALG, KS's); or general-ability
[Hersh (2005), Klein et al. (2005)] measures discussed above, is the
DIRECT MEASURE OF STUDENTS' HIGHER-LEVEL *DOMAIN-SPECIFIC* LEARNING
THROUGH PRE/POST TESTING using (a) valid and consistently reliable
tests *devised by disciplinary experts*, and (b) traditional courses
as controls. It should be realized that domain specific learning is
probably coupled to the broad-ability areas of the SHFCO, as
suggested for physics by the recent research of Coletta & Phillips
(2005).
Yes, I know, as discussed in Hake (2002), content learning should not
be the sole measure of the value of a course. But I think most would
agree that a gauge of content learning is *necessary*, if not
sufficient.
In my opinion, the physics-education reform model - measurement and
improvement of cognitive gains by faculty disciplinary experts *in
their own courses* - can provide a crucial complement to the top-down
approaches of Hersh (2005) and Klein et al. (2005). Such pre/post
testing, pioneered by economists [Paden & Moyer (1969)] and
physicists [Halloun & Hestenes (1985a,b)], is rarely employed in
higher education, in part because of the tired old canonical
objections recently lodged by Suskie (2004) and countered by Hake
(2004a) and Scriven (2004).
Despite the nay-sayers, pre/post testing is gradually gaining a
foothold in introductory astronomy, economics, biology, chemistry,
computer science, economics, engineering, and physics courses [see
Hake (2004b) for references].
Unfortunately, psychologists, as a group, have shown zero or even
negative interest in assessing the effectiveness of their own
introductory courses by means of definitive pre/post testing [see
e.g. Hake (2005c,d,e)].
IMHO, this is especially discouraging because psychologists and
psychometricians seem to be in control of (a) the U.S. Dept. of
Education's "What Works Clearinghouse" (WWC) <http://www.w-w-c.org/>
and (b) NCLB testing of "science achievement" to commence in 2007.
The latter threatens to promote California's direct instruction of
science thoughout the U.S [Hake (2005f)]. Why should psychologists be
the arbiters of "What Works" and NCLB testing when, as far as I know,
they haven't even bothered to meaningfully research "What Works" in
their own courses?
For recent scathing criticism of the WWC see Schoenfeld (2006a,b).
REFERENCES [Tiny URL's courtesy <http://tinyurl.com/create.php>]
Cohen, P.A. 1981. "Student ratings of Instruction and Student
Achievement: A Meta-analysis of Multisection Validity Studies,"
Review of Educational Research 51: 281. For references to Cohen's
1986 and 1987 updates see Feldman (1989).
Feldman, K.A. 1989. "The Association Between Student Ratings of
Specific Instructional Dimensions and Student Achievement: Refining
and Extending the Synthesis of Data from Multisection Validity
Studies," Research on Higher Education 30: 583.
Hake, R.R. 2005c. "Re: Why Don't Psychologists Research the Effectiveness
of Their Own Introductory Courses?" online at
<http://tinyurl.com/muvy6>. Post of 20 Jan 2005 16:29:56-0800 to
PsychTeacher (rejected) & PhysLrnR.
Hake, R.R. 2005e. "Do Psychologists Research the Effectiveness of
Their Courses? Hake Responds to Sternberg," online at
<http://tinyurl.com/n9dp6>. Post of 21 Jul 2005 22:55:31-0700 to
AERA-C, AERA-D, AERA-J, AERA-L, ASSESS, EvalTalk, PhysLrnR, POD, &
STLHE-L, TeachingEdPsych.
Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state of
college physics students," Am. J. Phys. 53: 1043-1055; online at
<http://modeling.asu.edu/R&E/Research.html>. Contains the "Mechanics
Diagnostic" test (omitted from the online version), precursor to the
widely used "Force Concept Inventory" [Hestenes et al. (1992)].
Hersh, R.H. 2005. "What Does College Teach? It's time to put an end
to 'faith-based' acceptance of higher education's quality," Atlantic
Monthly 296(4): 140-143, November; freely online at (a) the Atlantic
Monthly <http://tinyurl.com/dwss8>, and (b) (with hot-linked academic
references) at <http://tinyurl.com/9nqon> (scroll to the APPENDIX).
Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force Concept
Inventory," Phys. Teach. 30: 141-158; online (except for the test
itself) at
<http://modeling.asu.edu/R&E/Research.html>. The 1995 revision by
Halloun, Hake, Mosca, & Hestenes is online (password protected) at
the same URL, and is available in English, Spanish, German,
Malaysian, Chinese, Finnish, French, Turkish, Swedish, and Russian.
Klein, S.P., G.D. Kuh, M.Chun, L. Hamilton, & R. Shavelson. 2005. "An
Approach to Measuring Cognitive Outcomes Across Higher Education
Institutions." Research in Higher Education 46(3): 251-276; online at
<http://www.stanford.edu/dept/SUSE/SEAL/> // "Reports/Papers" scroll
to "Higher Education," where "//" means "click on."
McKeachie, W.J. 1987. "Instructional evaluation: Current issues and
possible improvements," Journal of Higher Education 58(3): 344-350.
Paden, D.W. & M.E. Moyer. 1969. "The Relative Effectiveness of
Teaching Principles of Economics," Journal of Economic Education 1:
33-45.
Scriven, M. 2004. "Re: pre- post testing in assessment," AERA-D post
of 15 Sept 2004 19:27:14-0400; online at <http://tinyurl.com/942u8>.
Shavelson, R.J. & L. Huang. 2003. "Responding Responsibly To the
Frenzy to Assess Learning in Higher Education," Change Magazine,
January/February; online at <http://www.stanford.edu/dept/SUSE/SEAL/>
// "Reports/Papers" scroll to "Higher Education," where "//" means
"click on."
Suskie, L. 2004. "Re: pre- post testing in assessment," ASSESS post
19 Aug 2004 08:19:53-0400; online at <http://tinyurl.com/akz23>.