Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-l] Research into student evaluations



If you reply to this long (15 kB) post please don't hit the reply button unless you prune the copy of this post that may appear in your reply down to a few relevant lines, otherwise the entire already archived post may be needlessly resent to subscribers.

*********************************************
ABSTRACT: As far as I know there have been no rigorous measurement of correlations of student evaluation of teaching (SET) ratings in large data sets over many courses with
(a) academic expectations, (b) course delivery techniques, or (c) gains on standardized tests of student learning, such as the Force Concept Inventory (FCI). However, anecdotal evidence suggests a NEGATIVE correlation of normalized gains on the FCI with SET scores. It is argued, yet again, that SET's may be valid for gauging the important *affective* impact of courses and for providing diagnostic feedback to *teachers*, but they are NOT valid as measures of higher education's primary concern: students' higher-order learning. In fact the gross misuse of SET's as gauges of student learning is, in my view, one of the institutional factors that thwarts substantive educational reform.
*********************************************

Gary Turner, in his Phys-L post of 26 Oct 2006 16:31:15-0500 titled "Research into student evaluations" wrote [bracketed by lines "TTTTTTT. . ."; slightly edited]:

TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
Can anyone recommend any research articles on the nature of the correlation between student evaluations of teaching (SET's) and any of the following:

a. academic expectations;

b. course delivery techniques (e.g., lecture vs active);

c. gains on standardized tests, such as the Force Concept Inventory (FCI) [Hestenes et al. (1992);
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

As far as I know (please correct me if I'm wrong) there have been no rigorous measurements of such correlations in large data sets over many courses. But for anecdotal evidence suggesting a NEGATIVE correlation of normalized gains on the FCI with SET scores see, e.g., "Re: What if students learn better in a course they don't like?" [Hake (2006a)].

Nevertheless, SET enthusiasts sometimes justify the use of SET's for gauging the cognitive impact of courses by citing measured correlations of "achievement" on course exams and final grades with SET ratings. But are such correlations significant with regard to what *should* be the primary concern of higher education: students' higher-order learning? [Shavelson & Huang (2003)].

In "The Physics Education Reform Effort: A Possible Model for Higher Education?" [Hake (2005)], I wrote [see that article for the references other than Hake (2002a)]:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Investigation of the extent to which a paradigm shift from teaching to learning . . . [Barr & Tagg (1995)]. . . is taking place requires measurement of students' learning in college classrooms. But Wilbert McKeachie (1987) has pointed out that the time-honored gauge of student learning - COURSE EXAMS AND FINAL GRADES - TYPICALLY MEASURES LOWER-LEVEL EDUCATIONAL OBJECTIVES such as memory of facts and definitions rather than higher-level outcomes such as critical thinking and problem solving. The same criticism (Hake 2002a) as to assessing only lower-level learning applies to Student Evaluations of Teaching (SET's), since their primary justification as measures of student learning appears to lie in the modest correlation with overall ratings of course (+ 0.47) and instructor (+ 0.43) with "achievement" AS MEASURED BY COURSE EXAMS OR FINAL GRADES (Cohen 1981).
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

In response to Turner's post, David Marx replied on 26 Oct 2006 [slightly edited; my CAPS]:

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
I recommend a look at [Seldin (2006)].

STUDENT EVALUATIONS ARE (surprisingly) VALID MEASURES OF TEACHING PERFORMANCE. There are a lot of misconceptions about student evals. This book includes studies of evals.
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM

SET's are a valid measure of teaching performance? The crucial question is "Valid for what?" In Hake (2002a) I wrote [see that article for references other than Hake & Swihart (1979) and Hake (2002b)]:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
I think SET's can be "valid" in the sense that they can be useful for gauging the *affective* impact of a course and for providing diagnostic feedback to *teachers* [see, e.g., Hake & Swihart (1979)] to assist them in making mid-course corrections. However IMHO, SET's are NOT valid in their widespread use by *administrators* to gauge the cognitive impact of courses [see, e.g., Williams & Ceci (1997); Hake (2000; 2002c,d); Johnson (2002)]. In fact the gross misuse of SET's as gauges of student learning is, in my view, one of the institutional factors that thwarts substantive educational reform (Hake 2002b, Lesson #12)."
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH


Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<rrhake@earthlink.net>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>



REFERENCES [Tiny URL's courtesy <http://tinyurl.com/create.php>.]
Hake R.R. & J.C. Swihart. 1979. "Diagnostic Student Computerized Evaluation of Multicomponent Courses," Teaching and Learning, Vol. V, No. 3 (Indiana University, January 1979, updated 11/97; online at <http://www.physics.indiana.edu/~sdi/DISCOE2.pdf> (20 kB).

Hake, R.R. 2002a. "Re: Problems with Student Evaluations: Is Assessment the Remedy?" online at <http://www.physics.indiana.edu/~hake/AssessTheRem1.pdf>
(72 kB). Also online in HTML at <http://www.stu.ca/~hunt/hake.htm> as one of the many resources in Russ Hunt's annotated bibliography of articles and books on student evaluation of teaching <http://www.stu.ca/~hunt/evalbib.htm>. See also Hake (2006c,d,e,f,g).

Hake, R.R. 2002b. "Lessons from the physics education reform effort,"Ecology and Society 5(2): 28; online at <http://www.ecologyandsociety.org/vol5/iss2/art28/>. Ecology and Society (formerly Conservation Ecology) is a free online "peer-reviewedjournal of integrative science and fundamental policy research" with about 11,000 subscribers in about 108 countries.

Hake, R. R. 2005. "The Physics Education Reform Effort: A Possible Model for Higher Education?" online at <http://www.physics.indiana.edu/~hake/NTLF42.pdf> (100 kB). A slightly edited version of an article that was (a) published in the National Teaching and Learning Forum 15(1), December 2005, online to subscribers at <http://www.ntlf.com/FTPSite/issues/v15n1/physics.htm>, and (b) disseminated by the Tomorrow's Professor list <http://ctl.stanford.edu/Tomprof/postings.html> as Msg. 698 on 14 Feb 2006. For an executive summary see Hake (2006b).

Hake, R.R. 2006a. "Re: What if students learn better in a course they don't like?" online at <http://tinyurl.com/yfgu2g>. Post of 29 Jun 2006 14:27:59-0700.

Hake, R.R. 2006b. "A Possible Model For Higher Education: The Physics Reform Effort (Author's Executive Summary)," Spark (American Astronomical Society Newsletter), June, online at <http://www.aas.org/education/spark/SparkJune06.pdf> (1.9MB). Scroll
down about 4/5 of the way to the end of the newsletter.

Hake, R.R. 2006c. "SET's Are Not Valid Gauges of Teaching Performance," online at <http://tinyurl.com/rtfqw>. Post of 20 Jun 200607:50:58-0700. ABSTRACT: It is argued that if universities value teaching that leads to student higher-level learning, then student evaluations of teaching (SET's) do NOT afford valid evidence of teaching performance. Instead, institutions should consider the DIRECT measure of students' higher-level *domain-specific* learning through pre/post testing using (a) valid and consistently reliable tests *devised by disciplinary experts*, and (b) traditional courses as controls.

Hake, R.R. 2006d. "SET's Are Not Valid Gauges of Teaching Performance #2" online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0606&L=pod&O=D&P=13806>. Post of 21 Jun 2006 21:12:19-0700. ABSTRACT: I respond in order to 12 points made by Michael Scriven in his thoughtful response to Hake (2006c).

Hake, R.R. 2006e. "SET's Are Not Valid Gauges of Teaching Performance #3," online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0606&L=pod&O=D&P=14961>. Post of 25 Jun 2006 20:58:34-0700. ABSTRACT: I respond in order to 5 points made by Wilbert (Bill) McKeachie (WM) in his thoughtful response to Hake (2006c).

Hake, R.R. 2006f. "SET's Are Not Valid Gauges of Teaching Performance #4, online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0606&L=pod&P=R15773&I=-3>. Post of 27 Jun 2006 17:24:18 -0700. ABSTRACT: I respond in order to 6 points made by Michael Theall in his response to Hake (2006c).<

Hake, R.R. 2006g. "Re: Adjunct Faculty: Improving Results," online at
<http://lsv.uky.edu/scripts/wa.exe?A2=ind0607&L=ASSESS&P=R2&I=-3>. Post of 3 Jul 2006 17:00:06-0700. ABSTRACT: I respond in order to 20 points made by Dan Tompkins in his ASSESS posts of 7-29 June 2006, titled "Re: Adjunct Faculty: Improving Results," "Re: SET's Are Not Valid Gauges of Teaching Performance," and "Re: What if students learn better in a course they don't like?" A subtitle might be "Is It Possible to Construct a "Philosophy Concept Test" of students' higher-level learning?

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force Concept Inventory," Phys. Teach. 30: 141-158; online (except for the test itself) at
<http://modeling.asu.edu/R&E/Research.html>. The 1995 revision by Halloun, Hake, Mosca, & Hestenes is online (password protected) at the same URL, and is available in English, Spanish, German, Malaysian, Chinese, Finnish, French, Turkish, Swedish, and Russian.

Pallet, H. 2006. "Uses and Abuses of Student Ratings," Chapter 4 of Seldin (2006). At <http://www.ankerpub.com/SeldinEFP-Preface.pdf> it is stated that 'William Pallett examines the uses and abuses of student ratings. He contends that they are a valuable resource but should count just 30% to 50% in the overall evaluation of teaching, that such ratings can serve multiple purposes, that administrators sometimes make too much of too little difference in ratings,and that student rating results should be categorized into no more than three to five groups."

Reis, R. 2006. "Uses and Abuses of Student Ratings," Tomorrow's Professor Message Msg. #756. This posting looks at Pallett (2006). Free subscriptions to "Tomorrow's Professor" are available at <https://mailman.stanford.edu/mailman/listinfo/tomorrows-professor>.

Shavelson, R.J. & L. Huang. 2003. "Responding Responsibly To the Frenzy to Assess Learning in Higher Education," Change Magazine, January/February; online at <http://www.stanford.edu/dept/SUSE/SEAL/>. See the first "highlight."

Seldin, P. ed. 2006. "Evaluating Faculty Performance: A Practical Guide to Assessing Teaching, Research, and Service," Anker Publishing. Anker information at <http://tinyurl.com/y7bffv>. [Amazon and Barnes & Noble appear to be aware of only the 1999 version of this book, while Anker gives no indication that previous editions exist.]