Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

[Phys-l] SET's Are Not Valid Gauges of Teaching Performance



If you reply to this long (23 kB) post please don't hit the reply button unless you prune the copy of this post that may appear in your reply down to a few relevant lines, otherwise the entire already archived post may be needlessly resent to subscribers.

******************************************
ABSTRACT: It is argued that if universities value teaching that leads to student higher-level learning, then student evaluations of teaching (SET's) do NOT afford valid evidence of teaching performance. Instead, institutions should consider the DIRECT measure of students' higher-level *domain-specific* learning through pre/post testing using (a) valid and consistently reliable tests *devised by disciplinary experts*, and (b) traditional courses as controls.
******************************************

In response to an ASSESS/POD post of 7 Jun 2006 by Richard Lyons (2006) titled "Adjunct Faculty: Improving Results," Dan Tompkins (2006), in his ASSESS post of 7 Jun 2006 08:37:20-0400 wrote [bracketed by lines "TTTTTTTTT. . . . ."; my CAPS]:

TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
In this context, it might be of interest to this list that we've begun a study of adjunct teaching effectiveness at Temple University. Assessing performance of large cohorts of faculty is always a challenge, but I think this may have a pay-off. Basically, we looked at teacher performance over a full year in two large programs, USING THE QUESTIONS ON OUR UNIVERSITY-WIDE STUDENT EVALUATION FORMS FOR WHICH KENNETH FELDMAN HAD DEMONSTRATED CORRELATIONS WITH "STUDENT ACHIEVEMENT," I.E. STUDENT LEARNING.Our sample included over 500 sections, so individual anomalies didn't play much of a role.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
One conclusion is that conditions of work play a huge role in affecting performance. This should not seem a surprise, but for the anecdotal claims one hears that "adjuncts teach as well as anyone else." They can, but they don't, always, and such claims require the immediate response: "what's your evidence?"
It is very interesting that student evaluations, which can be an instrument of surveillance and control, also enable analysis that can be used for progressive ends. Any institution can probably do something like it.
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

In my opinion IF institutions value teaching that leads to student higher-level learning [see, e.g., Shavelson & Huang (2003), Anderson & Krathwohl (2001)] then student evaluations, by themselves, do NOT afford valid evidence of teaching effectiveness for either adjunct or regular faculty, as I have argued in "The Physics Education Reform Effort: A Possible Model for Higher Education" [Hake (2005)]. There I wrote:

"Investigation of the extent to which a paradigm shift from teaching to learning . . . [Barr & Tagg (1995)]. . . is taking place requires measurement of students' learning in college classrooms. But Wilbert McKeachie (1987) has pointed out that the time-honored gauge of student learning - course exams and final grades - typically measures lower-level educational objectives such as memory of facts and definitions rather than higher-level outcomes such as critical thinking and problem solving. The same criticism (Hake 2002) as to assessing only lower-level learning applies to Student Evaluations of Teaching (SET's), since their primary justification as measures of student learning appears to lie in the modest correlation with overall ratings of course (+ 0.47) and instructor (+ 0.43) with "achievement" **as measured by course exams or final grades** (Cohen 1981)."

How then can we measure students' higher-level learning in college courses? In Hake (2005) I advocate the DIRECT measure of students' higher-level *domain-specific* learning through pre/post testing using (a) valid and consistently reliable tests *devised by disciplinary experts*, and
(b) traditional courses as controls.

Such pre/post testing of cognitive outcomes in no way implies that the *affective* impact of courses [Krathwahl et al. (1990)] as gauged, say, by student evaluations, is unimportant. As emphasized by e.g., Marian Diamond (1988), Bob Leamnson (1999), and Ed Nuhfer (2005), the affective and the cognitive are inextricably linked.

Unfortunately, formative pre/post testing, pioneered by economists (Paden & Moyer 1969) and physicists (Halloun & Hestenes 1985a,b), is rarely employed in higher education, in part because of the tired old canonical objections recently lodged by Suskie (2004) and countered by Hake (2004a; 2006a,b), Scriven (2004), Zumbo (1999) . . .[and more recently Nuhfer (2006)]. . . . Despite the nay-sayers, pre/post testing is gradually gaining a foothold in introductory astronomy, economics, biology, chemistry, computer science, economics, engineering, and physics courses (see Hake 2004b for references).

It should be emphasized that such low-stakes formative pre/post testing is the polar opposite of the high-stakes summative testing mandated by the U.S. Department of Education's No Child Left Behind Act for K-12 (USDE 2005a) that is now contemplated for higher education (USDE 2005b). As the NCLB experience shows, such testing often falls victim to "Campbell's Law" (Campbell 1975, Nichols & Berliner 2005):

"The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

I see no reason that student learning gains far larger than those in traditional courses could not eventually be achieved and documented in disciplines other than physics, from arts through philosophy to zoology IF their practitioners would (a) reach a consensus on the *crucial* concepts that all beginning students should be brought to understand, (b) undertake the lengthy qualitative and quantitative research required to develop tests of higher-level learning of those concepts, so as to gauge the need for and effects of non-traditional pedagogy, and (c) develop interactive engagement methods suitable to their disciplines.

Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<rrhake@earthlink.net>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

"What we assess is what we value. We get what we assess, and if we
don't assess it, we won't get it."
Lauren Resnick [quoted by Grant Wiggins (1990)]


REFERENCES [Tiny URL's courtesy <http://tinyurl.com/create.php>.]
Anderson, L.W. & L.A. Sosniak, eds. 1994. "Bloom's Taxonomy: A Forty-Year Retrospective," Ninety-Third Yearbook of The National Society for the Study of Education, Univ. of Chicago Press. Amazon.com information at
<http://tinyurl.com/7bcnm>.

Anderson, L.W. & D. Krathwohl, eds. 2001. "A Taxonomy for Learning, Teaching and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Addison Wesley Longman. See also Anderson & Sosniak (1994). The original 1956 Bloom et al. cognitive domain taxonomy has been updated to include important post-1956 advances in cognitive science - see especially Chapters 4 & 5 on the Knowledge and Cognitive Process Dimensions. See also the companion affective taxonomy by Krathwohl et al. (1990). Amazon.com information at <http://tinyurl.com/dh229>.

Barr, R.B. & J. Tagg. 1995. "From Teaching to Learning: A New Paradigm for Undergraduate Education," Change 27(6); 13-25, November/December. Reprinted in D. Dezure, Learning from Change: Landmarks in Teaching and Learning in Higher Education from Change 1969-1999. American Association for Higher Education, pp. 198-200. Also online at <http://tinyurl.com/8g6r4>.

Campbell, D. T. 1975. "Assessing the impact of planned social change," in G. Lyons, ed., Social research and public policies: The Dartmouth/OECD Conference, Chapter 1, pp. 3-45. Dartmouth College Public Affairs Center, p. 35; online at <http://www.wmich.edu/evalctr/pubs/ops/ops08.pdf> (196 kB).

Cohen, P.A. 1981. "Student ratings of Instruction and Student Achievement: A Meta-analysis of Multisection Validity Studies," Review of Educational Research 51: 281. For references to Cohen's 1986 and 1987 updates see Feldman (1989).

Diamond, M.R. 1988. "Enriching Heredity (Impact of the Environment on Brain Development)." Free Press. Amazon.com information at <http://tinyurl.com/pa65j> where the book is misattributed to "Dan Diamond."

Diamond, M.C. 1993. "Hearts, brains, and education: A new alliance for science curriculum," in "Higher Learning in America: 1980-2000," A. Levine ed., pp. 273-283. Johns Hopkins University Press. Amazon.com information at <http://tinyurl.com/h99cz>.

Feldman, K.A. 1989. "The Association Between Student Ratings of Specific Instructional Dimensions and Student Achievement: Refining and Extending the Synthesis of Data from Multisection Validity Studies," Research on Higher Education 30: 583.

Hake, R.R. 2002. "Re: Problems with Student Evaluations: Is Assessment the Remedy?" online at <http://www.physics.indiana.edu/~hake/AssessTheRem1.pdf> (72 kB).

Hake, R.R. 2004a. "Re: pre-post testing in assessment," online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0408&L=pod&P=R9135&I=-3>. Post of 19 Aug 2004 13:56:07-0700 to POD.

Hake, R.R. 2004b. "Re: Measuring Content Knowledge," POD posts of 14 &15 Mar 2004, online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0403&L=pod&P=R13279&I=-3> and
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0403&L=pod&P=R13963&I=-3>.

Hake, R. R. 2005. "The Physics Education Reform Effort: A Possible Model for Higher Education," online at
<http://www.physics.indiana.edu/~hake/NTLF42.pdf> (100 kB). This is a
slightly edited version of an article that was (a) published in the National Teaching and Learning Forum 15(1), December 2005, online to subscribers at
<http://www.ntlf.com/FTPSite/issues/v15n1/physics.htm>, and (b) disseminated by the Tomorrow's Professor list
<http://ctl.stanford.edu/Tomprof/postings.html> as Msg. 698 on 14 Feb 2006.

Hake, R.R. 2006a. "Should We Measure Change? YES!" online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0603&L=pod&P=R17226&I=-3> Post of 24 Mar 2006 10:49:00-0800 to AERA-C, AERA-D, AERA-J, AERA-L, ASSESS, ARN-L, EDDRA, EvalTalk, EdStat, MULTILEVEL, PsychTeacher (rejected), PhysLrnR, POD, SEMNET, STLHE-L, TeachingEdPsych, & TIPS.

Hake, R.R. 2006b. "Possible Palliatives for Paralyzing Pre/Post Paranoia," online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0606&L=pod&F=&S=&P=3851>, Post of 6 Jun 2006 to AERA-D, ASSESS, EvalTalk, PhysLrnR, and POD. Abstract only sent to AERA-A, AERA-B, AERA-C, AERA-J, AERA-L, Biolab, Biopi-L, Chemed-L, EdStat, IFETS, ITFORUM, RUME, Phys-L, Physhare, PsychTeacher (rejected), TeachingEdPsych, & TIPS.

Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state of college physics students," Am. J. Phys. 53: 1043-1055; online at
<http://modeling.asu.edu/R&E/Research.html>. Contains the "Mechanics Diagnostic" test (omitted from the online version), precursor to the widely used "Force Concept Inventory" [Hestenes et al. (1992)].

Halloun, I. & D. Hestenes. 1985b. "Common sense concepts about motion," Am. J. Phys. 53: 1056-1065; online at
<http://modeling.asu.edu/R&E/Research.html>.

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force Concept Inventory," Phys. Teach. 30: 141-158; online (except for the test itself) at
<http://modeling.asu.edu/R&E/Research.html>. The 1995 revision by Halloun, Hake, Mosca, & Hestenes is online (password protected) at the same URL, and is available in English, Spanish, German, Malaysian, Chinese, Finnish, French, Turkish, Swedish, and Russian.

Krathwohl, D.R., B.B. Masia, with B.S. Bloom. 1990. Taxonomy of Objectives Book 2; Affective Domain. Longman. Amazon.com information at <http://tinyurl.com/bh6tc>.

Leamnson, R. 1999. "Thinking About Teaching and Learning: Developing Habits of Learning with First Year College and University Students." Stylus. Amazon.com information at <http://tinyurl.com/d38ar>.

Lyons, R. 2006. "Adjunct Faculty: Improving Results," online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0606&L=pod&F=&S=&P=3979>. Post of 7 Jun 2006 to POD and ASSESS.

McKeachie, W.J. 1987. "Instructional evaluation: Current issues and possible improvements," Journal of Higher Education 58(3): 344-350.

Nichols, S.L & D.C. Berliner. 2005. "The Inevitable Corruption of Indicators and Educators Through High-Stakes Testing," Arizona State Univ. Education Policy Studies Laboratory, online at <http://tinyurl.com/7butg> (1.7 MB).

Nuhfer, E. 2005. "DeBono's Red Hat on Krathwohl's Head: Irrational Means to Rational Ends - More Fractal Thoughts on the Forbidden Affective: Educating in Fractal Patterns XIII." National Teaching and Learning Forum 14(5), online to subscribers at <http://www.ntlf.com/FTPSite/issues/v14n5/diary.htm>.

Nuhfer, E. 2006. "A Fractal Thinker Looks at Measuring Change: Part 1: Pre-Post Course Tests and Multiple Working Hypotheses- Educating in Fractal Patterns XVI," National Teaching and Learning Forum, 15(4), May. Online to subscribers at <http://www.ntlf.com/>. If your institution doesn't have a subscription, IMHO it should.

Paden, D.W. & M.E. Moyer. 1969. "The Relative Effectiveness of Teaching Principles of Economics," Journal of Economic Education 1: 33-45.

Scriven, M. 2004. "Re: pre- post testing in assessment," AERA-D post of 15 Sept 2004 19:27:14-0400; online at <http://tinyurl.com/942u8>.

Shavelson, R.J. & L. Huang. 2003. "Responding Responsibly To the Frenzy to Assess Learning in Higher Education," Change Magazine, January/February; online at <http://www.stanford.edu/dept/SUSE/SEAL/> // "Reports/Papers" scroll to "Higher Education," where "//" means "click on."

Steen, L.A.. ed. 1992. "Heeding the Call for Change: Suggestions for Curricular Action," Mathematical Association of America, pp. 150-162. Amazon.com information at <http://tinyurl.com/gcr37>.

Suskie, L. 2004. "Re: pre- post testing in assessment," ASSESS post 19 Aug 2004 08:19:53-0400; online at <http://tinyurl.com/akz23>.

Tompkins, D. 2006. Re: Adjunct Faculty: Improving Results," ASSESS post of 7 Jun 2006 08:37:20 -0400; online at <http://lsv.uky.edu/scripts/wa.exe?A2=ind0606&L=assess&T=0&F=&S=&X=4F72394426167D9629&Y=rrhake%40earthlink.net&P=1435>, or more compactly at <http://tinyurl.com/my4zk>.

USDE. 2005a. U.S. Department of Education, No Child Left Behind Act, online at <http://www.ed.gov/nclb/landing.jhtml?src=pb>.

USDE. 2005b. U.S. Dept. of Education, "Secretary Spellings Announces New Commission on the Future of Higher Education," press release online at
<http://tinyurl.com/cxgfz>: "Spellings noted that the achievement gap is closing and test scores are rising among our nation's younger students, due largely to the high standards and accountability measures called for by the No Child Left Behind Act. More and more students are going to graduate ready for the challenges of college, she said, and we must make sure our higher education system is accessible and affordable for all these students."

Wiggins, G. 1990. "The Truth May Make You Free, but the Test May Keep You Imprisoned: Toward Assessment Worthy of the Liberal Arts." The AAHE Assessment Forum, pp. 17-31. [Reprinted in Steen (1992).]

Zumbo, B. D. 1999. "The simple difference score as an inherently poor measure of change: Some reality, much mythology," in Bruce Thompson, ed. "Advances in Social Science Methodology, Volume 5, pp. 269-304. JAI Press; online at
<http://educ.ubc.ca/faculty/zumbo/papers/Zumbo_1999_Difference_Score.pdf> (2.2 MB), or more compactly at <http://tinyurl.com/kuf3t>.