The Tail Wagging the Dog; An Overdue Examination of Student Teaching Evaluations (original) (raw)

Do Differences in Teaching Evaluations Really Matter? An Investigation into What Constitutes a Meaningful Difference in Evaluations of Professors

2020

This study sought to determine what constitutes a minimally meaningful difference in student evaluations of their professors, when students are asked to rate their professors on the traditional 5-point teaching effectiveness item commonly used in higher education. A minimally meaningful difference is the smallest difference between two ratings that: 1) exceeds chance variation and 2) corresponds to a difference deemed meaningful using some external anchor or standard. Data was obtained through a series of surveys given to students at Butler University and to an online nationwide sample. Analysis occurred through both an anchor-based approach, using data obtained from a single survey, and a distribution-based method, using data obtained from two surveys administered two weeks apart. Both methods were used to find the minimal meaningful difference in student evaluations of professors. A meaningful difference of .84 was found when participants were asked to distinguish between two professors of higher quality. A meaningful difference of .75 was found when participants were asked to distinguish between two professors of lower quality. Both differences exceeded chance variation.

Professor , Student , and Course Attributes that Contribute to Successful Teaching Evaluations

Michael J. Seiler is an Assistant Professor of Finance and Vicky L. Seiler is an Assistant Professor of Marketing at Hawaii Pacific University, Honolulu, HI 96813. Dalen Chiang is a Professor at Cleveland State University. This study examines eight professor, student, and course attributes that affect four specific areas of teaching evaluations. All eight attributes significantly affect at least one of the four groups of student evaluation of teaching (SET) questions. The extant literature has previously ignored the fact that more than one factor exists. This has resulted in contradictory or inconclusive findings. Our study uses a more sophisticated methodology that allows for the delineation of all these intricate relationships. As a result, more clear and robust results emerge.[JEL: I20, I22, A00]

Student evaluations of teaching: teaching quantitative courses can be hazardous to one’s career

PeerJ

Anonymous student evaluations of teaching (SETs) are used by colleges and universities to measure teaching effectiveness and to make decisions about faculty hiring, firing, re-appointment, promotion, tenure, and merit pay. Although numerous studies have found that SETs correlate with various teaching effectiveness irrelevant factors (TEIFs) such as subject, class size, and grading standards, it has been argued that such correlations are small and do not undermine the validity of SETs as measures of professors’ teaching effectiveness. However, previous research has generally used inappropriate parametric statistics and effect sizes to examine and to evaluate the significance of TEIFs on personnel decisions. Accordingly, we examined the influence of quantitative vs. non-quantitative courses on SET ratings and SET based personnel decisions using 14,872 publicly posted class evaluations where each evaluation represents a summary of SET ratings provided by individual students responding ...

Student Evaluations of Teaching (Mostly) Do Not Measure Teaching Effectiveness

ScienceOpen Research

Student evaluations of teaching (SET) are widely used in academic personnel decisions as a measure of teaching effectiveness. We show: SET are biased against female instructors by an amount that is large and statistically significant the bias affects how students rate even putatively objective aspects of teaching, such as how promptly assignments are graded the bias varies by discipline and by student gender, among other things it is not possible to adjust for the bias, because it depends on so many factors SET are more sensitive to students' gender bias and grade expectations than they are to teaching effectiveness gender biases can be large enough to cause more effective instructors to get lower SET than less effective instructors.These findings are based on nonparametric statistical tests applied to two datasets: 23,001 SET of 379 instructors by 4,423 students in six mandatory first-year courses in a five-year natural experiment at a French university, and 43 SET for four sec...

Assessment & Evaluation in Higher Education The (mis)interpretation of teaching evaluations by college faculty and administrators

Student evaluations of teaching are ubiquitous and impactful on the careers of college teachers. However, there is limited empirical research documenting the accuracy of people's efforts in interpreting teaching evaluations. The current research consisted of three studies documenting the effect of small mean differences in teaching evaluations on judgements about teachers. Differences in means small enough to be within the margin of error significantly impacted faculty members' assignment of merit-based rewards (Study 1), department heads' evaluation of teaching techniques (Study 2) and faculty members' evaluation of specific teaching skills (Study 3). The results suggest that faculty and administrators do not apply appropriate statistical principles when evaluating teaching evaluations and instead use a general heuristic that higher evaluations are better.

Observations on the Folly of Using Student Evaluations of College Teaching for Faculty Evaluation, Pay, and Retention Decisions and Its Implications for Academic …

William & Mary Journal of Women and the …, 2006

Research on student teaching evaluations is vast. An examination of this research demonstrates wide disagreements but also substantial consensus of authority for the proposition that student evaluations should be used only with extreme care, if at all, in making personnel decisions. A number of reasons cause administrators to use teaching evaluations for personnel decisions. The literature, however, is virtually unanimous in its condemnation of norming student evaluations in order to rank classroom performances. Current cases on academic freedom indicate some retrenchment by the Circuits from broader pronouncements in earlier Supreme Court cases. This paper concludes that the use of non-validated student evaluations alone without any other criteria for teaching effectiveness raises substantial problems in faculty retention and promotion decisions. It also suggests that such an approach in the right case might violate academic freedom and the First Amendment.

How Colleges Evaluate Professors. Current Policies and Practices in Evaluating Classroom Teaching Performance in Liberal Arts Colleges

1975

Much has occurred in higher education to lead to expectation for change in the process of evaluation. The academic deans of all accredited private liberal arts colleges were asked to report on the procedures used in rating both overall and teaching performance, with 83.5 percent replying. Purposes of the questionnaire were to: (1) determine the relative level of importance placed on classroom teaching in the evaluation of overall performance of faculty members; (2) determine the types of information upon which evaluation of teaching performance is based; and (3) compare faculty evaluation policies and procedures during the contractions of the 1970's with the expansion of the mid-1960's. Important findings included: (1) significant declines in importance between 1966 and 1973 were recorded for research and publication; (2) campus committee work and student advising increased sharply as "major factors" in faculty evaluations; (3) in evaluating teaching performance, the deans' reliance on "systematic student ratings" increased significantly; and (4) whereas the importance of chairman's and dean's evaluations retained a prominent position in teaching evaluation, committee evaluation increased in prominence. Specific recommendations and an extensive bibliography are included in the publication. (Author)

Uses and Misuses of Student Evaluations of Teaching: The Interpretation of Differences in Teaching Evaluation Means Irrespective of Statistical Information

Student evaluations of teaching are among the most accepted and important indicators of college teachers' performance. However, faculty and administrators can overinterpret small variations in mean teaching evaluations. The current research examined the effect of including statistical information on the interpretation of teaching evaluations. Study 1 (N ¼ 121) showed that faculty members interpreted small differences between mean course evaluations even when confidence intervals and statistical tests indicated the absence of meaningful differences. Study 2 (N ¼ 183) showed that differences labeled as nonsignificant still influenced perceptions of teaching qualifications and teaching ability. The results suggest the need for increased emphasis on the use of statistics when presenting and interpreting teaching evaluation data.

Appraising Teaching Effectiveness: Beyond Student Ratings

2000

Evaluating faculty effectiveness is important in institutions of higher education. Although evaluation is inherently threatening to most faculty members, the vast majority take their assignments seriously and want to conduct them as effectively as possible. Assessing faculty performance is a complex and time-consuming process. If it is done poorly or insensitively, it can have an adverse effect on institutional quality. Whether or not individual institutions elect to commit the resources required for valid evaluations depends on the degree to which they agree with these propositions: (1) all members of the institution should be accountable for their activities and performance; (2) the conduct and use of credible evaluation programs have an important influence on the welfare and future excellence of the individual, the department, and the institution; and (3) when improvement efforts are supported by institutional policy and guided by comprehensive and valid appraisals of current functioning, the well-being of the individual and the institution are affected positively. (Contains 18 references.) (SLD) Reproductions supplied by EDRS are the best that can be made from the original document.

Non-modifiable Factors and Student Evaluation of Faculty Teaching Quality: An Examination of the Correlations

Universal Journal of Educational Research, 2019

Course evaluation by students has been widely used to produce essential information for university administrators and faculty in assessing instruction quality. This study focused on examining the non-modifiable factors possibly relating to student evaluation outcomes by analyzing the quantitative data of 259-course evaluations in a teachers college at a Midwest state university. Findings of multiple regression and univariate statistical analyses suggested that the mean score of students' ratings in a course was associated with the ranks of the faculty who taught the course and students' response rate of the course survey. Courses taught by higher-ranking faculty and rated by a lower percentage of students tend to have lower mean scores of course evaluation. The mean score of students' evaluation in a course was not correlated with the other variables of the gender of faculty, course level, course delivery method, and class size. Findings of this study lead to a broader impact on faculty teaching evaluation policy and implications for course evaluation practices.

Myths And Facts About Student Surveys Of Teaching The Links Between Students Evaluations Of Faculty And Course Grades

Journal of College Teaching & Learning (TLC), 2011

The present study sought to examine the justification of faculty claims regarding bias in students’ assessments of faculty performance that stem from external factors which do not include the quality of their teaching. Specifically, we sought to examine the hypothesis that there is a correlation between lecturer ranking and grades given by lecturers and between lecturer rankings, grades, and background variables. The framework of the research is the combination of three different stages: faculty, course, lecturer and the statistical manipulation, creating a complex image of reality and thereby offering an answer to the most classical question in the research literature. Findings of this study indicate that the alleged correlation between the students’ grades and the lecturers is non-existent, and nothing but a myth amongst the academic body. However, the research still points out that there are some additional elements which are beyond the efficiency of teaching as we tap into diffe...

Appraising Teaching Effectiveness: Beyond Student Ratings. IDEA Paper

1999

Student evaluations of college professors: Are female and male professors rated differently?

Journal of Educational Psychology, 1987

Over 1,000 male and female college students of 16 male and female professors (matched for course division, years of teaching, and tenure status) evaluated their instructors in terms of teaching effectiveness and sex-typed characteristics. Male students gave female professors significantly poorer ratings than they gave male professors on the six teaching evaluation measures; their ratings of female professors were poorer than those of female students on four of the six measures. Female students also evaluated female professors less favorably than male professors on three measures. Student perceptions of a professor's instrumental/active and expressive/ nurturant traits, which were positively related to student ratings of teaching, accounted for only a few of these gender-related effects. Student major and student class standing also played a role in the evaluation of professors. The importance of gender variables in teacher evaluation studies is discussed, and implications for future research are noted. Research since the 1960s has documented prejudice against women, particularly if women violate gender stereotypes, for example, by having gender-atypical characteristics or by participating in gender-atypical professions (Etaugh & Riley, 1983; Paludi & Bauer, 1983). Because college teaching is considered a high-status male occupation (Touhey, 1974) and because evaluations made by others influence advancement in such a career, it is important to determine if any biases exist in the evaluation of college professors. Most investigations of bias in the evaluation of professors have produced conflicting results. However, two variables that appear to be important are professor sex and professor sex typing. Although some studies have found relatively few or no differences in the evaluations of male and female professors on the basis of sex alone (

Student Ratings and Professor Self-Ratings of College Teaching: Effects of Gender and Divisional Affiliation

Journal of Personnel Evaluation in Education, 2005

Twenty female and 23 male professors at a liberal arts college participated along with their 803 undergraduate students in a questionnaire study of the effects of professor gender, student gender, and divisional affiliation on student ratings of professors and professor self-ratings. Students rated their professors on 26 questions tapping five teaching factors as well as overall teaching effectiveness. Professors rated themselves on the same questions as well as on nine exploratory ones. On student ratings, there were main effects for both professor gender (female professors were rated higher than male professors on the two interpersonal factors) and division (natural science courses were rated lowest on most factors). These patterns were qualified by significant interactions between professor gender and division. Although professor self-ratings varied by division, there were few significant correlations between professor self-ratings and students' ratings. Implications for future research are discussed. Keywords Student ratings. College professors. Gender. Teaching. Divisional affiliation Because of their importance for employment-related decisions, the validity of student ratings of college teaching has been of great concern. Despite research demonstrating validity through comparing academic performance of students in multiple-section courses on common examinations (higher-rated professors have students who perform better; d'Apollonia & Abrami, 1997), there have been troubling demonstrations of possible biasing factors, such as significant correlations of student ratings with expected course grade (but not actual grade) (Greenwald & Gillmore, 1997).

Taking the grading leniency story to the edge. The influence of student, teacher, and course characteristics on student evaluations of teaching in higher education

Educational Assessment, Evaluation and Accountability, 2011

The use of student evaluation of teaching (SET) to evaluate and improve teaching is widespread amongst institutions of higher education. Many authors have searched for a conclusive understanding about the influence of student, course, and teacher characteristics on SET. One hotly debated discussion concerns the interpretation of the positive and statistically significant relationship that has been found between course grades and SET scores. In addition to reviewing the literature, the main purpose of the present study is to examine the influence of course grades and other characteristics of students, courses, and teachers on SET. Data from 1244 evaluations were collected using the SET-37 instrument and analyzed by means of cross-classified multilevel models. The results show positive significant relationships between course grades, class attendance, the examination period in which students receive their highest course grades, and the SET score. These relationships, however, are subject to different interpretations. Future research should focus on providing a definitive and empirically supported interpretation for these relationships. In the absence of such an interpretation, it will remain unclear whether these relationships offer proof of the validity of SET or whether they are a biasing factor.

Student evaluations of teaching: perceptions of faculty based on gender, position, and rank

Teaching in Higher Education, 2010

The current study explores the feelings and thoughts that faculty have about their student evaluations of teaching (SET). To assess the perceptions of SETs, all teaching faculty in one college at a western Land Grant University were asked to complete an anonymous online survey. The survey included demographic questions (i.e. gender; rank such as assistant, associate, and full professor; and positions like non-tenure track, tenure track, and tenured) as well as questions related to faculty's feelings while reading their SETs. While minimal differences were found in responses based on rank or position, several differences were found based on faculty gender. Overall, female faculty appear to be more negatively impacted by student evaluations than male faculty. These gender differences support previous research that suggests males and females receive and react differently to personal evaluation. Resultant suggestions include modifying surveys from anonymous to confidential and offering professional development training for faculty.

FACULTY FORUM Factors Influencing Teaching Evaluations in Higher Education

Past research indicates several factors influencing teaching evaluation ratings instructors receive. We analyzed teaching evaluations from psychology courses during fall and spring semesters of [2003][2004] to determine if class size, class level, instructor gender, number of publications (faculty instructors), average grade given by the instructor, and instructor rank predicted teaching evaluation ratings. Entering predictor variables into a multiple regression analysis concurrently, results indicated that only average grade given and instructor rank significantly predicted instructor ratings. Specifically, higher average grades given by the instructor predicted higher ratings, and graduate teaching assistants received higher overall ratings than faculty instructors.