The need for an evolving concept of validity in industrial and personnel psychology: Psychometric, legal, and emerging issues (original) (raw)

APA handbook of testing and assessment in psychology, Vol. 1: Test theory and testing and assessment in industrial and organizational psychology

American Psychological Association eBooks, 2013

Previous chapters in Part I of this volume have provided comprehensive coverage of issues critical to test quality, including validity, reliability, sensitivity to change, and the consensus standards associated with psychological and educational testing. In-depth analysis of test quality represents a daunting area of scholarship that many test users may understand only partially. The complexity of modern test construction procedures creates ethical challenges for users who are ultimately responsible for the appropriate use and interpretation of tests they administer. Concerns over the ethical use of tests intensify as the high stakes associated with the results grow. Test manuals should provide detailed information regarding test score validity and applicability for specific purposes. However, translating the technical information into usable information may be difficult for test developers, who may be more expert at psychometric research than at making their findings accessible to the general reader. Additionally, the information presented in test manuals can be accurate and understandable but insufficient for appropriate evaluation by users. Fortunately, several sources of review and technical critique are available for most commercially available tests. These critiques are independent of the authors or publishers of the tests. Although test authors and publishers are required by current standards to make vital psychometric information available, experts may be necessary as translators and evaluators of such information. The most wellestablished source of test reviews is the Buros Institute of Mental Measurements. Established more

Themes and Variations in Validity Theory

Educational Measurement: Issues and Practice, 2005

Should the Standards reflect the perspective that construct validity is central to all validation efforts? Is the construct-/con tent-/criterion-rela ted categorization o f validity evidence now obsolete? Should the definition o f validity include consideration o f the consequences of test use? s most of the readers of this A journal know, the 1985 Standards for Educational and Psychological Testing (AERA, MA, & NCME, 1985) is under revision. Deciding whether and how to revise the characterization of validity is probably the most significant issue facing the authors of the revised Standards. There is a substantial disjunction between the way validity is characterized in the 1985 standards and in the published work of many who write about the philosophy of validity. While there are dominant themes that run throughout characterizations of validity in the Standards and elsewhere, there are also substantial variations, both with respect to the boundaries of validityhow it is delimited from other concepts-and with respect to its components-how it is analyzed into aspects, constituent parts, or processes to guide validity research. The purpose of this article is to highlight some of the major questions facing the measurement community in deciding how to conceptualize validity in the revised Standards and thereby to encourage dialogue about how these issues might be resolved. Choices made with respect to the boundaries and components of validity are not just topics for the seminar room-they influence the way validity research is carried out, the responsibilities of assessment developers and users, and the rights of those who are assessed.

Emerging/Evolving Views of the Meaning of Score Validity

1998

The American Psychological Association, in the late 1940s, began work to establish a code of ethics to include and address the needs of members in scientific and applied fields. Out of the ethics work emerged a set of standards for evaluating psychological tests. Four categories, or types of validity, were identified: content, predictive, concurrent, and construct. In subsequent years, predictive and concurrent were combined in a single category labeled criterion validity. The resulting three categories of validity, sometimes called the holy trinity, having survived nearly 40 years of use, are now entrenched concepts in test construction and evaluation. Current trends in the conceptualization of test validity dismiss these three categories as separate entities, but old habits die hard, as apparently do old ideas. This paper reviews the emergence of validity as a unitarian concept, and discusses current views with particular attention to consequential validity. The current theory tha...

On Validity Theory and Test Validation

Educational Researcher, 2007

propose a new framework for conceptualizing test validity that separates analysis of test properties from analysis of the construct measured. In response, the author of this article reviews fundamental characteristics of test validity, drawing largely from seminal writings as well as from the accepted standards.

Focus on psychometrics. The multitrait–multimethod approach to construct validity

Research in Nursing & Health, 1991

The multitrait-multimethod matrix approach as proposed by was an important contribution to our understanding of the nature of validation procedures. There are, however, problems encountered when using the Campbell and Fiske approach. The purpose of this article is to discuss the method and selected problems, and to propose an alternate approach to address those problems.

The Interaction of Values and Validity Assessment: Does a Test's Level of Validity Depend on a Researcher's Values?

1998

This paper presents a somewhat different framework for considering the validity problem than that proposed by Messick (1989). Validity evaluation is considered as a problem of comparing continua in a multidimensional space corresponding to constructs, tests, and applications. This framework is used to consider the position taken by Markus (1998) and to argue that a test's validity is independent of a researcher's values, and that a completion of Messick's synthesis is not needed.

Corrections for Criterion Reliability in Validity Generalization: A False Prophet in a Land of Suspended Judgment

Industrial and Organizational Psychology, 2014

The results of meta-analytic (MA) and validity generalization (VG) studies continue to be impressive. In contrast to earlier findings that capped the variance accounted for in job performance at roughly 16%, many recent studies suggest that a single predictor variable can account for between 16 and 36% of the variance in some aspect of job performance. This article argues that this “enhancement” in variance accounted for is often attributable not to improvements in science but to a dumbing down of the standards for the values of statistics used in correction equations. With rare exceptions, applied researchers have suspended judgment about what is and is not an acceptable threshold for criterion reliability in their quest for higher validities. We demonstrate a statistical dysfunction that is a direct result of using low criterion reliabilities in corrections for attenuation. Corrections typically applied to a single predictor in aVGstudy are instead applied to multiple predictors. A multiple correlation analysis is then conducted on corrected validity coefficients. It is shown that the corrections often used in single predictor studies yield a squared multiple correlation that appears suspect. Basically, the multiple predictor study exposes the tenuous statistical foundation of using abjectly low criterion reliabilities in single predictorVGstudies. Recommendations for restoring scientific integrity to the meta-analyses that permeate industrial–organizational (I–O) psychology are offered.