An Investigation of the Effect of Correlated Abilities on Observed Test Characteristics (original) (raw)

Using Multidimensional Item Response Theory to Evaluate Educational and Psychological Tests

Educational Measurement: Issues and Practice, 2005

Many educational and psychological tests are inherently multidimensional, meaning these tests measure two or more dimensions or constructs. The purpose of this module is to illustrate how test practitioners and researchers can apply multidimensional item response theory (MIRT) to understand better what their tests are measuring, how accurately the different composites of ability are being assessed, and how this information can be cycled back into the test development process. Procedures for conducting MIRT analyses-from obtaining evidence that the test is multidimensional, to modeling the test as multidimensional, to illustrating the properties of multidimensional items graphicallyare described from both a theoretical and a substantive basis. This module also illustrates these procedures using data from a ninth-grade mathematics achievement test. I t concludes with a discussion of future directions in MIRT research.

Measurement Accuracy: An Application of Multidimensional Item Response Theory to the Woodcock-Johnson Psycho-Educational Battery-Revised Achievement Scales

1992

A two-dimensional, compensatory item response model and a unidimensional model were fitted to the reading and mathematics items in the Woodcock-Johnson Psycho-Educational Battery-Revised for a sample of 1,000 adults aged 20-39 years. Multidimensional information theory predicts that if the unidimensional abilities can be represented as vectors in the two-dimensional solution, then the multidimensional model can be used to obtain ability scores with smaller standard errors. In reading, the mul,idimensional model yielded scores with smaller standard errors, but multidimensional scores from subtests within reading were identical to the overall reading score. In mathematics, unidimensional scores were non-linearly related to multidimensional ability estimates, and for some subjects, multidimensional ability scores had larger standard errors. There is a 16-item list of references. One table and 12 figures provide analysis information. (Author/SLD)

Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development

Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and reliability estimates are necessary. There are two generally acceptable frameworks used in evaluating the quality of test in educational and psychological measurements, these are; Classical Test Theory (CTT) and Item Response Theory (IRT). The estimates of test items validity and reliability depend on a particular measurement model used. It is vital for a test developer to be familiar with the different test development and item analysis methods in order to facilitate the development of a new test. The CTT is a traditional approach which was widely criticise in the measurement community for its shortcomings such as sample dependency of coefficient measures and estimates of measurement error. However, the IRT is a modern approach which provides solutions to most of the CTT " s identified shortcomings. This paper therefore, provides a comprehensive overview of the IRT and its procedures as applied to test item development and analysis. The paper concludes with some suggestions for test developers and test specialists at all levels to adopt IRT for its identified crucial theoretical and empirical gains over CTT. IRT based parameter estimates should be superior and reliable than CTT based parameter estimates. With these features, IRT can help resolve the problems associated with test design based on CTT

A Confirmatory Factor Analytic Study Examining the Dimensionality of Educational Achievement Tests

2008

Along with the increasing popularity of item response theory (IRT) in testing practices, it is important to check a fundamental assumption of most of the popular IRT models, which is unidimensionality. Nevertheless, it is hard for educational and psychological tests to be strictly unidimensional. The tests studied in this paper are from a standardized high-stake testing program. They feature potential multidimensionality by presenting various item types and item sets. Confirmatory factor analyses with one-factor and bifactor models, and based on both linear structural equation modeling approach and nonlinear IRT approach were conducted. The competing models were compared and the implications of the bifactor model for checking essential unidimensionality were discussed.

An application of item response theory to psychological test development

Psicologia: Reflexão e Crítica, 2016

Item response theory (IRT) has become a popular methodological framework for modeling response data from assessments in education and health; however, its use is not widespread among psychologists. This paper aims to provide a didactic application of IRT and to highlight some of these advantages for psychological test development. IRT was applied to two scales (a positive and a negative affect scale) of a self-report test. Respondents were 853 university students (57 % women) between the ages of 17 and 35 and who answered the scales. IRT analyses revealed that the positive affect scale has items with moderate discrimination and are measuring respondents below the average score more effectively. The negative affect scale also presented items with moderate discrimination and are evaluating respondents across the trait continuum; however, with much less precision. Some features of IRT are used to show how such results can improve the measurement of the scales. The authors illustrate and emphasize how knowledge of the features of IRT may allow test makers to refine and increase the validity and reliability of other psychological measures.

A comparative examination of structural models of ability tests

Quality and Quantity, 1992

This study was conducted to examine whether the same underlying structure of ability tests emerges when three different data analysis methods are used. The sample consisted of 335 examinees who applied for vocational guidance and were administered a battery of 17 tests. A matrix of intercorrelations between scores, based on the number of correct answers was obtained. The matrix was subjected to factor analysis, Guttman's SSA, and tree analysis, essentially resulting in different structures. Comparisons were made, and the theoretical implications of the results are discussed in relation to various structural models of ability tests existing in the literature.