Examining Measurement Invariance and Differential Item Functioning With Discrete Latent Construct Indicators: A Note on a Multiple Testing Procedure (original) (raw)

Methods of Detecting Differential Item Functioning: A Comparison of Item Response Theory and Confirmatory Factor Analysis

2001

METHODS TO DETECT DIFFERENTIAL ITEM FUNCTONING: A COMPARISON OF ITEM RESPONSE THEORY AND CONFIRMATORY FACTOR ANALYSIS Ratchaneewan Wanichtanom Old Dominion University, 2001 Co-Directors of Advisory Committee: Dr. Terry L. Dickinson Dr. Glynn D. Coates Differential item functioning (DIF) occurs when an item performs statistically differently for a reference group than for a focal group. DDF is a threat to the validity of a test, and it can lead to illegal usage o f a test in situations such as employee selection. Thus, DIF has important implications for test construction and practice. This research is a Monte Carlo study that compares Item Response Theory (IRT) and three Confirmatory Factor Analysis (CFA) methods for detecting DIF. The three CFA methods were Model Comparison (MC), Modification Indexes (MI), and Modification Indexes-Divided sample (Mi-Divided). The research compared the detection rates of DIF by the methods for reference and focal groups. Each group consisted o f 1000...

Power and Type I Error of the Mean and Covariance Structure Analysis Model for Detecting Differential Item Functioning in Graded Response Items

Multivariate Behavioral Research, 2006

In this simulation study, we investigate the power and Type I error rate of a procedure based on the mean and covariance structure analysis (MACS) model in detecting differential item functioning (DIF) of graded response items with five response categories. The following factors were manipulated: type of DIF (uniform and non-uniform), DIF magnitude (low, medium and large), equality/inequality of latent trait distributions, sample size (100, 200, 400, and 800) and equality or inequality of the sample sizes across groups. The simulated test was made up of 10 items, of which only 1 contained DIF. One hundred replications were generated for each simulated condition. Results indicate that the MACS-based procedure showed acceptable power levels (≥ .70) for detecting medium-sized uniform and non-uniform DIF, when both groups' sample sizes were as low as 200/200 and 400/200, respectively. Power increased as sample sizes and DIF magnitude increased. The analyzed procedure tended to better control for its Type I error when both groups' sizes and latent trait distribution were equal across groups and when magnitude of DIF and sample size were small.

The use of latent variable mixture models to identify invariant items in test construction

Quality of life research : an international journal of quality of life aspects of treatment, care and rehabilitation, 2017

Patient-reported outcome measures (PROMs) are frequently used in heterogeneous patient populations. PROM scores may lead to biased inferences when sources of heterogeneity (e.g., gender, ethnicity, and social factors) are ignored. Latent variable mixture models (LVMMs) can be used to examine measurement invariance (MI) when sources of heterogeneity in the population are not known a priori. The goal of this article is to discuss the use of LVMMs to identify invariant items within the context of test construction. The Draper-Lindely-de Finetti (DLD) framework for the measurement of latent variables provides a theoretical context for the use of LVMMs to identify the most invariant items in test construction. In an expository analysis using 39 items measuring daily activities, LVMMs were conducted to compare 1- and 2-class item response theory models (IRT). If the 2-class model had better fit, item-level logistic regression differential item functioning (DIF) analyses were conducted to ...

An NCME Instructional Module on Latent DIF Analysis Using Mixture Item Response Models

Educational Measurement: Issues and Practice, 2015

The purpose of this ITEMS module is to provide an introduction to differential item functioning (DIF) analysis using mixture item response models. The mixture item response models for DIF analysis involve comparing item profiles across latent groups, instead of manifest groups. First, an overview of DIF analysis based on latent groups, called latent DIF analysis, is provided and its applications in the literature are surveyed. Then, the methodological issues pertaining to latent DIF analysis are described, including mixture item response models, parameter estimation, and latent DIF detection methods. Finally, recommended steps for latent DIF analysis are illustrated using empirical data.

Performance of exploratory structural equation model (ESEM) in detecting differential item functioning

EUREKA: Social and Humanities

The validity of a standardised test is questioned if an irrelevant construct is accounted for the performance of examinees, which is wrongly modeled as the ability in the construct (test items). A test must ensure precision in the examinee's ability irrespective of their sub-population in any demographic variables. This paper explored the potentials of gender and school location as major covariates on the West African Examinations Council (WAEC) mathematics items among examinees (N=2,866) using Exploratory Structural Equation Modeling (ESEM). The results remarked that the test is multidimensional (six-factors) with compliance fix indices of (χ2 (940)=4882.024, p < 0.05, CFI=0.962, TLI=0.930, RMSEA=0.038, SRMR=0.030, 90 % CI=0.037-0.039, Akaike information criterion (AIC)=147290.577, Bayesian information criterion (BIC)=149585.436 and Sample-size adjusted BIC=148362.154) respectively. Also, there were 10 (20 %) significant DIF items in the WAEC to gender, while 3 (6 %) of the ...

Measurement invariance and differential item functioning of the Bar-On EQ-i: S measure over Canadian, Scottish, South African and Australian samples

Personality and Individual Differences, 2011

The use of latent class analysis, and finite mixture modeling more generally, has become almost commonplace in social and health science domains. Typically, research aims in mixture model applications include investigating predictors and distal outcomes of latent class membership. The most recent developments for incorporating latent class antecedents and consequences are stepwise procedures that decouple the classification and prediction models. It was initially believed these procedures might avoid the potential misspecification bias in the simultaneous models that include both latent class indicators and predictors. However, if direct effects from the predictors to the indicators are omitted in the stepwise procedure, the prediction model can yield biased estimates. This article presents a logical and principled approach, readily implemented in current software, to testing for direct effects from latent class predictors to indicators using multiple indicator multiple cause modeling. This approach is illustrated with real data and opportunities for future developments are discussed.

Explanatory Secondary Dimension Modeling of Latent Differential Item Functioning

Applied Psychological Measurement, 2011

The models used in this article are secondary dimension mixture models with the potential to explain differential item functioning (DIF) between latent classes, called latent DIF. The focus is on models with a secondary dimension that is at the same time specific to the DIF latent class and linked to an item property. A description of the models is provided along with a means of estimating model parameters using easily available software and a description of how the models behave in two applications. One application concerns a test that is sensitive to speededness and the other is based on an arithmetic operations test where the division items show latent DIF.

Latent variable mixture models to test for differential item functioning: a population-based analysis

Health and quality of life outcomes, 2017

Comparisons of population health status using self-report measures such as the SF-36 rest on the assumption that the measured items have a common interpretation across sub-groups. However, self-report measures may be sensitive to differential item functioning (DIF), which occurs when sub-groups with the same underlying health status have a different probability of item response. This study tested for DIF on the SF-36 physical functioning (PF) and mental health (MH) sub-scales in population-based data using latent variable mixture models (LVMMs). Data were from the Canadian Multicentre Osteoporosis Study (CaMos), a prospective national cohort study. LVMMs were applied to the ten PF and five MH SF-36 items. A standard two-parameter graded response model with one latent class was compared to multi-class LVMMs. Multivariable logistic regression models with pseudo-class random draws characterized the latent classes on demographic and health variables. The CaMos cohort consisted of 9423 r...