Kevin Krost | Virginia Tech (original) (raw)
Uploads
Papers by Kevin Krost
2019 ASEE Annual Conference & Exposition Proceedings, Sep 10, 2020
and a B.Sc. in Electrical and Electronic Engineering from the University of Technology, Jamaica. ... more and a B.Sc. in Electrical and Electronic Engineering from the University of Technology, Jamaica. Her research interest is eliciting conceptual understanding of introductory engineering concepts using active learning strategies.
Proceedings of the 2019 AERA Annual Meeting, 2019
Computers and education open, Dec 1, 2021
The primary purpose of this study was to examine the extent to which students’ course perceptions... more The primary purpose of this study was to examine the extent to which students’ course perceptions (i.e., perceptions of empowerment, usefulness, success, interest, and caring) and cost beliefs predict their effort and grades in an online course. We surveyed 1,446 students in an online geography course. Students completed closed- and open-ended items and we used structural equation modeling and qualitative coding to analyze the data. Students’ course perceptions predicted their course effort, which then predicted their final course grade. The quantitative findings demonstrated that students’ situational interest and perceptions of instructor caring were statistically significant predictors of their effort and achievement. The qualitative findings indicated that students’ perceptions of the usefulness of the course content and their interest affected their effort, as did the amount of time that they had available for course activities. The findings were moderated by students’ perceptions of course ease. Students reported decreased effort when they believed that they could succeed and the course was easy, and when they believed it was going to take a lot of time and the course was difficult. This study highlights the importance of designing courses that (a) interest students in the course activities, (b) foster perceptions of caring between the instructor and students, (c) are at an appropriate level of difficulty, and (d) provide a reasonable workload with considerations for students with time constraints. Researchers may use the findings to develop interventions and strategies that instructors can use to encourage students to put forth more effort in online courses.
There has been extensive research indicating gender-based differences among STEM subjects, partic... more There has been extensive research indicating gender-based differences among STEM subjects, particularly mathematics (Albano & Rodriguez, 2013; Lane, Wang, & Magone, 1996). Similarly, gender-based differential item functioning (DIF) has been researched due to the disadvantages females face in STEM subjects when compared to their male counterparts. Given that, this study will apply the multiple indicators multiple causes (MIMIC) model, a type of structural equation model, to detect the presence of gender-based DIF using the Program for International Student Assessment (PISA) mathematics data from students in the United States of America then predict the DIF using math-related covariates. This study will build upon a previous study which explored the same data using the hierarchical generalized linear model and will be confirmatory in nature. Based on the results of the previous study, it is expected that several items will exhibit DIF which disadvantages females, and that mathematics-based self-efficacy will predict the DIF. However, additional covariates will also be explored and the two models will be compared in terms of their DIF-detection and the subsequent modeling of DIF. Implications of these results include females under-achieving when compared to their male counterparts, thus continuing the current trend. These gender differences can further manifest at the national level, causing US students as a whole to under-perform at the international level. Last, the efficacy of the MIMIC model to detect and predict DIF will be illustrated and become increasingly used to model and better understand differences and DIF.
Preventive medicine reports, Dec 1, 2021
The present study examines public housing residents’ smoking cessation intentions, expectancies, ... more The present study examines public housing residents’ smoking cessation intentions, expectancies, and attempts one year after implementation of the Department of Housing and Urban Development’s mandatory smoke-free rule in public housing. The sample includes 233 cigarette smokers, ages 18–80, who reside in the District of Columbia Housing Authority. Data collection occurred between March and August 2019. Descriptive statistics, chi-square, and Wilcoxon two-sample test analyses assessed smoking cessation intentions, expectancies, and attempts across resident demographics and characteristics. Findings showed 17.2% of residents reported not thinking about quitting, 39.1% reported thinking about quitting, and 48.6% reported thinking about quitting specifically because of the rule. Residents ages 60–80 were more likely to consider quitting because of the rule, compared to residents ages 18–59. Of those thinking of quitting, 58.6% were sure they could quit if they tried. Those thinking of quitting due to the rule (62.0%) were more likely to have made at least one quit attempt in the past 3 months than those i not attributinging thinking of quitting to the rule. Res Residents trying to quit reported an average of 2.7 attempts in the last 3 months;; most perceived evidence-based cessation supports as not helpful. A A majority reported thinking about quitting and attempting to quit but continuing to smoke, indicating a significant gap between intent to quit and successfully quitting. Results suggest that the rule positively influenced smoking behaviors. However, additional interventions are needed to assist public housing residents with successfully quitting smoking.
International Journal of Environmental Research and Public Health, Aug 24, 2021
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Proceedings of the 2019 AERA Annual Meeting
There has been extensive research indicating gender-based differences among STEM subjects, partic... more There has been extensive research indicating gender-based differences among STEM subjects, particularly mathematics (Albano & Rodriguez, 2013; Lane, Wang, & Magone, 1996). Similarly, gender-based differential item functioning (DIF) has been researched due to the disadvantages females face in STEM subjects when compared to their male counterparts. Given that, this study will apply the multiple indicators multiple causes (MIMIC) model, a type of structural equation model, to detect the presence of gender-based DIF using the Program for International Student Assessment (PISA) mathematics data from students in the United States of America then predict the DIF using math-related covariates. This study will build upon a previous study which explored the same data using the hierarchical generalized linear model and will be confirmatory in nature. Based on the results of the previous study, it is expected that several items will exhibit DIF which disadvantages females, and that mathematics-...
Computers and Education Open
Preventive Medicine Reports
International Journal of Environmental Research and Public Health
In July 2018, the United States Department of Housing and Urban Development (HUD) implemented a m... more In July 2018, the United States Department of Housing and Urban Development (HUD) implemented a mandatory smoke-free rule in public housing. This study assessed administrator and resident perceptions of rule implementation during its initial year in the District of Columbia Housing Authority (DCHA). Assessment included nine focus groups (n = 69) with residents and in-depth interviews with administrators (n = 7) and residents (n = 26) from 14 DCHA communities (family = 7 and senior/disabled = 7). Semi-structured discussion guides based on the multi-level socio-ecological framework captured dialogue that was recorded, transcribed verbatim, and coded inductively. Emerging major themes for each socio-ecological framework level included: (1) Individual: the rule was supported due to perceived health benefits, with stronger support among non-smokers; (2) Interpersonal: limiting secondhand smoke exposure was perceived as a positive for vulnerable residents; (3) Organizational: communicatio...
2019 ASEE Annual Conference & Exposition Proceedings
Thesis Chapters by Kevin Krost
Differential item functioning (DIF) is a difference in the probability of correctly answering an ... more Differential item functioning (DIF) is a difference in the probability of correctly answering an item on a test between a focal (minority) group and a reference (majority) group after matching on ability. While it is possible to extend the concept to multiple groups, two groups are predominately used. For this study, students with disabilities (SWDs) were compared to students without disabilities, and English language learners (ELLs) were compared to non-ELLs. Both groups were analyzed due to their increased inclusion in educational testing, their emergence as important, less-researched groups, and the fairness and validity issues associated with their inclusion in standardized testing situations. Additionally, after assessing the presence of DIF, differential test functioning (DTF) was examined to determine the extent that the item-level differential functioning manifested on the test level. Methodologically, this study was interested in comparing classical and modern test theory methods of DIF detection to compare the detection rates of each when using real data. Classical test theory (CTT) methods that were examined were logistic regression and the Mantel-Haenszel procedure, whereas item response theory (IRT) methods included Raju’s DFIT framework and Lord’s Wald Test. Both classical and item response theory methods have advantages and disadvantages; CTT has less restrictive assumptions but can also lack power for detecting DIF. IRT methods, on the other hand, have much stronger assumptions but, if met, are powerful in detecting DIF. Therefore, each method’s ability to detect DIF was compared to assess their efficacy under varying conditions. The results indicated a large number of items were flagged as exhibiting non-negligible DIF among both ELLs and SWDs, with both reference and focal groups being disadvantaged on different items. Differential test functioning was present with a large number of items flagged as exhibiting DIF needing to be removed before the test-level differential functioning was not significant. Methodologically, each method identified several items as exhibiting DIF, and the IRT-based methods detected more non-negligible DIF-items than methods under CTT. Specifically, Lord’s Wald Test of b parameters and the noncompensatory DIF (NCDIF) statistic under Raju’s DFIT framework detected several items exhibiting non-negligible DIF.
Comparison of the Item Response Theory with Covariates Model and Explanatory Cognitive Diagnostic Model for Detecting and Explaining Differential Item Functioning, 2023
In psychometrics, a concern is that the assessment is fair for all students who take it. The fair... more In psychometrics, a concern is that the assessment is fair for all students who take it. The fairness of an assessment can be evaluated in several ways, including the examination of differential item functioning (DIF). An item exhibits DIF if a subgroup has a lower probability of answering an item correctly than another subgroup after matching on academic achievement. Subgroups include race, spoken language, disability status, or sex. Under item response theory (IRT), a single score is given to each student since IRT assumes that an assessment is only measuring one construct. However, under cognitive diagnostic modeling (CDM), an assessment measures multiple specific constructs and classifies students as having mastered the construct or not. There are several methods to detect DIF under both types of models, but most methods cannot conduct explanatory modeling. Explanatory modeling consists of predicting item responses and latent traits using relevant observed or latent covariates. If an item exhibits DIF which disadvantages a subgroup, covariates can be modeled to explain the DIF and indicate either true or spurious differences. If an item exhibited statistically significant DIF which became nonsignificant after modeling explanatory variables, then the DIF would be explained and considered spurious. If the DIF remained significant after modeling explanatory variables, then there was stronger evidence that DIF was present and not spurious. When an item exhibits DIF, the validity of the inferences from the assessment is threatened and group comparisons become inappropriate.
This study evaluated the presence of DIF on the Trends in International Math and Science Study (TIMSS) between students who speak English as a first language (EFL) and students who do not speak English as a first language (multilingual learners [ML]) in the USA. The 8th grade science data was analyzed from the year 2011 since science achievement remains understudied, the 8th grade is a critical turning point for K-12 students, and because 2011 was the most recent year that item content is available from this assessment. The item response theory with covariates (IRT-C) model was used as the explanatory IRT model, while the reparameterized deterministic-input, noisy “and” gate (RDINA) model was used as the explanatory CDM (E-CDM). All released items were analyzed for DIF by both models with language status as the key grouping variable. Items which exhibited significant DIF were further analyzed by including relevant covariates. Then, if items still exhibited DIF, their content was evaluated to determine why a group was disadvantaged.
Several items exhibited significant DIF under both the IRT-C and E-CDM. Most disadvantaged ML students. Under the IRT-C, two items that exhibited DIF were explained by quantitative covariates. Two items that did not exhibit significant nonuniform DIF became significant after explanation. Whether or not a student repeated elementary school was the strongest explanatory covariate, while confidence in science explained the most items. Under the E-CDM, five items initially exhibited significant uniform DIF with one also exhibiting nonuniform DIF. After scale purification, two items exhibited significant uniform DIF, and one exhibited marginally significant DIF. After explanatory modeling, no items exhibited significant uniform DIF, and only one item exhibited marginally significant nonuniform DIF. Examining covariates, home educational resources explained the most with ten items, and the strongest positive covariate. Repeated elementary school had the strongest absolute effect. Examining the item content of 14 items, most items had no causal explanation for the presence of DIF. In four items, a causal mechanism was identified and concluded to exhibit item bias. An item’s cognitive domain had a relationship with DIF items, with 79% of items under the Knowing domain. Based on these results, DIF that disadvantaged ML students was present among several items on this science assessment. Both the IRT-C and E-CDM identified several items exhibiting DIF, quantitative covariates explained several items exhibiting DIF, and item bias was discovered in several items.
Following up on this empirical study, a simulation study was performed to evaluate DIF detection power and Type I error rates of the Wald test and likelihood ratio (LR) test, and parameter recovery when ignoring subgroups, using the compensatory reparameterized unified model (C-RUM). Factors included sample size, DIF magnitude, DIF type, Q-matrix complexity, their interaction effects, and p-value adjustment.
Evaluating DIF under the C-RUM, DIF detection method had the largest effect on Type I error rates, with the Wald test recovering the nominal p-value much better than the LR test. In terms of power, DIF magnitude was the most important factor, followed by Q-matrix complexity. As DIF magnitude increased and Q-matrix complexity decreased, power rates increased. In terms of parameter recovery, DIF type was the strongest effect, followed by Q-matrix complexity. Nonuniform DIF recovered the parameter more than uniform DIF, while fewer attributes measured by an item improved parameter recovery. Several factors affected DIF detection power and Type I error, including DIF detection method, DIF magnitude, and Q-matrix complexity. For parameter recovery, DIF type had an impact, along with Q-matrix complexity, and DIF magnitude.
2019 ASEE Annual Conference & Exposition Proceedings, Sep 10, 2020
and a B.Sc. in Electrical and Electronic Engineering from the University of Technology, Jamaica. ... more and a B.Sc. in Electrical and Electronic Engineering from the University of Technology, Jamaica. Her research interest is eliciting conceptual understanding of introductory engineering concepts using active learning strategies.
Proceedings of the 2019 AERA Annual Meeting, 2019
Computers and education open, Dec 1, 2021
The primary purpose of this study was to examine the extent to which students’ course perceptions... more The primary purpose of this study was to examine the extent to which students’ course perceptions (i.e., perceptions of empowerment, usefulness, success, interest, and caring) and cost beliefs predict their effort and grades in an online course. We surveyed 1,446 students in an online geography course. Students completed closed- and open-ended items and we used structural equation modeling and qualitative coding to analyze the data. Students’ course perceptions predicted their course effort, which then predicted their final course grade. The quantitative findings demonstrated that students’ situational interest and perceptions of instructor caring were statistically significant predictors of their effort and achievement. The qualitative findings indicated that students’ perceptions of the usefulness of the course content and their interest affected their effort, as did the amount of time that they had available for course activities. The findings were moderated by students’ perceptions of course ease. Students reported decreased effort when they believed that they could succeed and the course was easy, and when they believed it was going to take a lot of time and the course was difficult. This study highlights the importance of designing courses that (a) interest students in the course activities, (b) foster perceptions of caring between the instructor and students, (c) are at an appropriate level of difficulty, and (d) provide a reasonable workload with considerations for students with time constraints. Researchers may use the findings to develop interventions and strategies that instructors can use to encourage students to put forth more effort in online courses.
There has been extensive research indicating gender-based differences among STEM subjects, partic... more There has been extensive research indicating gender-based differences among STEM subjects, particularly mathematics (Albano & Rodriguez, 2013; Lane, Wang, & Magone, 1996). Similarly, gender-based differential item functioning (DIF) has been researched due to the disadvantages females face in STEM subjects when compared to their male counterparts. Given that, this study will apply the multiple indicators multiple causes (MIMIC) model, a type of structural equation model, to detect the presence of gender-based DIF using the Program for International Student Assessment (PISA) mathematics data from students in the United States of America then predict the DIF using math-related covariates. This study will build upon a previous study which explored the same data using the hierarchical generalized linear model and will be confirmatory in nature. Based on the results of the previous study, it is expected that several items will exhibit DIF which disadvantages females, and that mathematics-based self-efficacy will predict the DIF. However, additional covariates will also be explored and the two models will be compared in terms of their DIF-detection and the subsequent modeling of DIF. Implications of these results include females under-achieving when compared to their male counterparts, thus continuing the current trend. These gender differences can further manifest at the national level, causing US students as a whole to under-perform at the international level. Last, the efficacy of the MIMIC model to detect and predict DIF will be illustrated and become increasingly used to model and better understand differences and DIF.
Preventive medicine reports, Dec 1, 2021
The present study examines public housing residents’ smoking cessation intentions, expectancies, ... more The present study examines public housing residents’ smoking cessation intentions, expectancies, and attempts one year after implementation of the Department of Housing and Urban Development’s mandatory smoke-free rule in public housing. The sample includes 233 cigarette smokers, ages 18–80, who reside in the District of Columbia Housing Authority. Data collection occurred between March and August 2019. Descriptive statistics, chi-square, and Wilcoxon two-sample test analyses assessed smoking cessation intentions, expectancies, and attempts across resident demographics and characteristics. Findings showed 17.2% of residents reported not thinking about quitting, 39.1% reported thinking about quitting, and 48.6% reported thinking about quitting specifically because of the rule. Residents ages 60–80 were more likely to consider quitting because of the rule, compared to residents ages 18–59. Of those thinking of quitting, 58.6% were sure they could quit if they tried. Those thinking of quitting due to the rule (62.0%) were more likely to have made at least one quit attempt in the past 3 months than those i not attributinging thinking of quitting to the rule. Res Residents trying to quit reported an average of 2.7 attempts in the last 3 months;; most perceived evidence-based cessation supports as not helpful. A A majority reported thinking about quitting and attempting to quit but continuing to smoke, indicating a significant gap between intent to quit and successfully quitting. Results suggest that the rule positively influenced smoking behaviors. However, additional interventions are needed to assist public housing residents with successfully quitting smoking.
International Journal of Environmental Research and Public Health, Aug 24, 2021
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Proceedings of the 2019 AERA Annual Meeting
There has been extensive research indicating gender-based differences among STEM subjects, partic... more There has been extensive research indicating gender-based differences among STEM subjects, particularly mathematics (Albano & Rodriguez, 2013; Lane, Wang, & Magone, 1996). Similarly, gender-based differential item functioning (DIF) has been researched due to the disadvantages females face in STEM subjects when compared to their male counterparts. Given that, this study will apply the multiple indicators multiple causes (MIMIC) model, a type of structural equation model, to detect the presence of gender-based DIF using the Program for International Student Assessment (PISA) mathematics data from students in the United States of America then predict the DIF using math-related covariates. This study will build upon a previous study which explored the same data using the hierarchical generalized linear model and will be confirmatory in nature. Based on the results of the previous study, it is expected that several items will exhibit DIF which disadvantages females, and that mathematics-...
Computers and Education Open
Preventive Medicine Reports
International Journal of Environmental Research and Public Health
In July 2018, the United States Department of Housing and Urban Development (HUD) implemented a m... more In July 2018, the United States Department of Housing and Urban Development (HUD) implemented a mandatory smoke-free rule in public housing. This study assessed administrator and resident perceptions of rule implementation during its initial year in the District of Columbia Housing Authority (DCHA). Assessment included nine focus groups (n = 69) with residents and in-depth interviews with administrators (n = 7) and residents (n = 26) from 14 DCHA communities (family = 7 and senior/disabled = 7). Semi-structured discussion guides based on the multi-level socio-ecological framework captured dialogue that was recorded, transcribed verbatim, and coded inductively. Emerging major themes for each socio-ecological framework level included: (1) Individual: the rule was supported due to perceived health benefits, with stronger support among non-smokers; (2) Interpersonal: limiting secondhand smoke exposure was perceived as a positive for vulnerable residents; (3) Organizational: communicatio...
2019 ASEE Annual Conference & Exposition Proceedings
Differential item functioning (DIF) is a difference in the probability of correctly answering an ... more Differential item functioning (DIF) is a difference in the probability of correctly answering an item on a test between a focal (minority) group and a reference (majority) group after matching on ability. While it is possible to extend the concept to multiple groups, two groups are predominately used. For this study, students with disabilities (SWDs) were compared to students without disabilities, and English language learners (ELLs) were compared to non-ELLs. Both groups were analyzed due to their increased inclusion in educational testing, their emergence as important, less-researched groups, and the fairness and validity issues associated with their inclusion in standardized testing situations. Additionally, after assessing the presence of DIF, differential test functioning (DTF) was examined to determine the extent that the item-level differential functioning manifested on the test level. Methodologically, this study was interested in comparing classical and modern test theory methods of DIF detection to compare the detection rates of each when using real data. Classical test theory (CTT) methods that were examined were logistic regression and the Mantel-Haenszel procedure, whereas item response theory (IRT) methods included Raju’s DFIT framework and Lord’s Wald Test. Both classical and item response theory methods have advantages and disadvantages; CTT has less restrictive assumptions but can also lack power for detecting DIF. IRT methods, on the other hand, have much stronger assumptions but, if met, are powerful in detecting DIF. Therefore, each method’s ability to detect DIF was compared to assess their efficacy under varying conditions. The results indicated a large number of items were flagged as exhibiting non-negligible DIF among both ELLs and SWDs, with both reference and focal groups being disadvantaged on different items. Differential test functioning was present with a large number of items flagged as exhibiting DIF needing to be removed before the test-level differential functioning was not significant. Methodologically, each method identified several items as exhibiting DIF, and the IRT-based methods detected more non-negligible DIF-items than methods under CTT. Specifically, Lord’s Wald Test of b parameters and the noncompensatory DIF (NCDIF) statistic under Raju’s DFIT framework detected several items exhibiting non-negligible DIF.
Comparison of the Item Response Theory with Covariates Model and Explanatory Cognitive Diagnostic Model for Detecting and Explaining Differential Item Functioning, 2023
In psychometrics, a concern is that the assessment is fair for all students who take it. The fair... more In psychometrics, a concern is that the assessment is fair for all students who take it. The fairness of an assessment can be evaluated in several ways, including the examination of differential item functioning (DIF). An item exhibits DIF if a subgroup has a lower probability of answering an item correctly than another subgroup after matching on academic achievement. Subgroups include race, spoken language, disability status, or sex. Under item response theory (IRT), a single score is given to each student since IRT assumes that an assessment is only measuring one construct. However, under cognitive diagnostic modeling (CDM), an assessment measures multiple specific constructs and classifies students as having mastered the construct or not. There are several methods to detect DIF under both types of models, but most methods cannot conduct explanatory modeling. Explanatory modeling consists of predicting item responses and latent traits using relevant observed or latent covariates. If an item exhibits DIF which disadvantages a subgroup, covariates can be modeled to explain the DIF and indicate either true or spurious differences. If an item exhibited statistically significant DIF which became nonsignificant after modeling explanatory variables, then the DIF would be explained and considered spurious. If the DIF remained significant after modeling explanatory variables, then there was stronger evidence that DIF was present and not spurious. When an item exhibits DIF, the validity of the inferences from the assessment is threatened and group comparisons become inappropriate.
This study evaluated the presence of DIF on the Trends in International Math and Science Study (TIMSS) between students who speak English as a first language (EFL) and students who do not speak English as a first language (multilingual learners [ML]) in the USA. The 8th grade science data was analyzed from the year 2011 since science achievement remains understudied, the 8th grade is a critical turning point for K-12 students, and because 2011 was the most recent year that item content is available from this assessment. The item response theory with covariates (IRT-C) model was used as the explanatory IRT model, while the reparameterized deterministic-input, noisy “and” gate (RDINA) model was used as the explanatory CDM (E-CDM). All released items were analyzed for DIF by both models with language status as the key grouping variable. Items which exhibited significant DIF were further analyzed by including relevant covariates. Then, if items still exhibited DIF, their content was evaluated to determine why a group was disadvantaged.
Several items exhibited significant DIF under both the IRT-C and E-CDM. Most disadvantaged ML students. Under the IRT-C, two items that exhibited DIF were explained by quantitative covariates. Two items that did not exhibit significant nonuniform DIF became significant after explanation. Whether or not a student repeated elementary school was the strongest explanatory covariate, while confidence in science explained the most items. Under the E-CDM, five items initially exhibited significant uniform DIF with one also exhibiting nonuniform DIF. After scale purification, two items exhibited significant uniform DIF, and one exhibited marginally significant DIF. After explanatory modeling, no items exhibited significant uniform DIF, and only one item exhibited marginally significant nonuniform DIF. Examining covariates, home educational resources explained the most with ten items, and the strongest positive covariate. Repeated elementary school had the strongest absolute effect. Examining the item content of 14 items, most items had no causal explanation for the presence of DIF. In four items, a causal mechanism was identified and concluded to exhibit item bias. An item’s cognitive domain had a relationship with DIF items, with 79% of items under the Knowing domain. Based on these results, DIF that disadvantaged ML students was present among several items on this science assessment. Both the IRT-C and E-CDM identified several items exhibiting DIF, quantitative covariates explained several items exhibiting DIF, and item bias was discovered in several items.
Following up on this empirical study, a simulation study was performed to evaluate DIF detection power and Type I error rates of the Wald test and likelihood ratio (LR) test, and parameter recovery when ignoring subgroups, using the compensatory reparameterized unified model (C-RUM). Factors included sample size, DIF magnitude, DIF type, Q-matrix complexity, their interaction effects, and p-value adjustment.
Evaluating DIF under the C-RUM, DIF detection method had the largest effect on Type I error rates, with the Wald test recovering the nominal p-value much better than the LR test. In terms of power, DIF magnitude was the most important factor, followed by Q-matrix complexity. As DIF magnitude increased and Q-matrix complexity decreased, power rates increased. In terms of parameter recovery, DIF type was the strongest effect, followed by Q-matrix complexity. Nonuniform DIF recovered the parameter more than uniform DIF, while fewer attributes measured by an item improved parameter recovery. Several factors affected DIF detection power and Type I error, including DIF detection method, DIF magnitude, and Q-matrix complexity. For parameter recovery, DIF type had an impact, along with Q-matrix complexity, and DIF magnitude.