Susan Embretson - Academia.edu (original) (raw)

Uploads

Papers by Susan Embretson

Research paper thumbnail of Item response theory

Oxford University Press eBooks, 2000

Research paper thumbnail of Response Time Relationships Within Examinees: Implications for Item Response Time Models

Springer proceedings in mathematics & statistics, 2021

A wide variety of models have been developed for item response times. The models vary in both pri... more A wide variety of models have been developed for item response times. The models vary in both primary purpose and underlying assumptions. As noted by van der Linden (2016), several item response models assume response time and response accuracy are highly dependent processes. However, the nature of this assumed relationship varies substantially between models; that is, greater accuracy may be associated with either increased or decreased response time. In addition to these conflicting assumptions, examinees may differ in their relative response times across items. In the current study, the relationship of item log response times to item differences in difficulty and content was examined within subjects. Although on the item level, mean item log response time was positively correlated with difficulty, a broad distribution of these correlations was found within subjects, ranging from positive to negative. These results indicate that existing models may be differentially effective depending on examinees’ predominant strategy in item solving.

Research paper thumbnail of Measuring Human Intelligence with Artificial Intelligence: Adaptive Item Generation

Cambridge University Press eBooks, Nov 1, 2004

Research paper thumbnail of The Factorial Validity of Scores from a Cognitively Designed Test: The Spatial Learning Ability Test

Educational and Psychological Measurement, Feb 1, 1997

The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measu... more The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measure specified aspects of spatial processing. Mathematical modeling of item difficulties and response times have supported SLAT as involving primarily spatial processing. Here, four studies on the factorial validity of SLAT are summarized to elaborate nomothetic span. SLAT was the highest loading test on the spatial ability factor in the context of either simple or complex spatial tests. Further, SLAT was less related to verbal-analytic coding skills than many other spatial tests, including a test that contained the same item type. Thus consistent with the construct representation of SLAT as involving spatial processing, the factorial validity studies indicate SLAT as a more pure measure of spatial ability.

Research paper thumbnail of Mixed Rasch Models for Measurement in Cognitive Psychology

Springer eBooks, 2007

To apply standard unidimensional IRT models to ability measurement, it must be assumed that indiv... more To apply standard unidimensional IRT models to ability measurement, it must be assumed that individuals at the same level of performance are in fact comparable. That is, individuals with the same estimate on the latent trait are interchangeable and thus can have identical interpretations given to their scores. Such interpretations, of course, depend on the construct validity of the trait measure, which includes both the construct representation and nomothetic span aspects (Embretson, 1983). The construct representation aspect of validity involves the meaning of the construct, in terms of the processes, strategies, and knowledge that are directly involved in performance. Construct representation is an aspect of internal validity. The nomothetic span aspect concerns the significance of the trait, which includes the relationship of trait scores to other measures, demographics and criteria. Nomothetic span is primarily the external validity of the measure.

Research paper thumbnail of Multidimensional Measurement from Dynamic Tests: Abstract Reasoning Under Stress

Multivariate Behavioral Research, Oct 1, 2000

ABSTRACT

Research paper thumbnail of Designing Cognitive Complexity in Mathematical Problem-Solving Items

Applied Psychological Measurement, Jun 1, 2010

Research paper thumbnail of Psychological measurement: Scaling and analysis

American Psychological Association eBooks, 2023

Research paper thumbnail of Understanding examinees’ item responses through cognitive modeling of response accuracy and response times

Large-scale Assessments in Education, Mar 21, 2023

Research paper thumbnail of Modern Measurement in the Social Sciences

SAGE Publications Ltd eBooks, Apr 30, 2012

Research paper thumbnail of FOCUS ARTICLE: The Second Century of Ability Testing: Some Predictions and Speculations

Measurement: Interdisciplinary Research & Perspective, 2004

ABSTRACT

Research paper thumbnail of Implications of a multidimensional latent trait model for measuring change

American Psychological Association eBooks, 1991

ABSTRACT

Research paper thumbnail of Construct Validity and Cognitive Diagnostic Assessment

Cambridge University Press eBooks, May 14, 2007

Research paper thumbnail of Confirmatory factor analysis of four general neuropsychological models with a modified halstead-reitan battery

Journal of clinical neuropsychology, Jun 1, 1983

Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing fr... more Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing from previous work by Swiercinsky, Royce and co-workers, Christensen and Luria, and Lezak. The relative explanatory power of these four models for this particular battery in an adult neuropsychiatric population was examined using confirmatory factor analysis. None of the models was shown to fit adequately in an absolute sense, but three of them represented substantial, statistically reliable improvements over a null model of mutual independence, and a clear pattern of relative fit was observed. Further improvements were achieved by modifying the best fitting initial model in several ways. A cross-validation with an independent sample supported the results of the model development step. Tentative theoretical and clinical implications for the overall organization of the neuropsychological abilities measured by this battery were drawn, and recommendations were made for further application of this method in neuropsychological research.

Research paper thumbnail of Item Diffficulty Modeling of Paragraph Comprehension Items

Applied Psychological Measurement, Sep 1, 2006

Research paper thumbnail of The continued search for nonarbitrary metrics in psychology

American Psychologist, 2006

H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issue... more H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issues: (a) the measurement of racial prejudice and (b) the establishment of clinically significant change. According to Blanton and Jaccard, although research findings are not undermined by arbitrary metrics, individual scores and score changes may not be meaningfully interpreted. The author believes that their points are mostly valid and that their examples were appropriate. However, Blanton and Jaccard's article does not lead directly to solutions, nor did it adequately describe the scope of the metric problem. This article has 2 major goals. First, some prerequisites for nonarbitrary metrics are presented and related to Blanton and Jaccard's issues. Second, the impact of arbitrary metrics on psychological research findings are described. In contrast to Blanton and Jaccard (2006), research findings suggest that metrics have direct impact on statistics for group comparisons and trend analysis.

Research paper thumbnail of Item Response Theory

Psychology Press eBooks, Sep 5, 2013

Research paper thumbnail of The new rules of measurement

Psychological Assessment, Dec 1, 1996

Research paper thumbnail of Diagnosing Student Node Mastery: Impact of Varying Item Response Modeling Approaches

Frontiers in Education, Sep 27, 2021

Research paper thumbnail of Modeling Item Difficulty in a Dynamic Test

Journal of Cognitive Education and Psychology, Oct 1, 2020

The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a d... more The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a dynamic test due to continual provision of feedback from examiner to examinee. Yet, the original scoring protocol for the test largely ignores this dynamic structure. The current analysis applies a dynamic adaptation of an explanatory item response theory model to evaluate the impact of feedback on item difficulty. Additionally, several item features (rule type, number of target shapes) are considered in the item difficulty model. Results demonstrated that all forms of feedback significantly reduced item difficulty, with the exception of corrective feedback that could not be directly applied to the next item in the series. More complex and compound rule types also significantly predicted item difficulty, as did increasing the number of shapes, thereby supporting the response process aspect of validity. Implications for continued use of the Concept Formation subtest for educational programming decisions are discussed.

Research paper thumbnail of Item response theory

Oxford University Press eBooks, 2000

Research paper thumbnail of Response Time Relationships Within Examinees: Implications for Item Response Time Models

Springer proceedings in mathematics & statistics, 2021

A wide variety of models have been developed for item response times. The models vary in both pri... more A wide variety of models have been developed for item response times. The models vary in both primary purpose and underlying assumptions. As noted by van der Linden (2016), several item response models assume response time and response accuracy are highly dependent processes. However, the nature of this assumed relationship varies substantially between models; that is, greater accuracy may be associated with either increased or decreased response time. In addition to these conflicting assumptions, examinees may differ in their relative response times across items. In the current study, the relationship of item log response times to item differences in difficulty and content was examined within subjects. Although on the item level, mean item log response time was positively correlated with difficulty, a broad distribution of these correlations was found within subjects, ranging from positive to negative. These results indicate that existing models may be differentially effective depending on examinees’ predominant strategy in item solving.

Research paper thumbnail of Measuring Human Intelligence with Artificial Intelligence: Adaptive Item Generation

Cambridge University Press eBooks, Nov 1, 2004

Research paper thumbnail of The Factorial Validity of Scores from a Cognitively Designed Test: The Spatial Learning Ability Test

Educational and Psychological Measurement, Feb 1, 1997

The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measu... more The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measure specified aspects of spatial processing. Mathematical modeling of item difficulties and response times have supported SLAT as involving primarily spatial processing. Here, four studies on the factorial validity of SLAT are summarized to elaborate nomothetic span. SLAT was the highest loading test on the spatial ability factor in the context of either simple or complex spatial tests. Further, SLAT was less related to verbal-analytic coding skills than many other spatial tests, including a test that contained the same item type. Thus consistent with the construct representation of SLAT as involving spatial processing, the factorial validity studies indicate SLAT as a more pure measure of spatial ability.

Research paper thumbnail of Mixed Rasch Models for Measurement in Cognitive Psychology

Springer eBooks, 2007

To apply standard unidimensional IRT models to ability measurement, it must be assumed that indiv... more To apply standard unidimensional IRT models to ability measurement, it must be assumed that individuals at the same level of performance are in fact comparable. That is, individuals with the same estimate on the latent trait are interchangeable and thus can have identical interpretations given to their scores. Such interpretations, of course, depend on the construct validity of the trait measure, which includes both the construct representation and nomothetic span aspects (Embretson, 1983). The construct representation aspect of validity involves the meaning of the construct, in terms of the processes, strategies, and knowledge that are directly involved in performance. Construct representation is an aspect of internal validity. The nomothetic span aspect concerns the significance of the trait, which includes the relationship of trait scores to other measures, demographics and criteria. Nomothetic span is primarily the external validity of the measure.

Research paper thumbnail of Multidimensional Measurement from Dynamic Tests: Abstract Reasoning Under Stress

Multivariate Behavioral Research, Oct 1, 2000

ABSTRACT

Research paper thumbnail of Designing Cognitive Complexity in Mathematical Problem-Solving Items

Applied Psychological Measurement, Jun 1, 2010

Research paper thumbnail of Psychological measurement: Scaling and analysis

American Psychological Association eBooks, 2023

Research paper thumbnail of Understanding examinees’ item responses through cognitive modeling of response accuracy and response times

Large-scale Assessments in Education, Mar 21, 2023

Research paper thumbnail of Modern Measurement in the Social Sciences

SAGE Publications Ltd eBooks, Apr 30, 2012

Research paper thumbnail of FOCUS ARTICLE: The Second Century of Ability Testing: Some Predictions and Speculations

Measurement: Interdisciplinary Research & Perspective, 2004

ABSTRACT

Research paper thumbnail of Implications of a multidimensional latent trait model for measuring change

American Psychological Association eBooks, 1991

ABSTRACT

Research paper thumbnail of Construct Validity and Cognitive Diagnostic Assessment

Cambridge University Press eBooks, May 14, 2007

Research paper thumbnail of Confirmatory factor analysis of four general neuropsychological models with a modified halstead-reitan battery

Journal of clinical neuropsychology, Jun 1, 1983

Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing fr... more Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing from previous work by Swiercinsky, Royce and co-workers, Christensen and Luria, and Lezak. The relative explanatory power of these four models for this particular battery in an adult neuropsychiatric population was examined using confirmatory factor analysis. None of the models was shown to fit adequately in an absolute sense, but three of them represented substantial, statistically reliable improvements over a null model of mutual independence, and a clear pattern of relative fit was observed. Further improvements were achieved by modifying the best fitting initial model in several ways. A cross-validation with an independent sample supported the results of the model development step. Tentative theoretical and clinical implications for the overall organization of the neuropsychological abilities measured by this battery were drawn, and recommendations were made for further application of this method in neuropsychological research.

Research paper thumbnail of Item Diffficulty Modeling of Paragraph Comprehension Items

Applied Psychological Measurement, Sep 1, 2006

Research paper thumbnail of The continued search for nonarbitrary metrics in psychology

American Psychologist, 2006

H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issue... more H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issues: (a) the measurement of racial prejudice and (b) the establishment of clinically significant change. According to Blanton and Jaccard, although research findings are not undermined by arbitrary metrics, individual scores and score changes may not be meaningfully interpreted. The author believes that their points are mostly valid and that their examples were appropriate. However, Blanton and Jaccard's article does not lead directly to solutions, nor did it adequately describe the scope of the metric problem. This article has 2 major goals. First, some prerequisites for nonarbitrary metrics are presented and related to Blanton and Jaccard's issues. Second, the impact of arbitrary metrics on psychological research findings are described. In contrast to Blanton and Jaccard (2006), research findings suggest that metrics have direct impact on statistics for group comparisons and trend analysis.

Research paper thumbnail of Item Response Theory

Psychology Press eBooks, Sep 5, 2013

Research paper thumbnail of The new rules of measurement

Psychological Assessment, Dec 1, 1996

Research paper thumbnail of Diagnosing Student Node Mastery: Impact of Varying Item Response Modeling Approaches

Frontiers in Education, Sep 27, 2021

Research paper thumbnail of Modeling Item Difficulty in a Dynamic Test

Journal of Cognitive Education and Psychology, Oct 1, 2020

The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a d... more The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a dynamic test due to continual provision of feedback from examiner to examinee. Yet, the original scoring protocol for the test largely ignores this dynamic structure. The current analysis applies a dynamic adaptation of an explanatory item response theory model to evaluate the impact of feedback on item difficulty. Additionally, several item features (rule type, number of target shapes) are considered in the item difficulty model. Results demonstrated that all forms of feedback significantly reduced item difficulty, with the exception of corrective feedback that could not be directly applied to the next item in the series. More complex and compound rule types also significantly predicted item difficulty, as did increasing the number of shapes, thereby supporting the response process aspect of validity. Implications for continued use of the Concept Formation subtest for educational programming decisions are discussed.