Susan Embretson - Academia.edu (original) (raw)
Uploads
Papers by Susan Embretson
Oxford University Press eBooks, 2000
Springer proceedings in mathematics & statistics, 2021
A wide variety of models have been developed for item response times. The models vary in both pri... more A wide variety of models have been developed for item response times. The models vary in both primary purpose and underlying assumptions. As noted by van der Linden (2016), several item response models assume response time and response accuracy are highly dependent processes. However, the nature of this assumed relationship varies substantially between models; that is, greater accuracy may be associated with either increased or decreased response time. In addition to these conflicting assumptions, examinees may differ in their relative response times across items. In the current study, the relationship of item log response times to item differences in difficulty and content was examined within subjects. Although on the item level, mean item log response time was positively correlated with difficulty, a broad distribution of these correlations was found within subjects, ranging from positive to negative. These results indicate that existing models may be differentially effective depending on examinees’ predominant strategy in item solving.
Cambridge University Press eBooks, Nov 1, 2004
Educational and Psychological Measurement, Feb 1, 1997
The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measu... more The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measure specified aspects of spatial processing. Mathematical modeling of item difficulties and response times have supported SLAT as involving primarily spatial processing. Here, four studies on the factorial validity of SLAT are summarized to elaborate nomothetic span. SLAT was the highest loading test on the spatial ability factor in the context of either simple or complex spatial tests. Further, SLAT was less related to verbal-analytic coding skills than many other spatial tests, including a test that contained the same item type. Thus consistent with the construct representation of SLAT as involving spatial processing, the factorial validity studies indicate SLAT as a more pure measure of spatial ability.
Springer eBooks, 2007
To apply standard unidimensional IRT models to ability measurement, it must be assumed that indiv... more To apply standard unidimensional IRT models to ability measurement, it must be assumed that individuals at the same level of performance are in fact comparable. That is, individuals with the same estimate on the latent trait are interchangeable and thus can have identical interpretations given to their scores. Such interpretations, of course, depend on the construct validity of the trait measure, which includes both the construct representation and nomothetic span aspects (Embretson, 1983). The construct representation aspect of validity involves the meaning of the construct, in terms of the processes, strategies, and knowledge that are directly involved in performance. Construct representation is an aspect of internal validity. The nomothetic span aspect concerns the significance of the trait, which includes the relationship of trait scores to other measures, demographics and criteria. Nomothetic span is primarily the external validity of the measure.
Multivariate Behavioral Research, Oct 1, 2000
ABSTRACT
Applied Psychological Measurement, Jun 1, 2010
American Psychological Association eBooks, 2023
Large-scale Assessments in Education, Mar 21, 2023
SAGE Publications Ltd eBooks, Apr 30, 2012
Measurement: Interdisciplinary Research & Perspective, 2004
ABSTRACT
American Psychological Association eBooks, 1991
ABSTRACT
Cambridge University Press eBooks, May 14, 2007
Journal of clinical neuropsychology, Jun 1, 1983
Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing fr... more Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing from previous work by Swiercinsky, Royce and co-workers, Christensen and Luria, and Lezak. The relative explanatory power of these four models for this particular battery in an adult neuropsychiatric population was examined using confirmatory factor analysis. None of the models was shown to fit adequately in an absolute sense, but three of them represented substantial, statistically reliable improvements over a null model of mutual independence, and a clear pattern of relative fit was observed. Further improvements were achieved by modifying the best fitting initial model in several ways. A cross-validation with an independent sample supported the results of the model development step. Tentative theoretical and clinical implications for the overall organization of the neuropsychological abilities measured by this battery were drawn, and recommendations were made for further application of this method in neuropsychological research.
Applied Psychological Measurement, Sep 1, 2006
American Psychologist, 2006
H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issue... more H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issues: (a) the measurement of racial prejudice and (b) the establishment of clinically significant change. According to Blanton and Jaccard, although research findings are not undermined by arbitrary metrics, individual scores and score changes may not be meaningfully interpreted. The author believes that their points are mostly valid and that their examples were appropriate. However, Blanton and Jaccard's article does not lead directly to solutions, nor did it adequately describe the scope of the metric problem. This article has 2 major goals. First, some prerequisites for nonarbitrary metrics are presented and related to Blanton and Jaccard's issues. Second, the impact of arbitrary metrics on psychological research findings are described. In contrast to Blanton and Jaccard (2006), research findings suggest that metrics have direct impact on statistics for group comparisons and trend analysis.
Psychology Press eBooks, Sep 5, 2013
Psychological Assessment, Dec 1, 1996
Frontiers in Education, Sep 27, 2021
Journal of Cognitive Education and Psychology, Oct 1, 2020
The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a d... more The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a dynamic test due to continual provision of feedback from examiner to examinee. Yet, the original scoring protocol for the test largely ignores this dynamic structure. The current analysis applies a dynamic adaptation of an explanatory item response theory model to evaluate the impact of feedback on item difficulty. Additionally, several item features (rule type, number of target shapes) are considered in the item difficulty model. Results demonstrated that all forms of feedback significantly reduced item difficulty, with the exception of corrective feedback that could not be directly applied to the next item in the series. More complex and compound rule types also significantly predicted item difficulty, as did increasing the number of shapes, thereby supporting the response process aspect of validity. Implications for continued use of the Concept Formation subtest for educational programming decisions are discussed.
Oxford University Press eBooks, 2000
Springer proceedings in mathematics & statistics, 2021
A wide variety of models have been developed for item response times. The models vary in both pri... more A wide variety of models have been developed for item response times. The models vary in both primary purpose and underlying assumptions. As noted by van der Linden (2016), several item response models assume response time and response accuracy are highly dependent processes. However, the nature of this assumed relationship varies substantially between models; that is, greater accuracy may be associated with either increased or decreased response time. In addition to these conflicting assumptions, examinees may differ in their relative response times across items. In the current study, the relationship of item log response times to item differences in difficulty and content was examined within subjects. Although on the item level, mean item log response time was positively correlated with difficulty, a broad distribution of these correlations was found within subjects, ranging from positive to negative. These results indicate that existing models may be differentially effective depending on examinees’ predominant strategy in item solving.
Cambridge University Press eBooks, Nov 1, 2004
Educational and Psychological Measurement, Feb 1, 1997
The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measu... more The Spatial Learning Ability Test (SLAT) was designed from a cognitive processing theory to measure specified aspects of spatial processing. Mathematical modeling of item difficulties and response times have supported SLAT as involving primarily spatial processing. Here, four studies on the factorial validity of SLAT are summarized to elaborate nomothetic span. SLAT was the highest loading test on the spatial ability factor in the context of either simple or complex spatial tests. Further, SLAT was less related to verbal-analytic coding skills than many other spatial tests, including a test that contained the same item type. Thus consistent with the construct representation of SLAT as involving spatial processing, the factorial validity studies indicate SLAT as a more pure measure of spatial ability.
Springer eBooks, 2007
To apply standard unidimensional IRT models to ability measurement, it must be assumed that indiv... more To apply standard unidimensional IRT models to ability measurement, it must be assumed that individuals at the same level of performance are in fact comparable. That is, individuals with the same estimate on the latent trait are interchangeable and thus can have identical interpretations given to their scores. Such interpretations, of course, depend on the construct validity of the trait measure, which includes both the construct representation and nomothetic span aspects (Embretson, 1983). The construct representation aspect of validity involves the meaning of the construct, in terms of the processes, strategies, and knowledge that are directly involved in performance. Construct representation is an aspect of internal validity. The nomothetic span aspect concerns the significance of the trait, which includes the relationship of trait scores to other measures, demographics and criteria. Nomothetic span is primarily the external validity of the measure.
Multivariate Behavioral Research, Oct 1, 2000
ABSTRACT
Applied Psychological Measurement, Jun 1, 2010
American Psychological Association eBooks, 2023
Large-scale Assessments in Education, Mar 21, 2023
SAGE Publications Ltd eBooks, Apr 30, 2012
Measurement: Interdisciplinary Research & Perspective, 2004
ABSTRACT
American Psychological Association eBooks, 1991
ABSTRACT
Cambridge University Press eBooks, May 14, 2007
Journal of clinical neuropsychology, Jun 1, 1983
Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing fr... more Four theoretical factor models for a modified Halstead-Reitan battery were formulated, drawing from previous work by Swiercinsky, Royce and co-workers, Christensen and Luria, and Lezak. The relative explanatory power of these four models for this particular battery in an adult neuropsychiatric population was examined using confirmatory factor analysis. None of the models was shown to fit adequately in an absolute sense, but three of them represented substantial, statistically reliable improvements over a null model of mutual independence, and a clear pattern of relative fit was observed. Further improvements were achieved by modifying the best fitting initial model in several ways. A cross-validation with an independent sample supported the results of the model development step. Tentative theoretical and clinical implications for the overall organization of the neuropsychological abilities measured by this battery were drawn, and recommendations were made for further application of this method in neuropsychological research.
Applied Psychological Measurement, Sep 1, 2006
American Psychologist, 2006
H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issue... more H. Blanton and J. Jaccard examined the arbitrariness of metrics in the context of 2 current issues: (a) the measurement of racial prejudice and (b) the establishment of clinically significant change. According to Blanton and Jaccard, although research findings are not undermined by arbitrary metrics, individual scores and score changes may not be meaningfully interpreted. The author believes that their points are mostly valid and that their examples were appropriate. However, Blanton and Jaccard's article does not lead directly to solutions, nor did it adequately describe the scope of the metric problem. This article has 2 major goals. First, some prerequisites for nonarbitrary metrics are presented and related to Blanton and Jaccard's issues. Second, the impact of arbitrary metrics on psychological research findings are described. In contrast to Blanton and Jaccard (2006), research findings suggest that metrics have direct impact on statistics for group comparisons and trend analysis.
Psychology Press eBooks, Sep 5, 2013
Psychological Assessment, Dec 1, 1996
Frontiers in Education, Sep 27, 2021
Journal of Cognitive Education and Psychology, Oct 1, 2020
The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a d... more The Concept Formation subtest of the Woodcock Johnson Tests of Cognitive Abilities represents a dynamic test due to continual provision of feedback from examiner to examinee. Yet, the original scoring protocol for the test largely ignores this dynamic structure. The current analysis applies a dynamic adaptation of an explanatory item response theory model to evaluate the impact of feedback on item difficulty. Additionally, several item features (rule type, number of target shapes) are considered in the item difficulty model. Results demonstrated that all forms of feedback significantly reduced item difficulty, with the exception of corrective feedback that could not be directly applied to the next item in the series. More complex and compound rule types also significantly predicted item difficulty, as did increasing the number of shapes, thereby supporting the response process aspect of validity. Implications for continued use of the Concept Formation subtest for educational programming decisions are discussed.