Examining Method Effect of Synonym and Antonym Test in Verbal Abilities Measure (original) (raw)

Comparing the Performance of Synonym and Antonym Tests in Measuring Verbal Abilities

2016

This study investigates whether synonym and antonym tests measure similar domains of verbal abilities and have comparable psychometric performance. The data used in this study are subsets of the data collected during 2013-2014 graduate admission testing at Gadjah Mada University (UGM), using three forms of the Potensi Akademik Pascasarjana (PAPS) [Graduate Academic Aptitude Test]. Confirmatory factor analysis revealed that synonym and antonym tests assess similar domains of verbal abilities. A model integrating items from both tests to represent a single dimension better explained the data than a model separating the two tests into manifestations of different dimensions. High correlations among dimensions in unidimensional model showed interrelatedness for the domains of verbal abilities such as verbal knowledge, comprehension, and reasoning. Additional analysis using item-level analysis showed that antonym items tended to be more difficult than synonym items. This finding indicates...

Identifying dimensions of vocabulary knowledge in the word associates test

Vocabulary learning and instruction, 2012

Depth of vocabulary knowledge (DVK) (i.e. how much a learner knows about the words he knows) is typically conceptualized as a psychologically multidimensional construct, including various forms of word knowledge. Read's Word Associates Test (WAT) is the most common test of DVK in the literature, assessing knowledge of words' synonyms and collocates. Despite the fact that the WAT aims to measure two dimensions of vocabulary knowledge, no studies until now have investigated whether these dimensions are psychometrically distinct. The present study seeks to fill that gap. A known-reliable-and-valid WAT developed by David Qian was administered to 530 Japanese university English majors. Confirmatory factor analysis was employed to investigate the psychometric dimensionality of the WAT. It was discovered that a bifactor model, wherein the primary explanatory factor is a vocabulary g-factor, with additional, uncorrelated factors for synonym and collocate items, demonstrated the best fit. This finding implies that although these dimensions of DVK may be somewhat distinct, they are largely subsumed by general vocabulary knowledge.

The Contribution of a Response‐Production Component to a Free‐Response Synonym Task

Journal of Educational Measurement, 1996

A cognitive approach to the study of format differences is illustrated using synonym tasks. By means of a multiple regression analysis with latent variables, it is shown that both a response‐production component and an evaluation component are involved in answering a free‐response synonym task. Given the results of Janssen, De Boeck, and Vander Steene (1996), the format differences between the multiple‐choice evaluation task and the free‐response synonym task can be explained in terms of the kinds of verbal abilities measured. The evaluation task is a pure measure of verbal comprehension, while the free‐response synonym task is affected by verbal comprehension and verbal fluency, as well. The design used to study format differences controls both for content effects and for the effects of repeating item stems across formats.

An Investigation of the Effect of Correlated Abilities on Observed Test Characteristics

1984

To assess the effects of correlated abilities on test characteristics, and to explore the effects of correlated abilities on the use of a multidimensional item response theory model which does not explicitly account for such a correlation, two tests were constructed. One had two relatively unidimensional subsets of items, the other had all two-dimensional items. For each test, response data were generated according to a multidimensional two-parameter logistic model using four groups of 2000 simulated examinees, differing in the degree of'inter-dimension ability correlation. To evaluate the effects on observed test characteristics, the simulated response data were analyzed using item analysis and factor analysis techniques. To assess the effects on the use of the multidimensional model, the model parameters were estimated, and compared to the true parameters. Results of the study indicated that the presence of correlated abilities has important implications. It is necessary to consider latent item structure as well as latent ability structure in test construction and analysis. Use of multidimensional item response theory models that do not explicitly account for correlated abilities may result in misinterpretation of the underlying dimensions. Research is needed to determine the nature of the misinterpretation and to perhaps develop an item response theory analogue to factor rotation. (Author/BS)

An Item Response Theory analysis of the Verb Subordinates Test

2019

This study presents a psychometric evaluation of the Verb Subordinates Test (VST). The VST assesses lexical competence based on knowledge of troponyms in the verb lexicon. Items are true/false statements with the structure To verbhyponym is a way to verbhypernym. Using Item Response Theory (IRT) analysis, this study examined the difficulty and discriminatory value of different items and difficulty levels of the VST. Statistical analyses showed that the VST is a promising vocabulary assessment measure with high internal consistency and good convergent validity, and that individual VST items, given their frequency range, are differentially informative across the vocabulary trait continuum.

A comparative examination of structural models of ability tests

Quality and Quantity, 1992

This study was conducted to examine whether the same underlying structure of ability tests emerges when three different data analysis methods are used. The sample consisted of 335 examinees who applied for vocational guidance and were administered a battery of 17 tests. A matrix of intercorrelations between scores, based on the number of correct answers was obtained. The matrix was subjected to factor analysis, Guttman's SSA, and tree analysis, essentially resulting in different structures. Comparisons were made, and the theoretical implications of the results are discussed in relation to various structural models of ability tests existing in the literature.

Linking scores from two written receptive English academic vocabulary tests—The VLT-Ac and the AVT

Language Testing, 2023

The academic section of the Vocabulary Levels Test (VLT-Ac) and the Academic Vocabulary Test (AVT) both assess meaning-recognition knowledge of written receptive academic vocabulary, deemed central for engagement in academic activities. Depending on the purpose and context of the testing, either of the tests can be appropriate, but for research and pedagogical purposes, it is important to be able to compare scores achieved on the two tests between administrations and within similar contexts. Based on a sample of 385 upper secondary school students in universitypreparatory programs (independent CEFR B2-level users of English), this study presents a comparison model by linking the VLT-Ac and the AVT using concurrent calibration procedures in Item Response Theory. The key outcome of the study is a score comparison table providing a means for approximate score comparisons. Additionally, the study showcases a viable and valid method of comparing vocabulary scores from an older test with those from a newer one.

Confirmatory Analyses of Componential Test Structure Using Multidimensional Item Response Theory

Multivariate Behavioral Research, 1999

The componential structure of synonym tasks is investigated using confirmatory multidimensional two-parameter IRT models. It was hypothesized that an open synonym task is decomposable into generating synonym candidates and evaluating these candidate words with respect to their synonymy with the stimulus word. Two subtasks were constructed to identify these two components. Different confirmatory models were estimated both with TESTMAP and with NOHARM. The componential hypothesis was supported, but it was found that the generation subtask also involved some evaluation and that generation and evaluation were highly correlated.

Evidence of the Generalization and Construct Representation Inferences for the GRE® revised General Test Sentence Equivalence Item Type

The report is the first systematic evaluation of the sentence equivalence item type introduced by the GRE® revised General Test. We adopt a validity framework to guide our investigation based on Kane’s approach to validation whereby a hierarchy of inferences that should be documented to support score meaning and interpretation is evaluated. We present evidence relevant to the generalization inference as well as evidence of construct representation.We analyzed the pool of sentence equivalence items in three studies. The first and second studies focused on the generalization inference and sought to document the construction principles behind the sentence equivalence items, specifically the nature of the vocabulary tested.The third study focused on construct representation and evaluated the contribution of the stem, the keys, and the distractors to item difficulty. We concluded that the vocabulary tested by the sentence equivalence items is appropriate given the purpose of the GRE, namely, to assist in the selection of graduate students. The difficulty of the items was shown to be, in part, a function of the familiarity of the vocabulary as well as the context in which the vocabulary is tested, which we argue is positive validity evidence.

Effects of Positively and Negatively Worded Items on Estimated Latent Trait Scores

Psychological scales, e.g., anxiety, depression, and stress inventories, tend to be a combination of positively and negatively worded items with ordered item responses using a Likert-type scale. The generalized partial credit model (GPCM) is often applied to ordinal response data, but little research uses the nominal response model (NRM) with these types of instruments. Preson, Reise, Cai, and Hays (2011) compared these models applied to psychological scales; this study focused on the item parameter estimates. We advance this study by comparing the estimated latent trait from the GPCM and the NRM to an instrument constructed with reverse-worded items. The purpose is to compare the estimated latent trait for the two models and for the subsets of positively or negatively worded items.