Linking Composite Scores: Effects of Anchor Test Length and Content Representativeness (original) (raw)

A Note on the Choice of an Anchor Test in Equating ETS Research Report Series EIGNOR EXECUTIVE EDITOR A Note on the Choice of an Anchor Test in Equating

2020

Since its 1947 founding, ETS has conducted and disseminated scientific research to support its products and services, and to advance the measurement and education fields. In keeping with these goals, ETS is committed to making its research freely available to the professional community and to the general public. Published accounts of ETS research, including papers in the ETS Research Report series, undergo a formal peer-review process by ETS staff to ensure that they meet established scientific and professional standards. All such ETS-conducted peer reviews are in addition to any reviews that outside organizations may provide as part of their own publication processes. SAT is a trademark of the College Board. Abstract Anchor tests play a key role in test score equating. We attempt to find, through theoretical derivations, an anchor test with optimal item characteristics. The correlation between the scores on a total test and on an anchor test is maximized with respect to the item pa...

Test Score Reporting: Perspectives From the ETS Score Reporting Conference

2011

November 4th, 2010, to explore some issues that influence score reports and new advances that contribute to the effectiveness of these reports. Jessica Hullman, Rebecca Rhodes, Fernando Rodriguez, and Priti Shah present the results of recent research on graph comprehension and data interpretation, especially the role of presentation format, the impact of prior quantitative literacy and domain knowledge, the trade-off between reducing cognitive load and increasing active processing of data, and the affective influence of graphical displays. Rebecca Zwick and Jeffrey Sklar present the results of the Instructional Tools in Educational Measurement and Statistics for School Personnel (ITEMS) project, funded by the National Science Foundation and conducted at the University of California, Santa Barbara to develop and evaluate 3 web-based instructional modules intended to help educators interpret test scores. Zwick and Sklar discuss the modules and the procedures used to evaluate their effectiveness. Diego Zapata-Rivera presents a new framework for designing and evaluating score reports, based on work on designing and evaluating score reports for particular audiences in the context of the CBAL (Cognitively Based Assessment of, for, and as Learning) project (Bennett & Gitomer, 2009), which has been applied in the development and evaluation of reports for various audiences including teachers, administrators and students.

IMPROVING TEST SCORE REPORTING: PERSPECTIVES FROM THE ETS SCORE REPORTING CONFERENCE

ETS Research Report Series, 2011

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

ETS Research Report Series, 2013

Since its 1947 founding, ETS has conducted and disseminated scientific research to support its products and services, and to advance the measurement and education fields. In keeping with these goals, ETS is committed to making its research freely available to the professional community and to the general public. Published accounts of ETS research, including papers in the ETS Research Report series, undergo a formal peer-review process by ETS staff to ensure that they meet established scientific and professional standards. All such ETS-conducted peer reviews are in addition to any reviews that outside organizations may provide as part of their own publication processes. Peer review notwithstanding, the positions expressed in the ETS Research Report series and other published accounts of ETS research are those of the authors and not necessarily those of the Officers and Trustees of Educational Testing Service.

A Practitioner's Introduction to Equating with Primers on Classical Test Theory and Item Response Theory

Council of Chief State School Officers, 2009

The Council of Chief State School Officers (CCSSO) is a nonpartisan, nationwide, nonprofit organization of public officials who head departments of elementary and secondary education in the states, the District of Columbia, the Department of Defense Education Activity, and five U.S. extra-state jurisdictions. CCSSO provides leadership, advocacy, and technical assistance on major educational issues. The Council seeks member consensus on major educational issues and expresses their views to civic and professional organizations, federal agencies, Congress, and the public. State Collaborative on Assessment and Student Standards The State Collaborative on Assessment and Student Standards (SCASS) projects were created in 1991 by the Council of Chief State School Officers to encourage and assist states in working collaboratively on assessment design and development for a variety of topics and subject areas. These projects are organized and facilitated within CCSSO by the Division of State Services and Technical Assistance. Technical Issues in Large-Scale Assessment (TILSA) TILSA is part of the State Collaborative on Assessment and Student Standards (SCASS) project whose mission is to provide leadership, advocacy, and service by focusing on critical research in the design, development, and implementation of standards-based assessment systems that measure the achievement of all students. TILSA addresses state needs for technical information about large-scale assessment by providing structured opportunities for members to share expertise, learn from each other, and network on technical issues related to valid, reliable, and fair assessment; designing and carrying out research that reflects common needs across states; arranging presentations by experts in the field of educational measurement on current issues affecting the implementation and development of state programs; and developing handbooks and guidelines for implementing various aspects of standardsbased assessment systems. This long-standing partnership has conducted a wide variety of research over the years into critical issues affecting the technically-sound administration of K-12 assessments, including research on equating, setting cut-scores, consequential validity, generalizability theory, use of multiple measures, alignment of assessments to standards, accommodations, assessing English language learners, and the reliability of aggregate data. In addition, TILSA has provided professional development in critical topics in measurement for its members. The partnership has developed technical reports on each topic it researched. In addition, TILSA has produced guidelines for developing state assessment programs.

Estimating the Reliability of Composite Scores

The Office of Qualifications and Examinations Regulation (Ofqual) has initiated a research programme looking at the reliability of results from national tests, examinations and qualifications in England. The aim of the programme is to gather evidence to inform Ofqual on developing policy on reliability from a regulatory perspective, with a view to improving the quality of the assessment systems further. As part of the Ofqual reliability programme, this study, through a review of literature, attempts to: look at the different approaches that are employed to form composite scores from component or unit scores; investigate the implications of the use of the different approaches for the psychometric properties, particularly the reliability, of the composite scores; and identify procedures that are commonly used to estimate the reliability measure of composite scores. This report summarizes the procedures developed for classical test theory (CTT), generalizability theory (G-theory) and item response theory (IRT) that are widely used for studying the reliability of composite scores that are composed of weighted scores from component tests. The report is intended for use as a reference by researchers and test developers working in the field of educational measurement.

Uncommon measures: Equivalence and linkage among educational tests

1999

A study was conducted of the feasibility of establishing an equivalency scale that would enable commercial state tests to be linked to one another and to the National Assessment of Educational Progress (NAEP). In evaluating the feasibility of linkages, the study committee focused on the linkage of various fourth-grade reading tests and the linkage of various eighth-grade mathematics tests. Committee members concentrated on the factors that affect the validity of the inferences about student performance that users would draw from the linked test scores. The committee concluded that comparing the full array of currently administered commercial and state achievement tests to one another, through the development of a single equivalency or linking scale, is not feasible. Nor is reporting individual student scores from the full array of tests on the NAEP scale and transforming individual scores on these tests and assessments into NAEP achievement levels feasible. Under limited conditions, it may be possible to calculate a linkage between two tests, but multiple factors affect the validity of the inferences drawn from the linked scores. Unless the test to be linked to the NAEP is very similar in content, format, and uses, the resulting linkage is likely to be unstable and potentially misleading. (Contains 3 tables, 3 figures, and 81 references.) (SLD) Reproductions supplied by EDRS are the best that can be made from the original document.

A Design Framework for the ELTeach Program Assessments

ETS Research Report Series, 2014

Since its 1947 founding, ETS has conducted and disseminated scientific research to support its products and services, and to advance the measurement and education fields. In keeping with these goals, ETS is committed to making its research freely available to the professional community and to the general public. Published accounts of ETS research, including papers in the ETS Research Report series, undergo a formal peer-review process by ETS staff to ensure that they meet established scientific and professional standards. All such ETS-conducted peer reviews are in addition to any reviews that outside organizations may provide as part of their own publication processes. Peer review notwithstanding, the positions expressed in the ETS Research Report series and other published accounts of ETS research are those of the authors and not necessarily those of the Officers and Trustees of Educational Testing Service.

The Validity of Inferences From Locally Developed Assessments Administered Globally

ETS Research Report Series, 2018

Since its 1947 founding, ETS has conducted and disseminated scientific research to support its products and services, and to advance the measurement and education fields. In keeping with these goals, ETS is committed to making its research freely available to the professional community and to the general public. Published accounts of ETS research, including papers in the ETS Research Report series, undergo a formal peer-review process by ETS staff to ensure that they meet established scientific and professional standards. All such ETS-conducted peer reviews are in addition to any reviews that outside organizations may provide as part of their own publication processes. Peer review notwithstanding, the positions expressed in the ETS Research Report series and other published accounts of ETS research are those of the authors and not necessarily those of the Officers and Trustees of Educational Testing Service.

Linking Composite Scores: Effects of Anchor Test Length and Content Representativeness (original) (raw)

Related papers