Format Effects of Empirically Derived Multiple-Choice Versus Free-Response Instruments When Assessing Graphing Abilities (original) (raw)

Assessing students' abilities to construct and interpret line graphs: Disparities between multiple‐choice and free‐response instruments

Science Education, 1994

The author is concerned about the methodology and instrumentation used to assess both graphing abilities and the impact of microcomputer-based laboratories (MBL) on students' graphing abilities for four reasons: (1) the ability to construct and interpret graphs is critical for developing key ideas in science; (2) science educators need to have valid information for making teaching decisions; (3) educators and researchers are heralding the arrival of MBL as a tool for developing graphing abilities; and (4) some of the research which supports using MBL appears to have significant validity problems. In this article, the author will describe the research which challenges the validity of using multiple-choice instruments to assess graphing abilities. The evidence from this research will identify numerous disparities between the results of multiple-choice and free-response instruments. In the first study, 72 subjects in the seventh, ninth, and eleventh grades were administered individual clinical interviews to assess their ability to construct and interpret graphs. A wide variety of graphs and situations were assessed. In three instances during the interview, students drew a graph that would best represent a situation and then explained their drawings. The results of these clinical graphing interviews were very different from similar questions assessed through multiple-choice formats in other research studies. In addition, insights into students' thinking about graphing reveal that some multiple-choice graphing questions from prior research studies and standardized tests do not discriminate between right answerslright reasons, right answers/wrong reasons, and answers scored "wrong" but correct for valid reasons. These results indicate that in some instances multiple-choice questions are not a valid measure of graphing abilities, In a second study, the researcher continued to pursue the questions raised about the validity of multiple-choice tests to assess graphing, researching the following questions: What can be learned about subjects' graphing abilities when students draw their own graphs compared to assessing by means of a multiple-choice instrument? Does the methodology used to assess graphing abilities: (1) affect the percentage of subjects who answer correctly; (2) alter the percentage of subjects affected by the "picture of the event" phenomenon? Instruments were constructed consisting of

A Comparison of Multiple-Choice and Constructed Figural Response Items

Journal of Educational Measurement, 1991

In contrast to multiple-choice test questions, figural response items call for constructed responses and rely upon figural material, such as illustrations and graphs, as the response medium. Figural response questions in various science domains were created and administered to a sample of 4th-, 8th-, and 12th-grade students. Item and test statistics from parallel sets of figural response and multiple-choice questions were compared. Figural response items were generally more difficult, especially for questions that were difficult (p < .5) in their constructed-response forms. Figural response questions were also slightly more discriminating and reliable than their multiple-choice counterparts, but they had higher omit rates. This article addresses the relevance of guessing to figural response items and the diagnostic value of the item type. Plans for future research on figural response items are discussed.

Graphical artefacts: Taxonomy of students’ response to test items

Educational Studies in Mathematics, 2013

The present study, carried out in the Nordic countries, examines the characteristics of students' scholastic performance on items containing graphical artefacts, that is, bar graphs, pie charts and line graphs, selected from the Programme for International Student Assessment (PISA) survey test. Graphical analysis of statistical data resulted in the observation of two major categories of performance by the students. The results of cluster analysis also confirmed the two approaches. One approach consists of items perceived as requiring identification, that is, focusing primarily on perceptual elements. The other consisting of items requiring a critical-analytical approach, that is, involving evaluation of the graphical system, active interaction with subject specific operators and forms of expression. The general observation is that the pattern of response is similar for all these countries, with items demanding an identification approach showing comparatively higher scores than for items perceived as demanding a critical-analytical approach.

What lies behind graphicacy? Relating students' results on a test of graphically represented quantitative information to formal academic achievement

Journal of research in …, 2006

Based on studies carried out on qualitative data an instrument was constructed for investigating how larger numbers of students handle graphics. This test, consisting of 18 pages, each with its own graphic display(s) and a set of tasks, was distributed to 363 students, 15-16 years of age, from five different schools. The format of the questions varied, as did the format of the graphics. As students' performance was expected to be multidimensional, confirmatory factor analysis was carried out with a structural equation modeling technique. In addition to the identification of a general graphicacy-test factor (Gen) and an end-of-test effect (End 0 ), a narrative dimension (Narr 0 ) was vaguely indicated. This model was then related to a six-factor model of students' formal academic achievement measured by their leaving certificates from compulsory education. The strongest correlation obtained was between the general graphicacy-test dimension (Gen) and a mathematic/science factor (MathSc 0 ) in the grades model. In addition, substantial relationships were detected between the Gen factor and both an overall school achievement factor (SchAch) and a language factor (Lang 0 ) in the grades model. ß 2005 Wiley Periodicals, Inc. J Res Sci Teach 43: [43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62] 2006 Graphs, charts, cartograms, thematic maps, etc., are common tools for handling and communicating quantitative information in contemporary society. Successively, through their years of schooling, students encounter increasingly more advanced forms of graphic displays, both in direct educational situations and as illustrations and/or complementary facts on other subject matters. Students' access to modern computers equipped with graphic application software makes possible not only an abundance of graphics (sometimes unnecessarily elaborate) in magazines, newspapers, television, and electronic media-encountered in their everyday life both in and out of school-but also allows them to create these images. Thus, being ''graphicate''

The Influence of Graphical and Symbolic Language Manipulations on Responses to Self-Administered Questions

Public Opinion Quarterly, 2004

This article reports results from 14 experimental comparisons designed to test 7 hypotheses about the effects of two types of nonverbal languages (symbols and graphics) on responses to selfadministered questionnaires. The experiments were included in a survey of 1,042 university students. Significant differences were observed for most comparisons, providing support for all seven hypotheses. These results support the view that respondents' answers to questions in selfadministered surveys are influenced by more than words. Thus, the visual presentation of questions must be taken into consideration when designing such surveys and, especially, when comparing results across surveys in which the visual presentation of questions is varied.

The construction and validation of the test of graphing in science (togs)

Journal of Research in Science Teaching, 1986

The objective of this project was to develop a multiple choice test of graphing skills appropriate for science students from grades seven through twelve. Skills associated with the construction and interpretation of line graphs were delineated, and nine objectives encompassing these skills were developed. Twenty-six items were then constructed to measure these objectives. To establish content validity, items and objectives were submitted to a panel of reviewers. The experts agreed over 94% of the time on assignment of items to objectives and 98% on the scoring of items. TOGS was first administered to 119 7th, 9th, and 1 lth graders. The reliability (KR-20) was 0.81. Poorly functioning items were rewritten based on the item difficulty and discrimination data. The revised version of the test was given to 377 7th through 12th grade students. Total scores ranged from 2 to 26 correct (X= 13.3, S.D. =5.3). The reliability (KR-20) was 0.83 for all subjects and ranged from 0.71 for eighth graders to 0.88 for ni?th graders. Point biserial correlations showed 24 of the 26 items above 0.30 with an average value of 0.43. It was concluded from this and other data that TOGS was a valid and reliable instrument for measuring graphing abilities.

The Effects of Images on Multiple-choice Questions in Computer-based Formative Assessment

Digital Education Review , 2015

Current learning and assessment are evolving into digital systems that can be used, stored, and processed online. In this paper, three different types of questionnaires for assessment are presented. All the questionnaires were filled out online on a web-based format. A study was carried out to determine whether the use of images related to each question in the questionnaires affected the selection of the correct answer. Three questionnaires were used: two questionnaires with images (images used during learning and images not use during learning) and another questionnaire with no images, text-only. Ninety-four children between seven and eight years old participated in the study. The comparison of the scores obtained on the pre-test and on the post-test indicates that the children increased their knowledge after the training, which demonstrates that the learning method is effective. When the post-test scores for the three types of questionnaires were compared, statistically significant differences were found in favour of the two questionnaires with images versus the text-only questionnaire. No statistically significant differences were found between the two types of questionnaires with images. Therefore, to a great extent, the use of images in the questionnaires helps students to select the correct answer. Since this encourages students, adding images to the questionnaires could be a good strategy for formative assessment.

EXTERNAL REPRESENTATIONS AND PROBLEM SOLVING COMPETENCE: DO GRAPHS IMPROVE PROBLEM SOLVING IN STUDENTS?

2008

With the advent of computer technology and the popularity of the Internet as an informationgathering tool, the academic environment has changed. In today's digital age, students expect their education to include technology. Given the shift in information delivery and the expectations of students, it is important to assess how well students are able to learn when presented with different types of information. Graphs have proven to be an effective communication and presentation tool, but as technology has advanced the methods available for displaying data have multiplied. The primary goal of this study was to explore individual differences in graph comprehension. Overall results suggested that students who have high mathematical problem solving scores and are able to correctly identify the function of different displays are better able to accurately extract information from visual displays. Implications for education are discussed.

Levels of line graph question interpretation with intermediate elementary students of varying scientific and mathematical knowledge and ability: A think aloud …

2008

This study examined how intermediate elementary students’ mathematics and science background knowledge affected their interpretation of line graphs and how their interpretations were affected by graph question levels. A purposive sample of 14 6th-grade students engaged in think aloud interviews (Ericsson & Simon, 1993) while completing an excerpted Test of Graphing in Science (TOGS) (McKenzie & Padilla, 1986). Hand gestures were video recorded. Student performance on the TOGS was assessed using an assessment rubric created from previously cited factors affecting students’ graphing ability. Factors were categorized using Bertin’s (1983) three graph question levels. The assessment rubric was validated by Padilla and a veteran mathematics and science teacher. Observational notes were also collected. Data were analyzed using Roth and Bowen’s semiotic process of reading graphs (2001). Key findings from this analysis included differences in the use of heuristics, self-generated questions, science knowledge, and self-motivation. Students with higher prior achievement used a greater number and variety of heuristics and more often chose appropriate heuristics. They also monitored their understanding of the question and the adequacy of their strategy and answer by asking themselves questions. Most used their science knowledge spontaneously to check their understanding of the question and the adequacy of their answers. Students with lower and moderate prior achievement favored one heuristic even when it was not useful for answering the question and rarely asked their own questions. In some cases, if students with lower prior achievement had thought about their answers in the context of their science knowledge, they would have been able to recognize their errors. One student with lower prior achievement motivated herself when she thought the questions were too difficult. In addition, students answered the TOGS in one of three ways: as if they were mathematics word problems, science data to be analyzed, or they were confused and had to guess. A second set of findings corroborated how science background knowledge affected graph interpretation: correct science knowledge supported students’ reasoning, but it was not necessary to answer any question correctly; correct science knowledge could not compensate for incomplete mathematics knowledge; and incorrect science knowledge often distracted students when they tried to use it while answering a question. Finally, using Roth and Bowen’s (2001) two-stage semiotic model of reading graphs, representative vignettes showed emerging patterns from the study. This study added to our understanding of the role of science content knowledge during line graph interpretation, highlighted the importance of heuristics and mathematics procedural knowledge, and documented the importance of perception attentions, motivation, and students’ self-generated questions. Recommendations were made for future research in line graph interpretation in mathematics and science education and for improving instruction in this area.

The different effects of thinking aloud and writing on graph comprehension

2011

We report an experiment which seeks to determine how novice users' conceptual understanding of graphs differs depending on the nature of the interaction with them. Undergraduate psychology students were asked to interpret three-variable "interaction" data in either bar or line graph form and were required to either think aloud while doing so or to produce written interpretations. Analysis of the verbal protocols and written interpretations showed that producing a written interpretation revealed significantly higher levels of comprehension than interpreting them while thinking aloud. Specifically, a significant proportion of line graph users in the verbal protocol condition was either unable to interpret the graphs, or misinterpreted information presented in them. The occurrence of these errors was substantially lower for the bar graph users in the verbal protocol condition. In contrast, analysis of the written condition revealed no significant difference in the level of comprehension between the two graph types. Possible explanations for these findings are discussed.