Tim McNamara - Academia.edu (original) (raw)
Papers by Tim McNamara
1 Cross-cultural pragmatics in oral proficiency 2 Decision dependability of subtests, tests and t... more 1 Cross-cultural pragmatics in oral proficiency 2 Decision dependability of subtests, tests and the overall TOEFL test battery 3 Development of new proficiency based skill level descriptors for translation: theory and practice 4 Comparing test difficulty and text reliability in the evaluation of an extensive reading programme 5 Performance assessment and the components of the oral construct across different tests and rater groups 6 Applying ethical standard of portfolio assessment of writing English as a Second Language 7 Equating national exams of foreign language reading comprehension 8 Taking a multifaceted view of the unidimensional measurement from Rasch analysis in language tests 9 A qualitative approach to monitoring examiner conduct in CASE 10 Linguistics accuracy versus coherence in assessing examination answers in content subjects 11 A study of the Decision-making Behaviour of Composition Markers 12 What raters really pay attention to 13 A study of writing tasks assigned i...
Language Testing Reconsidered
Applied Linguistics
This article discusses the familiar notion of the shibboleth in situations of exclusion, focusing... more This article discusses the familiar notion of the shibboleth in situations of exclusion, focusing on the non-use, rather than the use, of language, for which I propose the term the anti-shibboleth. The article begins with an introduction to the concept of the shibboleth, giving examples from situations of violent conflict and suppression such as the Holocaust and the Cambodian genocide, and goes on to develop the notion of the anti-shibboleth, using further examples from such contexts. It then considers the situation of Aboriginal Australians in the period up until 1967 in the light of the concept. The article concludes with a discussion of the way in which we may understand the anti-shibboleth theoretically, drawing on insights from poststructuralism.
Revue française de linguistique appliquée
... prospects, and lifelong learning for participation in an international society.[1] [1] Source... more ... prospects, and lifelong learning for participation in an international society.[1] [1] Source: <http:/ / www. ... find themselves listening to lectures and reading relevant materials which they then use as input ... will be subject to revision as experience with the test and investigation of its ...
Applied Linguistics
Applied linguists have developed complex theories of the ability to communicate in a second langu... more Applied linguists have developed complex theories of the ability to communicate in a second language (L2). However, the perspectives on L2 communication ability of speakers who are not trained language professionals have been incorporated neither into theories of communication ability nor in the criteria for assessing performance on general-purpose oral proficiency tests. This potentially weakens the validity of such tests because the ultimate arbiters of L2 speakers’ oral performance are not trained language professionals. This study investigates the perspectives of these linguistic laypersons on L2 communication ability. Twenty-three native and non-native English-speaking linguistic laypersons judged L2 speakers’ oral performances and verbalized the reasons for their judgments. The results showed that the participants focus not only on the linguistic aspects of the speaker’s output but also on features that applied linguists have less paid attention to. Even where speaker’s lingui...
Language Assessment Quarterly
ABSTRACT This article addresses the suitability of the CEFR as the basis for decisions about the ... more ABSTRACT This article addresses the suitability of the CEFR as the basis for decisions about the readiness of individuals to engage in academic writing tasks in undergraduate university courses, and as a guide to progress. The CEFR offers potentially relevant general scales and subscales, but also more specific subscales for writing in the academic context. However, recent challenges to traditional views of academic writing have potential implications for assessment frameworks such as the CEFR when they are used to identify readiness for, and progress in, academic study. In this article we explore the views of students on what it means to “do” academic writing. Questionnaires, interviews, and short reflective texts were used to investigate the changing perceptions of first-year undergraduate students at an Australian university. The analysis of student data confirms the reality of the more complex view of academic writing suggested by the recent literature. The article then considers what implications this has for the adequacy of the definitions provided in the CEFR. It suggests that the CEFR descriptors underrepresent the complexity of the challenges of academic writing, particularly its cognitive demands. A new and rather different approach will be required to inform assessments used to manage the admission of students in`to academic writing contexts and the monitoring of their progress.
Language Testing
All educational testing is intended to have consequences, which are assumed to be beneficial, but... more All educational testing is intended to have consequences, which are assumed to be beneficial, but tests may also have unintended, negative consequences (Messick, 1989). The issue is particularly important in the case of large-scale standardized tests, such as Australia’s National Assessment Program - Literacy and Numeracy (NAPLAN), the intended benefits of which are increased accountability and improved educational outcomes. The NAPLAN purpose is comparable to that of other state and national ‘core skills’ testing programs, which evaluate cross-sections of populations in order to compare results between population sub-groupings. Such comparisons underpin ‘accountability’ in the era of population-level testing. This study investigates the impact of NAPLAN testing on one population grouping that is prominent in the NAPLAN results’ comparisons and public reporting: children in remote Indigenous communities. A series of interviews with principals and teachers documents informants’ first...
Language Testing, 2018
All educational testing is intended to have consequences, which are assumed to be beneficial, but... more All educational testing is intended to have consequences, which are assumed to be beneficial, but tests may also have unintended, negative consequences (Messick, 1989). The issue is particularly important in the case of large-scale standardised tests, such as Australia's National Assessment Program–Literacy and Numeracy (NAPLAN), the intended benefits of which are increased accountability and improved educational outcomes. The NAPLAN purpose is comparable to that of other state and national 'core skills' testing programs which evaluate cross-sections of populations in order to compare results between population sub-groupings. Such comparisons underpin 'accountability' in the era of population-level testing. This study investigates the impact of NAPLAN testing on one population grouping that is prominent in the NAPLAN results comparisons and public reporting: children in remote Indigenous communities. A series of interviews with principals and teachers documents i...
Language & Communication
Highlights We consider the scope of current conceptualizations of communicative competence in t... more Highlights We consider the scope of current conceptualizations of communicative competence in tests of spoken language. We present studies illustrating aspects of performance underrepresented in traditional criteria for speaking tests. We propose the greater use of non-language specialists' views to determine assessment criteria.
Language Testing
This paper explores the views of nursing and medical domain experts in considering the standards ... more This paper explores the views of nursing and medical domain experts in considering the standards for a specific-purpose English language screening test, the Occupational English Test (OET), for professional registration for immigrant health professionals. Since individuals who score performances in the test setting are often language experts rather than domain experts, there are possible tensions between what is being measured by a language test and what is deemed important by domain experts. Another concern is a lack of qualitative research on the process of the standard setting. To date, no published qualitative work has been identified about the contributions of domain experts in the standard setting for healthcare communication. In this study, a standard-setting exercise was conducted for the speaking component of the OET, using judgements of nursing and medical clinical educators and supervisors. In all, 13 medical and 18 nursing clinical educators and supervisors rated medical and nursing candidate performances respectively. These performances were audio-recorded OET role-plays that were selected across a range of proficiency levels. Domain experts were invited to comment on the basis of their decisions and the extent of alignment between these decisions and the criteria used to assess performance on the OET. Nursing and medical domain experts showed that they attended to all of the OET criteria in making their decisions about standards. However, clinical scenario simulation also invited judgements of clinical competence from participants, even where they knew that clinical competence should be excluded from their decision-making. Another concern related to the authenticity limitations of the role-play tasks as evidence of readiness to handle communication in the workplace. Overall, findings support the value of qualitative evidence from the standard setting in providing insight into the factors informing and impeding decision-making.
Language Testing
This paper considers how to establish the minimum required level of professionally relevant oral ... more This paper considers how to establish the minimum required level of professionally relevant oral communication ability in the medium of English for health practitioners with English as an additional language (EAL) to gain admission to practice in jurisdictions where English is the dominant language. A theoretical concern is the construct of clinical communicative competence and its separability (or not) from other aspects of professional competence, while a methodological question examines the technical difficulty of determining a defensible minimum standard. The paper reports on a standard-setting study to set a minimum standard of professionally relevant oral competence for three health professions – medicine, nursing, and physiotherapy – as measured by the speaking sub-test of the Occupational English Test, a profession-specific test of clinically related communicative competence. While clinical educators determined the standard, it is to be implemented by raters trained as teach...
A preliminary study is reported of the use of new multifaceted Rasch measurement mechanisms for i... more A preliminary study is reported of the use of new multifaceted Rasch measurement mechanisms for investigating rater characteristics in language testing. Ratings from four judges of scripts from 50 candidates taking the International English Language Testing System test, a test of English for Academic Purposes, are analyzed. The analysis illustrates how multifaceted Rasch measurement can be used to examine inter-rater consistency, differences in rater harshness, available grades on the rating scale, and the effect that between-rater variation has on the measurement of individual candidates. Although the main focus of the paper is on modeling and estimating rater variation, Rasch modeling also has the potential for practical applications controlling the effects of the variation it describes. One such application is considered: the use of the model to explore the relationship between varying amounts-f multiple marking and the resulting ability estimates of candidates, to see if it may be possible to reduce the amount of multiple marking required to produce stable and reliable estimates of ability. Contains 18 references. (LB)
Issues in Applied Linguistics, 2007
Annual Review of Applied Linguistics, 1998
... Click on any of the links below to perform a new search. Title: Policy and Social Considerati... more ... Click on any of the links below to perform a new search. Title: Policy and Social Consideration in Language Assessment. Authors: McNamara, Tim. ... Source: Annual Review of Applied Linguistics, v18 p304-19 1998. More Info: Help Peer-Reviewed: N/A. Publisher: N/A. ...
Current Anthropology, 2009
The role of item response theory (IRT) in determining the validity of second language tests is ex... more The role of item response theory (IRT) in determining the validity of second language tests is examined in the case of one specific test, the listening subtest of the Occupational English Test (OET), used in Australia to measure the language skills of non-native English-speaking health professionals. First, the listPning subtest is described. Then the debate over the appropr'ateness of IRT use in language testing research is discussed in some detail, with reference to a number of separate studies. Finally, a study of the use of IRT in validating the OET is reported. The study involved analysis, using the Partial Credit model, of data from 196 candidates taking the test in 1987. It investigated whether it is possible to construct a single measurement dimension of listening ability from data from the subtest's two parts, and if the answer is yes, whether the skills tested in the two parts are essentially the same. Results indicate that th-. test is indeed unidimensional, and support the use of IRT for such analysis. It is also concluded that the kinds of listening tasks in the two subtest parts represent significantly different tasks in terms of level of ability required to deal successfully with them. (MSE)
1 Cross-cultural pragmatics in oral proficiency 2 Decision dependability of subtests, tests and t... more 1 Cross-cultural pragmatics in oral proficiency 2 Decision dependability of subtests, tests and the overall TOEFL test battery 3 Development of new proficiency based skill level descriptors for translation: theory and practice 4 Comparing test difficulty and text reliability in the evaluation of an extensive reading programme 5 Performance assessment and the components of the oral construct across different tests and rater groups 6 Applying ethical standard of portfolio assessment of writing English as a Second Language 7 Equating national exams of foreign language reading comprehension 8 Taking a multifaceted view of the unidimensional measurement from Rasch analysis in language tests 9 A qualitative approach to monitoring examiner conduct in CASE 10 Linguistics accuracy versus coherence in assessing examination answers in content subjects 11 A study of the Decision-making Behaviour of Composition Markers 12 What raters really pay attention to 13 A study of writing tasks assigned i...
Language Testing Reconsidered
Applied Linguistics
This article discusses the familiar notion of the shibboleth in situations of exclusion, focusing... more This article discusses the familiar notion of the shibboleth in situations of exclusion, focusing on the non-use, rather than the use, of language, for which I propose the term the anti-shibboleth. The article begins with an introduction to the concept of the shibboleth, giving examples from situations of violent conflict and suppression such as the Holocaust and the Cambodian genocide, and goes on to develop the notion of the anti-shibboleth, using further examples from such contexts. It then considers the situation of Aboriginal Australians in the period up until 1967 in the light of the concept. The article concludes with a discussion of the way in which we may understand the anti-shibboleth theoretically, drawing on insights from poststructuralism.
Revue française de linguistique appliquée
... prospects, and lifelong learning for participation in an international society.[1] [1] Source... more ... prospects, and lifelong learning for participation in an international society.[1] [1] Source: <http:/ / www. ... find themselves listening to lectures and reading relevant materials which they then use as input ... will be subject to revision as experience with the test and investigation of its ...
Applied Linguistics
Applied linguists have developed complex theories of the ability to communicate in a second langu... more Applied linguists have developed complex theories of the ability to communicate in a second language (L2). However, the perspectives on L2 communication ability of speakers who are not trained language professionals have been incorporated neither into theories of communication ability nor in the criteria for assessing performance on general-purpose oral proficiency tests. This potentially weakens the validity of such tests because the ultimate arbiters of L2 speakers’ oral performance are not trained language professionals. This study investigates the perspectives of these linguistic laypersons on L2 communication ability. Twenty-three native and non-native English-speaking linguistic laypersons judged L2 speakers’ oral performances and verbalized the reasons for their judgments. The results showed that the participants focus not only on the linguistic aspects of the speaker’s output but also on features that applied linguists have less paid attention to. Even where speaker’s lingui...
Language Assessment Quarterly
ABSTRACT This article addresses the suitability of the CEFR as the basis for decisions about the ... more ABSTRACT This article addresses the suitability of the CEFR as the basis for decisions about the readiness of individuals to engage in academic writing tasks in undergraduate university courses, and as a guide to progress. The CEFR offers potentially relevant general scales and subscales, but also more specific subscales for writing in the academic context. However, recent challenges to traditional views of academic writing have potential implications for assessment frameworks such as the CEFR when they are used to identify readiness for, and progress in, academic study. In this article we explore the views of students on what it means to “do” academic writing. Questionnaires, interviews, and short reflective texts were used to investigate the changing perceptions of first-year undergraduate students at an Australian university. The analysis of student data confirms the reality of the more complex view of academic writing suggested by the recent literature. The article then considers what implications this has for the adequacy of the definitions provided in the CEFR. It suggests that the CEFR descriptors underrepresent the complexity of the challenges of academic writing, particularly its cognitive demands. A new and rather different approach will be required to inform assessments used to manage the admission of students in`to academic writing contexts and the monitoring of their progress.
Language Testing
All educational testing is intended to have consequences, which are assumed to be beneficial, but... more All educational testing is intended to have consequences, which are assumed to be beneficial, but tests may also have unintended, negative consequences (Messick, 1989). The issue is particularly important in the case of large-scale standardized tests, such as Australia’s National Assessment Program - Literacy and Numeracy (NAPLAN), the intended benefits of which are increased accountability and improved educational outcomes. The NAPLAN purpose is comparable to that of other state and national ‘core skills’ testing programs, which evaluate cross-sections of populations in order to compare results between population sub-groupings. Such comparisons underpin ‘accountability’ in the era of population-level testing. This study investigates the impact of NAPLAN testing on one population grouping that is prominent in the NAPLAN results’ comparisons and public reporting: children in remote Indigenous communities. A series of interviews with principals and teachers documents informants’ first...
Language Testing, 2018
All educational testing is intended to have consequences, which are assumed to be beneficial, but... more All educational testing is intended to have consequences, which are assumed to be beneficial, but tests may also have unintended, negative consequences (Messick, 1989). The issue is particularly important in the case of large-scale standardised tests, such as Australia's National Assessment Program–Literacy and Numeracy (NAPLAN), the intended benefits of which are increased accountability and improved educational outcomes. The NAPLAN purpose is comparable to that of other state and national 'core skills' testing programs which evaluate cross-sections of populations in order to compare results between population sub-groupings. Such comparisons underpin 'accountability' in the era of population-level testing. This study investigates the impact of NAPLAN testing on one population grouping that is prominent in the NAPLAN results comparisons and public reporting: children in remote Indigenous communities. A series of interviews with principals and teachers documents i...
Language & Communication
Highlights We consider the scope of current conceptualizations of communicative competence in t... more Highlights We consider the scope of current conceptualizations of communicative competence in tests of spoken language. We present studies illustrating aspects of performance underrepresented in traditional criteria for speaking tests. We propose the greater use of non-language specialists' views to determine assessment criteria.
Language Testing
This paper explores the views of nursing and medical domain experts in considering the standards ... more This paper explores the views of nursing and medical domain experts in considering the standards for a specific-purpose English language screening test, the Occupational English Test (OET), for professional registration for immigrant health professionals. Since individuals who score performances in the test setting are often language experts rather than domain experts, there are possible tensions between what is being measured by a language test and what is deemed important by domain experts. Another concern is a lack of qualitative research on the process of the standard setting. To date, no published qualitative work has been identified about the contributions of domain experts in the standard setting for healthcare communication. In this study, a standard-setting exercise was conducted for the speaking component of the OET, using judgements of nursing and medical clinical educators and supervisors. In all, 13 medical and 18 nursing clinical educators and supervisors rated medical and nursing candidate performances respectively. These performances were audio-recorded OET role-plays that were selected across a range of proficiency levels. Domain experts were invited to comment on the basis of their decisions and the extent of alignment between these decisions and the criteria used to assess performance on the OET. Nursing and medical domain experts showed that they attended to all of the OET criteria in making their decisions about standards. However, clinical scenario simulation also invited judgements of clinical competence from participants, even where they knew that clinical competence should be excluded from their decision-making. Another concern related to the authenticity limitations of the role-play tasks as evidence of readiness to handle communication in the workplace. Overall, findings support the value of qualitative evidence from the standard setting in providing insight into the factors informing and impeding decision-making.
Language Testing
This paper considers how to establish the minimum required level of professionally relevant oral ... more This paper considers how to establish the minimum required level of professionally relevant oral communication ability in the medium of English for health practitioners with English as an additional language (EAL) to gain admission to practice in jurisdictions where English is the dominant language. A theoretical concern is the construct of clinical communicative competence and its separability (or not) from other aspects of professional competence, while a methodological question examines the technical difficulty of determining a defensible minimum standard. The paper reports on a standard-setting study to set a minimum standard of professionally relevant oral competence for three health professions – medicine, nursing, and physiotherapy – as measured by the speaking sub-test of the Occupational English Test, a profession-specific test of clinically related communicative competence. While clinical educators determined the standard, it is to be implemented by raters trained as teach...
A preliminary study is reported of the use of new multifaceted Rasch measurement mechanisms for i... more A preliminary study is reported of the use of new multifaceted Rasch measurement mechanisms for investigating rater characteristics in language testing. Ratings from four judges of scripts from 50 candidates taking the International English Language Testing System test, a test of English for Academic Purposes, are analyzed. The analysis illustrates how multifaceted Rasch measurement can be used to examine inter-rater consistency, differences in rater harshness, available grades on the rating scale, and the effect that between-rater variation has on the measurement of individual candidates. Although the main focus of the paper is on modeling and estimating rater variation, Rasch modeling also has the potential for practical applications controlling the effects of the variation it describes. One such application is considered: the use of the model to explore the relationship between varying amounts-f multiple marking and the resulting ability estimates of candidates, to see if it may be possible to reduce the amount of multiple marking required to produce stable and reliable estimates of ability. Contains 18 references. (LB)
Issues in Applied Linguistics, 2007
Annual Review of Applied Linguistics, 1998
... Click on any of the links below to perform a new search. Title: Policy and Social Considerati... more ... Click on any of the links below to perform a new search. Title: Policy and Social Consideration in Language Assessment. Authors: McNamara, Tim. ... Source: Annual Review of Applied Linguistics, v18 p304-19 1998. More Info: Help Peer-Reviewed: N/A. Publisher: N/A. ...
Current Anthropology, 2009
The role of item response theory (IRT) in determining the validity of second language tests is ex... more The role of item response theory (IRT) in determining the validity of second language tests is examined in the case of one specific test, the listening subtest of the Occupational English Test (OET), used in Australia to measure the language skills of non-native English-speaking health professionals. First, the listPning subtest is described. Then the debate over the appropr'ateness of IRT use in language testing research is discussed in some detail, with reference to a number of separate studies. Finally, a study of the use of IRT in validating the OET is reported. The study involved analysis, using the Partial Credit model, of data from 196 candidates taking the test in 1987. It investigated whether it is possible to construct a single measurement dimension of listening ability from data from the subtest's two parts, and if the answer is yes, whether the skills tested in the two parts are essentially the same. Results indicate that th-. test is indeed unidimensional, and support the use of IRT for such analysis. It is also concluded that the kinds of listening tasks in the two subtest parts represent significantly different tasks in terms of level of ability required to deal successfully with them. (MSE)