Simon Boynton | The University of Hong Kong (original) (raw)
Papers by Simon Boynton
In CAES, formal and informal assessments fulfil formative and/or summative purposes and significa... more In CAES, formal and informal assessments fulfil formative and/or summative purposes and significant effort is made to ensure the validity and reliability of these assessments. For example, course coordinators establish validity by designing assessment tasks that measure the intended learning outcomes of a course and ensure reliability through standardisation and moderation meetings. In the classroom, teachers enhance student learning through formal or ad hoc formative assessment tasks. In spite of these efforts, however, student and teacher feedback over the years have indicated that there is room for improvement either in design or implementation of some course assessments. This seminar thus aims to enhance the assessment literacy of CAES teachers. It first reports the results of a small-scale study that aimed to identify best practices of assessment at CAES. It is then followed by a discussion that encourages teachers to explore contextual and operational factors that impact a key...
This timely and fascinating collection describes the current state of primary school English-lang... more This timely and fascinating collection describes the current state of primary school English-language education in China, Japan, Singapore, Korea, India, Vietnam, and Taiwan. The book portrays all these countries as valuing English highly as a language for international communication and wanting to improve the ability of their citizens to use it. English-language education at primary school is seen as an important stepping stone in this process. The chapters about China, Japan, Korea, India and Vietnam incorporate details of research projects undertaken by their authors and these provide useful insights into the views of primary school teachers. In many countries there seems to be tension and disparity between seemingly well-intentioned education authorities who set educational policy and the classroom teachers who have to implement the policy with students. With reference to data from questionnaires, interviews, and classroom observations the authors of these chapters show how prim...
Popular science writing is rarely referred to in the literature yet could be used as a writing ta... more Popular science writing is rarely referred to in the literature yet could be used as a writing task for undergraduate science students. Since 2013-14, a popular science article writing task has been used as the main writing task in an English-in-the-Discipline course for second year undergraduate science students from multiple scientific disciplines at a university in Hong Kong. In this course students were formally taught the genre features of popular science articles and research articles using the concept of reader-writer proximity (Hyland, 2010) in which the fixed rhetorical features are used to “construct both the reader and writer as people with similar understandings and goals” (Hyland, 2010, p. 116). Samples of students’ writing were analysed for genre features through Hyland’s concept of proximity, and individual interviews conducted with their authors. The objectives of the study were to determine (1) to what extent students can incorporate and successfully use features of...
All too often, the shift from a norm-referenced to criterion-reference assessment results in test... more All too often, the shift from a norm-referenced to criterion-reference assessment results in tests that reflect holistic and (teacher-biased) expectations of student performance without actually determining whether the criteria hold sufficient standards of validity and reliability, resulting in considerable inter-rater variance. This is particularly felt in L2 contexts where students struggle to achieve grades under criteria that are not produced in the same language (or spirit) as their L1, and where teachers from different backgrounds may interpret individual criteria according to their ideologies and beliefs. However, most validation studies only focus on quantitative matters (losing the personal, holistic focus) or focus on qualitative concerns (missing the general picture). The present study validates a criterion-referenced group tutorial discussion speaking assessment for undergraduate EAP at a leading university in Hong Kong in terms of inter-rater variance and criterion validity. I present three complementary quantitative measures (Intraclass Correlation Coefficient, Cronbach's Alpha and Exploratory Factor Analysis) which suggest a number of criteria could safely be removed from the rubric, and show that the grading of some criteria frequently overlaps that of other criteria. However, in attempting to explain the reasons behind the statistical results, qualitative interviews with test raters suggest that 1) raters bring with them their own interpretations of the rubric criteria and 2) there are specific linguistic considerations regarding individual test takers' performance and rater variance on said performance. The results have implications for improving the validity and reliability of in-house produced criterion-reference assessment rubrics, and hopefully the paper serves as a 'how-to' for language assessment practitioners
Journal of English for Academic Purposes, 2017
Exploring rater conceptions of academic stance and engagement during group tutorial discussion as... more Exploring rater conceptions of academic stance and engagement during group tutorial discussion assessment. English for Academic Purposes, to appear.
International Journal of English Linguistics, 2016
This study attempts to validate an academic group tutorial discussion speaking test for undergrad... more This study attempts to validate an academic group tutorial discussion speaking test for undergraduate freshmen students taking initial EAP training at a university in Hong Kong in terms of task, rater and criterion validity. Three quantitative measures (Cronbach's Alpha, Intraclass Correlation Coefficient, and Exploratory Factor Analysis) are used to assess validity of rater scores for the test using a rubric with considerations for assessment of academic stance presentation, inter-candidate interaction, and individual language proficiency. These results are triangulated with post-hoc interview data from the raters regarding the difficulties they face assessing individual proficiency and group interaction over time. The results suggest that current provisions of the rubric in dealing with the assessment of interaction in group settings (namely visual cues such as "active listening" as well as provisions for interruptions in the form of "domination") are problematic, and that raters are unable to separate the grading of academic stance from the grading of language concerns. We also note affective and cognitive difficulties involved with assessing extended periods of interactional discourse including student talking time (or lack of it), the group dynamic, and raters" personal beliefs and practice as threats to validity that the statistical measurements were unable to capture. A new sample rubric and further suggestions for improving the validity of group tutorial assessments are provided.
The Asian Journal of Applied Linguistics, Mar 31, 2015
Popular science writing is rarely referred to in the literature yet could be used as a writing ta... more Popular science writing is rarely referred to in the literature yet could be used as a writing task for undergraduate science students. Since 2013-14, a popular science article writing task has been used as the main writing task in an English-in-the-Discipline course for second year undergraduate science students from multiple scientific disciplines at a university in Hong Kong. In this course students were formally taught the genre features of popular science articles and research articles using the concept of reader-writer proximity (Hyland, 2010) in which the fixed rhetorical features are used to "construct both the reader and writer as people with similar understandings and goals" (Hyland, 2010, p. 116). Samples of students' writing were analysed for genre features through Hyland's concept of proximity, and individual interviews conducted with their authors. The objectives of the study were to determine (1) to what extent students can incorporate and successful...
In CAES, formal and informal assessments fulfil formative and/or summative purposes and significa... more In CAES, formal and informal assessments fulfil formative and/or summative purposes and significant effort is made to ensure the validity and reliability of these assessments. For example, course coordinators establish validity by designing assessment tasks that measure the intended learning outcomes of a course and ensure reliability through standardisation and moderation meetings. In the classroom, teachers enhance student learning through formal or ad hoc formative assessment tasks. In spite of these efforts, however, student and teacher feedback over the years have indicated that there is room for improvement either in design or implementation of some course assessments. This seminar thus aims to enhance the assessment literacy of CAES teachers. It first reports the results of a small-scale study that aimed to identify best practices of assessment at CAES. It is then followed by a discussion that encourages teachers to explore contextual and operational factors that impact a key...
This timely and fascinating collection describes the current state of primary school English-lang... more This timely and fascinating collection describes the current state of primary school English-language education in China, Japan, Singapore, Korea, India, Vietnam, and Taiwan. The book portrays all these countries as valuing English highly as a language for international communication and wanting to improve the ability of their citizens to use it. English-language education at primary school is seen as an important stepping stone in this process. The chapters about China, Japan, Korea, India and Vietnam incorporate details of research projects undertaken by their authors and these provide useful insights into the views of primary school teachers. In many countries there seems to be tension and disparity between seemingly well-intentioned education authorities who set educational policy and the classroom teachers who have to implement the policy with students. With reference to data from questionnaires, interviews, and classroom observations the authors of these chapters show how prim...
Popular science writing is rarely referred to in the literature yet could be used as a writing ta... more Popular science writing is rarely referred to in the literature yet could be used as a writing task for undergraduate science students. Since 2013-14, a popular science article writing task has been used as the main writing task in an English-in-the-Discipline course for second year undergraduate science students from multiple scientific disciplines at a university in Hong Kong. In this course students were formally taught the genre features of popular science articles and research articles using the concept of reader-writer proximity (Hyland, 2010) in which the fixed rhetorical features are used to “construct both the reader and writer as people with similar understandings and goals” (Hyland, 2010, p. 116). Samples of students’ writing were analysed for genre features through Hyland’s concept of proximity, and individual interviews conducted with their authors. The objectives of the study were to determine (1) to what extent students can incorporate and successfully use features of...
All too often, the shift from a norm-referenced to criterion-reference assessment results in test... more All too often, the shift from a norm-referenced to criterion-reference assessment results in tests that reflect holistic and (teacher-biased) expectations of student performance without actually determining whether the criteria hold sufficient standards of validity and reliability, resulting in considerable inter-rater variance. This is particularly felt in L2 contexts where students struggle to achieve grades under criteria that are not produced in the same language (or spirit) as their L1, and where teachers from different backgrounds may interpret individual criteria according to their ideologies and beliefs. However, most validation studies only focus on quantitative matters (losing the personal, holistic focus) or focus on qualitative concerns (missing the general picture). The present study validates a criterion-referenced group tutorial discussion speaking assessment for undergraduate EAP at a leading university in Hong Kong in terms of inter-rater variance and criterion validity. I present three complementary quantitative measures (Intraclass Correlation Coefficient, Cronbach's Alpha and Exploratory Factor Analysis) which suggest a number of criteria could safely be removed from the rubric, and show that the grading of some criteria frequently overlaps that of other criteria. However, in attempting to explain the reasons behind the statistical results, qualitative interviews with test raters suggest that 1) raters bring with them their own interpretations of the rubric criteria and 2) there are specific linguistic considerations regarding individual test takers' performance and rater variance on said performance. The results have implications for improving the validity and reliability of in-house produced criterion-reference assessment rubrics, and hopefully the paper serves as a 'how-to' for language assessment practitioners
Journal of English for Academic Purposes, 2017
Exploring rater conceptions of academic stance and engagement during group tutorial discussion as... more Exploring rater conceptions of academic stance and engagement during group tutorial discussion assessment. English for Academic Purposes, to appear.
International Journal of English Linguistics, 2016
This study attempts to validate an academic group tutorial discussion speaking test for undergrad... more This study attempts to validate an academic group tutorial discussion speaking test for undergraduate freshmen students taking initial EAP training at a university in Hong Kong in terms of task, rater and criterion validity. Three quantitative measures (Cronbach's Alpha, Intraclass Correlation Coefficient, and Exploratory Factor Analysis) are used to assess validity of rater scores for the test using a rubric with considerations for assessment of academic stance presentation, inter-candidate interaction, and individual language proficiency. These results are triangulated with post-hoc interview data from the raters regarding the difficulties they face assessing individual proficiency and group interaction over time. The results suggest that current provisions of the rubric in dealing with the assessment of interaction in group settings (namely visual cues such as "active listening" as well as provisions for interruptions in the form of "domination") are problematic, and that raters are unable to separate the grading of academic stance from the grading of language concerns. We also note affective and cognitive difficulties involved with assessing extended periods of interactional discourse including student talking time (or lack of it), the group dynamic, and raters" personal beliefs and practice as threats to validity that the statistical measurements were unable to capture. A new sample rubric and further suggestions for improving the validity of group tutorial assessments are provided.
The Asian Journal of Applied Linguistics, Mar 31, 2015
Popular science writing is rarely referred to in the literature yet could be used as a writing ta... more Popular science writing is rarely referred to in the literature yet could be used as a writing task for undergraduate science students. Since 2013-14, a popular science article writing task has been used as the main writing task in an English-in-the-Discipline course for second year undergraduate science students from multiple scientific disciplines at a university in Hong Kong. In this course students were formally taught the genre features of popular science articles and research articles using the concept of reader-writer proximity (Hyland, 2010) in which the fixed rhetorical features are used to "construct both the reader and writer as people with similar understandings and goals" (Hyland, 2010, p. 116). Samples of students' writing were analysed for genre features through Hyland's concept of proximity, and individual interviews conducted with their authors. The objectives of the study were to determine (1) to what extent students can incorporate and successful...