Xây dựng và xác trị bài kiểm tra học kì kĩ năng nói môn tiếng anh cho học sinh lớp 8 tại một trường thcs tại hà nội

đang tải dữ liệu....

Nội dung tài liệu: Xây dựng và xác trị bài kiểm tra học kì kĩ năng nói môn tiếng anh cho học sinh lớp 8 tại một trường thcs tại hà nội

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES FACULTY OF ENGLISH LANGUAGE TEACHER EDUCATION GRADUATION PAPER THE DEVELOPMENT AND VALIDATION OF AN ENGLISH SPEAKING ACHIEVEMENT TEST FOR 8TH GRADERS IN A SECONDARY SCHOOL IN HANOI Supervisor: Dương Thu Mai, PhD. Student: Nguyễn Vân Anh Course: QH2014.F1.E1 HÀ NỘI – 2018 ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC NGOẠI NGỮ KHOA SƯ PHẠM TIẾNG ANH KHÓA LUẬN TỐT NGHIỆP XÂY DỰNG VÀ XÁC TRỊ BÀI KIỂM TRA HỌC KÌ KĨ NĂNG NÓI MÔN TIẾNG ANH CHO HỌC SINH LỚP 8 TẠI MỘT TRƯỜNG THCS TẠI HÀ NỘI Giáo viên hướng dẫn: Dương Thu Mai, PhD. Sinh viên: Nguyễn Vân Anh Khóa: QH2014.F1.E1 HÀ NỘI – 2018 ACCEPTANCE PAGE I hereby state that I: Nguyễn Vân Anh, QH2014.F1.E1, being a candidate for the degree of Bachelor of Arts (Fast-track program), accept the requirements of the College relating to the retention and use of Bachelor’s Graduation Paper deposited in the library. In terms of these conditions, I agree that the origin of my paper deposited in the library should be accessible for the purposes of study and research, in accordance with the normal conditions established by the librarian for the care, loan or reproduction of the paper. Signature Date i ACKNOWLEDGEMENTS I would like to express my sincere gratitude to my supervisor – Ms. Duong Thu Mai, PhD. – for her constant support for my study. Without her valuable advice and suggestions, together with her encouragement and passion, I could not manage to complete this graduation paper. I deeply appreciate the testing experts at FELTE, ULIS-VNU, as well as 8th grade teachers and students at Dong Da Secondary School, who have been the enthusiastic respondents in my research. I would also like to send my thanks to my lecturers and my classmates for the insightful comments and encouraging words they have given me. Last but not least, I owe my gratitude towards my beloved family members, who have been motivating me throughout the writing of this research. i ABSTRACT Thanks to the trend of communicative language teaching and the introduction of the pilot English program, the role of speaking and speaking assessment is being highlighted in secondary school English education. However, speaking assessment instruments for secondary students are limited and not many of them are of good quality, due to the complexity of test development and validation. This empirical study is an attempt to design and evaluate the content validity of an English speaking achievement test for 8th graders based on conceptualized framework of test construction. To select the appropriate test content, observation of English 8’s textbook content and teaching practice, as well as teachers’ opinion collected by questionnaire had been conducted. A test was constructed at the end of the study. It was unveiled that responsive tasks were most appropriate to assess 8th graders’ speaking performance. The survey data gathered from testing experts afterwards has shown that the test content is valid in terms of relevance to objectives, duration, instructions and interaction amount. The rating scale and some tasks in the test, however, need further revisions to be more relevant to the course contents. Key terms: language tests, test development, test evaluation, content validity ii TABLE OF CONTENTS ACKNOWLEDGEMENTS ................................................................................. i ABSTRACT ......................................................................................................... ii TABLE OF CONTENTS ................................................................................... iii LIST OF FIGURES, TABLES, AND ABBREVIATIONS ............................ vi CHAPTER 1. INTRODUCTION ...................................................................... 1 1. Background of the study.................................................................................... 1 2. Aims and objectives of the study ...................................................................... 2 3. Scope of the study ............................................................................................. 4 4. Significance of the study ................................................................................... 4 5. Organization of the study .................................................................................. 5 CHAPTER 2. LITERATURE REVIEW .......................................................... 6 1. Key concepts of language assessment ............................................................... 6 1.1. Language testing ....................................................................................... 6 1.2. Developing classroom language tests ..................................................... 13 1.3. Validating a classroom language test...................................................... 17 1.4. Testing speaking/oral production............................................................ 20 2. Review on related studies ................................................................................ 27 2.1. Studies on developing and validating speaking tests worldwide............ 27 2.2. Studies on developing and validating speaking tests in Vietnam ........... 27 CHAPTER 3. METHODOLOGY ................................................................... 30 1. The instruction and assessment of speaking skill for 8th graders at Dong Da Secondary School ............................................................................................. 30 1.1. The instruction of speaking skill ............................................................. 30 1.2. The assessment of speaking skill ............................................................ 31 iii 2. Research questions .......................................................................................... 31 2.1. What are the components of an English speaking achievement test for 8th graders? ..................................................................................................... 31 2.2. To what extent is the test valid in terms of content validity? ................. 31 3. Sampling .......................................................................................................... 32 4. Data collection methods .................................................................................. 32 4.1. Observation ............................................................................................. 32 4.2. Survey ..................................................................................................... 33 4.3. Data collection procedure ....................................................................... 34 5. Data analysis methods ..................................................................................... 35 CHAPTER 4. FINDINGS AND DISCUSSION .............................................. 37 1. The development of the speaking achievement test for 8th graders at Dong Da Secondary School ........................................................................................ 37 1.1. Test rationale ........................................................................................... 37 1.2. Content selection ..................................................................................... 38 1.3. Development of specifications (including scoring procedures) and writing of materials ........................................................................................ 47 2. The validation of the speaking achievement test for 8th graders at Dong Da Secondary School ............................................................................................. 49 CHAPTER 5. CONCLUSION ......................................................................... 55 1. Conclusion and implications ........................................................................... 55 2. Limitations of the study ................................................................................... 56 3. Suggestions for further research ...................................................................... 57 REFERENCES .................................................................................................. 58 APPENDICES.................................................................................................... 63 APPENDIX 1. TEST SPECIFICATIONS .......................................................... 63 iv APPENDIX 2. QUESTIONNAIRE FOR EXPERT INFORMANTS ................ 70 APPENDIX 3. QUESTIONNAIRE FOR TESTING EXPERTS ....................... 75 APPENDIX 4. OBSERVATION SCHEME FOR TEXTBOOK ....................... 80 APPENDIX 5. OBSERVATION SCHEME FOR ENGLISH LESSON ............ 81 APPENDIX 6. BOOK MAP OF ENGLISH 8 TEXTBOOK ............................. 82 v LIST OF FIGURES, TABLES, AND ABBREVIATIONS LIST OF FIGURES Figure 1. Steps to designing an effective test Figure 2. Different components involved in communication Figure 3. Testing experts’ opinions on the difficulty of test tasks LIST OF TABLES Table 1. A framework for describing the speaking construct Table 2. Speaking performance objectives of the new English 8 course Table 3. Teachers’ opinions on the frequency of task types in English lessons Table 4. Teachers’ opinions on the suitability of speaking task types Table 5. Teachers’ opinion on the likelihood of using speaking task types in real-life communicative situations Table 6. Teachers’ opinion on the level of interest of students in different speaking tasks Table 7. Common Reference Level: global scale Table 8. Frequency of speaking tasks recommended by the textbooks to measure each performance objective Table 9. Frequency of speaking tasks used by teacher to measure each objective Table 10. Testing experts’ opinions on the relevance of test tasks towards course content Table 11. Testing experts’ opinions on the relevance of test tasks towards task objectives Table 12. Testing experts’ opinions on the instructions/prompts of test tasks Table 13. Testing experts’ opinions on the types of interaction in test tasks Table 14. Testing experts’ opinions on the duration of each task and the test as a whole vi LIST OF ABBREVIATIONS VSTEP Vietnam Standardized Test of English Proficiency IELTS International English Language Testing System TOEFL Test of English as a Foreign Language DDSS Dong Da Secondary School CEFR Common European Framework ULIS University of Languages and International Studies VNU Vietnam National University vii CHAPTER 1. INTRODUCTION 1. Background of the study The role of assessment in teaching and learning is indispensable. Not only does it measure “the level or magnitude of some attribute of a person” (Mousavi, 2009), assessment also utilizes the collected information to render “decisions about students, curricula and programs, and educational policy” (Nitko, 2009). Thanks to such well-timed decisions, learning can be reinforced and the students can also be motivated (Heaton, 1988). To attain valuable data about learning and teaching, a variety of assessment instruments can be listed, such as tests, portfolios, diaries, conferences and so on. Among these instruments, test is an outstanding tool for educational assessment in general and language assessment particularly. First, it highlights the “strengths and weaknesses in the learned abilities of the students” (Henning, 1987). Moreover, according to Heaton (1988), testing enables educators to make necessary adjustments in their teaching. It also locates the difficulty areas in the language program and motivates students through fair evaluation, as asserted by Heaton (1988). Thus, not only the teachers but also learners might benefit from testing. That explains why testing is involved in schools in various forms and at different levels. Designing a test takes time and is never considered an easy procedure. As Abeywickrama and Brown (2010) put it, “constructing a good test is a complex task involving both science and art.” This is not an exaggeration - to tell the truth - especially in Vietnam, a non-native English speaking country. Constructing a test to measure oral ability is even more challenging, since speaking is “more than just knowing the language” (Chastain, 1988). Unlike reading and listening ones, speaking tests “do not easily fit the conventional assumption about people and testing” (Underhill, 1987). This productive skill requires testers to actually perform the language, and is “far too complex a skill to permit any reliable analysis to be made for the purpose of objective testing” (Heaton, 1990). 1 After the construction process comes the validation one, in which the validity of the test is evaluated. This procedure can take place before and after the test administration, telling how “sound” and “to the point” the test is (Cumming & Berwick, 1996). As Cumming and Berwick also claimed, validation in language testing is ominously significant since it takes into account “educational and linguistic policies, instructional decisions, pedagogical practices, as well as tenets of language theory and research” (Cumming & Berwick, 1996). Validity is undoubtedly the top concern when it comes to evaluate the value of the test, according to Bachman and Palmer (1996). According to Heaton (1988), a test validity is “the extent to which it measures what it is supposed to measure and nothing else” (p. 159). Among different dimensions of validity, content validity is one of the most important, as the assessment of content validity is the initial stage of establishing the validity of any instrument (Ozer, Fitgerald, Sulbaran & Garvey, 2014). Evidences of content validity are often reflected through the relevance of the content and the coverage of important parts of the construct domain (Messick, 1995). 2. Aims and objectives of the study In a context where communicative language teaching is increasingly promoted (Abeywickrama & Brown, 2010), together with the introduction of The Pilot English Program for Secondary School Students (Ministry of Education and Training, 2012), the role of speaking and speaking assessment is being highlighted. Yet, due to the complexity of test development, current achievement classroom speaking tests tend to be either adapted from that of international standardized tests namely VSTEP, IELTS, TOEFL and so on, or self-designed by the teachers. The problem with standardized tests, however, lies in the fact that what is tested might not be what is taught (Vu, 2010). On the other hand, tests designed by teachers might very well reflect the learning situation, but encounter troubles regarding validation. This is due to the fact that a close and thorough investigation into the qualities of such homemade 2 tests tends to be neglected. This absence of test evaluation might catalyze several consequences, as only when the test is qualified can teachers and learners make the most use of it (Ozer, Fitgerald, Sulbaran & Garvey, 2014) and can the relationship between learning and assessment be fortified. In addition, to the knowledge of the author, there is still a humble amount of research about developing speaking tests compared to other skills. Studies about speaking construction and validation that aim specifically at students of lower levels – such as secondary or primary school students – are even fewer. Given the importance of speaking test development and validation issues, the researcher believes this domain deserves more attention. All the aforementioned reasons have motivated the researcher to conduct a study in which an English speaking achievement test for 8th graders is designed and validated based on language assessment theories and experts’ opinions. Hopefully, the result of the study can highlight the importance of having a qualified speaking test to facilitate secondary school’s English learning and motivate other researchers to investigate more into test development and validation. In particular, this research primarily aims at designing and validating an English speaking achievement test for 8th graders at a secondary school in Hanoi. To be more specific, the development of basic components of the test, including test specifications, test items and rating scale, will be featured in this study. The test will then be validated through data collected regarding its content validity. To accomplish such goals, the paper purports to address the following questions: 2.1. What are the components of an English speaking achievement test for 8th graders? 2.2. To what extent is the test valid in terms of content validity? 3 3. Scope of the study The study focuses on testing only the speaking skill of secondary students that are using the new English 8 textbook. Speaking is integrated more in the new English textbooks, which follows the communicative teaching approach (Ministry of Education and Training, 2017a), witnessing an increased number of speaking-related drills and activities in its content. Besides, the research aims at rendering an achievement test rather than other kinds of test. The reason behind this is because the syllabus of the English 8 course and its learning objectives are available to the public, which provides the necessary conditions to construct an achievement test. Moreover, the test administration time is near the end of the second semester, hence, nothing but an achievement test would be more suitable for the level and amount of knowledge that they have acquired. Last but not least, although a good test should possess several qualities, namely reliability, validity, backwash, authenticity and practicality (Brown, 2004), this study could only cast its light on one validity aspect. This is firstly because of the scale of the study and the availability of resources that leaves the researcher no choice but to be selective. Validity was chosen because it is specially dubbed as an “essential measurement quality” by Bachman & Palmer (1996, p.19). Moreover, regarding all dimensions of validity, it is asserted by Ozer, Fitgerald, Sulbaran and Garvey (2014) that the assessment of a test should begin from content validity. This explains why content validity is the focused issue in the study when it comes to validation. The literature used in the study will be restricted to English as a Second Language and Language Assessment materials. 4. Significance of the study The study is expected to provide a reference source for test designers to constitute assessment instruments, especially achievement tests, to measure the speaking performance of secondary school students. Specifically, the test specifications drafted in this study might help formulate similar speaking 4 achievement tests for 8th graders. Additionally, the framework adapted and developed in this study might be the foundation for developing speaking tests of high quality in the future. With such high quality tests, more valuable information about the practice of teaching and learning the new English textbooks will be gathered. In addition to this, the development procedure of the test might reflect the most problematic areas in test designing that deserve more attention. Meanwhile, the validation process is likely to highlight several factors that affect the test’s content validity. Such information would be valuable for test makers to develop more valid assessment instruments in the forthcoming future. 5. Organization of the study The study is divided into five chapters: Chapter 1. Introduction This chapter is the presentation of statement of the problem, rationale, scope, aims and objectives as well as the organization of the study. Chapter 2. Literature review This chapter featured the literature related to language testing, test designing and test validation. Chapter 3. Methodology This chapter describes the methods of the study, the selection of respondents, the materials and the methods of data collection and data analysis. Chapter 4. Findings and discussion This chapter presents and discusses the results of the data collection and data analysis process. Chapter 5. Conclusion This chapter summarizes the study, names some limitations and offers recommendations for further study. 5 CHAPTER 2. LITERATURE REVIEW This chapter attempts to provide a theoretical background for the research. Key concepts of language assessment – including language testing, test development, validity and reliability issues – together with international and domestic studies in the domain will be reviewed. 1. Key concepts of language assessment 1.1. Language testing 1.1.1. Definitions of tests The developing field of language assessment has noticed various definitions of test. Carroll (1968) gives a definition of test as below: “A psychological or educational test is a procedure designed to elicit certain behavior from which one can make inferences about certain characteristics of an individual.” (Carroll, 1968, p. 46) From this notion, Bachman (1990) asserts that a test is one type of measurement tailored to elicit “a specific sample of an individual behavior”. Abeywickrama and Brown (2010) attempt to construe the term in a simpler way, in which a test is “a method of measuring a person’s ability, knowledge, or performance in a given domain” (p. 3). Much as these interpretations differ, they agree on several crucial parts. First, a test is a method, an instrument, or a procedure that requires performance of the test-takers. According to Abeywickrama and Brown (2010), in order to qualify as a test, the method needs to be explicit and structured: multiple-choice questions with answer key, writing prompt with scoring rubric, oral interview coming along with a question script or a checklist of anticipated responses. Second, a test must “measure”, which might be understood as “a process of quantifying a test taker’s performance according to explicit procedures or rules” (Bachman, 1990, pp. 18-19). The measured target might be anything from general ability to specific competencies or objectives (Abeywickrama & 6 Brown, 2010). The communication of results, hence, also range from letter grade, comments to numerical score (in standardized tests for instance), as said by Abeywickrama and Brown (2010). The third crossroad of the aforementioned definitions is that a test must measure the ability, knowledge or performance of “an individual” (Caroll, 1968; Bachman, 1990; Abeywickrama & Brown, 2010). Therefore, as perceived by Abeywickrama and Brown (2010), it is important to have a deep insight into the test-takers’ previous language experience and background. These data will help tester decide whether the test suits testees’ ability, as well as give a hand in appropriate score interpretation (Abeywickrama & Brown, 2010). Among the three scholars, Abeywickrama and Brown (2010) explain the most clearly what a test can convey by pointing out the three aspects that can be concluded from test results. These are the “ability”, “knowledge” and “performance” of test-takers (Abeywickrama & Brown, 2010, p. 3). Although a test measures performance, it is the test-taker’s ability (or competence) – that is reflected via the results. Sometimes the knowledge about the language is tested as well (Abeywickrama & Brown, 2010). 1.1.2. Testing and assessment Despite some overlaps in the meaning of “testing” and “assessment”, these two terms are not equivalent. Whereas testing is understood as above, assessment – as claimed by Mousavi (2009) – is “appraising or estimating the level of magnitude of some attribute of a person” (p. 36). Educationally speaking, Abeywickrama and Brown (2010) refer to assessment as “an ongoing process that encompasses a wide range of methodological techniques” (p. 3). When a student replies to a question, tries out a new vocabulary item and the teacher observes, assessment occurs. When a student writes an essay and submits it to the teacher to get a score, assessment also takes place. Hence, assessment can be formal or informal, conscious or subconscious, incidental or intended (Abeywickrama & Brown, 2010, p. 3). This then leads to a conclusion 7 that assessment has a broader meaning than testing, or that tests are only “a subset of assessment, a genre of assessment techniques (Abeywickrama & Brown, 2010, p. 3). 1.1.3. Classification of test Hughes (2003) categorizes tests according to the information they offer. According to his work, tests are divided into four types: proficiency test, achievement test, diagnostic test and achievement test. Abeywickrama and Brown (2010) add aptitude test to the list, making it the fifth test type. However, the recent unpopularity of aptitude test due to some certain limitations mentioned in Stansfield and Reed’s study (2004) explains its absence in the categorization below. 1.1.3.1. Proficiency test Proficiency test, as explained by Hughes (2003, p. 9), “are designed to measure people’s ability in a language regardless of any training they may have had in that language”. Rather than relying on the course content or objectives, the content of the test is based on what candidates have to be able to do in order to be considered proficient. 1.1.3.2. Diagnostic test This kind of test is used to diagnose students’ strengths and weaknesses. They are intended primarily to identify the needed future instructions. 1.1.3.3. Placement test As its name suggest, this kind of test is employed to place students at the stage which suit their current levels or abilities. Since not a single placement test will work for any circumstance (Hughes, 2003), it is better to be tailor- made. 1.1.3.4. Achievement test Unlike proficiency test, achievement test relates directly to language courses (Hughes, 2003) with their purpose being how successful learner(s) 8 have been in achieving course objectives. According to Gronlund (1982), achievement testing plays a central part in all types of educational programs. It is the most widely used method of assessing learner achievement (Gronlund, 1982; Abeywickrama & Brown, 2010). Gronlund (1982) then defines achievement test as follows: “An achievement test is a systematic procedure for determining the amount student has learned. Although the emphasis is on measuring learning outcome, it should not be implied that testing is to be done only at the end of instruction. All too frequently, achievement testing is viewed as an end-of-unit or end-of-course activity that is used primarily for assigning course grades.” (Gronlund, 1982, p. 1) Abeywickrama and Brown (2010) concur with this definition by saying: “the primary role of an achievement test is to determine whether course objectives have been met” (p. 9). It has been learnt from the above definitions that an achievement test can take place at the end of either a unit or a whole course. Thus, regarding classification, Hughes (2003) introduces two kinds of achievement test, namely final achievement test and progress achievement test. Final achievement test As its name suggests, final achievement test are “those administered at the end of a course of study” (Hughes, 2003). The test-writers, hence, might be education ministries, official examining authorities or members of the teaching institutions. There used to be a debate over the content of final achievement test, as summarized by Hughes (2003), whether it should base on the syllabus content or on course objectives. Achievement tests following syllabus-content approach seems to be fairer since learners are examined on what they are thought to have learnt. However, if the syllabus, the books or materials are badly designed or selected, results can be misleading since successful performance in the test may not indicate successful achievement of course objectives (Hughes, 2003). Then comes the alternative approach where the test 9 is based directly on the objectives of the course. The key part of this approach lies in the objectives: they need to be explicitly designed. Hughes (2003) himself expresses his preference towards this approach, since he believes it will give “more accurate information about individual and group achievement, and it is likely to promote a more beneficial washback effect on teaching” (p. 11). Progress achievement test This kind of test is intended to “measure the progress that students are making” (Hughes, 2003). Since the word progress refers to the progress of achieving course objectives, Hughes (2003) supposes that progress achievement test should relate to the objectives as well. This notion then catalyzes two trends in conducting progress achievement test. The first one involves repeated administration of the final achievement test. After each attempt to take the test, the scores are anticipated to increase, exhibiting the progress made. The very transparent flaw spotted in this approach is that students might earn very low scores in the early stages of the course, which can be discouraging. Another way to develop progress achievement test is to base it on a well-established set of short-term objectives (Hughes, 2003). The objectives for each progress achievement test must relate closely to each other and to the course objectives. In other words, they must show “a clear progression” towards the final test (Hughes, 2003, p. 12). The study opted to design a final achievement test, since it is the most efficient instrument to decide how successful students are in achieving course objectives. Moreover, the research was conducted on 8th graders, who hardly have to take proficiency, diagnostic and placement tests. Furthermore, the objectives of the course are publicly available, which made it easier to construct a final achievement test. 1.1.4. Two major issues in modern language testing Before moving on to the development and validation of classroom language tests, it is necessary to have an overview of some current issues in 10

Tìm luận văn, tài liệu, khoá luận - 2024 © Timluanvan.net