Study Guide - Final Examination PDF

Title	Study Guide - Final Examination
Author	Camille Arsenault
Course	ESL Testing and Evaluation
Institution	Université Laval
Pages	16
File Size	277.6 KB
File Type	PDF
Total Downloads	4
Total Views	146

Preview

CLICK TO PREVIEW PDF

Summary

Course taught by Professor Shahrzad Saif. ...

Description

ESL Testing & Evaluation Final examination Test development Pre-Development Stage - purpose - characteristics of test-takers - context (Target Language Use domain) Stages of test development (regular exam) 1. Answer 3 main questions (pre-development stage) a. purpose b. test-takers c. context (TLU domain) i. For teachers in Quebec in public schools, the TLU is syllabus-based 2. Define test specifications / construct definition a. direct i. valid ii. authentic iii. interactive b. aim for positive washback i. *The more valid your test is, the less reliable it is 3. Develop scales a. analytic vs holistic b. try to make your test more reliable by creating a good scale Stages of test development (10 stages) for bigger tests, more important 1. state the problem 2. write specifications for the test 3. write and moderate items 4. trial items informally on native speakers 5. trial test on a group of non-native speakers similar to those for whom the test is intended 6. analyze the results of the trials; make changes as necessary 7. calibrate the scales 8. validate

9. write handbooks for test takers, test users and staff 10. train staff Specifying the Testing Problem - test purpose and type - test-takers - abilities to be tested - constructs (and %) - test score consequences - high-stakes or not - pass-fail - class evaluation - limitations - budget - staff Construct Definition The purpose of construct definition is to - specify the abilities to be assessed - determine what it is we have to ask the learners to do Test Specifications (aka “spects”) - What are test specifications? - Who needs test specifications? - teachers/test developers - item writers - test validators - for larger-scale tests, standardized tests - test users How to draw up test specifications? Item/task specification (test content) - input materials - text types, topics, vocabulary range, structural range - instructions - prompts Evidence specification - expected response - what will be the outcome of the test? (e.g. written prod, oral presentation, etc.)

-

scoring procedures - how will the test be marked? with what? Assembly specification - test structure - item types - channel/medium - timing - technique Presentation specification Delivery specification - timing - administration - security Common test techniques Multiple-choice items + easy to mark + quick to mark - hard to create (difficult to write successful items) - measures recognition, not production - discrete-point instead of integrative - guessing may have a considerable but unknownable effect on test scores - the technique severely restricts what can be tested - backwash may be harmful - cheating may be facilitated * The distractors (options) in the multiple-choice must be ● consistent (all verbs, all nouns, etc.) ● same length ● No “all/none of the above” ● same level of difficulty/frequency Cloze procedure + easy to mark + easy to develop + easy to administer C-test (remove the last half of the word, but keep the root) (e.g. bigger = big___) + only exact scoring is necessary + shorter passages are possible + a wider range of topics, styles and levels of ability is possible - harder to read than a cloze passage

- correct responses can be found in the surrounding text Dictation + easy to create + relatively easy to administer (not as easy as paper-and-pencil cloze) + involve listening ability - not easy to score Translation ● Ask lower-level students to read a text and then tell you what the text is about in their L1 Oral interview Composition writing Summarizing *All techniques are good but you must choose according to your constructs (what and how of spects*

Face validity (everything that makes the test appealing) - appearance of the test - pictures - fonts Checklist - test purpose - description of the test taker - test level - construct (theoretical framework for test) - number of sections - time for each section - weighting for each section - target language situation (TLU domain) - text types - text length - language skills to be tested (constructs) - language elements to be tested - test tasks - test methods - rubrics - criteria for marking - description of typical performance at each level

Testing Writing Direct way to evaluate writing: writing. Constructs are both in instructions and evaluation grid, which makes it easy to construct + it constructs validity. 3 steps:

1. Representative tasks Specify all possible content ● Functions ○ Expressing (thanks, opinion, apology, etc.) ○ Directing (ordering, instructing, warning, etc.) ○ Describing (actions, events, people, etc.) ○ Eliciting (information, directions, service, etc.) ○ Narration (sequence of events) ○ Reporting (description, comment, decisions) ● Types of text ○ Form ○ Letter (personal, business) ○ Message ○ Note ○ Postcard ○ Recipe ○ Set of instructions ● Adresses of texts ○ Unspecified - target audience ● Topics ○ Unspecified - some tasks are all connected to a common theme ○ Don’t give choices - Only one ○ Imagination, background knowledge, etc. What would be best: topic seen in class, which they don’t need imagination or background knowledge to understand the topic. ● Dialect and length of texts ○ formal & informal Include a representative sample of the specified content

Wide range of tasks - some students will be better at some; some will have difficulties at others. This way, it makes sure the abilities of students are explored (e.g. we don’t consider a student as poor if he or she only created task - that student might surprise you with another task). If a test includes a wide ranging and representative sample of specifications, the test is most likely to have a positive washback effect. Of course, the desirability of wide sampling has to be balanced with practicality.

2. Elicit a valid sample of writing ability Set as many separate tasks as is feasible Give students opportunities for fresh starts. Test only writing ability and nothing else Reduce the dependence on reading instructions should be simple to understand and shouldn’t be too long. Diagrams and graphics can be used. Series of pictures can also be used to elicit a narrative. Others: E.g. Discuss the advantages and disadvantages of being born into a wealthy family : age, context, perspective. Restrict candidates Writing tasks should be well defined: candidates should not be allowed to go too far astray. Provide information as form of notes or pictures to not have them get carried away. The tasks should be authentic.

3. Ensure valid and reliable scoring Set tasks which can be reliably scored A number of the suggestions made to obtain a representative performance will also facilitate reliable scoring. Set as many tasks as possible The more scores for each candidate, the more reliable should be the total score.

Restrict candidates The greater the restrictions imposed on the candidates, the more directly comparable will be the performances of different candidates. Give no choice of tasks Making the candidates perform all tasks also makes comparisons between candidates easier. Ensure long enough samples E.g. if you evaluate organization in texts - the text needs to be long enough for it to include organization Create appropriate scales for scoring Holistic & analytical = both are valid Holistic scoring The assignment of a single score to a piece of writing on the basis of an overall impression of it. + Fast + Practical Level-based - compensatory Analytic scoring Methods of scoring which require a separate score for each of a number of aspects of a task are said to be analytic. e.g. grammar, vocabulary, mechanics (punctuation, spelling, etc.), fluency, form (organization), etc. + Disposes of the problem of uneven development of subskills in individuals + Scorers are compelled to focus on all aspects of performance (they might have otherwise ignored some) + A number of scores make the scoring more reliable -

Time (It takes longer) - not practical Concentration on different aspects may divert attention from the overall effect of the piece of writing

Non-compensatory: your strength in one area will not compensate for your weakness

Calibrate the scale to be used Select and train scorers Follow acceptable scoring procedures Two or more scorers Testing should be done in a quiet, well-lit environment Multiple scoring = reliability Test usefulness ❖ Specifying test constructs: specification of representative writing tasks (construct definition) ❖ Content validity: inclusion of a representative sample of the specified content ❖ Interactiveness: testing only writing, nothing else (avoiding construct over-representation) ❖ Authenticity: a writing task that is authentic for one situation may not be authentic for another situation ❖ Reliability: ensure reliable scoring, set as many tasks as possible, restrict candidates, give no choice of task, create scoring scales based on constructs (scoring scales should include as many criteria as the # of constructs), train scorers, have multiple scoring



Testing Oral Ability Involves comprehension as well as production. 3 steps:

1. Representative tasks Specify all possible content ● Operations ○ Expressing (thanks, opinion, apology, etc.) ○ Directing (ordering, instructing, warning, etc.) ○ Describing (actions, events, people, etc.) ○ Eliciting (information, directions, service, etc.) ○ Narration (sequence of events) ○ Reporting (description, comment, decisions)

● Types of text ○ Discussion ○ Presentation (monologue) ○ Conversation ○ Service encounter ○ Interview ● Adresses (other speakers) ○ May be of equal or higher status ○ May be known or unknown ○ Interlocutor (teacher) + candidate ● Topics ○ Familiar and interesting to the candidates ● Dialect and lenght of texts ○ Standard British English or Standard American English ● Style ○ Formal and informal ● Vocabulary range ○ Non-technical except as the result of preparation for a presentation Skills - Informational skills (e.g. give instructions, elicit help, seek permission) - Interactional skills (e.g. express agreement, indicate uncertainty, correct themselves or others, etc.) - Skills in managing interactions (e.g. initiate interactions, change the topic, take their turn in interaction, end an interaction, etc.) Include a representative sample of the specified content Unless the tasks are extremely restrictive, it is not possible to predict all the operations which will be performed in an interactive oral test. The interlocutor can have a considerable influence on the content of an oral test.

2. Elicit a valid ample of oral ability Choose appropriate techniques - Interview - Questions and requests for information - Yes-No questions should be avoided, except maybe at the beginning, while the speaker is still warming up

-

Requests for elaboration Appearing not to understand Invitation to ask questions Interruption Abrupt change of topic Pictures Role play Interpreting Prepared monologue - Only be used where the ability to make prepared presentations - Reading aloud - Interaction with fellow candidates + elicit language that is appropriate to exchanges between equals - the performance of one candidate is likely to be affected by that of the others - Discussion - Role play - Responses to audio or video-recordings (for uniformity of elicitation + promotes reliability) - Described situations - Remarks in isolation to respond to - Simulated conversation Plan and structure the testing carefully: ● Set as many separate tasks as is feasible ● Plan carefully ● Give as many ‘fresh starts’ as possible ● Use a second tester for interviews ● Set only tasks and topics that would be expected to cause candidates no difficulty in their own language ● Quiet room ● Candidates should be at ease ● Collect enough information ● Teacher should not talk too much ● Select interviewers + train them

3. Ensure valid and reliable scoring Create appropriate scales for scoring Calibrate the scale to be used Select and train scorers

Follow acceptable scoring procedures (ignore personal qualities of the candidates that are irrelevant to an assessment of their language ability)

Testing reading * Students in reading are actively involved just like speaking and writing. It is only that students are involved in receiving input instead of producing output * Operations (reading abilities to be tested) ● Skimming ○ read fast to get general info ○ obtain main ideas quickly and efficiently ○ establish quickly the structure or a text ○ decide the relevance of a text/part of a text to their needs ● Scanning / Search reading ○ look for specific info ● Reading with comprehension / careful reading ○ reading and understanding the text ■ develop ideas, opinions regarding the text ■ make inferences (infer the meaning of an unknown word from context) Implications for construct definition - the constructs will not be the same if we are evaluating skimming and reading with comprehension - speed when skimming and scanning - comprehension when reading with comprehension Factors to consider before choosing a test task Text - Text types - textbooks, magazines, articles, poems, novels, letters, diaries, etc. - Text forms - description, exposition, argumentation, narration, etc. - Text topic - field-specific (familiar to the students) - general knowledge (non-technical, non-specialist)

-

-

-

-

Text length - usually expressed in number of words - will vary depending on the level of the candidates Background knowledge - *If the text (or a similar text) has been seen in class prior to the task, it is okay* Vocabulary and grammar - may be a list of vocabulary / grammatical structures Language background - students’ level of proficiency Reading speed may be expressed in words per minute different speeds for different operations (skimming vs reading with comprehension) predict it will take more time for students to read than you Item types multiple choice short answer (when evaluating communicatively) gap-filling *For lower-level students* - the teacher can ask students to tell what the story is in their L1 (if the teacher speaks their L1)

How to test reading (item types) - Multiple choice - Short answer (when evaluating communicatively) - the best short answers and those with a unique correct response - Gap-filling - particularly useful in testing reading - it can be used any time that the required response is so complex that it may cause writing (and scoring) problems - it is also the basis for “summary cloze” (cloze test but the text is a summary of the text) - Information transfer - The outcome is not necessarily writing - e.g. building something, following a route on a map, etc. * The wording of reading tests is not meant to cause candidates any difficulties of comprehension * * Responses when testing reading should make minimal demands on writing abilities*

While writing items to test reading DO - present items in order - make items independent of each other - use “seen” texts (texts already seen/talked about in class) DON’T - ask questions that can be answered without understanding the text - penalize errors of grammar, spelling, or punctuation if you are just measuring the reading ability - include items that give certain candidates an advantage over others Procedures for writing items 1. careful reading of the text, keeping in mind the specified operations (skimming, scanning, etc.) a. “what should a competent reader derive from the text?” b. notes should be taken on main points, interesting pieces of information, stages of argument, examples, etc. 2. decide what tasks is it reasonable to expect candidates to be able to perform in relation to these 3. write a draft of the test a. paragraph numbers and line numbers should be added to the text if items need to make reference to these 4. present the text and the items to colleagues for moderation a. colleagues might want to use the following grid to evaluate the test



Testing Listening Facts to consider - listening and speaking - listening and reading - transient nature of listening Listening abilities (same as written, but input is oral) Global activities - identifying the general idea (the gist of the text) - following the argument - grasping speakers intentions Informational operations - obtain factual information - follow instructions - understand requests for information - understand suggestions, comments, excuses, preferences, etc. Interactional - understand greetings and introductions - understand expressions of agreement - recognize speaker’s purpose - recognize requests for clarification Lower-level listening activities - phonemic discrimination - recognition of intonation patterns Listening Tasks Text types - Monologues - Dialogues - Presentations (live or recorded) - live recommended ( + authentic) - Authentic texts vs. recordings - recordings = less direct, less authentic - Passages originally intended for reading = not recommended - does not represent natural speech

Text form - description - exposition - argumentation - instruction - narration Text length and speed of speech - may be expressed in seconds or minutes (text length) - may be expressed as words per minute (wpm) or syllables per second (sps) (speed of speech) Item types - multiple choice - gap-filling - short answer - information transfer (jigsaw activity) - note taking - transcription - dictation Scoring the listening test - do not deduct points for errors of grammar or spelling when scoring a listening task While writing items to test listening DOS - present items in order - make items independent of each other - use “seen” texts (texts already seen/talked about in class) DON’TS - ask questions that can be answered without understanding the text - penalize errors of grammar, spelling, or punctuation if you are just measuring the reading ability - include items that give certain candidates an advantage over others

Testing young learners Why test young learners? - To help teachers to figure out what the students do not know - Do they know? (no pass, no fail, no placement) How much do they know? How can we help them get better?

- Assessment rather than test Major considerations in testing young learners - Testing as part of assessment = ongoing process (every day) - dynamic assessment (noticing the gap) - The role of feedback - Extremely important for improvement - you always give students feedback and you always give positive feedback first (as a way to hold the students accountable) - Self-assessment (smiley faces) - “How do you think you did?” (very informal self-assessment) Tests for young learners DOS - administer the test in a relaxed setting - make sure that your test is valid and reliable - include games, stories, and plays in the test - use pictures and color - include tasks that involve interactions between children - include integrative tasks only if teaching activities involve such tasks (validity - you do not evaluate something you haven’t seen in class) DON’TS - make the test long - make the content of the test ambiguous - include tasks that children cannot handle in their first language Recommended test techniques - identification of people and objects - multiple choice pictures - anagrams with pictures - cartoon stories - straightforward questions - picture description - placing “object” cards on the “scene” cards...