Tugas Language Assessment: Assignment 9 for meeting 13

SUMMARY

Assessing Listening

Observing The Performance Of The Four Skills

One important principle for assessing a learner's competence is to conside the fallibility of the results of a single performance, such as that produced in a test. As with any attempt at measurement, it is your obligation as a teacher to triangulate your measurements: consider at least two (or more) performances and/or contexts before drawing a conclusion. That could take the form of one or more of the following designs:

· several tests that are combined to form an assessment

· a single test with multiple test tasks to account for learning styles and per

· formance variables

· in-class and extra-class graded work

· alternative forms of assessment (eg, journal, portfolio, conference, observation, self-assessment, peer-assessment).

Multiple measures will always give you a more reliable and valid assessment than a single measure. A second principle is one that we teachers often forget. We must rely as much aspossible on observable performance in our assessments of students. Observable means being able to see or hear the performance of the learner (the senses of touch, taste, and smell don't apply very often to language testing!). What, then, is observable among the four skills of listening, speaking, reading, and writing

THE IMPORTANCE OF LISTENING

Listening has often played second fiddle to its counterpart, speaking. In the standardized testing industry, a number of separate oral production tests are available, but it is rare to find just a listening test. One reason for this emphasis is that listening is often implied as a component of speaking. How could you speak a language without also listening? In addition, the overtly observable nature of speaking renders it more empirically measurable then listening. But perhaps a deeper cause lies in universal biases toward speaking. A good speaker is often (unwisely) valued more highly than a good listener. To determine if someone is a proficient user of a language, people customarily ask, "Do you speak Spanish?" People rarely ask, "Do you understand and speak Spanish?"

Every teacher of language knows that one's oral production ability-other than monologues, speeches, reading aloud, and the like-is only as good as one's listening comprehension ability. But of even further impact is the likelihood that input in the aural-oral mode accounts for a large proportion of successful language acquisition. In a typical day, we do measurably more listening than speaking (with the exception of one or two of your friends who may be nonstop chatterboxes!).

Basic Types Of Listening

As with all effective tests, designing appropriate assessment tasks in listening begins with the specification of objectives, or criteria. Those objectives may be classified in terms of several types of listening performance. Think about what you do when you listen. Literally in nanoseconds, the following processes flash through your brain:

1. You recognize speech sounds and hold a temporary "imprint" of them in short-term memory.

2. You simultaneously determine the type of speech event (monologue, interpersonal dialogue, transactional dialogue) that is being processed and attend to its context (who the speaker is, location, purpose) and the content of the message.

3. You use (bottom-up) linguistic decoding skills and/or (top-down) background schemata to bring a plausible interpretation to the message, and assign a literal and intended meaning to the utterance.

4. In most cases (except for repetition tasks, which involve short-term memory only), you delete the exact linguistic form in which the message was originally received in favor of conceptually retaining important or relevant information in long-term memory.

Each of these stages represents a potential assessment objective:

• comprehending of surface structure elements such as phonemes, words, in nation, or a grammatical category

• understanding of pragmatic context

• determining meaning of auditory input

• developing the gist, a global or comprehensive understanding

From these stages we can derive four commonly identified types of listening performance, each of which comprises a category within which to consider assessme tasks and procedures.

1. Intensive. Listening for perception of the components (phonemes, words, intonation, discourse markers, etc.) of a larger stretch of language

2. Responsive. Listening to a relatively short stretch of language (a greeting, question, command, comprehension check, etc.) in order to make an equally short response.

3. Selective. Processing stretches of discourse such as short monologues for several minutes in order to "scan" for certain information. The purpose of such performance is not necessarily to look for global or general meanings, but to be able to comprehend designated information in a context of longer stretches of spoken language (such as classroom directions from a teacher, T or radio news items, or stories). Assessment tasks in selective listening couldask students, for example, to listen for names, numbers, a grammatical category directions in a map exercise), or certain facts and events,

4. Extensive Listening to develop a top-down, global understanding of spoken language. Extensive performance ranges from listening to lengthy lectures to listening to a conversation and deriving a comprehensive message or purpose. Listening for the gist, for the main idea, and making inferences are all part of extensive listening.

Micro- And Macroskills Of Listening

The micro and macroskills provide 17 different objectives to assess in listening micro and macroskills of listening (adapted from Richards, 1983)

Microskills:

1. Discriminate among the distinctive sounds of English.

2. Retain chunks of language of different lengths in short-term memory.

3. Recognize English stress patterns, words in stressed and unstressed positions, rhythmic structure, intonation contours, and their role in signaling information.

4. Recognize reduced forms of words.

5. Distinguish word boundaries, recognize a core of words, and interpret word order patterns and their significance.

6. Process speech at different rates of delivery.

7. Process speech containing pauses, errors, corrections, and other performance variables.

8. Recognize grammatical word classes (nouns, verbs, etc.), systems (e.g.,tense, agreement, pluralization), patterns, rules, and elliptical forms.

9. Detect sentence constituents and distinguish between major and minor constituents.

10. Recognize that a particular meaning may be expressed in different grammatical forms.

11. Recognize cohesive devices in spoken discourse.

Macroskills:

12. Recognize the communicative functions of utterances, according to situations, participants, goals.

13. Infer situations, participants, goals using real-world knowledge.

14. From events, ideas, and so on, described, predict outcomes, infer links and connections between events, deduce causes and effects, and detect such relations as main idea, supporting idea, new information, given information, generalization, and exemplification.

15. Distinguish between literal and implied meanings.

16. Use facial, kinesic, body language, and other nonverbal clues to decipher meanings.

17. Develop and use a battery of listening strategies, such as detecting key words, guessing the meaning of words from context, appealing for help and signaling comprehension or lack thereof.

Developing a sense of which aspects of listening performance are predictably difficult will help you to challenge your students appropriately and to assign weights to items. Consider the following list of what makes listening difficult (adapted from Richards, 1983: Ur. 1984; Dunkel, 1991

1. Clustering: attending to appropriate "chunks of language-phrases, clauses, constituents

2. Redundancy: recognizing the kinds of repetitions, rephrasing, elaborations, and insertions that unrehearsed spoken language often contains, and benefiting from that recognition

3. Reduced forms; understanding the reduced forms that may not have been a part of an English lcarner's past learning experiences in classes where only formal "textbook" language has been presented

4. Performance variables: being able to "weed out hesitations, false starts, pauses, and corrections in natural speech

5. Colloquial language: comprehending idioms, slang, reduced forms, shared cultural knowledge

6. Rate of delivery: keeping up with the speed of delivery, processing automatcally as the speaker continues

7. Stress, rhythm, and intonation: correctly understanding prosodic elements spoken language, which is almost always much more difficult than understanding the smaller phonological bits and pieces

8. Interaction managing the interactive flow of language from listening to speaking to listening, etc.

Designing Assessment Tasks: Intensive Listening

Once you have determined objectives, your next step is to design the tast including making decisions about how you will elicit performance and how you expect the test-taker to respond. We will look at tasks that range from intensive tening performance, such as minimal phonemic pair recognition to extensive prehension of language in communicative contexts. The focus in this section is the microskills of intensive listening.

Recognizing Phonological and Morphological Elements

A typical form of intensive listening at this level is the assessment of recognition of phonological and morphological elements of language. A classic test task gives a spoken stimulus and asks test-lakers to identify the stimulus from two or more choices.

Paraphrase Recognition

The next step up on the scale of listening comprehension microskills is words phrases and sentences, which are frequently assessed by providing a stimulus sentence and asking the test-taker to choose the correct paraphrase from a number of choices. Designing Assessment Tasks: Responsive Listening. A question-and-answer format can provide some interactivity in these lower-end listening tasks. The test-taker's response is the appropriate answer to a question. Appropriate response to a question

Test-takers hear:

How much time did you take to do your homework?

Test-takers read:

(a) In about an hour.

(b) About an hour.

(d) Yes, I did.

The objective of this item is recognition of the wb-question bow much and its appropriate response. Distractors are chosen to represent common learner errors: (a) responding to bow much vs. how much longer; (c) confusing bow much in ref. erence to time vs. the more frequent reference to money: (d) confusing a wb-question with a yes/no question. None of the tasks so far discussed have to be framed in a multiple-choice format. They can be offered in a more open-ended framework in which test-takers write or speak the response. The above item would then look like this:

Designing Assessment Tasks: Selective Listening

A third type of listening performance is selective listening, in which the test-takerlistens to a limited quantity of aural input and must discern within it some specific information. A number of techniques have been used that require selective listening.

Listening Cloze

Listening cloze tasks (sometimes called cloze dictations or partial dictations) require the test-taker to listen to a story, monologue, or conversation and simultaneously read the written text in which selected words or phrases have been deleted. In its generic form, the test consists of a passage in which every nth word (typically every seventh word) is deleted and the test-taker is asked to supply an appropriate word In a listening cloze task, test-takers see a transcript of the passage that they are listening to and fill in the blanks with the words or phrases that they hear.

Information Transfer

Selective listening can also be assessed through an information transfer technique in which aurally processed information must be transferred to a visual representation, such as labeling a diagram, identifying an element in a picture, completing a form, or showing routes on a map. At the lower end of the scale of linguistic complexity, simple picturecued items are sometimes efficient rubrics for assessing certain selected information.

Sentence Repetition

The task of simply repeating a sentence or a partial sentence, or sentence repetition, is also used as an assessment of listening comprehension. As in a dictatic (discussed below), the test-taker must retain a stretch of language long enough reproduce it, and then must respond with an oral repetition of that stimulus Incorrect listening comprehension, whether at the phonemic or discourse less may be manifested in the correctness of the repetition. A miscue in repetition scored as a miscue in listening.

Sentence repetition is far from a flawless listening assessment task. Buck (20% p. 79) noted that such tasks are not just tests of listening, but tests of general skills. Further, this task may test only recognition of sounds, and it can easily be taminated by lack of short-term memory ability, thus invalidating it as an assessme of comprehension alone. And the teacher may never be able to distinguish a listening comprehension error from an oral production error. Therefore, sentence etition tasks should be used with caution.

Designing Assessment Tasks: Extensive Listening

1. Can listening performance be distinguished from cognitive processing fact such as memory, associations, storage, and recall?

2. As assessment procedures become more communicative, does the task take into account test-takers' ability to use grammatical expectancies, lexical cocations, semantic interpretations, and pragmatic competence?

3. Are test tasks themselves correspondingly content valid and authentic-ths is, do they mirror real-world language and context?

4. As assessment tasks become more and more open-ended, they more closerresemble pedagogical tasks, which leads one to ask what the difference is between assessment and teaching tasks. The answer is scoring the former imply specified scoring procedures, while the latter do not.

Dictation

Dictation is a widely researched genre of assessing listening comprehension. In a dictation, test-takers hear a passage, typically of 50 to 100 words, recited three times: first, at normal speed; then, with long pauses between phrases or natural word groups, during which time test-takers write down what they have just heard; and finally, at normal spced once more so they can check their work and proofread.

Kinds of errors:

1. spelling error only, but the word appears to have been heard correctly spelling and/or obvious misrepresentation of a word, illegible word grammatical error (For example, test-taker hears I can't do it, writes I can do it.

2. skipped word or phrase

3. permutation of words

4. additional words not in the original

5. replacement of a word with an appropriate synonym

Here are some possibilities.

1. Note-taking. In the academic world, classroom lectures by professors are common features of a non-native English-user's experience. One form of a midterm examination at the American Language Institute at San Francisco State Universin (Kahn, 2002) uses a 15-minute lecture as a stimulus.

2. Editing. Another authentic task provides both a written and a spoken stimulus, and requires the test-raker to listen for discrepancies. Scoring achieves relatively high reliability as there are usually a small number of specific differences that must be identified. Here is the way the task proceeds. Editing a written version of an aural stimulus.

3. Interpretive tasks. One of the intensive listening tasks described above was paraphrasing a story or conversation. An interpretive task extends the stimulus material to a longer stretch of discourse and forces the test-taker to infer a response potential stimuli.

Assessing Speaking

Basic Types Of Speaking

1. Imitative, at one end of a continuum of types of speaking performance is the ability to simply parrot back (imitate) a word or phrase or possibly a sentence. While this is a purely phonetic level of oral production, a number of prosodic, lexical, and grammatical properties of language may be included in the criterion performance. We are interested only in what is traditionally labeled "pronunciation"; no inferences are made about the test-taker's ability to understand or convey meaning or to participate in an interactive conversation. The only role of listening here is in the short-term storage of a prompt, just long enough to allow the speaker to retain the short stretch of language that must be imitated.

2. Intensive. A second type of speaking frequently employed in assessment contexts is the production of short stretches of oral language designed to demonstrate competence in a narrow band of grammatical, phrasal, lexical, or phonological relationships (such as prosodic elements-intonation, stress, rhythm,juncture). The speaker must be aware of semantic properties in order to be able to respond, but interaction with an interlocutor or test administrator is minimal at best.

3. Responsive. Responsive assessment tasks include interaction and test comprehension but at the somewhat limited level of very short conversations, standard greetings and small talk, simple requests and comments, and the like. The stimulus is almost always a spoken prompt (in order to preserve authenticity), with perhaps only one or two follow-up questions or retorts.

4. Interactive. The difference between responsive and interactive speaking is in the length and complexity of the interaction, which sometimes includes multiple exchanges and/or multiple participants. Interaction can take the two forms of transactional language, which has the purpose of exchanging specific information or interpersonal exchanges, which have the purpose of maintaining social relationships

Micro And Macroskills Of Speaking

Microskills

1. Produce differences among English phonemes and allophonic variants.

2. Produce chunks of language of different lengths.

3. Produce English stress patterns, words in stressed and unstressed positions, rhythmic structure, and intonation contours.

4. Produce reduced forms of words and phrases.

5. Use an adequate number of lexical units (words) to accomplish pragmatic purposes.

6. Produce fluent speech at different rates of delivery.

7. Monitor one's own oral production and use various strategic devices pauses, fillers, selfcorrections, backtracking-to enhance the clarity of the message.

8. Use grammatical word classes (nouns, verbs, etc.), systems (e.g., tense, agreement, pluralization), word order, patterns, rules, and elliptical forms.

9. Produce speech in natural constituents: in appropriate phrases, pause groups, breath groups, and sentence constituents.

10. Express a particular meaning in different grammatical forms.

11. Use cohesive devices in spoken discourse.

Macroskills

12. Appropriately accomplish communicative functions according to situations, participants, and goals.

13. Use appropriate styles, registers, implicature, redundancies, pragmatic conventions, conversation rules, floor-keeping and -yielding, interrupting, and other sociolinguistic features in face-to-face conversations.

14. Convey links and connections between events and communicate such relations as focal and peripheral ideas, events and feelings, new information and given information, generalization and exemplification,

15. Convey facial features, kinesics, body language, and other nonverbalo along with verbal language.

16. Develop and use a battery of speaking strategies, such as emphasizing words, rephrasing, providing a context for interpreting the meaning of words, appealing for help, and accurately assessing how well your interlocutor is understanding you.

There is such an array of oral production tasks that a complete treatment is almost impossible within the confines of one chapter in this book. Below is a consideration of the most common techniques with brief allusions to related tasks. As already noted in the introduction to this chapter, consider three important issues as you set out to design tasks:

1. No speaking task is capable of isolating the single skill of oral production. Concurrent involvement of the additional performance of aural comprehension, and possibly reading. is usually necessary.

2. Eliciting the specific criterion you have designated for a task can be tricky because beyond the word level, spoken language offers a number of productive options to test-takers. Make sure your elicitation prompt achieves its aims as closely

3. Because of the above two characteristics of oral production assessment, it is important to carefully specify scoring procedures for a response so that ultimately you achieve as high a reliability index as possible.

Designing Assessment Tasks: Imitative Speaking

You may be surprised to see the inclusion of simple phonological imitation in a consideration of assessment of oral production. After all, endless repeating of words phrases, and sentences was the province of the long-since-discarded Audiolingual Method, and in an era of communicative language teaching, many believe that non-meaningful imitation of sounds is fruitless. Such opinions have faded in recent years as we discovered that an overemphasis on fluency can sometimes lead to the decline of accuracy in speech. And so we have been paying more attention to pronunciation, especially suprasegmentals, in an attempt to help learners be more comprehensible.

Test Of Spoken English (TSE)

Somewhere straddling responsive, interactive, and extensive speaking tasks another popular commercial oral production assessment, the Test of Spoken E (TSE), The TSE is a 20 minute audiotaped test of oral language ability with academic or professional environment. TSE scores are used by many North American institutions of higher education to select international teaching assistants.

The scores are also used for selecting and certifying health professionals such as physicians, nurses, pharmacists, physical therapists, and veterinarians. The tasks on the TSE are designed to elicit oral production in various discourse categories rather than in selected phonological, grammatical, or lexical targets. The following content specifications for the TSE represent the discourse and pragmatic contexts assessed in each administration:

1. Describe something physical.

2. Narrate from presented material.

3. Summarize information of the speaker's own choice.

4. Give directions based on visual materials

5. Give instructions.

6. Give an opinion.

7. Support an opinion.

8. Compare/contrast,

9. Hypothesize

10. Function "interactively

11. Define

Using these specifications, Lazaraton and Wagner (1996) examined 15 different specific tasks in collecting background data from native and non-native speakers of English.

1. giving a personal description

2. describing a daily routine

3. suggesting a gift and supporting one's choice

4. recommending a place to visit and supporting one's choice

5. giving directions

6. describing a favorite movie and supporting one's choice

7. telling a story from pictures

8. hypothesizing about future action

9. hypothesizing about a preventative action

10. making a telephone call to the dry cleaner

11. describing an important news event

12. giving an opinion about animals in the zoo

13. defining a technical term

14. describing information in a graph and speculating about its implications

15. giving details about a trip schedule

The final two categories of oral production assessment (interactive and extensive speaking) include tasks that involve relatively long stretches of interactive discourse (interviews, role plays, discussions, games) and tasks of equally long duration but that involve less interaction (speeches, telling longer stories, and extended explana- tions and translations). The obvious difference between the two sets of tasks is the degree of interaction with an interlocutor. Also, interactive tasks are what some would describe as interpersonal, while the final caregory includes more transactional speech events.

Interview

When "oral production assessment" is mentioned, the first thing that comes to mind is an oral interview: a test administrator and a test-taker sit down in a direct face-to-face exchange and proceed through a protocol of questions and directives. The interview, which may be tape recorded for re-listening, is then scored on one or more parameters such as accuracy in pronunciation and/or grammar, vocabulary usage, fluency, sociolinguistic/pragmatic appropriateness, task accomplishment and even comprehension. Interviews can vary in length from perhaps five to forty-five minutes, depending on their purpose and context. Placement interviews, designed to get a quick spoken sample from a student in order to verify placement into a course, may need only five minutes if the interviewer is trained to evaluate the output accuratels. Longer comprehensive interviews such as the OPI (see the next section) are designed to cover predetermined oral production contexts and may require the better part of an hour.

Every effective interview contains a number of mandatory stages. Two decades ago, Michael Canale (1984) proposed a framework for oral proficiency testing that has withstood the test of time. He suggested that test-takers will perform at the best if they are led through four stages:

1. Warm-up. In a minute or so of preliminary small talk, the interviewed directs mutual introductions, helps the test-taker become comfortable with the situation, apprises the test-taker of the format, and allays anxieties. No scoring this phase takes place.

2. Level cbeck. Through a series of preplanned questions, the interviewer stimulates the test-taker to respond using expected or predicted forms and functions. If, for example, from previous test information, grades, or other data, the test-taker has been judged to be a "Level 2" (see below) speaker, the interviewer prompts will attempt to confirm this assumption.

3. Probe. Probe questions and prompts challenge test-takers to go to the heights of their ability, to extend beyond the limits of the interviewer's expectati through increasingly difficult questions. Probe questions may be complex in the framing and/or complex in their cognitive and linguistic demand. Through probe items, the interviewer discovers the ceiling or limitation of the test-taker's preciency. This need not be a separate stage entirely, bur might be a set of questi that are interspersed into the previous stage. At the lower levels of proficies probe items may simply demand a higher range of vocabulary or grammar from test-taker than predicted.

4. Wind-down. This final phase of the interview is simply a short period of during which the interviewer encourages the test-taker to relax with some questions, sets the test-taker's mind at ease, and provides information about and where to obtain the results of the interview.

Discussions and Conversations

As formal assessment devices, discussions and conversations with and among students are difficult to specify and even more difficult to score. But as informal techniques to assess learners, they offer a level of authenticity and spontaneity that other assessment techniques may not provide. Discussions may be especially appropriate tasks through which to elicit and observe such abilities as topic nomination, maintenance, and termination.

Games

Among informal assessment devices are a variety of games that directly involve laguage production. Consider the following types:

Assessment games

1. "Tinkertoy" game: A Tinkertoy (or Lego block) structure is built behind a screen. One or two learners are allowed to view the structure. In successive stages of construction, the learners tell "runners" (who can't observe the structure) how to re-create the structure. The runners then tell "builders" behind another screen how to build the structure. The builders may question or confirm as they proceed, but only through the two degrees of separation. Object: re-create the structure as accurately as possible.

2. Crossword puzzles are created in which the names of all members of a class are clued by obscure information about them. Each class member must ask questions of others to determine who matches the clues in the puzzle.

3. Information gap grids are created such that class members must conduct mini-interviews of other classmates to fill in boxes, e.g., "born in July." "plays the violin," "has a two-year-old child," etc.

4. City maps are distributed to class members. Predetermined map directions are given to one student who, with a city map in front of him or her, describes the route to a partner, who must then trace the route and get to the correct final destination.

Oral Proficiency Interview (OPI)

The best-known oral interview format is one that has gone through a consider able metamorphosis over the last half-century, the Oral Proficiency Interview (OPD). Originally known as the Foreign Service Institute (FSD) test, the OPI is the result of a historical progression of revisions under the auspices of several ages cies, including the Educational Testing Service and the American Council Teaching Foreign Languages (ACTFL).

Source:

Brown, H. Douglas. 2003. Language Assessment Principles and Classroom Practices (112-184). San Francisco, California

Tugas Language Assessment

Selasa, 12 Mei 2020

Assignment 9 for meeting 13

Tidak ada komentar:

Posting Komentar