SUMMARY
Assessing Grammar
A.
Differing
notions of ‘grammar’ for assessment
In
reaction to the grammar-translation approach that had become more about
learning a set of abstract linguistic rules than about learning to use a
language for some communicative purpose, some language teachers began to seek
alternative approaches to language teaching based on what students could ‘do’
with the language. These teachers insisted that the grammar should not only be
learned, but also applied to some linguistic or communicative purpose. They
recommended that grammatical analysis be accompanied by application, where
students are asked to answer questions, write illustrative examples, combine
sentences, correct errors, write paragraphs and so forth. To know a language meant
to be able to apply the rules – an approach relatively similar to what is done
in many classrooms today. In this approach, knowledge of grammar was assessed
by having students apply rules to language in some linguistic context.
Most
of the early debates about language teaching have now been resolved; however,
others continue to generate discussion. For example, most language teachers
nowadays would no longer expect their students to devote too much time to
describing and analyzing language systems, to translating texts or to learning
a language solely for access to its literature; rather, they would want their
students to learn the language for some communicative purpose. In other words,
the primary goal of language learning today is to foster communicative
competence, or the ability to communicate effectively and spontaneously in
real-life settings. Language teachers today would not deny that grammatical
competence is an integral part of communicative language ability, but most
would maintain that grammar should be viewed as an indispensable resource for
effective communication and not, except under special circumstances, an object
of study in itself. Current teaching controversies revolve around the role, if
any, that grammar instruction should play in the language classroom and the
degree to which the grammatical system of a language can be acquired through
instruction. These questions have, since the 1980s, produced an explosion of
empirical research, which is of critical importance to language teachers.
Grammar
and Linguistic
Since the 1950s, there have been many such
linguistic theories – too numerous to list here that have been proposed to
explain language phenomena. Many of these theories have helped shape how L2
educators currently define grammar in educational contexts. Although it is
beyond the purview of this book to provide a comprehensive review of these
theories, it is, nonetheless, helpful to mention a few, considering both the impact
they have had on L2 education and the role they play in helping define grammar
for assessment purposes. Generally speaking, most linguists have embraced one
of two general perspectives to describe linguistic phenomena. Either they take
a syntactocentric perspective of language, where syntax, or the way in which words
are arranged in a sentence, is the central feature to be observed and analyzed;
or they adopt a communication perspective of language, where the observational
and analytic emphasis is on how language is used to convey meaning (VanValin
and LaPolla, 1997). I will use these two perspectives to classify some of the
more influential grammatical paradigms in our field.
Form-Based
Perspectives Of Language
Several syntac to centric, or form-based, theories
of language have provided grammatical insights to L2 teachers. I will describe
three: traditional grammar, structural linguistics and
transformational-generative grammar. One of the oldest theories to describe the
structure of language is traditional grammar. Originally based on the study of
Latin and Greek, traditional grammar drew on data from literary texts to
provide rich and lengthy descriptions of linguistic form. Unlike some other
syntac to centric theories, traditional grammar also revealed the linguistic
meanings of these forms and provided information on their usage in a sentence (Celce-Murcia
and Larsen-Freeman, 1999). Traditional grammar supplied an extensive set of
prescriptive rules along with the exceptions. A typical rule in a traditional English
grammar might be:
The
first-person singular of the present tense verb ‘to be’ is ‘I am’. ‘Am’ is used
with ‘I’ in all cases, except in first-person singular negative tag and yes/no
questions, which are contracted. In this case, the verb ‘are’ is used instead
of ‘am’. For example, ‘I’m in a real bind, aren’t I?’ or ‘Aren’t I trying my
best?’
Probably the best-known syntax to centric theory is
Chomsky’s (1965) transformational generative grammar and its later, broader
instantiation, universal grammar (UG). Unlike the traditional or structural grammars
that aim to describe one particular language, transformational generative grammar
endeavored to provide a ‘universal’ description of language behavior revealing
the internal linguistic system for which all humans are predisposed (Radford,
1988). Transformational-generative grammars aims that the underlying properties
of any individual language system can be uncovered by means of a detailed,
sentence-level analysis. In this regard, Chomsky proposed a set of
phrase-structure rules that describe the underlying structures of all languages.
These phrase structure rules join with lexical items to offer a semantic
representation to the rules. Following this, a series of ‘transformation’ rules
are applied to the basic structure to add, delete, move or substitute the underlying
constituents in the sentence. Morphological rules are then applied, followed by
phonological or orthographic rules (for further information, see Radford, 1988,
or Celce-Murcia and Larsen-Freeman, 1999).
Form-
And Use-Based Perspectives Of Language
The three theories of linguistic analysis described
thus far have provided insights to L2 educators on several grammatical forms.
These insights provide information to explain what structures are theoretically
possible in a language. Other linguistic theories, however, are better equipped
to examine how speakers and writers actually exploit linguistic forms during
language use. For example, if we wish to explain how seemingly similar
structures like I like to read and I like reading connote different meanings,
we might turn to those theories that study grammatical form and use interfaces.
This would address questions such as: Why does a language need two or more
structures that are similar in meaning? Are similar forms used to convey
different specialized meanings? To what degree are similar forms a function of
written versus spoken language, or to what degree are these forms characteristic
of a particular social group or a specific situation? It is important for us to
discuss these questions briefly if we ultimately wish to test grammatical forms
along with their meanings and uses in context.
Biber et al. (1998) identified a second kind of
corpus-based study that relates grammatical forms to different types of texts.
For example, how do academic texts differ from informal conversations in terms
of the passive voice? Besides showing which linguistic features are possible in
texts, corpus linguistics strives to identify which are probable. In other
words, to what degree are linguistic features likely to occur in certain texts
and in what circumstances? For example, in physical descriptions of objects the
majority of the verbs are non-progressive or stative. Unlike descriptive linguistics
or UG, corpus linguistics is not primarily concerned with syntax; rather, it
focuses on how words co-occur with other words in a single sentence or text.
Communication-Based
Perspectives Of Language
Other theories have provided grammatical insights
from a communicationbased perspective. Such a perspective expresses the notion
that language involves more than linguistic form. It moves beyond the view of language
as patterns of morphosyntax observed within relatively decontextualized sentences
or sentences found within natural-occurring corpora. Rather, a communication-based
perspective views grammar as a set of linguistic norms, preferences and expectations
that an individual invokes to convey a host of pragmatic meanings that are
appropriate, acceptable and natural depending on the situation. The assumption
here is that linguistic form has no absolute, fixed meaning in language use (as
seen in sentences 1.5 and 1.7 above), but is mutable and open to interpretation
by those who use it in a given circumstance. Grammar in this context is often
co-terminous with language itself, and stands not only for form, but also for
meaningfulness and pragmatic appropriacy, acceptability or naturalness – a
topic I will return to later since I believe that a blurring of these concepts
is misleading and potentially problematic for language educators.
What
is pedagogical grammar?
Many language teachers who have taken courses in
linguistic analysis and learned to examine language within the frameworks of
formal, grammatical theories have often felt that these courses did not
adequately meet their immediate needs. This is often because courses in
linguistic analysis rarely address classroom concerns such as what grammar to teach,
how to teach it and how to test it. Furthermore, it is unlikely that language
teachers would attempt to teach phrase-structure rules, parameter setting
conditions or abstract notions of time and space, and certainly, they would
never test students on these principles.
In this chapter, I have attempted to answer the
question ‘What do we mean by grammar?’ In this respect, I have differentiated
between language and language analysis or linguistics. I have also discussed
several schools of linguistics and have shown how each has broadened our understanding
of what is meant by ‘grammar’. Finally, I have shown how these different
notions of grammar provide complementary information that could be drawn on for
purposes of teaching or assessing grammar. In the next chapter I will discuss how
second language grammatical knowledge is acquired. In this respect, we will
examine how grammatical ability has been conceptualized in L2 grammar teaching
and learning, and how L2 grammar teaching and learning are intrinsically linked
to assessment.
B.
Research on L2
grammar teaching, learning and assessment
In
recent years, some of these same questions have been addressed by second
language acquisition (SLA) researchers in a variety of empirically based
studies. These studies have principally focused on a description of how a
learner’s interlanguage (Selinker, 1972), or how a learner’s L2, develops over
time and on the effects that L2 instruction may have on this progression. In
most of these studies, researchers have investigated the effects of learning
grammatical forms by means of one or more assessment tasks. Based on the
conclusions drawn from these assessments, SLA researchers have gained a much
better understanding of how grammar instruction impacts both language learning
in general and grammar learning in particular. However, in far too many SLA
studies, the ability under investigation has been poorly defined or defined
with no relation to a model of L2 grammatical ability.
Comparative
Methods Studies
The comparative methods studies sought to compare
the effects of different language-teaching methods on the acquisition of an L2.
These studies occurred principally in the 1960s and 1970s, and stemmed from a reaction
to the grammar-translation method, which had dominated language instruction
during the first half of the twentieth century. More generally, these studies
were in reaction to form-focused instruction (referred to as ‘focus on forms’
by Long, 1991), which used a traditional structural syllabus of grammatical
forms as the organizing principle for L2 instruction. According to Ellis
(1997), form-focused instruction contrasts with meaning-focused instruction in
that meaning-focused instruction emphasizes the communication of messages
(i.e., the act of making a suggestion and the content of such a suggestion)
while form focused instruction stresses the learning of linguistic forms. These
can be further contrasted with form-and-meaning focused instruction (referred
to by Long (1991) as ‘focus-on-form’), where grammar instruction occurs in a
meaning-based environment and where learners strive to communicate meaning
while paying attention to form. (Note that Long’s version of ‘focus-on-form’
stresses a meaning orientation with an incidental focus on forms.) These
comparative methods studies all shared the theoretical premise that grammar has
a central place in the curriculum, and that successful learning depends on the
teaching method and the degree to which that promotes grammar processing.
Empirical
Studies In Support Of Non-Intervention
The non-interventionist position was examined
empirically by Prabhu (1987) in a project known as the Communicational Teaching
Project (CTP) in southern India. This study sought to demonstrate that the
development of grammatical ability could be achieved through a task-based, rather
than a form-focused, approach to language teaching, provided that the tasks
required learners to engage in meaningful communication. In the CTP, Prabhu
(1987) argued against the notion that the development of grammatical ability
depended on a systematic presentation of grammar followed by planned practice.
Possible
Implications Of Fixed Developmental Order To Language Assessment
The notion that structures appear to be acquired in
a fixed developmental order and in a fixed developmental sequence might
conceivably have some relevance to the assessment of grammatical ability. First
of all, these findings could give language testers an empirical basis for
constructing grammar tests that would account for the variability inherent in a
learner’s inter language. In other words, information on the acquisition order
of grammatical items could conceivably serve as a basis for selecting grammatical
content for tests that aim to measure different levels of developmental
progression, such as Chang (2002, 2004) did in examining the underlying
structure of a test that attempted to measure knowledge of the relative
clauses. These findings also suggest a substantive approach to defining test
tasks according to developmental order and sequence on the basis of how grammatical
features are acquired over time (Ellis, 2001b). In other words, one task could
potentially tap into developmental level one, while another taps into
developmental level two, and so forth.
Problems
With The Use Of Development Sequences As A Basis For Assessment
Although developmental sequence research offers an
intuitively appealing complement to accuracy-based assessments in terms of
interpreting test scores, I believe this method is fraught with a number of
serious problems, and language educators should use extreme caution in applying
this method to language testing. This is because our understanding of natural
acquisitional sequences is incomplete and at too early a stage of research to
be the basis for concrete assessment recommendations (Lightbown, 1985; Hudson,
1993).
Interventionist
Studies
Not all L2 educators are in agreement with the
non-interventionist position to grammar instruction. In fact, several (e.g.,
Schmidt, 1983; Swain, 1991) have maintained that although some L2 learners are
successful in acquiring selected linguistic features without explicit grammar
instruction, the majority fail to do so. Testimony to this is the large number
of non-native speakers who emigrate to countries around the world, live there
all their lives and fail to learn the target language, or fail to learn it well
enough to realize their personal, social and long-term career goals.
Empirical
Studies In Support Of Intervention
Aside from anecdotal evidence, the non-interventionist
position has come under intense attack on both theoretical and empirical
grounds with several SLA researchers affirming that efforts to teach L2 grammar
typically results in the development of L2 grammatical ability. Hulstijn (1989)
and Alanen (1995) investigated the effectiveness of L2 grammar instruction on
SLA in comparison with no formal instruction.
Research
On Instructional Techniques And Their Effects On Acquisition
Much of the recent research on teaching grammar has
focused on four types of instructional techniques and their effects on acquisition.
Although a complete discussion of teaching interventions is outside the purview
of this book (see Ellis, 1997; Doughty and Williams, 1998), these techniques
include form- or rule-based techniques, input-based techniques, feedback-based
techniques and practice-based techniques (Norris and Ortega, 2000).
Grammar
Processing And Second Language Development
It
is important for language teachers and testers to understand these processes, especially
for classroom assessments. For example, I have had students fake their way
through an entire lesson on the second conditional. They knew the form and
could produce it well enough, but it was not until the end of the lesson that I
realized they had not really understood the meaning of the hypothetical or
counterfactual conditional. In other words, meaning was not mapped onto the
form. A short comprehension test earlier in the lesson might have allowed me to
re-teach the meaning of the conditionals before moving ahead.
Implicit grammatical knowledge refers to ‘the
knowledge of a language that is typically manifest in some form of naturally
occurring language behavior such as conversation’ (Ellis, 2001b, p. 252). In
terms of processing time, it is unconscious and is accessed quickly. DeKeyser (1995)
classifies grammatical instruction as implicit when it does not involve rule
presentation or a request to focus on form in the input; rather, implicit
grammatical instruction involves semantic processing of the input with any
degree of awareness of grammatical form.
In this chapter, demonstrated how the teaching,
learning and assessment of L2 grammatical ability are intrinsically related.
Language educators depend on linguists for information on the nature of
language, so that teaching, learning and assessment can reflect current notions
of language. Language educators also depend on experience, other language teachers
and SLA researchers for insights on teaching and learning, so that the
processes underlying instruction and acquisition can be obtained and so that
information on how learning can be maximized can be generated. Finally, both
language educators and SLA researchers depend on language testers for expertise
in the design and development of assessments so that samples of learner performance
can be consistently elicited, and so that the information observed from
assessments can be used to make claims about what a learner does or does not
know. In the next two chapters I will discuss how grammar has been defined in
models of language proficiency and will argue for a coherent model of grammatical
ability – one that could be used for test development and test validation
purposes.
C.
The role of
grammar in models of communicative language ability
In
this chapter will discuss the role that grammar plays in models of communicative
competence. I will then endeavor to define grammar for assessment purposes. In
this discussion I will describe in some detail the relationships among
grammatical form, grammatical meaning and pragmatic meaning. Finally, I will
present a theoretical model of grammar that will be used in this book as a basis
for a model of grammatical knowledge. This will, in turn, be the basis for
grammar-test construction and validation. In the following chapter I will
discuss what it means for L2 learners to have grammatical ability.
The
Role Of Grammar In Models Of Communicative Competence
Every language educator who has ever attempted to
measure a student’s communicative language ability has wondered: ‘What exactly
does a student need to “know” in terms of grammar to be able to use it well enough
for some real-world purpose?’ In other words, they have been faced with the
challenge of defining grammar for communicative purposes. To complicate matters
further, linguistic notions of grammar have changed over time, as we have seen,
and this has significantly increased the number of components that could be
called ‘grammar’. In short, definitions of grammar and grammatical knowledge
have changed over time and across context, and I expect this will be no
different in the future.
Rea-Dickins’
Definition Of Grammar
In
discussing more specifically how grammatical knowledge might be tested within a
communicative framework, Rea-Dickins (1991) defined ‘grammar’ as the single
embodiment of syntax, semantics and pragmatics. She argued against Canale and
Swain’s (1980) and Bachman’s (1990b) multi-componential view of communicative
competence on the grounds that componential representations overlook the
interdependence and interaction between and among the various components. She
further stated that in Canale and Swain’s (1980) model, the notion of grammatical
competence was limited since it defined grammar as ‘structure’ on the one hand
and as ‘structure and semantics’ on the other, but ignored the notion of
‘structure as pragmatics’. Similarly, she added that in Bachman’s (1990b)
model, grammar was defined as structure at the sentence level and as cohesion
at the suprasentential level, but this model failed to account for the pragmatic
dimension of communicative grammar. Instead, Rea-Dickins (1991) argued that for
grammar to be truly ‘communicative’, it had to ‘allow for the processing of
semantically acceptable syntactic forms, which are in turn governed by
pragmatic principles’ (p. 114), and not be solely an embodiment of
morphosyntax.
Larsen-Freeman’s
Definition Of Grammar
Another
conceptualization of grammar that merits attention is Larsen Freeman’s (1991,
1997) framework for the teaching of grammar in communicative language teaching
contexts. Drawing on several linguistic theories and influenced by language teaching
pedagogy, she has also characterized grammatical knowledge along three
dimensions: linguistic form, semantic meaning and pragmatic use. Form is
defined as both morphology, or how words are formed, and syntactic patterns, or
how words are strung together. This dimension is primarily concerned with linguistic
accuracy. The meaning dimension describes the inherent or literal message
conveyed by a lexical item or a lexico-grammatical feature.
What
Is Meant By ‘Grammar’ For Assessment Purposes?
Now with a better understanding of how grammar has
been conceptualized in models of language ability, how might we define
‘grammar’ for assessment purposes? It should be obvious from the previous
discussion that there is no one ‘right’ way to define grammar. In one testing
situation the assessment goal might be to obtain information on students’ knowledge
of linguistic forms in minimally contextualized sentences, while in another, it
might be to determine how well learners can use linguistic forms to express a
wide range of communicative meanings. Regardless of the assessment purpose, if
we wish to make inferences about grammatical ability on the basis of a grammar
test or some other form of assessment, it is important to know what we mean by
‘grammar’ when attempting to specify components of grammatical knowledge for
measurement purposes. With this goal in mind, we need a definition of
grammatical knowledge that is broad enough to provide a theoretical basis for the
construction and validation of tests in a number of contexts. At the same time,
we need our definition to be precise enough to distinguish it from other areas
of language ability.
Given the central role that construct definition
plays in test development and validation, my intention in this chapter has been
to discuss the ‘what’ of grammar assessment. I have examined how grammar has
been depicted in models of communicative language ability over the years, and
have argued that for assessment purposes grammar should be clearly differentiated
from pragmatics. Grammar should also be defined to include a form and meaning
component on both the sentence and discourse levels. I have also argued that meaning
can be characterized as literal and intended. Also the pragmatic dimension of
language constitutes an extrapolation of both the literal meaning and the
speaker’s intended meaning, while using contextual information beyond what is expressed
in grammatical forms. I have argued that pragmatic meanings may be
simultaneously superimposed upon grammatical forms and their meanings (e.g., as
in a joke). In short, grammar should not be viewed solely in terms of
linguistic form, but should also include the role that literal and intended
meaning plays in providing resources for all types of communication. Although
forms and meanings are highly related, it is important for testers to make
distinctions among these components, when possible, so that assessments can be
used to provide more precise information to users of test results. In the next
chapter, I will use this model of grammar as a basis for defining second or
foreign language grammatical ability for assessment.
D.
Towards a
definition of grammatical ability
What is meant by grammatical ability?
Having described how grammar has
been conceptualized, we are now faced with the challenge of defining what it
means to ‘know’ the grammar of a language so that it can be used to achieve
some communicative goal. In other words, what does it mean to have ‘grammatical
ability’?
Defining
grammatical constructs
Although our basic underlying model of grammar will
remain the same in all testing situations (i.e., grammatical form and meaning),
what it means to ‘know’ grammar for different contexts will most likely change (see
Chapelle, 1998). In other words, the type, range and scope of grammatical features
required to communicate accurately and meaningfully will vary from one
situation to another. For example, the type of grammatical knowledge needed to
write a formal academic essay would be very different from that needed to make
a train reservation. Given the many possible ways of interpreting what it means
to ‘know’ grammar, it is important that we define what we mean by ‘grammatical
knowledge’ for any given testing situation. A clear definition of what we
believe it means to ‘know’ grammar for a particular testing context will then
allow us to construct tests that measure grammatical ability.
Grammatical
Knowledge
Knowledge refers to a set of informational
structures that are built up through experience and stored in long-term memory.
These structures include knowledge of facts that are stored in concepts,
images, networks, production-like structures, propositions, schemata and
representations (Pressley, 1995). Language knowledge is then a mental
representation of informational structures related to language. The exact
components of language knowledge, like any other construct, need to be defined.
In this book, grammar refers to a system of language whereas grammatical knowledge
is defined as a set of internalized informational structures related to the theoretical
model of grammar proposed in Figure 3.2 (p.62). In this model, grammar is
defined in terms of grammatical form and meaning, which are available to be
accessed in language use. To illustrate, suppose a student learning French
knows that the passive voice is constructed with a form of the verb ĂȘtre (to
be) plus a past participle, and is able to produce this form accurately and
with ease.
Grammatical
Ability
Grammatical ability is, then, the combination of
grammatical knowledge and strategic competence; it is specifically defined as
the capacity to realize grammatical knowledge accurately and meaningfully in
testing or other language-use situations. Hymes (1972) distinguished between competence
and performance, stating that communicative competence includes the underlying
potential of realizing language ability in instances of language use, whereas language
performance refers to the use of language in actual language events. Carroll
(1968) refers to language performance as ‘the actual manifestation of linguistic
competence . . . in behavior’ (p. 50).
Metalinguistic
Knowledge
Finally, metalanguage is the language used to
describe a language. It generally consists of technical linguistic or
grammatical terms (e.g., noun, verb). Metalinguistic knowledge, therefore, refers
to informational structures related to linguistic terminology. We must be clear
that metalinguistic knowledge is not a component of grammatical ability;
rather, the knowledge of linguistic terms would more aptly be classified as a
kind of specific topical knowledge that might be useful for language teachers
to possess. Some teachers almost never present metalinguistic terminology to their
students, while others find it useful as a means of discussing the language and
learning the grammar. It is important to remember that knowing the grammatical
terms of a language does not necessarily mean knowing how to communicate in the
language.
What
Is ‘Grammatical Ability’ For Assessment Purposes?
The approach to the assessment of grammatical
ability in this book is based on several specific definitions. First, grammar
encompasses grammatical form and meaning, whereas pragmatics is a separate, but
related, component of language. A second is that grammatical knowledge, along with
strategic competence, constitutes grammatical ability. A third is that grammatical
ability involves the capacity to realize grammatical knowledge accurately and
meaningfully in test-taking or other language-use contexts. The capacity to
access grammatical knowledge to understand and convey meaning is related to a
person’s strategic competence. It is this interaction that enables examinees to
implement their grammatical ability in language use. Next, in tests and other
language-use contexts, grammatical ability may interact with pragmatic ability
(i.e., pragmatic knowledge and strategic competence) on the one hand, and with
a host of non-linguistic factors such as the test-taker’s topical knowledge,
personal attributes, affective schemata and the characteristics of the task on the
other. Finally, in cases where grammatical ability is assessed by means of an
interactive test task involving two or more interlocutors, the way grammatical
ability is realized will be significantly impacted by both the contextual and
the interpretative demands of the interaction.
Knowledge
Of Phonological Or Graphological Form And Meaning
Knowledge of phonological/graphological form enables
us to understand and produce features of the sound or writing system, with the exception
of meaning-based orthographies such as Chinese characters, as they are used to
convey meaning in testing or language-use situations. Phonological form
includes the segmentals (i.e., vowels and consonants) and prosody (i.e.,
stress, rhythm, intonation contours, volume, tempo). These forms can be used
alone or in conjunction with other grammatical forms to encode phonological
meaning. For example, the ability to hear or pronounce meaning-distinguishing
sounds such as the /b/ vs. /v/ could be used to differentiate the meaning
between different nouns (boat/vote), and the ability to hear or pronounce the
prosodic features of the language (e.g., intonation) could allow students to
understand or convey the notion that a sentence is an interrogative.
Knowledge
Of Lexical Form And Meaning
Knowledge of lexical form enables us to understand
and produce those features of words that encode grammar rather than those that
reveal meaning. This includes words that mark gender (e.g., waitress),
countability (e.g., people) or part of speech (e.g., relate, relation). For
example, when the word think in English is followed by the preposition about before
a noun, this is considered the grammatical dimension of lexis, representing a
co-occurrence restriction with prepositions. One area of lexical form that
poses a challenge to learners of some languages is word formation.
Knowledge
Of Morphosyntactic Form And Meaning
Knowledge of morphosyntactic form permits us to
understand an produce both the morphological and syntactic forms of the
language. This includes the articles, prepositions, pronouns, affixes (e.g.,
-est), syntactic structures, word order, simple, compound and complex
sentences, mood, voice and modality. A learner who knows the morphosyntactic form
of the English conditionals would know that: (1) an if-clause sets up a
condition and a result clause expresses the outcome; (2) both clauses can be in
the sentence-initial position in English; (3) if can be deleted under certain
conditions as long as the subject and operator are inverted; and (4) certain
tense restrictions are imposed on if and result clauses.
Knowledge
of cohesive form and meaning
Knowledge
of cohesive form enables us to use the phonological, lexical and
morphosyntactic features of the language in order to interpret and express
cohesion on both the sentence and the discourse levels. Cohesive form is
directly related to cohesive meaning through cohesive devices (e.g., she, this,
here) which create links between cohesive forms and their referential meanings
within the linguistic environment or the surrounding co-text. Halliday and
Hasan (1976, 1989) list a number of grammatical forms for displaying cohesive
meaning. This can be achieved through the use of personal referents to convey
possession or reciprocity; demonstrativereferents to display spatial, temporal
or psychological links; comparative referents
to encode similarity, difference and equality; and logical connectors to signal
a wide range of meanings such as addition, logical conclusion and contrast.
Knowledge
Of Information Management Form And Meaning
Knowledge
of information management formallows us to use linguistic formsas a resource
for interpreting and expressing the information structure of discourse. Some
resources that help manage the presentation of information include, for
example, prosody, word order, tense-aspect and parallel structures. These forms
are used to create information management meaning. In other words, information
can be structured to allow us to organize old and new information (i.e.,
topic/comment), topicalize, emphasize information and provide information
symmetry through parallelism and tense concordance.
Knowledge Of Interactional Form And Meaning
Knowledge of interactional form enables us to
understand and use linguistic forms as a resource for understanding and
managing talk-ininteraction. These forms include discourse markers and
communication management strategies. Discourse markers consist of a set of
adverbs, conjunctions and lexicalized expressions used to signal certain language
functions. For example, well . . . can signal disagreement, ya know or ahhuh can
signal shared knowledge, and by the way can signal topic diversion. Conversation-management
strategies include a wide range of linguistic forms that serve to facilitate smooth
interaction or to repair interaction when communication breaks down. For
example, when interaction stops because a learner does not understand
something, one person might try to repair the breakdown by asking, *What means
that? Here the learner knows the interactional meaning, but not the form.
Given the central role that construct definition
plays in test development and validation, my intention in this chapter has been
to discuss the ‘what’ of grammatical knowledge invoked by grammar assessment.
After describing grammatical constructs and defining key terms in this book, have
proposed a theoretical model of grammatical ability that relates grammatical
knowledge to pragmatic knowledge and that specifies grammatical form and
meaning on the sentence and discourse levels. I have provided operational
descriptions of each part of the model along with examples that differentiate
knowledge of grammatical form and meaning from knowledge of pragmatic meaning.
This model aims to provide a broad theoretical basis for the definition of
grammatical knowledge in creating and interpreting tests of grammatical ability
in a variety of language use settings. In the next chapter, I will discuss how
this model can be used to design tasks that measure one or more components of
grammatical ability.
E.
Designing test
tasks to measure L2 grammatical ability
In
fact, test scores can vary as a result of the personal attributes of testtakers
such as their age (Farhady, 1983; Zeidner, 1987), gender (Kunnan, 1990;
Sunderland, 1995) and language background (Zeidner, 1986, 1987). They can also
fluctuate due to their strategy use (Cohen, 1994; Purpura, 1999), motivation
(Gardner, 1985) and level of anxiety (Gardner, Lalonde, Moorcroft and Evans,
1987). However, some of the most important factors that affect grammar test
scores, aside from grammatical ability, are the characteristics of the test
itself. In fact, anyone who has ever taken a grammar test, or any test for that
matter, knows that the types of questions on the test can severely impact
performance. For example, some test-takers perform better on multiple-choice
tasks than on oral interview tasks; others do better on essays than on cloze
tasks; and still others score better if asked to write a letter than if asked
to interpret a graph. Each of these tasks has a set of unique characteristics,
called test-task characteristics. These characteristics can potentially
interact with the characteristics of the examinee (e.g., his or her grammatical
knowledge, personal attributes, topical knowledge, affective schemata) to
influence test performance.
How
Does Test Development Begin?
Every
grammar-test development project begins with a desire to obtain and often
provide) information about how well a student knows grammar in order to convey
meaning in some situation where the target language is used. The information
obtained from this assessment then forms the basis for decision-making. Those situations
in which we use the target language to communicate in real life or in which we
use it for instruction or testing are referred to as the target language use
(TLU) situations (Bachman and Palmer, 1996). Within these situations, the tasks
or activities requiring language to achieve a communicative goal are called the
target language use tasks. A TLU task is one of many languageuse tasks that
test-takers might encounter in the target language use domain. It is to this
domain that language testers would like to make inferences about language
ability, or more specifically, about grammatical ability.
What
Are The Characteristics Of Grammatical Test Tasks?
As the goal of grammar assessment is to provide as
useful a measurement as possible of our students’ grammatical ability, we need
to design test tasks in which the variability of our students’ scores is
attributed to the differences in their grammatical ability, and not to
uncontrolled or irrelevant variability resulting from the types of tasks or the
quality of the tasks that we have put on our tests. As all language teachers
know, the kinds of tasks we use in tests and their quality can greatly
influence how students will perform. Therefore, given the role that the effects
of task characteristics play on performance, we need to strive to manage (or at
least understand) the effects of task characteristics so that they will
function the way we designed them to – as measures of the constructs we want to
measure (Douglas, 2000). In other words, specifically designed tasks will work
to produce the types of variability in test scores that can be attributed to
the underlying constructs given the contexts in which they were measured
(Tarone, 1998). To understand the characteristics of test tasks better, we turn
to Bachman and Palmer’s (1996) framework for analyzing target language use
tasks and test tasks.
The
Bachman And Palmer Framework
Bachman and Palmer’s (1996) framework of task
characteristics represents the most recent thinking in language assessment of
the potential relationships between task characteristics and test performance.
In this framework, they outline five general aspects of tasks, each of which is
characterized by a set of distinctive features. These five aspects describe characteristics
of (1) the setting, (2) the test rubrics, (3) the input, (4) the expected
response and (5) the relationship between the input and response.
Characteristics
Of The Setting
The characteristics of the setting include the
physical characteristics, the participants, and the time of the task. Obviously
these characteristics can have a serious, unexpected effect on performance. For
example, I once gave a speaking test to a group of ESL students in my
discussion skills class at UCLA. I randomly placed students into groups of
four, and each was given a problem to solve. Individuals then had to
participate in a discussion in which they had to try to persuade their partners
of their opinion. Each group’s 20-minute discussion was videotaped. After the exam,
I learned that a few students were so nervous being videotaped that they
seriously questioned the quality of their performance. I also learned that, in
one group, a participant became angry when the others did not agree with her
and openly told them their ideas were ‘stupid’. She also berated them for being
quiet. The other students were so embarrassed they hardly said a word. In such
a case, one participant had anundue effect on the others’ ability to perform
their best.
Characteristics
of the test rubrics
The test rubrics include the instructions, the
overall structure of the test, the time allotment and the method used to score
the response. These characteristics can obviously influence test scores in
unexpected ways (Madden, 1982; Cohen, 1984, 1993). The overall test
instructions (when included) introduce test-takers to the entire test. They
make explicit the purpose of the overall test and the area(s) of language
ability being measured. They also introduce examinees to the different parts of
the test and their relative importance. The instructions make explicit the
procedures for taking the entire test. Overall test instructions are common in
all high-stakes tests.
Characteristics
Of The Input
According to Bachman and Palmer (1996), the
characteristics of the input (sometimes called the stimulus) are critical
features of performance in all test and TLU tasks. The input is the part of the
task that test-takers must process in order to answer the question. It is
characterized in terms of the format and language.
Characteristics
Of The Expected Response
When we design a test task, we specify the rubric
and input so that test takers will respond in a way that will enable us to make
inferences about the aspect of grammar ability we want to measure. The
‘expected response’ thus refers to the type of grammatical performance we want
to elicit. The characteristics of the expected response are also considered in terms
of the format and language. Similar to the input, the expected response of
grammar tasks can vary according to channel (aural or visual), form (verbal,
non-verbal), language (native or target) and vehicle (live or reproduced).
Relationship
Between The Input And Response
A final category of task characteristics to consider
in examining how test tasks impact performance is seen in how characteristics
of the input can interact with characteristics of the response. One
characteristic of this relationship involves ‘the extent to which the input or
the response affects subsequent input and responses’ (Bachman and Palmer, 1996,
p. 55). This is known as reactivity. Reciprocal tasks, which involve both interaction
and feedback between two or more examinees, are examples of tasks that have a
high degree of reactivity. However, non-reciprocal tasks, such as writing in a
journal, have no reactivity since no interaction or feedback is required to
complete the task. Finally, in adaptive test tasks there is no feedback, but
there is interaction in the sense that the responses influence subsequent
language use. For example, in computer adaptive tests such as the BEST Plus
(Center for Applied Linguistics, 2002), students are presented with test questions
tailored to their ability level. In other words, as the student responds to
input, subsequent input is tailored to their proficiency level.
Describing
Grammar Test Tasks
When language teachers consider tasks for grammar
tests, they call to mind a large repertoire of task types that have been
commonly used in teaching and testing contexts. We now know that these holistic
task types constitute collections of task characteristics for eliciting
performance and that these holistic task types can vary on a number of
dimensions. We also need to remember that the tasks we include on tests should
strive to match the types of language-use tasks found in real-life or language
instructional domains. Traditionally, there have been many attempts at
categorizing the types of tasks found on tests. Some have classified tasks
according to scoring procedure. For example, objective test tasks (e.g.,
true–false tasks) are those in which no expert judgment is required to evaluate
performance with regard to the criteria for correctness. Subjective test tasks
(e.g.,essays) are those that require expert judgment to interpret and evaluate performance
with regard to the criteria for correctness.
Selected-Response
Task Types
Selected-response tasks present input in the form of
an item, and test takers are expected to select the response. Other than that,
all other task characteristics can vary. For example, the form of the input can
be language, non-language or both, and the length of the input can vary from anword
to larger pieces of discourse. In terms of the response, selected response tasks
are intended to measure recognition or recall of grammatical form and/or
meaning. They are usually scored right/wrong, based on one criterion for
correctness; however, in some instances, partial-credit scoring may be useful,
depending on how the construct is defined. Finally, selected-response tasks can
vary in terms of reactivity, scope and directness.
Given the central role of task in the development of
grammar tests, this chapter has addressed the notion of task and task
specification in the test development process. I discussed how task was
originally conceptualized as a holistic method of eliciting performance and
argued that the notion of task as a monolithic entity falls short of providing
an adequate framework from which to specify tasks for the measurement of
grammatical ability. I also argued that given the diversity of tasks that could
emerge from real-life and instructional domains, a broad conceptualization of task
is needed in grammatical assessment – one that could accommodate selected-response,
limited-production and extended-production tasks. For assessment, the process
of operationalizing test constructs and the specification of test tasks are
extremely important. They provide a means of controlling what is being measured,
what evidence needs to be observed to support the measurement claims, what
specific features can be manipulated to elicit the evidence of performance, and
finally how the performance should be scored. This process is equally important
for language teachers, materials writers and SLA researchers since any
variation in the individual task characteristics can potentially influence what
is practiced in classrooms or elicited on language tests. In this chapter, I argued
that in developing grammar tasks, we needed to strive to control, or at least
understand, the effects of these tasks in light of the inferences we make about
examinees’ grammatical ability. Finally, I described Bachman and Palmer’s
(1996) framework for characterizing test tasks and showed how it could be used
to characterize SL grammar tasks. This framework allows us to examine tasks
that are currently in use, and more interestingly, it allows us to show how
variations in task characteristics can be used to create new task types that
might better serve our educational needs and goals. In the next chapter, I will
discuss the process of constructing a grammar test consisting of several tasks.
Assessing Vocabulary
Chapter One
The Place Of Vocabulary
In Language Assessment
At
first glance, it may seem that assessing the vocabulary knowledge of second
language learners is both necessary and reasonably straightforward. It is
necessary in the sense that words are the basic building blocks of language,
the units of meaning from which larger structures such as sentences, paragraphs
and whole texts are formed. For native speakers, although the most rapid growth
occurs in child- hood, vocabulary knowledge continues to develop naturally in
adult life in response to new experiences, inventions, concepts, social trends
and opportunities for learning. For learners, on the other hand, acquisition of
vocabulary is typically a more conscious and demanding process. Even at an
advanced level, learners are aware of limitations in their knowledge of second
language (or L2) words.
Recent Trends In
Language Testing
However,
scholars in the field of language testing have a rather different perspective
on vocabulary-test items of the conventional kind. Such items fit neatly into
what language testers call the discrete point approach to testing. This
involves designing tests to assess whether learners have knowledge of
particular structural elements of the language: word meanings, word forms,
sentence patterns, sound contrasts and so on. In the last thirty years of the
twentieth century, language testers progressively moved away from this
approach, to the extent that such tests are now quite out of step with current thinking
about how to design language tests, especially for proficiency assessment.
Three Dimensions Of
Vocabulary Assessment
Up
to this point, I have outlined two contrasting perspectives on the role of
vocabulary in language assessment. One point of view is that it is perfectly
sensible to write tests that measure whether learners know the meaning and
usage of a set of words, taken as independent semantic units. The other view is
that vocabulary must always be assessed in the context of a language-use task,
where it interacts in a natural way with other components of language
knowledge. To some extent, the two views are complementary in that they relate
to different purposes of assessment. Conventional vocabulary tests are most likely
to be used by classroom teachers for assessing progress in vocabulary learning and diagnosing areas of
weakness. Other users of these tests are researchers in second language
acquisition with a special interest in how learners develop their knowledge of,
and ability to use, target-language words. On the other hand, researchers in
language testing and those who undertake large testing projects tend to be more
concerned with the design of tests that assess learners' achievement or
proficiency on a broader scale. For such purposes, vocabulary knowledge has a
lower profile, except to the extent that it contributes to, or detracts from,
the performance of communicative tasks. As with most dichotomies, the
distinction I have made between the two perspectives on vocabulary assessment
oversimplifies the matter. There is a whole range of reasons for assessing
vocabulary knowledge and use, with a corresponding variety of testing
procedures. In order to map out the scope of the subject, I propose three
dimensions. The dimensions represent ways in which we can expand our conventional
ideas about what a vocabulary test is in order to include a wider range of
lexical assessment procedures. I introduce the dimen- sions here, then
illustrate and discuss them at various points in the following chapters. Let us
look at each one in turn.
Discrete - Embedded
The
first dimension focuses on the construct which underlies the assessment
instrument. In language testing, the term construct refers to the mental
attribute or ability that a test is designed to measure. In the case of a
traditional vocabulary test, the construct can usually be labelled as
'vocabulary knowledge of some kind. The practical signifi- cance of defining
the construct is that it allows us to clarify the meaning of the test results.
Normally we want to interpret the scores on a vocabulary test as a measure of
some aspect of the learners' vocabulary knowledge, such as their progress in
learning words from the last several units in the course book, their ability to
supply derived forms of base words (like scientist and scientific, from
science), or their skill at inferring the meaning of unknown words in a reading
passage.
Selective -
Comprehensive
The
second dimension concerns the range of vocabulary to be included in the
assessment. A conventional vocabulary test is based on a set of target words
selected by the test-writer, and the test-takers are assessed according to how
well they demonstrate their knowledge of the meaning or use of those words.
This is what I call a selective vocabulary measure. The target words may either
be selected as individual words and then incorporated into separate test items,
or alternatively the test-writer first chooses a suitable text and then uses certain
words from it as the basis for the vocabulary assessment.
Context-Independent -
Context-Dependent
The
role of context, which is an old issue in vocabulary testing, is the basis for
the third dimension. Traditionally contextualisation has meant that a word is
presented to test-takers in a sentence rather than as an isolated element. From
a contemporary perspective, it is necessary to broaden the notion of context to
include whole texts and, more generally, discourse. In addition, we need to
recognise that contextualisation is more than just a matter of the way in which
vocabulary is presented. The key question is to what extent the test- takers
are being assessed on the basis of their ability to engage with the context
provided in the test. In other words, do they have to make use of contextual
information in order to give the appropriate Judgements about appropriateness take
us beyond the text to consider the wider social context. For instance, take a
proficiency test in which the test-takers are doctors and the test task is a
role play simulating a consultation with a patient. If vocabulary use is one of
the criteria used in rating the doctors' performance, they need to demonstrate
an ability to meet the lexical requirements of the situation; for example:
understanding the colloquial expressions that patients use for common symptoms
and ailments, explaining medical concepts in lay terms, avoiding medical
jargon, offering reassurance to someone who is upset or anxious, giving advice
in a suitable tone and so on. Vocabulary use in the task is thus influenced by
the doctor's status as a highly educated professional, the expected role relationship
in a consultation and the affective dimension of the situation. This is a much
bro view of context than we are used to thinking of in relation to vocabulary
testing, but a necessary one nonetheless if we are to assess vocabulary in
contemporary performance tests.
An Overview Of The Book
The
three dimensions are not intended to form a comprehensive model of vocabulary
assessment. Rather, they provide a basis for locating the variety of assessment
procedures currently in use within a common framework and, in particular, they
offer points of contact between tests which treat words as discrete units and
ones that assess vocabulary more integratively in a task-based testing context.
Atvarious points through the book I refer to the dimensions and exemplify them.
Since a large proportion of work on vocabulary assessment to date has involved
instruments which are relatively discrete, selective and context independent in
nature, this approach may seem to be predominant in several of the following
chapters. However, my aim is to present a balanced view of the subject, and I
discuss mea- sures that are more embedded, comprehensive and context dependent
wherever the opportunity arises, and especially in the last two chapters of the
book.
Chapter Two
The Nature Of
Vocabulary
Before
we start to consider how to test vocabulary, it is necessary first to explore
the nature of what we want to assess. Our everyday concept of vocabulary is
dominated by the dictionary. We tend to think of it as an inventory of
individual words, with their associated meanings. This view is shared by many
second language learners, who see the task of vocabulary learning as a matter
of memorising long lists of L2 words, and their immediate reaction when they encounter
an unknown word is to reach for a bilingual dictionary. From this perspective,
vocabulary knowledge involves knowing the meanings of words and therefore the
purpose of a vocabulary test is to find out whether the learners can match each
word with a synonym, a dictionary-type definition or an equivalent word in
their own language. However, when we look more closely at vocabulary in the
light of current developments in language teaching and applied linguistics, we
find that we have to address a number of questions that have the effect of
progressively broadening the scope of what we need to assess. The first question is: What is a word?
This is an issue that is of considerable interest to linguists on a theoretical
level, but for testing purposes we have more practical reasons for asking it.
For example, it becomes relevant if we want to make an estimate of the size of
a learner's vocabulary. Researchers who have attempted to measure how many
words native speakers of English know have produced wildly varying figures, at
least partly because of their different ways of defining what a word is.
What Is A Word?
A
basic assumption in vocabulary testing is that we are assessing knowledge of
words. But the word is not an easy concept to define, either in theoretical
terms or for various applied purposes. There are some basic points that we need
to spell out from the start. One is the distinction between tokens and types,
which applies it the individual word form that is being assessed or the whole
word family to which that word form belongs? One further complication in defining
what words are is the existence of homographs. These are single word forms that
have at least two meanings that are so different that they obviously belong to different
word families. One commonly cited example is the noun bank, which has two major
meanings: an institution which provides financial services, and the sloping
ground beside a river. It also refers to a row of dials or switches, and to the
tilting of an aircraft's wings as it turns. There is no underlying meaning that
can usefully link all four of these definitions, so in a real sense we have
several distinct word families here. In dictionaries, they are generally
recognised as such by being given separate entries (rather than separate senses
under a single entry). In the testing context, we cannot assume, just because learners
demonstrate knowledge of one meaning, that they have acquired any of the
others.
What About Larger
Lexical Items?
The
second major point about vocabulary is that it consists of more than just
single words. For a start, there are the phrasal verbs (get across, move out,
put up with) and compound nouns (fire fighter, love letters, practical joke,
personal computer, applied social science, milk of magnesia), which are
generally recognised as lexical units consisting of more than one word form.
Then there are idioms like a piece of cake, the Good Book, to go the whole hog,
let the cat out of the bag. These are phrases and sentences that cause great
difficulty for second language learners because the whole unit has a meaning
that cannot be worked out just from knowing what the individual words mean. Working
from a similar point of view, Nattinger and DeCarrico (1992) have developed the
concept of a lexical phrase, which is a group of words that looks like a grammatical
structure but operates as a unit, with a particular function in spoken or
written discourse. Theyidentify four categories of lexical phrases:
1.
Polywords: short fixed phrases that perform a
variety of functions, such as for the most part (which they call a qualifier),
at any rate and so to speak (fluency devices), and hold your horses (disagreement
marker).
2.
Institutionalised expressions: longer
utterances that are fixed in form and include proverbs, aphorisms and formulas
for social interaction. Examples are: a watched pot never boils, how do you
do?, long time no see, and once upon a time ... and they lived happily ever
after.
3.
Phrasal constraints: short- to medium-length
phrases consisting of a basic frame with one or two slots that can be filled
with various words or phrases. These include a (day / year / long time ago,
yours (sincerely / trulyl, as far as I (know I can tell / am aware), and the (sooner)
the (better).
4.
Sentence builders: phrases that provide the
framework for a complete sentence, with one or more slots in which a whole idea
can be expressed. Examples are: I think that X; not only X, but also Y and that
reminds me of X.
Pawley
and Syder (1983: 206-208) offer a lengthy list of longer utterances of a
similar kind. Here are some of their items:
It's
on the tip of my tongue.
I'll
be home all weekend.
Have
you heard the news?
What
does it mean to know a lexical item?
Let
us now leave aside the question of what units vocabulary is composed of and
take up the issue of what it means to know lexical items of various kinds. To
put it another way, how do we go about describing the nature of vocabulary
knowledge? One approach is to try to spell out all that the learners should
know about a word if they are to fully acquire it. An influential statement along
these lines was produced by Richards (1976). In his article he outlined a
series of assumptions about lexical competence, growing out of developments in
linguistic theory in the 1960s and 1970s.
1.
The first assumption is that the
vocabulary knowledge of native speakers to expand in adult life, in contrast to the
relative stability of their grammatical competence. The other seven assumptions
cover various aspects of what is meant by knowing a word:
2.
Knowing a word means knowing the degree
of probability of encountering that word in speech or print. For many words we
also know the sort of words most likely to be found associated with the word.
3.
Knowing a word implies knowing the
limitations on the use of the word according to variations of function and
situation.
4.
Knowing a word means knowing the
syntactic behaviour associated with the word.
5.
Knowing a word entails knowledge of the
underlying form of a word and the derivations that can be made from it.
6.
Knowing a word entails knowledge of the
network of associations between that word and other words in the language.
7.
Knowing a word means knowing the
semantic value of a word.
8.
Knowing a word means knowing many of the
different meanings associated with a word. (Richards, 1976: 83)
What Is Vocabulary
Ability?
Three dimensions of
vocabulary assessment represent one attempt to incorporate the two perspectives
within a single framework. However, a more ambitious effort has been undertaken
by Chapelle (1994), who proposed a definition of vocabulary ability based on
Bachman's (1990; see also Bachman and Palmer, 1996) general construct of
language ability.
The Context Of
Vocabulary Use
Traditionally
in vocabulary testing, the term context has referred to the sentence or
utterance in which the target word occurs. For instance, in a multiple-choice
vocabulary item, it is normally recommended that the stem should consist of a
sentence containing the word to be tested, as in the following example:
The committee endorsed
the proposal.
1. discussed
2. supported
3. c.
knew about
4. prepared
Under
the influence of integrative test formats, such as the cloze procedure, our
notion of context has expanded somewhat beyond the sentence level. Advocates of
the cloze test, especially Oller (1979), pointed out that many of the blanks
could be filled successfully only by picking up on contextual clues in other
sentences or paragraphs of the text. Thus, in this sense the whole text forms
the context that we draw on to interpret the individual lexical items within
it. However, from a communicative point of view, context is more than just a
linguistic phenomenon.
Vocabulary Knowledge
And Fundamental Processes
The
second component in Chapelle's (1994) framework of vocabulary ability is the
one that has received the most attention from applied linguists and second
language teachers. Chapelle outlines four dimensions of this component:
1.
Vocabulary size: This refers to the
number of words that a person knows. In work with native speakers scholars have
attempted to measure the total size of their vocabulary by taking a sample
of words from a large unabridged
dictionary. In the case of second language learners the goal is normally more
modest: it is to estimat how many of the more common words they know based on a
test of their knowledge of a sample of items from a word-frequency list.
Discuss this further in Chapter 4 and in Chapter 5 we look at two
vocabulary-size tests. As Chapelle (1994: 165) points out, though, if we follow
the logic of a communicative approach to vocabulary ability, we should not just
seek to measure vocabulary size in an absolute sense, but rather in relation to
particular contexts of use.
2.
Knowledge of word characteristics: I
discussed the frameworks developed by Richards (1976) and Nation (1990) earlier
in the chapter, and this is where they fit into Chapelle's definition. Just as
native speakers do, second language learners know more about some words than
others. Their understanding of particular lexical items may range from vague to
more precise (Cronbach, 1942). As Laufer (1990) points out, learners are likely
to be confused about some of the words that they have learned, because the
words share certain common features, e.g. affect, effect; quite, quiet;
simulate, stimulate; embrace, embarrass. And again, as with vocabulary size,
the extent to which a learner knows a word varies according to the context in which
it is used.
3.
Lexicon organization: This concerns the
way in which words and other lexical items are stored in the brain. Aitchison's
book Words in the Mind (1994) provides a comprehensive and very readable account
of psycholinguistic research on the mental lexicon of proficient language
users. There is a research role here for vocabulary tests to explore the
developing lexicon of second language learners and the ways in which their
lexical storage differs from that of native speakers. Meara (1984; 1992b) has
worked in this area using word-association and lexical-network tasks.
Metacognitive
strategies for vocabulary use
This is the third
component of Chapelle's definition of vocabulary
ability, and is what
Bachman (1990) refers to as 'strategic compe-
tence'. The strategies
are employed by all language users to manage
the ways that they use
their vocabulary knowledge in communication.
Most of the time, we
operate these strategies without being aware of
it. It is only when we
have to undertake unfamiliar or cognitively
demanding communication
tasks that the strategies become more
conscious. For example,
I am carefully choosing my words as I write
this chapter, trying
(or should that be attempting or striving,
perhaps?) both to
express the ideas clearly and to get achieve the
level of (in)formality
that the editors seem to be looking for. Here are
some other situations
in which native speakers may need to apply
more conscious
strategies: deciphering illegible handwriting in a per-
sonal letter, reading
aloud a scripted speech, breaking the news of a
relative's death to a
young child, or conversing with a foreigner. As
language teachers we
become skilled at modifying the vocabulary that
we use so that our
learners can readily understand us. By contrast you
have probably observed
inexperienced native speakers failing to com-
municate with
foreigners because they use slang expressions, they do
not articulate key
words clearly, they are unable to rephrase an utter-
ance that the other
person has obviously not understood and so on.
Of course, more is
involved in all these cases than just vocabulary, but
the point is that
lexical strategies play a significant role.
Chapter Three
Research On Vocabulary
Acquisition And Use
The
focus of this chapter is on research in second language vocabulary acquisition
and use. There are three reasons for reviewing this research in a book on
vocabulary assessment. The first is that the researchers are significant users
of vocabulary tests as instruments in their studies. In other words, the purpose
of vocabulary assessment is not only to make decisions about what individual
learners have achieved in a teaching/learning context but also to advance our
understanding of the processes of vocabulary acquisition. Secondly, in the
absence of much recent interest in vocabulary among languagetesters,
acquisition researchers have often had to deal with assessment issues
themselves as they devised the instruments for their research. The third reason
is that the results of their research can contribute to a better understanding
of the nature of the construct of vocabulary ability, which - as I explained in
the previous chapter – is important for the validation of vocabulary tests.
Systematic Vocabulary
Learning
Given
the number of words that learners need to know if they are to achieve any kind
of functional proficiency in a second language, it is understandable that
researchers on language teaching have been interested in evaluating the
relative effectiveness of different ways of learning new words.
The
findings of studies that address both of these issues have been reviewed by a
number of authors (e.g. Higa, 1965; Nation, 1982; Cohen, 1987; Nation, 1990:
Chapter 3; Ellis and Beaton, 1993b, Laufer, 1997b). In brief, some of the
significant findings are as follows:
1.
Words belonging to different word
classes vary according to how difficult they are to learn. Rodgers (1969) found
that nouns are easiest to learn, following by adjectives; on the other hand,
verbs and adverbs were the most difficult. Ellis and Beaton (1993b) confirmed
that nouns are easier than verbs, because learners can form mental images of
them more readily.
2.
Mnemonic techniques are very effective
methods for gaining an initial knowledge of word meanings in a second language
(Cohen, 1987; Hulstijn, 1997). One method in particular, the keyword technique,
has been extensively researched (see, for example, Paivio and Desrochers, 1981;
Pressley, Levin and McDaniel, 1987). It involves teaching learners to form
vivid mental images which link the meanings of an L2 word and an LI word that
has a similar sound. This technique works best for the receptive learning of
concrete words.
3.
In order to be able to retrieve L2 words
from memory - rather than just recognising them when presented - learners need
to say the word to themselves as they learn it (Ellis and Beaton, 1993a).
4.
Words which are hard to pronounce are
learned more slowly than ones that do not have significant pronunciation
difficulty (Rodgers, 1969; Ellis and Beaton, 1993b).
5.
Learners at a low level of language
learning store vocabulary according to the sound of words, whereas at more
advanced levels words are stored according to meaning (Henning, 1973).
6.
Lists of words which are strongly
associated with each other – like Incidental vocabulary learning.
Since
the early 1980s a number of reading researchers have focused on vocabulary
acquisition by native speakers of English. While there is a great deal of
variation in the estimates of the number of words known by native speakers of
various ages and levels of education, there is general agreement that vocabulary
acquisition occurs at an impressivly fast rate from childhood throughout the
years of formal education and at a slower pace on into adult life. On the face
of it, a large proportion of these words are not taught by parents or teachers,
or indeed learned in any formal way. The most plausible explanation for this is
that native speakers acquire words 'incidentally as they encounter them in the
speech and writing of other people.
Research With Native
Speakers
The
first step in investigating this kind of vocabulary acquisition was to obtain
evidence that it actually occurs. Teams of reading researchers in the United
States (Jenkins, Stein and Wysocki, 1984; Nagy, Herman and Anderson, 1985;
Nagy, Anderson and Herman, 1987) undertook a series of studies with
native-English-speaking school children. The basic research design involved
asking the subjects to read texts appropriate to their age level that contained
unfa-miliar words. The children were not told that the researchers were interested
in vocabulary. After they had completed the reading task, they were given
unannounced at least one test of their knowledge ofthe target words in the
text. Then the researchers obtained a measure of vocabulary learning by
comparing the test scores of the students who had read a particular text with those
of other students who had not. The results showed that, in these terms, a
small, statistically significant amount of learning had indeed occurred. In
their 1985 study, Nagy, Herman and Anderson estimated that the probability of learning
a word while reading was between 10 and 25 per cent (depending how strict the
criterion was for knowing a word), whereas in the 1987 study they calculated a
probability of just 5 per cent. One reason for the discrepancy was that in the
latter study the test was administered six days after the children did the
reading task rather than immediately afterwards.
Second Language
Research
Now,
how about incidental learning of second language vocabulary? In a study that
predates the Ll research in the US, Saragi, Nation and Meister (1978) gave a
group of native speakers of English the task of reading Anthony Burgess's novel
A Clockwork Orange, which contains a substantial number of Russian-derived words
functioning as an argot used by the young delinquents who are the main
characters in the book. When the subjects were subsequently tested, it was
found on average that they could recognise the meaning of 76 per cent of the 90
target words. Pitts, White and Krashen (1989) used just excerpts from the novel
with two groups of American university students and also found some evidence of
vocabulary learning; however, as you might expect, the reduced scope of the
study resulted in fewer target disc program Raiders of the Lost Ark. She was
particularly interested in factors that influenced the learning of unfamiliar
words. It appeared that, in terms of frequency, learning was associated with
the general frequency of words in the language rather than how often they
occurred in that particular text. In addition, she found that words were more
likely to be learned if they were salient, in the sense of being important for
understanding a specific part of the program.
Chapter Four
Research On Vocabulary
Assessment
In
the previous chapter, we saw how tests play a role in research on vocabulary
within the field of second language acquisition (SLA). Now we move on to
consider research in the field of language testing, where the focus is not so
much on understanding the processes of vocabulary learning as on measuring the
level of vocabulary knowledge and ability that learners have reached. Language
testing is concerned with the design of tests to assess learners for a variety
of practical purposes that can be summarised under labels such as placement,
diagnosis, achievement and proficiency. However, in practice this distinction
between second language acquisition research and assessment is difficult to maintain
consistently, because, on the one hand, language testing researchers have paid
relatively little attention to vocabulary tests and, on the other hand, second language
acquisition researchers working on vocabulary acquisition have often needed to
develop tests as an integral part of their research design. Thus, some of the
important work on how to measure vocabulary knowledge and ability has been produced
by vocabulary acquisition researchers rather than language testers; the latter
have tended either to take vocabulary tests for granted or, in the 1990s, to be
interested in more integrative and communicative measures of lan- guage
proficiency.
Multiple-Choice
Vocabulary Items
Although
the multiple-choice format is one of the most widely used methods of vocabulary
assessment, both for native speakers and for second language learners, its
limitations have also been recognised for a long time. Wesche and Paribakht
summarise the criticisms of these items as follows:
1.
They are difficult to construct, and
require laborious field-testing, analysis and refinement.
2.
The learner may know another meaning for
the word, but not the one sought.
3.
The learner may choose the right word by
a process of elimination, and has in any case a 25 per cent chance of guessing
the correct answer in a four-alternative format.
4.
Items may test students' knowledge of
distractors rather than theirability to identify an exact meaning of the target
word.
5.
The learner may miss an item either for
lack of knowledge of words or lack of understanding of syntax in the
distractors.
6.
This format permits only a very limited
sampling of the learner's total vocabulary (for example, a 25-item
multiple-choice test samples one word in 400 from a 10,000-word vocabulary). Wesche
and Paribakht, (1996: 17)
Chapter Five
Vocabulary Tests: Four
Case Studies
In
this chapter discuss four tests that assess vocabulary knowledge as case
studies of test design and validation. I have referred to all four of them in
earlier chapters, especially Chapter 4, and so the case studies give me the opportunity to explore
issues raised earlier in greater depth, in relation to particular well-known
language tests. The four tests are:
1.
The Voluntary Levels Test;
2.
The Eurocentres Vocabulary Size Test
(EVST);
3.
The Vocabulary Knowledge Scale (VKS);
and
4.
The Test of English as a Foreign
Language (TOEFL)
These
tests do not represent the full range of measures covered by the three
dimensions of vocabulary assessment. Three of them are discrete,
context-independent tests and all four are selective rather than comprehensive.
However, I have chosen them because they are widely known and reasonably well
documented in the literature. More specifically, there is research evidence
available concerning their validity as assessment procedures for their intended
purpose. They also represent innovations in vocabulary assessment and serve to
highlight interesting issues in test design. However, there is a limited number
of instruments that I could have considered for inclusion as case studies in
this chapter, which reflects the fact that, despite the upsurge in second language
vocabulary studies since the early 1980s, the design of tests that could
function as standard instruments for research or other assessment purposes has
been a neglected area.
The Eurocentres
Vocabulary Size Test
Like
the Vocabulary Levels Test, the Eurocentres Vocabulary Size. Test (EVST) makes
an estimate of a learner's vocabulary size using a graded sample of words covering
numerous frequency levels. However, there are several differences in the way
that the two tests are designed and so it is worthwhile to look at the EVST in some
detail as well. The EVST is a check-list test which presents learners with a
series of words and simply requires them to indicate whether they know each one
or not. It includes a substantial proportion of non-words to provide a basis for
adjusting the test-takers' scores if they appear to be over- stating their
vocabulary knowledge. Another distinctive feature of the EVST is that it is
administered by computer rather than as a pen-and-paper test. Let us now look
at the test from two perspectives: first as a placement instrument and then as
a measure of vocabulary size.
Chapter Six
The Design Of Discrete
Vocabulary Tests
In
this chapter review various considerations that influence the design of
discrete vocabulary tests. Discrete tests most commonly focus on vocabulary
knowledge: whether the test-takers know the meaning or use of a selected set of
content words in the target language. They may also assess particular
strategies of vocabulary learning or vocabulary use. Such tests are to be distinguished
from broader measures of vocabulary ability that are embedded in the assessment
of learners' performance of language-use tasks.
The
discussion of vocabulary-test design in the first part of this chapter is based
on the framework for language-test development presented in Bachman and
Palmer's (1996) book Language Testing in Practice. Since the full framework is
too complex to cover here, I have chosen certain key steps in the test-development
process as the basis for a discussion of important issues in the design of
discrete vocabu- lary tests in particular. In the second part of the chapter, I
offer a practical perspective on the development of vocabulary tests by means
of two examples. One looks at the preparation of classroom progress tests, and
the other describes the process by which I devel- oped the word-associates
format as a measure of depth of vocabulary knowledge.
Receptive And
Productive Vocabulary
From
our experience as users of both first and second languages, we can all vouch
for the fact that the number of words we can recognise and understand is rather
larger than the number we use in our own speech and writing. This distinction
between receptive and productive vocabulary is one that is accepted by scholars
working on both first and second language vocabulary development, and it is
often referred to by the alternative terms passive and active. As Melka (1997)
points out, though, there are still basic problems in conceptualising and
measuring the two types of vocabulary, in spite of a lengthy history of
research on the subject. The difficulty at the conceptual level is to find
criteria for distin- guishing words that have receptive status from those which
are part of a person's productive vocabulary. It is generally assumed that words
are known receptively first and only later become available for productive use.
Melka (1997) suggests that it is most useful to think in terms of a receptive
to productive continuum, representing increasing degrees of knowledge or
familiarity with a word. Thus, when they first encounter a new word, learners
have limited knowledge of it and may not even remember it until they come across
it again. It is only after they gain more knowledge of its pronunciation,
spelling, grammar, meaning, range of use and so on that they are able to use it
themselves. The problem is to locate the threshold at which the word passes
from receptive to productive status. Is there a certain minimum amount of word
knowledge that is required before productive use is possible? Melka acknowledges
that, if there is a continuum here, it is not a simple smooth one; furthermore,
there is a fluid boundary and a great deal of interaction between receptive and
productive vocabulary.
Chapter Seven
Comprehensive Measures
Of Vocabulary
Comprehensive
measures are particularly suitable for assessment procedures in which
vocabulary is embedded as one component of the measurement of a larger construct,
such as communicative competence in speaking, academic writing ability or
listening comprehension. However, we cannot simply say that all comprehensive measures
are embedded ones, because they can also be used on a discrete basis. For
example, a number of the studies which have applied lexical statistics to
learner compositions have been conducted by L2 vocabulary researchers who were
not interested in an overall assessment of writing ability but just in making
inferences about the learners' productive vocabulary knowledge. These
researchers are clearly treating vocabulary as a separate construct and not
making any more general assessment of the quality of the learners' writing.
Statistical Measures Of
Writing
One
way in which we can assess the written production of learners is by calculating
various statistics that reflect their use of vocabulary. Some of these
measurements were originally developed by literary scholars to analyse the
stylistic features of major authors and to date they have been applied to
second language writing only to a limited extent. Researchers in second language
acquisition have certainly been interested in quantitative, or objective',
measures of learner production but, in keeping with the general orientation of
their work, they have mostly counted grammatical units, such as the length of sentences
or of clauses. Those scholars who have worked with lexical measures have used
them for research purposes rather than for assessment of learners. The relative
complexity of the procedures in- volved in calculating the statistics makes it
difficult to apply them in an operational writing test, although the results of
such studies may provide valuable input into the design of the rating scales
that are most commonly used for the assessment of learner production in a less
time-consuming, qualitative manner.
The researchers have
used the lexical statistics to investigate a variety of research questions:
1.
Do these measures give consistent
results when they are applied to two compositions written by the same learners,
with only a short time interval in between? (Arnaud, 1992; Laufer and Nation,
1995).
2.
How do the compositions of second
language learners compare with those of native speakers of a similar age and/or
educational level? (Arnaud, 1984; Linnarud, 1986; Waller, 1993).
3.
What is the relationship between the
lexical statistics and holistic ratings of the quality of the learners'
compositions? (Nihalani, 1981; Linnarud, 1986; Engber, 1995).
4.
What is the relationship between the
lexical quality of learners' writing and their vocabulary knowledge, as
measured by a discrete point vocabulary test? (Arnaud, 1984; 1992; Laufer and
Nation, 1995).
5.
Does the lexical quality of advanced
learners' writing increase after one or two semesters of English study?
(Laufer, 1991; 1994).
Chapter Eight
Further Developments In
Vocabulary Assessment
In
earlier chapters, have surveyed a diverse range of work on second language
vocabulary assessment and proposed three dimensions which allow us to locate
the different types of measure within a common framework. Conventional vocabulary
tests - which I would describe as predominantly discrete, selective and context
independent - are effective research tools for certain purposes and are routinely
administered in second language teaching programmes around the world. Existing
tests of this kind will continue to be used and new
ones devised. They work
best in assessment situations where it makes sense to focus on vocabulary as a
discrete form of language knowledge and to treat lexical items as individual
units of meaning. At a time when the pendulum in language-teaching methodology
is moving back to a greater emphasis on form-focused instruction, there is
renewed interest in giving explicit attention to learners' mastery of the
structural features of the language, including its lexical forms.
The Vocabulary Of
Informal Speech
In
the preceding section, I have made reference to the lexical features of spoken
English in discussing Skehan's work. The vocabulary of speech is the second
area of vocabulary study that has received less attention than it should have,
as indicated by the fact that perhaps the most frequently cited research study
is the one conducted by Schonell et al. (1956) in the 1950s on the spoken
vocabulary of Australian workers. I mention this not to cast aspersions on
research coming from Australia (tempting though it may be for a New Zealander
to do so) but simply to highlight the limited number of more recent studies
from anywhere else. There are several reasons for this (see also McCarthy and
Carter, 1997: 20): As I have frequently noted, in Chapter 4 and elsewhere, a
large proportion of the research on vocabulary has been undertaken by reading
researchers, who obviously focus on words in written texts. There is no equivalent
research tradition on the vocabulary of spoken language, especially in informal
settings.
1.
Almost all the established
word-frequency lists have been compiled by counting words in corpora of written
texts. Although spoken corpora are becoming more common now, they are usually
much smaller than the corresponding written ones because samples of spoken
language are difficult both to collect and to store. Making recordings of
natural speech is quite a challenge: people tend to be self-conscious when they
know they are being recorded, and there are legal and ethical constraints on
recording speakers without their knowledge or consent. Once the speech has been
recorded, it then has to be painstakingly transcribed before being entered into
the computer for analysis.
2.
Spoken language also creates problems of
analysis. Speech is not 'grammatical', at least according to the rules for the
sentences of written language, and McCarthy and Carter (1997: 28-29) point out various
difficulties in the identification of vocabulary items as well. For instance,
are vocalisations like mm, er and um to be considered as lexical items? Should
contracted forms like don't, it's and gonna be counted as one word form or two?
O'Loughlin (1995) faced such problems when he applied the lexical density
statistic to speaking test data and I showed in Chapter 7 (Table 7.2) how he
had to develop quite an elaborate set of rules for distinguishing lexical and grammatical
items.
Source:
·
Purpura E. James. 2004. Assessing Grammar (1-145). Cambridge
University Press
·
Read John. 2000. Assessing Vocabulary (1-236). Cambridge University Press