Wynne Wong

The Research Supporting the Comprehensible Input Hypothesis and C.I. Instruction

Research shows that

  • languages are acquired only when people get aural or written comprehensible input
  • comprehensible reading in the target language improves acquisition a lot
  • grammar practice and explanations, most metacognition, performance feedback, and output are of minimal or no value
  • drills and any other kind of output practice don’t help acquisition
  • there are predictable, unavoidable, error-involving stages and sequences of acquisition of “grammar” which cannot be changed
  • learners’ speaking the target language does not help learners acquire it, and often slows acquisition
  • comprehensible input methods (including T.P.R.S., narrative paraphrase a.k.a. Movietalk, free voluntary reading, watching subtitled target-language video and Story Listening) do more for acquisition than do legacy methods that inolve drills, rule-teaching and practice, forced output, etc.
  • despite superficial differences, children and adults learn languages in the same way

Here is the evidence supporting what we know about language acquisition.  Thanks to Eric Herman for digging a lot of this up, and thanks to Karen Lichtman, Bill VanPatten, Ray Hull, Stephen D. Krashen, Wynne Wong, Reed Riggs and Paul Nation for sending papers, comments, etc.

Want a live crash course in research?  See Bill VanPatten’s presentation (in 6 parts) here.  His weekly podcast is archived here.  Lance Pantagiani’s condensed Tea With BVP episodes are archived here. Sarah Cottrell’s Musicuentos podcasts are also worth a listen.

1) Should students be taught and practice specific grammar points?  NO.  Truscott reviews research and says that “overall the evidence against grammar teaching is quite strong.”  Krashen annihilates the grammarians’ arguments here. Wong and VanPatten also dismiss the grammar-practice argument in Wong and Van Patten 2003: “The Evidence Is In: Drills Are Out,” and VanPatten, Keating & Leeser (2012) conclude that “things like person-number endings on verbs must be learnt from the input like anything else; they can’t be taught and practiced in order to build a mental representation of them” (see Wong and Van Patten 2003 the evidence is in drills are out).

VanPatten also notes that “what we call grammar rules are what we end up with, and are not how we learn or what the brain actually does” (MIWLA presentation, 2013), and that “classroom rule learning is not the same as acquisition.” Lightbown writes that “structured input works as well as structured input plus explanation” (in VanPatten, 2004): in other words, explanations don’t aid acquisition (though some students may feel good getting them).

Bardovi-Harlig (2000) found, as VanPatten and Wong put it, that “learners– again, both in and out of the classroom– have demonstrated that acquisition of the tense and aspectual systems (e.g. the use of the preterit/passé composé and the imperfect) is piecemeal and unaffected by instructional intervention.”

VanPatten (1998) also notes that “[a] reading of the literature on second language acquisition and use suggests that communication is not the result of learning discrete bits of language and then putting them together.

VanPatten (2013) also echoes Susan Gross when he notes that “building up in a learner’s brain [are] simultaneously lexicon and morphology, syntatic features and constraints, pragmatics and discourse, interfaces between components, communicative discourse [and] skill” and that “these happen all at once.  They are almost impossible to isolate and practice one at a time, because they don’t operate one at a time.

In a fascinating study, Batterink & Neville (2013) found evidence that the “longstanding hypothesis is that syntactic processing occurs outside of conscious awareness, relying upon computational mechanisms that are autonomous and automatic” (what Krashen calls the Monitor model) is, in fact, correct.

2) How much vocabulary, grammar and general language skill do students pick up via free voluntary reading (FVR)? LOTSand loads more than from direct instruction. There are estimates that readers acquire an average of a word every twenty minutes of FVR, that FVR works about twenty times as quickly as classroom instruction, and that 75% of an adult’s vocabulary comes from reading.  See Lehman (2007), summarised in IJFLTJuly07.  Additional free voluntary reading research is detailed on Krashen’s site and Japanese researcher Beniko Mason has also done a ton of good FVR research.  There is very good research on the Fijian Book Flood experiment detailed here, which shows, among other things, that some “focus on form”– grammar and writing feedback– is useful for second-language acquisition at later and higher levels, even while comprehensible input does 95% of the work and remains the sine qua non of language acquisition.  In a recent study (abstract here), non-native speakers of Spanish who had a Spanish reading habit had much greater vocabulary than native Spanish speakers who did not read.

Stephen Krashen notes that “Nagy, Herman, and Anderson (1985) concluded that for English as a first language, each time readers encountered a new word in a comprehensible context, they acquired about five to ten percent of the meaning of the word. This may not seem like very much, but Nagy et al. point out that with enough comprehensible input, this is more than enough to account for what is known of vocabulary development.

VanPatten writes that “for maximum vocabulary development, learners need to read all along the way, since most vocabulary development in both L1 and L2 is incidental, meaning that vocabulary is learned as a by-product of some other intention (normally reading).” Warwick Ely here examines free voluntary reading, grammar instruction, etc, and comes to the same conclusions that Krashen, VanPatten, Wong, Lightbown & Spada etc do. Waring (2015) here makes the “inescapable case” for reading.  Mason and Krashen’s look at F.V.R. among Japanese learners of English showed significant positive effects. Self-selected, comprehensible, interesting reading in the target (or native) language is boosts acquisition for the following reasons:

  • it delivers masses of comprehensible input
  • learners can pause, slow down, go back and seek extra (e.g. online or dictionary) help, which they cannot do nearly as well with a live speaker, and especially not with many native speakers (who often do not adjust vocabulary and speed to non-native-speakers’ needs)
  • readers can (and generally do) select books (input) tailored to their level
  • there is no output pressure, so the affective filter is low
  • for beginners, prosodic features like word differentiation are easier to see than to hear (but others, such as tone and accent, are harder to grasp)
  • the brain’s visual system is acute and, especially for monolinguals, better developed than the hearing processing system.

3) Do people acquire language via comprehensible input? YES. Krashen here summarises the comprehension hypothesis and destroys its rivals. Lightbrown and Spada (2013) state that “comprehensible input remains the foundation of all language acquisition.”  VanPatten and Wong (2003) note that “Acquisition of a linguistic system is input dependent.”  Krashen also takes a look at savants, polyglots and ordinary folk who have learned languages via comprehensible input in this fascinating paper.  In a study of Spanish learners, comprehensible input teaching worked about six times as quickly as traditional instruction.  There is a great short comprehensible input demo by Krashen here, and here (starts at about 12:30) is a longer and more detailed lecture.

Krashen also lists the academic research supporting comprehensible input here.

Karen Lichtman lists the T.P.R.S.-supportive research here, and another giant literature review is here.

Note: For reading to help L2 acquisition, it must

  • be 98% comprehended
  • restrict vocabulary load to learners’ levels
  • be interesting in and of itself
  • recycle vocabulary

As Hulstijn notes, “most of the incidental L2 vocabulary learning studies. . . their results are valid, and educationally relevant, only as far as this initial encounter is concerned. What is far more relevant for educational practice is that long-term retention of new vocabulary normally requires frequent exposures or rehearsal, regardless of the conditions under which new words have initially been encountered (2003, p. 367).

Nation writes that “Unsimplified text clearly provides poor conditions for reading and incidental vocabulary learning for learners whose vocabulary sizes are less than 9,000 word families [ie almost all learners in educational settings]” (2014, p. 9).

4) Should we organise curriculum thematically?  NO.  Among other reasons, it turns out that it’s harder to remember clusters of similar vocab than collections of thematically disparate vocab. As Paul Nation writes, “research on learning related vocabulary, such as lexical sets, … shows that learning related words at the same time [e.g. in thematic/semantic units such as “clothes” or “chores”] makes learning them more difficult. This learning difficulty can be avoided if related words are learned separately, as they are when learning from normal language use.” See Paul Nation on lexical sets and Rob Waring’s paper on vocab learning.

5) Should we “shelter” (limit) vocab?  YES. Evidence from children’s language acquisition suggests that we should, while “upping” prosodic variation (“wacky” or differentiated voices), reading rituals, and responses to student output (the paper is forthcoming). There is some processing research (VanPatten) that suggests that the amount of “mental energy” available for comprehension is limited, and that a minimal amount of new vocab be introduced in structured patterns over a broad overlay of well-known vocab, so that “mental energy” can be devoted to acquiring newer items. VanPatten: “any model of L2 input processing [must] consider in some way the impact of capacity issues in working memory on what learners can do at a given point in time.”  In other words, overload = bad.

Children also acquire vocabulary more quickly if it is “framed”: delivered in interactive, structured and limited speech-and-response sets (see chapter 10 of Nurture Shock for details). It is estimated (Nation, 2006) that in most languages, the top 1000 most-frequently-used words account for about 85% of all oral language use, and the top 2000 for ~95%.  Best practice is probably to teach “along the frequency list” where the most emphasis is on words that are most used (with variations that cater to student needs and interests).

6) Do learners “learn” the “grammar” that teachers “teach?”  Not on teachers’ or texts’ schedules.  VanPatten (2010) argues in this very comprehensive paper that “some domains [aspects of language acquisition] may be more or less amenable to explicit instruction and practice [e.g.vocabulary], while others are stubborn or resistant to external influences [e.g. grammar].”  VanPatten, echoing Krashen, concludes that there is limited transfer of conscious knowledge “about” language into functional fluency and comprehension, and notes that “[n]ot only does instruction not alter the order of acquisition, neither does practice” (2013).

Ellis (1993) says that “what is learned is controlled by the learner and not the teacher, not the text books, and not the syllabus.”

7) Should we use L1– the “mother tongue”– in class? YES, (albeit as little as possible), as Krashen notes, because this avoids both ambiguity AND incomprehensibility, neither of which  help acquisition. Here are some ideas about why L1 should be used in the languages classroom (Immersion teachers take note…all the _______ in the world won’t help kids who do not understand it).  Nation (2003) notes “There are numerous ways of conveying the meaning of an unknown word […] However, studies comparing the effectiveness of various methods for learning always come up with the result that an L1 translation is the most effective (Lado, Baldwin and Lobo 1967; Mishima 1967; Laufer and Shmueli 1997).”

Here is some 2020 research where students  an L1-supported L2 class outperformed an L2-only (immersion-style) class.

8) Can we change the order of acquisition? NO. Krashen’s books have examples of order of acquisition. More recently, Lightbown and Spada (2013) reiterate Krashen’s contentions, showing how acquisition order of verb forms (in English-learning children) is fixed. Wong and VanPatten (2003) make the same point.  There is very little we can do to “speed up” acquisition of any “foreign” grammar rule (e.g. English speakers learning the Spanish subjunctive) or vocabulary, other than providing lots of comprehensible input that contains the rule in question.

VanPatten (2013) notes that instruction “does not alter the order of acquisition,” and Long (1997) says that “[t]he idea that what you teach is what they learn, and when you teach it is when they learn it, is not just simplistic, it is wrong.” We also know that L2 mistakes are partially a function of L1, have partly to do with L1-L2 differences, but mostly to do with learners not being mentally ready to produce the new form (which is a result of a lack of input).

For example, L1 German learners of L2 French make mistakes with subject-verb inversion…despite German having exactly the same rule as French for s-v inversion.  Arika Okrent documents children’s L1 acquisition errors; note that errors 5-8 are also classic adult L2 acquisition errors (stages).

Bardovi-Harlig (2000) found, as  VanPatten and Wong (2003)  put it, that “learners […] have demonstrated that acquisition of the tense and aspectual systems (e.g. the use of the preterit/passé composé and the imperfect) is piecemeal and unaffected by instructional intervention.”  In Lightbown (1984), French-speaking students’ English output did not “match” the input they were given.  Students “do not simply learn linguistic elements as they are taught– adding them one after another in neat progression.  Rather, the students process the input in ways which are more “acquisition-like” and not often consistent with what the teacher intends for them to “learn”.”

9) Does correcting or properly re-stating learner mistakes–recasting– improve learner performance? Generally, NO. Lightbrown and Spada (2013) point out that while teachers like recasting (and do it a lot), and while students can and do immediately generate improved output as a result, “these interactions were not associated with improved performance on […] subsequent test[s].”  VanPatten writes “[d]irect error correction by the instructor does not promote linguistic accuracy and the absence of error correction in the early stages of acquisition does not impede the development of linguistic accuracy” (1986 p.212).

Feedback regarding meaning, however, works: a student who points at a picture of a cat and says “dog” can benefit frim being told “no, that’s a cat.” However, feedback directed at the implicit system– eg you should say vengo, not veno— is useless.

My view: if there is a place for recasts in the languages classroom, it is in ensuring that student output– which is also input for other students— is comprehensible and accurate.

10) Is there broad agreement among second-language-acquisition researchers about what constitutes effective practice? YES. In this paper, Ellis lays out the “ten principles” of second languages teaching.  He notes

  • comprehensible input is the sine qua non of second language acquisition
  • we must provide some “focus on form” (grammar explanations) to support meaning
  • there is no transfer from explicit knowledge of grammar to implicit language competence
  • the use of quite a lot of “formulaic” expressions– a.k.a. “lexical chunks”– is essential esp. for beginners
  • curricula organised along grammar sequential lines are probably not brain-friendly
  • instruction must primarily focus on meaning
  • drills don’t work
  • some output is necessary for acquisition in much later stages as this focuses learner attention on some aspects of form

S.L.A. researcher Patsy Lightbown here explains the “known facts” about second language acquisition.  Here is a video of S.L.A. research and what works/does not work by Bill VanPatten.

11) Do “learning styles” or “multiple intelligences” exist?  NO.  In this paper, psychologist Daniel Willingham puts the boots to the idea that teachers need to kill themselves providing nineteen different ways to learn the verb “to run.”  While people often have preferences about learning, and while some people definitely have better skills in some areas than others, there is no evidence to suggest that language acquisition is positively affected by anything other than the presence of masses of comprehensible input, and the absence of counterproductive activities (grammar practice, forced output, grammar lectures, etc).

VanPatten has said that “No research has found a link between learning styles and individual differences on the one hand, and on the other the processes involved in language acquisition.

12) Do students like speaking in a second-language class?  Generally, no.  Krashen first made this point, and Baker and MacIntyre note that “Speaking has been found to be the most anxiety-provoking form of communication,” (references to Maclntyre & Gardner (1991) and McCroskey & Richmond (1987)) and also note that production anxiety in classes is high among non-Immersion students.

Best practice is probably to let those want to, talk, and to delay any output for others while asking them to signal comprehension or lack thereof (as natural approach, A.I.M., Narrative Paraphrase and T.P.R.S. do).

13)  Does speaking improve acquisition?  NO.  Despite (a few) studies which try to make the case for output, there isn’t a strong one. See Krashen’s response to one such study here, and his examination of Swain’s output hypothesis– and the research testing it– here. In another study, English-speaking students were taught Spanish structures (subjunctive and conditional) via various mixes of input and practice output. In this study, students who

  • got input only did very well
  • got input and did limited output (“practise”) did no better than input-only students
  • did more output (“practise”) than getting input did significantly worse than those who got more input.

Wong and VanPatten (2003) note that “[a]cquisition of a linguistic system is input-dependent, meaning that learners must be engaged in comprehension in order to construct that system […] Production is not comprehension and thus produced language is not input for the learner. That input must come from others.” They also note that “drills are unnecessary and in some cases hinder acquisition,” and Van Patten (2013) remarks that “traditional ‘practice’ may result in language-like behaviour, but not acquisition” and that “practice is not a substitute for input.”  He goes on to ask “if input is so important, what does traditional practice do?” and answers “essentially very little, if anything.  It does not help mental representation.  It is not clear it helps skills.

VanPatten also says that when “mechanical drills attempt to get the learner to acquire the thing they are asked to produce, the cart has been put before the horse,” and notes that “research conducted since the early 1990s has shown that traditional approaches to teaching grammar that involve the use of mechanical, meaningful and communicative drills do not foster acquisition in the way that practice [listening/reading] with structured input does.

Caveat: “Output hones a learner’s ability to access the implicit system with accuracy and speed” (Keating, 2016, cited in Hawkins and Henshaw, 2022). This means that fluency is– in part– a function of speaking. HOWEVER: since one cannot speak– let alone speak well– without a small ocean of language in one’s brain, what teachers often refer to as “speaking practice” should be minimized. Another way of putting it: the more language you have in your head, the easier it will eventually be to speak, so the best use of classroom time– since most learners will get much less than the 1000 or so hours of comprehended input that they need to acquire something like fluency– is going to be spent getting input. Or, as Blaine Ray puts it, “you will get back out what you put in.”

14) Should we speak s.l.o.w.l.y. in class? YES. Audiologist Ray Hull writes  “[f]or an adolescent, spoken speech at around 135 words per minute is perfect for speech understanding, particularly when the student is learning a new language. So, 130 WPM may be even better. It will seem very slow to you, but the central auditory system of the student will appreciate it.” Adult native-language output is 170-180 words per minute, so slowness is essential (for all teachers, not just those of languages).  Note that there is no way to speed up auditory processing speed.

15) Do learners need many repetitions of vocab items to acquire them? YES.  In this study, scientists concluded that 160 repetitions of an item resulted in new items being “wired in” like older (or L1) items.  However, acquisition rates vary and depends on various factors:  is the word an L1 cognate?  Is it being used comprehensibly?  Is its use meaningful?, etc.

16) Does feedback about performance in a language (e.g. correction, explicit information, etc) help acquisition?  NO. Sanz and Morgan-Short (2002) replicated with computer-delivered input what VanPatten & Cadierno (1993) did with spoken and written input.  And, as VanPatten & Wong (2003) put it, they found that “neither explicit information nor explicit feedback seemed to be crucial for a change in performance; practice in decoding structured input alone […] was sufficient.”  In other words, explaining to people how a grammar rule in a language works, and/or pointing out, explaining and recasting (correcting) errors has no effect on acquisition.  VanPatten also writes that “Overt correction does little good in the long run” but “indirect correction may be useful,” but notes that the research on indirect feedback is far from clear.

17)  Are some people better language learners than others?  NO.  Older research (as Vanpatten, 2013, watch it here, video 5, says) suggested different people had different aptitudes.  New research (VanPatten 2013b, 2014) suggests, echoing Krashen, that on traditional tests of aptitude that measure conscious learning– e.g. knowing grammar rules– there are “better” and “worse” students.

HOWEVER, in terms of processing (understanding) ability, there is no difference among people.  If they get comprehensible input, they acquire at roughly the same rate, in the same way.  A classroom that foregrounds grammar practice and output should produce a more varied mix of outcomes than one which focuses on input.  VanPatten notes that working memory– roughly, how much “stuff” one can keep in their head consciously at a time– varies between individuals, and that those with greater working memory may find language acquisition easier.

18) Do children and adults learn languages in the same way? Mostly, yes.  Children must develop a linguistic system while simultaneously acquiring a language.  For example, kids need to develop basic competencies (which adults take for granted), such as knowing that words can represent reality, that that there are such things as individual words, etc.  Once this “linguistic foundation” has been laid, kids and adults acquire languages in the same way. We know this because kids and adults make similar errors, have similar sequences of acquiring grammar, etc. As VanPatten notes, “adults and children appear to be constrained by the same mechanisms during language acquisition regardless of context, and the fundamental ingredients of language acquisition are at play in both situations: input (communicatively embedded language that learners hear or see, if sign language); Universal Grammar coupled with general learning architecture; and processing mechanisms that mediate between input and the internal architecture. In short, much of what we observe as differences between adults and children are externally imposed differences; not differences in underlying linguistic and psycholinguistic aspects of acquisition. And some of those externally imposed differences are a direct result of myths about language acquisition.”  

19) Do we have data showing how well comprehensible input methods work in comparison with legacy methods?  YES.(note:  Nov 14, 2015– this section is being updated; please comment if you have things to add)

  • C.A.L.A. testing shows T.P.R.S.-taught students outperforming other students despite having less in-class time than other students
  • Joe Dziedzic found that T.P.R.S. outperformed “communicative” teaching, with the biggest gains for T.P.R.S.-taught students being in oral and written output, despite T.P.R.S. students not being forced to speak or write outside of evaluation.
  • Ray & Seely’s Fluency Through T.P.R. Storytelling (7th ed.) has a research appendix.  Summary:  T.P.R.S. never works worse than, sometimes performs as well as, but mostly performs better than traditional methods.
  • Ashley Hastings’ “focal skills” C.I. approach– where what we call “Movietalk” comes from– significantly beats traditional teaching.
  • Grant Boulanger has shown that C.I. teaching both works better than the textbook in terms of student outcomesand increases retention of students who typically do not stick around in language classes (people of colour, boys, poor people, etc).
  • There are as of Nov 2018 twenty-nine studies that compare one C.I. approach (TPRS) with other methods. TPRS mostly comes out much better.
  • Beniko Mason’s “Story Listening” C.I. method also beats traditional instruction hands down. See her research here.

20) Do learners acquire words more easily if they hear masses of repetition at one time, or the same number of repetitions spaced out? No. Brown investigated this and found of vocab that “Significant gains were observed […] and so the influence of several factors was explored: frequency of occurrence within the class and variation in word form were found to have significant positive effects on gains, while distribution of occurrences (massed or distributed) had no effect.” (Brown, D. Incidental vocabulary learning in a Japanese university L2-English language classroom over a semester. TESOL J. 2021; 12:e595. https://doi.org/10.1002/tesj.595)

21). Was Krashen right? Forty-some years after Stephen Krashen proposed his five hypotheses, the verdict on his claims is in.

Finally, there is no evidence suggesting that the following legacy language practices are effective:

  • grammar teaching and practice
  • forced and/or early output
  • any kind of drill
  • teacher-led chanting, or call-and-response
  • error correction and/or recasts
  • minimal reading; “fragmented” one-dimensional reading (e.g. lists, informational text, etc)
  • sequenced grammar instruction

Got a study, paper, etc that needs adding? Email me or add a comment and I’ll update this.