Bowles & Montrul 2008

Are explicit grammar instruction and feedback effective and worthwhile? A look at bad research & wrong conclusions.

I have been discussing research on grammar teaching and feedback for awhile on Twitter with Steve S. and others.  I maintain that there is essentially no value– in terms of acquisitional gains for students– in explicitly teaching grammar or providing corrective feedback.  Steve sent me a paper– Bowles and Montrul (2008)— which seems to suggest the opposite.  This is a classic problem for languages teachers:  somebody does (very bad) research about Grammar Intervention Technique X, “finds” that it “works,” and then textbook publishers and grammarians use this to torture their poor students.  SO…

Today’s question:  is grammar instruction and feedback both effective and worthwhile?

Bowles and Montrul took English speakers learning Spanish, and wanted to see whether appropriate forms of the personal a in Spanish could best be acquired (for recognition) via regular exposure to Spanish, or via exposure to explicit instruction (“this is the personal a, and ____ is how/where you use it”) plus reading sentences containing (and some not containing) the personal a, some of which were grammatical and other which weren’t, plus feedback: if they screwed up, they were told so, and they got an explanation, and they could do the exercise again as often as they wanted.  They were also told to try to get a score of 90% correct.

When the treatment finished, they were tested, and statistical analyses confirm that, yes, the people who got instructional treatment– instruction, sample sentences, and feedback– did better than the others (and by “did better,” we mean “were able to recognise proper/improper uses of the personal a”).

So, Steve S. appears to be right.  Grammar instruction and feedback are prima facie effective.  BUT…but…but… there are so many problems with this study that, frankly, we might as well throw it out.  Here we go:  Stolzie versus the Professors.

First, Bowles and Montrul made several mistakes with their control group.

1.  Their study compared a treatment group with a non-treatment group, with insufficient differentiation of treatment variables.  This raises the question of cause: whether the treatment group’s gains came from instruction and feedback, or from simple exposure to Spanish.  If the treatment group got exposure to comprehensible language containing the instructional target (the personal a), and instruction and feedback, we do not know whether it was simple exposure to the target, or instruction and feedback about the target that made changes in understanding.

To address a concern like this, study design would have to expose a control group to lots of language containing the target, and the treatment group to that same language, as well as instruction plus feedback, so that the only difference between the groups would be the instruction and feedback.  This would allow us to tell what made the difference.

2.  Their study also failed to account for quantity of language exposed to.  They note that both groups got regular course instruction, but only the treatment group got the treatment (outside of class time).  So…if the treatment group got more Spanish than the controls, how do we know that the outcomes were a result of treatment?  Perhaps the treatment group’s gains came about from just simply getting more Spanish.  This is a confound: a potential and untested alternative explanation.

To address this concern, both groups should have received the same amount of exposure to Spanish– ideally only in class.

Second, Bowles and Montrul severely limited themselves with their treatment.  If you want to determine  the best way to improve language acquisition (even of a simple item), you cannot just take one intervention and compare it to a control, and from that make a general statement such as “grammar interventions work.”.  Their experiment does not look at other possibilities.  How about just simple comprehensible input containing the target in class?  Or, how about VanPatten’s processing instruction?  How about free voluntary reading in Spanish?

Lourdes and Ortega (2000) in their massive study of effectiveness of instructional intervention (that’s jargon for “does teaching people languages actually help them acquire languages?”) noted that basically any exposure to the target language– if it is meaningful– will produce some acquisition.  The question is not “does _____ work?”, but “how well— compared to other approaches– does _____ work?”  A grammarian who likes his worksheets and a “communicative” teacher who loves having her first-years do “dialogues” will both say “but they are learning!” and they are right.  The question, however, is how MUCH are they learning compared to other methods?

From the teacher’s point of view– outside of the control-group flaws noted above– this study does not provide us with anything useful.  All it (in my view wrongly) claims is that some “focus on form” (allegedly) worked better than whatever else the students were doing.  But since we have a lot of instructional options, research that doesn’t compare them is useless.

A better design would have looked at different ways of helping people acquire the personal a (other than just having it present in input, as it was for the control group) and compared their effectiveness.

Third, there was no examination of durability of intervention.  OK, a week after intervention, tests found the intervention group picked up the personal a.  How about a year later– did they still have it?  If there is no look at durability of intervention, why bother?  If I have to decide what to do with my students, and I have zero guarantee that Intervention ____ will last, why do it– especially if, as we will see, it’s boring. Krashen proposed a three-months-delayed post-test as one criterion of validity.  This study does not deliver on that.

Fourth, any classroom teacher can see the massive holes in this kind of thing right off the bat.

(A) it’s boring.  Would YOU want to read and listen to two-dimensional writing for days?  Juan vio a Juana.  Juana le dio un regalo a su mamá.  I cannot imagine any set of students paying attention to this.  If you wanted to diversify instruction– i.e. not present just tedious lists of sentences and grammar info– you would also be severely restricted in what you can actually do in the classroom, as you have to build everything around rule ______.  

(B) the “number of rules” problem rears its head.  Bowles and Montrul targeted the personal a because we don’t have that in English.  Spanish also has a ton of other grammar we don’t have in English.  Off the top of my head, umm,

  • subject position in questions
  • differences in use of past tenses with auxiliary verbs
  • major differences in uses of reflexive verbs…e.g. why does a Spanish speaker say comí una pizza, but me comí tres pizzas?

Any Spanish teacher could go on and come up with zillions more “non-Englishy” rules that need to be learned.  If a teacher wants to design teaching around rule-focused input and feedback, the problem is that they will never be able to address all the rules, because the number of rules is not only functionally infinite, but nobody knows them all.

Fifth, the opportunity cost of grammar reinforcement etc is both high and unaddressed in this study.  Basically, what we have is a bandwidth problem.  We have X amount of time per day/course/year to teach Spanish (or whatever).  Any focus on Rule A means– by definition– we will have less time to devote to Rule B.  Even the doddering grammarian with his verb charts and grammar notes can see the problem– oh no!  If we spend too much time on the personal a, I won’t be able to benefit the kids with my mesmerising object pronoun worksheets!— but it’s worse than that.

In terms of input, focus on a grammar rule/item/etc means losing out on two crucial things:

1. Language that is multidimensional in terms of content.  As noted, if the personal a is your target, you are seriously restricted in what you can say, write, etc (it’s boring) but, beyond being boring, students are losing out on whatever could be said without using the personal a.

2.  Language that is grammatically multidimensional.  If I must teach focused on the personal a, the other “rules” will be less present in the input, and so we’re starving Peter to feed Paul.

My guess is that– even if you did this study without all the flaws I note above and got positive results– you would find a cost elsewhere, as the quantity and variety of language students would be exposed to would have dropped and been simplified.  So they might master the personal a, but they acquire less of grammar rule ____ or vocab _____.

(Krashen and many others have looked at almost exactly this question in terms of acquisition of vocab and writing skills in terms of whether or not free voluntary reading (in L1 or L2) or classroom instruction works best.  You can teach people vocab, or phonics, or word-decoding, or writing rules, or you can let them read (or listen) to interesting stuff.  The research is unanaimous and clear: free voluntary reading beats everything in terms of how fast things are picked up, how interesting learning is, and how “multidimensional” the learning– measured in various ways, from word recognition to improved writing– is.)

What we need is a holistic look at acquisition, which one-item studies of this kind cannot show us.  What did these students not acquire while they were doing their personal a grammar work?  What did the students who got multidimensional input pick up?  Language is much more complex than knowing Rule ____ and looking at an instructional intervention that targets .1% of what needs to be learned– while ignoring the other 99.9%– is silly at best.

If you really want to know whether an instructional intervention, or technique, works, you have to look at all aspects of language use, not just whether or not one rule has been acquired.

SO…do grammar-focused instruction, vocab presentation and corrective feedback work to help people acquire the personal a?

  • maybe (but Bowles and Montrul don’t know why)
  • we have no idea for how long
  • sure…for one item at a time
  • in a boring way
  • in a way that sacrifices essential multidimensional input (of grammar and vocab)

So.  Next?

Advertisements