adoption of research

There Are No Shortcuts

It has been oft-observed that no matter what your first language is, your brain acquires additional languages in the same way (ie via comprehended input, in stages, following a set order, etc).

One study looked at L1 German and English speakers acquiring L2 French. For example, to say do you like to work? in French, you have to say aimes-tu travailler? (literally, like-you to work). In German we say magst du arbeiten? Note that in both French and German, we reverse subject pronoun and first verb to make the question.

It was found that both Germans and English made the same mistakes with subject-verb inversion during question formation, despite German having the same “rule” as French.

This should be comforting to language teachers, who often see “errors” persisting seemingly forever. Why can’t the kids just use plural verbs?, ask Spanish teachers. What is so difficult about the fartitive arricle? whine our French-teaching colleagues. Well, here is a story that may shed some light.

I’m a native German speaker who learned English starting in kindergarten, French in grade seven, and Spanish at age twenty-two. I also acquired a lot of Cantonese from neighbourhood kids around age three, but I forgot it.

In Spanish, when you say I wash my hands, you don’t say *lavo mis manos. You actually say me lavo las manos, which literally means something like “for myself I wash the hands.” The me makes it clear that these are my and not somebody else’s hands.

This “rule” took me forever to acquire. Like, years. And then it hit me.

In German, we have exactly the same “rule” as Spanish. To say I wash my hands, you don’t say *ich wasche meine Hände. You say ich wasche mir die Hände, or “I wash for me the hands.” (The only difference between Spanish and German is where the reflexive pronoun me/mir goes.)

I had to acquire the same “rule” in Spanish that I had already acquired in German, and I had to acquire it the same way that I— and anyone else— acquires it: from the input.

So if your kids are taking forever to say eg estoy bien instead of soy bien, or whatever, relax. Even if their L1 “rule” is like the L2 “rule” they are acquiring— and equally so if there is no similarity— they still have to work through ordered development.

And if there is one lesson here, it might be, resist the urge for grammatical explanations, or cleverly-disguised “practice”, or God help you worksheets, when your kids’ emergent grammars raises your teacherly hackles. Patience, my good sir and madame— there are no shortcuts.

The Rule of Three: Simpler Evaluation

Teachers are uhhhh obsessive, especially about marking. We write and rewrite assessment instruments, when we could be hitting a bachata class, ripping up the Grand Wall after work, or kicking back with our five-year-old.

^ wanna be overloaded like her? ^ 😞😞

We spend too much time thinking about grading. Luckily for us, I’m gonna make the rest of your teaching career waaaaaay simpler by showing you how to make marking simple.

Various assessment gurus will tell you something fairly similar regarding attaching Numberz to Performancez: there are only three, (or maybe four), real levels of skill that one can accurately describe.  These are basically,
1. not yet proficient
2. functionally proficient
3. fully proficient
.

Breaking things down further is complicated, and therefore makes marking slower (and rubrics more complex and therefore harder for students to understand). The more you refine descriptors and levels, the harder it is to distinguish between them. 

Yes, sometimes more complex rubrics are called for, but not in a language class. And why not? Because the only teacher action which makes a difference for language learners is the amount and quality of input

So…imagine if you got marked on partying. They give you a Number for how well you party.
Q: what would the rubric look like?
A: like this…

1 You are on your way to the party.
2 You are standing in the doorway, chatting with the host, eyeing a nice martini.
3 You are shaking it on the dancefloor with thirty others, with your second drink, and the sexiest person at the party is checking you out.

Works? Sure! It’s simple, quick and accurate. Your Party Mark will be 34%, 66% or 100%. Now, say we also wanted to grade outfits. So we add this:

1 Sweats and slides are kinda basic…but hey, you got out of bed!
2 Business casual? You look good and respectable but no eyeballs/mentions for you.
3 Oh yeah! What’s yr Insta, gorgeous? 😁

If we mark our partiers on both behaviour and dress, we could get from 1/6 to 6/6, or 16%, 33%, 50%, 66%, 83%, 100%. This is pretty good.  We could add another criterion– say, flirting skills– and then our marks would range from 3/9 to 9/09, or 33%, 44%, 55%, 66%, 77%, 88% and 100%.

So here is our Rule of Three for Evaluation:

1. We focus on three levels of skill (not yet, just got it, fully proficient).
2. There is a clear difference between each level.
3. We do not mark more than three criteria.

Now, I’ma show y’all how this works for a language class. Here’s our oral interaction rubric (end of year, zero prep, totally 100% spontaneous & unplanned Q&A with a student, Level 2 and up in any language).

Here is the rubric. We are evaluating comprehension, functional accuracy and quantity of speech. 

3. I understand everything said. My errors have minimal impact on how understandable I am. I ask and answer questions, and keep conversation going, appropriately.

2. I understand much of what is said with some obvious gaps. My errors occasionally make me impossible to understand. I try to keep conversation going but sometimes have problems adding to/elaborating on what has been said.

1. I don’t understand much of what is said. My errors often make me hard to understand. I have consistent problems keeping the convo going.

This rubric is a 3×3 and generates marks between 3 and 9 out of 9 (ie 33%-100% in 11% intervals). It’s a nice mix of detail, fast, and simple. You basically never want a rubric more complex than 3×3 cos it gets too texty for kids to read.

There you go. Use it if you want it.

Anyway, a few notes to go with this (and with marking writing, or anything else):

A. You can mark via selective sample. Eg, for writing, say your kids pump out 300-word stories (mid Level 1). I’ll bet you dinner and a movie that marking any three sentences will show you their proficiency as accurately as reading the entire thing. Same goes for answering questions about a reading, or listening. Pick a small sample and go.

B. You will generally see marks “clustering.” The kid who understands all the questions/comments in an oral interview will probably also be able to speak well. This is cos most “skills” develop in concert. With our partying rubric, it is likely that Mr Dressed To Kill is also quite sociable, a good and enthusiastic dancer, etc. Yes, there will be the odd kid who understands everything but can’t say much, but this is uncommon.

Now would somebody please make rubrics for spontaneous written output and reading comp also? Create & share.

Let’s be DONE with marking questions and focus on what matters: finding cool input for kids, and making our grading quick & simple, so that we can relax after work & show up energised. Remember, one of C.I.’s greatest innovators at one point said that their method was developed to boost their golf score. The logic? Well-rested, happy teacher = good teacher 😁😁.

 

 

What grades should kids get? Notes on evaluation for the mathematically-challenged.

Here is a part of a post from Ben’s.  A teacher– let’s call him Mr John Speaking– who uses T.P.R.S. in their language class writes:

“I was told by a Defartment Chair a few weeks ago that my grades were too high across the board (all 90s/100s) and that I needed more of a range for each assessment. Two weeks later I had not fixed this “problem” and this same Defartment Chair pulled me out of class and proceeded to tell me, referencing gradebook printouts for all my classes, that these high grades “tell me there is not enough rigor in your class, or that you’re not really grading these assessments.” After this accusation, this Defartment Chair told me I was “brought on board [for a maternity leave replacement] in the hopes of being able to keep me, but that based on what he’d seen the past few weeks, I’m honestly not tenure track material.”

Obviously, Mr John Speaking’s Defartment Chair is an idiot, but, as idiots do, he does us a favour:  he brings up things worth thinking about.

There are two issues here:

a) Should– or do— student scores follow any predictable distribution?  I.e., should there be– or are there–a set percentage of kids in a class who get As, Bs, Cs, Ds and Fs?

b) How do you know when scores are “too low” or “too high”?

Today’s question: what grades should students get?

First, a simple, math idiot’s detour into grading systems and stats.  The math idiot is me.  Hate stats?  Bad at math? Read on!  If I can get it, anyone can get it!

It is important to note that there are basically two kinds of grading systems. We have criterion-referenced grading and curved (norm-referenced) grading.

First, we have criterion-referenced grading.  This is, we have a standard– to get an A, a student does X.  To get a B, a student does Y, etc.  For example, we want to see what our Samoyed Dogs’ fetching skills are and assign them fetching marks. Here is our Stick Fetching Rubric:

A:  the dog runs directly and quickly to the thrown stick, picks it up, brings it back to its owner, and drops it at owner’s feet.

B: the dog dawdles on its way to the stick, plays with it, dawdles on the way back, and doesn’t drop it until asked.

C: the dog takes seemingly forever to find the stick, bring it back, and refuses to drop it.

So we take our pack of five Samoyed Dogs, and we test them on their retrieval skills.  Max, who is a total idiot, can’t find the stick forever, then visits everyone else in the park, then poos, then brings the stick an hour later but won’t drop it because, hell, wrestling with owner is more fun.  Samba dutifully retrieves and drops.  Rorie is a total diva and prances around the park before bringing the stick back.  Arabella is like her mother, Rorie, but won’t drop the stick.  Sky, who is so old he can remember when dinosaurs walked the Earth, goes straight there, gets the stick, and slowly trudges back.  So we have one A, one B, one C, one C- (Max– we mercy passed him) and one A- (Sky, cos he’s good and focused, but slow).

Here are our Samoyeds:

Samoyeds

Now note–

1. Under this scheme, we could theoretically get five As (if all the Dogs were like Samba), or five Fs (if everybody was as dumb and lovable as Max).  We could actually get pretty much any set of grades at all.

2.  The Samoyed is a notoriously hard-to-train Dog.  These results are from untrained Samoyeds.  But suppose we trained them?  We used food, praise, hand signals etc etc to get them to fetch better and we did lots of practice.  Now, Sky is faster, Rorie and Arabella don’t prance around the park, and even silly Max can find the stick and bring it.  In other words, all the scores went up, and because there is an upper limit– what Samba does– and nobody is as bad as Max was at fetching, the scores are now clustered closer together.

The new scores, post-training, are:

Sky and Samba: A

Rorie, Max and Arabella: B

Variation, in other words, has been reduced.

3.  Suppose we wanted– for whatever reason– to lower their scores.  So, we play fetch, but we coat the sticks in a nasty mix of chocolate and chili powder, so that whenever the Dogs get near them, they get itchy noses, and very sick if they eat them.  The Dogs stop wanting to fetch our sticks.  Some of them will dutifully do it (e.g. Samba), but they aren’t idiots, and so most of them will decide to forget or ignore their training.

4.  Also note who we don’t have in our Dog Pool:  Labrador Retrievers (the genius of the fetching world), and three-legged Samoyeds.  There’s no Labs because they are three orders of magnitude better than Samoyeds at fetch, and we don’t have three-legged Samoyeds because, well, they can’t run.

In other words, we could reasonably get any mix of scores, and we could improve the scores, or we could– theoretically– lower them.  Also, we don’t have any Einstein-level retrievers or, uhh, “challenged” retreivers– there are no “outliers.”

Now, let’s look at “bell curve” (a.k.a. norm-referenced) grading.  In this case, we decide– in advance— how many of each score we want to assign.  We don’t want any random number of As or Fs or whatever– we want one A, one F, etc.  We want the scores to fit into a bell curve, which looks like this:

bell curve

We are saying “we want a certain # of As, Bs, Cs, Ds and Fs.”  Now, we have a problem.  In our above stick fetching example, we got an A, an A-, a B, a C and a C-.  We have no Ds or Fs, because all of the Dogs could perform.  None of them were totally useless.  (After doing some training, we would get two As (Samba, Sky) and three Bs (Rorie, Max and Arabella).  But if we have decided to bell curve, or norm reference, our scores, we must “force” them to fit this distribution.

So Samba gets an A, Sky gets a B, Rorie gets a C, Arabella gets a D, and Max fails.

Now, why would anyone do this?  The answer is simple: norm referencing is only a way to sort students into ranks where the only thing that matters is where each person ranks in regard to others.  We are not interested in being able to say “in reference to criteria ____, Max ranks at C.”  All we want to do here is to say where everyone is on the marks ladder compared to everyone else.

Universities, law schools, etc sometimes do this, because they have to sort students into ranks for admissions purposes, get into the next level qualifiers, etc etc.  For example, law firm Homo Hic Ebrius Est goes to U.B.C. and has 100 students from which to hire their summer slav– err, articling students.  If they can see bell-curved scores, they can immediately decide to not interview the bottom ___ % of the group, etc.  Which U.B.C. engineers get into second year Engineering?  Why, the top 40% of first-year Engineering students, of course!

Now I am pretty sure you can see the problem with norm referencing:  when we norm reference (bell curve), we don’t necessarily say anything about what students actually know/can do.  In the engineering example, every student could theoretically fail…but the people with the highest marks (say between 40 and 45 per cent) would still be the top ones and get moved on.  In the law example, probably 95% of the students are doing very well, yet a lot of them won’t be considered for hire. Often, bell-curves generate absurd results.  For example, with the law students, you could have an overall mark of 75% (which is pretty good) but be ranked at the bottom of the class.

So where does the idea for norm referencing (“bell curving”) sudent scores come from?  Simple: the idea that scores should  disitribute along bell-curve line comes from a set of wrong assumptions about learning and about “nature.”  In Nature, lots of numbers are distributed along bell-curve lines.  For example, take the height of, say, adult men living in Vancouver.  There will be a massive cluster who within two inches of 5’11” (from 5’9″ to 6’1″).  There will be a smaller # who are 5’6″ to 5’8″ (and also who are 6’1.5″ to 6’3″).  There will be an even smaller number who are shorter than 5’6″ and taller than 6’3″.  Get it?  If you graphed their heights, you’d get a bell curve like this:

bc2

If you graphed adult women, you’d also get a bell curve, but it would be “lower” as women (as dating websites tell us) are generally shorter than men.

Now– pay attention, this is where we gotta really focus– there are THREE THINGS WE HAVE TO REMEMBER ABOUT BELL CURVES

a)  Bell curve distributions only happen when we have an absolutely massive set of numbers.  If you looked at five men, they might all be the same height, short, tall, mixed, whatever (i.e. you could get any curveat all). But when you up your sampling to a thousand, a bell curve emerges.

b) Bell curve distributions only happen when the sample is completely random.  In other words, if you sampled only elderly Chinese-born Chinese men (who are generally shorter than their Caucasian counterparts), the curve would look flatter and the left end would be higher.  If you didn’t include elderly Chinese men, the curve would look “pointier” and the left end would be smaller. A bell curve emerges when we include all adult men in Vancouver.  If you “edit out” anyone, or any group, from the sample, the distribution skews.

c)  Bell curves raise one student’s mark at the expense of another’s.  When we trained our Samoyed Dogs, then marked them on the Stick Fetching Rubric, we got three As and two Bs.  When we convert this into a curve, however, what happens is, each point on the curve can only have one Dog on it.  Or, to put it another way, each Dog has a different mark, no matter how well they actually do.  So, our three As and two Bs become an A, a B, a C, a D and an F.  If Rorie gets a B, that automatically (for math-geek reasons) means that Max will get a different mark, even if they are actually equally skilled.

As you can see in (c), bell curves are absolutely the wrong thing to do with student marks.

And now we can address the issues that Mr John Speaking’s Defartment Head brings up.  Mr Defartment Head seems to think that there are too many high marks, and not enough variation within the marks.

First, there is no way one class– even of 35 kids– has enough members to form an adequate sample size for a bell-curve distribution.  If Mr Defartment Head thinks, “by golly, if that damned Mr John Speaking were teaching rigorously, we’d have only a few As, a few Ds, and far more Bs and Cs,” he’s got it dead wrong: there aren’t enough kids to make that distribution  possible.  Now, it could happen, but it certainly doesn’t have to happen.

Second, Mr John Speaking does not have a statistically random selection of kids in his class.  First, he probably doesn’t have any kids with special challenges (e.g. severe autism, super-low I.Q., deaf, etc etc).  BOOM!– there goes the left side of the bell curve and up go the scores.  He probably also doesn’t have Baby Einstein or Baby Curie in his class– those kids are in the gifted program, or they’ve dropped out and started hi-techs in Silicon Valley.  BOOM!– there goes the right side of your curve.  He’ll still have a distribution, and it could be vaguely bell-like, but it sure won’t be a classic bell curve.

Or he could have something totally different.  Let’s say in 4th block there are zero shop classes, and zero Advanced Placement calculus classes.  All of the kids who take A.P. calculus and shop– and who also take Spanish– therefore get put in Mr Speaking’s 4th block Spanish class.  So we now have fifteen totally non-academic kids, and fifteen college-bound egg-heads.  Mr Speaking, if he used poor methods, could get a double peaked curve:  a bunch of scores clustering in the C range, and another punch in the A, with fewer Bs and Ds.

Third, instruction can– and does– make a massive difference in scores. Remember what happened when we trained our Samoyeds to give them mad fetching skillz, yo? Every Dog got better. If Mr Speaking gave the kids a text, said “here, learn it yourself,” then put his feet up and did Sudoku on his phone or read the newspaper for a year (I have a T.O.C. who comes in and literally does this), his kids would basically suck at the language (our curve just sank down).  On the other hand, if he used excellent methods, his kids’ scores would rise (curve goes up).  Or, he is awesome, but gets sick, and misses half the year, and his substitute is useless, so his kids’ scores come out average.  Or, he sucks, gets sick, and for half the year his kids have Blaine Ray teaching them Spanish, so, again, his kids’ scores are average:  Blaine giveth, and Speaking taketh away.

“Fine,” says the learned Defartment Chair, “Mr John Speaking is a great teacher, and obviously his students’ scores are high as a result of his great teaching, but there should still be a greater range of scores in his class.”

To this, we say a few  things

a)  How do we know what the “right” variability of scores is?  The answer:  there is no way of knowing without doing various kinds of statistical comparisons.  This is because it’s possible that Mr Speaking has a bunch of geniuses in his class.  Or, wait, maybe they just love him (or Spanish) and so all work their butts off.  No, no, maybe they are all exactly the same in IQ?  No, that’s not it.  Perhaps the weak ones get extra tutoring to make up for their weakness. Unless you are prepared to do– and have the data for– something called regression squares analysis, you are not even going to have the faintest idea about what the scores “should” be.

b)  score variability has been reduced with effective teaching.  There are zillions of real-world examples of where appropriate, specific instruction reduces the variation in performance. Any kid speaks their native language quite well.  Sure, some kids have more vocab than others, but no two Bengali (or English-speaking) ten year olds are significantly different in their basic speaking skills.  95% of drivers are never going to have an accident worse than a minor parking-lot fender-bender.  U.S. studies show that an overwhelming majority of long-gun firearm owners store and handle guns properly (the rate is a bit lower for handgun owners). Teach them right, and– if they are paying attention– they will learn.

Think about this.  The top possible score is 100%, and good teaching by definition raises marks.  This means that all marks should rise, and because there is a top end, there will be less variation.

Most importantly,  good teaching works for all students. In the case of a comprehensible input class, all of the teaching is working through what Chomsky called the “universal grammar” mechanism.  It is also restricted in vocab, less (or not) restricted in grammar, and the teacher keeps everything comprehensible and focuses on input.  This is how everyone learns languages– by getting comprehensible input– so it ought to work well (tho not to exactly the same extent) on all learners.

Because there is an upper end of scores (100%), because we have no outliers, and because good teaching by definition reaches everyone, we will have reduced variation in scores in a comprehensible input class.

So, Mr Speaking’s response to his Defartment Head should be “low variation in scores is an indication of the quality of my work. If my work were done poorly, I would have greater variation, as well as lower marks.” High marks plus low variation = good teaching. How could it be otherwise?

In a grammar class, or a “communicative” class, you would expect much more variation in scores.  This is because the teaching– which focuses on grammar and or output, and downplays input– does not follow language acquisition brain rules.  How does this translate into greater score variation?

a) Some kids won’t get enough input– or the input won’t be comprehensible enough– and so they will pick up less.  Now you have more lower scores.

b) Some kids will be OK with that.  Some kids won’t, and they’ll do extra work to catch up.  Result: variation in acquisition.  Now, there will be a few high scores and more low ones.

c) Some kids will hate speaking and so will do poorly on the speaking assessments, which will increse variation.

d) Many kids don’t learn well from grammar teaching, so in a grammar-focused class, you’d expect one or two As, and a lot of lower marks.

e) if the teacher is into things like “self-reflection on one’s language skills and areas for growth” or such edubabble and the kids are supposed to go back and rework/redo assignments, things could go either way.  If, for example, they re-do a dialogue from the start of the course at the end, they might– if the vocab has been recycled all year– do better.  If, however, it’s the check your grammar stuff, you’d again expect variation: only a very few kids can do that, even if their language skills have grown during the year.

And, of course, there is the “grammar bandwidth” problem: any effort to focus on a specific aspect of grammar means that other areas suffer, because our conscious minds have limited capacity. A District colleague told me that, for Level 5 (grade 12) French, the kids self-edit portfolio work. They have an editing checklist– subject-verb agreement, adjective agreement, etc– and they are supposed to go and revise their work.

The problems with this, of course, are two: in their mad hunt for s-v errors, the kids will miss out on other stuff, and we know that little to no conscious learning makes it into long-term memory.

Some real-life examples of how good instruction narrows variation  in scores:

At Half-Baked School, in the Scurvy School District (names have been changed to protect the guilty), TPRS teacher Alicia Rodriguez has Beginning Spanish.  So does her Defartment Chair, Michelle Double-Barreled.  When, at the end of the semester, they have to decide on awards– who is the best Beginning Spanish student?– Alicia has 16 kids getting an A, another 12 getting a B, and two betting a C+.  None fail.  Michelle Double-Barreled has one kid getting an A, a bunch of Bs and Cs, a couple of Ds, and a few failures.

What this means is, 16 of Alicia’s kids can

a) write 100 excellent words in Spanish in 5 min, on topics ranging from “describe yourself” to “describe [a picture].”

b) Write a 600-1000 word story in 45 min.

Both will have totally comprehensible, minor-errors-only Spanish.

Michelle Double-Barrelled, on the other hand, has one A.  Her “A” kid can

a) do grammar stuff

b) write a 100-word paragraph on one of the topics from the text (e.g. shopping, eating in restaurant, sports s/he plays, family).

This will be not-bad Spanish.

Now, who’s doing a better job?  Alicia has more kids doing more and better work.  Michelle has a classic bell-curve distribution.  According to Mr John Speaking’s Defartment Chair, Mrs Double-Barreled has a “normal” range of scores.  Yet Alicia is clearly getting her kids to kick major butt. Hmm…

The point is, with appropriate and effective instruction– good Dog training, or good Spanish teaching– we are going to get a cluster of generally higher scores.  Poor or no teaching might produce something like a bell curve.

So…what does T.P.R.S. and other comprehensible input teaching do for student outcomes?

In my class, T.P.R.S. did the following

a) all scores rose.

b) the difference between top and bottom scores (variation) decreased.

c) I.E.P. kids all passed.

d)  First-year kids in second-year classes did about 85% as well as second year kids, despite having missed a year of class.

e) In terms of what the kids could actually do, it was light-years ahead of the communicative grammar grind.  Kids at end of 2nd year were telling and writing 400-600 word stories in 3-5 verb tenses, in fluent and comprehensible (though not perfect) Spanish.  Oral output was greater in quality and quantity too.

f) Nobody failed.

My colleague Leanda Monro (3rd year French via T.P.R.S.) explains what T.P.R.S. did in her classes:

“[I saw a ] huge change in overall motivation. I attribute this to a variety of well-grounded theories including “emotion precedes cognition” (John Dewey), Krashen’s affective filter, and the possible power of the 9th type of intelligence, drama and creativity. (Fels, Gardener).   There is a  general feeling of excitement, curiosity, eagerness to speak French, incorporation of new vocabulary, spontaneous speech.

All but one student has an A or a B. The one student in the C range has significant learning challenges , and despite excellent attendance in all courses is failing both math and English. No one is failing.

[There was] far less variation. Overall, far greater success for all students. My contribution to the “Your scores are too high” comment is this: As educators we need to pose an important question:  Are we trying to identify talent, or are we trying to nurture and foster talent?  T.P.R.S. works to nurture and foster.”

And here are Steve Bruno’s comments on the effect of T.P.R.S. on his kids’ scores:

“I now get more  As and Bs [than before]. A few C+s and very few Cs. Let’s put it this way, in the past I’ve had to send between 20 and 25 interims/I reports (total 7 classes); this year, so far, I’ve sent just THREE! Of these, two had poor attendance; the other one is an L.A.C. student who is taking a language for the first time (Gr. 9).

Marks are also closer together.  Anyone who has been teaching C.I. will understand why this is the case:  Students feel more confident, less stressed and simply love T.P.R.S. They don’t like doing drills, or memorizing grammar rules, etc.

Here’s anther example [of how comprehensible input has changed student behaviour].  Last year, I had a few students who on the day of an announced test (usually, with one week of warning) would suddenly become ill, and, or skip. Some of my L.A.C. students would have to write the test in a separate room. Others would show all sorts of anxiety as I handed out the tests. Many of these students would end up either failing the test or doing very poorly.

This year, I have the same students in my class, and the day of an unannounced test, they don’t have to go to another room, nobody complains and they just get down to it and, yes, they do quite well, thank you very much!

OK, people…you want to report on how things are going with T.P.R.S.? Post some comments, or email.

Why aren’t there more C.I. teachers?

2,387 French Grammar Images, Stock Photos & Vectors | Shutterstock

I’ve been getting emails and tweets from people all over and one of the questions that often comes up is, why are there not more language teachers using T.P.R.S. or other comprehensible input methodologies?

Let’s get a few things out of the way first.  The evidence is in: all you need for second-language acquisition is lots of meaningful, repetitive, interesting comprehensible input– aurally and through reading.  The research is very clear: we do not need to ask for output, do grammar drills, provide grammar feedback, explicitly teach grammar (in any other sense than brief explanations, like “amos goes with “we” in Spanish), ask students to self-reflect on their linguistic skills, etc, to enable students to acquire a language.   These practices are at best a waste of time and at worst a barrier to acquisition.

But for now, the question remains: how come more teachers don’t use C.I. generally, or T.P.R.S. specifically? 

So here’s a list.  What do you all think?

Adriana Ramirez told me before I started C.I. that I’d need 3 years to make it properly work. I am near the end of year 2 and I couldn’t agree more.  This is a major learning curve…but even done by a beginner, it’s more fun and more effective than anything else.  Now, I am far from the smartest guy in the room, but even I can learn how to teach via comprehensible input.  If I can do it, how come more people don’t?

OK; here we go:

a) “It worked for me.” Most language teachers survived the grammar grind, or the “communicative approach,” in high school.  99% of the time, they went to Europe or Japan or wherever after high school, where the amount of input in the language they’d learned was so high that they became fluent, or close to it.  For many of them, high-school is just the “groundwork” for the real world.  Nevermind that most kids won’t ever end up in ____ to “really” learn or use the language.

These folks basically see it this way: grammar grind/communicative was “real world” preparation; immersion (or Uni) finished off their language skills; this is good enough (or ideal).  We also teach the way we learn– unless we make a strong effort to step out of our mental box– so…

b) “I’m the only C.I. teacher.” If you are the lone wolf– or even if there’s two or three of you– and you have a department, department head or administrator who does not understand, or (more commonly) “believe in,” C.I., it’s tough to kick against the pricks.  This happens in lots of places.  C.I. teachers get badmouthed behind their backs, or openly at department meetings, and since most teachers– especially women teachers, who are socialised to “play nice”– don’t want to ruffle feathers (“come on, team!”-style thinking)– it’s hard for a lot of us to do our own thing.

Thankfully, the Internet keeps us connected to our community.

This is not to say that life is hostile for all C.I. practitioners– in my department, we agree to disagree, and we get along great– but it’s still harder to innovate if you’re the only one.

c) “C.I. doesn’t fit the textbook.”  As noted in earlier posts, you cannot change the order of acquisition of “grammar.”  The only thing you can change is the speed, and the only way you can do that is to provide loads of interesting comprehensible input.  Textbooks are the antithesis of how language is actually acquired.  If a C.I. teacher is forced to “follow” a textbook, C.I. goes out the window.  Anyone who has ever seen the TPRS addendum for the Avancemos textbook knows how stupid, though well-meaning, that addendum, is.

d) “What I do works, and I’m not reinventing the wheel.”  Teachers who are young are overwhelmed, then they have kids; many of us old farts (and I am one of them) lose their edge.  “I worked for years, my system works, I’m not changing” is what one person flat-out said to me not long ago when I asked them about whether they’d want to try C.I. I get it.  It’s hard.  The question, I suppose, is philosophical: why, really, are you here?  I like Nietzsche’s way of assessing the “rightness” of a choice (what he calls “the eternal recurrence”): you are doing something right if you could do it an infinite number of times and it would still be interesting. This is what I love about C.I.: I can never step twice into the same story.

Is the C.I. workload too high?  Hmm… when I look back at the work I did pre-TPRS, I am amazed at how MUCH stuff I made, found, put together, modified, etc, and how bad my results were.  I made a few hundred bingo (and other) games, I made conversation cards, I designed culture projects, I made conversation systems…and at the end of the year the kids still couldn’t say “I like running.”

With C.I. , all I really need are a story outline or idea, the props etc, a reading that uses these structures, and a novel.  Indeed, after 20 classes, my kids are reading Berto y sus Buenas Ideas, and their output at the end of the year is MILES ahead of where they used to be (and there are zero management issues).  So…it’s not more work…. it’s different work.

e) “I’m not sure that will work.” Most of us teachers are conservative and cautious by nature (unless they are young, single and on a Pro-D day which ends with alcohol– Ok, I am being facetious here, but you get the point).   We are passing on tradition, we have old-school language teaching hammered into our heads….

f) “This is what we do.” Institutional memory is long. I read somewhere that educational research takes from 8-50 years to filter down into shared practice. We still have senior English teachers in my district who give spelling tests fer Gawd’s sake!  If you learned the Blablabian language via Activity ____ in high school, then you end up teaching Blablabian in a high school, doing Activity ____ is an easy default.

95% of teacher learning does not happen during methods class, as all of us (except for University education-program designers) know. We learn most of our skills in real time, in a real job.  This means you are going to get support, advice, materials etc from established people, and, in languages, that means grammarians and “eclectic”-approach people.

g) “Not all students can succeed in French.  Some aren’t smart enough and others don’t have the skills” is what my first language department head told me.  She then added “that’s why many of them take Spanish or Punjabi. They think it’s easier and they don’t want to do the work that French requires.” Standard grammar or communicative teaching works as a “weeder” of students who do not want to learn via grammar instruction or stupid “communicative tasks”– like “ask your neighbour if he likes running. This reinforces what most people think: that most people cannot learn languages cos they don’t “work hard enough.”

I recently read a great article about Alcoholics Anonymous which pointed out that A.A.’s success rate is abysmal– max 10% of AA attendees stay off the sauce (or whatever) for any length of time– yet its rep is solid because the 90% people for whom it fails don’t go talking about it; the 10% who make it, join, mentor others, speak publicly, etc, and so the “success” rate appears high and the failures are blamed for their failure by those who have succeeded.  According to A.A. boosters, if A.A. works, it’s because of the method; when it fails, it’s the fault of those who failed.  This is like saying “we made a drug that cures ___ 10% of the time, and the other 90% it’s the fault of the patient that the cure did not take.”

If we use Method X, and it works poorly for many students, we could come up with many explanations.  Bad method?  Students who don’t have skills or motivation?  Who knows for sure…but given that anyone and everyone, even the severely mentally challenged, can learn a language, and that most people in the world learn two or more languages without formal teaching, it’s a long shot to say that failure to learn is the kids’ fault.

What we do know is that in Canada, as Netten and Germain (2012) argue in a paper (see bibliography for details), “Core French” (what most kids get: 5-6 hours/week of french from Gr5-Gr11 or so) doesn’t work very well.  Lots of kids drop out, many don’t like it, and those who do finish have poor skills.  The ones who do best tend to be white (and, increasingly, East Asian), wealthier and with more educated parents (the same is broadly true of French Immersion kids).  These are the kids who go on exchange trips to Quebec, whose parents buy them Dora and Duolingo and Rosetta Stone, etc.  Hmmm…anybody see the problem here?

Looking around at various B.C. school districts, what I have heard, over and over, is that generally 75% of kids drop out of languages by grade 12.  If a 1500-kid schools has 8 blocks of French (250 kids) in grade 8 (level 1 for you Yanks), by the time they get to grade 12 often there will only be one or maybe two blocks left (25-50 kids).  Partly this is cos taking a second language is only necessary to the grade 11 level (in B.C. and a few other provinces) for Uni admission purposes.  But you have to wonder.  What if math was taught this way?  What if more than 75% of math students dropped math by grade 12?  There would be an uproar.  National crisis!  We’re losing our edge, bla bla bla.

My guess is that traditional grammar or communicative teaching “weeds out” the kids who don’t naturally learn in those ways, and so the ones who do finish are held up as examples of the “success” of these older methods, which reinforces bias against change:  “Johnny got an A in French 12; that means other kids should also be able to.”

At ____ last year (names have been changed to protect the guilty), the C.I. teacher and the dept head (a grammarian and I.B. teacher) each had Intro Spanish. At the end of the year when they had to decide who the “top kids in Spanish” in each section were for award purposes, the C.I. teacher had literally 15 kids per block at an A/A+ level, while the dept head had one.  The C.I. teacher said it was C.I. that did it while the department head said something like a lot of these students [in my class] don’t put in the work or don’t have the skills. (Same kinds of kids in both groups).  I’ve seen the C.I. results– they’re stellar.  The department head refuses to try C.I., bad-mouths the department’s two C.I. teachers (who by any standard get amazing results), and still maintains that it is student work habits that drive acquisition, not the teaching method.

Yet, for the department head, the failure of many kids is a benefit to her: she ends up with the egg-head kids who slave away over grammar when they get to the International Bachelaurate year of language study, and she doesn’t have to change her teaching style, and the kids still do “well enough.”  She would get annihilated in a non-egg-head school.

h) “The textbook provides structure” In BC, there was a provincial committee ten years ago that looked at Spanish resources. They allowed ¡Díme!  (the dumbest book ever made), Avancemos, Paso a Paso, etc, because anybody can go in with those texts, and just follow the instructions. ¡Juntos!— which is as good as it gets for the communicative approach– was rejected cos in order to use it, you had to be pretty creative, have reasonable Spanish, be into all kinds of manipulatives and out-of-seat activities, etc.

In other words, the Ministry of Education and the Board consortium which collectively reviewed Spanish materials wanted a teacher-proof program where anyone could “teach” Spanish.  I get this, sort of– I know loads of people (I am one) who wasn’t Uni-trained in language teaching and had to learn to teach it on the fly– but it’s pretty frikkin’ bad long-term policy.  We like to say “students will rise to our expectations.”  Surely we can expect the same of teachers.  This btw is one of the reasons why A.I.M. is popular: it is so rigidly organised and laid out that you can literally walk in on Day 1 and follow the book for an entire year.

i) There’s no real pressure to “succeed” in languages teaching.   One of the ironies of the idiotic, standards-driven U.S. testing mania is that there has been real pressure to figure out what works.  Blaine Ray, T.P.R.S.’s inventor, was fired from his first  and second jobs because his principals wanted better results and higher enrolment.  Ray’s quest for a better method– which began with Asher’s T.P.R. and then moved into narratives after he read Krashen– produced what appears right now to be the best second-language-teaching method. We Canadians have less testing, less accountability– all in my view good things– and, above all, much less socioeconomic inequality than the U.S., which leads to better outcomes for poorer students.  But the price for our ease is a lack of innovation, especially in languages pedagogy.  And too often we can just say “well, those kids didn’t learn French (or whatever) because, well, they weren’t working hard enough. People who really need to learn a second language have to take it in Uni, and usually go away to where it’s spoken to pick it up.”

j) “Why did I not learn about C.I. in Uni?” In B.C., none of the languages teacher training programs

  • offer teacher candidates a solid grounding in tested, systematised comprehensible input methods (TPRS, narrative paraphrase, Story Listening, etc).  If you can teach people the basics of TPRS in two days, why are student teachers not being taught?
  • teach, or ask teacher candidates to demonstrate, understanding of current research into how languages are acquired
  • hire professors/instructors who know S.L.A. research.  In the U.S., Bill VanPatten found that fewer than two per cent of Uni-level languages teachers knew anything at all about S.L.A.
  • ask that teacher candidates demonstrate competence in C.I. methods

Much the same is true for most of the U.S. and Canada.  How do people actually learn effective methods?  From an experimental colleague, or from people like Blaine Ray,  Carol Gaab or the IFLT/NTPRS conferences.  I do TPRS workshops every year; without fail, I am asked why did we not learn this in teacher training?

k) Institutional pressure is strongly against C.I.  Krashen, who has done more to advance  language teaching than anyone, was unable to go to the ACTFL conference for years, because he has courageously and correctly called out ACTFL’s textbook sponsors about the high cost and low effectiveness of their materials.  They finally got him in there in 2016 because Bill VanPatten put his foot down.  As I have noted, using T.P.R.S. is way cheaper than using a textbook as well as being far more effective…which Houghton-Mifflin etc do not want to hear.

OK there’s my list of the reasons why C.I. teachers, and not grammarians or “communicative” teachers are still in the minority.