communicative activities

What grades should kids get? Notes on evaluation for the mathematically-challenged.

Here is a part of a post from Ben’s.  A teacher– let’s call him Mr John Speaking– who uses T.P.R.S. in their language class writes:

“I was told by a Defartment Chair a few weeks ago that my grades were too high across the board (all 90s/100s) and that I needed more of a range for each assessment. Two weeks later I had not fixed this “problem” and this same Defartment Chair pulled me out of class and proceeded to tell me, referencing gradebook printouts for all my classes, that these high grades “tell me there is not enough rigor in your class, or that you’re not really grading these assessments.” After this accusation, this Defartment Chair told me I was “brought on board [for a maternity leave replacement] in the hopes of being able to keep me, but that based on what he’d seen the past few weeks, I’m honestly not tenure track material.”

Obviously, Mr John Speaking’s Defartment Chair is an idiot, but, as idiots do, he does us a favour:  he brings up things worth thinking about.

There are two issues here:

a) Should– or do— student scores follow any predictable distribution?  I.e., should there be– or are there–a set percentage of kids in a class who get As, Bs, Cs, Ds and Fs?

b) How do you know when scores are “too low” or “too high”?

Today’s question: what grades should students get?

First, a simple, math idiot’s detour into grading systems and stats.  The math idiot is me.  Hate stats?  Bad at math? Read on!  If I can get it, anyone can get it!

It is important to note that there are basically two kinds of grading systems. We have criterion-referenced grading and curved (norm-referenced) grading.

First, we have criterion-referenced grading.  This is, we have a standard– to get an A, a student does X.  To get a B, a student does Y, etc.  For example, we want to see what our Samoyed Dogs’ fetching skills are and assign them fetching marks. Here is our Stick Fetching Rubric:

A:  the dog runs directly and quickly to the thrown stick, picks it up, brings it back to its owner, and drops it at owner’s feet.

B: the dog dawdles on its way to the stick, plays with it, dawdles on the way back, and doesn’t drop it until asked.

C: the dog takes seemingly forever to find the stick, bring it back, and refuses to drop it.

So we take our pack of five Samoyed Dogs, and we test them on their retrieval skills.  Max, who is a total idiot, can’t find the stick forever, then visits everyone else in the park, then poos, then brings the stick an hour later but won’t drop it because, hell, wrestling with owner is more fun.  Samba dutifully retrieves and drops.  Rorie is a total diva and prances around the park before bringing the stick back.  Arabella is like her mother, Rorie, but won’t drop the stick.  Sky, who is so old he can remember when dinosaurs walked the Earth, goes straight there, gets the stick, and slowly trudges back.  So we have one A, one B, one C, one C- (Max– we mercy passed him) and one A- (Sky, cos he’s good and focused, but slow).

Here are our Samoyeds:

Samoyeds

Now note–

1. Under this scheme, we could theoretically get five As (if all the Dogs were like Samba), or five Fs (if everybody was as dumb and lovable as Max).  We could actually get pretty much any set of grades at all.

2.  The Samoyed is a notoriously hard-to-train Dog.  These results are from untrained Samoyeds.  But suppose we trained them?  We used food, praise, hand signals etc etc to get them to fetch better and we did lots of practice.  Now, Sky is faster, Rorie and Arabella don’t prance around the park, and even silly Max can find the stick and bring it.  In other words, all the scores went up, and because there is an upper limit– what Samba does– and nobody is as bad as Max was at fetching, the scores are now clustered closer together.

The new scores, post-training, are:

Sky and Samba: A

Rorie, Max and Arabella: B

Variation, in other words, has been reduced.

3.  Suppose we wanted– for whatever reason– to lower their scores.  So, we play fetch, but we coat the sticks in a nasty mix of chocolate and chili powder, so that whenever the Dogs get near them, they get itchy noses, and very sick if they eat them.  The Dogs stop wanting to fetch our sticks.  Some of them will dutifully do it (e.g. Samba), but they aren’t idiots, and so most of them will decide to forget or ignore their training.

4.  Also note who we don’t have in our Dog Pool:  Labrador Retrievers (the genius of the fetching world), and three-legged Samoyeds.  There’s no Labs because they are three orders of magnitude better than Samoyeds at fetch, and we don’t have three-legged Samoyeds because, well, they can’t run.

In other words, we could reasonably get any mix of scores, and we could improve the scores, or we could– theoretically– lower them.  Also, we don’t have any Einstein-level retrievers or, uhh, “challenged” retreivers– there are no “outliers.”

Now, let’s look at “bell curve” (a.k.a. norm-referenced) grading.  In this case, we decide– in advance— how many of each score we want to assign.  We don’t want any random number of As or Fs or whatever– we want one A, one F, etc.  We want the scores to fit into a bell curve, which looks like this:

bell curve

We are saying “we want a certain # of As, Bs, Cs, Ds and Fs.”  Now, we have a problem.  In our above stick fetching example, we got an A, an A-, a B, a C and a C-.  We have no Ds or Fs, because all of the Dogs could perform.  None of them were totally useless.  (After doing some training, we would get two As (Samba, Sky) and three Bs (Rorie, Max and Arabella).  But if we have decided to bell curve, or norm reference, our scores, we must “force” them to fit this distribution.

So Samba gets an A, Sky gets a B, Rorie gets a C, Arabella gets a D, and Max fails.

Now, why would anyone do this?  The answer is simple: norm referencing is only a way to sort students into ranks where the only thing that matters is where each person ranks in regard to others.  We are not interested in being able to say “in reference to criteria ____, Max ranks at C.”  All we want to do here is to say where everyone is on the marks ladder compared to everyone else.

Universities, law schools, etc sometimes do this, because they have to sort students into ranks for admissions purposes, get into the next level qualifiers, etc etc.  For example, law firm Homo Hic Ebrius Est goes to U.B.C. and has 100 students from which to hire their summer slav– err, articling students.  If they can see bell-curved scores, they can immediately decide to not interview the bottom ___ % of the group, etc.  Which U.B.C. engineers get into second year Engineering?  Why, the top 40% of first-year Engineering students, of course!

Now I am pretty sure you can see the problem with norm referencing:  when we norm reference (bell curve), we don’t necessarily say anything about what students actually know/can do.  In the engineering example, every student could theoretically fail…but the people with the highest marks (say between 40 and 45 per cent) would still be the top ones and get moved on.  In the law example, probably 95% of the students are doing very well, yet a lot of them won’t be considered for hire. Often, bell-curves generate absurd results.  For example, with the law students, you could have an overall mark of 75% (which is pretty good) but be ranked at the bottom of the class.

So where does the idea for norm referencing (“bell curving”) sudent scores come from?  Simple: the idea that scores should  disitribute along bell-curve line comes from a set of wrong assumptions about learning and about “nature.”  In Nature, lots of numbers are distributed along bell-curve lines.  For example, take the height of, say, adult men living in Vancouver.  There will be a massive cluster who within two inches of 5’11” (from 5’9″ to 6’1″).  There will be a smaller # who are 5’6″ to 5’8″ (and also who are 6’1.5″ to 6’3″).  There will be an even smaller number who are shorter than 5’6″ and taller than 6’3″.  Get it?  If you graphed their heights, you’d get a bell curve like this:

bc2

If you graphed adult women, you’d also get a bell curve, but it would be “lower” as women (as dating websites tell us) are generally shorter than men.

Now– pay attention, this is where we gotta really focus– there are THREE THINGS WE HAVE TO REMEMBER ABOUT BELL CURVES

a)  Bell curve distributions only happen when we have an absolutely massive set of numbers.  If you looked at five men, they might all be the same height, short, tall, mixed, whatever (i.e. you could get any curveat all). But when you up your sampling to a thousand, a bell curve emerges.

b) Bell curve distributions only happen when the sample is completely random.  In other words, if you sampled only elderly Chinese-born Chinese men (who are generally shorter than their Caucasian counterparts), the curve would look flatter and the left end would be higher.  If you didn’t include elderly Chinese men, the curve would look “pointier” and the left end would be smaller. A bell curve emerges when we include all adult men in Vancouver.  If you “edit out” anyone, or any group, from the sample, the distribution skews.

c)  Bell curves raise one student’s mark at the expense of another’s.  When we trained our Samoyed Dogs, then marked them on the Stick Fetching Rubric, we got three As and two Bs.  When we convert this into a curve, however, what happens is, each point on the curve can only have one Dog on it.  Or, to put it another way, each Dog has a different mark, no matter how well they actually do.  So, our three As and two Bs become an A, a B, a C, a D and an F.  If Rorie gets a B, that automatically (for math-geek reasons) means that Max will get a different mark, even if they are actually equally skilled.

As you can see in (c), bell curves are absolutely the wrong thing to do with student marks.

And now we can address the issues that Mr John Speaking’s Defartment Head brings up.  Mr Defartment Head seems to think that there are too many high marks, and not enough variation within the marks.

First, there is no way one class– even of 35 kids– has enough members to form an adequate sample size for a bell-curve distribution.  If Mr Defartment Head thinks, “by golly, if that damned Mr John Speaking were teaching rigorously, we’d have only a few As, a few Ds, and far more Bs and Cs,” he’s got it dead wrong: there aren’t enough kids to make that distribution  possible.  Now, it could happen, but it certainly doesn’t have to happen.

Second, Mr John Speaking does not have a statistically random selection of kids in his class.  First, he probably doesn’t have any kids with special challenges (e.g. severe autism, super-low I.Q., deaf, etc etc).  BOOM!– there goes the left side of the bell curve and up go the scores.  He probably also doesn’t have Baby Einstein or Baby Curie in his class– those kids are in the gifted program, or they’ve dropped out and started hi-techs in Silicon Valley.  BOOM!– there goes the right side of your curve.  He’ll still have a distribution, and it could be vaguely bell-like, but it sure won’t be a classic bell curve.

Or he could have something totally different.  Let’s say in 4th block there are zero shop classes, and zero Advanced Placement calculus classes.  All of the kids who take A.P. calculus and shop– and who also take Spanish– therefore get put in Mr Speaking’s 4th block Spanish class.  So we now have fifteen totally non-academic kids, and fifteen college-bound egg-heads.  Mr Speaking, if he used poor methods, could get a double peaked curve:  a bunch of scores clustering in the C range, and another punch in the A, with fewer Bs and Ds.

Third, instruction can– and does– make a massive difference in scores. Remember what happened when we trained our Samoyeds to give them mad fetching skillz, yo? Every Dog got better. If Mr Speaking gave the kids a text, said “here, learn it yourself,” then put his feet up and did Sudoku on his phone or read the newspaper for a year (I have a T.O.C. who comes in and literally does this), his kids would basically suck at the language (our curve just sank down).  On the other hand, if he used excellent methods, his kids’ scores would rise (curve goes up).  Or, he is awesome, but gets sick, and misses half the year, and his substitute is useless, so his kids’ scores come out average.  Or, he sucks, gets sick, and for half the year his kids have Blaine Ray teaching them Spanish, so, again, his kids’ scores are average:  Blaine giveth, and Speaking taketh away.

“Fine,” says the learned Defartment Chair, “Mr John Speaking is a great teacher, and obviously his students’ scores are high as a result of his great teaching, but there should still be a greater range of scores in his class.”

To this, we say a few  things

a)  How do we know what the “right” variability of scores is?  The answer:  there is no way of knowing without doing various kinds of statistical comparisons.  This is because it’s possible that Mr Speaking has a bunch of geniuses in his class.  Or, wait, maybe they just love him (or Spanish) and so all work their butts off.  No, no, maybe they are all exactly the same in IQ?  No, that’s not it.  Perhaps the weak ones get extra tutoring to make up for their weakness. Unless you are prepared to do– and have the data for– something called regression squares analysis, you are not even going to have the faintest idea about what the scores “should” be.

b)  score variability has been reduced with effective teaching.  There are zillions of real-world examples of where appropriate, specific instruction reduces the variation in performance. Any kid speaks their native language quite well.  Sure, some kids have more vocab than others, but no two Bengali (or English-speaking) ten year olds are significantly different in their basic speaking skills.  95% of drivers are never going to have an accident worse than a minor parking-lot fender-bender.  U.S. studies show that an overwhelming majority of long-gun firearm owners store and handle guns properly (the rate is a bit lower for handgun owners). Teach them right, and– if they are paying attention– they will learn.

Think about this.  The top possible score is 100%, and good teaching by definition raises marks.  This means that all marks should rise, and because there is a top end, there will be less variation.

Most importantly,  good teaching works for all students. In the case of a comprehensible input class, all of the teaching is working through what Chomsky called the “universal grammar” mechanism.  It is also restricted in vocab, less (or not) restricted in grammar, and the teacher keeps everything comprehensible and focuses on input.  This is how everyone learns languages– by getting comprehensible input– so it ought to work well (tho not to exactly the same extent) on all learners.

Because there is an upper end of scores (100%), because we have no outliers, and because good teaching by definition reaches everyone, we will have reduced variation in scores in a comprehensible input class.

So, Mr Speaking’s response to his Defartment Head should be “low variation in scores is an indication of the quality of my work. If my work were done poorly, I would have greater variation, as well as lower marks.” High marks plus low variation = good teaching. How could it be otherwise?

In a grammar class, or a “communicative” class, you would expect much more variation in scores.  This is because the teaching– which focuses on grammar and or output, and downplays input– does not follow language acquisition brain rules.  How does this translate into greater score variation?

a) Some kids won’t get enough input– or the input won’t be comprehensible enough– and so they will pick up less.  Now you have more lower scores.

b) Some kids will be OK with that.  Some kids won’t, and they’ll do extra work to catch up.  Result: variation in acquisition.  Now, there will be a few high scores and more low ones.

c) Some kids will hate speaking and so will do poorly on the speaking assessments, which will increse variation.

d) Many kids don’t learn well from grammar teaching, so in a grammar-focused class, you’d expect one or two As, and a lot of lower marks.

e) if the teacher is into things like “self-reflection on one’s language skills and areas for growth” or such edubabble and the kids are supposed to go back and rework/redo assignments, things could go either way.  If, for example, they re-do a dialogue from the start of the course at the end, they might– if the vocab has been recycled all year– do better.  If, however, it’s the check your grammar stuff, you’d again expect variation: only a very few kids can do that, even if their language skills have grown during the year.

And, of course, there is the “grammar bandwidth” problem: any effort to focus on a specific aspect of grammar means that other areas suffer, because our conscious minds have limited capacity. A District colleague told me that, for Level 5 (grade 12) French, the kids self-edit portfolio work. They have an editing checklist– subject-verb agreement, adjective agreement, etc– and they are supposed to go and revise their work.

The problems with this, of course, are two: in their mad hunt for s-v errors, the kids will miss out on other stuff, and we know that little to no conscious learning makes it into long-term memory.

Some real-life examples of how good instruction narrows variation  in scores:

At Half-Baked School, in the Scurvy School District (names have been changed to protect the guilty), TPRS teacher Alicia Rodriguez has Beginning Spanish.  So does her Defartment Chair, Michelle Double-Barreled.  When, at the end of the semester, they have to decide on awards– who is the best Beginning Spanish student?– Alicia has 16 kids getting an A, another 12 getting a B, and two betting a C+.  None fail.  Michelle Double-Barreled has one kid getting an A, a bunch of Bs and Cs, a couple of Ds, and a few failures.

What this means is, 16 of Alicia’s kids can

a) write 100 excellent words in Spanish in 5 min, on topics ranging from “describe yourself” to “describe [a picture].”

b) Write a 600-1000 word story in 45 min.

Both will have totally comprehensible, minor-errors-only Spanish.

Michelle Double-Barrelled, on the other hand, has one A.  Her “A” kid can

a) do grammar stuff

b) write a 100-word paragraph on one of the topics from the text (e.g. shopping, eating in restaurant, sports s/he plays, family).

This will be not-bad Spanish.

Now, who’s doing a better job?  Alicia has more kids doing more and better work.  Michelle has a classic bell-curve distribution.  According to Mr John Speaking’s Defartment Chair, Mrs Double-Barreled has a “normal” range of scores.  Yet Alicia is clearly getting her kids to kick major butt. Hmm…

The point is, with appropriate and effective instruction– good Dog training, or good Spanish teaching– we are going to get a cluster of generally higher scores.  Poor or no teaching might produce something like a bell curve.

So…what does T.P.R.S. and other comprehensible input teaching do for student outcomes?

In my class, T.P.R.S. did the following

a) all scores rose.

b) the difference between top and bottom scores (variation) decreased.

c) I.E.P. kids all passed.

d)  First-year kids in second-year classes did about 85% as well as second year kids, despite having missed a year of class.

e) In terms of what the kids could actually do, it was light-years ahead of the communicative grammar grind.  Kids at end of 2nd year were telling and writing 400-600 word stories in 3-5 verb tenses, in fluent and comprehensible (though not perfect) Spanish.  Oral output was greater in quality and quantity too.

f) Nobody failed.

My colleague Leanda Monro (3rd year French via T.P.R.S.) explains what T.P.R.S. did in her classes:

“[I saw a ] huge change in overall motivation. I attribute this to a variety of well-grounded theories including “emotion precedes cognition” (John Dewey), Krashen’s affective filter, and the possible power of the 9th type of intelligence, drama and creativity. (Fels, Gardener).   There is a  general feeling of excitement, curiosity, eagerness to speak French, incorporation of new vocabulary, spontaneous speech.

All but one student has an A or a B. The one student in the C range has significant learning challenges , and despite excellent attendance in all courses is failing both math and English. No one is failing.

[There was] far less variation. Overall, far greater success for all students. My contribution to the “Your scores are too high” comment is this: As educators we need to pose an important question:  Are we trying to identify talent, or are we trying to nurture and foster talent?  T.P.R.S. works to nurture and foster.”

And here are Steve Bruno’s comments on the effect of T.P.R.S. on his kids’ scores:

“I now get more  As and Bs [than before]. A few C+s and very few Cs. Let’s put it this way, in the past I’ve had to send between 20 and 25 interims/I reports (total 7 classes); this year, so far, I’ve sent just THREE! Of these, two had poor attendance; the other one is an L.A.C. student who is taking a language for the first time (Gr. 9).

Marks are also closer together.  Anyone who has been teaching C.I. will understand why this is the case:  Students feel more confident, less stressed and simply love T.P.R.S. They don’t like doing drills, or memorizing grammar rules, etc.

Here’s anther example [of how comprehensible input has changed student behaviour].  Last year, I had a few students who on the day of an announced test (usually, with one week of warning) would suddenly become ill, and, or skip. Some of my L.A.C. students would have to write the test in a separate room. Others would show all sorts of anxiety as I handed out the tests. Many of these students would end up either failing the test or doing very poorly.

This year, I have the same students in my class, and the day of an unannounced test, they don’t have to go to another room, nobody complains and they just get down to it and, yes, they do quite well, thank you very much!

OK, people…you want to report on how things are going with T.P.R.S.? Post some comments, or email.

Advertisements

What are the pros and the cons of A.I.M.?

I was recently chatting with a couple of Vancouver teachers who used to use the Accelerated Integrative Method (A.I.M.) of language teaching.  A.I.M., developed by Wendy Maxwell, is both a method and a program.  It begins  with “total immersion”: the teacher speaks only the target language in class, and uses gestures to support meaning.  Students are expected to speak from Day 1, and to also use the gestures.  There is reading, some grammar instruction (not a ton), and the whole thing is built around a set of stories, which are read, listened to, acted, watched, acted with puppets, etc, as well as responded to.  Oral output is rehearsing a play, which is performed at the end of the year/semester.  They have some reading materials.  The curriculum is super-structured:  you need to “do” all the stories in order to perform the play and they have very detailed lesson plans (and procedures) starting day 1.

Now, I have not used A.I.M.– I found out about it at the same time as T.P.R.S. and the latter intuitively appealed to me more– but I get asked a lot about what I think.  So since I can’t speak for A.I.M., I’ll let Catherine and Natasha explain what they did and didn’t like about it:

Natasha:

  • used AIM for about 2 years for French
  • liked the intense “immersion” it offered– lots of French spoken in class and the T.P.R. (total physical response– words accompanied with gesture) aspect
  • initially appreciated the rigorous structure: it was “easy to start” and there was no need to copy/borrow/adapt “materials” and “resources” from others.

Natasha abandoned A.I.M. and here is why:

  • the TPR was only superficially and initially useful and eventually became a pain in the butt.  Students also generally refused to do it.
  • TPR created problems with ambiguity, and fossilised.  For example, if a gesture accompanied “walks,” Natasha found that they would keep using “walks”in the wrong place with the gesture (e.g. “we walks”).
  • the oral assessment– can the kids recite their lines in the play?– in her view was silly as it wasn’t even close to real language use.  She also noted that the performers didn’t always know what they were saying.
  • she found it very difficult to keep the kids focused on the stories, because they are the same in all their iterations.  E.g. they would listen to it, read it, watch it, act it out, act it out with puppets, etc.  There was, according to Natasha, no variation.  No parallel characters, student-centered improv a la t.p.r.s., etc.

Catherine also used A.I.M. for two years and repeated most of Natasha’s comments (both positive and negative), with a few of her own.  On the upside:

  • if the whole languages department in a school is using A.I.M., the transitions between grades– i.e. “what should they know when they start grade ___?”– is very easy, as the curriculum is majorly locked in.
  • the theatre pieces in which each year or semester culminates are pretty cool to look at (and, if your school has the resources for costumes etc, can be a lot of fun to put on)

On the downside:

  • because the curriculum is so rigid, it inevitably leaves some students out.  If students have not acquired ___, the curriculum marches ahead anyway.
  • there is very little room for improvisation in stories
  • teachers with a creative bent will be severely limited, because the whole A.I.M. package is “unified” and one has to “do” or “cover” everything for the final goal– theatre pieces– to work.  This means that teachers’ ideas will have very limited room for exploration.
  • much of the introductory stuff is boring.  E.g. the class sits in a circle and the teacher says “this is a pen,” and “this is a desk,” etc.

(One of the interesting things for me was oral assessment:  A.I.M. uses “real” language– i.e. student-generated output– right from the get-go, but assesses something other than “real language” in the theatre piece, while T.P.R.S. uses “fake” language– acted-out stories with simple dialogue– but assesses “real” language– teacher interviewing the kids one-on-one.)

T.P.R.S. answers a few of these criticisms:

  1. T.P.R. is only (and optionally) used for awhile, and generally with true beginners
  2. The method is infinitely flexible.  We have Blaine’s “holy trinity” of story asking, PQA and reading…and we now also have Ben Slavic’s PictureTalk, Ashley Hastings’ MovieTalk, dictation…and even when we are using a “text” such as Blaine Ray’s Look, I Can Talk, or Adriana Ramírez’ Teaching Spanish Through Comprehensible Input Storytelling, we– and the KIDS– can change story details, locations, etc etc.
  3. The comprehension checks in T.P.R.S.– if regularly done– will provide super-clear feedback about whether or not students have acquired (on understanding level) whatever they are being taught.  If a teacher gets a weak choral response, or slow/poor responses from the actor(s), we go back, add a character, etc.
  4. There is no “end goal” in T.P.R.S.  If we are in the target language, and the kids understand, and we don’t overload them with vocab, they are acquiring.  Blaine Ray has famously remarked that he spent four months doing ONE story with his grade 9s.  We are not working toward an exam, a play, a portfolio.  All we want to do is tell the kids interesting fun stories with vocab we can repeat zillions of times.
  5. If a story is boring, we add a parallel character, or bail out and start another one, or throw something random in.  While we do want to stick to our structures, we can basically do whatever we want with them.
  6. If there’s ambiguity we just translate.

Another colleague, Katy-Ann, has this to say about A.I.M.:

“I loved using the AIM program!!  It was a lot of work at the beginning to learn all the gestures, but I found that it worked so well. I could speak French for the entire time with my 8’s, and the majority of the kids loved the way the program worked. At the end of the year the students were capable of telling a story (based on the play that we read) in their own words, with a partner. The activity was completely unrehearsed, and as the students alternated back and forth telling the story, they had to listen for details and continue on where their partner left off. Most groups talked bath and forth in this way for a good 10 minutes. They were also capable of writing a massive story. I loved hearing them create more complex sentences and I could help them with the words they were stuck on without actually telling them the word. I could gesture and it would jog their memory. I found that this gave the students confidence. They were actually recalling things and not just repeating words back to me. At the end of the year the feedback from the students was overwhelmingly positive and the parents were very supportive of the method as well.

I’m a fairly animated teacher, so I felt comfortable making a bit of a fool of myself with the gestures, songs and games. My colleague and I collaborated a lot during the process and reworked the songs into raps to make them a little cooler. This style really suited my personality and I loved that I could actually stick to my French only rule in the classroom.  I haven’t used TPRS in the classroom and unfortunately I’m not teaching French this year, so I can’t really compare the methods. If I was teaching French (and I had some pull at my new school) I would totally beg to do the AIM program again with the jr French classes. I’m not sure how the older kids would react to it.

Anyways, I hope that this helps. I think that the program is AMAZING. The kit that my school purchased is called Salut, mon ami. I only got through one kit in the year, because we added in a couple things, but I would recommend two per grade – or if you are just starting, then one.  Of course there are some holes in the program, but the main thing that I noticed is that the kids were speaking in full sentences every day, they were successful and engaged. I could really go on and on about it because I’m a believer. I would totally take the seminar if you can. I did the three day course and by the end I knew it was for me.”

Anyway there you have it, some A.I.M. ideas.  Anyone with experience with A.I.M. please leave some comments.

Is Output Useful in the 2nd Language Classroom?

I’ve been arguing with Sarah Cottrell, Kari R. and others about the role of output in a FL classroom.  TPRS is primarily an input-based methodology:  students learn via hearing (and reading) comprehensible input.  So…should students in a TPRS class talk?  And, if so, when?

You can read Krashen’s views here.

My first answer is, talking is not useful– at all– when we start with beginners.  Most people don’t like talking at first, because they quite logically feel self-conscious because they know their accent is “wrong” or “off,” and because they know that they sound like a two-year-old.  There’s lots of research that suggests that talking in TL tops the list of “things I don’t like” in FL classrooms.

If Krashen is right about the affective filter– that people need to be happy and comfy (he says “open to input”) to learn a FL– then talking (for beginners) is totally the wrong strategy.

Output (from most learners) is often flawed, which means, bad input.  Ask any Spanish teacher what happens when you first teach “gustar” (“to be pleasing” or “to like”).  When you ask “te gusta ver la tele?”  the kids answer with “Si, te gusta ver la tele.” (Do you like watching TV? Yes, you like watching TV).  This is a classic beginner mistake.  Now, we understand (and so does the kid doing the listening)…but what they need to hear– “Si, me gusta…”– they don’t.  Why would we encourage poor modeling?

Bill VanPatten (2003) argues that language acquisition happens only when a learner is processing input. “Output,” he writes, “is not a substitute for input, which must come from other speakers.”

We might say, “poor modeling, fine, the kids will pick it up eventually.”  Sure…but that’s a waste of time; why dither?  If we can provide quality input via stories and reading, acquisition is much faster.

Another problem with output exercises– the “communication gap” activities that communicative teachers use, where kids are supposed to use (and thereby acquire) the TL to acquire information they want– is that they’re boring, and they can be bypassed via L1 use.

If you are teaching “to like,” and you tell the kids “OK, look at the list of things in your book on P. ___, and ask your partner what they like,” a lot of them will just point and say “do you like ___?” in English rather than saying “Aimes-tu les chats?”  Even if they do use TL, they’ll only do it once or twice, cos, let’s face it, this is a boring activity.  Which slows the acquisition even more.  Why would you use the target language if you don’t have to?  And everyone knows it feels fake using a non-native language with other learners.Plus, this turns the teacher into a police officer dutifully patrolling the class for TL, hounding the bad English away.  Fair enough…but don’t we have better things to do?

So if early output = bad modeling and slow acquisition, is there ever a place for output?  And we are talking output other than yes/no and one-word answers, or scripted story dialogue.  I think there is…but under some conditions.

a) Output must be perfect.  If a kid says it in class, and there’s any errors at all, it has to be immediately re-cast by teacher into perfection, and then circled.

b) Output must only come when students want to do it.  It must emerge organically.  Forced output is not language– it is drama, recitation, what VanPatten calls “language-like behaviour,” but it’s not language.  Language is what you say and understand without having to think.

With beginners, this is fairly easy: we allow only super-simple perfect output from the willing initially.  With my last batch of beginners, the first sentence they learned was Rochelle juega futbol con David Beckham en Los Angeles.  I circled that (and a parallel sentence: Breleigh baila en Cork con Seamus Ennis.).

I then wrote “juego = I play” on board and asked Rochelle “Juegas futbol con David Beckham?” and she was able to answer by reading off board– I pointed to “juego” and then “futbol con David Beckham.”  I then did a pop-up: “what do the “a” and “o” on “jueg-” mean?”  Note that I could do this because it was obvious from the first minute that Rochelle was an outgoing kid who was eager to talk (as were her friends Jasmin and Rasna, and a boy named Fahim…but they were the only ones).  Breleigh however wanted nothing to do with talking– she was OK answering questions chorally and in English– but she did not want to say anything in Spanish.  Fair enough.  After a few weeks, she started to want to answer in Spanish.

So I set it up so that the only thing they could say was meaningful to them, and perfect– they simply had to read off the board.  If they wanted vocab specific to them, I wrote it on board w/ translation.

Sarah Cotrell said to me on Twitter “My students don’t want to have language; they want to use language.”  Great…but how does quality input come from learners?  Your 4th year kids probably won’t make errors with “te gusta?” which they learned in first year…but they’ll inevitably make interlanguage etc errors with the subjunctive or whatever they have recently started seeing.

I would do some or all of the following with upper-level kids

a) limited discussion (with teacher recasts) about texts or images

b) assignments where students have to interact with native speakers and document the results (e.g. go find a Spanish speaker, and interview them, and record questions and answers with your phone or camera).

c) encourage kids to go and do stuff with the language outside of class.

Some teachers say “don’t kids in TPRS classes get bored just listening?” and I respond with “not if what they are listening to [and reading] is interesting.”  We personalise stories, we do simple Q&A about kid interests, we weave kids into stories, and we do readings about teen characters with real problems that kids care about, and we use our sense of humour above all.  We use stories because they are the most universal and oldest and most compelling form of packaging communication, period.  We always want to find out what happens in a story!

Merrill Swain often comes up here.  Swain essentially says two things: that output is important in acquisition because it “provokes” comprehensible input, and that output can make speakers aware of errors and problems, and in their desire to “fix” these problems they will acquire some language.  There are a few problems with Swain’s ideas.

Say I am in France and I am hungry, and I walk into a boulangerie.  If I stammer and point to a loaf of breadand say “Je veux acheter un…un…un…”, the boulangeur is probably going to say “baguette.”  A classic communication-gap scenario.  And I used metaliguistic strategies:  I pointed, I said “uhh” (or “eu”) over and over, etc.

Great.  I provoked output from him that became comprehensible input for me.  But there’s two problems.

a)  how am I going to remember “une baguette”?  Evidence suggests that I am going to need to hear (or read) it 20-50 times to get it hammered into my long-term memory, and more to be able to spit it out without thinking.  How are communication gaps going to do that?  This might work in an immersion environment– if the situation comes up every day, I’ll eventually pick up baguette— but we don’t have that kind of time in class.

b)  when I get input from the baker, I am getting perfect native-speaker French.   This is not what happens in a classroom, even one full of motivated, experienced and attentive students.   No matter how good the activities etc, students are still getting impoverished, error-filled input from other learners.

Swain isn’t wrong…but her theory, properly speaking, addresses learning conditions, not actual acquisition.  If I want to acquire “baguette,” I need to hear it and read it over and over– and be focused on it– otherwise my encounter with Monsieur Boulangeur will be a one-off that will find me scratching my head next time I want bread.

The second part of Swain’s theory– that output will increase the speaker’s self-awareness of problems in their grammar, vocab and pronunciation, etc, and that this will lead to acquisition– is also problematic.

Let’s say I am in India and I want some water in a restaurant.  I ask my server kanna he? and get a puzzled response.  He nods and brings me the menu.   But I don’t want a menu– I want water.  I must have screwed up.  But how did I screw up?  Did I get the word for “water” wrong?  Is my pronunciation off?  Did the guy not hear me?  Does he not speak Hindi?  (Perhaps he is a Malayalam or Bengali speaker).  As it turns out, I have the word for “water” wrong– it’s not kanna but rather panni.

First, it’s not clear what the problem is.  I know there’s a problem…but what is it?  As an adult language learner, with developed metacognitive skills, I can figure it out: wrong word.  Could a 15 year old?  A tourist with little interest in Hindi?  What if the problem is more complex, like I get a verb tense wrong, or I forget to add a crucial postposition.  Concrete nouns are easy to figure out– point and ask with raised eyebrows– how about complex grammar?

Second, even if I figure out that I got the word wrong– and I managed to ‘rescue’ the conversation by pointing to a water fountain or bottle– I will probably, as in the  boulangeur example above, probably get one mention of the word panni and that’s it.  This might work in an immersion environment, but in a classroom this is not a viable strategy.

This, to me, is the essence of the problem of language teaching: how do we provide  comprehensible input that is compelling enough that it can be repetitive enough that students can acquire it?  Of everything I’ve seen and tried– natural approach, T.P.R.S., Grammar Grind, communicative-experiential, audiolingual, “eclectic”– T.P.R.S. is the best solution to the problem, outside of an immersion environment, because it allows us to provide masses of comprehensible input that is personalised and interesting (via the personal and/or weird details, and because we use stories).