The Medium

Talking about language is already tough. Try discussing a brand new language via Skype with two hearing linguists, plus another via text, who happens to be deaf, and see what you learn.

Langdon Graves, Francine the Machine, 2013. Courtesy the artist and VICTORI CONTEMPORARY.

Fact-checking an article normally doesn’t make for a compelling scene. I sit in my home office, usually at my desk, a laptop open next to the phone handset, with the speaker on. For some reason I imagine my source on the other end cuddling a dog; I don’t know why, perhaps because I want them to be comfortable, perhaps because this seems to take place in the evening, when they should be at home. Once my source is comfortable, my normal process is to read the story aloud, as much of it as the source wants, with the proviso that we’re checking dates, spellings, quotes, paraphrases of ideas. Analyses and conclusions are my responsibility, I tell them. I was gratified to read in John McPhee’s latest piece in the New Yorker that he does this too with science stories—there’s simply too much to get wrong, and checking with such thoroughness, beyond the boundaries of merely someone’s quote, always makes the piece better. When people ask why I can’t just send them the text, I explain that this seems too much like ceding control, which makes a writer nervous. Like he’s doing public relations, not journalism.

This spring, I experienced a fact-checking session with three linguists that was so remarkable, it showed me I should always be paying more attention to every session. Because this particular session was about language at all levels, I experienced an off-the-chart dopamine response. Oh, this is why I do what I do. It was also like a moonshot, logistics-wise, compared to most read-back situations I’ve done. And this session turned an ordinary conversation inside out so you could see all its ribs, but things were also so deliberately complicated that it no longer worked like an ordinary conversation.

And it all started because I couldn’t get the video to work on my Skype.


I was wrapping up an article about “young” sign languages, only three or four generations old, which spring up all over the world, mainly in isolated villages where there’s a high prevalence of deafness. Over time, many people in the village come to have a deaf person in the family. Because these places usually have little contact with the national educational system, there’s no access to the national sign language, so people make signs up in their homes. Over a couple of generations—not very long in the life of a language—these signs become systematized for the whole community.

A graduate student of linguistics at the University of Texas at Austin, Lynn Hou, had just received funding from the National Science Foundation and the National Institutes of Health to research one of these languages in Oaxaca, Mexico. It’s called Chatino Sign Language, so named because it is used in a few villages in an area where the Chatino people live. Lynn is one of the few deaf field researchers who work on these new sign languages, so her involvement was additionally noteworthy. For my article, I had interviewed her, using Google Chat.

Though there were two conversations going on, the one happening in voice was considered the main one.

I also interviewed Lynn’s collaborator, a graduate student named Kate Mesh, as well as Hilaria Cruz, a linguist who uses home sign with deaf members of her family and who alerted Lynn to the existence of Chatino Sign Language. I’d spoken with each of them on the phone.

Reading back the piece to Kate and Hilaria would be easy enough. But what was it going to mean to “read it back” to Lynn? The obvious option: Send her the text of the story. I felt I could make this accommodation. In the end, I didn’t have to. Lynn, Kate, and Hilaria decided that they wanted to do the read-back at the same time, sitting in the same room. I’m leery of committees, but I agreed, mainly because I thought it would be more convenient for everybody.

The only way this could work is because Lynn had arranged for a real-time captioner named Rabin´ Monroe, who is contracted through the university. Trained as a court reporter, Rabin´ sits with deaf and hard-of-hearing students during their classes and takes down everything in the acoustic environment — speaking, sneezing, doors slamming, coughing — on a stenography machine, which outputs via a cable to her laptop screen. For Lynn, the text was set to scroll in a large-sized, light yellow sans-serif font on a blue background. At the end, she gets a verbatim transcript of everything that Rabin´ typed. Though she was sitting with them, part of the captioner’s job is to be invisible; she’s not supposed to interject herself into any part of the interaction, even if it’s falling apart. She was so good at this that I didn’t know until long after the interaction was over that she had, in fact, been in the room with Lynn, Kate, and Hilaria—I’d assumed she, like me, was off-site.

In order for Rabin´ to do her job, she had to hear me, so Lynn had her laptop, which had Skype. And since Lynn already had Skype open on her machine, she and I had a chat window open as well.

You can already see how this was going to shape up. In case you’ve lost track, this is what we were dealing with: We were located in two places, and between us there were three laptops and one stenography machine. We were working in two languages (English and American Sign Language, or ASL) and across three communication channels (voice, sign, and text). They were sitting at a rectangular table, all on the same side: first Hilaria, then Kate, then Lynn, then Rabin´. That made five of us, four of whom brought constraints to the situation, ranging from the permanent to the temporary: Lynn is deaf, Hilaria is a non-native speaker of English, Rabin´ is supposed to be silent and invisible, and I couldn’t see, because I had no video on my Skype.


I’m not going to say that our execution was perfectly frictionless, but it’s worth noting that 20 years ago this conversation would have been cumbersome to the point of impossibility. Even if the telecommunications had been cheap, we didn’t have accessible, quality keyboard and screen technologies. The real-time text channel wouldn’t have been so easily available, either. Before the Americans with Disabilities Act was passed in 1990, people didn’t have the leverage to get institutions to pay for accommodations like real-time captioning. Without captioning, the article would have been translated into ASL by an interpreter. (And really, this would have been unacceptable—you wouldn’t translate the article into Spanish or some other language to be checked, so why ASL?) Even meeting as a group for a teleconference probably wouldn’t have been so easy to execute. Nowadays we think nothing of talking in groups mediated by some technology, whether it’s work or family groups, but two decades ago, this option wasn’t exactly top-of-mind in the culture of communications. The three linguists I was working with were old-fashioned in insisting on being in the same room rather than jumping on a conference call from their individual offices or homes.

A read-back isn’t a performance, me showing off a finished product to an audience; it’s an activity we do together to make a text better, which can only happen when the sources interrupt my reading as soon as they hear something they want to flag, correct, or discuss. This requirement is like deliberately breaking a conversation. Some people are too polite to interrupt, and I don’t blame them; most people in my middle-class, educated, English-speaking corner of the world are trained to avoid interrupting, as a rule. (Not so where Hilaria comes from, as I learned.) So, at the top of a read-back, I always give explicit instructions about interruptions, which Rabin´ captured in her transcript:

MICHAEL: And…I—I—so…one part about this process, too, that in order for it to work in the way that—you know, in order for the corrections to get made, you have to violate one principle of human conversation, which is you have to interrupt.

KATE: OK. So if we want to do that, Lynn, do you wanna just wave for a second and we’ll tell Michael “stop, stop, stop,” and you can type?

(Lynn signing.)

KATE: Lynn’s gonna type this.

LYNN (typing to Michael): Yes, we’ll interrupt you when one of us waves to one another, and then Kate will explain what’s the interruption is about.

KATE: Hilaria, can you see that?

HILARIA: Yes, that’s fine. That’s fine.

KATE: We have three modalities goin’ on.

HILARIA: Yeah. That’s fine. I can see that.

MICHAEL: OK. Yeah, I’m lovin’ that multimodality of this.

KATE: So Lynn’s asking do I want to…well, she’ll type it up here.

LYNN (typing): Kate can warn you that somebody is interrupting and then you can pause… does that sound OK?

MICHAEL (typing): yes.

(Michael coughed.)

KATE: OK. Perfect. So no matter who wants to interrupt at any point, we’ll make some kind of visual signal to one another here, and then I’ll say, “OK, we’re interrupting, Michael.”

MICHAEL: OK. And then I’ll pause.

KATE: Super. Thank you.

This transcript makes for an interesting record, not only of my own fluency (as if I needed to be reminded of its spottiness), but of the negotiations that everyone does to keep things on track. In an ordinary conversation, all these negotiations are hidden, immediate, and automatic, at least between highly proficient speakers of the same language. But in this conversation, we kept tripping over basic elements. Who’s going to talk when? How fast should anyone talk? What’s the duration of a silence that means affirmation? How will Michael—who can’t see anyone in the room—know if someone has a comment? From whom is a comment really coming? Just because it’s spoken doesn’t mean it doesn’t come from Lynn; just because it’s text doesn’t mean it does.


Back in the 1950s and ‘60s, some illustrious researchers in linguistics, psychiatry, and anthropology set out to analyze some minutes-long segments of film that featured, in various configurations and interactions, a woman named Doris, her son Billy, her husband, and an interviewer, the anthropologist Gregory Bateson. It was all known as the “Doris” film. The researchers were hoping to extract the basic principles of human interaction (and, at a more basic level, create guidelines for psychiatric interviews). Bateson called it “the natural history of an interview,” because the “Doris” film showed “the natural history of two human beings over a brief span of time” and because “a minimum of theory guided the collection of the data,” leaving it “uncorrupted by theory.”

I love to imagine a collaboration that ended up growing and splintering until it spilled out of people’s offices, down the hallways, all the way to the campus pub.

Those were heady days for social scientists, the boom years of American higher education, when they were drunk on research funding and eager to shake off the influence of Freud in favor of cybernetics and linguistics. As Bateson put it, they were interested in questions like, “What signals are emitted, and what orders of awareness does the signaler show by emitting other signals about these signals? Can he plan them? Can he recall them?” The researchers collected and collected, sifted and sifted, finding ever deeper layers of structure and signification in the “Doris” film, and as they did, the bigger the project became, because as more researchers came on board, they saw more stuff to explain, which meant bringing on more researchers. I love to imagine a collaboration that ended up growing and splintering until it spilled out of people’s offices, down the hallways, all the way to the campus pub, where the best conversations were being held. Eventually this giant cocktail party collapsed under its own weight and the findings (which filled five volumes) were never officially published as a book or scholarly articles.

As this effort showed, even the most ordinary of verbal interactions are always more complex than we think. In part, it’s because we’re so good at conversing (or most of us, anyway), that we’re not aware of the jockeying, stalling, and playing along that goes on. A surprising amount of what we mean isn’t in the words. It’s in our eyes, our hands, our posture. And a lot of what we’re doing, both with the words and things like eye gaze, is managing the interaction itself. The fascinating thing is, we usually don’t know we’re doing it at the time, and we can’t remember it. Yet it’s entirely learned.

When you don’t have that conversational infrastructure, the interaction becomes sort of like rowing a boat with a bottom made of plastic bags. Every time you propel the boat forward, you risk putting a foot or an oar through the bottom and sinking the whole thing. What was interesting about the fact-checking session is that we knew all that. We knew we were going to have difficulty, we tried to adapt, we explicitly stated some of the parameters (such as how fast I would read or how we were going to interrupt) and still there were moments of tension, even though Kate, Lynn, and Hilaria have a lot of practice communicating together. It was multilingual, multimodal, multichannel, and multidevice—but “multi” didn’t necessarily mean “more” in all regards.

I showed the transcript that Rabin´ worked on to a linguistic anthropologist who has studied deaf-hearing interactions to help me understand some moments in the interaction. For instance, turn-taking. When people talk, they use a variety of cues to know when one person’s turn is ending and the next person can start talking. (For people who are signing, turn-taking has to be signaled visually.) But it wasn’t easy to get Lynn’s attention, which was split between two screens, or mine, because I couldn’t see. Later, Lynn told me that there’s a slight delay in the text display, which is another challenge. “By the time I see what’s going on, the turn-taking has already started or even has been completed with one person and moved on to another person.” (And Rabin´ pointed out that the captioning scrolls quickly—you either catch it or you don’t.)

Because of the time delay, among other factors, Lynn wasn’t fully and perfectly incorporated into the speaking turn-taking either. “Generally,” Lynn told me later, “most hearing people are not accustomed to checking in before they speak up.” (But she said that she and Kate are sensitive to issues of inclusion and accessibility.) The anthropologist also pointed out that the text chat between Lynn and me wasn’t integrated into either the signing and speaking turn-taking. Even though Lynn and I had quite a bit of back-and-forth—she interrupted me, I explained, she clarified, I repeated, she asked me to continue—Kate knew this better than Hilaria did. (Both could certainly see—and hear—Lynn typing.) Though there were two conversations going on, the one happening in voice was considered the main one. “They almost treated [the conversation that you and Lynn were having] as something that was going on in the background,” the anthropologist said.

Now, the transcript that Rabin´ produced did capture one moment that showed that Kate and Hilaria must have been looking at Lynn’s screen for at least part of the conversation:

MICHAEL: Sure. Yeah.

LYNN (typing to Michael): [She agrees with Hilaria about a problem with the beginning of the article, where I’ve talked about the “discovery” of this language.]

(Michael sneeze-coughed.)

(Michael typing.)

MICHAEL (typing): [I tell Lynn that I know why this is problematic and can try to change it but I can’t promise anything.] I can go in with certain intentions and intellectual and ethical commitments. I don’t say that to beg powerlessness but to say that it’s a long game

(Everybody laughing.)

(Michael sneeze-coughed.)

(Michael typing.)

LYNN (typing): Yes, I realize that the editors seem to have the final say in how the article will be written and published. [She nodded to signal to Kate and Hilaria that she was done typing.]

KATE: Wonderful.

We were lucky that Kate offered to play the conversational manager across voice and sign, but because she also had a part to play in the conversation, it was hard to keep the roles straight. We needed another signal for when she was switching roles. “I think it’s a great example of how habituated we are to the way we orchestrate interactions with other people and make turn-taking unproblematic for the most part,” the anthropologist said.

I asked her if people who study conversations are themselves good conversationalists. The anthropologist laughed. “Not usually. The people who study conversational structures are more aware of them. But that doesn’t make them necessarily change their interaction skills.”


For a while during our fact-checking session, it looked like things were going to work smoothly. Sometimes Kate and Hilaria spoke to me via voice; I suppose (because I couldn’t see) they were turning their faces so that Lynn could read their lips. Sometimes Lynn was reading their words transcribed by the captioner. Sometimes Kate and Lynn were signing in ASL (invisible to me and incomprehensible to Hilaria, who doesn’t know ASL), or Kate was speaking and Lynn was reading her lips; at other times, Lynn was signing and Kate was voicing the sign so that the utterance would end up in the transcript. Sometimes I was typing with Lynn, and she was sharing the screen with Kate and Hilaria. Mostly I spoke my responses.

It wasn’t until we were connected via Skype that I learned just how fast Lynn types. With the voice channel open I could hear her keystrokes, blazingly fast, very few backspaces, and then, plink, her text appeared on my side, perfectly spelled and punctuated. I also hadn’t anticipated her voice. At one crucial back-and-forth, Lynn said something to Hilaria and Kate. It was like getting an unexpected handwritten note from someone you’ve known, up to then, via email.

It became clear that there was a sixth participant: the text of the article I’d written. While the humans were slipping around, negotiating this and navigating that, the article itself was the anchor to which the whole thing was tied—but without the slipping and sliding, the text wouldn’t have been improved.

Hilaria had the first correction. I was at the end of my article’s second paragraph when:

KATE: We have a visual interruption here. So one second. We’re gonna hash out what’s goin’ on.

HILARIA: I take an issue with that…with the statement of “linguists discover new languages,” because I feel that languages existed there, and it’s not only like linguists are like…you know…Indiana Jones, just kind of discovering something. Because I feel that these are languages that communities—people in the communities use to communicate with one another, and here comes the—a savior, the linguist, who comes and discovers this new language. I have a problem with that.

Once we cleared that up, she and I joked a bit:

MICHAEL: OK. Good interruption, Hilaria.

HILARIA: Well, thank you. I come from a culture where we interrupt freely.

A little later, we hit a bump. An interesting bump, but a bump. The topic was the age of Chatino Sign Language. Both Kate and Lynn had told me that the oldest signer was a woman in her 50s, from which I had extrapolated that the language itself was no older than she was. But they pointed out that they knew only that the woman was the oldest living signer, not that she was the first. It was possible that the sign language had had an older form that died out but was resurrected. Possible but unknowable, like much about languages with no written record.

MICHAEL: So then in the next paragraph, it starts, “Because Chatino Sign Language is probably only two generations old,” that’s not correct. You don’t know how many generations old it is.

KATE: Um… yeah, we don’t. That’s right. Let me interrupt and add a comment. [Here she was speaking for Lynn, who was signaling that she wanted to type something.]

LYNN (typing to Michael): It is possible that Chatino Sign Language may be older, but the older deaf signers died before the language could be transmitted to the current signers…Or the language went dormant and re-emerged with the death and birth of deaf signers.

(Michael’s typing is very loud.)

MICHAEL (typing): Could i say that “chatino sign language in its current form is probably only two generations old”?

LYNN (typing): I know that the Central Taurus Sign Language goes back about seven generations, but there weren’t always deaf people in every generation. It skipped one or two generations at some point, so only the hearing people remembered the sign language.

KATE: Lynn is looking at your comment and saying “yes,” and she thinks that’s nice.

LYNN (typing): Yes. That’s nice.


KATE: So the complexity that won’t make it into the article is that right now we’re not sure how the sign system has influenced the gestural system of the surrounding area. So we don’t know whether having deaf signers in this area has gotten—has introduced changes to the co-speech and non-speech-linked gestures. So one—

MICHAEL: Oh, my god. That…

LYNN (typing): Can you please repeat the last paragraph please? [Then she turned to ask Kate in sign if “the whole thing” had been read.]

KATE: [Who doesn’t understand what Lynn means by “the whole thing”] Did I finish reading your comment, Lynn?

LYNN: No, [did he finish reading] his article. [Then she signs: “Did he read the entire paragraph to us?”]

KATE: No, he’s not finished.

LYNN: OK. [Rabin´ explained to me later that this was an instance where Kate voiced a sign from Lynn, which Rabin´ attributed to Lynn as if she’d spoken it.]

LYNN (typing): OK, can you please repeat the question aloud?

As John McPhee would tell you, the best reason to do read-backs is because they turn up material that the interviews, for whatever reason, didn’t. When I interjected with “Oh, my God,” I had just realized that the evolution of Chatino Sign Language might have taken another path. (Looking at this transcript now, I am abashed to realize I never picked up the topic of sign languages preserved by hearing people, which is interesting.) From the interviews, I had learned that speakers in that part of the world use a lot of manual gestures called “emblems” as they’re speaking. These emblems can parallel speech or even replace it. One example is a thumbs-up gesture, which can be inserted into an American English sentence so smoothly like this:

That cheesecake was totally [makes thumbs-up gesture]

Most speakers wouldn’t notice that they’d gotten a message in two modalities. Chatino Sign Language uses many of these emblems as signs, so when we got to talking about discontinuous transmission of the language, I realized that an earlier form of Chatino Sign Language could have survived as the emblem repertoire. I voiced this possibility:

MICHAEL: So is it not also possible that the emblem repertoire that you’re looking at is… not a… co-speech… system? As a lingua franca? “Lingua franca” being my word, not your word, but…. But some kind of remnant of an earlier signed language?

KATE: It’s very unlikely….

Kate answers the question before Rabin´ had finished transcribing and before Lynn could read my question. And this is when four linguists talking through an article about language in a communicatively complicated situation broke down. Not irreparably. And it wasn’t over the fact-checking interruption. It was a normal interruption, which means that people were speaking out of turn. It wouldn’t have happened this way if I’d had a video link, or if Lynn and I weren’t working in the text channel that was apart from the main voice channel.

Because I couldn’t see, I didn’t know what happened, but all of a sudden:

KATE: I’m sorry. [She’s apologizing because she responded before Lynn could read the question.] I interrupted. And I should have allowed…. I interrupted. And I should have allowed you to think about it for a second. I’m sorry. The timing is off.

LYNN: Do you mind letting me read what is typed before you respond, Kate?

KATE: You’re right. I should. Go ahead.

MICHAEL (typing): My question about the emblem repertoire wasn’t in the story.

LYNN (typing): OK, can you please repeat the question aloud?


MICHAEL (typing): let me keep reading, because the question about the emblem repertoire will come up again.


This moment gives you a very tiny look at the communication slippages that deaf people have to navigate in mixed situations all the time. “It’s really difficult for deaf people, if they’re the only deaf person and everyone else is hearing, to hold their ground in terms of getting and keeping the floor, and in terms of organizing the turn-taking,” said the anthropologist. You don’t have be an anthropologist to know that people who are deaf get put in the position of having to be very assertive to make sure they’re not excluded—even to the point of standing up to the people who are helping to manage aspects of the turn-taking, as Lynn had to with Kate. Lynn and Kate work together a lot, so I assume they have a rapport that allowed them to bounce back from moments like these. Afterwards, I circled back to Lynn to ask if she’d been as amazed by the mechanics of the read-back as I was. She said that the multichannel, multimodal, and multilingual aspects were all routine for her—as were the communication gaps, she said, which happen more than she’d like.

Some people are too polite to interrupt, and I don’t blame them; most people in my middle-class, educated, English-speaking corner of the world are trained to avoid interrupting, as a rule.

I wish I could have layered the final article with the experiences of the read-back, how we’d all gotten online, four linguists, and had an amazing linguistic experience working together on this article about these really interesting languages. Everywhere you turned, there was something linguistic to talk about; the whole thing was about the urge to communicate and the complexities of language, all the way down.

I realized something about even the most ordinary read-back session for fact-checking purposes that I or anyone else does. Most of what I’ve read about fact-checking during various brouhahas in the last couple of years has focused on the boundaries of authorial discretion (and indiscretion), the shifting categories of the factual and the truthy, and whom should be trusted with maintaining those boundaries—fact-checkers, authors, editors, the institutions of journalism itself? In other words, it’s focused on essences: facts, traditions, professions.

But no one looks at the fact-checking session as a conversation, at the way that what’s going to be printed emerges from an interaction between two (or more) people and the very subtle negotiations that are part of having a conversation. If a source can’t be politely impolite (by interrupting, by not agreeing to give up a turn in the conversation, by demanding to hear how a paraphrase or correction will be worded exactly), then their corrections may not make it in. That’s no one’s fault. We’re limited by our conversational habits as much (if not more) as by our epistemological ones. In that sense, this story, any story, full of its facts, is the thing that emerges unscathed from these conversations—which can present numerous hazards even under the best of circumstances. But especially so when your Skype video isn’t working.

Postscript: In case you’re wondering, yes, I did a read-back of this piece with Rabin´, while Hilaria, Lynn, and Kate all received the text because they were all on the road. All four of them supplied perspectives that improved my article and that suggest we could be signing, writing, and talking about this interaction for a very long time.

TMN Contributing Writer Michael Erard lives in Portland, Maine, with his wife and son. His book about the science of polyglots, Babel No More, is now out in stores. More by Michael Erard