Relearn the Linguistic World in “Arrival”: An Interview with Jessica Coon

Computer science and machine learning scholars are increasingly interested in languages, and natural language processing has shown great progress as a result.

In the movie Arrival, language is one of the most powerful weapons. On top of that, the film’s consultant, Jessica Coon, believes that language is its own origin; that language originates from itself.

Jessica Coon is Associate Professor of linguistics at McGill University, Canada Research Chair in Syntax and Indigenous Languages, and linguistics expert in Mayan Ch’ol (Chol). Coon’s research is focused on “ergativity, split ergativity, case and agreement systems, nominalization, verb-initial languages, and field methodology.” She graduated from Reed College as a linguistics-anthropology major. Coon later earned a master’s degree and a doctoral degree in linguistics from MIT where she was advised by the famous linguist David Pesetsky, former student of Noam Chomsky . Since 2011, Coon has worked as a linguistics professor in the Department of Linguistics at McGill. She became an Associate Professor in 2015. source:

Arrival, a film about an alien visit to Earth, excited not only sci-fi fans, but the whole linguistics community. The movie brought the study of language to the eye of the public in a way that hasn’t really been done before. Jessica Coon, Associate Professor in the Department of Linguistics at McGill University in Canada is pleased with what Denis Villeneuve’s 2016 hit has has done for her field. Coon was hired as a consultant for the movie and she’s willing to work on more like it in the future.

Jessica Coon working for “Arrival”

Arrival is based on Story of Your Life, a short story written by Ted Chiang and published in 1998. The story tells how a linguistics professor named Louise Banks (played by Amy Adams in the film) and the physicist, Ian Donnelly (Jeremy Renner) team up to interpret the language of seven-limb aliens called Heptapods. The movie grossed US $198 million in the box office and linguistics, a subject that gets little mainstream talk, suddenly got a lot of attention.

Linguistics is sometimes a little misunderstood. It is easy to assume that the discipline is concerned with learning how to speak languages. However, according to Coon, “Linguistics is the scientific study of human language. So it’s not necessarily studying a specific language, or learning to speak a language, but looking at language as a system of grammar.”

In Arrival, Louise Banks successfully decodes an extraterrestrial communication system by discovering the structure of their language. This process reveals a relationship between perception and science. One that informs research in machine intelligence and explains how a theory of language became the basis for a science fiction tale.

Coon says she had fun working on the movie and was surprised by the level of detail that went into the production.

“The film does a lot of things right. There are of course points throughout where all the linguists in the audience will sort of cringe, because there are some things that are a little off, but I think in terms of the big picture, they get a lot of things right… They rented all the books off my office shelf. That’s only a scene of a minute right, like nobody notices what books are there, but all the linguists know what books are there… Collectively once you put all these things together, you sort of get a nice, holistic movie, it gave me a new appreciation for film making.”

This set from Arrival is based on Coon’s colleague’s office. Details like the Apple computer, a green book written by Noam Chomsky, and a picture of him on the shelf show the cares the film makers took to represent a linguist’s work.

Part I

Arrival opens at a deliberate pace with a dream-like sequence accompanied by ambient themes from Max Richter’s On the Nature Of Daylight. The music is sad and mysterious. A young girl is sick and her mother comforts her.

The mother is Louise Banks, a linguistics professor at a university in Massachusetts. In class one day, news reports announce that 12 unidentified objects have appeared at different points on Earth. An alarm sounds, classes are cancelled and students evacuate the university. Banks watches as fighter jets shoot through the sky overhead.

The next day, Colonel G.T. Weber brings a strange recording to Banks’ office and asks her to help decode a “conversation.” Voices are heard asking,“Why have you come? Can you understand?” Something responds in a series of low-pitched hums and growls.

“Is that..?”

“Yes.” Weber confirms that it’s a recording of the visitors.

Confused Banks asks, “How many?… Did they have mouths?”

Weber is terse with her. He wants to know what she can understand from the tape.

“I don’t know…” Banks replies, “I can tell you that it’s impossible to translate from an audio file, I would need to be there to interact with them.”

Morgan Sonderegger

Banks is right. Coon says, “As linguists, we’re interested in the more abstract properties of languages, but you can’t get at those directly. You have to interact with speakers of those languages, whether that be human languages or alien languages.”

In 2000, Coon went to Reed College to study linguistics. Discussing her choice, she explains,

“Oh I think I got into linguistics the way that many people do, I was really interested in foreign languages… I liked language classes, but when I got into university I didn’t want to be a Spanish literature major or a German major. I was really less interested in you know, the literature, but more in the languages themselves, sort of the structure of these languages. And so I came across a course called linguistics. I had had no idea what that meant, as many people entering university don’t… I thought it sounded interesting and relevant and so I took my first linguistics class and I was hooked.”

John B. Haviland, Honorary Professor in Linguistics and Anthropology at the University of California in San Diego.

Linguistic anthropologist John B. Haviland later became Coon’s advisor. Haviland is an expert in Mayan Tzotzil. Under his tutelage, Coon chose to concentrate on Mayan Ch’ol.

Mayan languages have been spoken in Mesoamerica and northern Central America for at least 5,000 years. Currently at least six million people speak Mayan as their mother tongue. The language has subdivisions like Ch’olan-Tzeltalan which further branches into Tzotzil and Ch’ol.

In 2004, Jessica Coon published a paper on nominal structure and split ergativity in Ch’ol. Her interest in the subject hasn’t faded since.

In this map of Mayan languages, Tzotzil and Ch’ol belong to the Ch’olan-Tzeltalan branch and Chuj belongs to the Q’anjoblan branch. It is spoken by “around 40,000 members of the Chuj community in Guatemala and around 10,000 in Mexico.”

Part II

Flying in an army helicopter to the alien ship in Montana, Banks meets the physicist Donnelly. He’s reading her book and immediately tells her he disagrees that languages are the foundation of civilization. Science is the foundation.

They land in a valley obscured by the gigantic spherical object, or “shell,” and are prepped in a hurry at an army camp. Icelandic musician Jóhann Jóhannsson’s soundtrack of deep electronic pulses begin to set a discomforting mood. The spooked scientists are made to defy gravity as they walk up a vertical tunnel into the shell. The music’s dull, non-linear rhythm provides an eerie contrast to Banks’ bewilderment when she faces the Heptapods for the first time.
Coon is familiar with the feelings of excitement and nervousness that come when first interacting with surveyees in the field. In 2002, during the summer of her sophomore year, Coon traveled to Chiapas, Mexico to study Ch’ol. It was her first time conducting research in the field. In an article for Sloan Science and Film she writes about the experience:

Just after finishing my sophomore year in college, I arrived in Chiapas, Mexico for my first summer of linguistic fieldwork. My linguistics professor, renowned Mayanist John Haviland, drove us six hours down winding mountain roads from the city of San Cristóbal de las Casas into the hot Chiapan lowlands, to a Ch’ol-speaking Mayan village called Campanario. After negotiating my stay with a surprised host family to-be, Haviland got ready to head back to the city. Overwhelmed, with only rudimentary Spanish and my courage quickly slipping away, I asked him to remind me again what exactly I was supposed to do. “Make some friends,” he said casually, “learn some Ch’ol.”

Communication between Banks and the Heptapods starts with simple words like “Human,” “Louise,” “Ian…” In one scene, she risks her life by removing her biochemical protective suit, walks toward them and points at herself with her fingers, “Louise.”

image (8)

The scene resonates with Coon, “She’s the first one to take off her helmet and go up to the glass and try to introduce herself, and you know, that of course is important.” It also reminds Coon of the need for planning when conducting field research. A researcher can never anticipate what their partner will do or what situations could develop once on the ground. A field linguist can’t just walk straight into a community and start addressing abstract and theoretical problems. The first task is always to set up a positive relationship. This was one of Coon’s core work ethics during her field projects in Mexico.

In Arrival, Banks makes slow but steady progress, teaching the Heptapods simple words like “eat” and “walk.”

Weber isn’t convinced by her approach. He struggles to see the purpose of the elementary-level words.

Banks explains that it’s to avoid misunderstandings. The objective is to teach the Heptapods to understand questions like “What is your purpose on earth?” But to accomplish that the aliens have to first understand what a question is, learn how to distinguish between “a single you” and “a plural you,” purpose, intuition, and of course, command a basic vocabulary.

“She does a nice job of showing that. In linguistics and I guess any scientific disciplines you can’t go straight to a big complicated question if you don’t understand the smaller pieces,” Coon comments, “You have to start with simple things…if you want to understand the grammar as a whole, you do need to build from these smaller parts first.”

“Human” in Heptapod script

In the movie, Banks is under incredible pressure to get things right. However, misunderstandings can be one the most enjoyable parts of field research.

In one interview, Coons recounts a fun story from her early fieldwork. When she was learning Ch’ol, she kept asking her surveyees if she sounded ok. The would reply positively, until one day she asked if they would say it the way she does. They responded they wouldn’t say it that way at all. Even if her pronunciation was bad, she was encouraged to keep speaking.

“I really enjoy the task of getting to go to a community and learn about the language. Not by reading books about it, because for many languages, there aren’t books…but by working with speakers of these languages and you know… what’s the structure of this language? How does it compare to other languages? What are the theoretically interesting parts? But of course for communities, a lot of identity gets tied into language and so really seeing how important languages are to communities got me involved in vitalizing efforts, because many of the languages I work with are endangered. I think that’s what really keeps me in the field and makes me excited to do the work I do.”

David Pesetsky

In 2004, driven by her passion for linguistics, Coon enrolled in MIT to do her Master’s and Ph.D. MIT has the best linguistics department in the world, made famous by the father of modern linguistics, Noam Chomsky. Coon was later supervised by renown linguist David Pesetsky, who was one of Chomsky’s students. Pesetsky is an expert in syntax and has spent his career focusing on generative grammar.

For the following six years, Jessica continued her study of Ch’ol. She published important papers in top linguistics journals like Natural Language and Linguistic Theory, Lingua and Linguistic Inquiry. Her publications discuss subjects like VOS (Verb-Object-Subject) as Predicate Fronting in Mayan Ch’ol, and Interrogative Possessors and the Problem with Pied-piping in Ch’ol. According to Google Scholar, her most referenced publication is her doctoral thesis, “Complementation in Ch’ol (Mayan): A Theory of Split Ergativity.” In the acknowledgement section of the thesis, she first thanks the native speakers of Mayan Ch’ol who she worked with.

Part III

The first word Banks learns from the Hepatpods is “human.” As she studies their symbols, she starts to have visions of herself with her daughter—a little girl playing in the grass. The more of the language she decodes, the more intense the visions become—a girl playing, asking her words…

When Donnelly realizes what Banks is experiencing, he suggests they may be related to the Sapir-Whorf hypothesis on linguistic relativity. But she disagrees.

As the film climaxes, nations become divided over the meaning of the Heptapods’ messages and the world is on the brink of war. Banks is forced to ask the Heptapods why they have come. The answer she interprets is one that no one wants to hear: “Offer weapons.”

“Weapon” in Heptapod script

Banks is not convinced she got it right. Her expertise tells her she needs a more accurate translation—do they mean weapons or tools? As the threat of violence looms, the army is ordered to retreat from the shell. Louise pays the Heptapods one last visit and learns the mystery of her visions. They are apparitions of the future. She will marry Donnelly and have a child who dies of a rare disease. The “weapon” the Heptapods offered her is the ability to transcend time.

Ted Chiang, the writer of the original story, doesn’t have a background in linguistics. But he has studied how linguists interpret a new language. He questions how the human race could make progress by learning the languages of more advanced species.

The hypothesis Chiang’s theory is based on was developed by Edward Sapir and advanced by Benjamin Lee Whorf in the 1830s. The Sapir-Whorf hypothesis states that each language has its own unique structure and forms of expression that determine the speaker’s perceptions and categorizations of experience. Different language speakers perceive the world in different ways and develop restricted understandings of reality.

The hypothesis was proven by Whorf’s in field surveys of Hopi and other Indigeneous American languages. It took the idea nearly a century to mature, but by the 1940s it had accumulated a strong theoretical foundation.

However, in the 1970s, Chomsky’s linguistic revolution rejected the idea. Universal Grammar states that all languages share a the same set of structural rules. They are independent of sensory experience and have little influence over the way people think. Even if each language builds these rules into their own unique forms, Chomsky believes that the Sapir-Whorf hypothesis exaggerates the differences.

Outside the context of science fiction, Coon is suspicious that a human brain could be rewired by learning an alien language. “I’m trying to think of other sci-fi contexts where there is some kind of created language, George Orwell’s 1984 is one example where the government makes up this language that supposed to be really simple as way of mind control… But our human brains seem to be fixated on the kind of language that we have,” she explains.

Creole, a natural language developed from a pidgin, is a good example of how Universal Grammar could work. When two different language groups encounter each other (normally due to immigration or trade), the speakers of these groups improvise messages that mix words and phrases from the respective languages. What results is a pidgin, an incomplete language that in academic terms has “low prestige,” or is informal.

Some linguists believe that children who grow up in this kind of linguistic environment unconsciously fill in the communication gaps. By doing so they create a complete language which becomes Creole. This natural ability to formalize an improvised communication system could be evidence that a meta-language is stored in the brain.

When asked if she thinks learning a language has changed the way she thinks, Coon responds,

“I think it has, but not for the Sapir-Whorf type reason you’re asking… Learning a new language can change your life in all kinds of ways. You might meet interesting new people. Or learning the language might open all kinds of new doors in terms of who you are able to communicate with, what you’re able to learn, what kind of job you’re able to get… But I don’t think that… there’s no evidence from the scientific studies that have been done, that the grammar of a different language affects in any meaningful or significant way how we see the world.”

Over the last twenty years, there has been an increase in the amount of studies that attempt to prove how language affects thought. In 2011, Scientific American published a paper by Lera Boroditsky, an Associate Professor at Stanford University, that summarizes the findings of this research. One researcher the paper mentions is John B. Haviland—Coon’s advisor at MIT.

For two decades, Haviland partnered with Stephen C. Levinson from the Max Plank Institute in the Netherlands to find evidence. They were able to demonstrate that people whose native language relies on absolute directions can remember the routes they take in a non-native environment more accurately. In fact, their directional memory was far stronger than the natives of the environment. In this context, a language has provided the people that speak it with a certain advantage.

Nevertheless, these studies are not enough to prove a strong enough correlation between languages and human thinking. Boroditsky’s findings are still at odds with mainstream opinion.

Part IV

Nowadays, under Chomsky’s influence, most linguists agree that understanding a new language is only a small part of their task. The focus is not on the details, but on bigger questions, like how languages are acquired.

Noam Chomsky

Chomsky’s earliest theoretical hypothesis was that children are born with “a certain set of structural rules innate to humans.” This set of rules is is written in human DNA and is called Universal Grammar. It provides the human brain with functions that automatically react to the input of linguistic material and attempt to perfect it. Gifted with this grammar toolkit for all the world’s languages, infants can learn to speak any of them fluently.

But research in ergative languages like Basque and Urdu challenged the premise of Universal Grammar. In ergative languages, the way a subject functions in a sentence differs from European languages.

One example is Basque, spoken in a region that straddles the westernmost Pyrenees in parts of northern Spain and southwestern France. It is an isolate that has nothing in common with neighbouring Indo-European languages, making it notoriously difficult to study.

English is a nominative-accusative language. Languages like it, treat transitive verb agents and singular intransitive verb subjects the same. Transitive verb objects are treated differently.

Take this sentence as an example:

I traveled; I invited him to travel

Ergative languages like Basque, treat transitive verb objects and intransitive verb subjects the same. Verb agents are treated differently.

In ergative form, the sentence would become:

Me traveled; I invited him to travel

In the 1980s, in order to explain these types of phenomena, Chomsky and his followers modified the definition of Universal Grammar with the “Principles and Parameters” theory. It postulates that a common set of principles govern the structures of all languages, replacing the notion of one universal grammar. The principles that are shared are primary. The unique characteristics that differentiate languages, or parameters, are secondary.

The “Principles and Parameters” theory mainly focuses on English and French, which are well studied and more conducive to research. However, linguists are emphasizing that a more diverse set of languages should be used for developing and verifying theoretical assumptions. Coon says,

“If we are developing a theory of Universal Grammar, of human language generally, it’s important that we have theories that explain not only how English works, but also how Mayan languages work, because these are all human languages, they’re all equally learnable by humans. if I take a human baby from Canada and I drop her off in Mexico she’ll learn a Mayan language, there’s no barrier there.”

Coon’s doctoral thesis is a study of Ch’ol split ergativity. It attempts to define the conditions in Ch’ol for one, ergativity, and two, features similar to nominative-accusative languages. After starting at McGill, she published the book Aspects of Split Ergativity (The title evoking Chomsky’s Aspects of the Theory of Syntax). From 2011 onwards, her publications in top linguistics journals have continued to focuse on ergativity, with titles like “Ergativity and the complexity of extraction: A view from Mayan” and “The ergativity of TAM (Tense Aspect Mood)”. In 2015, Coon was given the positions of Associate Professor in the linguistics department at McGill and the Canada Research Chair in Syntax and Indigenous Languages.

In recent years, Coon has also studied Chuj, belonging to the Q’anjoblan language family. As its name suggests, it is spoken by the Chuj people of Guatemala and Mexico. She says,

“Part of my work is looking at our existing theories of how human languages work and trying to see, do these theories work for less-studied language? Do they work for Mayan languages? What do we need to modify? Or what can Mayan languages tell us about the nature of Universal Grammar and the nature of human language? Because these languages are very different from a language like English or a language like Spanish, but they’re still equally important when it comes to our theory of Universal Grammar.”

A major setback for linguists is increasing rate at which languages are going extinct. Currently, there are 6,000 to 7,000 languages in the world. However, researchers predict that up to 90% of these might not survive the century. We are losing key pieces of an important puzzle.

Part V

The syntactic theory that Jessica studies is at the traditional core of linguistics. But there are newer fields that have evolved, like pragmatics and discourse analysis; and interdisciplinary fields like cognitive linguistics and neurolinguistics. At the crossroads of linguistics and computer science there is computational linguistics (which includes machine translation), corpus linguistics and speech recognition and synthesis.

Computer science and machine learning scholars are increasingly interested in languages, and natural language processing (NLP) has shown great progress as a result. NLP combines artificial intelligence, computational linguistics and computer science to essentially make machines better at human languages. It’s worth asking if in the future, these translation tools could replace human languages.

Coon’s views on machine learning and natural language processing are informed by a Computational Linguist from the university of Washington named Emily Bender.

“She does really relevant and interesting work where she has argued for the importance of understanding cross-linguistic variation and how this is going to help machine learning make better language independent learners… She argues in favour of using the typological knowledge that we’ve developed as linguists to improve how the learning works.” 

But there is still not enough research available on minority languages. The data is scarce and it is simply impossible for linguists to document the world’s 6,000 languages and use them to train machines. It’s also important to remember that even if the translation tools we have now are effective, translation is not a transparent equivalent to communication. To perform tasks and create productive relationships with native speakers, their language must be learned.

Another question: if a Universal Grammar is programmed in the human brain, then can entities with similar processing capabilities also learn human languages? Heptapods? Computers? Some neural network research shows that machines can learn languages. Do they?

“With machine learning the idea is we’re trying to develop an algorithm such that when exposed to enough data, the system repeats back, or can produce language that mimics human language.” Coon chooses her words carefully, She’s apprehensive about what it means for a machine to “comprehend.”

“One important question is: what is it about our human brains that makes some types of systems learnable and some types of systems unlearnable, when in fact we can imagine or we can show that machines could learn them? As linguists we’re interested in: what is it about the human brain that makes human language shaped the way it is and how do babies learn it. But also, what can babies not learn and what does that tell us about human cognition?”

Since it is possible for machines to learn a language that humans can imagine, but not learn themselves, there is an indication that humans and machines process language in very different ways.

Based on this, Coon says, “I guess machines aren’t probably going to comprehend in the same way that humans do.”
Modern technology is not linguists’ nemesis either—even if they work with ancient languages all the time. Fifteen years ago, Coon used notebooks to record her field data. But going through stacks of books on her shelf is not the most efficient method of retrieval. Linguists now have databases that ease the process of inputting, storing and extracting linguistic material. The data can also be shared through the web with scholars and communities working on similar languages.

One issue Coon faces, is that a lot of corpus tools don’t contain enough data on the minority languages she researches. The tools are also either too expensive for students, too difficult to use, or not compatible on different computers. She says, “I use this database tool that a postdoc and some other students here have slowly been developing, because there wasn’t anything useful on the market… There is no single database tool that linguists or language learners can use that’s easily searchable.”

In May 2016, Ted Chiang published a short essay called “Bad Character” in The New Yorker. He argues that social issues in mainland China could be the result of Mandarin not using a phonographic script. His theory is based on principles similar to the plot of Arrival.

Linguists can’t conduct their researches without computers

On the other hand, in The Language of Food: A Linguist Reads the Menu, Dan Jurafsky, Professor of Linguistics and Computer Science at Stanford, concludes:

“In other words, the linguistic and culinary habits of our own tribe or nation are not the habits of all tribes and nations. Yet all languages and cultures share a deep commonality, the social and cognitive traits that make us human. These facets—respect for our differences, and faith in our shared humanity—are the ingredients in the recipe for compassion. That’s the final lesson of the language of food.”

If Jessica Coon read this, she would probably agree.

Author: Jiaxin Su | Reviewer: Rita Chen | Editors: Xiang Chen, Nicholas Richards

