Earlier this year, Google artificial intelligence researcher Timnit Gebru sent a Twitter message to University of Washington professor Emily Bender. Gebru asked Bender if she had written about the ethical questions raised by recent advances in AI that processes text. Bender hadn’t, but the pair fell into a conversation about the limitations of such technology, such as evidence it can replicate biased language found online.
Bender found the DM discussion enlivening and suggested building it into an academic paper. “I hoped to provoke the next turn in the conversation,” Bender says. “We’ve seen all this excitement and success, let’s step back and see what the possible risks are and what we can do.” The draft was written in a month with five additional coauthors from Google and academia and submitted in October to an academic conference. It would soon become one of the most notorious research works in AI.
Last week, Gebru said she was fired by Google after objecting to a manager’s request to retract or remove her name from the paper. Google’s head of AI said the work “didn’t meet our bar for publication.” Since then, more than 2,200 Google employees have signed a letter demanding more transparency around the company’s handling of the draft. Saturday, Gebru’s manager, Google AI researcher Samy Bengio, wrote on Facebook that he was “stunned,” declaring “I stand by you, Timnit.” AI outside Google have publicly castigated the company’s treatment of Gebru.
The furor gave the paper that catalyzed Gebru’s sudden exit an aura of unusual power. It circulated in AI circles like samizdat. But the most remarkable thing about the 12-page document, seen by WIRED, is how uncontroversial it is. The paper does not attack Google or its technology and seems unlikely to have hurt the company’s reputation if Gebru had been allowed to publish it with her Google affiliation.
“It is hard to see what could trigger an uproar in any lab, let alone lead to someone losing their job over it.”
Julien Cornebise, honorary associate professor, University College London
The paper surveys previous research on the limitations of AI systems that analyze and generate language. It doesn’t present new experiments. The authors cite prior studies showing that language AI can consume vast amounts of electricity, and echo unsavory biases found in online text. And they suggest ways AI researchers can be more careful with the technology, including by better documenting the data used to create such systems.
Google’s contributions to the field—some now deployed in its search engine—are referenced but not singled out for special criticism. One of the studies cited, showing evidence of bias in language AI, was published by Google researchers earlier this year.
“This article is a very solid and well researched piece of work,” says Julien Cornebise, an honorary associate professor at University College London who has seen a draft of the paper. “It is hard to see what could trigger an uproar in any lab, let alone lead to someone losing their job over it.”
Google’s reaction might be evidence company leaders feel more vulnerable to ethical critiques than Gebru and others realized—or that her departure was about more than just the paper. The company did not respond to a request for comment. In a blog post Monday, members of Google’s AI ethics research team suggested managers had turned Google’s internal research review process against Gebru. Gebru said last week that she may have been removed for criticizing Google’s diversity programs, and suggesting in a recent group email that coworkers stop participating in them.
The draft paper that set the controversy in motion is titled “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” (It includes a parrot emoji after the question mark.) It turns a critical eye on one of the most lively strands of AI research.
Tech companies such as Google have invested heavily in AI since the early 2010s, when researchers discovered they could make speech and image recognition much more accurate using a technique called machine learning. These algorithms can refine their performance at a task, say transcribing speech, by digesting example data annotated with labels. An approach called deep learning enabled stunning new results by coupling learning algorithms with much larger collections of example data, and more powerful computers.