Machine translation: a game changer in science

The English language may be king in science, but neural machine translation could well put an end to its dominance.

Google translate celebrated its 17th birthday in April. Like anyone coming of age, this online translation service continues to mature every year. So much so that one recent study concluded that “translation [of languages] can serve as both a short- and a long-term solution for making science more resilient, accessible, globally representative and impactful beyond the academy.”

Lynne Bowker, a professor at the University of Ottawa’s school of translation and interpretation, co-authored the study. “A multilingual knowledge production system can only be a good thing,” she argued. “It would enable scientists to do research without coming up against language barriers.” The advent of translation engines that use artificial neural networks makes this science-fiction scenario a reality. “This class of algorithms relies on advanced statistics and huge quantities of data to produce high-quality translations,” said Dr. Bowker.

Benoît Dubreuil, Quebec’s French language commissioner, shares this view. The Quebec national assembly appointee – who speaks multiple languages and acts as a kind of auditor general of the French language – is preparing a report on the role of machine translation in the province’s agencies, to be published later this fall. “It’s no longer 2015: what we laughed at back then now defies imagination. The translations generated by these tools have become acceptable, even if they’re not beyond reproach.”

Some limits remain

Nuance matters. Translation tools like Google Translate, DeepL and Microsoft Translator still make mistakes. They’re just more subtle. “In terms of grammar and spelling, they’re nearly perfect. But there are real shortcomings when it comes to phraseology and terminology,” Dr. Dubreuil explained. For example, “peuple autochtone” is often translated as “Aboriginals” rather than “Indigenous people.”

Vincent Larivière, a professor in the school of library and information science at the Université de Montréal and Canada Research Chair in the transformations of scholarly communication, believes that these limitations call for the utmost caution. “In the humanities and social sciences, machine translation can strip a sentence of nuance, and therefore of meaning. However, this is not necessarily the case for more codified disciplines such as physics,” Dr. Larivière pointed out.

In this sense, he echoed the comments made by several speakers during consultations by the standing committee on science and research regarding French-language scientific research and publishing in Canada. For example, Yves Gingras, a professor of science history and sociology at the Université du Québec à Montréal, stated before the committee that “translating everything is irrational in economic and scientific terms.” In his view, systematically translating scientific literature from English into French (or vice versa) would be inefficient in a context where the two official languages coexist.

The real benefit of machine translation lies elsewhere, said Dr. Larivière. “In an ideal world, the metadata, abstracts and titles of scientific articles would be available in a whole range of languages, increasing their discoverability.” This minimum translation threshold would make life easier for all scientists with a poor command of English, the lingua franca of the academic world. “Using these tools, readers could then choose to translate the text into any language they want. We wouldn’t always have to use English.”

A reasoned approach

But that’s just part of the story. Editing graduate student manuscripts, initial filtering for a literature review, preparing slides for a conference abroad – the potential uses of machine translation in the everyday life of scientists are vast. However, not all are created equal. “Some uses are low-risk, others much higher. You have to consider the context in which these tools are used and their impact,” Dr. Dubreuil explained.

This reasoned approach requires a minimum understanding of what’s under the hood of these technologies. For example, the corpus needed to train neural machine translation systems is not available in certain languages, since too few publications are translated into these languages. This makes translating to, or from these languages a labour-intensive exercise. “Then there’s language pairs,” Dr. Bowker pointed out. “Do we really translate that many texts from Hindi to French and vice versa? I don’t think so.”

You also have to think about disciplines. Generally speaking, more specialized fields have fewer resources translated into languages other than English. That’s why Dr. Bowker does not believe translators are about to disappear, even if this defies common sense. “A critical mass of quality data is needed to make these systems better. And unless we see a publishing boom in languages other than English, we’ll need professional translators to produce that critical mass,” she argued.

Machine translation systems should not be trained solely on the mediocre output generated by similar translation tools. This vicious cycle would impair these technologies and, by the same token, undermine science in French. “Thanks to these tools, there’s no longer any reason why French shouldn’t be at least as present as English in Canada’s scientific community,” Dr. Dubreuil noted. “It’s now a matter of changing habits to take account of this new reality.”

Machine translation: a game changer in science

Some limits remain

A reasoned approach

Cancel reply