By: Kate Deimling
On May 20, 2019, Jon Ritzdorf gave a presentation on machine translation at the NYCT’s monthly meeting. Ritzdorf, a long-time friend of the Circle, is the Senior Solutions Architect and Machine Translation lead at RWS Moravia and also teaches translation technology at institutions including the University of Maryland, the Middlebury Institute of International Studies, and NYU. In a very engaging presentation to a full house, Ritzdorf covered the history of machine translation (MT), gave a demonstration of its use, and described its limitations and potential.
History and Future of MT
MT has gone through several phases. The first MT, invented in the 1950s, was “rules-based,” meaning that researchers attempted to “teach” the computer all the rules of the languages it was translating. This system was not very effective. The next phase, known as the “statistical” or “phrase-based” model, came about in the 2000s. Using the resources of the internet, it was based on a corpus of samples in source and target languages, with no rules involved. Now, we have “neural MT.” This is not a drastic step beyond statistical MT, but millions of aligned training data are needed.
Ritzdorf pointed out that MT definitely needs human post-editing. While it has improved vastly in the last 10 years, there are still adequacy issues: it can leave out sentences, or make a negative statement into an affirmative one. He specified that MT should not be used in high-risk situations (where a financial penalty or life-or-death consequences could be involved). He predicts that CAT tools will be gone in 10 years (and he is not alone in saying this).
Evaluating MT Results
In order to evaluate a machine translation, we need a triplet sample: a source text, a human translation of this text, and a machine translation of this text. Using a sample French source and an English translation provided by NYCT vice president Kate Deimling, Ritzdorf produced MT translations of the same text using five MT engines: Google Translate, Amazon Translate, Microsoft Translate, Deepl Pro, and Systran PNMT. There are several approaches to evaluating a machine translation. BLEU (bilingual evaluation understudy) is a metric available online that compares the overlap between the machine translation and the human translation.
Using BLEU gives the machine translation a score between 1-100. Ritzdorf recommends the following breakdown:
Score below 40 – Do not use (30-40 could be OK but heavy editing will be necessary)
Score of 40-50 – Very useable, post-editing necessary
Score of 50 – Good, involves some post-editing
Score of 60 – Very good, not much effort to correct
A score of above 70 is practically impossible, as not even two human translations will overlap 70 percent or more.
The French sample analyzed in the meeting was from a report on asylum policy in Europe. Four of the five MT engines produced scores ranging from 45 to 47, meaning that this could be a time-saving method for translating this text.
Do We Want to Be Translators or Post-Editors?
However, as some members brought up after the meeting, even if it may save time, MT replaces the experience of translating with the experience of post-editing. Generally, translators have chosen this profession because we enjoy the challenges of thinking through the translation process. And many translators do not accept jobs editing other people’s translations because these kinds of corrections are tedious to make. As a post-editor, the translator will need to compare the original with the translation to make sure no meaning was lost. Then stylistic changes will be necessary. For instance, in the sample analyzed, the heading “the lottery of asylum” will need to be changed to “the asylum lottery.”
My personal opinion is that, for now, many translators will choose to forgo MT and focus on fields where it will not be used or will be slow to penetrate. Many clients are skeptical of MT and have confidence in the expertise of human translators. Although there is a huge corpus of materials on the web for MT to learn from, much of it is not of good quality. In particular, quality writing and creativity are needed in the fields of marketing and advertising, journalism and reports, and literary translation. Also, due to security and confidentiality concerns, much legal and medical translation will not be appropriate for MT.
However, if MT is indeed the wave of the future, translators may need to come to terms with it and learn to be post-editors, just as many translators previously found it necessary to learn CAT tools. Many of us have already been asked by agencies we work with if we are willing to do post-editing. For more information, NYCT members can log in to the member area of the Circle’s website and go to our meeting archives for audio of the presentation and a PDF of his slides that Jon Ritzdorf kindly provided to us. In the meeting archives, you will also find links to useful resources such as the BLEU scores of the translation sample that was analyzed, MT engines available online, and the online BLEU scoring metric. Ritzdorf suggests that translators can run BLEU evaluations on samples of texts they’ve already translated to get a feel for how useful MT may be as we enter this brave new world.
NYCT vice-president Kate Deimling has been a freelance French-to-English translator for over 12 years, after obtaining a Ph.D. in French from Columbia University and working in academia. ATA-certified since 2009, she specializes in advertising and marketing, art and culture, international development, and fiction and non-fiction. She has spoken on translation at Duke University and the Middlebury Institute of International Studies at Monterey and has translated books on wine, fashion, and art history. She also directs the Circle’s mentoring program, which she founded in 2015.