Category Archives: machine translation

Non-standard terminology

Certainly most non-translators would be surprised at how often the translator encounters words in a foreign language for which there is no generally agreed upon translation. This is clearly one factor that severly limits the capabilities of translation software. Google Translate works by sifting mountains of reference translations. For standard terms in clearly formulated sentences, this sifting strategy can work quite well. As soon as non-standard terms crop up, however, Google Translate stumbles, not least due to the fact that many reference translations are of questionable quality or applicability. The problems of ambiguity that plague the task of translation are regularly apparent when one searches for hard-to-translate terms at online dictionaries like LEO or reference sources such as the EU’s database of legal translations.

I confront terms for which there is no preexisting entry at LEO or clearly understandable direct equivalent in English nearly every day. Here are a few:

  • tiefenstufenabhängige Baumdurchwurzelungsstrategien (soil-depth-dependant tree rooting strategies)
  • Holzhackschnitzelheizkraftwerk (combined heat and power plant that runs on wood chips; try to say that one three times fast)
  • Kommunikationsaufforderungsakte (acts by which one prompts another to communicate)
  • Verfüllkörper (the body of backfilled material within a revegetated strip mine)
  • Legalitätszentriert (adjective indicating a focus on aspects of legality; literally, “legality-centered”)
  • Nachverhandlungsanfälligkeiten (noun designating things which are subject to future negotiation)
  • Rovingsgelege (I forgot what this is; something to do with repair of wind turbine rotors)
  • Granulatmusterzugschublade (component in a roller compactor for the manufacture of pharmaceutical products)
  • Ver- und Entsorgungsmedia (funny compound in German designating “media” for both “supply” and “disposal” – a highly ambiguous term when translated directly)Note that none of these terms (except for Holzhackschnitzelheizkraftwerk) yields even a single hit at Google. So how does Google Translate handle them? Well, it doesn’t.

As

“As” is an interesting word. Ever looked it up in the dictionary? Mine contains 43 different definitions for the term. “As” can be used in so many different contexts it almost eludes definition. Yet in its multipurpose utility, this tiny, seemingly irrelevant grammatical particle serves an essential linguistic function. As an adverb, conjunction, pronoun, or preposition, “as”  plays many roles, interlinking parts of speech and giving sentences form. One could describe it as the glue that holds the language together.

Needless to say, the German word “als” is not directly equivalent to its English counterpart. Like “as,” it is used as a comparative particle (diese Schuhe sind bequemer als die anderen) and conjunction (ich war froh, als sie endlich anriefen), but on the whole, it is used less frequently and has a much more restrictive range of use. However, “als” does take on a particular function that “as” lacks. The differences are subtle at first glance. Take the following sentence as an example: Die Beamten sind als Vetreter das öffentliche Gesicht der Verwaltung (“The officials are as representatives the public face of local government”). Here “als” is used to set up an equivalance between two things; the “officials” are in effect stated to be the equivelent of “representatives.” The direct English translation is acceptable and fairly clear, but rings a little bit strange. Why is this? In English, “as” is also used as a preposition to set up an equivalence, but this equivalence is a relative one, and usually does not have the 1:1 substitutional meaning found in many German constructions. For example: Der Auftragnehmer übernimmt die Aufbereitung am Standort X als technischer Betriebsführer für die Auftraggeberin als Betreiber (“The contracted party assumes responsibility for processing at location X as the technical manager for the contracting party as operator”). Translated directly, this sentence is somewhat confusing in English. What is meant by the “contracting party as operator”? “As” in English lacks the rigorous 1:1 substitutional equivalence implied by “als” in the German source sentence. A more readable translation would simply read: “for the contracting party, who is the operator.”

This is actually a fairly common problem when translating from German to English. An awareness for the non-compatability of “als” and “as” in certain contexts can help one to identify why the target sentence is not working and how it can be fixed.

Probability-based translation

The more I translate, the more convinced I become that developing accurate translation software is a nearly impossible task, and one that certainly can’t be achieved with probability-based models alone, as is used by Google Translate. Aside from the idiosyncratic and cultural properties of language (as previously discussed here), machine translation is complicated by the incompleteness of reference databases. Essentially, it’s impossible for a piece of software to translate a term for which no dictionary entry or prior translation exists. This problem is much more pressing than one might initially suspect, considering the frequency with which the translator encounters little-used terms for which no translation is immediately forthcoming. Translating the other day, I kept a list of uncommon terms and the number of Google hits each term yields. Here are a few: Patentfeld (7 hits); patentstark (3 hits); versatzfähig (1 hit); bestandeskundlich (2 hits); tiefenstufenabhängig (3 hits).

In the absence of an ability to consider the larger context of a text and deconstruct meaning – in short, without the ability to think – translation software is unable to effectively deal with non-standard terms. Yet the complicating factor of rare terminology is just one example of the many situations in translation in which a 1:1 rendering is not possible. Clearly, the dynamic transformation of the signifiers in a source text necessary to produce an accurate and legible translation is an act of creative interpretation that is totally beyond the present capabilities of translation software, particularly software based on probability models.

Pseudo-anglicisms

Pseudo-anglicisms, that is, words borrowed from English and invested with new meaning in another language, are particularly abundant in German marketing texts. PR and communications agencies in Germany regularly employ mountains of English marketing jargon in their client presentations. Frequently, highly specialized terms are employed incorrectly; in other cases, one encounters pseudo-English terminology that has been codified with new definitions. A harrowing job for the translator. I was working on a Power Point presentation yesterday in which a list of proposed marketing activities appeared; one bullet point read: “Claim als Crowner.” I found this rather funny – two putatively English words side-by-side that an English speaker would not understand. The term “Claim” is quite common and used in German to mean “slogan.” I had to do some research on “Crowner,” however, which turns out to be an advertising sticker on a display case (a riff on Krönung, perhaps?). I’m often curious as to origin of these terms, which, in many instances, seemed to be rooted in a misunderstanding of English. The term “claim,” of course, can be used in English to indicate an argument made about the merits of a product. I would suppose it was simply misinterpreted some decades ago by an exchange student who went on to become an influential marketing guru in Germany. “Crowner,” on the other hand, has clearly been spun from whole cloth, much like the German marketing terms “BlowUp” (for an extremely large billboard) and “Sell-Out-Unterstützung” (the meaning of which I am still trying to ascertain).

Proper names that aren’t so proper

There’s a gym near our apartment in Berlin that advertises itself as a “health and fitness club” on a sign above the main entrance. I walked past the gym regularly for weeks and always thought to myself, “Now, what’s the actual name of this place? They should display the name more prominently.” This morning my patience was finally exhausted. Determined to identify the gym’s name, I stopped to inspect the sign and peer through the windows. Lo and behold: the place is simply called “Health and Fitness Club.”

Proper names based on everyday English words are actually not that rare in Germany, and I find them to be highly annoying. In Stuttgart, for example, there’s a company called “Financial Consulting,” a business name you could never actual register in the U.S., because it fails to identify the business uniquely. (I can just imagine the founder filling out the registration papers: Name of business? Financial consulting. Type of business? Financial consulting…).

Loanwords and misnomers

English loanwords are proliferate in the German language, and new words are adopted on a daily basis. Borrowing is particularly pronounced in the business world, where mastery of English neologisms carries a certain cachet. Scores of English words invariably crop up in German marketing texts. Take the following extreme example, which I came across recently: “Last not least sorgen unter anderem vier Bands mit heißem Sound für eine tolle Stimmung und echtes Partyfeeling.”

The particular problem for the translator—which the above sentence demonstrates fairly well—relates to the way in which borrowed words are often invested with new meaning in German. “Partyfeeling” is of course an invented compound which only the most foolhardy would dare to transcribe directly. Yet the most probable renderings—“party atmosphere,” or similar, are also fraught with the potential for inaccuracy, as the term “Party” in German can be an acceptable designation for a rather staid corporate reception—which is not the case in English.

False cognates await the unwary translator in a myriad of seemingly innocuous contexts. In German, for example, the term “Trailer” designates all manner of short promotional videos. In English, however, a “trailer” is specifically a promotional clip for a feature film—an unawareness of this fact can lead to a serious error in the translation. In a similar vein, the term “Headline” is used in German to indicate the heading of any document; in English, by contrast, we only speak of headlines in the newspaper. An even stranger example is the German use of the term “Wording,” which refers generally to a company’s internal language—“wording” in English, of course, merely refers to the way something is phrased.

The German use of the word “team” in business contexts is also a source of particular frustration for the translator. In Germany the staff of any company is often referred to as the “team”; in English, by contrast, the term is used much more sparingly to designate an inter-organizational group with a specific task—if it is used at all. Confronted with the term in many contexts the translator may feel compelled to select a substitute, yet he can only do so at the risk of arousing the resistance of the client, for whom “team” is the firmly established designation.

Can’t computers do that now?

One question I often hear when telling people I’m a translator is:“Can’t computers do that now?” I find this question surprising, as it exhibits a disturbing lack of awareness for what a translator actually does. Although it may not be all too surprising to hear this perception voiced by someone who has never learned a foreign language, even those with a grasp of the difficulties that pervade language translation often contend that the hurdles are merely technical, and that sooner or later machine translation will reach maturity. Indeed, if a computer program can beat Kasparov at chess, why can’t we develop one to master the task of translation?

Google seems to think it can, and has been touting the positive results of its statistic-based approach to machine translation. Whereas previous attempts to write translation programs involved efforts by linguists to define rules for transcribing text from one language to another, Google has thrown its weight behind a different approach in which reams of text are fed into a computer for the development of probability based models. Although advances have been made with this approach, and computers are likely to close the gap on human translators in coming years, it seems doubtful that a computer will ever surmount all of the hurdles facing machine translation. An interesting article in The Atlantic highlights some of the basic problems faced by machine translation, such as variance in word order and grammatical structure between languages. Far more problematic for development of an accurate and reliable translation program, however, are the idiomatic and cultural properties of language.

The panacea of intercultural exchange envisioned by some is complicated by the fact that computers don’t understand that institutional and cultural environments often inform specific texts. Translation is an act of negotiation, and sometimes suitable equivalents for certain expressions or terms do not exist. Oftentimes, the translator must heavily modify the target text to arrive at an appropriate adaptation. Take the following sentence, for example: “Die Gebaeude im Bezirk sind zu über 80 Prozent von gründerzeitlicher Altbausubstanz geprägt.” The real problem here, of course, is the word gründerzeitlich, a term that has no equivalent in English. (Google’s translation software doesn’t event attempt to deal with it, yielding “The buildings in the district are more than 80 percent from gründerzeitliche houses marked substance.”)

German readers know that the Gründerzeit was an historical period that generated a specific architectural style in Germany and Austria. English-speaking readers lack this context. An effective translation of the above sentence would take this realization into account and perhaps offer a gloss. Here’s what I came up with: “Over 80% of the buildings in the district were originally constructed in the German architectural period known as the Gründerzeit (“founding epoch”).

This doesn’t seem all that complicated on the face of it, but it requires a certain sensitivity to intercultural contexts, something that a computer program running on probability models lacks. There is literally no way to arrive at the formulation “originally constructed inthe architectural period” without an act of interpretation and awareness for one’s reader.