"a fish, a barrel, and a smoking gun"
for 25 November 1996. Updated every WEEKDAY.

Royale with Cheese


[English Only]

It would take a cunning linguist

indeed to lick the nasty problem

of international communication

on the web. Those who don't read

the online lingua franca are

limited to a smattering of

resources even more

idiosyncratic than the

English-language fare. The web

may be world-wide, but most of

the content is tongue-tied -

bound to a particular language

group, and bound to be of little

use to the rest of the world.

Multilingual resources are

scarce, and scarcely

multilingual. The United Nations

makes their web information

available in a whopping three

languages. Meanwhile,

Anglophones commercially and

intellectually pistol-whip the

rest of the web world into

publishing their (marginally)

more interesting content in

English. Still, even those of us

in the Tower of Babel's

penthouse can feel like we're

missing out on the fun.

Sometimes it seems the only hope

is that some deus ex machina

will make alien texts more




Even dedicated decoders often

mangle meaning. While Coca-Cola

didn't ever make it over to

China in bottles labeled "Bite

the Wax Tadpole," marketing

marks hankering for homophony

did consider the corrupt

cognate. Worse, an ad campaign

launched in Taiwan actually did

claim "Pepsi brings your

ancestors back from the dead" -

raising all sorts of

generational issues, if not



If texts can get tangled so

severely when there's human

oversight, imagine the problems

when algorithms are your

interlocutors. Automatic machine

translators - whose synapses are

still less complex than even

advertising executives' - have

just as amusing a history of

failure. Illustrative, if

probably apocryphal, is the

story of an early Department of

Defense-funded translator that

morphed "The spirit is willing

but the flesh is weak" into

Russian for "The vodka is strong

but the meat is rotten."



A simple way to avoid confusing

the poor software? Stop hamming

it up about the human spirit and

stick to intralingual

recipe-swapping, the inane

banter of online chat, or other

formulaic verbiage. Automatic

translators designed to function

within certain dry topic areas,

such as software documentation

or business correspondence, meet

with better success than

general-purpose programs. After

all, when your domain of

discourse is corporate

bureaucracy, you have far less

potential for confusion than

with more poetic topics.

Although there's also that story

about the translator that

rendered "Vice President" to

what literally meant "President

of Vice." For the most part,

translation programs trained to

work with specific types of

boring drivel have managed to do

as well as human translators a

few years out of college.

Proffer to such a system a text

slightly out of its topic area,

however, and the results will be

nonsense. Since such narrowly

focused solutions are the only

ones that have worked well,

there would seem little hope for

automatically boiling down web

babble into more widely

comprehensible babble.


[Spanish Stamps]

Two English-to-Japanese web

translation programs have

already appeared, Pensee and

NetSurfer/ej. Both bring

translation ability - of a sort -

to the browser itself, loosing

this rather restricted

translation ability on the web,

the least restricted reservoir

of words the world has yet

known. The immediate effect is

on the client side, but as

translators like these become

widespread, content creators

will no doubt begin to do some

preprocessing - lending a hand

to autopolyglots by providing

output that's easy to

regurgitate. It's easier, after

all, than actually serving up

multilingual fare.



Such translation programs have

gone unnoticed by all but the

most Asian-obsessed site

builders in the States. But

clumsy tools will eventually

encourage the nuts and bolts to

be made bigger and simpler. The

web, already written at a

grade-school level, now has yet

another impetus to become bland.

In order to make web text

amenable to automatic

translation, writers will

further restrict their

vocabulary to avoid tripping up

such software. The lowest common

denominator will mechanically be

forced even lower; the verbal

baby food served up online will

be pureed into an even more

homogeneous and flavorless




Automatic translators aren't

solely to blame for this

broadening of the Internet

audience. We are, too, but as

most web-publishing companies

financial statements prove,

writing for those without a clue

simply won't simplify things

enough - the writing truly

suited for the web will be

crafted for consumption by

expert systems and rule-based

processes instead of people.

Bots have already joined the

ranks of content consumers, and

will begin reading in more

complex - if not humanlike -

ways. Search bots will be

scanning to see how many times

those critical key words are

repeated in the actual body

text. Summarizing bots will try

to "gist" web writing based on

simple cues and clues.

Matchmaking bots will be reading

the web like a giant personals

section. All of these

innovations will improve upon

any current standard of web

literacy - no one actually reads

the web now, anyways.

courtesy of The Internick