A Trail of Translations

I’ve returned from London with plenty of notes and useful feedback on my RSA presentation regarding the kind of information that would be more interesting or more helpful to take to developers regarding JavaScript, HTML5, and security. It’ll take another week or so to sort them into coherent thoughts to post here. Two major topics to cover are more examples of JavaScript anti-patterns and suggestions for default CSP headers. Or at least suggestions on a CSP deployment scheme.

Of course, being in London I also had to deal with a foreign language. “Bubble & squeak” appears on breakfast menus, but I have no idea what that is. All the (Queen’s) English I know was taught by years of Dr. Who, Red Dwarf, Monty Python, Fawlty Towers, Blackadder, and so on; I’m sure you’ll notice a theme. As a result, I can name fifty types of aliens that’ve threatened (or destroyed) Big Ben, a dozen kings (not hard, most were Harries), and a few counties and shires — although they might be the same thing, which makes that feat less impressive. Yet that exhaustive studying left a culinary void touched only by Wensleydale, pickle, and fish and chips. So, for that particular breakfast I chose the Full English.

Creating a faithful translation poses interesting challenges. A major decision is weighing literal fidelity against meaning and style, especially when faced with idioms or, as you may have noticed from articles here, subtle (only sometimes) references to pop-culture from music, movies, or books.

Using Google translate makes for a fun exercise in observing semantic accuracy. Here’s a look at the Chinese and Japanese translation of an article I wrote that highlights key points in HTML5 security. Rather than a pop-culture metaphor, I went with a more…down to earth…choice. (The article is entirely too short, but the editors have a dismal view of readers’ ability to last beyond 1,000 words. Sorry.)

The Chinese version needs to transliterate my name in addition to translating the content. Google informs me the transliterated, translated author was Mike (迈克) Proxima (施玛). It chose a phonetic translation for Mike, but not for the last name. If you separated the characters of “Proxima” (施玛), you’d get nonsense like “impose mary”. But decomposing the characters like that is silly. It’d be like translating the syllables of Shakespeare to call him “rattle pike” or “jiggle fellow”. Bing’s translation engine seems to understand this phonetic concept more clearly, rendering the characters simply, “Shi Ma”.

The article’s original translator chose a different title, which Google says is something like “HTML5 powerful behind security trap”. No problem with a different title. It conveys a similar message even if I have to rely on Google’s creaky English to understand it.

The original translator helpfully added a note about cookies (I told you the word limit was strict). The point is that we need to care about how sites use the Web Storage API just as much as we care about how they use cookies in ways that affect our privacy. Here’s Google’s version of the text:

The easiest and best compatibility of programs Cookie, but as a true client-side storage, Cookie, and there are many fatal. IE6 and above can use the user Data Behavior, global Storage can be used in Firefox, Flash Local Storage can be used in the environment of the Flash plug-in, this in several ways, however, there are limitations of compatibility, So the real use is not ideal.

I’m curious here what the translator thought of “Here be dragons”. The phrase is probably familiar enough in the English language as a short-hand for unknown territory. (Despite having an almost nonexistent cartographic pedigree.) But probably evokes different imagery for Chinese readers due to cultural background.

The Japanese translator also chose to retitle the article. I like Bing’s version, “HTML5 security – view of its priorities”. It’s a less catchy title, but matches the content just fine.

A cool thing about this version is that the translator ran with the geography metaphor. Google translate makes the English sound bad, but the gist is that Web Storage data is easily pilfered if there’s an XSS vuln.

It is waiting, whatever that is stored in the browser cross-site scripting and I’ll steal the other side of the thin wall.

And here’s another sentence that sticks with the metaphor nicely (again, blame Google for the awkwardness of the English, not the article’s translator):

Data in the browser might be secure by their function, when the journey their data toward the wilderness, such as database servers and the external needs your help for their safety.

And finally, Bing has a nice translation that summarizes the relation between HTML5 and security:

Browser is the place of battle.

Yes, it is.

But we don’t have to end this article here. At one point in my RSA presentation I commented on abstracting JavaScript development to a more strongly-typed language. Another type of translation, if you will. Where working in one language provides a semantic strictness or understanding that is easily lost when working with JavaScript directly. From a security perspective, strictness and clarity are preferential.

The coolest JavaScript translation I’ve come across is Emscripten. It turns C and C++ into JavaScript. That’s a feat in itself. Even better is compiling C++ into a pure, HTML5 (and JavaScript) first-person shooter.

Oh, look. We’re getting close to that 1,000 word limit. At least now you have an excuse to replace this browser tab with one running a 3D game in HTML5, JavaScript, and WebGL. No plugins required. Enjoy.

In the mean time, I’ll be working on those notes.