• TOCTOU Twins

    Effective security boundaries require conclusive checks (data is or is not valid) with well-defined outcomes (access is or is not granted). Yet the passage between boundaries is fraught with danger. As the twin-faced Roman god Janus watched over doors and gates – areas of transition – so does the twin-faced demon of insecurity, TOCTOU, infiltrate web apps.

    This demon’s two faces watch the state of data and resources within an app. They are named:

    • Time of check (TOC) – When the data is inspected, such as whether an email address is well formed or text contains a <script> tag. Data received from the browser is considered “tainted” because a malicious user may have manipulated it. If the data passes a validation function, then the taint may be removed and the data permitted entry deeper into the app.
    • Time of use (TOU) – When the app performs an operation with the data. For example, inserting it into a SQL statement or web page. Weaknesses occur when the app assumes the data has not changed since it was last checked. Vulns occur when the change relates to a security control.

    Boundaries protect web apps from vulns like unauthorized access and arbitrary code execution. They may be enforced by programming patterns like parameterized SQL statements that ensure data can’t corrupt a query’s grammar, or access controls that ensure a user has permission to view data.

    We see security boundaries when encountering invalid certs. The browser blocks the request with a dire warning of “bad things” might be happening, then asks the user if they want to continue anyway. (As browser boundaries go, that one’s probably the most visible and least effective.)

    Ideally, security checks are robust enough to prevent malicious data from entering the app. But there are subtle problems to avoid in the time between when a resource is checked (TOC) and when the app uses that resource (TOU), specifically duration and transformation.

    One way to illustrate a problem in the duration of time between TOC and TOU is in an access control mechanism. User Alice has been granted admin access to a system on Monday. She logs in on Monday and keeps her session active. On Tuesday her access is revoked. But the security check for revoked access only occurs at login time. Since she maintained an active session, her admin rights remain valid.

    Another example would be bidding for an item. Alice has a set of tokens with which she can bid. The bidding algorithm requires users to have sufficient tokens before they may bid on an item. In this case, Alice starts off with enough tokens for a bid, but bids with a lower number than her total. The bid is accepted. Then, she bids again with an amount far beyond her total. But the app failed to check the second bid, having already seen that she has more than enough to cover her bid. Or, she could bid the same total on a different item. Now she’s committed more than her total to two different items, which will be problematic if she wins both.

    In both cases, state information has changed between the TOC and TOU. The resource was not marked as newly tainted upon each change, which let the app assume the outcome of a previous check remained valid. You might apply a technical solution to the first problem: conduct the privilege check upon each use rather than first use (the privilege checks in the Unix sudo command can be configured as never, always, or first use within a specific duration). You might solve the second problem with a policy solution: punish users who fail to meet bids with fines or account suspension. One of the challenges of state transitions is that the app doesn’t always have omniscient perception.

    Transformation of data between the TOC and TOU is another potential security weakness. A famous web-related example was the IIS “overlong UTF-8vulnerability from 2000 – it was successfully exploited by worms for months after Microsoft released patches.

    Web servers must be careful to restrict file access to the web document root. Otherwise, an attacker could use a directory traversal attack to gain unauthorized access to the file system. For example, the IIS vuln was exploited to reach cmd.exe when the app’s pages were stored on the same volume as the Windows system32 directory. All the attacker needed to do was submit a URL like:

    https://iis.site/dir/..%c0%af..%c0%af..%c0%af../winnt/system32/cmd.exe?/c+dir+c:\\

    Normally, IIS knew enough to limit directory traversals to the document root. However, the %c0%af combination didn’t appear to be a path separator. The security check was unaware of overlong UTF-8 encoding for a forward slash (/). Thus, IIS received the URL, accepted the path, decoded the characters, then served the resource.

    Unchecked data transformation also leads to HTML injection vulnerabilities when the app doesn’t normalize data consistently upon input or encode it properly for output. For example, %22 is a perfectly safe encoding for an href value. But if the %22 is decoded and the href’s value created with string concatenation, then it’s a short step to busting the app’s HTML. Normalization needs to be done carefully.

    TOCTOU problems are usually discussed in terms of file system race conditions, but there’s no reason to limit the concept to file states. Reading source code can be as difficult as decoding Cocteau Twins lyrics. But you should still review your app for important state changes or data transformations and consider whether security controls are sufficient against attacks like input validation bypass, replay, or repudiation:

    • What happens between a security check and a state change?
    • How are concurrent read/writes handled for the resource? Is it a “global” resource that any thread, process, or parallel operation might act on?
    • How long does the resource live? Is it checked before each use or only on first use?
    • When is data considered tainted?
    • When is it considered safe?
    • When is it normalized?
    • When is it encoded?

    January was named after the Roman god Janus. As you look ahead to the new year, consider looking back at your code for the boundaries where a TOCTOU demon might be lurking.

    • • •
  • HIQR for the SPQR

    Friends, Romans, coding devs, lend me your eyes. I’ve created an HTML Injection Quick Reference (HIQR). More details here.

    British Museum roman coin

    It’s not in iambic pentameter, but there’s a certain rhythm to the placement of quotation marks, less-than signs, and alert functions.

    For those unfamiliar with HTML injection (or cross-site scripting in the Latin Vulgate), it’s a vuln that can be exploited to modify a web page in a way that changes the DOM or executes arbitrary JavaScript. In the worst cases, the app delivers malicious content to anyone who visits the infected page. Insecure string concatenation is the most common programming error that leads to this flaw.

    Imagine an app that allows users to include <img> tags in comments, perhaps to show off cute pictures of spiders. Thus, the app expects image elements whose src attribute points anywhere on the web. For example:

    <img src="https://web.site/image.png">
    

    If users were limited to nicely formed https links, all would be well in the world. (Sort of, there’d still be an issue of what content that link pointed to, whether obscene, copyrighted, malware, multi-GB images that would DoS browsers or sites they’re sourced from, and so on. But those are threat models for a different day.)

    There’s already trouble brewing in the form of javascript: schemes. For example, an attacker could inject arbitrary JavaScript into the page – a dangerous situation considering it would be executing within the page’s Same Origin Policy.

    <img src="javascript:alert(9)">
    

    Then there’s the trouble with attributes. Even if the site restricted schemes to https: an uncreative hacker could simply add an inline event handler. For example:

    <img src="https://&" onerror="alert(9)">
    

    Now the attacker has two ways of executing JavaScript in their victim’s browsers – javascript: schemes and event handlers.

    There’s more.

    Suppose the app writes anything the user submits into the web page. We’ll even imagine that the app’s developers have decided to enforce an https: scheme and the tag may only contain a src value. In an attempt to be more secure, the app writes the user’s src value into an <img> element with no event handlers. This is where string concatenation rears its ugly, insecure head. For example, the hacker submits the following src attribute:

    https:">alert(9)
    

    The app drops this value into the src attribute and, presto!, a new element appears. Notice the two characters at the end of the line, ">, these were the intended end of the src attribute and <img> tag, which the attacker’s payload subverted:

    <img src="https:">alert(9)>">
    

    A few more tweaks to the payload, such as creating some <script> tags, and the page is fully compromised.

    HTML injection attacks become increasingly complex depending on the context of where the payload is rendered, whether characters are affected by validation filters, whether regexes are used to deny malicious payloads, and how payloads are encoded before being placed on the page.

    SPQR (Senātus Populusque Rōmānus) was the Latin abbreviation used to refer to the collective citizens of the Roman empire. Read up on HTML injection and you’ll become SPQH (Senātus Populusque Haxxor) soon enough.

    SPQR

    • • •
  • Escape from Normality

    John Carpenter fans know the only way you’ll escape from New York is if Snake Plissken is there to get you out. When it comes to web security, don’t bother waiting for Kurt Russell’s help. You’re on your own.

    And if you’re dealing with escape characters in JavaScript strings, you’ll want to make sure your application is a maximum security environment.

    Imagine an app with a search function. It takes a form field named q and, instead of reflecting the search term in the field’s value, it updates the value attribute with a one-line JavaScript call. Normally, you’d expect an app to just rewrite the <input> field like so:

    <input id="searchResult" type="text" name="q" value="abc">
    

    It’s not necessarily a bad idea to update the element’s value with JavaScript. Building HTML with string concatenation is a notorious vector for XSS. Writing the value with JavaScript might be more secure than rebuilding the HTML every time because the assignment avoids several encoding problems. This works if you’re keeping the HTML static and trading JSON messages with the server.

    On the other hand, if you move the server-side string concatenation from the <input> field to a <script> tag, then you’ve shifted the XSS problem to a different vector. In our target app, the <input> field’s value was delimited with quotation marks (“). The JavaScript code uses apostrophes (‘) to delimit the string, as follows:

    <script>
    document.getElementById('searchResult').value = 'abc';
    </script>
    

    Rather than strip apostrophes from the search variable’s value, the developers decided to escape them with backslashes. Here’s how it’s expected to work when a user searches for abc'.

    document.getElementById('searchResult').value = 'abc\\'';
    

    Escaping the payload’s apostrophe preserves the original string delimiters, prevents the JavaScript syntax from being manipulated, and blocks HTML injection attacks – so it seems.

    What if the escape is escaped? Perhaps by throwing a backslash of your own into a search term like abc\\'.

    document.getElementById('searchResult').value = 'abc\\\\'';
    

    The developers caught the apostrophe, but missed the backslash. When JavaScript tokenizes the string it sees the escape working on the second backslash instead of the apostrophe. This corrupts the syntax, as follows:

    //              ⬇ end of string token
    value = 'abc\\\\'';
    //               ⬆ dangling apostrophe
    

    From here we just start throwing HTML injection payloads against the app. JavaScript interprets \\ as a single backslash, accepts the apostrophe as the string terminator, and parses the rest of our payload.

    https://web.site/search?q=abc**\\';alert(9)//**

    document.getElementById('searchResult').value = 'abc\\\\';alert(9)//';
    

    JavaScript’s semantics are lovely from an attacker’s perspective. Here’s an example payload using the String concatenation operator (+) to glue the alert function to the value:

    https://web.site/search?q=**abc\\'%2balert(9)//**

    document.getElementById('searchResult').value = 'abc\\\\'+alert(9)//';
    

    Or we could try a payload that uses the modulo operator (%) between the String and our alert.

    abc\\'%alert(9)//
    

    Maybe the developers added the alert function to a denylist, e.g. a regex for alert\(, by checking for an opening parenthesis. In that case, call the function via the window object’s property list. This makes it look like an innocuous string to naive regexes:

    abc\\'%window["alert"](9)//
    

    What happens if the denylist contained the word alert altogether? Build the string character by character:

    abc\\'window[String.fromCharCode(0x61,0x6c,0x65,0x72,0x74)](9)//
    

    By now we’ve turned an evasion of an escaped apostrophe into an exercise in obfuscation and filter bypasses. These examples focused on all the permutations of escape sequences in JavaScript strings. Check out the HIQR for more anti-regex patterns and JavaScript obfuscation techniques.

    The Hitchhiker's Guide to the Galaxy

    A few additional tips when defending against the payloads:

    • In code reviews, be suspicious of string concatenation. Use safer methods to bind user-supplied data to HTML.
    • If you create output encoding methods rather than relying on frameworks like React, make sure they match the DOM context where the data will be written.
    • Normalize data before operating on it, whether this entails character set conversion, character encoding, substitution, or removal.
    • Apply security checks after normalization, preferring inclusion lists over exclusion lists – it’s a lot easier to guess what’s safe than assume what’s dangerous.

    Normalization is an important first step. Any time you transform data you should reapply security checks. Snake Plissken was never one for offering advice. Instead, think of The Hitchhiker’s Guide to the Galaxy and recall Trillian’s report as the Infinite Improbability Drive powers down (p. 61):

    …we have normality, I repeat we have normality….Anything you still can’t cope with is therefore your own problem.

    Good luck with normality and trying to correctly escape data. Security isn’t a certainty, but one thing is, at least according to Queen – there’s ”no escape from reality.”

    • • •