I have yet to create a full taxonomy of the mistakes developers make that lead to insecure code. As a brief note towards that effort, here’s an HTML injection (aka cross-site scripting) example that’s due to a series of tragic assumptions that conspire to not only leave the site vulnerable, but waste lines of code doing so.
The first clue lies in the querystring’s
state parameter. The site renders the
state’s value into a
title element. Naturally, a first probe for HTML injection would be attempting to terminate that tag. If successful, then it’s trivial to append arbitrary markup such as
<script> tags. A simple probe looks like this:
The site responds by stripping the payload’s
</title> tag (plus any subsequent characters). Only the text leading up to the injected tag is rendered within the
<HTML> <HEAD> <TITLE>abc</TITLE>
This seems to have effectively countered the attack and not expose any vuln. Of course, if you’ve been reading this blog for any length of time, you’ll know this trope of deceitful appearances always leads to a vuln. That which seems secure shatters under scrutiny.
The developers knew that an attacker might try to inject a closing
</title> tag. Consequently, they created a filter to watch for such things and strip them. This could be implemented as a basic case-insensitive string comparison or a trivial regex.
And it could be bypassed by just a few characters.
Consider the following closing tags. Regardless of whether they seem surprising or silly, the extraneous characters are meaningless to HTML yet meaningful to our exploit because they foil the assumption that regexes make good parsers.
<%00/title> <""/title> </title""> </title id="">
After inspecting how the site responds to each of the tags, it’s apparent that the site’s filter only expected a so-called “good”
</title> tag. Browsers don’t care about an attribute on the closing tag. (They’ll ignore such characters as long as they don’t violate parsing rules.)
Next, we combine the filter bypass with a payload. In this case, we’ll use an image
The attack works! We should have been less sloppy and added an opening
<TITLE> tag to match the newly orphaned closing one. A good exploit should not leave the page messier than it was before.
<HTML> <HEAD> <TITLE>abc</title id="a"><img src=x onerror=alert(9)> Vulnerable & Exploited Information Resource Center</TITLE>
The tragedy of this vuln is that it proves the site’s developers were aware of the concept of HTML injection exploits, but failed to grasp the fundamental characteristics of the vuln. The effort spent blocking an attack (i.e. countering an injected closing tag) not only wasted lines of code on an incorrect fix, but left the naive developers with a false sense of security. The code became more complex and less secure.
The mistake also highlights the danger of assuming that well-formed markup is the only kind of markup. Browsers are capricious beasts; they must dance around typos, stomp upon (or skirt around) errors, and walk bravely amongst bizarrely nested tags. This syntactic havoc is why regexes are notoriously worse at dealing with HTML than proper parsers.
There’s an ancillary lesson here in terms of automated testing (or quality manual pen testing, for that matter). A scan of the site might easily miss the vuln if it uses a payload that the filter blocks, or doesn’t apply any attack variants. This is one way sites “become” vulnerable when code doesn’t change, but attacks do.
And it’s one way developers must change their attitudes from trying to outsmart attackers to focusing on basic security principles.