Tag Archives: source code

Write It Like It’s Stolen

Source code. How would you alter the risks associated with your web site if its source code were stolen? Hard-coded passphrases? String concatenation of SQL statements? How much security relies on secrecy of functionality versus secrecy of data? Think of it in terms of Kerchoff’s Principle, roughly “The system must not require secrecy and can be stolen by the enemy without causing trouble”. Kerchoff was writing about cryptography, but the concept applies well to software (setting aside certain issues like intellectual property, we’re just focusing on web security).

In January 2012 Reuters reported that Symantec source code had been stolen. A month later the source started appearing publicly. The compromise, initially dismissed as only affecting some five-year old code, unleashed a slew of updates to the pcAnywhere products. Of the several notable points about this hack, one to emphasize was the speed with which vulnerabilities were identified after the initial compromise. The implication being that hackers’ access to source code highlighted vulnerabilities not otherwise found by the original developers’ more obvious access to source.

Eyeballs and bugs make an unconditionally decent witch’s brew, whereas the rule “Given enough eyeballs, all bugs are shallow” fails as an unqualified statement. The most famous counter-example being the approximately two-year window of non-random randomness inside the Debian OpenSSL package. The rule’s important caveat being that not all eyeballs are created equal (nor are all eyeballs trained on the same bugs). Still, the transparency of open source provided not only the eventual discovery and fix, but — perhaps more important — a solid understanding of the explicit timeframe and software packages that were subpar. Such confidence, even when it applies to knowing software is broken, is a boon to security. (There are other fun reformulations of the rule, such as, “Given enough bugs, all software is exploitable.” or “Given enough shallowness, all bugs are not security concerns.” But that takes us off topic…)

In any case, write code like it’s going to be stolen. Or at least as if it’s going to be peer reviewed. Reviewed, that is, by someone who might be smarter, a better programmer, more knowledgeable about a security topic, or at the opposite end of the spectrum: fresh eyes that’ll notice a typo. Pick one of the web sites compromised by SQL injection in the last few years. Not only is SQL injection an inexcusable offense, but pointing out simple problems like SQL injection may also lead to pointing out other egregious mistakes like not salting passwords. Having your source code stolen might lead to heckling over stupid programming mistakes. Having your customers’ email addresses and passwords stolen has worse consequences.

In fact, we should lump privacy into this thought experiment as well. The explosion of mobile apps (along with their HTML5 and similar web sites) has significantly intertwined privacy and security. After all, much of privacy relates to control over the security of our own data. Malicious mobile apps aren’t always the ones trying to pop root on your phone or figuring out how to open a backdoor; they’re also the ones scraping your phone’s brain for everything it knows. It’s one thing to claim an App does or does not perform some action; it’s another to see what it’s actually doing — intended or not.

Your code doesn’t have to be stolen or open source to be secure, but your programmers need to care and need to know what to look for. Source code scanners are one way to add eyeballs, though not necessarily the easiest way. In fact, the future of code compilation (including interpreted languages like PHP, Python, etc.) is more likely to bring source code scanning concepts into the compiler. After all, why lump on extra tools and the all-too-often cumbersome configuration they entail when you should get the same feedback from the compiler. Clang (and tools like cppcheck) work wonders for cleaning up problematic C++ code. They don’t generate warnings for web security concepts like XSS or SQL injection, but there’s no reason they couldn’t evolve to do so.

In fact, what would be cooler: A source code analyzer that you need to dump your web site into then configure and tweak, or a compiler/interpreter that generates the security warnings for you? Imagine a mod_php for production and a mod_phpcheck that natively performs variable taint checking and function misuse for you. Not only should web sites shift more reliance on browser-based computing to established JavaScript frameworks in order to minimize reinventing the wheel (and associated security vulns). Web languages should move towards building security analysis into their compilation/interpretation environments. While a project like Emcripten isn’t a direct example of this, it’s a great example of bringing a notoriously loosely typed language like JavaScript into a complex analyzer like LLVM — and the potential for better code. Imagine a LLVM “optimizer” that knew how to detect and warn about certain types of DOM-based XSS.

The day that your IDE draws red squiggles underneath code because it’s insecure (rather than a typo, badly-typed, has a signed/unsigned mismatch, etc.) will be a day that web security takes a positive evolutionary step. Evolution favors those best adapted to an environment. In this case, security is best served by the tools used to write and execute code. Until then, we’ll be stuck with the biotic diversity of cumbersome tools and the varyingly-vigilant eyeballs that belong to developers and hackers alike.