You’ve Violated APE Law!

Developers who wish to defend their code should be aware of Advanced Persistent Exploitability. It is a situation where breaking code remains possible due to broken code.

La Planète des Singes

Code has errors. Writing has errors. Consider the pervasiveness of spellcheckers and how often the red squiggle complains about a misspelling in as common an activity as composing email. Mistakes happen; they’re a natural consequence of writing, whether code, blog, email, or book. The danger here is that in code these mistakes lead to exploits.

Sometimes coding errors arise from a stubborn refusal to acknowledge fundamental principles, as seen in the Advanced Persistent Ignorance that lets SQL injection persist almost a decade after programming languages first provided countermeasures. That vuln is so old that anyone with sqlmap and a URL can trivially exploit it.

Other coding errors are due to the lack of follow-through to address the fundamental causes of a vuln; the defender fixes the observed exploit as opposed to understanding and fixing the underlying issue. This approach fails when the attacker merely needs to tweak an exploit in order to compromise the vuln again.

We’ll use the following PHP snippet as an example. It has an obvious flaw in the arg parameter:

<?php
$arg = $_GET['arg'];
$r = exec('/bin/ls ' . $arg);
?>

Confronted with an exploit that contains a semi-colon to execute an arbitrary command, a developer might remember to apply input validation. This is not necessarily wrong, but it is a first step on the dangerous path of the “Clever Factor”. In this case, the developer chose to narrow the parameter to only contain characters.

<?php
$arg = $_GET['arg'];
# did one better than escapeshellarg
if(preg_match('/[a-zA-Z]+/', $arg)) {
$r = exec('/bin/ls ' . $arg);
}
?>

As a first offense, the regex should have been anchored to match the complete input string, i.e. '/^[a-zA-Z]+$/'. That mistake alone should dismiss this dev’s understanding of the problem and claim to a clever solution. But let’s continue the exercise with three more questions:

Is the intention clear? Is it resilient? Is it maintainable?

This developer declared they “did one better” than the documented solution by restricting input to mixed-case letters. One possible interpretation is that they only expected directories with mixed-case alpha names. A subsequent dev may point out the need to review directories that include numbers or a dot (.) and, as a consequence, relax the regex. That change may still be in the spirit of the validation approach (after all, it’s restricting input to expectations), but if the regex changes to where it allows a space or shell metacharacters, then it’ll be exploited. Again.

This leads to resilience against code churn. The initial code might be clear to someone who understands the regex to be an input filter (albeit an incorrect one in the first version). But the regex’s security requirements are ambiguous enough that someone else may mistakenly change it to allow metacharacters or introduce a typo that weakens it. Additionally, what kind of unit tests accompanied the original version? Merely some strings of known directories and a few negative tests with “./” and “..”? None of those tests would have demonstrated the vulnerability or conveyed the intended security aspect of the regex.

Code must be maintained over time. In the PHP example, the point of validation is right next to the point of usage. Think of this as the spatial version of the time of check to time of use flaw. In more complex code, especially long-lived code and projects with multiple committers, the validation check could easily drift further and further from the location where its argument is used. This dilutes the original developer’s intention since someone else may not realize the validation context and re-taint (such as with string concatenation with other input parameters) or otherwise misuse the parameter.

In this scenario, the solution isn’t even difficult. PHP’s documentation gives clear, prominent warnings about how to secure calls to the entire family of exec-style commands.

$r = exec('/bin/ls ' . escapeshellarg($arg));

The recommended solution has a clear intent — escape shell arguments passed to a command. It’s resilient — the PHP function will handle all shell metacharacters, not to mention the character encoding (like UTF-8). And it’s easy to maintain — whatever manipulation the $arg parameter suffers throughout the code, it will be properly secured at its point of usage.

It also requires less typing than the back-and-forth of multiple bug comments required to explain the pitfalls of regexes and the necessity of robust defenses. Applying a fix to stop an exploit is not the same as applying a fix to solve a vulnerability’s underlying problem.

There is a wealth of examples for this phenomenon, from string-matching alert to block cross-site scripting attacks to renaming files to prevent repeat exploitation (oh, the obscurity!) to stopping a service only to have it restart when the system reboots.

 

What does the future hold for programmers of the future? Pierre Boule’s vacationing astronauts perhaps summarized it best in the closing chapter of La Planète des Singes:

Des hommes raisonnables ? … Non, ce n’est pas possible

May your interplanetary voyages lead to less strange worlds.