Designing a web application scanner is easy. A good design requires a few sentences; a great design might need two paragraphs or so. It’s easy to find messages on e-mail lists that describe the One True Way to scan a web site.
Implementing a scanner is hard. The core of a web vulnerability scanner performs two functions: find a link, test that link. The task of finding links falls to a crawling engine. The crawler must be fundamentally strong, otherwise links will be missed and a missed link is an untested link. Untested links lead to holes in the site coverage which raise uncertainty in the state of the site’s security. It’s rarely necessary to hit every link of a web site in order to adequately scan it. Security testing requires comprehensive coverage of the site’s functionality, which is different from covering every single link. A SQL injection vulnerability in the thread ID of a forum can be found by crawling a few sample discussion threads. It’s not necessary to fully enumerate 100,000 threads about nerfing warlocks or debating Mal vs. Kirk.
In addition to crawling strategies, scanners must also be able to crawl a site as an authenticated user. Maintaining an authenticated state requires coordinating several pieces of information (tracking the session cookie, avoiding logout links). But first the scanner must find and submit the login form.
Simple login forms have a text field, password field, and submit button. The HTML standard provides the markup to create these forms. The standard only defines syntax, not usage. This gives web developers leeway to abuse HTML through ignorance, inefficiency, and what can only be termed outright malice.
Consider the login form created by Sun’s OpenSSO Enterprise 8.0. The HTML roughly breaks down to the following:
Even uglier login form patterns exist in the wild. In some cases the login form is wrapped within its own HTML element:
<html> <head></head> <body> ...other content <div> <html><form> ... </form></html> </div> ...other content </body> </html>
Username: <input type="text" value=""> Password: <input type="password" value="">
Then there’s the
doPostBack function in .NET sites along with their penchant for multiple submit buttons (e.g. one for authentication, another for a search). Now the scanner has to identify the salient fields for authentication and hit the correct submit button; it’s no good to fill out the username and password only to submit the search button.
Sure, a user could manually coax the scanner through any of these login processes, but that places an unnecessary burden on the user’s time. This is less of a problem when dealing with a single web site, but becomes overwhelming when trying to scan a dozen or even hundreds of web applications.
These types of logins also highlight the difficulty scanners have with understanding the logic of a web page, let alone the logic of a group of pages or some workflow within the site.
It’s still possible to automate the login process for these forms, doing so requires customization at the expense of having a generic authentication mechanism. In the end, dealing with login pages often provides insight into the madness of HTML editing (it’s hard to call some of these methods programming) and the bizarre steps developers will take just to “make it work.”
Scanners should automate the crawl and test phases as much as possible. After all, it’s dangerous to tie too much of a scan’s effectiveness to the user’s knowledge of web security. It may not be every day that a web developer answers your question about the robots.txt file with, “I don’t know what that is,” but it’s a good idea to have a scanner that will be comprehensive and accurate regardless of whether the user knows the UTF-7 encoding for an angle bracket or wonders why web sites don’t just strip the
alert function to prevent XSS attacks.