The security pitfalls of the .NET ViewState object have been well-known since its introduction in 2002. The worst mistake is for a developer to treat the object as a black box that will be controlled by the web server and opaque to the end user. Before diving into ViewState security problems we need to explore its internals. This article digs into more technical language1 than others on this site and focuses on reverse engineering the ViewState. Subsequent articles will cover security. To invoke Bette Davis: “Fasten your seat belts. It’s going to be a bumpy night.”2
The ViewState enables developers to capture transient values of a page, form, or server variables within a hidden form field. The ability to track the “state of the view” (think model-view-controller) within a web page alleviates burdensome server-side state management for situations like re-populating fields during multi-step form submissions, or catching simple form entry errors before the server must get involved in their processing. (MSDN has several articles that explain this in more detail.)
This serialization of a page’s state involves objects like numbers, strings, arrays, and controls. These “objects” are not just conceptual. The serialization process encodes .NET objects (in the programming sense) into a sequence of bytes in order to take it out of the server’s memory, transfer it inside the web page, and reconstitute it when the browser submits the form.
Our venture into the belly of the ViewState starts with a blackbox perspective that doesn’t rely on any prior knowledge of the serialization process or content. The exploration doesn’t have to begin this way. You could write .NET introspection code or dive into ViewState-related areas of the Mono project for hints on unwrapping this object. I merely chose this approach as an intellectual challenge because the technique can be generalized to analyzing any unknown binary content.
The first step is trivial and obvious: decode from Base64. As we’re about to see, the ViewState contains bytes values forbidden from touching the network via an HTTP request. The data must be encoded with Base64 to ensure survival during the round trip from server to browser. If a command-line
pydoc base64 or
perldoc MIME::Base64 doesn’t help you get started, a simple web search will turn up several ways to decode from Base64. Here’s the beginning of an encoded ViewState:
Now we’ll break out the
xxd command to examine the decoded ViewState. One of the easiest steps in reverse engineering is to look for strings because our brains evolved to pick out important words like “donut”, “Password”, and “zombies!” quickly. The following line shows the first 16 bytes that
xxd produces from the previous example. To the right of the bytes
xxd has written matching ASCII characters for printable values – in this case the string 720702894.
0000000: ff01 0f0f 0509 3732 3037 3032 3839 340f ......720702894.
Strings have a little more complexity than this example conveys. In an English-centric world words are nicely grouped into arrays of ASCII characters. This means that a programming language like C treats strings as a sequence of bytes followed by a NULL. In this way a program can figure out that the bytes
0x627261696e7300 represent a six-letter word by starting at the string’s declared beginning and stopping at the first NULL (0x00). I’m going to do some hand-waving about the nuances of characters, code points, character encodings and their affect on “strings” as I’ve just described. For the purpose of investigating ViewState we only need to know that strings are not (or are very rarely) NULL-terminated.
Take another look at the decoded example sequence. I’ve highlighted the bytes that correspond to our target string. As you can see, the byte following 720702894 is 0x0f – not a NULL. Plus, 0x0f appears twice before the string starts, which implies it has some other meaning:
ff01 0f0f 0509 **3732 3037 3032 3839 34**0f ......**720702894**.
The lack of a common terminator indicates that the ViewState serializer employs some other hint to distinguish a string from a number or other type of data. The most common device in data structures or protocols like this is a length delimiter. If we examine the byte before our visually detected string, we’ll see a value that coincidentally matches its length. Count the characters in 720702894.
ff01 0f0f 05**09** 3732 3037 3032 3839 340f ......720702894.
Congratulations to anyone who immediately wondered if ViewState strings are limited to 255 characters (the maximum value of a byte). ViewState numbers are a trickier beast to handle. It’s important to figure these out now because we’ll need to apply them to other containers like arrays.3 Here’s an example of numbers and their corresponding ViewState serialization. We need to examine them on the bit level to deduce the encoding scheme.
Decimal Hex Binary
1 01 00000001
9 09 00001001
128 8001 10000000 00000001
655321 09ffd9 11011001 11111111 0100111
The important hint is the transition from values below 128 to those above. Seven bits of each byte are used for the number. The high bit tells the parser, “Include the next byte as part of this numeric value.”
Here’s the same number with the unused “high” bit removed and reordered with the most significant bits first.
MSB … LSB
0101011 0000110 (5510, 0x1586)
Now that we’ve figured out how to pick out strings and their length it’s time to start looking for ways to identify different objects. Since we have strings on the mind, let’s walk back along the ViewState to the byte before the length field. We see 0x05.
ff01 0f0f **05**09 3732 3037 3032 3839 340f ......720702894.
That’s the first clue that 0x05 identifies a string. We confirm this by examining other suspected strings and walking the ViewState until we find a length byte (or bytes) preceded by the expected identifier. There’s a resounding correlation until we find a series of strings back-to-back that lack the 0x05 identifier. Suddenly, we’re faced with an unknown container. Oh dear. The length field for the three strings has been highlighted:
0000000: 1503 **07**74 6f70 5f6e 6176 **3f**68 7474 703a ...top_nav?http: 0000010: 2f2f 7777 772e 5f5f 5f5f 5f5f 5f2e 636f //www._______.co 0000020: 6d2f 4162 6f75 7455 732f 436f 6e74 6163 m/AboutUs/Contac 0000030: 7455 732f 7461 6269 642f 3634 392f 4465 tUs/tabid/649/De 0000040: 6661 756c 742e 6173 7078 **0a**43 6f6e 7461 fault.aspx.Conta 0000050: 6374 2055 73 ct Us
Moving to the first string in this list we see that the preceding byte, 0x03, is a number that luckily matches the amount of strings in our new, unknown object. We peek at the byte before the number and see 0x15. We’ll call this the identifier for a String Array.
At this point the reverse engineering process is easier if we switch from a completely black box approach to one that references MSDN documentation and output from other tools.
Two of the most common objects inside a ViewState are Pairs and Triplets. As the name implies, these containers (also called tuples) have two or three members. There’s a catch here, though: They may have empty members. Recall the analysis of numbers. We wondered how upper boundaries (values greater than 255) might be handled, but we didn’t consider the lower bound. How might empty containers be handled? Do they have a length of zero (0x00)? Without diverging too far off course, I’ll provide the hint that NULL strings are 0x658 and the number zero (0) is 0x66.
The root object of a ViewState is either a Pair or Triplet. Thus, it’s easy to inspect different samples in order to figure out that 0x0f identifies a Pair and 0x10 a Triple. Now we can descend the members to look for other kinds of objects.
A Pair has two members. This also implies that it doesn’t need a size identifier since there’s no point in encoding “2” for a container that is designed to hold two members. (Likewise “3” for Triplets.) Now examine the ViewState using a recursive descent parser. This basically means that we encounter a byte, update the parsing context based on what the byte signifies, then consume the next byte based on the current context. In practice, this means a sequence of bytes like the following example demonstrates nested Pairs:
0000000: ff01 0f0f 0509 3732 3037 3032 3839 340f ......720702894. 0000010: 6416 0666 0f16 021e 0454 6578 7405 793c d..f.....Text.y<
- Member 1: Pair
- Member 1: String
- Member 2: Pair
- Member 1: ArrayList (0x16) of 6 elements
- Member 2: Empty
- Member 1: ArrayList (0x16) of 6 elements
- Member 1: String
- Member 2: Empty
Don’t worry if you finish parsing with 16 or 20 leftover bytes. These correspond to the MD5 or SHA1 hash of the contents. In short, this hash prevents tampering of ViewState data. Recall that the ViewState travels back and forth between the client and server. There are many reasons why the server wants to ensure the integrity of the ViewState data. We’ll explore integrity (hashing), confidentiality (encryption), and other security issues in a future article.
I haven’t hit every possible control object that might sneak into a ViewState. You can find a NULL-terminated string. You can find RGBA color definitions. And a lot more.
More technical, but not rigorously so. Given the desire for brevity, some programming terms like objects, controls, NULL, strings, and numbers (integers signed or unsigned) are thrown about rather casually. ↩
All About Eve. http://www.imdb.com/title/tt0042192/ (Then treat yourself to Little Foxes and Whatever Happened to Baby Jane?) ↩
I say “other containers” because strings can simply be considered a container of bytes, albeit bytes with a particular meaning and restrictions. ↩