The security pitfalls of the .NET ViewState object have been well-known since its introduction in 2002. The worst mistake is for a developer to treat the object as a black box that will be controlled by the web server and opaque to the end user. Before diving into ViewState security problems we need to explore its internals. This article digs into more technical language1 than others on this site and focuses on reverse engineering the ViewState. Subsequent articles will cover security. To invoke Bette Davis: “Fasten your seat belts. It’s going to be a bumpy night.”2

The ViewState enables developers to capture transient values of a page, form, or server variables within a hidden form field. The ability to track the “state of the view” (think model-view-controller) within a web page alleviates burdensome server-side state management for situations like re-populating fields during multi-step form submissions, or catching simple form entry errors before the server must get involved in their processing. (MSDN has several articles that explain this in more detail.)

This serialization of a page’s state involves objects like numbers, strings, arrays, and controls. These “objects” are not just conceptual. The serialization process encodes .NET objects (in the programming sense) into a sequence of bytes in order to take it out of the server’s memory, transfer it inside the web page, and reconstitute it when the browser submits the form.

Our venture into the belly of the ViewState starts with a blackbox perspective that doesn’t rely on any prior knowledge of the serialization process or content. The exploration doesn’t have to begin this way. You could write .NET introspection code or dive into ViewState-related areas of the Mono project for hints on unwrapping this object. I merely chose this approach as an intellectual challenge because the technique can be generalized to analyzing any unknown binary content.

The first step is trivial and obvious: decode from Base64. As we’re about to see, the ViewState contains bytes values forbidden from touching the network via an HTTP request. The data must be encoded with Base64 to ensure survival during the round trip from server to browser. If a command-line pydoc base64 or perldoc MIME::Base64 doesn’t help you get started, a simple web search will turn up several ways to decode from Base64. Here’s the beginning of an encoded ViewState:


Now we’ll break out the xxd command to examine the decoded ViewState. One of the easiest steps in reverse engineering is to look for strings because our brains evolved to pick out important words like “donut”, “Password”, and “zombies!” quickly. The following line shows the first 16 bytes that xxd produces from the previous example. To the right of the bytes xxd has written matching ASCII characters for printable values – in this case the string 720702894.

0000000: ff01 0f0f 0509 3732 3037 3032 3839 340f ......720702894.

Strings have a little more complexity than this example conveys. In an English-centric world words are nicely grouped into arrays of ASCII characters. This means that a programming language like C treats strings as a sequence of bytes followed by a NULL. In this way a program can figure out that the bytes 0x627261696e7300 represent a six-letter word by starting at the string’s declared beginning and stopping at the first NULL (0x00). I’m going to do some hand-waving about the nuances of characters, code points, character encodings and their affect on “strings” as I’ve just described. For the purpose of investigating ViewState we only need to know that strings are not (or are very rarely) NULL-terminated.

Take another look at the decoded example sequence. I’ve highlighted the bytes that correspond to our target string. As you can see, the byte following 720702894 is 0x0f – not a NULL. Plus, 0x0f appears twice before the string starts, which implies it has some other meaning:

ff01 0f0f 0509 **3732 3037 3032 3839 340f ......720702894.

The lack of a common terminator indicates that the ViewState serializer employs some other hint to distinguish a string from a number or other type of data. The most common device in data structures or protocols like this is a length delimiter. If we examine the byte before our visually detected string, we’ll see a value that coincidentally matches its length. Count the characters in 720702894.

ff01 0f0f 0509 3732 3037 3032 3839 340f ......720702894.

Congratulations to anyone who immediately wondered if ViewState strings are limited to 255 characters (the maximum value of a byte). ViewState numbers are a trickier beast to handle. It’s important to figure these out now because we’ll need to apply them to other containers like arrays.3 Here’s an example of numbers and their corresponding ViewState serialization. We need to examine them on the bit level to deduce the encoding scheme.

Decimal   Hex     Binary  
1         01      00000001  
9         09      00001001  
128       8001    10000000 00000001  
655321    09ffd9  11011001 11111111 0100111

The important hint is the transition from values below 128 to those above. Seven bits of each byte are used for the number. The high bit tells the parser, “Include the next byte as part of this numeric value.”

LSB           MSB  
10000110 00101011

Here’s the same number with the unused “high” bit removed and reordered with the most significant bits first.

MSB   ...   LSB  
0101011 0000110 (5510, 0x1586)

Now that we’ve figured out how to pick out strings and their length it’s time to start looking for ways to identify different objects. Since we have strings on the mind, let’s walk back along the ViewState to the byte before the length field. We see 0x05.

ff01 0f0f 0509 3732 3037 3032 3839 340f ......720702894.

That’s the first clue that 0x05 identifies a string. We confirm this by examining other suspected strings and walking the ViewState until we find a length byte (or bytes) preceded by the expected identifier. There’s a resounding correlation until we find a series of strings back-to-back that lack the 0x05 identifier. Suddenly, we’re faced with an unknown container. Oh dear. Look for the length field for the three strings:

0000000: 1503 0774 6f70 5f6e 6176 3f68 7474 703a ...top_nav?http:
0000010: 2f2f 7777 772e 5f5f 5f5f 5f5f 5f2e 636f //
0000020: 6d2f 4162 6f75 7455 732f 436f 6e74 6163 m/AboutUs/Contac
0000030: 7455 732f 7461 6269 642f 3634 392f 4465 tUs/tabid/649/De
0000040: 6661 756c 742e 6173 7078 0a43 6f6e 7461 fault.aspx.Conta
0000050: 6374 2055 73 ct Us

Moving to the first string in this list we see that the preceding byte, 0x03, is a number that luckily matches the amount of strings in our new, unknown object. We peek at the byte before the number and see 0x15. We’ll call this the identifier for a String Array.

At this point the reverse engineering process is easier if we switch from a completely black box approach to one that references MSDN documentation and output from other tools.

Two of the most common objects inside a ViewState are Pairs and Triplets. As the name implies, these containers (also called tuples) have two or three members. There’s a catch here, though: They may have empty members. Recall the analysis of numbers. We wondered how upper boundaries (values greater than 255) might be handled, but we didn’t consider the lower bound. How might empty containers be handled? Do they have a length of zero (0x00)? Without diverging too far off course, I’ll provide the hint that NULL strings are 0x658 and the number zero (0) is 0x66.

The root object of a ViewState is either a Pair or Triplet. Thus, it’s easy to inspect different samples in order to figure out that 0x0f identifies a Pair and 0x10 a Triple. Now we can descend the members to look for other kinds of objects.

A Pair has two members. This also implies that it doesn’t need a size identifier since there’s no point in encoding “2” for a container that is designed to hold two members. (Likewise “3” for Triplets.) Now examine the ViewState using a recursive descent parser. This basically means that we encounter a byte, update the parsing context based on what the byte signifies, then consume the next byte based on the current context. In practice, this means a sequence of bytes like the following example demonstrates nested Pairs:

0000000: ff01 0f0f 0509 3732 3037 3032 3839 340f ......720702894.
0000010: 6416 0666 0f16 021e 0454 6578 7405 793c d..f.....Text.y<
 - Member 1: Pair
    - Member 1: String
    - Member 2: Pair
       - Member 1: ArrayList (0x16) of 6 elements
           Number 0
       - Member 2: Empty
 - Member 2: Empty

Don’t worry if you finish parsing with 16 or 20 leftover bytes. These correspond to the MD5 or SHA1 hash of the contents. In short, this hash prevents tampering of ViewState data. Recall that the ViewState travels back and forth between the client and server. There are many reasons why the server wants to ensure the integrity of the ViewState data. We’ll explore integrity (hashing), confidentiality (encryption), and other security issues in a future article.

I haven’t hit every possible control object that might sneak into a ViewState. You can find a NULL-terminated string. You can find RGBA color definitions. And a lot more.

This was a brief introduction to the ViewState. It’s necessary to understand its basic structure and content before we dive into its security implications. In the next part of this series I’ll expand the analysis to more objects while showing how to use the powerful parsing available from the Boost.Spirit C++ library. We could even dive into JavaScript parsing for those who don’t want to leave the confines of the browser. After that, we’ll look at the security problems due to unexpected ViewState manipulation and the countermeasures for web apps to deploy. In the mean time, I’ll answer questions that pop up in the comments.

  1. More technical, but not rigorously so. Given the desire for brevity, some programming terms like objects, controls, NULL, strings, and numbers (integers signed or unsigned) are thrown about rather casually. 

  2. All About Eve. (Then treat yourself to Little Foxes and Whatever Happened to Baby Jane?

  3. I say “other containers” because strings can simply be considered a container of bytes, albeit bytes with a particular meaning and restrictions.