HackDig : Dig high-quality web security articles for hacker

«No Previous
No Next

PDB Type Theft

2015-10-12 10:45

HP_ZeroDay_twitter.pngIn August of this year, Microsoft published an update to NTDLL and along with it, released updated symbols for debugging. These symbols are available as PDBs (program databases). Unfortunately, the symbols that were released contain type information that is missing standard structures and enumerations. As a result, debugging applications on Windows became a far more involved task. Microsoft is aware of the issue but has yet to release updated PDBs that rectify this issue.

While they are working on it, I found myself wondering if I could avoid their involvement altogether. Barring any changes to the structures and enumerations, the information from previous versions of the PDBs should still be valid. As such, if I could copy the type information from a previous PDB and inject it into the current PDB, I'd theoretically be able to have everything I expect from a working build process.


So that is what I did.


I initially started with pdbparse, an open-source Python module for parsing Microsoft PDB files. Another tool I considered is pdbparser, which is written in C/C++ and seems a bit more complete. Both have their advantages but  neither quite fit my needs. I ended up implementing my own version in a poor attempt to make the code a little cleaner for both reading and writing PDBs.


Fortunately, the PDB file format is simple enough that I was able to do this with relative ease. A PDB is broken up into pages of data where the size of a page is specified within the header. As one can imagine, the header is in the first page of the file, and it describes the location of the pages for the Root stream. The Root stream then specifies the number of streams, the sizes of the respective streams, and the pages the streams occupy. There are a couple of hard-coded indexes within the streams. For example, the first stream described within the Root stream is always a copy of the Root stream. The stream we are primarily interested in is the third stream, which describes type information. The header for the type information stream describes another stream named TpiHash. I do not yet understand the contents of the values; only that they are required for type information to be used.


Most of the code within the script is responsible for parsing the structures as part of reading a PDB and for preparing structures as part of writing a PDB. The code to handle copying the type information is pretty straightforward and ultimately boils down to the following:

  • Open the PDB containing the type information to copy.
  • Open the PDB that is to receive type information.
  • Replace the type information stream from the PDB in #1 to the PDB in #2.
  • Find the TPIHash stream from PDB #2 and overwrite it with the TPIHash stream from PDB #1.
  • Update the TPI stream to point to the TPIHash stream.
  • Write out the updated PDB.

You can find the script itself, dubbed pdb_type_theft.py, here. Usage is straight-forward -- just call the script with the first argument as the PDB to copy types from and the second argument as the PDB to dump types into.


This script requires having a PDB with the type information you want available to copy into another PDB.  If you are not in the habit of snapshotting your VMs after every update, the following links may be helpful -- just make sure to set your user-agent to 'Microsoft-Symbol-Server/6.6.0007.5' when downloading them.  Note that these images will need to be extracted with a tool that can handle .cab files.  Windows 7 can extract them if you rename the file to end with a .cab extension.


Windows 7 32-bit:




Windows 7 64-bit:





Here is the current (incomplete) module information:


Figure 1 - NTDLL Module Information


Now we’ll try to run !gflag, which references the _PEB structure:


Figure 2 - Output from !gflag


Next we run the script to copy type information stream from the PDB:


Figure 3 - Script execution


Here is a comparison of the patched file versus original file:


Figure 4 - File comparison


Then after we rename it and reload symbols:


Figure 5 - Successful load of symbols




That script was also tested against the symbols for ntoskrnl.exe since they have the exact same issue. One caveat with the script is that it does not support the older PDB2 format. It may also have trouble with different versions of the TPI stream. Modifying the script to support those versions would likely be doable, however I focused solely on fixing the issue with the Microsoft symbols. Hopefully this helps makes debugging on Windows 7 a little easier until Microsoft manages to officially fix the symbols.


Jasiel “WanderingGlitch” Spelman

Read:9684 | Comments:0 | Tags:No Tag

“PDB Type Theft”0 Comments

Submit A Comment



Blog :

Verification Code:


Share high-quality web security related articles with you:)


Tag Cloud