devilution icon indicating copy to clipboard operation
devilution copied to clipboard

RFC: Verifying the accuracy of decompiled code

Open nomdenom opened this issue 7 years ago • 24 comments

One of the issues with the decompiled code (besides the fact that it looks like an IOCCC entry) is that sometimes the decompilation is not accurate.

The current approaches to verification are roughly these I think:

  • Stare at it & the disassembly for a long time, tweak it and run it until it looks like the original (I guess the most common so far).
    • (side question: is anyone using a debugger on the original code to discover how it works?)
  • Tweak it iteratively until the output of VC6 starts to resemble the original assembly (eg. @seritools' PRs)

I have been thinking about more "automated" approaches to this, here's dump of my thoughts on how to do it:

Proving the new assembly matched the original assembly

Could theoretically use a automated theorem prover / model checker eg. Z3 to prove two pieces of assembly equivalent. Requires translating x86 assembly to something a theorem prover understands (this has been done). Might not work for more complicated code. Modeling memory reads/writes is still an open research question.

Plausibility: Low, unless you are looking to get a PhD in this field. Nice to dream about.

Testing the new code and original code

"Golden" "unit" tests

Write some "unit tests" on parts of the code that are "pure" (do not interact with the outside world), eg. dungeon generation or rendering. (the unit tests would need some "partial initialization" procedures to set up the state of the necessary global variables)

Steps:

  • Execute these tests on both the original and new binary.
  • Dump the "results".
  • Use the original's result as the "golden" reference for comparison.

The input parameters (seeds etc.) could be randomly generated, similar to QuickCheck. Combined with code coverage checking, this could achieve a pretty high level of confidence in the equivalence of the code.

This needs the ability to call into the original binary and being able to inspect its global variables. This could be done by:

  • Modifying the original EXE to be a DLL and generating a corresponding import library & headers. (bonus point: allows using a debugger to step through the original code)
  • Using a custom PE loader to just load it into the address space and calling it directly.

(note: the new EXE needs a different base virtual address since the original EXE would occupy the default 0x00400000 address)

Run-time equivalence checking

This is a bit more advanced:

  • For the same "pure" parts of the code, execute both in parallel.
  • Verify that the state of the world is equal after "leaving" the pure parts.

What's the state of the world? Well, roughly any memory the code can touch, meaning global variables and any heap-allocated blocks.

Matching up / aligning two different "worlds" for comparison might be tricky, so ideally it would execute two different pieces of code on the same global state.

To do make the old and new binary operate on the same state, there are two symmetric options:

  1. Make the new code's globals be located at the same addresses as the original code's. Problems:
    • The decompiled globals do not have the original addresses attached to it in most cases (solvable)
    • The compiler places globals however it desires. This can be worked around though, eg. GCC supports "linker scripts" to manually lay out sections at certain addresses, or we could just directly generate eg. a "stub" assembly file that lays out the global symbols at the original offsets.
  2. Make the old code use the new code's offsets. Approaches:
    1. Can dump assembly out of IDA and re-assemble / re-link it with the new code.
    2. Run the code using an x86 emulator library and redirect memory accesses to the right place. (bonus: allows to safely run any code, since it can just stop simulation if it tries to access "off-limits" memory or call something that would have a side effect).

There's also the "minor" problem of undoing the memory writes of eg. the old code after it has executed:

  • On Unix, can just fork(). Unfortunately, this does not run on Unix currently.
  • Copy all the memory to a backup location, then copy it back.
  • If using the emulator, can just log all the writes and undo then.

Summary

As you can see, all of these approaches require a 1:1 mapping between old and new procedures & global variables.

I think an interesting approach to take would be to add a "pseudo-annotation" to the current headers with the address of it in the original code (that can then be extracted by automated tools).

Something like this:

annotations.h:

#define ADDR(x) // dummy, only used by tools

example.h:

extern int pSomeVar ADDR(0x123456);
void __cdecl SomeFunc(int a1) ADDR(0x123456);

This would also allow automatically generating the IDA scripts that are in the scalpel repository and make sure they do not get out-dated as we eg. add names to more variables.

nomdenom avatar Aug 20 '18 23:08 nomdenom

Write some "unit tests" on parts of the code that are "pure" (do not interact with the outside world), eg. dungeon generation or rendering. (the unit tests would need some "partial initialization" procedures to set up the state of the necessary global variables)

This is the approach taken by https://github.com/sanctuary/djavul. For the dungeon generation algorithms, see https://github.com/sanctuary/djavul/tree/master/cmd/l1. Example test case: https://github.com/sanctuary/djavul/blob/master/d1/l1/l1_test.go#L41

This needs the ability to call into the original binary and being able to inspect its global variables. This could be done by:

Modifying the original EXE to be a DLL and generating a corresponding import library & headers. (bonus point: allows using a debugger to step through the original code) Using a custom PE loader to just load it into the address space and calling it directly.

The approach taken by Djavul is to disassemble the original binary into a a set of assembly files, which when assemled with nasm produces a SHA1 identical binary. Then the import section is updated to include a new DLL, so that basically the original Diablo game invokes new functions of the DLL which we can write in any language, e.g. C or Rust (or Go as used in Djavul). Following this approach, one concept at the time is extracted and thoroughly tested until proven identical. For instance, @7i and I have tried running through the entire seed space (2^32) for dungeon generation and it is doable. Took about two weeks on @7i's gaming box.

As you can see, all of these approaches require a 1:1 mapping between old and new procedures & global variables.

This work is tracked by https://github.com/diasurgical/scalpel and https://github.com/sanctuary/notes

mewmew avatar Aug 21 '18 06:08 mewmew

For the morbidly curious, a thought dump for the design of Djavul is present at https://gitter.im/sanctuary/notes?at=5abc40bde3d0b1ff2c7b859c, https://github.com/sanctuary/djavul/blob/master/DESIGN.md and https://github.com/sanctuary/djavul/#djavul. Perhaps some ideas may be of use and transferable to Devilution .

mewmew avatar Aug 21 '18 06:08 mewmew

is that sometimes the decompilation is not accurate.

It really depends on your definition of accurate. If Hex-Rays and the compiler used to recompile were assumed to be bug-free, the resulting binary should be accurate in a sense that the code, in the end, would be equivalent, even though they aren't the same.

On the other hand, what I'm trying to achieve in those PRs is basically source code accuracy: Recreating the source to be as close to the original code as needed to make the original compiler used generate the same binary output. That of course is a long and tedious process since it goes a step further than logical verification.

But yeah, a verification/testing process would be awesome for catching remaining bugs in the code and preventing adding new bugs/inaccuracies in the cleanup process 👍

seritools avatar Aug 21 '18 12:08 seritools

@nomdenom wrote:

Requires translating x86 assembly to something a theorem prover understands (this has been done). Might not work for more complicated code. Modeling memory reads/writes is still an open research question.

Plausibility: Low, unless you are looking to get a PhD in this field. Nice to dream about.

That was intense, wow! It's interesting, thanks.

sskras avatar Aug 21 '18 17:08 sskras

@mewmew Cool, that looks pretty much what I had in mind for the unit tests, perhaps a bit higher-level due to using Go. For Devilution it might make more sense to transplant these tests into C/C++ (since the "new" side of the code is already in C and easily callable that way).

Was the tool for converting the EXE to assembly this one? https://github.com/sanctuary/exp/tree/master/cmd/foo

nomdenom avatar Aug 21 '18 18:08 nomdenom

@seritools Yes, I was oversimplifying a bit. The decompilation process by itself is probably quite accurate in the sense that it does not change the semantics of the code, but has various problems eg.:

  • Using the wrong global variables (neighboring ones) due to the compiler optimizing array access and generating eg. out-of-bounds offsets.
  • The code does not strictly conform to the C standard due to the ridiculous amount of casts added (I am pretty sure some of these are a violation of the infamous "strict pointer aliasing" rules and could cause optimized code to be incorrect).

Once you start fixing these issues by manually modifying the code, it is pretty easy to introduce issues. That is what I am really trying to get at - being safely able to refactor the code while having confidence in its accuracy, as you mentioned.

nomdenom avatar Aug 21 '18 18:08 nomdenom

I am pretty sure some of these are a violation of the infamous "strict aliasing" rules and could cause optimized code to be incorrect

Oh yeah, definitely.

BTW: From what I've gathered looking through the VC6 MSDN Docs, the optimizer option "Assume no aliasing" and "Assume aliasing only across function calls" are both disabled by default, and not enabled with any of the "default" optimization levels.

seritools avatar Aug 21 '18 18:08 seritools

Was the tool for converting the EXE to assembly this one? https://github.com/sanctuary/exp/tree/master/cmd/foo

The tool is called bin2asm, the source code of which is located at https://github.com/decomp/exp/tree/master/cmd/bin2asm

To install, simply run go get github.com/decomp/exp/cmd/bin2asm.

And, to use, do as follows (the sanctuary/exp repo contains json files which specify the basic blocks and functions used in diablo, to improve the output of the disassembly):

[u@x61s ~]$ export GOPATH=/tmp/go
[u@x61s ~]$ export PATH=$GOPATH/bin:$PATH
[u@x61s ~]$ go get github.com/decomp/exp/cmd/bin2asm
[u@x61s ~]$ cd /tmp
[u@x61s tmp]$ git clone https://github.com/sanctuary/exp
[u@x61s tmp]$ exp/cmd/diabolic
[u@x61s diabolic]$ cp ~/Desktop/diablo/idb/diablo-v1.09b.exe/diablo-v1.09b.exe .
[u@x61s diabolic]$ bin2asm -last 0x46B682 diablo-v1.09b.exe 
bin2asm: creating "_dump_/main.asm"
bin2asm: creating "_dump_/common.inc"
bin2asm: creating "_dump_/pe-hdr.asm"
bin2asm: creating "_dump_/_text.asm"
bin2asm: creating "_dump_/_rdata.asm"
bin2asm: creating "_dump_/_data.asm"
bin2asm: creating "_dump_/_rsrc.asm"
[u@x61s diabolic]$ cd _dump_/
[u@x61s _dump_]$ sha1sum d1.exe ../diablo-v1.09b.exe 
ebaee2acb462a0ae9c895a0e33079c94796cb0b6  d1.exe
ebaee2acb462a0ae9c895a0e33079c94796cb0b6  ../diablo-v1.09b.exe

Edit: The .idata section is still a raw hex dump in bin2asm. The dump_imports tool (https://github.com/decomp/exp/tree/master/cmd/dump_imports) is used for now to recreate the import data section. When the mood strikes, the dump_imports tool will be incorporated into the bin2asm tool; at which point dump_imports will be deprecated.

mewmew avatar Aug 22 '18 01:08 mewmew

I like the idea of adding unit test to this project. I was experimenting with googletest but am having issues isolating the code. Maybe this is not possible with the way the project is structured or makes it much more difficult (or just beyond my capabilities). Any feedback or suggestions would greatly be appreciated.

https://github.com/ttdonovan/devil-nightly/compare/master...wip-googletest

ttdonovan avatar Aug 23 '18 18:08 ttdonovan

@ttdonovan You probably want to "stub out" all the non-essential Windows APIs (as calling them from unit tests probably won't do any good), or provide alternative "pure" implementations of such functions. I have something like that (a Windows API stub) as an experiment (this allows compiling everything as eg. a native Linux executable), I'll comment once it's a bit more ready. For most functions, they require the initialization of global variables (which then might require loading things from the game's MPQ), so that needs to be dealt with somehow as well.

nomdenom avatar Aug 23 '18 21:08 nomdenom

For most functions, they require the initialization of global variables (which then might require loading things from the game's MPQ), so that needs to be dealt with somehow as well.

My suggestion would be to have Devilution look for a directory named diabdat in the game folder, and if present then load files from there directly. Then we could have a preprocess step for the game to extract game assets (e.g. MPQ Editor), and for the future also to convert game assets to open source formats :)

This is the goal of https://github.com/sanctuary/mpq and https://github.com/sanctuary/formats, both of which can be used to dump all CEL and CL2 files to PNG images.

mewmew avatar Aug 23 '18 22:08 mewmew

Has anyone verified the CreateItem random code is accurate? Sorry, searching is not giving me anything.

I don't think the decompiled code is correct for item creation. I've dusted off a really old Diablo 1 trainer suite that I made, and tried adding in the only thing it was missing: the item seed and recreating items! I was excited to see it should be pretty simple after viewing this code here. However, even doing a simple test is failing at step 1 so the code is inaccurate or I missed something?

My test consists of taking an item seed from a legit item (some magic leather armor I found). Run the RecreateItem with that given seed. First thing that uses the seed is setting a random AC in GetItemAttrs, and this is wrong. Most importantly are the prefix/suffix randomization and these are far off :(

I've been running numerous iterations and can't get it to line up so I'm thinking something is amiss. Anyone know if this has been validated before?

Meth962 avatar Jul 05 '19 23:07 Meth962

@Meth962 only one function isn't currently binary identical (instruction wise) to original: https://github.com/diasurgical/devilution/milestone/3 I just checked and GetItemAttrs is still bin exact so the only difference that is really possible is that it is operating on the wrong data addresses.

AJenbo avatar Jul 06 '19 00:07 AJenbo

Thanks for the quick reply. I'm guessing it's something in engine.cpp or items.cpp then since those functions are heavily called to recreate the items :(

Meth962 avatar Jul 06 '19 01:07 Meth962

I'm guessing it's something in engine.cpp or items.cpp

Engine.cpp is only responsible for drawing CEL files as well as providing the random function. It could be that the trainer you used messed with something else or used a different method to create items. The code we have is accurate and correct and generates everything the same as the vanilla .exe.

ghost avatar Jul 06 '19 01:07 ghost

Those are the random routines I'm talking about. SetRndSeed GetRndSeed random

Are you sure those are accurate? I saw a few lines of "guessing the type for..." in there so I didn't trust it.

Right now I'm just try to prove the reverse engineered code is accurate so I'm not even using the trainer. I'm just doing some simple math to see if the random item code is correct.

Meth962 avatar Jul 06 '19 01:07 Meth962

Yes they are all bin exact, here is random: image The original 1.09b on the left and Devilution on the right.

If you check my link you can see that the only function that isn't bin exact is scrollrt_draw_dungeon, but it's down to one instruction being done differently and doesn't involve items obviously.

The types have been verified, but not all the comments have been removed yet.

AJenbo avatar Jul 06 '19 01:07 AJenbo

Ah, thanks for sharing that. I haven't learned assembly as I find it much harder to follow than assembly from a RISC32.

I have one last question, if you have this ready. Can you check items.cpp::GetItemPower? I feel like the if check for prefix and suffix is wrong. The if statement checks if only a suffix is present and then flips a coin (50/50), but the outcomes make no sense. It either, removes the suffix or adds a prefix. That means there are no items that have only suffixes??

void GetItemPower(int i, int minlvl, int maxlvl, int flgs, BOOL onlygood)
{
	int pre, post, nt, nl, j, preidx, sufidx;
	int l[256];
	char istr[128];
	BYTE goe;

	pre = random(23, 4);
	post = random(23, 3);
	if (pre && !post) { // if suffix only (0 = false = has a suffix)
		if (random(23, 2)) // flip a coin
			post = 1; // suffix is removed
		else
			pre = 0; // prefix is added

Thanks in advance :)

Meth962 avatar Jul 06 '19 02:07 Meth962

In case your not aware we have manually verified every single function in multiple ways. And also we have build tools that show the overall difference in the project. So the chance that there are faults like that is very unlikely. The only thing that hasn't been verified is what global variables the code is operating on.

I doubled checked the function and it is bin exact, I also went the extra mile and compared the relative position of the global variables and they also appear correct. A fault with a global variable would be something like PL_Prefix[j].PLPower being checked instead of PL_Prefix[j].PLMinLvl (none of the code in your snippet operates on global variables).

AJenbo avatar Jul 06 '19 02:07 AJenbo

@Meth962

My calculations for GetItemPower (correct me if I'm wrong):

pre==0 - means prefix is generated (which is weird but whatever) post!=0 - means suffix is generated

Before if: Out of 12: 3 cases - neither 6 cases - only suffix. 1 case - only prefix 2 cases - both

After if: Half of neither cases goes to suffix, half to prefix (since this function is called only items which are expected to be magical): 15/24 only suffix, 5/24 - only prefix, 4/24 - both.

(Also found this in Jarulf's Guide, alas in % form but easily checked)

In general I think you should look into direction of RecreateItem function since it heavily points out that not only seed but also ic (_iCreateInfo) is important for restoring item generation process.

Edit: added link to Jarulf's guide.

Predelnik avatar Jul 06 '19 06:07 Predelnik

@Predelnik Thanks for pointing out a small '!' I missed. You're right, that code is weird. if(!pre) means prefix on value equals 0 but, if(post) means suffix non zero. I had added a not operator on post thinking I copied it wrong. Makes sense though for the majority of items having suffix, just those little binary operations are easily overlooked.

I was able to get my mod to generate legitimate items sourced from Griswold's Premium at least. Haven't gotten it to use the correct seed on monster drops but that's another day's problem!

Heh, trust me, I know all too well about createInfo and the seed. Many years ago I gave up trying to figure out how the game did it and settled for adding items live-time every game reload. I'm pretty happy to finish off that one missing thing from my mod though :)

Thanks for the help!

Meth962 avatar Jul 06 '19 19:07 Meth962

Thanks for pointing out a small '!' I missed. You're right, that code is weird. if(!pre) means prefix on value equals 0 but, if(post) means suffix non zero.

Oh hey, another case for comparing numbers with numbers instead of saving 3 chars just because. :D

if (pre == 0) / if (post != 0)

seritools avatar Jul 06 '19 19:07 seritools

@Meth962 wrote:

Many years ago I gave up trying to figure out how the game did it and settled for adding items live-time every game reload. I'm pretty happy to finish off that one missing thing from my mod though :)

Out of curiousity, what is the name of your mod, please:)?

sskras avatar Jul 07 '19 12:07 sskras

Name? I guess it would be "Diablo Editor" but I didn't release it to the public or anything :P My friends and I just use it to make our Diablo 1 playthroughs more convenient (item saving/stash/enemy hp visible etc). It gets pretty boring at level 40-50 so I always aspired to create legitimate items with it since we don't find upgrades anymore. Ah, looks like I re-uploaded an old video on part of the mod here https://www.youtube.com/watch?v=094bpnUS6MA It only shows some in game stuff and not the item creator or monster viewer.

Meth962 avatar Jul 07 '19 15:07 Meth962