Automate Solving Anti-Debug binary with Angr

This post show you how to approach Angr in order to automate solve anti-debug RE challenges.

AngrControlFlow - Naive binary

You can grab the challenges at here.

The graph above is generated by radare2, with command agfw graph.gif

A small note while reversing a large binaries, when you look at any graph, look from the bottom up, it give you some senses where the flow goes. From the end first, and then up to where it starts.

When a crypto guy look at the graph, there is 2 things going on in his head:

  • Non-constant time comparison, vulnerable to timing attack.
  • Simple comparison. Easy to reverse.

Let me remind you one again, what can be solved with Z3 can be solved with Angr, because Angr uses Z3 as internal solver, however we often don't deal with Z3 in Angr, we use Claripy as abstract symbolic layer.

The more I use Angr, I realize Claripy has more capabilities tailored to reverse engineering in BitVector field than mathematical aspect like Z3. By saying this, I mean the performance gain is nothing, but the ease of implementation is quite comfortable. (I use Z3 longer enough so both are good for me)

About the binaries:

  • No anti-debug (if it has, I would patch it, in the later of this post, I will introduce to hook anti-debug instead of patching)
  • No SIMD instruction

So here is my plan to solve the challenge:

  • Decompiled the code by IDA, solve it with Z3 (the last resort)
  • Apply automate binary analysis Angr
  • Intel Pintool to count instructions, if we pass a comparison check, the number of instruction we execute is increased, so we can solve this with Pintool. However, because the comparison use non consecutive array, so we have to customize Pintool.

Let's go with Angr first. If we can't make it, we will be back to Z3.

As usual, we load the binary, let it run at main(), define good point and bad point, and pray Angr gets to the win branch.

import angr

p = angr.Project('angrycontrolflow')
In [5]: p = angr.Project('angrycontrolflow')
WARNING | 2019-03-03 02:39:04,834 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.

Oops, this time the binaries is position-independent code (PIC). It's not like what we used to do. However, the binary is load with a base address 0x400000. We note that, and move on.

This time, we write like what we used to do. Find good point, bad point, and hint the simulation manager (simgr) to get to good point and avoid bad point

main = 0x000007e4

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good =0xa66
bad = 0xa88


simgr.explore(find=good, avoid=bad)
Error. What the hell! 

Oh man, we get errors.

This happen because God doesn't like you, or you doesn't treat people well so the drama go back to you, or it could be because you use a Mac.  

The real reason is, we pin the function address to specific address, that address number is way too low, so we may think what we did is unusual. It's true, when Angr is started, it starts at main function, which is non-exist location in the binary code. As a result, when simgr runs, it starts from nowhere and end at nowhere, that is why it doesn't work.

So I gave you a hint before, when we load the binary, the base address is 0x400000, now let's modify the code, add the base address to these addresses we proposed.

base = 0x400000
main = base + 0x000007e4

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0xa66
bad = base + 0xa88


simgr.explore(find=good, avoid=bad)

Wow, we found the result.

Let dump the result out.

result = simgr.found[0]

# Always print this 
for i in range(3):
    print (result.posix.dumps(i))

Yay we have the flag.

The whole script here:

import angr 

p = angr.Project('angrycontrolflow')

base = 0x400000
main = base + 0x000007e4

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0xa66
bad = base + 0xa88


simgr.explore(find=good, avoid=bad)

result = simgr.found[0]

# Always print this 
for i in range(3):
    print (result.posix.dumps(i))
Sorry for typo exit =))

ReMe - Antidebug

Not long ago I wrote a challenge Introduction to SMT in Efiens Round 2 CTF. You can see the writeup here. The download link also in midas's github.

Wow, this time multiple comparison but not many branches afgw reme.gif

The code is developed by me, therefore if you take a look at comparison graph, it's not vulnerable to timing attack.

Let's go with Angr, midas's solution is in Z3, from what I said, what can be complete with Z3 can be complete in Angr.

Check the binary:

  • There are anti-debug functions, we will hook to the address of anti-debug function and replace it to do nothing, and functions with fork, we modify the eax register to avoid it jump to black hole
  • The code is PIC (so we need to add base)
  • Patch the obfuscated code which take forever to complete in the beginning (using radare2). We patch at address 0x00000bb6  jmp 0xc01 to jmp 0xc0a

As usual, let's start loading the binary and read the base given to us.

import angr 

p = angr.Project('REme-a28b7273a8e6671e6f0a361addbba87e')
WARNING | 2019-03-03 03:43:33,089 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.

As always, we start at main, start simulation manager, find good point and bad point. Remember, this binary has base 0x400000, so we should add base to address in order to  make Angr to work.

base = 0x400000
main = base + 0x00000b2f 

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0x38ec
bad = base + 0x38ff

In the previous blog post about Angr, I write a hook function as this style

@p.hook(0x123)
def bla(state):
	pass

But we could write it shorter, like this, both will work same way, no worry.

def bla(state):
	pass

p.hook(addr=0x123, bla)

The input to hook function is state, you may ask why, because when Angr run to the address we specify to hook, it will pass the current state it has to the hook function and ask "What do you want to do with this state? You asked for a hook". That's why we need to pass state to hook. Otherwise we get error and Angr will exit.

Next step, instead of patching a bunch of places, I decided to hook the ptrace to do nothing and fork function to set eax=1.

grep call and grep fork, select the 1st column in radare2 output 

Let's write hooks

def nothing(sate):
    pass 

def set_eax(state):
    state.regs.eax = 1 
    

Alright, back to our script, now we see that ptrace has XREF as address 0x341f, so let's hook it with nothing() . And hook fork with set_eax()

ptrace = base+ 0x341f

p.hook(ptrace, nothing)

gdb_debug_list = [0x00000b4e,0x00001340,0x00001805,0x00001dd3,0x000026ff,0x0000302e,]
gdb_debug_list = map(lambda x: base+x, gdb_debug_list)

for addr in gdb_debug_list:
    p.hook(addr, set_eax)

Alright, let's the simulation manager  find its way to the good point.

simgr.explore(find=good,avoid=bad)

Nice, there are 64 solutions to good point. Let's grab it.

Here is the full script for you to grab and play

import angr 

def nothing(sate):
    pass 

def set_eax(state):
    state.regs.eax = 1 
    pass

p = angr.Project('REme-a28b7273a8e6671e6f0a361addbba87e')


base = 0x400000
main = base + 0x00000b2f 

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0x38ec
bad = base + 0x38ff

ptrace = base+ 0x341f

p.hook(ptrace, nothing)

gdb_debug_list = [0x00000b4e,0x00001340,0x00001805,0x00001dd3,0x000026ff,0x0000302e,]
gdb_debug_list = map(lambda x: base+x, gdb_debug_list)

for addr in gdb_debug_list:
    p.hook(addr, set_eax)

simgr.explore(find=good,avoid=bad)

result = simgr.found[0]

# Always print this 
for i in range(3):
    print (result.posix.dumps(i))

Well, you reach to the end of this post and wonder why we have to deal with Angr instead of Z3 in order to solve ReMe? In term of mentality, use Z3 in ReMe challenge is better, since I wrote it in a way to prevent Angr.

In the challenge AngrControlFlow, no anti-debug, so it would be faster to solve with Angr.

I solve AngrControlFlow with Angr within 10 mins, meanwhile I solve my ReMe take 25 minutes.

Base on how fast I parse decompiled code from IDA, in someway, if I do ReMe with Z3 it would be faster. However, the solution I wrote would be only use for ReMe challenge, but with Angr, you can reuse it.

So far, we learn how to use Angr, deal with anti-debug function, add constrains to solver, with PIC binary we need to add base.

Next time, we will try to unpack self-obfuscate binary with Angr.

Show Comments

Get the latest posts delivered right to your inbox.