Experimenting with Binary Ninja IL rewriting
Every year after Flare-On ends, I tell myself that I should really take the time to learn more about deobfuscation. One of the projects I’ve been meaning to do for a while is to figure out how to use Binary Ninja’s IL modification API, so I decided to pick out a challenge from crackmes.one and see if I could improve the decompilation by modifying ILs.
The crackme I used was The Obfuscator’s Riddle, an obfuscated binary of medium difficulty. My goal was to solve the crackme entirely statically by using IL modification to produce a clean decompilation. Unfortunately, I didn’t succeed at doing this, but there are so few writeups on IL modification that it’s still probably helpful to document everything that I tried. Also, I want to come back to this project at some point, and if I write it all down in a blog post I’ll at least have a prayer of remembering what I was doing.
I did manage to deobfuscate the binary, but it was through binary patching, not IL modification. The patching script and the solution to the crackme are included at the end ofthe writeup.
Challenge Overview
As usual for these types of challenges, we’re prompted for a password:
140001081 write_console("Enter password: ", 0x10)
140001091 int128_t buffer = zx.o(0)
140001099 uint32_t numberOfCharsRead[0x6]
140001099 numberOfCharsRead[0] = 0
1400010ad uint64_t rbx = 0x10
1400010c1 ReadConsoleA(hConsoleInput: GetStdHandle(nStdHandle: STD_INPUT_HANDLE),
1400010c1 lpBuffer: &buffer, nNumberOfCharsToRead: 0x10,
1400010c1 lpNumberOfCharsRead: &numberOfCharsRead, pInputControl: nullptr)
However, after the password prompt we’re faced with a long switch statement that has 0xef different cases. At the end of each case, the variable rax_2 is set to a different value between 0 and 0xef, followed by a jump to a “dispatcher” block at 0x31b8. Weirdly, Binary Ninja didn’t decompile the switch case at first, but rebasing to a base address of 0 fixed it.
0000119d switch (rcx_5)
000011ca case 1
000011ca rax_2.b = 0xbe
000011cc goto label_31b8
0000211b case 3
0000211b rax_2.b = 0xb5
0000211d goto label_31b8
00001d9c case 4
00001d9c rax_2.b = 0xe7
00001d9e goto label_31b8
00002d8e case 5
00002d8e rax_2.b = 0x2b
00002d90 goto label_31b8
000015e8 case 6
000015e8 rax_2.b = 0x15
000015ea goto label_31b8
This is a classic case of control flow flattening. rax_2 stores a state variable that’s used as an offset into a jump table, and the dispatcher at 0x31b8 jumps to the address stored at [jumptable base address] + 4 * rax_2. Since the decompiler doesn’t understand how the state variable is used by the dispatcher to calculate where to jump next, it can’t reconstruct the control flow graph of the binary, leading to a “flattened” view where every basic block ends with a jump to the dispatcher. (If you’re unfamiliar with control flow flattening, there are a lot of detailed writeups on different forms of it, like this one from OLLVM or this one from Tigress.)
Binary Ninja IL Modification
Patching with JUMP instructions
My first thought was to look through the MLIL for jumps to the dispatcher block, then replace them with direct jumps to the target address that would be calculated within the dispatcher block based on the state variable. Finding the jumps was easy, as the jumps to the dispatcher block followed a predictable pattern of setting the al register to the next state value, followed by a jump to the dispatcher:
70 @ 000011b5 rax_2.al = 0x2c
71 @ 000011b7 goto 1026 @ 0x31b8
This made it relatively easy to write a function to check whether a set of two MLIL instructions corresponded to a CFF-obfuscated jump.
def is_cff(first, second):
if len(first.operands) < 3 or len(second.operands) < 1:
return False
if second.operands[0] != DISPATCHER_LOC:
return False
if type(first) != MediumLevelILSetVarField or type(second) != MediumLevelILGoto:
return False
if type(first.operands[0]) != Variable or type(first.operands[2]) != MediumLevelILConst:
return False
return True
I then tried searching through the MLIL to find all instances of the CFF pattern, using the replace_expr function to replace the jumps to the dispatcher with jumps to the target address. (Like a lot of the other IL modification functions, the instruction to be replaced by replace_expr is specified by its “expression index”. Notably, the expression index of an instruction is not the index where the instruction appears in the IL: the first MLIL instruction in the crackme binary has expression index 4, not 0. The expression indices 0 through 3 are actually assigned to the the operands of the first instruction, which are themselves expressions.)
for block in analysis_context.mlil.basic_blocks:
for i in range(len(block) - 1):
first = block[i]
second = block[i + 1]
if is_cff(first, second):
idx = first.operands[2]
jump_target = JUMPTABLE_ADDR + bv.read_int(JUMPTABLE_ADDR + 4 * idx.value.value, 4)
log_info(f'Found jump offset {hex(idx.value.value)} at address {hex(first.address)}')
analysis_context.mlil.replace_expr(second.expr_index, analysis_context.mlil.jump(analysis_context.mlil.const(8, jump_target)))
log_info(f'Replaced jump at {hex(second.address)} with jump to {hex(jump_target)}')
I then added the modification function as a step in the analysis workflow so that it would be called whenever a function was analyzed. Most of the examples of MLIL modification suggest inserting the modification step right after core.function.generateMediumLevelIL, but when I tried to do this, I got an exception in core.function.analyzeIndirectBranches step with the message invalid access to LLIL instruction. I’m not sure why this happened, but I’m guessing it’s because the new MLIL instructions inserted into the function aren’t actually backed by any underlying LLIL instructions. Inserting the modification function after core.function.analyzeIndirectBranches fixed the issue.
wf = Workflow("core.function.metaAnalysis").clone("core.function.metaAnalysis")
wf.register_activity(Activity(
configuration=json.dumps({
"name": "extension.fix_cff",
"title": "Fix CFF",
"description": "Replace control flow flattening instructions with direct jumps.",
"eligibility": {
"auto": {
"default": True
}
}
}),
action=fix_cff
))
wf.insert_after("core.function.analyzeIndirectBranches", [
"extension.fix_cff"
])
wf.register()
In general, it seems like IL modification is somewhat experimental, with very little error handling when things go wrong. I ran into some difficult-to-diagnose errors, especially when I tried modifying LLIL rather than MLIL - sometimes Binary Ninja would segfault, and other times the analysis workflow appeared to get stuck in an infinite loop of analyzing and reanalyzing. My guess is a lot of these problems have to do with the fact that my IL modifications resulted in major changes to the control flow, and that IL modification might work better if I used it to clean up data-level obfuscation (e.g., opaque constants) as opposed to control flow obfuscation.
This was the working script that I ended up with:
DISPATCHER_LOC = 380
JUMPTABLE_ADDR = 0x418c
def is_cff(first, second):
if len(first.operands) < 3 or len(second.operands) < 1:
return False
if second.operands[0] != DISPATCHER_LOC:
return False
if type(first) != MediumLevelILSetVarField or type(second) != MediumLevelILGoto:
return False
if type(first.operands[0]) != Variable or type(first.operands[2]) != MediumLevelILConst:
return False
return True
def fix_cff(analysis_context: AnalysisContext):
log_info(f"Starting CFF replacement for function {analysis_context.function}")
for block in analysis_context.mlil.basic_blocks:
for i in range(len(block) - 1):
first = block[i]
second = block[i + 1]
if is_cff(first, second):
idx = first.operands[2]
jump_target = JUMPTABLE_ADDR + bv.read_int(JUMPTABLE_ADDR + 4 * idx.value.value, 4)
log_info(f'Found jump offset {hex(idx.value.value)} at address {hex(first.address)}')
analysis_context.mlil.replace_expr(second.expr_index, analysis_context.mlil.jump(analysis_context.mlil.const(8, jump_target)))
log_info(f'Replaced jump at {hex(second.address)} with jump to {hex(jump_target)}')
analysis_context.mlil.finalize()
analysis_context.mlil.generate_ssa_form()
wf = Workflow("core.function.metaAnalysis").clone("core.function.metaAnalysis")
wf.register_activity(Activity(
configuration=json.dumps({
"name": "extension.fix_cff",
"title": "Fix CFF",
"description": "Replace control flow flattening instructions with direct jumps.",
"eligibility": {
"auto": {
"default": True
}
}
}),
action=fix_cff
))
wf.insert_after("core.function.analyzeIndirectBranches", [
"extension.fix_cff"
])
wf.register()
While this script did, in fact, successfully replace the jumps to the dispatcher with jumps to the target address, it didn’t produce an unflattened decompilation. The long switch case was still there, and it was full of jumps to hard-coded addresses, e.g, jump(0x2209):
0000119d switch (jump_table_14000418c[zx.q(rcx_5 - 1)])
0000119d case 0xffffcff2
0000119d continue
000011a0 case 0xffffd014
000011a0 rax_2 = 0
000031b8 label_31b8:
000031b8 rcx_4 = rbp_1
000031ba rdx_1 = r14_1
000031bd continue
000031bd continue
000011a7 case 0xffffd01b
000011a7 rax_2.b = 0xe8
00003214 jump(0x2209)
000011ae case 0xffffd022
000011ae rax_2.b = 0x33
00003214 jump(0x14db)
000011b5 case 0xffffd029
As it turns out, the primary instructions used in MLIL control flow are not JUMP instructions, but GOTO instructions. A JUMP instruction targets a specific memory address, whereas a GOTO targets an MLIL expression index. Since the documentation on IL modification mentioned that adding new GOTO expressions can be more complicated than adding other types of IL instructiions, I had hoped that it would be possible to get a good result by using a JUMP in place of a GOTO, but apparently not.
Patching with GOTO instructions
Labels
The Binary Ninja documentation gives the following explanation of how to add new GOTO IL instructions:
When you are trying to insert *LIL_GOTO, *LIL_IF, and *LIL_JUMP_TO instructions, you will need to specify the IL destination as a *LevelILLabel. Your destination must be a properly marked label, which is actually rather tricky to obtain. This is because there is currently no way to get a label for already-emitted IL Instructions, so if you want to modify the control flow of a function, you will need to do a Copy Transformation as described above. This may change in future versions.
I was a little confused about what exactly this meant, but I eventually came across this GitHub issue that clarified things slightly:
The general idea is that you need to mark labels at the start of basic blocks and have the goto target those labels. If the goto is emitted before the label though, what do you do? In that case, make the label ahead of time and mark it when you get to the block.
Unfortunately, there aren’t many examples that demonstrate the correct use of labels, but after some experimenting I was eventually able to figure out the following:
- IL basic blocks are constructed by emitting one instruction at a time. It’s not possible to insert new IL instructions at arbitrary locations in a basic block, which is why any modification more complicated than replacing one IL instruction with one other IL instruction requires a completely new basic block to be constructed.
- Any IL instruction that can be targeted by a direct branch (
GOTO,JUMP_TO) or an indirect branch (IF) needs to have a “label” attached to it. When a branch instruction is constructed, the label needs to be passed to it to specify the destination of the branch. - Labels need to be “emitted” just like instructions. You can’t attach a label to an arbitrary IL instruction, the label has to be attached to the instruction at the same time the instruction is being emitted.
- It’s possible to construct a label without immediately specifying the location that the label points to. The
mark_label()function associates a label with a particular IL location, and it can be called anytime after a label is created. Whenmark_label()is called on a label, the label points to the next IL instruction to be emitted. - What happens if the branch instruction comes before the branch target in the basic block? Since the target instruction hasn’t been emitted yet, there’s no label that can be used to construct the branch instruction. In this case, you need to create a new label without marking it, then keep track of that label and call
mark_label()on it later when the target instruction is emitted.
According to the documentation, when copying an IL function, we’re supposed to be able to use the function MediumLevelILFunction.get_label_for_source_instruction() to obtain a label that corresponds to the start of a basic block. In practice, however, this function often returned None without giving me much indication as to why. I ultimately didn’t end up using get_label_for_source_instruction() in my script.
Writing the replacement script
An MLIL GOTO expression takes a label for a basic block as its target, but a jump table is just a list of memory addresses. In order to figure out the equivalent GOTO instruction to construct for each jump in the jump table, I needed to figure out which MLIL instructions were present at each of the targeted addresses. Since the dispatcher is represented in MLIL as a long switch case targeting each location in the jump table, I figured I could construct a map from addresses to MLIL just by finding the target address of each case in the switch statement.
blocks = {}
dispatcher = old_func.get_basic_block_at(DISPATCHER_LOC)
for e in dispatcher.outgoing_edges:
insn = e.target[0]
blocks[insn.address] = insn.expr_index
However, in my initial attempt at writing the script I ran into an issue where some of the jump target addresses weren’t the addresses of any MLIL instructions. This has to do with how addresses get assigned to more complex MLIL instructions like if/then/else. For instance, the following MLIL instruction is assigned an address of 0x12a9:
1542 @ 000012a9 if (rdx_4 == i_23) then 1650 @ 0x2db2 else 1652 @ 0x12af
However, looking at the disassembly that was lifted into this MLIL instruction, it actually begins at 0x12a7. If a jump table were to jump there, it would jump to 0x12a7, not 0x12a9.
000012a7 cmp edx, ecx
000012a9 je 0x2dad
The issue here is that an if/then/else MLIL instruction is lifted from two instructions: a comparison and a branch. The comparison comes before the branch, but it’s the address of the branch instruction that’s considered to be the “address” of the MLIL instruction. More generally, MLIL instructions can be constructed out of complicated trees of expressions and sub-expressions, and the underlying disassembly behind those instructions can come from completely different locations in memory. As a result of this, it doesn’t necessarily make sense to talk about “the address” of an MLIL instruction, so the approach I took to figure out jump targets probably wasn’t a very good one.
The hacky workaround I used in my script was to iterate through the operands of each of the MLIL instructions and associate their addresses with the MLIL instruction as well. In the case of the comparison above, for example, its first operand is the MediumLevelILCmpE expression rdx_4 == i_23. Since it’s lifted from the cmp edx, ecx instruction, its address is 0x12a7.
mlil_addresses = {}
for i in range(len(old_func)):
insn = old_func[i]
if insn.address not in mlil_addresses:
mlil_addresses[insn.address] = insn.address
for op in insn.operands:
try:
if op.address not in mlil_addresses:
mlil_addresses[op.address] = insn.address
except:
pass
Having (mostly) associated the MLIL instructions with their addresses, it was then possible to construct a new IL function. The jumps to the dispatcher were replaced with GOTO instructions that pointed to the jump target, and all other MLIL instructions were copied over unchanged. While copying each instruction, I maintained a dictionary called labels that associated each label with the memory address it targeted.
if is_cff(first, second, dispatcher[0].expr_index):
idx = first.operands[2]
jump_target = JUMPTABLE_ADDR + bv.read_int(JUMPTABLE_ADDR + 4 * (idx.value.value - 1), 4)
if jump_target in mlil_addresses:
jump_target = mlil_addresses[jump_target]
if jump_target not in labels:
label = MediumLevelILLabel()
labels[jump_target] = label
else:
label = labels[jump_target]
new_expr = new_func.goto(label, loc=ILSourceLocation.from_instruction(second))
new_func.append(new_expr, ILSourceLocation.from_instruction(second))
log_info(f'Replaced jump at {hex(second.address)} with jump to {hex(jump_target)}')
else:
try:
emit_insn(new_func, second, labels)
except Exception as e:
log_warn(f'Could not emit instruction: {e}. Emitting NOP instead.')
new_func.append(new_func.nop(ILSourceLocation.from_instruction(second)), ILSourceLocation.from_instruction(second))
To copy MLIL instructions, I defined a function called emit_insn. Before copying each instruction, the function checked to see if its address was associated with an existing label that needed to be marked. If so, it marked the label, otherwise it created a new label for the instruction and marked it immediately. In this way, all of the newly created GOTO instructions ended up with correctly marked labels once the new MLIL was done being emitted.
def emit_insn(new_func, insn, labels):
if insn.address in labels:
new_func.mark_label(labels[insn.address])
else:
label = MediumLevelILLabel()
labels[insn.address] = label
new_func.mark_label(labels[insn.address])
new_func.append(insn.copy_to(new_func), ILSourceLocation.from_instruction(insn))
The full script:
DISPATCHER_LOC = 380
JUMPTABLE_ADDR = 0x418c
# Look for the following pattern:
# state.al = [next state]
# goto 380 @ 0x31b8
def is_cff(first, second, dispatcher):
if len(first.operands) < 3 or len(second.operands) < 1:
return False
if second.operands[0] != DISPATCHER_LOC:
return False
if type(first) != MediumLevelILSetVarField or type(second) != MediumLevelILGoto:
return False
if type(first.operands[0]) != Variable or type(first.operands[2]) != MediumLevelILConst:
return False
return True
def emit_insn(new_func, insn, labels):
if insn.address in labels:
new_func.mark_label(labels[insn.address])
else:
label = MediumLevelILLabel()
labels[insn.address] = label
new_func.mark_label(labels[insn.address])
new_func.append(insn.copy_to(new_func), ILSourceLocation.from_instruction(insn))
def fix_cff(analysis_context: AnalysisContext):
labels = {}
old_func = analysis_context.mlil
new_func = MediumLevelILFunction(old_func.arch, low_level_il=analysis_context.llil)
blocks = {}
dispatcher = old_func.get_basic_block_at(DISPATCHER_LOC)
for e in dispatcher.outgoing_edges:
insn = e.target[0]
blocks[insn.address] = insn.expr_index
# Deal with instructions like if/then/else where the operands are at a different memory address than the MLIL instruction itself
mlil_addresses = {}
for i in range(len(old_func)):
insn = old_func[i]
if insn.address not in mlil_addresses:
mlil_addresses[insn.address] = insn.address
for op in insn.operands:
try:
if op.address not in mlil_addresses:
mlil_addresses[op.address] = insn.address
except:
pass
new_func.prepare_to_copy_function(old_func)
for old_block in old_func.basic_blocks:
new_func.prepare_to_copy_block(old_block)
first_insn = old_block[0]
new_func.set_current_address(first_insn.address, old_block.arch)
emit_insn(new_func, first_insn, labels)
for i in range(1, len(old_block)):
first: MediumLevelILInstruction = old_block[i - 1]
second: MediumLevelILInstruction = old_block[i]
new_func.set_current_address(second.address, old_block.arch)
if is_cff(first, second, dispatcher[0].expr_index):
idx = first.operands[2]
jump_target = JUMPTABLE_ADDR + bv.read_int(JUMPTABLE_ADDR + 4 * (idx.value.value - 1), 4)
if jump_target in mlil_addresses:
jump_target = mlil_addresses[jump_target]
if jump_target not in labels:
label = MediumLevelILLabel()
labels[jump_target] = label
else:
label = labels[jump_target]
new_expr = new_func.goto(label, loc=ILSourceLocation.from_instruction(second))
new_func.append(new_expr, ILSourceLocation.from_instruction(second))
log_info(f'Replaced jump at {hex(second.address)} with jump to {hex(jump_target)}')
else:
try:
emit_insn(new_func, second, labels)
except Exception as e:
log_warn(f'Could not emit instruction: {e}. Emitting NOP instead.')
new_func.append(new_func.nop(ILSourceLocation.from_instruction(second)), ILSourceLocation.from_instruction(second))
new_func.finalize()
new_func.generate_ssa_form()
analysis_context.mlil = new_func
wf = Workflow("core.function.metaAnalysis").clone("core.function.metaAnalysis")
wf.register_activity(Activity(
configuration=json.dumps({
"name": "extension.fix_cff",
"title": "Fix CFF",
"description": "Replace control flow flattening instructions with direct jumps.",
"eligibility": {
"auto": {
"default": True
}
}
}),
action=fix_cff
))
wf.insert_after("core.function.analyzeIndirectBranches", [
"extension.fix_cff",
])
wf.register()
Unfortunately, this script wasn’t any more successful than the previous one at getting me an unflattened binary. As a reminder, this is what the decompilation looked like before running the script:
0000119d switch (rcx_5)
000011ca case 1
000011ca rax_2.b = 0xbe
000011cc goto label_31b8
0000211b case 3
0000211b rax_2.b = 0xb5
0000211d goto label_31b8
00001d9c case 4
00001d9c rax_2.b = 0xe7
00001d9e goto label_31b8
// [...]
And this is what it looked like after:
0000119d switch (jump_table_14000418c[zx.q(rcx_5 - 1)])
0000119d case 0xffffcff2
0000119d continue
0000119d case 0xffffd014
0000119d goto label_11a0
0000119d case 0xffffd01b
0000119d goto label_11a7
0000119d case 0xffffd022
0000119d goto label_11ae
// [...]
The long switch case is still there, even though all the GOTO instructions have been changed. There were a few CFF blocks that my script missed, so it’s possible the decompilation would’ve worked better if I’d managed to detect and rewrite them. However, it’s also possible that swapping out individual MLIL instructions doesn’t do much to rewrite the control flow graph, and I would’ve had to do a much more substantial rewrite of the function in order to get rid of the dispatcher block entirely. Additionally, I’m not sure exactly where in the analysis process Binary Ninja tries to simplify the control flow graph, so maybe the script runs too late in the workflow to be able to make a difference (although, one of the official examples of the API performs unflattening at the MLIL level, so that’s unlikely).
At this point, my script had gotten kind of hacky and unmanageable, so I decided to stop trying to solve the crackme this way. I still think IL rewriting has a lot of promise as a deobfuscation technique, but it’s probably easier to use in situations that don’t involve drastic modification to the control flow. In addition, as the script got to be longer and more complicated, it probably would’ve been better to switch to using the C++ or Rust APIs. I think the C++ API is the most supported way of doing this, so maybe I’ll try that next time.
Binary Patching
I did eventually manage to get a nice-looking decompilation, but it was with binary patching using keystone and capstone. This is a much more well-documented topic than IL modification, so I’m not going to go through the script line by line, but I’ll still provide it here for reference. Essentially, it replaces jumps to the dispatcher with jumps to the real target address, and it replaces the cmove/cmovne/cmovb instructions that set the state with the corresponding conditional jump instructions.
I looked for jumps to the dispatcher by using capstone to search for the pattern mov al, ??; jmp 0x31b8, which worked almost all of the time but wasn’t quite perfect. The manual patches in the script were to deal with dispatcher jumps that didn’t quite fit the pattern. For example, mov al, 0x9f; xor edi, edi; jmp 0x31b8 wasn’t detected as a jump to be unflattened because of the extra xor edi, edi in the middle. There’s probably a better way to do this than trying to match exact sequences of instructions, which will likely be the topic of a future blog post once I look into it more.
The script:
import struct
import pefile
from capstone import *
from capstone.x86 import *
from keystone import *
baseaddr = 0x0
dispatcher_rva = baseaddr + 0x31b8
jumptable_rva = baseaddr + 0x418c
pe = pefile.PE('crackme.exe')
mapped = bytearray(pe.get_memory_mapped_image())
text = mapped[0x1000:0x3250]
def get_jump_target(idx):
jump_offset = jumptable_rva + (idx - 1) * 4 - baseaddr
target = struct.unpack('<i', mapped[jump_offset:jump_offset+4])[0] + jumptable_rva
return target
def get_cond_targets(cmov, first_op, second_op):
if first_op.operands[0].reg == X86_REG_EAX:
eax_target = get_jump_target(first_op.operands[1].imm)
other_target = get_jump_target(second_op.operands[1].imm)
else:
eax_target = get_jump_target(second_op.operands[1].imm)
other_target = get_jump_target(first_op.operands[1].imm)
return f'{cond_mappings[cmov.mnemonic]} {hex(other_target)}; jmp {hex(eax_target)}'
md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True
md.skipdata = True
ks = Ks(KS_ARCH_X86, KS_MODE_64)
cond_mappings = {'cmove': 'je', 'cmovne': 'jne', 'cmovb': 'jb'}
# disassemble the entire .text section
instructions = []
for i in md.disasm(text, baseaddr+0x1000):
instructions.append(i)
# find all the jumps to the dispatcher
dispatcher_jmps = []
for i in range(len(instructions)):
insn = instructions[i]
if insn.mnemonic == 'jmp':
for op in insn.operands:
if op.type == X86_OP_IMM:
if op.imm == dispatcher_rva:
dispatcher_jmps.append(i)
compare_locs = {}
for jmp_loc in dispatcher_jmps:
insn = instructions[jmp_loc]
prev = instructions[jmp_loc - 1]
equiv = None
# Patch direct jumps to dispatcher:
# 00002c30 b0d1 mov al, 0xd1
# 00002c32 e981050000 jmp 0x31b8
if prev.mnemonic == 'mov':
patch_len = insn.size + prev.size
patch_loc = prev.address
target = get_jump_target(prev.operands[1].imm)
equiv = f'jmp {hex(target)}'
# Patch conditional jumps to dispatcher:
# 00002c1e b936000000 mov ecx, 0x36
# 00002c23 b8c8000000 mov eax, 0xc8
# 00002c28 0f45c1 cmovne eax, ecx {0x36}
# 00002c2b e988050000 jmp 0x31b8
elif 'cmov' in prev.mnemonic:
compare_locs[prev.address] = jmp_loc - 1
first_op = instructions[jmp_loc - 3]
second_op = instructions[jmp_loc - 2]
patch_len = insn.size + prev.size + first_op.size + second_op.size
patch_loc = first_op.address
equiv = get_cond_targets(prev, first_op, second_op)
if equiv is not None:
patch_opcodes = ks.asm(equiv, as_bytes=True, addr=patch_loc)[0]
padded = patch_opcodes + b'\x90' * (patch_len - len(patch_opcodes))
pe.set_bytes_at_rva(patch_loc - baseaddr, padded)
# Find the conditional mov instructions reached by a jump, ex:
# 00001267 483b5c2438 cmp rbx, qword [rsp+0x38 {var_e0}]
# 0000126c b999000000 mov ecx, 0x99
# 00001271 b8cd000000 mov eax, 0xcd
# 00001276 e96e0e0000 jmp 0x20e9
# [...]
# 000020e9 0f42c1 cmovb eax, ecx
# 000020ec e9c7100000 jmp 0x31b8
compare_jumps = {}
for i in range(len(instructions)):
insn = instructions[i]
if insn.mnemonic == 'jmp':
for op in insn.operands:
if op.type == X86_OP_IMM:
if op.imm in compare_locs:
compare_jumps[i] = compare_locs[op.imm]
for i, compare_loc in compare_jumps.items():
insn = instructions[i]
first_op = instructions[i - 2]
second_op = instructions[i - 1]
patch_len = insn.size + first_op.size + second_op.size
patch_loc = first_op.address
cmov = instructions[compare_loc]
equiv = get_cond_targets(cmov, first_op, second_op)
patch_opcodes = ks.asm(equiv, as_bytes=True, addr=patch_loc)[0]
padded = patch_opcodes + b'\x90' * (patch_len - len(patch_opcodes))
pe.set_bytes_at_rva(patch_loc - baseaddr, padded)
# Patch the last few cases where the instruction pattern is different from the others
# Ex., an extra instruction between the mov and the jump to the dispatcher:
# 000024cf b09f mov al, 0x9f
# 000024d1 31ff xor edi, edi {0x0}
# 000024d3 e9e00c0000 jmp 0x31b8
pe.set_bytes_at_rva(0x1272, b'\xe9\x6a\x03\x00\x00')
pe.set_bytes_at_rva(0x1381, b'\xe9\x48\x0d\x00\x00')
pe.set_bytes_at_rva(0x1518, b'\xe9\x89\x0e\x00\x00')
pe.set_bytes_at_rva(0x24d3, b'\xe9\x16\xf6\xff\xff')
pe.set_bytes_at_rva(0x28eb, b'\xe9\xf2\xeb\xff\xff')
# Insert a mov ebp, ecx instruction after the XOR loop, since we're not reaching the mov in the dispatcher
pe.set_bytes_at_rva(0x28e6, b'\x89\xcd')
pe.write('patched_cs.exe')
Solving The Crackme
With the control flow flattening removed, we obtain the following decompilation:
1400014f1 while (true)
1400014f1 input = var_58
1400014f6 int32_t state = -0x5a4c3827
1400014fc rax_2.b = 0x78
1400014fe int64_t i = 0
140001505 int64_t var_e8_1 = 0
14000281c BOOL input_char
14000281c int32_t or_chars
14000281c input_char, or_chars = IsDebuggerPresent()
14000281c
140002824 if (input_char == 0)
140001b95 if (rcx_3 != 0)
14000126c for (; i u< rcx_3; i += 1)
1400024b4 uint8_t xored
1400024b4
1400024b4 if (i u< 0x10)
1400024ba input_char.b = input[i]
1400017e8 uint64_t substituted
1400017e8 substituted.b = sbox[zx.q(input_char.b)]
1400012d7 xored = (state u>> 8).b ^ substituted.b
1400012d7
1400012be if (i u>= 0x10 || i u>= 0x10)
1400031f3 fail()
1400031f3 noreturn
1400031f3
1400012c4 xor_buf[i] = xored
14000264b state = (zx.d(xored) << 8 ^ state) * 0xc2b2ae3d + 0x1f2e3d4c
14000264b
14000140f while (i_1 u< 0x10)
1400028ca if (i_1 u>= 0x10)
140003200 fail()
140003200 noreturn
140003200
1400028d7 or_chars.b = target[i_1]
1400028db or_chars.b ^= xor_buf[i_1]
1400028e0 or_chars.b |= or_chars_1.b
1400028e3 i_1 += 1
1400028e6 or_chars_1 = or_chars
1400028e6
140002878 input_char.b = or_chars_1.b == 0
140002878
140002c1e if ((input_char.b & 1) != 0)
14000137f input_char.b = 0xe0
14000137f
1400012e6 for (; i_2 u< 0x20; i_2 += 1)
140002a9f if (i_2 u>= 0x20)
140003214 label_140003214:
140003214 fail()
140003214 noreturn
140003214
140002ab1 console_msg[i_2] = access_granted[i_2] ^ 0xcd
140002ab1
14000278b write_console(check_alnum(&console_msg, 0x20), 0x20)
140002790 continue
140002790
1400024cf input_char.b = 0x9f
1400024cf
1400020df for (i_2 = 0; i_2 u< 0xe; i_2 += 1)
140001f85 if (i_2 u>= 0xe)
140001f85 goto label_140003214
140001f85
140001f97 console_msg[i_2] = access_denied[i_2] ^ 0xcd
140001f97
1400017c9 write_console(check_alnum(&console_msg, 0xe), 0xe)
After all that, it turns out the validation function is very simple. First, there’s a debugger check, which we don’t care about since we’re solving everything statically. Then, each character of the input is encrypted using a custom algorithm that uses a substitution box followed by an XOR with a running keystream generated by an LCG. The encrypted input is XORed with the target value 8b4bb25cd9a29e4a009aac337f6d359d, and each byte of the result is ORed together and compared to 0, a way of checking if all bytes in the XORed result are 0. If the comparison to the target value succeeds, the string Congratulations, Access granted is XOR decoded with the key 0xcd and printed to the console.
The encryption algorithm is the following:
MASK32 = 0xffff_ffff
sbox = bytearray.fromhex("""51c25966c4e9e36c1e1a19cab4db17717241fa3b3739c626a8e8f99502da556f
702449b957d9de69a4150386bd625d6553fdfe6d892dd56bf7ec21a1602c22f4
18c5cb807f4c52768442f5932b48835bb18e0fba081661f8bf0d05dc1f9e9b0e
df38e1c3b7999d7e5aeabba340ac3eab00f0b690964acf63bc27ce3faf752ff1
cc58aa44f6740c28871bed2e68ad4b8c01097d46368288d4549a50a2d37847d2
6e0a7c8def35a9dda025d710e6f3b23381d0ee731cfb8556ffc1113cd177310b
6406290713d82a98325cc9794f9f8fb5e4b36aeb91b8ae7b43cdfc5f8b30e7d6
c05e4da6453d7a141d679c2092e5f2344e94a7c7be97c8a5e023b0048ae23a12""")
def encrypt(data):
state = 0xa5b3c7d9
xored = bytearray()
for i in data:
substituted = sbox[i]
xored_char = ((state >> 8) & 0xff) ^ substituted
xored.append(xored_char)
state = ((((i << 8 ^ state) * 0xc2b2ae3d) & MASK32) + 0x1f2e3d4c) & MASK32
return xored
Inverting the algorithm is straightforward. The generation of the keystream from the LCG is the same for encryption and decryption, so we just need to run the ciphertext through an inverse S-box and apply the XOR with the keystream again.
inv_sbox = bytearray(b'\x00'*0x100)
for i in range(len(sbox)):
inv_sbox[sbox[i]] = i
def decrypt(data):
state = 0xa5b3c7d9
xored = bytearray()
for i in data:
xored_char = ((state >> 8) & 0xff) ^ i
xored.append(inv_sbox[xored_char])
state = ((((i << 8 ^ state) * 0xc2b2ae3d) & MASK32) + 0x1f2e3d4c) & MASK32
return xored
Applying the decryption function to the target sequence of bytes 8b4bb25cd9a29e4a009aac337f6d359d, we obtain the password: ElementaryMyDr!!.