I stumbled on the M/o/Vfuscator compiler that only use
It proves to be surprisingly easy to generate such code. This blog post shows how it is done using a somewhat idealized version of the CPU, but it should be easy to map it on the real X86 ISA. The examples use registers
mov
instructions in the generated code1 using ideas from the "mov
is Turing-complete" paper.It proves to be surprisingly easy to generate such code. This blog post shows how it is done using a somewhat idealized version of the CPU, but it should be easy to map it on the real X86 ISA. The examples use registers
r0
, r1
, …, and a mov
instruction that can be in one of the three forms
mov rdest, rsrc mov rdest, [rsrc + roffset] ; Load indexed mov [rdest + roffset], rsrc ; Store indexedIt is possible to use an immediate constant instead of
rsrc
, or for the rdest
and roffset
in the addresses. The width of the move can be 8 bits (movb
) or 32 bits (movl
). We'll only use 8-bit values for the computations — wider widths can be handled by working on one byte at a time. Pointers are assumed to be 32 bits wide.if-statements
We'll begin with if-statements. Comparing two registersr1 == r2
and placing the result in r0
can be done as
movb [&scratch + r1], 0 movb [&scratch + r2], 1 movb r0, [&scratch + r1]where
scratch
is an array
uint8_t scratch[256];We cannot have any flow control when only using
mov
instructions, so the idea is to handle if-statements by executing both the true and false parts, and ignoring the result from the false part. E.g. code of the form
if (r1 == r2) r0 = r3; else r0 = r4;is compiled as if written
r0 = (r1 == r2) ? r3 : r4;which can be done as
movb [&scratch + r1], 0 movb [&scratch + r2], 1 movb r0, [&scratch + r1] movb [&scratch + 1], r3 movb [&scratch + 0], r4 movb r0, [&scratch + r0]This works fine as long as everything is in registers, or both the true and false parts write to the same memory. The case where they write different memory need to be handled by writing to a dummy location for the cases where no value should be stored. For example,
if (r1 == r2)
a = r3;
is compiled to
movb [&scratch + r1], 0 movb [&scratch + r2], 4 movb r0, [&scratch + r1] movl [&tmp + 4], &a movl [&tmp + 0], &dummy movl r0, [&tmp + r0] movb [r0 + 0], r3where
tmp
and dummy
are defined as
uint32_t tmp[2]; uint8_t dummy;
Addition
Addition by a constant can be implemented by leveraging the addition done within the addressing mode by indexing in a constant arrayconst uint8_t constants[512] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, ... 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, ... 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff };so e.g. adding 23 to
r0
is done as
movb r0, [&(constants+23) + r0]
Loops etc.
We now have nearly everything needed to implement a Turing machine where the tape is implemented as an array that is indexed by the current position. The only problem is that we only can do straight line code... So we cheat by adding a jmp
instruction at the end of our program, jumping back to the first instruction. This works fine for implementing a Turing machine, as it is typically written as
while (1) { if (current_state == 0) { /* Do the work for this state */ current_state = /* State transition */ } else if (current_state == 1) { /* Do the work for this state */ current_state = /* State transition */ } else if ... }But the same approach works for normal languages too, by treating each basic block as a state, and to do a state transition for each branch.
Program termination
Finally, we need a way to terminate the program. This is done by writing to address 0
movb [0 + 0], 0which terminates the process with a segmentation fault. Conditional termination is done with the same dummy-variable trick as for the conditional store above.
1. This is not completely true – it need exactly one
jmp
instruction in addition to the mov
instructions.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.