13
I have written a program whose goal is to read a binary executable file compiled for the x86 (intel) architecture and interpret the Assembly code contained therein by executing instruction by instruction. The part of reading the executable, extracting the sections and creating a virtual memory that includes the executable code works smoothly and I was able to run some very simple programs (example: int main() {return 0;}
).
To decode the instructions I’m basing myself on intel manual (in English). Additionally I am using the utility objdump -d
to display the executable Disassembly to compare with my results.
My problem is in decoding the following sequence of bytes: (hexadecimal)
67 89 04 18
The objdump
correctly states that this means:
mov %eax, (%eax, %ebx, 1)
My problem is when I do the process manually based on the manual:
67
: Address size change prefix;89
: Instruction optionmov
from a record to a memory/record;04
: Modr/M byte to indicate that the first argument is%eax
, the need for a SIB and that the Displacement is zero;18
: Byte of SIB indicating that the last argument is%eax+%ebx
.
The detail is that both Modr/M and SIB are considered in 32-bits. It means that at this stage the size of the operand and the size of the address are 32-bits. However, the prefix for changing the address size needed to be used, which means that the original instruction (without the prefix) is 32-bits in the operand and 16-bits in the address. That is correct?
How is it possible to have a 32-bit operand and 16-bit address instruction? I tried to compile code with an instruction like this using gas
(GNU Assembler) and it returns an error stating that that combination is impossible. Why then is the default?
A simple question: What kind of C program when compiled generates the sequence
67 89 04 18
? When you run this program, what this instruction does when it has addresses and registers whose values do not fit in 16 bits?– Victor Stafusa
One can generate quite a similar instruction:
int main() { volatile int a[] = {1, 2, 3, 4, 5}; volatile int i = 2; a[i] = i; }
. Produce that by compiling withgcc -O3
:mov %edx, 8(%esp, %eax, 4)
. Quite similar to the example I used.– Guilherme Bernal
Interestingly this is coded as
89 54 84 08
. No prefix used. Now I’m confused... Why one case requires the prefix and the other not?– Guilherme Bernal