Tuesday, 23 May 2017

assembly - Understanding x86 MOV Syntax



I think this an easy (perhaps stupidly-easy) question to answer, but after almost two hours of Google-ing, I've struck out. I'm pretty sure that my problem is because I just don't understand what the syntax is doing.



I'm looking at some disassembled code in IDA and I have no idea what the following is doing:



mov dl, byte_404580[eax]


If I jump to byte_404580 I find .data:00404580 byte_404580 db 69h telling me that the value is 0x69. But I don't see how this is used.



Let me provide the context this code appears in:



mov eax, 0x73             ; Move hex 73 to EAX
and eax, 0x0F ; Keep lower half of EAX
mov dl, byte_404580[eax] ; MAGIC


With the above assumption that EAX is initially 0x73, I get DL=0x76. I have tried varying the values of EAX to find some pattern, but I haven't been able to figure out what is happening.


Answer



This syntax is used to denote memory addressing, similar to C's array syntax (array[index]). Your example is equivalent to computing the expression 0x404580 + (eax & 0x0F), treating it as an address, and taking one byte from this address. This suggests that the data at 0x404580 is an array of bytes (most likely 0x10 elements, based on the mask).



You can stop reading here if that answers your question.






If you go into Options > General and set "Show Opcode Bytes" to a non-zero value, you will see the actual values of the instruction bytes and be able to cross-reference them with the processor documentation to understand what's happening. This is usually not required, but it can be educational. For example:



mov dl, byte_404580[eax]


can be expressed as a sequence of bytes:



8A 14 05 80 45 40 00


Using Intel's Architecture Manual, Volume 2A, this can be decoded as follows:



8A - instruction opcode for MOV r8, r/m8 - determines the operand sizes

14 - the Mod R/M byte:
| 00010100b
Mod | 00
R/M | 100
Reg | 010

Mod R/M combination 00-100 is specified as "followed by the SIB byte".
Reg 010 stands for register DL/DX/EDX, the destination operand.

05 - the SIB byte:
| 00000101b
Scale | 00
Index | 000
Base | 101

This combination is specified as [scaled value of EAX] + a 32 bit displacement.

80 45 40 00 - the displacement itself, 0x404580


Adding these together, you get:



this instruction takes one byte from EAX + 0x404580 and moves it into the DL register.






IDA uses this information to infer that there's an array of byte-sized values at 0x404580, tries to name the location if it doesn't yet have a name, tries to resize the named item at the location to span the right amount of bytes (it doesn't necessarily know how many elements there are in this array, so it doesn't actually create an array there) and transforms the displayed expression to byte_404580[eax].


No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...