Stack and memory align operation in MASM X64

Asked

Viewed 237 times

18

I have an application that takes a TAC code (IL) and generates an ASM code using x64 MASM. The problem is that I’m not having any compilation error, and yes, (at least that’s what I think) at the time of building the stack. The program below is an MMC, so if I type 3 and 5 as input, it should return me 15, but it is returning me random numbers like 1281237 and 230932811.

I know that in x86, the alignment works in 4 bytes, and I read in some places that in x64 should be 16 bytes. But if I put in 16, I get a memory error when I run the same, so I’m lining up in 8 bytes, but this might be wrong, I’m not sure, there’s almost no documentation on MASM64 available around.

extern ExitProcess:proc
extern printf:proc
extern scanf:proc

includelib kernel32.lib
includelib user32.lib
includelib msvcrt.lib
include invoke_macros.asm

.data
scan BYTE 'scanf:',0
formatInt BYTE '%d',0
msg BYTE 'Return = %d',0
printInt BYTE 'printf: %d', 0ah, 0h
f1  BYTE 'Fake parameter #1 ( 137 - 279 ):',0

.data?
din dq ?

.code

start PROC
invoke  printf, addr f1
invoke  scanf, addr formatInt, addr din
MOV rax, din
PUSH rax
CALL sub_411420
ADD rsp, 8
invoke printf, addr msg, rax
RET
start ENDP

sub_411B00 proc
PUSH rbp
MOV rbp, rsp
SUB rsp, 48
MOV rax, [rbp + 16]
MOV [rbp - 24], rax
MOV rax, [rbp + 12]
MOV [rbp - 16], rax
LABEL_1:
MOV rax, [rbp - 24]
MOV rbx, [rbp - 16]
CDQ
DIV rbx
MOV rax, rdx
MOV [rbp - 8], rax
MOV rax, [rbp - 16]
MOV [rbp - 24], rax
MOV rax, [rbp - 8]
MOV [rbp - 16], rax
MOV rax, [rbp - 8]
CMP rax, 0
JG LABEL_1
MOV rax, [rbp + 12]
MOV rbx, [rbp + 16]
MUL rbx
MOV rbx, [rbp - 24]
CDQ
DIV rbx
MOV [rbp - 48], rax
JMP LABEL_4
LABEL_4:
MOV eax, [rbp - 48]
ADD rsp, 48
POP rbp
RET
sub_411B00 endp

sub_411420 proc
PUSH rbp
MOV rbp, rsp
SUB rsp, 48
PUSH [rbp - 24]
invoke  printf, addr scan
invoke  scanf, addr formatInt, addr din
MOV rax, din
MOV [rbp - 24], rax
PUSH [rbp - 16]
invoke  printf, addr scan
invoke  scanf, addr formatInt, addr din
MOV rax, din
MOV [rbp - 16], rax
PUSH [rbp - 24]
PUSH [rbp - 16]
PUSH [rbp + 8]
CALL sub_411B00
MOV [rbp - 8], rax
PUSH [rbp - 8]
POP rax
invoke  printf, addr  printInt, rax
PUSH rax
MOV eax, [rbp - 8]
ADD rsp, 96
POP rbp
RET
sub_411420 endp

end

So that’s my question. How does the stack and the memory align in x64? Thank you!

  • 2

    I edited the topic Sérgio.

  • 2

    I think this "level" of question will be more easily answered in Stackoverflow in English. There is another forum that seems to have a lot of movement there: http://masm32.com/board/index.php?board=13.0 .

  • 1

    @Bernardomeneghini comment on this code, young man! Make life easier for those who will help you! :)

  • x86-64 does not break the program by misaligned access, only runs slower. Hence many C programs run on x86/x86-64 and break when ported to ARM or other platform. s

1 answer

1

Hello, you can take a peek at this code:

https://github.com/osdeving/asm-snake

You’ll notice that all stack-frames were done manually, so you can take some of the functions that receive parameters and/or allocate stack space for local variables and make some modifications to your code to suit.

P. ex.

;
; Função ListAppend.                                                                             ;
; Parâmetros: DWORD pList, DWORD x, DWORD y                                                      ;
; Retorno:    Retorna em EAX o início da lista.                                                  ;
; Descrição:  Anexa ao fim da lista.                                                             ;
;
ListAppend:
    PUSH EBP                         ; stack frame.   
    MOV EBP, ESP
    SUB ESP, 8                       ; Aloca espaço para dois ponteiros: new_list e last.
    
    PUSH 12                          ; dwBytes = 12 bytes.
    PUSH 0x0040                      ; uFlags = GPTR. 
    CALL [GlobalAlloc]
    MOV DWORD [EBP - 8], EAX         ; Guarda na variável local new_list.

etc...

Being 64bit won’t change much with respect to the way you handle the stack. I believe that taking this code with a more 'raw' use of local variables, returns, function calls, you can solve your problem. This section above, for example, is the implementation of a chained list. The code I made is very well commented, very easy to understand.

I hope I helped, although I did not directly solve your problem (I would have to debug your code, and as there is a lot of directive masm it is difficult to do only in the olhômetro.)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.