How to extract hexadecimal code from a nasm-compiled executable?

Asked

Viewed 653 times

0

I have an executable, created in Assembly language and compiled with NASM.

There is a way to obtain the value, in hexadecimal, of the bytes produced by the compiler, so that I can use them in a disassembler (ie, discover the OP codes generated)?

Code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    FILE *file;
    char *buffer;
    unsigned long fileLen;
    file = fopen( "teste.o", "rb");
    if (!file) {
        printf("erro\n");
    }
    fseek(file, 0, SEEK_END);
    fileLen=ftell(file);
    fseek(file, 0, SEEK_SET);
    buffer=(char *)malloc(fileLen+1);
    if (!buffer) {
        fprintf(stderr, "Memory error!");
        fclose(file);
        return 0;
    }
    fread(buffer, fileLen, 1, file);
    fclose(file);

    for (unsigned int c=0;c&lt;fileLen;c++) {
        printf("%.2hhx ", buffer[c]);
        if (c % 4 == 3) {
            printf(" ");
        }
        if (c % 16 == 15) {
            printf("\n");
        }
    }
    printf("\n");
    free(buffer);
}
  • try to better expose what you want, you have to be more specific, show what you’ve done in terms of code. " I need the code" is not a good way to start, here we help you solve problems you have to implement. See how to ask at help center

  • #include <stdio. h> #include <stdlib. h> #include <string. h> int main() { FILE *file; char *buffer; unsigned long fileLen; file = fopen( "test. o", "Rb"); if (!file) { printf("error n"); } fseek(file, 0, SEEK_END); fileLen=ftell(file); fseek(file, 0, SEEK_SET); buffer=(char *)malloc(fileLen+1); if (!buffer) { fprintf(stderr, "Memory error!" ); fclose(file); Return 0; } fread(buffer, fileLen, 1, file); fclose(file);

  • for (unsigned int c=0;c<fileLen;c++) { printf("%.2hhx ", buffer[c]); if (c % 4 == 3) { printf(" "); } if (c % 16 == 15) { printf(" n"); } } printf(" n"); free(buffer); }

  • this the code until the moment, it was in a bad format but I think that when passing to an editor will be more visible.

  • 1

    I already edited your question and put the code there.

1 answer

1

The language or compiler you used little influences the format of the final executable. If you are on Linux, it is very likely to be a ELF (Executable and Linkable Format). Already in Windows, will be a PE (Portable Executable). Knowing the format of your executable (you can also write code that can extract data from both formats (or others), just check the magic bytes to differentiate) you need to extract the sections.

How this is saved in the file differs depending on the format, but there is a header with some general information like the architecture, the symbol table and the section table. Scroll through the table of sections and check each one’s flags. Compilers often produce some sections that are neither code nor data, such as the .comment. By the flags associated with each section you can identify the ones that contain code (may be more than one).

So you’ll have a list of code sections where three pieces of information are important: The section’s byte size, the location in the virtual memory (this will influence some instructions like the CALL involve different sections) and offset the file. The compiled machine code can be read directly from the executable file, reading size bytes from offset.

If you want to know function names or local variables you will also need the symbol table. This should help since only the code does not separate functions clearly. Each symbol is associated with a section and points to a memory address where the function or variable starts.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.