What is the Assembler?

Asked

Viewed 6,341 times

15

I always read things related to the Assembler and get confused. At first, I thought this was a mess and that Assembler was the same thing as Assembly. But it turns out that’s not it.

What is Assembler? And what is your relationship with Assembly?

  • Here has something to do with.

4 answers

15


Assembler

Assembler, as its name says is a assembler, and not a compiler, although it works very similarly. It takes a text that is a programming code and turns it into binary code (machine code). What differs from a compiler is precisely that the instructions in the language have a one-to-one relation in binary code. The analysis and transformation of the code into an assembler is much simpler than a compiler (the lexer is more or less equal, the parser is simpler, and semantic analysis in general does not exist).

Assembly

The low-level programming language is the Assembly (assembly language), which is assemblage by a Assembler (the suffix er in English is the same as our aching and indicates an agent of an action). We use it in uppercase as its first name.

There are several dialects for each physical architecture (x86 processor, ARM, MIPS, etc.), or virtual (JVM, CIL, and many dynamic languages have their own Assembly). Semantics depends on architecture, but syntax does not, and the specific assembler may adopt the one you prefer.

Some people confuse machine code and assembly. The first is binary, Assembly is at a level that humans (normal, although some will say that these are not so normal :P) understand. They are mnemonics which define the instructions that the processor must execute.

Assembly is all imperative and every mnemonic is a very simple instruction manipulating a value in registers, moving the data between the registers and memory and controlling the execution flow as simply as possible, without abstractions, without syntax sugar, no ready design standards.

When you say you will program in Assembler she makes the same mistake as the person who says she’s going to program in Visual Studio.

Assembly

But Assembly lower case is that file with binary CLR code (alias . NET).

Examples

Example of Intel Assembly x86-64 syntax code:

; Assembler (x86) version of 99 Bottles of beer
; 
; This version is for NASM compiler but doesn't use any 
; macros, just all basic instructions for x86 assembler.
; Also only putchar() function is used to print character
; onto screen, and the whole rest is in code.
;
; nasm -fwin32 99.asm
; gcc -o 99.exe 99.obj

        global  _main
        extern  _putchar
        
        segment .data

_line_1_1        db ' bottles of beer on the wall, ', 0
_line_1_2        db ' bottles of beer.', 13, 10, 0
_line_2_1        db 'Take one down and pass it around, ', 0
_line_2_2        db ' bottles of beer on the wall.', 13, 10, 13, 10, 0
_line_2_2_one    db ' bottle of beer on the wall.', 13, 10, 13, 10, 0
_ending_lines    db '1 bottle of beer on the wall, 1 bottle of beer.', 13, 10
                 db 'Take one down and pass it around, no more bottles of beer on the wall.', 13,
10, 13, 10
                 db 'No more bottles of beer on the wall, no more bottles of beer. ', 13, 10
                 db 'Go to the store and buy some more, 99 bottles of beer on the wall.', 13, 10, 0
         
        segment .text

; this function converts integer in range 0-99 to string
_integer_to_string:
        mov     eax, dword [esp + 08h]    ; get the vavlue
        mov     ecx, 10                   ; 
        sub     edx, edx                  
        div     ecx                       ; divide it by 10
        mov     ecx, dword [esp + 04h]    ; get the output offset
        test    eax, eax                  ; is greater than 9
        jz      .skip_first_digit         ; skip saving 0 char if no
        add     al, 030h                  ; convert number to ascii char
        mov     byte [ecx], al            ; save
        inc     ecx                       ; increase pointer
        jmp     .dont_test_second_digit   ; 
     .skip_first_digit:                   ; only if less then 10
        test    edx, edx
        jz      .skip_second_digit
     .dont_test_second_digit:             ; if it was greater than 10
        add     dl, 030h                  ; than second digit must by 
        mov     byte [ecx], dl            ; written at no condition
        inc     ecx                     
     .skip_second_digit:                  ; only skip if value was 0
        mov     byte [ecx], ah            ; save the null ending char
        retn    4                         ; ret and restore stack
; function prints null-terminated line to stdout
_show_line:
        push    edi                       ; function save registers
        push    esi
        mov     edi, dword [esp + 0Ch]    ; get the pointer to string
        sub     eax, eax                  ; look for zeros
        sub     ecx, ecx                        
        dec     ecx                       ; set ecx to -1
        repnz   scasb                     ; search for 0 in string
        neg     ecx
        sub     ecx, 2                    ; get the string length w/o zero
        mov     esi, dword [esp + 0Ch]    ; get pointer once again
     .putchar_loop:
        push    ecx                       ; keep the counter
        lodsb                             ; get the char
        push    eax                       
        call    _putchar                  ; print char to stdout
        add     esp, 4                    ; correct stack 
        pop     ecx                       ; get back the counter
        dec     ecx                     
        jnz     .putchar_loop             ; if not last char then get next
        pop     esi                       ; restore registers
        pop     edi
        retn    4
; prints string for only one number
_bottles:
        push    ebp                       ; keep the offset to call params
        mov     ebp, esp
        sub     esp, 4                    ; reserve one local variable
        mov     eax, dword [ebp + 08h]    ; get number of bottles
        dec     eax                       ; is it 1?
        jnz     .more_than_one            ; nope, it's not
        push    _ending_lines             ; print the last lines
        call    _show_line
        jmp     .end                      ; exit function
     .more_than_one:
        inc     eax                       ; get the original value
        push    eax                       ; convert it to string
        lea     eax, [ebp - 04h]
        push    eax                       ; string will be stored here
        call    _integer_to_string
        lea     eax, [ebp - 04h]
        push    eax
        call    _show_line                ; 'xx'
        push    _line_1_1
        call    _show_line                ; ' bottles of beer on the wall, '
        lea     eax, [ebp - 04h]
        push    eax
        call    _show_line                ; 'xx'
        push    _line_1_2
        call    _show_line                ; ' bottles of beer.'
        mov     eax, dword [ebp + 08h]
        dec     eax                       ; in second line the value is one less
        push    eax
        lea     eax, [ebp - 04h]
        push    eax
        call    _integer_to_string        ; convert it to string
        push    _line_2_1
        call    _show_line                ; 'Take one down and pass it around, '
        lea     eax, [ebp - 04h]
        push    eax
        call    _show_line                ; 'xx'
        cmp     dword [ebp + 08h], 2
        jnz     .second_line_for_more_than_one
        push    _line_2_2_one             ; ' bottle of beer on the wall.'
        jmp     .show_line
     .second_line_for_more_than_one:   
        push    _line_2_2                 ; ' bottles of beer on the wall.'
     .show_line:
        call    _show_line
     .end:
        leave
        retn    4
; main function, the command line arguments are not important
_main:        
        pushad
        mov     ecx, 99                   ; printf from 99
     .main_loop:
        push    ecx
        push    ecx
        call    _bottles                  ; print lines for this value
        pop     ecx
        loop    .main_loop                ; if still greater than zero
        popad
        sub     eax, eax                  ; That's all folks!
        retn

On the ARM:

;99 Bottles of Beer generator
;For ARM processors running RISCOS
;Using built in BASIC assembler
;

MOV R7, #99              ;bottle count kept in R7
MOV R12, R14             ;store caller return address

.beginverse              ;(_prints verses then returns to caller_)
BL  bottlesofbeer
ADR R0, onthewall
SWI "OS_Write0"          ;prints string at address in R0
BL  bottlesofbeer
SWI "OS_NewLine"
ADR R0, take
SWI "OS_Write0"
SUBS R7,R7,#1            
BLNE bottlesofbeer       ;beer left
BLEQ nobeer              ;no beer left
ADR R0, onthewall
SWI "OS_Write0"
SWI "OS_NewLine"
SWI "OS_NewLine"
BNE beginverse           ;go again if there's beer left
BL buymorebeer           ;print last verse
MOV PC, R12              ;exit to caller

.bottlesofbeer           ;(_prints "x bottle(s) of beer"_)
MOV R0, R7               ;arg1- number of bottles
ADR R1, bottlenum        ;arg2- buffer address
MOV R2, #3               ;arg3- buffer length
SWI "OS_ConvertInteger3" ;convert number of beers to string
SWI "OS_Write0"          ;and print it
CMP R7, #1             
ADR R0, bottles          ;
ADREQ R0, bottle         ;bottles is replaced with bottle if 1 bottle left
SWI "OS_Write0"
ADR R0, ofbeer
SWI "OS_Write0"
CMP R1, #0               ;unset zero flag so "nobeer" doesnt execute after return
MOV PC, R14              ;return

.buymorebeer             ;(_prints final verse_)
MOV R11, R14             ;save return address
BL nobeer
ADR R0, onthewall
SWI "OS_Write0"
ADR R0, comma
SWI "OS_Write0"
BL nobeer
SWI "OS_NewLine"
ADR R0, gotostore
SWI "OS_Write0"
MOV PC, R11              ;return to saved address

.nobeer                  ;(_prints "no more bottles of beer"_)
ADR R0, nomore
SWI "OS_Write0"
ADR R0, bottles
SWI "OS_Write0"
ADR R0, ofbeer
SWI "OS_Write0"
MOV PC, R14

;string components

.ofbeer
EQUS "of beer"           ;string contents
EQUB 0                   ;zero terminator

.onthewall
EQUS " on the wall "
EQUB 0

.bottle
EQUS " bottle "
EQUB 0

.bottles
EQUS " bottles "
EQUB 0

.take
EQUS "Take one down and pass it around, "
EQUB 0

.nomore
EQUS "no more"
EQUB 0

.bottlenum
EQUS "  "
EQUB 0

.comma
EQUS ","
EQUB 0

.gotostore
EQUS "Go to the store and buy some more...99 bottles of beer."
EQUB 0

Source.

Do you understand why they say you should comment on your code? That’s where you should.

Related questions:

  • I’m not questioning the answer, I’m trying to understand, so Assembler doesn’t follow this http://answall.com/a/104818/3635 ? It’s simpler, so it’s an assembler? I mean, it would be interesting to link this answer to it saying something "difference between compiler and assembler"

  • @Guilhermenascimento Not exactly. The process is similar, but the rules are much simpler. What really differs is the generation of code. There is no complex transformation, it does not analyze the code more broadly. It reads an instruction and generates its binary code, without major worries if the semantics are correct. It does not perform optimizations. It is almost a converter. Compilers make more complex analyses and transformations.

  • Yes, that’s what I understood (I had already read your other answer), I just commented in the sense that one hour someone would ask. I mean the other answer is very well explained and it would be really cool to link here, what do you think?

  • 1

    This code Assembly generated by GCC is translated by Assembler into machine code, and this same machine code already this point to run, I’m sure?

  • Oops, it’s great ;D

  • 2

    @cat yes, that is, in general the assembler already creates an executable with the parts that the operating system needs to understand this code and properly load in memory. Has compiler that generates an Assembly and the Assembler is responsible for generating the binary.

Show 1 more comment

9

What is Assembler? And what is your relationship with Assembly?

Assembler is a compiler. It converts written code into the language Assembly for native code.

  • 5

    Native code would be binary code or correct machine code?

  • 1

    @cat exactly, machine code. There are binary codes that are not native, for example MSIL or bytecodes.

  • This Assembler compiler is part of the Operating System, or is something that is already in Hardware?

  • 1

    Several Oses already come with a ready-to-use Assembler, but there is no specific dependency.

9

Assembly is a programming language. But it is not a typical language. It is characterized by being a low-level programming language composed of rigid and simple format instructions that do not allow substructures and by some Labels (labels that are targets for deviation instructions). It usually (almost always, but there are some cases it does not) is mapped from one to one in instructions to be executed by the processor (a processor statement = an Assembly statement).

Each instruction is defined by a mnemonic. For example, in an instruction MOV eax, 1 or JN algum_label, the MOV and the JN are the mnemonics, which also define the name of the instruction in question.

The Assembler is the program that converts the code into Assembly for the instructions themselves (encoded as a byte sequence). That is, the Assembler is the compiler. Despite this, the confusion between the terms Assembly and Assembler is common, and many people talk about "program in Assembler" when it should actually be "program in Assembly".

Since there are several types of processors, each with its own set of instructions, this means that for each processor we have at least one dialect of Assembly. Different assembler developers may use different notations or different mnemonics for the instructions, and therefore, even on the same architecture, there may be several distinct Assembly dialects.

  • I know a lot of people who say that program in Dreamweaver, Netbeans, Eclipse, etc

  • Great answer Victor ;D, I just got confused regarding mnemonics, I do not know what it is, could clarify?

  • @cat is better now?

  • @Victorstafusa of course! It is quite different from traditional programming languages such mnemonics, in this case would be synonymous with "statement or statement" which is used in high-level languages.

  • 1

    @cat Not quite statement no. These are actual instructions. For example instead of saying a = b, in Assembly would be MOV b, a. Instead of c = c + d, in Assembly would be ADD c, d. Obviously, it still depends on the specific Assembly dialect.

3

I found this definition very relevant:

Machine code:

Is the output compiled by an Assembler compiler.

Assembly

It is the readable form of the machine code.

Assembly Language

Refers to a specific machine code language with x86 Assembly.

Assembler

It is the tool used to compile source code into machine code.

Assembler Linguagem

Is the language used by any Assembler assembler.

I translated from that so-en response, I thought good, but my English is not the "the best"...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.