How does an "if" work internally?

Asked

Viewed 210 times

10

The if is widely used in programming, and he plays several important roles in a programmer’s everyday life. The code seems to magically run if an expression is passed in the if is true, otherwise that code is simply ignored.

But how does it work internally?

Use this code as a basis (pseudocode):

var x = 8;
if (x < 4) {
    print("x é menor do que 4!");
} else {
    print("x é maior do que 4!");
}
  • This reply would help you?

  • @Luizaugusto I’m talking as an if it works internally, as if "magic" happens

2 answers

16


Let’s say there was no command block, so it would look something like this (no optimization):

var x = 8;
if (x < 4) goto TRUE;
print("x é maior do que 4!");
goto FIM;
:TRUE
print("x é menor do que 4!");
:FIM

imagining that you understand the concept of goto and label that I used in TRUE, if you don’t know, it’s a specific question.

But it still doesn’t explain how if is executed.

Well, the processor knows how to make a decision, there is ready a code that knows whether to do something or not to do something, that’s all, it does not analyze the condition. This is usually called branch, i.e., it picks up a branch or other branch. On Intel is the instruction jz ( or je, jne, or others see a list).

Trilho de trem com um desvio

Internally it has a lot of logical ports that will run an algorithm to make this decision, but basically what it does is deviate the execution (changes the register PC which is the registrar that stores the address of the next instruction to be executed) if there is a value in a flag of the recorder (see their list).

That one flag register will have a value according to the last execution of a previous possibly comparative instruction (not necessarily) that has changed one of the flags which will be checked. Then the condition is made separately from the if. I always said that deep down what happens is this (many people do not understand and think that the condition is part of the if:

var x = 8;
var cond = x < 4;
if (cond) goto TRUE;
print("x é maior do que 4!");
goto FIM;    
:TRUE
print("x é menor do que 4!");
:FIM

Note that the if and the goto are still one thing, only one instruction causes deviation conditionally (other than instruction JMP which causes the deviation unconditionally and is the goto simple).

Of course, in some languages this works a little differently because they run on virtual machines, but in essence it’s the same thing. Same goes for different processors.

In fact in C I would compile more or less for this:

.LC0:
        .string "x \303\251 menor do que 4!"
.LC1:
        .string "x \303\251 maior do que 4!"
teste:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     DWORD PTR [rbp-4], 8
        cmp     DWORD PTR [rbp-4], 3
        jg      .L2
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        jmp     .L3
.L2:
        mov     edi, OFFSET FLAT:.LC1
        mov     eax, 0
        call    printf
.L3:
        nop
        leave
        ret

See on Compiler Explorer.

In case the jg is the instruction that makes the decision (it is jump if flag value is Greater). The instruction CMP is that it makes the comparison and generates a value in the flag ZF.

In C# would be this:

.class private auto ansi '<Module>'
{
} // end of class <Module>

.class public auto ansi beforefieldinit C
    extends [System.Private.CoreLib]System.Object
{
    // Methods
    .method public hidebysig 
        instance void M () cil managed 
    {
        // Method begins at RVA 0x2050
        // Code size 26 (0x1a)
        .maxstack 8

        IL_0000: ldc.i4.8
        IL_0001: ldc.i4.4
        IL_0002: bge.s IL_000f

        IL_0004: ldstr "x é menor do que 4!"
        IL_0009: call void [System.Console]System.Console::WriteLine(string)
        IL_000e: ret

        IL_000f: ldstr "x é maior do que 4!"
        IL_0014: call void [System.Console]System.Console::WriteLine(string)
        IL_0019: ret
    } // end of method C::M

    .method public hidebysig specialname rtspecialname 
        instance void .ctor () cil managed 
    {
        // Method begins at RVA 0x206b
        // Code size 7 (0x7)
        .maxstack 8

        IL_0000: ldarg.0
        IL_0001: call instance void [System.Private.CoreLib]System.Object::.ctor()
        IL_0006: ret
    } // end of method C::.ctor

} // end of class C

See on Sharplab.

And in Assembly:

; Core CLR v4.700.19.46205 (coreclr.dll) on x86.

C..ctor()
    L0000: ret

C.M()
    L0000: mov ecx, [0x10998940]
    L0006: call System.Console.WriteLine(System.String)
    L000b: ret

Check in the Sharplab. Then you ask: where is the condition? The compiler realized it could solve at compile time and killed the if :). In C if connect all optimizations it will do the same. Some languages know how to optimize well.

I’m not going to get into the question of logic ports that is too low, but it is only manipulation of existing data in the processor (it was put there before), as already said before, the condition will generate a value in one of flags processor which does not cease to be a register, and a value is changed in the PC (IP for some), then the execution continues at the address of the memory that is marked on that register.

For being something more difficult for most if you have some point not understood warn, I can improve the answer according to the need.

  • 1

    You told us branches and of course I remembered this answer https://stackoverflow.com/a/11227902/4438007

  • 2

    @Jeffersonquesado now can enjoy the photo here too, stole :D

3

A compiler converts your source code to machine code, which is understandable by the computer. During the conversion process, the source code goes through several phases of the compiler.

One of these stages is the intermediate code generating phase, in which the intermediate code for your source code is generated, in which it is optimized in the code optimization phase to increase the efficiency of the code in general, and thus finally be converted into machine code in the last phase of code generation.

Now to answer your question.

An intermediate code is represented using the Notation of 3 Addresses. A 3 address code is represented as follows::

Result := Argument_1 operator Argument_2

It is called 3 Address Code because there are 3 references to the address of the variables.

For example this code block (This is the answer you are looking for):

var x = 8;
if (x < 4) {
    print("x é menor do que 4!")
} else {
    print("x é maior do que 4!");
}

Now the 3 Address code representation will be as follows:

1) var x = 8;
2) var Result = x < 4;
3) if Result then goto (5)
4) goto (7)
5) print("x é menor do que 4!");
6) goto (8)
7) print("x é maior do que 4!");
8) 

Yes, the if works using goto. That’s how it works:

  • He first checks whether x < 4 in a variable using 3-address notation.

  • If the condition is true, the program goes to line 5.

  • If the condition is false, He goes to the next line, where he sends the program to line 7.

  • Lines 5 and 6 are the code block within the if. Line 6 sends the program to line 8, which is basically coming out of the block if-else and continue processing the rest of the code.

  • Line 7 is the code piece in the block else. Like the else "is" at the end, when finished running it automatically the program already goes to code after the block if-else, then it is not necessary a goto.

So this is how the if works. goto is used extensively by compiled through code. This is not limited to the block if.

Note that it is also possible in some places for the condition to be executed directly on if, as if x < 4 then goto (5) based on the example above.

  • 1

    I have the impression that to be closer to machine language the instruction 8 should be nop, the on the Operation, unless you have any commands after the execution of the conditional

  • The 3AC is very important in the optimization processes of the code generated by the compiler.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.