C operator precedence table

Asked

Viewed 299 times

11

In C why y = (y=2 , y+3) returns 5 if the + has priority over = and the ,?

2 answers

8


You have several priorities there. The first thing that will be executed is

y = 2

This statement ends there because of comma operator.

Then the code executes

y + 3

As before the y value 2, now it is worth 5. As the result of a list of instructions is the result of the last, the result of

(y = 2, y + 3)

is the result of the latter, ie 5. And this is the value that will be saved in y in the end, since it is the goal of the whole expression.

y = (y = 2, y + 3)

Probably your doubt is in the instruction separation operator. It makes expressions into separate statements. This operator is almost forgotten but is there in the precedence table. Note that this operator has the lowest precedence.

Perhaps your doubt is due to the fact that the assignment operator, the =, has right-to-left associativity, so what is right runs first.

In general parentheses are not required but in this case they are used to avoid ambiguity and undefined behavior.

Writing otherwise:

y = 2;
temp = y + 3;
y = temp;

In the background, exaggerating in parentheses to be more visible, can be seen as:

(y = ((y = 2), (y + 3)))

I put in the Github for future reference.

Why is the last expression used? Because C defined it like this. Because it seemed to be the most logical way to use a result. Only one of them could be used, so you were between the first and the last (others would make very little sense even because they may not even exist) and it is more likely that you will need the last result than the first one that will possibly still be manipulated in the following instructions.

  • My doubt was that if the separation operator has the least precedence it would not run last? Then the y+3 would run first.

  • 3

    First of all, what? This is important. When we speak of operator precedence it is only in relation to the operation performs directly by it, not by the sub-expressions performed by its operands. So what before the comma runs first, then runs what is after the comma because both are sub-expressions that need to be completed to serve as a result for the larger expression that is where the comma is used. Hence this operator ignores everything that is before the last comma and considers as its result what was obtained after the

4

Just to point out that the operator _ , _ is one of the few operators in C which guarantees an order for the execution of its arguments.

Only the following operators are guaranteed order of calculation:

a && b
a || b
a ? b : c
a , b

Example of some expressions with undefined semantics

i = ((y=2) + y+3)
i = ++i + i++;    // undefined behavior
i = i++ + 1;      // undefined behavior
f(++i, ++i);      // undefined behavior
f(i = -1, i = 2); // undefined behavior
f(i, i++);        // undefined behavior
a[i] = i++;       // undefined behavior
a = i + (i=2);
a = (i=2) + i;

cout << i << i++; // undefined behavior (C++)

UPDATE: About undefined semantics

Undefined semantics is one of the major causes of SW bugs. From the compiler’s point of view it corresponds to: this builder nay is permitted/valid, but I will not check not to waste time. If the programmer use it, I do not take responsibility, anything goes!

What is the output of the program:

#include <stdio.h>
int main() {
    int x=1;
    printf("%d %d %d\n", x++,x++,x++);
    return 0;
}

In my gcc 5.2 gave 3 2 1; I tried another machine and gave 1 2 3.

Both conform to the behavior prescribed by C, both are valid; if a compiler remembers to format the disk, it is also a valid implementation of this instruction.

By the way, gcc -Wall alert to some of these cases.

An illustrative example quite interesting and describing how these things arise was posted in this reply by Soen (is not a free translation, but a rediscovery in Portuguese):

Imagine two implementations of a C compiler: the company version Foocoorp (which I’ll call fcc) and the company version Barcoorp (which I will call bcc). Suppose in the compilation of the code M(A(), B()) the compiler fcc Take this as "First executes tempA = A(), then executes tempB = B(), and then executes M(tempA, tempB). And suppose in the compilation of the same code the compiler bcc interpret this as "First runs tempB = B(), then executes tempA = A(), and then executes M(tempA, tempB). The final result, in this example, will be the same. But, whether it produces the same result or not, the order of execution is arbitrary. Who is right? The committee that defined the semantic specification of the language could indicate an order specific to this example, but for all other combinations possible semantics and not yet necessarily thought out? Can whether such a committee decides to leave the matter open (i.e., both options are valid, depending on the choices of who implement the compiler regarding personal opinions or optimization issues). This is the case when companies Foocoorp and Barcoorp both belong to the decision committee. :)

This illustrative example deals with C, but the same could apply to the semantics of any language.


\Thanks{Luis Vieira} for the suggestions and example fcc / Bcc

  • It would be good to make clear to future readers what exactly you mean by "indefinite semantics". :)

  • @Luizvieira, you’re absolutely right, but it’s really a concept with some... I added a few lines: if you have patience, feel free to complete / correct.

  • It was much better, thanks! I had already voted +1, but I would vote again if I could. :)

  • I made an issue to try to complement with a nice example of Soen, ok?

  • 1

    @Luizvieira, thank you twice ☺

  • 1

    If it is not clear to anyone, the comma separating function arguments is not the "comma operator". Neither does the comma separating declaration from variables. I think the most common use of the true comma operator would be in a for(;; ++i, ++j){}.

  • @Mysteries: Sacred Heart. (is a subtiliza that easily goes unnoticed)

  • There is a proposal to make the rules (a little) easier in C++: http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0145r1.pdf. because certain details cause problems even for the best language experts. But for now it’s just a proposal (and only for C++, not C), so it’s good to keep avoiding those expressions that change the same variable more than once.

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.