How to see the implementation of a function?

Asked

Viewed 476 times

2

From time to time I have some doubts about the efficiency of a code and I think that if I see the implementation of a function I am using I can know how efficient it is or not.

Some examples of implementations that would resolve some doubts:

1) What is the implementation of <= or >=? What are the differences with regard to <, > and ==? The comparison >= would be two separate functions or one?

2) What is the implementation of getchar(), gets() and scanf() ? What is the difference between them?

3) What is the implementation of do..while(), while() and for()? The latter two seem to be the same thing but in a different way of setting the parameters. Would it be the same implementation? How efficient it is to use for() instead of while() and vice versa?

I have read some C language and compiler documentation, which I use, GCC, but I would really like to be able to see these implementations, or something very close that could clarify these doubts.

  • 4

    The only way is to read the source code of the compiler and see what it generates. But for a quick answer: 1) very likely each of these will be a single machine instruction, all with the same performance; 2) I have no idea; 3) depends a lot on the architecture on which the program is compiled, but there should be no significant difference. The most common is se a condição X for falsa faça um desvio pra instrução A during the test and faça um desvio incondicional pra instrução B at the end of the loop. Already in a do..while there is no such last deviation, and the condition is tested at the end (for true in this case).

  • 1

    +1 pro comment from mgibsonbr. And for mathematical operators, I think it’s more interesting to see how they work at the hardware level.

  • One more question. Where do I find the GCC source code? I found several sites with "releases" versions but I don’t know which one I can trust. Grateful.

3 answers

4


Editing here, complementing the original answer below: The implementation of these functions and operators is a very complex thing, and if you are a beginner I recommend not stress about it now. Always assume that implementations of low-level languages are the most efficient possible. Wanting to know how these things work is one of the hallmarks of good programmer, but until you master language building techniques and compilers, it is more important to know the algorithm and the correct use of these functions than its implementation itself.


The implementation of language functions is separated from the language itself ;)

Comparisons of major or minor, type a > b, can be made in different ways by different processors. Your mobile phone’s ARM can do this one way, whereas your PC’s i3/5/7 can do it another way. But a line of code like:

if (a < 3) { /* .. snip .. */ }

... Should work on both.

So what’s the secret? Each processor has its own machine language (Assembly). C, C++ and other low-level languages are a way for you to express commands to the machine, but at the end of the day the compiler takes what you’ve written and translates into the machine language of the processor you’re compiling for.

To make it even funnier, two different compilers can generate different Assembly code for the same input source code.

And to make it even funnier and funnier - because the compiler is written to work on an operating system, the compiler libraries themselves - which contain these methods - can vary from one operating system to another.

If you want to see the implementation of these functions, the way I suggest is to read the source code of the C compilers and libraries.

Here is the source of libraries from C to GNU: http://fossies.org/dox/glibc-2.20/files.html

There you find, among others, the scanf and the printf, as implemented pro Linux. Note that the part where it treats the target architecture is still underneath it, so good digging ;)

  • Taking into account that you mentioned documentation, where can I find all the original GCC and C documentation? Both as implementations of functions and things like ? The link you passed ( http://fossies.org/dox/glibc-2.20/files.html) is already the original ? And this link I found: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf is the original of the language C ? Grateful.

  • I don’t know about the link you passed there, but the link I gave you is the official one. The first link at the top of the page takes you to the compiler’s main page, with links to the documentation.

4

What’s the difference between getchar(), gets() and scanf()?

getchar reads one character at a time, gets reads an entire line at a time and scanf is a generic function that does different things according to the format string you use. These functions do very different things and it is better to choose the most appropriate one rather than worry about micro optimizations (the cost of doing input and output operations and calls to the operating system is probably much higher than .


For the rest of your questions, I suggest you learn to read the executable files generated by your compiler. In the case of gcc you can use the -S flag to get an Assembly language version of your program’s generated executable:

gcc -S meuprograma.c -o meuprograma.S

The compiler has enough freedom to change the structure of your program, as long as the result is the same as the original. I think you will be surprised by the result in some cases :) Just a tip: to simplify things, write only a function with the code of your interest - leave out the input and output.

Note that this all depends on the compiler you use, the level of optimization (-O0, -O1, -O2) and the architecture of your processor (x86, x86-64, ARM, etc)

1) What is the implementation of <= or >= ? What are the differences with <, > and == ? The comparison >= would be two separate functions or one single function ?

Don’t worry about it. Your processor will probably spend the same amount of time for any of these comparisons and even if it was different, your compiler would probably be able to do the microtimizations himself (for example, a if(a < b){ XXX }else{ YYY } is the same as a if(a >= b){ YYY }else{ XXX }).

3) What is the implementation of do. while(), while() and for() ? The latter two seem to be the same thing but in a different way of setting the parameters. Would it be the same implementation ? How efficient it is to use for() instead of while() and vice versa ?

It’s all equally efficient. your compiler will convert all the structured control structures (if, while, for, etc.) into a lower-level running flow graph and at the end will spit out a soup of goto unstructured.


Just a warning for your adventures: it is very difficult to guess how long the computer will take to do each operation and it is even harder to predict beforehand which part of your program is the one that is most sensitive in terms of performance (it is useless to double the speed of a stretch responsible for 1% of the total execution time). Always use a profiler to take empirical measurements of the time spent and remember that the CPU does not perform all operations at the same speed (for example, nowadays the memory access speed tends to be a Bottleneck much higher than the number of operations done by the CPU)

  • I wish I could give +1 extra for the last two paragraphs.

0

Answering the first question:

There are no differences in the implementation question, possibly the comparator >= or <= "exert the function" of 2 comparators, because it first checks whether the value in question is equal and then checks whether the value is higher or lower.

Answering the second question:

The GETCHAR() function reads a character and returns an integer which is: - the character code, or - the -1 value that corresponds to the file.

The GETS function, from the standard C library (stdio) can generate a big problem for the programmer who uses it: as this function does not limit the number of characters to be read from the standard input (stdin), there may be memory leakage, or even worse, injecting malicious code into the program.

The solution is to use FGETS, which limits the read buffer.

The function used to read "generic" values (receives any primitive type) is the SCANF function.

Answering the third question:

WHILE X FOR:

FOR performs a limited and fixed number of steps, while WHILE can do this indefinitely.

For example, if I make a loop to add 1 to a variable, already declared, n times, the loop will always run n times. The initial and final result will always be the same.

while must be used when it is the user who will set the initial value of this variable, and the control only serves the maximum value.

Being clearer(Exemplifying):

The program declares, at the beginning, x as 1. And then executes the FOR(loop) 10 times, and in each of them the value will be added 1. The final result will always be 11.

If it is a WHILE loop, written so that this variable x reaches 10 (while x<=10), the maximum result will always be 10. But if the user puts x initially as 9, the result will be 10. And the loop will only run 1 time.

If the program is fixed, without user intervention, the most indicated is the is. For example, a program that always provides squares of integer numbers from 1 to 10.

But if it is a program that calculates the squares of the integer numbers within a defined range, and typed, by the user (from x to y, for example), the while is more indicated.

DO WHILE X WHILE: The repetition structure DO WHILE assumes that something must be done first and then compare a variable to see if the loop will be executed once more.

SUMMARY:
The FOR is simpler to implement, which can reduce the errors that the programmer may make. The speed depends on the size of the loop, and there is no significant difference between them, considering an equal number of executions. FOR has more parameters, that is, more resources than WHILE, because in addition to the comparator (that both WHILE AND FOR have), FOR has the initialization of a variable and the increment of it.

WHILE can be implemented in functions that perform an undefined (initially) number of steps. Therefore, it can be faster if the user sets a small amount of steps. If it is a fixed number of runs, it is simpler to use a.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.