The main function and its arguments. How does it manage to assign to its argc and *argv[] arguments the parameters passed via terminal?

Asked

Viewed 138 times

1

For my question to be clearer. Suppose my program, called main, needs to receive any number of console parameters.

To do so, I would have to do it.

int main( int argc, char *argv[] ){
    ...
   return 0;
}

Correct?

Then I compile and run passing your parameters.

$ gcc main.c -o main
$ ./main par1 par2 ... parn

I would like to know how the main function can take the amount of parameters passed (which can be arbitrary) and play on the argc variable and play them all within the *argv array[]?

Furthermore, is it possible to reproduce this behavior in an ordinary function? If so, how? Could you provide me with an example.

I ask this question because I’m studying C and I want to understand this behavior.

3 answers

3

First read What is the difference between parameter and argument?.

It’s a bit complicated to explain this without writing a book talking about operating system and how an executable works and memory management.

And it doesn’t need to be passed exactly by a terminal.

The arguments are passed by the operating system, it is his problem to take those values on the command line that calls the executable and put them in memory so that the executable can read. Obviously he does it in a standardized way that every executable that follows this pattern can get this information, but it’s not very secret.

By the function’s signature we already know that there will be two information, one of it is an integer that tells how many arguments are available, that is, what is the size of the array that will come next. And then comes a array pointers. Again, the operating system is that puts these numbers somewhere in memory that I’m already going to talk about.

In addition it is placed in memory the arguments themselves, that is, come the texts that were typed all ended with a null as every string of the C pattern should be. Yes, the operating system follows what was determined by C. Any other language has to work the same way, at least the practice is like this, nothing would prevent it from being different. The address where each text was placed are the addresses that go there on array pointer.

These data are all placed in an area that will belong to the executable, this is already taken care of by the operating system’s memory management system, it is its function to control all this. The data you see directly in the function signature will go in a special area that we can informally call pre-stack (rundown). Although this is implementation details, it doesn’t have to be exactly like this. You as a programmer, if you’re not creating every mechanism of Runtime of language has not to know any of this.

But understand that main() has nothing of a very special, besides, it takes the values received in the parameters in the same way as any other function, these parameters are nothing more than variables places of the function.

This is an ordinary function. The only difference is that it is called by the operating system. And if you want to know if you can do it, no, it can’t, it doesn’t make sense.

If you don’t understand something you’ve probably skipped a step and want to learn something more advanced that lacks fundamental knowledge.

Read:

1

The short answer is yes.

You can do this. And as you can imagine, it’s a very common need. By coincidence I posted a program yesterday on this site that does exactly that, and the link is this: for the post in C

However the example shows the case for an int vector.**

I even saw an answer here quite contrary, saying that "not only can’t you as it doesn’t make sense", but I don’t think I understand or I don’t know what to say.

So I’m going to show you a program and explain the mechanics of this section, because, I repeat, it’s a common necessity and I’ve explained it countless times. And I’m not an instructor or anything like that. Or I don’t understand anything and I’ll post a program for nothing.

Because this is common:

Fits into a large number of abstractions, such as

  • create and traverse a vector of structures after reading from somewhere the total of structures
  • "go up" a file for memory on a char**linha with linha[i]of course corresponding to the line i from file to disk. A text editor for example
  • upload records from a CSV file, the ones where you have a record per line and the fields are separated by commas. CSV is actually char** ** but it would be another topic
  • upload something like a spreadsheet (Workbook) to memory as text. This would be in the background a scary char** ** ** but back to topic
  • for a student, start most of those programs of data structures, carrying the Nodes I don’t know what for memory before or to assemble the structure

The Example Program

The program will prepare the arguments and call a declared function

int nova_main(int, char**);

with this implementation WELL FAMILIAR

int nova_main(int argc, char** argv)
{
    printf("\n\tEm \"main\": %d argumentos\n\n", argc);
    for (int i = 0; i < argc; i += 1)
        printf("%2d\t'%s'\n", i, argv[i]);
    return 0;
};  // nova_main()

Very familiar because it’s the prototype of main(). Let’s even ride argv[0] as "program name".

The program uses this structure

typedef struct
{
    int     argc;
    char**  argv;
}   Bloco;

Just so you understand that you can declare an array of them and prepare an arbitrary number of argument lists. And also so you don’t have to keep looking for statements by the program, even though it’s tiny.

To keep the program compact and self-contained the input will be a vector

    const int   n_parm = 12;
    const char* teste[12] =
    {
        "criando", "um", "bloco", "de",
        "parametros", "como", "o", "sistema",
        "prepara", "para", "main()", "..."
    };

I had used a file, but I thought I’d leave it all together. After all it makes no difference: They are just strings, as the arguments in the command line would be.

Memory allocation for arguments

The logic is very naive but very efficient: memory is allocated in pointer blocks from this constant

#define _TAMANHO_ 4

As the block runs out it is extended in identical size and life follows. At the end of the arguments the eventual excess is released and the block is passed to the function nova_main(). Just to be more gentle, at the end the parameters are released in main() before closing the programme.

In a serious program you set the size of the allocation block with extreme care, because if it is small it will relocate all the time and can be slow. And if it is too big it will be a waste after all. Or allocate in pages and uses a linked list of pages, playing operating system.

Here is the output of a program execution

Realocado bloco para 8 ponteiros
Realocado bloco para 12 ponteiros
Realocado bloco para 16 ponteiros
        13 strings no vetor de argumentos:
                1 de 13: 'nome do programa'
                2 de 13: 'criando'
                3 de 13: 'um'
                4 de 13: 'bloco'
                5 de 13: 'de'
                6 de 13: 'parametros'
                7 de 13: 'como'
                8 de 13: 'o'
                9 de 13: 'sistema'
                10 de 13: 'prepara'
                11 de 13: 'para'
                12 de 13: 'main()'
                13 de 13: '...'
        Alocados 16 ponteiros
        Lidos 13 argumentos
        3 ponteiros a liberar
        Bloco reduzido para 13 ponteiros
        Chamando nova_main() com esses argumentos

        Em "main": 13 argumentos

       0        'nome do programa'
       1        'criando'
       2        'um'
       3        'bloco'
       4        'de'
       5        'parametros'
       6        'como'
       7        'o'
       8        'sistema'
       9        'prepara'
      10        'para'
      11        'main()'
      12        '...'


        "main()" retornou 0
        Agora apaga o bloco todo e encerra


Fim

Here is the program

#define _TAMANHO_ 4

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct
{
    int     argc;
    char**  argv;
}   Bloco;

int nova_main(int, char**);

int main(void)
{
    const int   n_parm = 12;
    const char* teste[12] =
    {
        "criando", "um", "bloco", "de",
        "parametros", "como", "o", "sistema",
        "prepara", "para", "main()", "..."
    };

    // a memoria vai ser alocada em blocos de _TAMANHO_ 
    // a cada vez que faltar memoria um novo bloco
    // eh acrescentado. Ao final um trim() no bloco
    // para retornar o exato tamanho utilizado.
    // Ex: para 12 strings vai alocar para _TAMANHO_ 5
    // 5 + 5 + 5 e ao final libera 3 das 15 e retorna 
    // argc = 12 e os 12 ponteiros, como esperado
    //
    // o primeiro argumento e o nome do progama
    // pura frescura, eh um exemplo apenas

    Bloco ex; // exemplo
    int N = _TAMANHO_; // primeiro bloco
    ex.argc = 0;
    ex.argv = (char**) malloc(sizeof(char*) * _TAMANHO_);
    const char* programa = "nome do programa";
    ex.argv[ex.argc] = (char*)malloc(1 + strlen(programa));
    strcpy(ex.argv[ex.argc], programa);
    ex.argc += 1;

    for (int i = 0; i < n_parm; i += 1)
    {   // carrega cada string
        if (ex.argc >= N)
        {   // acabou a memoria
            N = N + _TAMANHO_;
            char* novo = realloc(ex.argv, (N * sizeof(char*)) );
            printf("Realocado bloco para %d ponteiros\n", N);
            ex.argv = (char**)novo;
        };
        ex.argv[ex.argc] = (char*)malloc(1 + strlen(teste[i]));
        strcpy(ex.argv[ex.argc], teste[i]);
        ex.argc += 1;
    };  // for()
    printf("\t%d strings no vetor de argumentos:\n", ex.argc);
    for (int i = 0; i < ex.argc; i += 1)
    {
        printf("\t\t%d de %d: '%s'\n", 1+i, ex.argc, ex.argv[i]);
    };
    // agora acerta o final, liberando os ponteiros que
    // podem estar sobrando no bloco
    // foram alocados N ponteiros. Foram usados argc
    printf("\tAlocados %d ponteiros\n", N);
    printf("\tLidos %d argumentos\n", ex.argc);
    if (N == ex.argc)
        printf("\tNada a liberar\n");
    else
    {
        printf("\t%d ponteiros a liberar\n", N - ex.argc);
        char* novo = realloc(ex.argv, (ex.argc * sizeof(char*)));
        printf("\tBloco reduzido para %d ponteiros\n", ex.argc);
        ex.argv = (char**)novo;
    };
    printf("\tChamando nova_main() com esses argumentos\n");
    int res = nova_main(ex.argc, ex.argv);
    printf("\n\n\t\"main()\" retornou %d\n", res);
    printf("\tAgora apaga o bloco todo e encerra\n");
    for (int i = 0; i < ex.argc; i += 1)
        free(ex.argv[i]);
    free(ex.argv);
    printf("\n\nFim\n");
    return 0;
};

int nova_main(int argc, char** argv)
{
    printf("\n\tEm \"main\": %d argumentos\n\n", argc);
    for (int i = 0; i < argc; i += 1)
        printf("%8d\t'%s'\n", i, argv[i]);
    return 0;
};  // nova_main()

Without getting into religious discussions here, I compiled only in CL 19.27 and I circled in the Windows terminal. It’s the most available to me.

0

Basically, the compiler parses by the empty space character " ". Thus counts the number of arguments (not forgetting that main in "./main" is the first argument of the program). So if you need to write an argument with spaces you have to put in quotes (For example if you want to pass the name José Barros you have to write "José Barros", otherwise it will count as 2 arguments because of the space between the names). Knowing the number of arguments neatly creates an array of pointers for characters, in which each pointer corresponds to one of the arguments passed.

I think with this information it is easier to create a function that does the same. A useful function to parse is Strtok.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.