How abstract are pointers in C?

Asked

Viewed 124 times

4

I have a vision, which from time to time seems wrong, that C pointers are simply and literally memory addresses.

In this case it starts from a misconception that memory was a linear thing and accessed directly in its absolute positions. Maybe that was true in the days of MS-DOS. I never really learned that part.

In fact the operating system abstracts this for relative addresses, started at 0 (which is mapped to any absolute address, and nowadays there is a history of address randomization also for security reasons) and that go up to a maximum value allocated to that program. Or something like that.

Anyway, I’m studying here, but hit this question regarding pointers. How abstract are they from the point of view of C? They sane a number representing a memory address or only abstractions that expose this address for the programmer? I think they only expose, because otherwise it would be enough to use, for example, a int or long.

And why is it relevant to differentiate a pointer to char from one to int, if it is all address for a (relative) memory position? For example, in the case of a return malloc()?

A valid answer is therefore "Keep studying that one hour you will understand".

In fact if only answer me that it is an abstraction whose implementation hides some details of the programmer I can already consider answered what I wanted to know. Of course, detailing the implementation would be a very interesting addition.

  • The reason to differentiate a pointer to char from one to int is only for type checking during compilation. Both pointers have the same representation in memory, but it is not good to know what kind of data the pointer is pointing to when reading/writing the code?

  • For the purpose of type checking in the compilation I think it is useful information yes. But in theory I can circumvent this and declare for example int *str = malloc(sizeof(char)); (or vice versa) that the compiler accepts, or not? I will take the test.

  • I do not know the practical consequences of this in the programme, but it accepts both: https://ideone.com/qUsO1V

  • 3

    The type makes a difference when doing pointer arithmetic, basically &x + 1 picks up the address of x and sum sizeof x bytes (not 1). It also makes a difference by taking the value it points to. Ex: https://ideone.com/VoghTL

  • @hkotsubo Very well noted, I didn’t know/remember that there was a difference in pointer mathematics, and also when retrieving content in memory, two great reasons to include the type in the statement :)

  • But there is still a doubt: internally the type of data pointed is part of the pointer? Or this is all, as was said, for resolution purposes in compilation and then is converted into "dangerous" pointers let’s say so, I mean, in the sense of positions/memory structures that have no type of content associated?

  • Then I don’t know. I think it makes sense for the pointer to have the type information it points to, otherwise it would be difficult to perform operations like arithmetic and dereferencing (there is that word?). But I don’t know if the language specification says anything about it (and I’m too lazy to look up :-p)

  • 1

    Supplementary reading: https://stackoverflow.com/q/15151377 | https://stackoverflow.com/q/1352500 | https://stackoverflow.com/q/950972 | https://stackoverflow.com/q/44345148

Show 3 more comments

1 answer

6


We can say that conceptually pointers are abstract.

In fact almost everything is abstract, concrete even only excited electrons (people rave when I speak like this :D) circulating by static electrons in computer materials. The bit is already an abstract concept.

Things are more abstract or less abstract, but some level of abstraction always has, at least in things created by humans. We must always consider levels of abstraction and not whether or not it is abstract. Even a word that indicates something is already an abstraction.

Pointer is a concept that a human created to imply that we have a indirect. The most common is to be in memory, but do not need to be, we have in other contexts, for example one can have a pointer in database (that if people knew how to do could make some savings).

In memory, the most commonly used concept in programming, in fact a pointer is represented by a memory address.

From an abstract point of view (i.e., from the moment memory is abstracted) it does not matter whether physical memory is linear or not. Abstraction serves precisely for this.

Measuring the level of abstraction is not always easy, it becomes subjective. What is the minimum and maximum abstraction? Goes from 0 to 10, or from 0 to 100? Or it’s not even how you measure it. We don’t have parameters for it.

We can say that he is more concrete than the reference.

And the mechanism is a more abstract way of accessing memory.

Abstractly you shouldn’t worry as much as they are implemented, but the lower the level goes the more important it is to know this.

I don’t know if you have this dichotomy that question leaves. Or even if it has relevance. But I will try to answer.

But it exists even in Assembly, it is a type that deals with something that the processor understands, as well as the numerical types. Can you understand the level of concreteness? It is more concrete than string, a date, or even a array which is more abstract than a pointer, because it depends on the pointer to exist.

In the exposed form it seems to speak of the kind that is a inter, not only of the mechanism. And then the doubt whether it would be a concrete type or a abstract type makes sense.

My understanding is that it is more concrete, it is a concept that exists in the computer, it is not the same as a int or long. He is not another type composed solely of a whole, he has a form of his own. This is the opposite of what happens to a date for example that has the amount of seconds or fractions and seconds represented as integers or other "concrete" numerical type, or the interaction between numerical data representing years, months, days and who knows hours, and other fractions of days.

A int can be interpreted as a pointer? It can, but this is circumstantial, not always this is true.

"Pointer type"

Another question was asked which is completely different from what was asked in the title, perhaps for not understanding this.

Differentiating into what the pointer is pointing to has nothing to do with the pointer itself. Telling what it points to is important to know the size it is pointing to and how far it should access that information, or even calculate the virtual position of the memory of a sequential structure (array).

The compiler needs this information to make choices and generate proper code, and in some cases force typing (not so much because C has weak typing, but some implementations have options to strengthen typing and become even more important).

The issue of malloc() it’s just a case I just talked about, but it’s orthogonal to the pointer or its type, it does not need to know the size of the pointer or its pointed type.

The malloc() allocate bytes, ever, the sizeof can be used to catch the size of something. The sizeof can be used to pick up the size of the pointer because depending on the architecture its size is different. But it makes no sense to take the size of the pointer according to the type, because the size of the pointer does not change, it changes the size of the pointed object.

sizeof char (people always use the parentheses because they think it’s a function, but it’s an operator) it doesn’t make sense because the language specification says it’s guaranteed to be 1. Who does it is usually terraplanist, thinks that one day can change :D.

Accepting something doesn’t mean anything, C accepts almost everything. Allocating 1 byte (char) is as accepted as allocating 4 bytes typically (int, may be another size), but allocates quite different. This has nothing to do with pointer.

The complete dice type when using pointer is "Pointer of something", it’s not just pointer, never, and it’s not just the type. You can make a pointer for nothing (void), which in practice means that it is for anything, you give up the size of the pointed object.

In C the pointers tend to be relatively safe, if you use it right, in Assembly it’s pure mathematics, but if they were generated by a C compiler as they work perfectly.

Just remembering that the specification of C, not only in this, is very open to implementation do as you wish, but in practice, that’s what I explained here.

Has interesting additions in the comments above and below.

  • We can call it a Terraplanism, but sizeof char can’t be a question of expressing what one wants to do, at least for educational purposes? For example, char *new_str = malloc(sizeof(char) * (size + 1));? It’s just that I saw a material doing this.

  • You can, but 1 can do the same, because you only use 1 for that. And it only makes sense when it’s a char same, and it is very common to use to say that it is a byte, That is to say that you want something that is not what you want, even though the result is different. This example is one that I would not use, is noise, and able to have another problem before this passage :D

  • "Pointer size does not change, changes the size of the pointed object". That translates to saying that malloc(sizeof int*) (or whatever the syntax) does not take into account the int, which is the type to which the pointer points, but only the fact that it is a pointer, whose size is basically determined by the architecture, correct?

  • That’s right. That’s it. If you put void * same.

  • "C accepts almost everything". and "This has nothing to do with pointer". We can basically say that malloc has only incidental relationship with pointers then. To say that a pointer is a TAD that abstracts a memory address (which is already an abstraction of the operating system), purposely allowing other addresses to be hacked, I think kills the question.

  • Yes, it generates a pointer to the location it has reserved, and only. Any other relation to pointer has nothing to do with malloc(), can have to do with sizeof. I didn’t say it’s a TAD, I said otherwise.

  • kkkk so I got it all wrong. : D And a reference, is a TAD? Or Tads were cited as counter-examples?

  • Reference is something more complicated, but I think it’s not enough to be a TAD, reference in practice is only conceptual. It could be a TAD whose only limb is a pointer (or even have something else). It complicates a bit because it is embedded in the language, but I would say that it is not wrong to say that it is a TAD disguised. Pointer has even in Assembly.

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.