What is the difference between pointer and reference?

Asked

Viewed 8,000 times

58

One of the first things I learned about Java is that this language "has no pointers, only references", followed by some generic statements about how the first is complex and the second is simpler. But I never understood exactly what the difference is between one and the other. Is there any difference, or conceptually they are the same thing?

To clarify, I know that neither Java nor most languages that only support references allow arithmetic to be done with these references. Something like:

Object x = ...;
x++;

is not allowed. But is this just a restriction to the operations that the language lets do to the same type (pointer)? Or are in fact different concepts?

5 answers

39


TL; DR

Pointer is a low-level abstraction mechanism that contains a memory address for any object. This address is his focus and this value can be manipulated freely by the application as any data. Many languages totally hide their existence.

Mostrando um ponteiro na memória

Reference is a more abstract concept and not always visible in language. It creates a data set formed by the object that is its focus and the reference that indicates where the object is. This reference may be composed of a pointer. The manipulation of the reference is limited, including because it gives preference to the manipulation of the object.

They are two ways to conceptualize something very similar and often the same. This has implications on the details of use, but they serve the same purpose.

Introducing

I do not have information based on published reliable studies, so to answer me Baseio mainly on Wikipedia and what I have always learned.

I will try to give a universal interpretation that seems to me to be the intention of the question and not for a specific technology and show when it is something specific of a language without going into details.

Pointer

Pointer (in English) is a more concrete construction. It exists even at the lowest level of programming. A processor handles pointers. According to Wikipedia, a pointer - or pointer to the guys and girls of the old continent - is a value that refers to another value allocated in another area of memory.

I don’t like this definition very much because it implies that pointers only point to data. Okay, anything on the computer somehow can be considered a given, even code. But it’s not an unambiguous interpretation. So I prefer to see the pointer as a pointer to some area of memory. But I like that it quotes memory. It seems to me that the concept of pointer is more specific to memory. Although I myself use the term in other situations, it seems more appropriate that it be used only for memory treatment.

Some languages allow direct manipulation of pointers to a greater or lesser degree. We can say that this is one of the ways that help classify languages by their level of abstraction. That is, if the language manipulates the pointer in a very liberal, concrete and direct way, the language is lower level, as is the case of the Assemblies. Other languages such as C, C++, D, Rust, Go, Pascal, just to stick to some examples, also make liberal use of pointers but with some limitations and each with less incentive, providing other more abstract ways to get the same result. Other languages completely abandon the manipulation of pointers, such as Java, Python and Javascript. Others impose major limitations, such as C#. These languages are higher level.

Probably C is the language mainstream that most encourages and that most popularized the use of pointers. In fact C has no references (directly). And many definitions of what is pointer will be confused with what is used in C.

Pointers are always values, so pointer type parameters are passed by value, that is, the address saved on the pointer is copied. There’s an independence from what he’s pointing out. When you refer to the pointer, you are referring to the address contained in it. If you want the object of this address, you need to make this explicit.

It is common to use pointers for performance, after all it approaches concrete processing. One of the common pointer operations is arithmetic, so it handles access to memory positions very easily and quickly. Of course, there is always the risk of some operation being irregular and causing unwanted results. Pointers aren’t usually safe to manipulate indiscriminately. Although its flexibility to do, even, potentially irregular operations is what gives it great power. Great power, great responsibilities. The programmer has to know everything to do and how to do it. He does not automate "nothing".

But make no mistake, the pointer is still an abstraction. The pointer is a form of if reference objects in memory.

Reference

One reference (in English) It seems to me to be something broader. You may have references to something that goes beyond memory. Even in memory you may have references that alone cannot be considered pointers. Reference, as the name says, refers to something, but that something can be more free. The Wikipedia article says that it is an object that contains information that indicates data stored somewhere else rather than containing the data itself.

A reference is composed of two parts. An address indicating where the data is and the data itself. This is different from the pointer that has no direct relation to the die. You don’t do many operations with references, you don’t have arithmetic, for example. You can change the reference value but you cannot manipulate it freely. Of course in memory a direct reference will probably be implemented via a pointer. But it is possible to have a reference other than a pointer.

The reference is a more abstract concept and so depending on how it will be used this level of abstraction may be a little different. The way you deal with this reference may be different. In some cases the programmer might even ignore that it is a reference.

One thing I learned from reading Eric Lippert’s excellent articles is that a reference should be interpreted as a alias for a given, for an object.

That is, it’s just a name we give to the object. And you can have as many aliases want for him. Having only one alias has implications on how the data will be manipulated. Of course internally this alias will probably be manipulated with pointers but the shape will be different and mainly, will not be programmer problem.

When we say that we pass something by reference, it means that we pass the reference value - there is a copy of the reference, but our intent is to pass on the data to which she refers. It is common even in lower-level languages that have their own reference mechanisms not to give direct access to the reference address, because it is not important but its referenced data. Hence references are much safer than pointers. When we access a reference, it is implied that we want to access your referenced object. It’s even a matter of clear semantics.

In very low-level languages, like Assembly and C, they don’t have mechanisms that deal with references, you simulate them with pointers. In others, such as C++ and D, this can be done explicitly. In the highest-level languages their use is so opaque that the programmer can even use them without having science that is using a reference. It doesn’t really matter how much the language hides the implementation, if it’s a reference, it’s much easier to manipulate than a pointer. The fact that it is less flexible, gives more guarantees, many concerns are unnecessary. And even the compiler can benefit since it can be more aggressive because of the guarantees provided by the mechanism, which can facilitate having a performance gain.

References often have additional information in addition to the address where the other party is. At least its size and the type of information contained in that part are common, even if the information is only available at compile time, in the simplest languages.

C++ is probably one of the languages where differentiation is most important since it explicitly has both pointer and reference. In C++ pointer and reference are data types. Of course some languages have references embedded in other mechanisms. As is the case of a Slice de Go. But it’s not that C, for example, has no references, it just doesn’t have a specific mechanism to treat them. They are conceptualized and manipulated by the programmer’s complete discretion.

Both pointer and reference are indirect.

Nomenclature exchange

Everything we do in data structure uses references. And until it needs to be made explicit that it is accomplished through a pointer, this should be the preferred nomenclature. I and half the world exchange terms when it is not the most appropriate without causing great harm or misunderstanding. In documentations and other formal publications the misunderstanding cannot occur.

The intention of this answer is not to exhaust the subject, but only to resolve the differences of terms since we all use it all the time and do not always stop to think about exactly what they are.

Completion

As I started, I have no way to say, I have never read anything canonical and indisputable on the subject, I would even like to see something like this, but what I can help is to show that they are different concepts, one is a concrete mechanism, powerful and flexible and the other is more abstract, more secure (more automated), more universal, and easier to understand.

Reference, in higher-level languages such as Java, is a more abstract concept that indicates an indirect data, not even a mechanism accessible to the programmer in most of them. Pointer is one of the most concrete mechanisms used to implement the reference. It is important to have some awareness about the functioning of references to avoid surprises but it is not necessary to deeply understand their functioning. We can say that in these pointer languages, strictly speaking, should not even be cited. Not that I am preaching alienation of the programmer.

I do not know if I fully answered the question and I know that this answer cannot be considered definitive. Not that it is wrong either. But I believe that the links provided help start looking for more information on the subject.

  • 1

    Difficult to choose, but I accepted this answer because it is not only quite complete but - although extensive - clearly highlights the main difference between the concepts in my opinion (see the parts in bold). The other answers are also good, in particular that of Miguel Angelo - more concise and didactic, but less precise and more focused on C#.

  • I think exactly the same :P I did not improve mine afterwards because it would seem envy :) But you should have chosen the least voted, so it ends with the seriousness of *site once and for all. Humor is good at the right time and place. The place, an answer was not the best, but at the right time, all right, the moment has passed, solidify it was a bad decision. What was to generate lightness, generated a weight.

28

Both pointer and reference deal with something I’ll call "pointing phenomenon". Just as in electromagnetism we have a single foundation, and two ways of observing the effect of that foundation.

I’m going to talk about C# ’s point of view on this. I still believe that this point of view transits between languages without too much friction.

Pointer

From the point of view of C# at least, pointer is a type of data that appears in a memory address and must point to a given structure, or can be null. If it is not null, then the pointer is considered invalid if the specified structure is not of the correct type. It is still possible to have pointers of the void, pointing to something undefined (i.e., they are mere memory addresses).

  • void pointer: void*         Ponteiro void

Pointers allow to operate the pointing with mathematical operations, and in general look more like any integer than with an object.

                Somando números em um ponteiro

To have access to the object pointed, it is necessary to use operators of * or ->. The first obtains the structure pointed as value, and the second serves to have direct access to a member of the pointed structure.

                Focando o objeto apontado com <code>*</code>, e indo direto para o membro com <code>-></code>

Since pointers allow assigning any numerical value to the variable, they can only be used in insecure context, marked with the keyword unsafe. This is caused because the type of data present in the destination pointed by the memory address may not match the type of pointer:

                Tipo apontado não corresponde ao tipo do ponteiro

With pointers it is possible to create a cascade of notes, which lead to the final object.

  • DateTime****

    Cascata de apontamentos, com um objeto final

Reference

Reference is a type of data that has a pointer underneath the scenes, but does not allow mathematical operations, and giving direct access to the pointed object.

O endereço fica oculto do programador

In C# there are two reference forms:

  • in the passage of a variable to a function

    void Metodo(ref int valor)
    
     == ou ==
    
    void Metodo(out int valor)
    

    With references it is not possible to change the numerical value of the address, and when assigning a value to the reference the pointed value will be changed and not the reference itself.

    Fazendo <code>x = y</code>Making x = y

  • using reference types (or classes in C#)

    class Xpto // Xpto agora é um tipo-referência
    

    With reference types it is possible to assign address value, using another reference of the same reference type, or setting the value as null. These are the only ways to change the numerical value of the pointer that is under the table in a reference type.

    Atribuindo instâncias de classe Assigning instances of class

C# allows double reference only by passing a reference type to a method.

  • void Metodo(out Xpto classeXpto)

The access to the structure is done without using any special operator. The reference is used as if it were the referred type itself.

Comparisons

  • Both reference and pointer allow changing the pointed object without changing the numerical value of the address that represents it. Therefore both can be used to make the "reference passage", where the caller passes something for a method to change.

  • Both have a memory address underneath the cloths.

Same phenomenon, two points of view

Pointer/reference can be understood as two ways of observing the same "pointing" phenomenon. Pointers observe the phenomenon of pointing from the origin of the pointing. References observe the phenomenon of pointing from the destination of pointing.

Ponteiro observa da origem, referência observa do destinoDuality of the same foundation.

14

Pointers are variables that store memory addresses and allow reference to them.

References refer to specific objects or variables, totally abstracting the place and form of storage.

As pointers store (and allow you to manipulate) addresses, the languages that make pointers visible give you some properties like pointer arithmetic, aliasing (treating the same memory address as different data types) and access to invalid addresses (although the latter is normally undesirable).

As higher-level languages tend to abstract memory allocation, it is convenient to abdicate or minimize the pointer concept and use more abstract concepts such as reference variables in C++ (which is not even that high-level):

void bleah(int &a)
{
  a = 2;
}

void test()
{
  int var = 1;
  bleah(var);
  // neste ponto var tem o valor 2.
}

In C++ a reference variable means "create an alias (nickname) for the same object/variable" in another scope. If the object was allocated in the heap, or is an integer in a register that has no memory address, it doesn’t matter, all this is abstracted and managed by the compiler.

In C and C++ the passage of parameters to functions by default is given by value (and the value could be a pointer), and in C++ the reference variable indicates that you want to have access to the same object without having to copy it, but without explaining a pointer, providing a specific mechanism for switching parameters to functions by reference.

The "trick" is that you already have a variable, which represents an object, and the language allows you to reference it without having to worry about its location.

So there are two concepts here, the "reference variables" and "reference passage". The first is specific, the second is more general.

ISO/ANSI C for example does not have a reference variable, and the passage of parameters is by value (although the value may be a pointer, and the pointer has the property of allow reference an object if it is stored in a memory location).

In C++, a reference variable has neither syntax nor semantics a pointer although you can take the address of a reference variable and make it into a pointer.

Between pointers and reference variables there is at least another substantial difference: you usually need to have an object of the correct type already declared (and often already initialized) to pass by reference. As you do not manipulate the address, in theory this avoids the use of "wild" pointers (whose value points to the wrong place, including a non-existent area, common bugs when using pointers).

The reference variable in C# is similar. If you try to pass a null pointer (It is possible to do this by passing unmanaged code or Pinvoke) where it expects a reference, Runtime will usually fire an Exception. This is intentional because references, as a rule, need to be valid!

In Java, there is no "reference variable" type. And the passage of arguments is not by reference, but by reference value, and this is almost totally abstracted.

Since you do not manipulate addresses in Java, every variable is implicitly a variable that stores a reference, but the semantics is another since you do not access the reference, but the referenced object.

When you pass a java object between functions, you are passing "the reference value" in a new variable. If you try to do a function with the above example, you will have to remove the "&" from the parameter declaration, and regardless of the type of object passed to the function, you will have the following behavior:

void bleah(Object a)
{
  // Isto não altera o objeto referenciado por a: faz ela referenciar outro objeto. A variável "a" local não tem nenhuma relação com a variável da função chamadora (a não ser que o valor atribuido a ela originariamente era o mesmo).
  a = new String("b")
}

void test()
{
  Object var = new String("a");
  bleah(var);
  // neste ponto var.toString() ainda retorna "a".
}

What changes is the effect of the assignment operator: every variable implicitly stores the reference, but if you assign to a variable, it will be assigning new reference, and not changing the object pointed by the previous reference. In C++ you would be calling the operator '=' of the pointed object, in Java this does not happen, the object whose reference is simply not used for anything in the above function.

Now, if instead of a String (which is immutable), you had another object with an "append" method that changed its properties and you invoked the method:

void bleah(Weird a)
{
  a.append("whatever");
}

Then the function that called bleah would see any effects of the call to append, because the reference to it is still the same. The variable 'a' contains an implicit reference, but is not a "reference variable" as in C++ because it does not attempt to be an alias of the original variable: the assignment semantics is different (as well as other subtleties). If it passed by value, 'a' would be a copy of the object passed, not the same.

There are languages that accept reference variables (among those that do not accept pointers, such as C#: you cannot compile code with pointers in "safe" mode, but you can use references), and there are languages that do not accept them;

There are languages that pass parameters by value, by reference, by the reference value (!!! ) or some combinations of the above (such as C++ allowing passage by value or by reference, but the latter explicitly).

The abstract concept behind the reference passage is in contrast to the value passage: passing an object by value means passing a copy of it. For an object type an integer or floating point or even a pointer, it is usually passed by value because of performance, since the cost of copying the object is less than keeping pointers for it internally. But then you can’t change the value of the original variable (and this has a lot to do with the discussion between pure and impure functions).

This means that by copying, the reference (!) to the original object is lost. Passing a reference (either by a reference variable, or by a pointer, or by the reference value, as in Java) means having as reference (ha!) the same object and also allows avoiding copies, relevant in large or complex objects.

This is possible (and more convenient) with referenced variables, but Java demonstrates that without pointers, they are not necessary if the passage involves at least the reference value.

Pointers are forms of references to memory addresses, references to variables or objects can be implemented with pointers, here is the confusion.

5

Moderator’s note: this reply was published as a joke of April 1. Usually this type of content is not accepted on the site, but on this specific date an exception is made. After the date, the publication has been blocked and can no longer receive votes or be edited. Enjoy!

Although the previous answers are very enlightening, I cannot help but add one more answer to clarify these important concepts. Pointers and References are completely different things.

Pointers

(Latin Bridge+eiro: bridge maker)

Although current machines have subtly reduced the use of pointers, these are still of high importance. There are several types of pointers (usually 2 or 3 types), and they do not always have the same dimension.

inserir a descrição da imagem aqui

References

(From the Greek Re+wound=hurt repeatedly; as @bfavaretto refers there is here a clear relationship with the pointed hand side)

ponteiros afiados

As regards programming languages there are two essential reference types:

  • the reference manuals (all the appropriate language should have at least one, which by definition should avoid talking about non-documented Features)
  • the reference letters (to ensure and attest to the suitability of its author and the absence of pernicious side effects as they are, format the disk)

Completion

  • As stated by the OP, languages such as Java, refuse the existence of pointers in a clear reference to what is always a good time to have coffee.

  • Some pointer operations involve a lot of processor and sometimes synchronization problems (even this weekend there was time change, with the consequent arithmetic involved)

    (*pointer)++

  • In order to understand the response correctly, the following programme should be implemented
    echo lirba 1 aid zilef | rev

  • If the pointers are that have tips, then why aren’t they the ones that hurt repeatedly?

  • 1

    @bfavaretto, It is very true and pertinent the observation that you raised and that I will try to include in the answer. Some scientists even refer to the "scars of time" (although "in time everything is cured"). Feel free to improve other errors found.

  • 1

    Finally I understood. Illustrated examples are very good, and this clearly points out the important details. I will keep as reference.

  • @Bacco, thank you very much. It is comforting to know that there are those who take the time to answer so learned involving so much technology :)

  • @bfavaretto, Never thought to create a bad behavior medal for those who never respond the way they were supposed to and who end up giving the heads of the mediators and the goal? :)

  • No, but we are considering creating a good-humour with hints of improper ragged language :)

  • Best SE group response ever! :)

Show 2 more comments

0

In C++, a reference is a pointer that assuredly points to the type object. For example, A* can point to an A object or can be null, and can even be invalid, since A& is guaranteed by the compiler that points to a valid object.

Java partially hides pointers (you can’t do pointer arithmetic with a reference, or force it to point to an invalid object) but still allows the reference to be null, which makes them something like the C pointers++.

Languages such as Swift and Kotlin allow specifying whether the reference is voidable. Type "A" is a guaranteed valid reference, whereas type "A?" can be null. The programmer needs to use the exclamation mark when he wants to force the use of an A? variable as valid.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.