Manipulation of items during iteration

Asked

Viewed 142 times

0

In Python it is somewhat common to go through items of a everlasting handling or verifying the existence of a particular item, or whatever the operation is. I am reading the Python documentation again and I come across something that when seen for the first time I did not believe to be something say relevant. For example:

>>> animals = ['Cat', 'Dog', 'Elephant']
>>> for animal in animals:
...     print(animal, len(animal))

Cat 3
Dog 3
Elephant 8

As you can see above a list containing animal names and I print the current animal in the loop, as well as the number of characters in it. The only operation I do in this example is the printing of your name and the number of characters, I do not try to change the value of them in any way. My doubt exactly when I think about changing these values.

>>> animals = ['Cat', 'Dog', 'Elephant']
>>> for animal in animals:
...     if len(animal) > 3:
            animal = animal[:3]
        print(animal, len(animal))

Cat 3
Dog 3
Ele 3

>>> print(animals)

The output inside the loop as you can see demonstrates that the variable in the loop was changed, but when printing the list after the loop I see that the list itself was not. It baffled me, because frankly what I’m showing here as examples is nothing more than things that came to mind that I believe we’ve all thought about doing which is altering a variable during a loop of repetition, another fact is that studying Python we learned that some of its most basic types like string, int, float and even the tuplas are immutable, that is, because the reallocation in the current variable of the loop did not generate a new string (in case a string three-character), so thinking I thought I’d check whether the variable in the current loop was the same as the list being iterated, as follows:

>>> animals = ['Cat', 'Dog', 'Elephant']
>>> for animal in animals:
...     print(animal, id(animal))

Cat 140226536948264
Dog 140226536948040
Elephant 140226536875248

>>> animals[0], id(animals[0])
Cat 140226536948264
>>> animals[1], id(animals[1])
Dog 140226536948040
>>> animals[2], id(animals[2])
Elephant 140226536875248

You got it? It’s the same transponders. I don’t know in other languages but in the Python documentation it is described that one should create a copy of the iterable before the loop since it does not do this implicitly, but my doubt is still this because I can not change the current loop item for.

  • I couldn’t reproduce your example. Here it showed size 3 for everyone, as you expected: https://repl.it/repls/StrikingIllustriousNumbers

  • During iteration the item is changed to 3 characters but after iteration the list itself is unchanged.

  • 1

    Yes, but that’s not what’s in your question. You changed the variable and displayed it inside the loop. By your text, you asked why animal continued to be Elephant inside the same loop assigning animal = animal[:3]. If the intention was to ask why the change is not reflected in the original list, I think you need to rephrase the question.

  • Sorry if it wasn’t clear I changed the question.

2 answers

2


The question was a little confused because it is not possible to reproduce the result. It seems that this is what I wanted to demonstrate:

animals = ['Cat', 'Dog', 'Elephant']
for animal in animals:
    print(animal, id(animal))
for animal in animals:
    if len(animal) > 3:
        animal = animal[:3]
    print(animal, len(animal), id(animal))
for animal in animals:
    print(animal, len(animal), id(animal))

Behold working in the ideone. And in the repl it.. Also put on the Github for future reference.

Note that I first showed the addresses of each item, then showed the items showing the id with indication when it changes, and again the idindicating that the list is intact. When you go to make a test you have to have a control, show the phenomenon occurring and then the state of everything, otherwise you can have illusions.

And the question seems to be why the list hasn’t been changed. The question was amended in a detail of the text, but everything else still gives the wrong understanding of what it desired.

The loop variable is not what you are thinking. It does not have the item value but a reference immutable for an item of the data collection that is scanning in the loop. So you are not allowed to change the data in the collection. You think you have a variable that carries an isolated value. But that’s not until you try to write it down. There is what is called COW (Copy On Write) and a new reference is created and allocated in the variable, pointing to a new memory location where the value of the element is copied and for its own protection it does not let you change this value in the list.

In a direct loop there would be this protection. The design pattern foreach common in several languages exists to facilitate the iteration of items securely. If you need the flexibility and security by using a rough loop.

Note that the id before the loop in the third item is one, and within it when there is change is another. Therefore your test did not analyze correctly and gave you false information. The variable animal when it is changed it has another value. When it is not changed by optimization it need not have another value.

I think the fundamental concept here is the COW, it caused the illusion and the original test did not allow it to be observed. It is common in all types by value languages to be immutable and to use COW to optimize access. Types by reference are not usually immutable and then it doesn’t make sense to have the COW, so a type like that would allow you to change its value. Note that the string is even a type by reference by optimization, but has semantics of value, so it is immutable and follows the same criterion.

  • The assignment of the intermediate variable of the loop functions as a normal assignment; therefore, it is immutable only for immutable objects. Iterating over a list of lists for example allows modifying the elements (since the intermediate variable points to the list by reference, as expected).

  • @Pedrovonhertwig I made that clear in the reply, thank you.

2

This section of the documentation has relevant information.

What happens is not very intuitive, but it is easily explained. When iterating over a list, each item is assigned, one at a time, to an intermediate loop variable, a reference to the iterable object.

I mean, this:

minha_lista = [1, 2, 3, 4, 5]

for item in minha_lista:
    if item == 2:
        item = 10
print(minha_lista)
# [1, 2, 3, 4, 5]

It’s actually the equivalent of that:

minha_lista = [1, 2, 3, 4, 5]

for i in range(len(minha_lista)):
    item = minha_lista[i]
    if item == 2:
        item = 10
print(minha_lista)
[1, 2, 3, 4, 5]

Can you tell the difference? When we assign this intermediate variable, we’re actually changing the reference of the variable, not the original item in the list. That’s for immutable types.

The assignment to this intermediate variable works in the same way as a normal assignment. That is, if we iterate over a list of lists, for example, the variable will point to the reference (and not the value) of each of these lists. We can observe that the following behaves differently:

lista_de_listas = [[1, 2, 3], [4, 5, 6]]

for item in lista_de_listas:
    if item == [1, 2, 3]:
        item[1] = 10
print(lista_de_listas)
# [[1, 10, 3], [4, 5, 6]]

One way around this difficulty for when dealing with immutable objects is to access the list directly. You can do this without losing the convenience of having an intermediate variable with the enumerate:

minha_lista = [1, 2, 3, 4, 5]

for i, item in enumerate(minha_lista):
    if item == 2:
        minha_lista[i] = 10
print(minha_lista)
# [1, 10, 3, 4, 5]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.