How to initialize a list of empty lists?

Asked

Viewed 844 times

12

Using Python 2.7.12

I need to create a list as follows:

lista = [[],[],[],[],.........,[]]

This list needs to have a very large number of lists within it (so the .....). I found around the following way to accomplish this:

lista = [[]]*n

Where "n" is the number of sub-lists. But with that I get to a problem. Below is an example (with a much smaller number of sub-lists for illustration).

lista = [[]]*3
lista[0].append(1)
lista[1].append(2)
print lista

The way out should be:

[[1], [2], []]

But the exit from that code is:

[[1, 2], [1, 2], [1, 2]]

I have no idea what it is, nor do I know/have found another way to create such a list.

3 answers

15


The problem of the code you tried to make:

lista = [[]]*n

Is that the object that will be repeated, [], is initialized only once, when defined its reference and this is used in the other positions. To demonstrate this, simply scroll through the list and display the value of id:

lista = [[]]*3

for l in lista:
    print id(l)

See working on Repl.it | Ideone | Github GIST

The three values will be the same, as in:

47056207983464
47056207983464
47056207983464

To better demonstrate what happens, just check the opcode executed, with the aid of the module dis:

>>> print dis.dis('[[]]*3')
  1           0 BUILD_LIST               0
              2 BUILD_LIST               1
              4 LOAD_CONST               0 (3)
              6 BINARY_MULTIPLY
              8 RETURN_VALUE

See that the operation BUILD_LIST is executed twice, one for the internal list and one for the external one; later the constant 3 is loaded and the values are multiplied. That is, only one reference is created for the internal list, which is multiplied by 3.

To get around this problem, you can use the comprehensilist on:

lista = [[] for _ in xrange(n)]

See working on Repl.it | Ideone | Github GIST

Thus, n distinct references are defined.

For the same solution in Python 3, simply replace the function xrange for range.

This even happens with all Python changeable types. For example, if you have a class Foo and want to create a list of instances, you can not do:

lista = [Foo()]*3

Making it even clearer that Foo will be instantiated only once and the created object will be multiplied by 3.

4

Only by complementing the reply by @Woss...

Sequence multiplication

In the sequence documentation, the description of the operator of addition and multiplication between sequences and integers defines the following:


-- Free translation of parts relevant to the problem --

Being s and t sequences of the same type and n an integer

  • s + t: the concatenation of s and t.

    Example:

    [0, 1, 2] + [3, 4, 5]
    # [0, 1, 2, 3, 4, 5]
    
  • s * n or n * s: equivalent to add s with himself n times.

    Observing: The items in the sequence s are not copied, but referenced multiple times. [Note 2 to the documentation]

    Example:

    [0] * 3
    # [0, 0, 0]
    
    # Mesmo que 
    [0] + [0] + [0]
    # [0, 0, 0]
    

If you look at note 2 in the documentation it is specified that the items of the multiplied sequences are not copied, but referenced (at least the mutable types, as already explained in the @Woss response and in python documentation).

So:

s = [[]] * 3

It would be the same as creating an empty list and referencing it n times on the external list:

_tmp = []
s = [_tmp, _tmp, _tmp]

print(s)
# [[], [], []]

Consequently _tmp will reflect on all elements of s since they all reference the same list:

s[0].append(1)

print(s)
# [[1], [1], [1]]

_tmp.append(2)

print(x)
# [[1, 2], [1, 2], [1, 2]]

Implementing id() in Cpython

In the job documentation id() explains that the function returns the "identity" of an object, where it is an integer and is guaranteed to be unique during the lifetime of that object.

Finally there’s a note saying:

Cpython implementation Detail: This is the address of the Object in memory.

That in free translation would be:

Cpython implementation detail: This is the address of the object in memory.

I mean, that’s why we use the function id(), because it is guaranteed that if the memory address is the same, the variables reference the same object.

Example:

lista = []
print(id(lista))
# 139942416872384

ref = lista
print(id(ref))
# 139942416872384

outra_lista = []
print(id(outra_lista))
# 139942416112048

Notice that lista and ref point to the same memory address, and so possess their id equal, already outra_lista is another object in memory. Remembering that the numbers above vary with each run of the program, I used only for example.

-1

  • 2

    Jeferson, this was no longer contemplated in the other answers?

  • Jeferson, consider in answering bring to the author of the answer a new point of view or knowledge different from those already published in other answers. If you want to focus on knowledge already contemplated, show what has not yet been addressed, explain why the focus and the advantages of the approach over the others and whenever possible when answering provide a [mcve].

Browser other questions tagged

You are not signed in. Login or sign up in order to post.