Read file with list comprehension only works the first time

Asked

Viewed 55 times

2

fhand = open('text.txt', mode = 'r')
txtfile = [item.split()[0] for item in fhand]
txtfile2 = [item.split()[1] for item in fhand]
print(txtfile)
print(txtfile2)

Why the second comprehensilist on is not working?

Even if it’s exactly the same, even the same index, the txtfile2 always appears as an empty list when I put it in print.

  • 3

    fhand is a generator and generators can only be iterated once. With the first list comprehession you already consume all your generator, so the second one is empty. Since you are using the same value in both lists, ideally do both logics in just one loop.

  • 1

    @Woss Negative, fhand is a Textiowrapper. Different from a class object generator, if the method is called seek, the pointer can return to the starting position of the stream.

  • The goal of the program is to break the text in the file into lines or words?

1 answer

3


After you make the first for:

txtfile = [item.split()[0] for item in fhand]

The file will be read to the end. Then in the next iteration there will be nothing left to read, and so the second list becomes empty (imagine that fhand has an "internal pointer" indicating the position of the file in which it is: when making a for, the entire file is read and this "pointer" will be pointing to the end of the file, so the second for can’t find anything else to read).

In this case, an alternative is not to use comprehensilist on, and process each row of the file in order to get the data from each of the lists, and insert them separately:

txtfile = []
txtfile2 = []
with open('text.txt', mode='r') as fhand:
    for item in fhand:
        itens = item.split()
        txtfile.append(itens[0])
        txtfile2.append(itens[1])
print(txtfile)
print(txtfile2)

Note that I used with, that ensures that the file will be closed at the end.


Of course, if you want, you can do it using comprehensilist on:

with open('text.txt', mode='r') as fhand:
    txtfile, txtfile2 = map(list, zip(*[ item.split() for item in fhand ]))

First I create a list containing the lines properly split, then I pass this list to zip, that returns tuples containing the elements of each of these sublists (the first tuple contains the first element of the split of each of the lines, and the second tuple contains the second), and finally mapped these tuples to list, to become lists, which in turn are placed in txtfile and txtfile2.

But in my opinion it’s simpler to do the first loop above. I understand that list comprehensions are legal and such, but you should not force the use of some resource if it does not show a good alternative (and in case, I believe it is not, because I think the code has become unnecessarily more complicated).


As they said in the comments, another alternative is to use seek so that the "pointer" goes back to the initial position of the file:

fhand = open('text.txt', mode = 'r')
txtfile = [item.split()[0] for item in fhand]

fhand.seek(0) # volta para o início do arquivo
txtfile2 = [item.split()[1] for item in fhand]

Only then you will read the file twice (and will do the split in all lines again), which in this case seems unnecessary, because with the previous solution you read the file only once and already do everything you need.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.