Python - Dismembering text file

Asked

Viewed 651 times

3

Good night,

I need to separate some lines from the file and according to the add line in another file. That is, a file that contains 6 words will be written according to the word for a specific file.

These 6 words can increase to 8, 10, etc and then you will have to create 8, 10 files, and so on.

I first tried to create an array in which each row would be responsible for a line containing the word.

But I could not, because when I tried to add in the line, I could not, because I could not unless I specified the input with row and column.

For example, I want all lines containing orange to be directed to the orange.txt file and that all lines containing plum go to the plum.txt file.

The idea would be to make the code without "if plum", "if orange".

I tried to play the words in a vector but to play in the file I could not without having the if... Ex:

frutas = ['laranja', 'ameixa']

with open('frutas.txt', 'r') as arq_fruta:
  for line in arq_fruta:
    coluna = line.split()
    for i in range (len(frutas)):
      if(coluna[1] == variaveis[0]):
        laranja.append(coluna[0] +' '+ coluna[3]+'\n')

The last line I could no longer place as a vector for example, something like:

fruta[i].append(coluna[0] +' '+ coluna[3]+'\n') #só como exemplo, nao funciona

fruit[0] would be the vector of all lines containing only orange and fruit[1] all lines with plum.

I tried to create an array, but it did not work, because the matrix asks for the row and column for input, but I do not have these hints, since I will read and supposedly play for the archive.

And speaking of file, I also tried to do something that was "direct" but also does not work.

for i in range(1, len(frutas)):
   arq = open(frutas[i]+'.txt','w')
   arq.writelines(fruta[i])

Is there a more "right" way to do this? I did not succeed, only with the code with "if" which would cause a lot of change if I had to include another fruit for example.

1 answer

1


A more direct way would be to create a list of open files, where you have a file for each fruit. So you can have a single code that writes to all files directly without having to split into lists in memory. The code will be able to handle files of any size because it writes directly to the destination.

frutas = ['laranja', 'ameixa']
arquivos = [open(fruta + '.txt', 'w') for fruta in frutas]

with open('frutas.txt', 'r') as arq:
    for linha in arq:
        for fruta, arquivo in zip(frutas, arquivos):
            if fruta in linha:
                arquivo.write(linha)

If you really want to separate into variables in memory, a solution is to combine dictionaries with lists, can be facilitated by collections.defaultdict:

import collections

frutas = ['laranja', 'ameixa']
por_fruta = collections.defaultdict(list)

with open('frutas.txt', 'r') as arq:
    for linha in arq:
        for fruta, arquivo in zip(frutas, arquivos):
            if fruta in linha:
                por_fruta[fruta].append(linha)

So you have all the lists in the dictionary por_fruta... to write to file later:

for fruta, linhas in por_fruta.items():
    with open(fruta + '.txt', 'w') as f:
         f.writelines(linhas)
  • Thanks nosklo, it was something like the first suggestion I was looking for even! better yet it didn’t need to save in memory before sending to the archive and this part was very dynamic! The code is very different from what I usually code, like for example! if you have any document tips where I can learn more about and can suggest, I appreciate!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.