How to load . txt files from a directory including in the list or dataframe the name of this file in Python?

Asked

Viewed 107 times

2

I’m at Jupyter Notebook working with Python. I have a directory with files in txt, I can iterate in the directory and load these files in txt, however, I also need to take the name of this file as column of the list or dataframe.

Following is figure of the directory. diretório com arquivos

Follow the current code and the result of this.

allLines = []
path = 'C:/data/txt/'
fileList = os.listdir(path)
for i in fileList:
    file = open(os.path.join('C:/data/txt/'+ i), 'r', encoding='UTF-8')
    allLines.append(file.read())
print(allLines)

Then with the result I upload to a dataframe.

#Codigo
raw_data = pd.DataFrame(data=allLines, columns=['texto'])

Resultado esperado

As a final result I want to get the content and also the file name, is it possible? Can you help? Grateful.

1 answer

1


If I have understood correctly, this would be the result you want?

allLines = []
path = 'C:/data/txt/'
fileList = os.listdir(path)
for i in fileList:
    file = open(os.path.join('C:/data/txt/'+ i), 'r', encoding='UTF-8')
    allLines.append(file.read())

dados = {
"conteudo": allLines,
"arquivo": fileList
}
raw_data = pd.DataFrame(dados)
  • Opi, thanks comrade, I got the name of the files, however, in this dataframe I need the name of the files and their respective texts, have as?

  • It would be a dataframe for each file or a single dataframe with all the files and their repective names in the middle of the contents?

  • a single dataframe with all data and with two columns, column with file name and another column with the respective file text

  • I edited the answer, see if it’s the way you want it

  • Exactly, I’ll be other bids here, thank you very much comrade and good morning.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.