0
I’m at Jupyter Notebook working with Python.
I am importing the txt files with the following code:
allLines = []
path = '../data/txt/'
fileList = os.listdir(path)
for i in fileList:
file = open(os.path.join(path+ i), 'r', encoding='UTF-8')
allLines.append(file.read().strip())
dados = {
"nome_arquivo": fileList,
"texto": allLines
}
raw_data = pd.DataFrame(dados)
raw_data
The above code shows the following result:
The files I’m working on have the following features:
I was able to delete the startup breaks with the code .strip()
. However, some n still appears internally. The expected result is that files are imported without the " n" that are txt line breaks.
that solves:
rawdata.texto=[k.replace("\n","") for k in rawdata.texto]
, but I believe your question is duplicated– Lucas
This answers your question? how to remove n from a python string
– Lucas
Thanks it works, I already had this solution, I thought I had another solution within the loop. About duplicity I didn’t understand.
– Perciliano
@Perciliano, good that solved. Also look at the method
readlines()
– Paulo Marques