-1
I need to open several HTML’s, get their text and save to a txt sequentially, but I don’t know how to do that.
I can do this with a single HTML, but I need to do it with several, and sequentially. Because it’s a epub and I need the text to be in the correct order.
Follows my code:
from bs4 import BeautifulSoup
arquivo = open('pfv.txt', 'w')
html = open(("index_split_001.html"), encoding="utf8").read()
soup = BeautifulSoup(html, 'html.parser')
link = soup.get_text()
arquivo.writelines(link)
Nicolas,are local files same. The names I can rename and use anyone. Let me see if I got you. Make an for opening all htmls and every opening write to txt? As I guarantee that what has already been written will not be overwritten by the next html?
– user124673
In the txt file when inserting you use line break, so it will not overwrite the same line! Have a look at this link http://www.devfuria.com.br/python/manipulando-textfiles/
– Nicolas Pereira
Thanks man, it helped a lot.
– user124673