First: Opening and closing the file in each interaction is an incorrect way to use files anyway.
Second, your problem is probably occurring because of the line you do arq.close()
within the for
- that arq
was opened only once - the file in the variable arq1
is that it is open to every interaction.
You will be able to "untangle" a lot of your code by implementing two good practices there:
The first is to separate the directory paths and filenames from your code -
Note that the exact folder and file name are concerns you have to have, but that has nothing to do with the logic of the program. In addition, of course, they limit your program to running only with these names and folders - the code thus cannot be reused, nor for yourself, if you switch computers. (you already do this by separating the variable with the url root
)
This is solved by simply putting these names into distinct variables and easy to know which ones are right at the beginning of the file. Then you can either use these variables directly in your code - or, better yet, create a function that receives the filename as a parameter - this function can then be used as part of an improved program, which has a graphical interface, for example (and you don’t need to mess with a line of code that already works for this - just call the function with the new filename).
The second good practice - it’s less universal, but in this case it helps: it’s using Python’s "context managers" to open your files instead of
store the result of open
in a variable and call the close
explicitly. Just use the open
together with the command with
Python - you will notice that it is impossible to give the "node" you gave by closing the file more than once - once the file is closed at the end of the block with
- or your bow tie for
is executed with arq
open, or with arq
closed.
A third cool thing is that from Python 3.6 the class pathlib.Path
Python can be used instead of strings for filenames - it has some advantages - such as centralizing all the desired file functionality in a single object:
well, I gave a general reorganization - the biggest change is that I work with the changes in memory, instead of reading the file again in each interaction - you had another logic error there to check if the link had already been downloaded or not - this version brings everything to memory, (text data of this type use a negligible amount of memory) - and use the clause finally
of a Try except to ensure that new links are written to the file - even if some interruption occurs in the process. (I haven’t tested this program - but even if there’s some detail wrong, it might give you an idea for a nicer code organization)
from pathlib import Path
import urllib.request as con
import pytube
video_file_path = Path(r'C:\Users\joaov\AppData\Local\Programs\Python\Python37-32\videos.txt')
download_folder = Path(r"C:\Users\joaov\Dropbox\Músicas")
root = "https://www.youtube.com/watch?v=vrQWhFysPKY&list=PLViTG4xZB5MiktfNXT0OaxANraqLRgSQr"
def baixar(video_id):
print("Baixando")
yt = pytube.YouTube(video_id)
audios = yt.streams.filter(only_audio=True).all()
aud = audios[0]
# Bibliotecas externas as vezes não funcionam ainda
# com objetos "Path" - mas tente passar sem a
# conversão para "str" abaixo pra ver se funciona:
aud.download(str(download_folder))
print("Download terminado")
def baixa_todas():
k = 0
texto = con.urlopen(root).read().decode("utf-8")
tag = '&index='
# A chamada abaixo abre o arquivo, le todo o seu
# conteúdo, separa em linhas e guarda num objeto do tipo
# "set" que é mais eficiente para busca com o operador "in":
links_baixados = set(video_file_path.read_text().split("\n"))
links_novos = []
try:
while k <= len(texto):
sta = texto.find(tag,k)
fnl = sta + len(tag) + 2
link = root + texto[sta:fnl]
if link not in links_baixados:
baixar(link)
print(f"Baixando novo link {link}")
links_novos.append(link + "\n")
links_baixados.append(link)
k += fnl
finally:
with video_file_path.open("at") as arq:
arq.write_lines(links_novos)
if __name__ == "__main__":
baixa_todas()
print("Baixados!!!")
But within the loop that runs through the lines you are closing the file itself.
– Woss