Why is it wrong for the file to be closed if in each iteration it should open again?

Asked

Viewed 50 times

3

I’m trying to make a code so that on my Youtube playlist.

The program compares if the link of such a playlist video is included in a txt file. If it is, it does not download the video, if it is not, download and write the link in that txt file so that the next time it runs it does not download the video again.

However it is giving error after the first iteration that the file is closed, and I open it in each new iteration. What to do?

import urllib.request as con
import pytube
def baixar(y):
    print("Baixando")
    yt = pytube.YouTube(y)
    audios = yt.streams.filter(only_audio=True).all()
    aud = audios[0]
    aud.download(r"C:\Users\joaov\Dropbox\Músicas")
    print("Download terminado")
k = 0 
root = "https://www.youtube.com/watch?v=vrQWhFysPKY&list=PLViTG4xZB5MiktfNXT0OaxANraqLRgSQr"
texto = con.urlopen(root).read().decode("utf-8")
tag = '&index='
while k <= len(texto):
    sta = texto.find(tag,k)
    fnl = sta + len(tag) + 2
    link = root + texto[sta:fnl]
    arq = open(r'C:\Users\joaov\AppData\Local\Programs\Python\Python37-32\videos.txt','r')
    for linha in arq:
        if linha != link and linha != "":
            arq.close()
            arq1 = open(r'C:\Users\joaov\AppData\Local\Programs\Python\Python37-32\videos.txt','a')
            arq1.write(link + "\n")
            baixar(link)
            arq1.close()
    k += fnl
print("Baixados!!!")
  • But within the loop that runs through the lines you are closing the file itself.

1 answer

2

First: Opening and closing the file in each interaction is an incorrect way to use files anyway.

Second, your problem is probably occurring because of the line you do arq.close() within the for - that arq was opened only once - the file in the variable arq1 is that it is open to every interaction.

You will be able to "untangle" a lot of your code by implementing two good practices there: The first is to separate the directory paths and filenames from your code - Note that the exact folder and file name are concerns you have to have, but that has nothing to do with the logic of the program. In addition, of course, they limit your program to running only with these names and folders - the code thus cannot be reused, nor for yourself, if you switch computers. (you already do this by separating the variable with the url root)

This is solved by simply putting these names into distinct variables and easy to know which ones are right at the beginning of the file. Then you can either use these variables directly in your code - or, better yet, create a function that receives the filename as a parameter - this function can then be used as part of an improved program, which has a graphical interface, for example (and you don’t need to mess with a line of code that already works for this - just call the function with the new filename).

The second good practice - it’s less universal, but in this case it helps: it’s using Python’s "context managers" to open your files instead of store the result of open in a variable and call the close explicitly. Just use the open together with the command with Python - you will notice that it is impossible to give the "node" you gave by closing the file more than once - once the file is closed at the end of the block with - or your bow tie for is executed with arq open, or with arq closed.

A third cool thing is that from Python 3.6 the class pathlib.Path Python can be used instead of strings for filenames - it has some advantages - such as centralizing all the desired file functionality in a single object:

well, I gave a general reorganization - the biggest change is that I work with the changes in memory, instead of reading the file again in each interaction - you had another logic error there to check if the link had already been downloaded or not - this version brings everything to memory, (text data of this type use a negligible amount of memory) - and use the clause finally of a Try except to ensure that new links are written to the file - even if some interruption occurs in the process. (I haven’t tested this program - but even if there’s some detail wrong, it might give you an idea for a nicer code organization)

from pathlib import Path

import urllib.request as con
import pytube

video_file_path = Path(r'C:\Users\joaov\AppData\Local\Programs\Python\Python37-32\videos.txt')

download_folder = Path(r"C:\Users\joaov\Dropbox\Músicas")

root = "https://www.youtube.com/watch?v=vrQWhFysPKY&list=PLViTG4xZB5MiktfNXT0OaxANraqLRgSQr"


def baixar(video_id):
    print("Baixando")
    yt = pytube.YouTube(video_id)
    audios = yt.streams.filter(only_audio=True).all()
    aud = audios[0]
    # Bibliotecas externas as vezes não funcionam ainda
    # com objetos "Path" - mas tente passar sem a
    # conversão para "str" abaixo pra ver se funciona:
    aud.download(str(download_folder))
    print("Download terminado")


def baixa_todas():
    k = 0 

    texto = con.urlopen(root).read().decode("utf-8")
    tag = '&index='

    # A chamada abaixo abre o arquivo, le todo o seu
    # conteúdo, separa em linhas e guarda num objeto do tipo
    # "set" que é mais eficiente para busca com o operador "in":
    links_baixados = set(video_file_path.read_text().split("\n"))
    links_novos = []
    try:
        while k <= len(texto):
            sta = texto.find(tag,k)
            fnl = sta + len(tag) + 2
            link = root + texto[sta:fnl]

            if link not in links_baixados:
                baixar(link)
                print(f"Baixando novo link {link}")
                links_novos.append(link + "\n")
                links_baixados.append(link)

            k += fnl
    finally:
        with video_file_path.open("at") as arq:
            arq.write_lines(links_novos)


if __name__ == "__main__":        
    baixa_todas()
    print("Baixados!!!")
  • A good practice, sometimes error, is to mix Portuguese with English. It would be better to leave the initial variations of the path in Portuguese, or not?

  • At some point in the past, Python’s style recommendation said "only leave names other than English if you’re 120% sure that no one outside your country will interact with the code" (and it was "120%" anyway) - I think that phrase fell - but the ideal is not to mix - if you have names in English, leave everything in English and vice versa.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.