Getting HTML attributes with python

Asked

Viewed 97 times

-2

I’m wanting to get the information from arial-label, href and title tag a down:

    <a aria-label="AS MAIS TOCADAS NO BAILE FUNK 2019 #1 - SET DE FUNK by Funk 24por48 10 months ago 39 minutes 3,186,126 views" class="yt-simple-endpoint style-scope ytd-video-renderer" href="/watch?v=vTakYj4802U" id="video-title" title="AS MAIS TOCADAS NO BAILE FUNK 2019 #1 - SET DE FUNK">
        <yt-formatted-string aria-label="AS MAIS TOCADAS NO BAILE FUNK 2019 #1 - SET DE FUNK by Funk 24por48 10 months ago 39 minutes 3,186,126 views" class="style-scope ytd-video-renderer">AS MAIS TOCADAS NO BAILE FUNK 2019 #1 - SET DE FUNK</yt-formatted-string>
    </a>

I got this HTML snippet through selenium and BeautifulSoap (code down)

self.driver = webdriver.Firefox(options=self.options)
self.driver.get('https://www.youtube.com/results?search_query=funk+baile')
self.html = self.driver.find_elements_by_xpath('//*[@id="contents"]')[0].get_attribute('outerHTML')
self.html_musicas = self.soap.findAll(id="video-title", href=True)

How could I achieve the values of the attributes mentioned above? (arial-label, href and title)

1 answer

-1

I only managed by splitting tag stored in self.html_musicas and storing them in a vector:

self.dados_musicas[self.generos] = []
        for dados_tag in self.html_musicas:
            self.dados = str(dados_tag).split('"')
            self.dados_musicas[self.generos].append({self.dados[9]:self.dados[5]})

The exit was this here:

[{'MC João - Baile de Favela (KondZilla)': '/watch?v=kzOkza_u3Z8'},]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.