-1
I always use variations of this code in other tables and it usually works, but this one I try anyway and it doesn’t work, what I’m doing wrong?
url = "https://www.reddit.com/r/movies/comments/hhfalb/what_is_the_best_film_you_watched_last_week/"
req = requests.get(url)
soup = BeautifulSoup(req.content, 'html.parser')
tabela = soup.findAll('table', class_='MRH-njmSb5ZTkfb1o4dqv')
colunas = tabela.findAll('td')
print(tabela)
<table class="MRH-njmSb5ZTkfb1o4dqv"><thead> {....}
{....} <td class="_1LHijgw3WoeCUe8AUewfUB"><a href="https://www.reddit.com/r/movies/comments/hd7rzi/what_is_the_best_film_you_watched_last_week/fvluiyn/" class="_3t5uN8xUmg0TOwRCOGQEcU" rel="noopener nofollow ugc" target="_blank">"Miss Juneteenth"</a></td> {.....}
In the first attempts I was able to print the table code, but now not even that. Usually the result gives [] (empty), or (None) in the log.
I have tried Soup.find, Soup.findAll, nothing. As you would do?
So I wanted to avoid using Selenium because it would be a bit heavy (it takes a long time to install on the server for example). I believe that the problem is generated precisely because the page was created by javascript or something like that, why it worked with Selenium. I guess there’s no other way with normal Beautifulsoup, huh? Anyway I found another solution in the gambiarra, I got the table through the feed/ rss of the Reddit. It worked great.Thanks for the solution anyway!
– diegooli