Print elements of a regex in sequence in python

Asked

Viewed 74 times

-2

Good afternoon guys, I am new in python and I am learning a few with this I would like to ask help from you, I have the following source code taken from a site:

<div class='numerando'>1 - 1</div><div class='episodiotitle'><a href='https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x1-dublado-e-legendado-online-hd/'>Extreme Aggressor</a> <span class='date'>Sep. 22, 2005</span></div></li><li class='mark-2'><div class='imagen'><img src='https://image.tmdb.org/t/p/w154/d46a4eVjnECDSzKGJDNCSRQGrRo.jpg'></div><div class='numerando'>1 - 2</div><div class='episodiotitle'><a href='https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x2-dublado-e-legendado-online-hd/'>Compulsion</a> <span class='date'>Sep. 28, 2005</span></div></li>

With that I made this little code with the following regex:

site = "https://www.assistirseriesflix.com/series/mente-criminosa-dublado-hd/"
response = requests.get(site)
data = response.content
data.decode('utf-8')

match = re.findall(b'<div class=\'numerando\'>(.*?)</div><div class=\'episodiotitle\'><a href=\'(.*?)\'>(.*?)</a>', data)

I have tried several ways to print the code as follows:

Episode 1 - 1 Extreme Aggressor | Link : https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x1-dublado-e-legendado-online-hd/

Episode 1 - 2 Compulsion | Link : https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x2-dublado-e-legendado-online-hd/

onde 1 - 1 vem de <div class='numerando'>1 - 1</div>

E nome do episódio e link vem de <div class='episodiotitle'><a href='https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x1-dublado-e-legendado-online-hd/'>Extreme Aggressor</a>

Giving a print(match) it correctly shows all the data I want to get, however I can’t filter them the way I mentioned above

I tried with several codes I found in tutorials on the internet and some topics here of the stack itself, but I could not with any, I also tried to understand beautifulsoup but I did not have much success, actually I do not know what is the best way to do this in python, I thank you in advance for your help!

  • 2

    Do not use regex to work with HTML: https://answall.com/a/440262/112052 <-- this link shows how regex can become more and more complicated, while using the right tool, such as Beautiful Soup, is much better

1 answer

1


Certainly the BeautifulSoup is the most suitable library for your case, see:

import sys
import requests
from bs4 import BeautifulSoup

url = 'https://www.assistirseriesflix.com/series/mente-criminosa-dublado-hd/'

response = requests.get(url)

if response.status_code != 200:
    print(f'Erro HTTP: {response.status_code}')
    sys.exit(1)

soup = BeautifulSoup(response.text, 'html.parser')

episodios = soup.find('ul', {'class': 'episodios'})

for episodio in episodios.findAll('li'):
    div = episodio.find('div',{'class': 'numerando'})
    numeracao = div.text

    div = episodio.find('div',{'class': 'episodiotitle'})
    a = div.find('a', href=True)
    titulo, link = a.text, a['href']

    print(f'Episodio: {numeracao} {titulo} | Link: {link}')

Exit:

Episodio: 1 - 1 Extreme Aggressor | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x1-dublado-e-legendado-online-hd/
Episodio: 1 - 2 Compulsion | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x2-dublado-e-legendado-online-hd/
Episodio: 1 - 3 Won't Get Fooled Again | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x3-dublado-e-legendado-online-hd/
Episodio: 1 - 4 Plain Sight | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x4-dublado-e-legendado-online-hd/
Episodio: 1 - 5 Broken Mirror | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x5-dublado-e-legendado-online-hd/
Episodio: 1 - 6 L.S.D.K. | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x6-dublado-e-legendado-online-hd/
Episodio: 1 - 7 The Fox | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x7-dublado-e-legendado-online-hd/
Episodio: 1 - 8 Natural Born Killer | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x8/
Episodio: 1 - 9 Derailed | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x9/
Episodio: 1 - 10 The Popular Kids | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x10-dublado-e-legendado-online-hd/
Episodio: 1 - 11 Bloody Hungry | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x11-dublado-e-legendado-online-hd/
Episodio: 1 - 12 What Fresh Hell? | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x12/
Episodio: 1 - 13 Poison | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x13-dublado-e-legendado-online-hd/
Episodio: 1 - 14 Riding the Lightning | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x14-dublado-e-legendado-online-hd/
Episodio: 1 - 15 Unfinished Business | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x15/
Episodio: 1 - 16 The Tribe | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x16/
Episodio: 1 - 17 A Real Rain | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x17-dublado-e-legendado-online-hd/
Episodio: 1 - 18 Somebody's Watching | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x18-dublado-e-legendado-online-hd/
Episodio: 1 - 19 Machismo | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x19-dublado-e-legendado-online-hd/
Episodio: 1 - 20 Charmed And Harm | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x20/
Episodio: 1 - 21 Secrets And Lies | Link: https://www.assistirseriesflix.com/episodios/assistir-mentes-criminosas-1x21-dublado-e-legendado-online-hd/
Episodio: 1 - 22 The Fisher King (1) | Link: https://www.assistirseriesflix.com/episodios/mentes-criminosas-1x22/
  • thanks for the help, I understand by your code the use of Beautifulsoup in this case, not wanting to ask too much you could give an example using regex too please? If you don’t mind, thank you

Browser other questions tagged

You are not signed in. Login or sign up in order to post.