You can use the library Beautifulsoup to do the Parsing of HTML.
Just install with:
pip install beautifulsoup4
There in your code you get the HTML as you already did:
import urllib.request
url = "https://www.youtube.com/watch?v=2MpUj-Aua48"
f = urllib.request.urlopen(url)
html = f.read().decode('utf-8')
Now Beautifulsoup does the most complex job, which is to read HTML and fetch the tags you need:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
meta_tag = soup.head.find('meta', attrs={'name': 'keywords'})
keywords = [keyword.strip() for keyword in meta_tag['content'].split(',')]
Explaining:
Create Beautifulsoup object
soup = BeautifulSoup(html, 'html.parser')
Search for the first tag <meta>
within the <head>
that has the attribute name
and this contains the value keywords
meta_tag = soup.head.find('meta', attrs={'name': 'keywords'})
The method Soup.() returns the first tag found or None
if no tag matches past filters. In the example above I am asking Beautifulsoup to return an element whose tag be it <meta>
and containing the attribute name
with the value keywords
. If this element does not exist in the past HTML, meta_tag
had received None
as a value.
Break the string into a list with Keywords (I use the method str split.() and str.strip() to break the string and remove the excess spaces)
keywords = [keyword.strip() for keyword in meta_tag['content'].split(',')]
Upshot:
[
"4k video test",
"4k video demo",
"ultra tv video",
"video 4k for shop mode",
"ultra video tv demo play",
"2160p video test",
"hd sourround video test",
"samsung tv demo",
"s...",
]
Full example:
from bs4 import BeautifulSoup
import urllib.request
url = "https://www.youtube.com/watch?v=2MpUj-Aua48"
f = urllib.request.urlopen(url)
html = f.read().decode('utf-8')
soup = BeautifulSoup(html, 'html.parser')
meta_tag = soup.head.find('meta', attrs={'name': 'keywords'})
keywords = [keyword.strip() for keyword in meta_tag['content'].split(',')]
print('=== Keywords ===')
for k in keywords:
print(f' - {k}')
Code working on Repl.it
Upshot:
=== Keywords ===
- 4k video test
- 4k video demo
- ultra tv video
- video 4k for shop mode
- ultra video tv demo play
- 2160p video test
- hd sourround video test
- samsung tv demo
- s...