raise Valueerror("Unknown url type: %r" % self.full_url)

Asked

Viewed 169 times

-1

I’m trying to create a tool that opens a URL along with /robots.txt and return what is written on it, code:

site = input("Digite o dominio:")
api = f'{site}/robots.txt'
pagina = urllib.request.urlopen(api)

print(pagina.read())

But you’re making that mistake:

raise ValueError("unknown url type: %r" % self.full_url)

ValueError: unknown url type: 'uol.com.br/robots.txt'
  • It worked perfectly here: https://repl.it/@acwoss/Verifiableuncommonide. How did you execute the code? What input did you put in?

  • @Andersoncarloswoss did you notice that you can now change the names of the 'repls' in repl.it? The names it generates are terrible. Always when you were looking for something got lost, now you can even create folders. :-)

1 answer

0

I know what you’re doing wrong, the sites are not composed only by the name + ".com.br"(depends), the sites are composed this way(Also depends):

protocol + "www." + name + domain, is more or less like this:

https:// www. google .com.br, If you don’t do so, you will get just that mistake. I adapted your code and put error handling, now should work right and if you enter the wrong link, will not give error.

Execute this code on Python 3, and remember, some sites are not https, they may be http for example. Other sites do not have the .with.br, they may have .me, .Gov,etc....

# -*- coding:utf-8 -*-
import urllib.request
acessar_dados = "sim"

# Entrada
site = input("Digite o dominio, tipo https://www.google.com: ")

# Junção da entrada com a string
api = site+"/robots.txt"

# Tratamentos de erros caso o link seja inválido
try:
 pagina = urllib.request.urlopen(api)
except:
 print("O link '{}' é inválido!".format(site))
 acessar_dados="não"

# Se não aconteceram erros, vamos ler a página!
if acessar_dados=="sim":
 print(pagina.read())

Browser other questions tagged

You are not signed in. Login or sign up in order to post.