Error happens in the following line:
text = page.read().decode('utf8')
It attempts to decode the page of the aforementioned site using UTF-8 encoding, but fails to find any poorly formed byte. The content of the page is as follows:
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=shift_jis"><meta http-equiv="Content-Language" content="ja,en"><script type="text/javascript">\r\n\r\n var _gaq = _gaq || [];\r\n _gaq.push([\'_setAccount\', \'UA-20569835-2\']);\r\n _gaq.push([\'_trackPageview\']);\r\n\r\n (function() {\r\n var ga = document.createElement(\'script\'); ga.type = \'text/javascript\'; ga.async = true;\r\n ga.src = (\'https:\' == document.location.protocol ? \'https://ssl\' : \'http://www\') + \'.google-analytics.com/ga.js\';\r\n var s = document.getElementsByTagName(\'script\')[0]; s.parentNode.insertBefore(ga, s);\r\n })();\r\n\r\n</script><title>404 Not Found</title></head><body oncontextmenu="return false;" style="width: 100% !important; height: 2600px !important;">\r\n<center><a href="http://cgi.i-mobile.co.jp/ad_link.aspx?guid=on&asid=32341&pnm=0&asn=1"><img border="0" src="http://cgi.i-mobile.co.jp/ad_img.aspx?guid=on&asid=32341&pnm=0&asn=1&asz=0&atp=2&lnk=6666ff&bg=&txt=000000&pbb=1"></a></center>\r\n<center><a href="http://cgi.i-mobile.co.jp/ad_link.aspx?guid=on&asid=32341&pnm=0&asn=2"><img border="0" src="http://cgi.i-mobile.co.jp/ad_img.aspx?guid=on&asid=32341&pnm=0&asn=2&asz=0&atp=2&lnk=6666ff&bg=&txt=000000"></a></center>\r\n\r\n\r\n<center><FONT SIZE="2">ミンナ�ホが選んだ�ゥ11/07のランキング�ソ</FONT></center>\r\n<center><FONT SIZE="2">�ソ ��位 �ソ</FONT></center>\r\n\r\n<br>\r\n<center><FONT SIZE="2">�ソ ��位 �ソ</FONT></center>\r\n\r\n<a name="madop"></a>\r\n<br>\r\n<center><font size="2">他のキーワードで探してみる</FONT></center><center>\r\n<form method="get" action="/genre23.php">\r\n<font size="2"><input type="text" name="query2" value="" size="8"><font size="4">\r\n<SELECT name="genre">\r\n<OPTION value="3">��</OPTION>\r\n\r\n</SELECT>\r\n</FONT><input type="submit" value=" 探す�マ "></FONT>\r\n<input type="hidden" name="cache" value=""><input type="hidden" name="fname" value="">\r\n</form>\r\n</center><br>\r\n<center><font size="2" color="red"><b><a href="/inq/disclaimer.php?ngdom=beans-r-us.biz&ngk=retire%20your%20vehicle">利用規約・削除依頼</a></b></FONT></center>\r\n<br></body></html>'
As you can see, there are several oriental characters present. It is likely that he encountered problems decoding some of these.
So how do I make it work?
– Tagaky
If your intention is just to make the script work, then you can try taking the
.decode('utf8')
– Emoon
It worked, thank you
– Tagaky