Encoding, bytes conversion - strings

Asked

Viewed 801 times

1

I have a problem, I’ve been researching but all the solutions I find don’t work. The problem is that I am accessing a page (.txt) and cannot convert it from bytes for string, to be able to work the data (eg: page.split("\n"))

import urllib.request

def open_url(url):

   data = urllib.request.urlopen(url);
   page = data.read()
   return page

def Main():

   url = "http://openweathermap.org/help/city_list.txt"
   page = open_url(url)

   print(page)

Main()

So far so good, the page is returned and printed in bytes, what I would like is now to convert it to string, have tried:

print(page.decode('utf-8'))

But it makes a mistake:

Unicode code: 'utf-8' codec can’t Decode byte 0x96 in position 289664: invalid start byte

All the other solutions I’ve seen are equivalent to this, they may change the syntax a bit but I believe they do the same, ex: page.decode(encoding='UTF-8'), error giving is the same as described above.

I’d like to know a way around that and turn it into string 'formatible'.

1 answer

1


the code worked normally here, but you can try to print it like this:

print(str(page, 'iso-8859-1'))

Here it worked both ways.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.