Encoding problem while extracting zip file - edited

Question

Encoding problem while extracting zip file - edited

Asked 5 years, 7 months ago

Viewed 179 times

1

A webhook calls my API by sending a POST request. On the request body contains the url of a ZIP file.

Using the requests library, I perform a GET at the file url. I need to extract this few files from this zip and carry out a number of processes. The problem is that while trying to extract the file I come across the following error message: The following error occurs:

'ascii' codec can't encode character '\\xa2' in position 45: ordinal not in range(128)

Request code and attempt to extract the file:

import io
import requests
from zipfile import ZipFile

response = requests.get(url)

with ZipFile(io.BytesIO(response.content)) as thezip: # respose.content = arquivo zip em bytes por isso usei io.BytesIO()
    thezip.extractall()

When I print out the list of file names:

with ZipFile(io.BytesIO(response.content)) as thezip:
    print(thezip.namelist())

['Nao_Consistido/', 'Nao_Consistido/Relat\xc2\xa2rio de Previs\xc3\x86o de Vaz\xc3\x86o - Limite Inferior - LI.xls', 'Nao_Consistido/Relat\xc2\xa2rio de Previs\xc3\x86o de Vaz\xc3\x86o - Limite Superior - LS.xls', 'Nao_Consistido/Relat\xc2\xa2rio_de_Previs\xc3\x86o de Vaz\xc3\xa4es_PMO_de_DEZEMBRO_2019-preliminar.xls', 'Nao_Consistido/Todos_LI.prv', 'Nao_Consistido/Todos_LS.prv', 'Nao_Consistido/Todos_VE.prv']

I already set the PYTHONIOENCODING environment variable to utf-8 and it didn’t work. EDIT: After some tests I realized that the problem occurs only on the server (linux system), locally on Windows 10 does not occur.

1

And what would be the "list of file names"?

– Woss

2019/12/03 at 20:31
The . namelist() method lists the name of the files inside the zip

– Gustavo Serafim

2019/12/04 at 00:31
What a method namelist? There is none of this in the code of your question. I could check if you posted the full code?

– Woss

2019/12/04 at 01:56
I edited the question.

– Gustavo Serafim

2019/12/04 at 12:39
Gustavo, I still can not understand your problem, but you could explain why you are using i.BytesIO to spend a path for ZipFile? 'Cause you don’t pass a string normal? Or, what is the content of response.content? \xa2 is the character ¢, if your file name has this character may be the source of your problem. Anyway, they are suggestions to improve your question, the way it is now difficult to correctly define your problem.

– fernandosavio

2019/12/05 at 19:44
I use the i.BytesIO because Response.content returns the bytes file to me

– Gustavo Serafim

2019/12/05 at 20:32
But with Bytesio you will keep having bytes... I advise you to convert your bytes into string, you can use str(response.content', encoding='utf-8') (use the encoding correct of your request #Docs).

– fernandosavio

2019/12/05 at 20:57
While trying to convert get a similar error: UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-12: ordinal not in range(128)

– Gustavo Serafim

2019/12/05 at 21:17

Show 3 more comments

2 answers

Browser other questions tagged python python-3.x zip-file python-requests

You are not signed in. Login or sign up in order to post.

by Igor Gabriel • **530** points · Answer 1 · 2019-12-04T14:27:51+00:00

One solution is to use Encode and Decode:

import io
import requests
from zipfile import ZipFile

response = requests.get(url)

for i in sys.argv[1:]:
    with ZipFile(io.BytesIO(response.content)) as thezip:
        for i in thezip.namelist():
                n = Path(i.encode('cp437').decode(encoding))
                if 1:
                    print(n)
                if i[-1] == '/':
                    if not n.exists():
                        n.mkdir()
                else:
                    with n.open('wb') as w:
                        w.write(thezip.read(i))

by Jorge Borges • 63 points · Answer 2 · 2019-12-04T14:33:01+00:00

-1

In python, as in other languages, interprets \ as a String escape, try to separate the path from the zip file with /.

Could you give an example? I should separate the path before extracting?

– Gustavo Serafim

2019/12/04 at 14:47
A direct path - C:/Users/Name/Desktop/File.zip , this module requests it captures the past path and makes the separation of each directory using the character separation condition / up to the desired file.

– Jorge Borges

2019/12/04 at 15:20