Problem with python/ Decode

Asked

Viewed 1,038 times

0

I have 2 functions, the first prepares html and writes to a file . txt so that the second function opens this file and generates an email through outlook. In the body of the message, will be placed the contents of this html with the proper formatting. Everything happens perfectly, the . txt comes with html without any error, but when outlook is opening, it is closed and Error/Exception is generated below:

'ascii' codec can’t Encode Character u' xe7' in position 529: ordinal not in range(128)

I know this " xe7" is the 'ç', but I can’t fix it, I’ve tried to define it by . Decode("utf-8") and Encode("utf-8"), in the 'email_html_reading' variable, but the codec error persists. Follow the code of the 2 functions to see if I did something wrong:

Function 1:

import sys
import codecs
import os.path

def gerar_html_do_email(self):
    texto_solic = u'Solicitação Grupo '
    with codecs.open('html.txt', 'w+', encoding='utf8') as email_html:
        try:
            for k, v in self.dicionario.iteritems():
                email_html.write('<h1>'+k+'</h1>'+'\n')
                for v1 in v:
                    if (v1 in u'Consulte o documento de orientação.') or (v1 in u'Confira o documento de orientação.'):
                        for x, z in self.tit_nome_pdf.iteritems():
                            if x in k:
                                email_html.write('<a href='+'%s/%s'%(self.hiperlink,z+'>')+'Mais detalhes'+'</a>'+'\n')
                    else:
                        email_html.write('<p>'+v1+'</p>'+'\n')
                email_html.write('<p><i>'+texto_solic+'</i></p>'+'\n')
            email_html.close()
        except Exception as erro:
            self.log.write('gerar_html_para_o_email: \n%s\n'%erro)

Function 2:

def gerar_email(self):
    import win32com.client as com
    try:
        outlook       = com.Dispatch("Outlook.Application")
        mail          = outlook.CreateItem(0)
        mail.To       = u"Lista Liberação de Versões Sistema"
        mail.CC       = u"Lista GCO"
        mail.Subject  = u"Atualização Semanal Sistema Acrool"
        with codecs.open('html.txt', 'r+', encoding='utf8') as email_html_leitura:
            mail.HTMLBody = """
                            <html>
                                <head></head>
                                <body>
                                    <style type=text/css>
                                        h1{
                                            text-align: center;
                                            font-family: "Arial";
                                            font-size: 1.1em;
                                            font-weight: bold;
                                        }
                                        p{
                                            text-align: justify;
                                            font-family: "Arial";
                                            font-size: 1.1em;
                                        }
                                        a{
                                            font-family: "Arial";
                                            font-size: 1.1em;
                                        }
                                    </style>
                                    %s
                                </body>
                            </html>
                            """%(email_html_leitura.read().decode("utf-8"))
        email_html_leitura.close()
        mail.BodyFormat = 2
        mail.Display(True)
    except Exception as erro:
        self.log.write('gerar_email: \n%s\n'%erro)

If anyone can help me, thank you, every week I have to do a tedious task of creating this email and formatting in a pattern and everything manually, because the data is changed every week and most of the time, there are many. With it, I’ll gain practically a whole morning. Thank you.

  • For storage: use Python 3.

  • To understand what is wrong, please read http://local.joelonsoftware.com/wiki/O_M%C3%Adnimo_absoluto_que_todos_programadores_de_software_absolutely need,_,Positivamente_de_Saber_Sobre_Unicode_e_Conjuntos_de_Caracteres(Apologies!)

1 answer

0


I think the problem is that the source file has no Unicode encoding.

You can put this snippet right at the beginning of your file:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

Another solution is to use python3

#Edit: Added solution that solved the problem

reload(sys)
sys.setdefaultencoding('utf8')
  • Thanks for the help, my file already has this line at the beginning, inclusive, without it the source is not even compiled. As I use some external libraries like Mechanize, I tested it in python 3 and it didn’t work properly, so I still use Python 2.7.

  • Got it. Can you generate a very simple example by isolating the part where the problem occurs? Maybe a code that only reads the file entry and writes to another. I can’t reproduce using only the codes you passed.

  • To make your analysis easier, below follows the content of the html.txt file that is inserted in the <!-- mail variable.Htmlbody --> by <!-- %s --> The amazing thing is that when I put the direct content in <!-- %s --> replacing-o, the email normally opens without presenting the error with all the perfect formatting. I’m thinking about putting the content directly into the variable in memory, but I still don’t know how to do it, I’m burning my neurons here.

  • <h1>PRO240 – Mudança 1132</h1>&#xA;<p>Uma Filial relatou erro Oracle...</p>&#xA;<p>Existiam na base 2 registros na item sem indícios de que foram...</p>&#xA;<p><i>Solicitação Filial</i></p>&#xA;<h1>PAG109 – Mudança 1133</h1>&#xA;<p>Na mudança 1109 foi solicitado à Filial...</p>&#xA;<p>Foi solicitada a alteração “Data do Pagamento”...</p>&#xA;<p>O programa foi alterado para listar...</p>&#xA;<p><i>Solicitação Filial</i></p>&#xA;<h1>CAD400 – Mudança 1134</h1>&#xA;<p>Uma Filial relatou que endereço não estava sendo...</p>&#xA;<p>O campo foi adicionado e o problema foi resolvido...</p>&#xA;<p><i>Solicitação Filial</i></p>

  • To test, create the html.txt file and copy this content to it by saving it as 'UTF-8' and not 'Nicode'. Run only the generating function.

  • I managed to run without the error by swapping %(email_html_leitura.read().decode("utf-8")) by: %(email_html_leitura.read())

  • And opened outlook with correct formatting?

  • As I reduced the html enough for you to insert in the html.txt file, it may have worked for that, but I’ll test it at home later, hopefully it works. I remember not putting the .decode("utf-8"), tested and given the same error, but I may have done it the wrong way, just testing again. Thanks for helping me Klaus.

  • Maybe you are saving the . txt file in a different format from utf-8. This may also resolve: https://stackoverflow.com/questions/21129020/how-to-fix-unicodedecodeerror-ascii-codec-cant-decode-byte

  • It worked Klaus, I downloaded the link you posted and put just below the import sys the line: reload(sys) &#xA;sys.setdefaultencoding('utf8') and then it worked perfectly. Thank you very much brother, it was worth d++. put this solution there for me to give credit for your help. Hugs.

Show 5 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.