Selenium Webdriver error accentuation

Question

Selenium Webdriver error accentuation

Asked 6 years, 11 months ago

Viewed 735 times

-2

I am using Selenium Webdriver to realign test with python

when running this line:

mov.find_elements_by_xpath("td")[3].text.encode('utf-8')

I have as answer:

{'descricao': 'PROTOCOLIZADA PETI\xc3\x87\xc3\x83O'},

When I should:

{'descricao': 'PROTOCOLIZADA PETIÇÃO'},

Complete code

#encoding: utf-8
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver.support import ui
from app.browser_automation.browser_automation import BrowserAutomation
import time
import CONST
from datetime import datetime

from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support import ui

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


class BrowserAutomationTrt19(BrowserAutomation):

    def page_is_loaded(self, driver):
        time.sleep(2)
        return True #driver.find_element_by_id("Div_HeaderMedium") != None

    # consultas dos processo  na base afim de procurar no portal
    def make_consulta(self, numero):

        #preencher os elementos da consulta
        #dividir os numero em array
        partes_numero_processo = self.get_partes_do_numero(numero)

        #preencher os campos no formulario da página

        self.browser.find_element_by_xpath("//input[contains(@id,'seq')]").send_keys(partes_numero_processo[0])
        self.browser.find_element_by_xpath("//input[contains(@id,'dig')]").send_keys(partes_numero_processo[1])
        self.browser.find_element_by_xpath("//input[contains(@id,'ano')]").send_keys(partes_numero_processo[2])
        self.browser.find_element_by_xpath("//input[contains(@id,'org')]").send_keys(partes_numero_processo[5])

        #clicar no butão buscar
        self.browser.find_element_by_xpath("//input[contains(@class,'btnBuscar')]").click()
        #atualizar tela
        self.browser.refresh

        #se nao encontrar processo
        #if not self.processo_naoencontrado():
            #wait = ui.WebDriverWait(self.browser, 1000000000)
        #else:
            #CONST.PROCESSO_ENCONTRADO = False

        return True

    #se nao encontra processo
    def processo_naoencontrado(self):

        try:
            return "Nenhum registro encontrado!" in self.browser.find_element_by_id('idDivBlocoMensagem').text
        except:
            return False

    #obtendo as movimentacoes
    def read_movimentacoes(self,processo):
        movimentacoes = []
        time.sleep(5)

        #clicar no link <a>     
        self.browser.find_element_by_xpath("//div[contains(@class,'row-fluid')]/p[contains(@class,'lead')]/a").click()
        time.sleep(2)

        print 'Lendo Movimentacoes...1'
        moviments = self.browser.find_elements_by_xpath("//table[contains(@id,'tableMovimentacoes1inst')]/tbody/tr")

        print "teste 0"
        for mov in moviments:
            data_hora = mov.find_elements_by_xpath("td")
            data_hora = data_hora[0].text + " " + data_hora[1].text + ":00"
            data_mov = datetime.strptime(data_hora.strip(), '%d/%m/%Y %H:%M:%S')

            if processo['ultima_movimentacao'] <= data_mov:

                fase_movimentacao = mov.find_elements_by_xpath("td")[3].text.encode('utf-8')

                mov = {'data': data_hora,
                    'faseMovimentacao': {'descricao': fase_movimentacao}
                    }

                if mov not in movimentacoes:
                    movimentacoes.append(mov)
                    # print mov
            else:
                print "teste3"
                return movimentacoes

        print "teste4"
        time.sleep(2)
        return movimentacoes

Error image

fase_motion = mov.find_elements_by_xpath("td")[3].text.Decode('utf-8') File "C: Python27 Lib encodings utf_8.py", line 16, in Decode Return codecs.utf_8_decode(input, errors, True) Unicodeencodeerror: 'ascii' codec can’t Encode characters in position 6-7: ordinal not in range(128)

To avoid these coding problems I would recommend using Python3+ ;) And what you want to do is decode('utf-8') and not encode()

– Tuxpilgrim

2018/08/28 at 13:06
I can’t... The system is configured to use Python 2.7

– alexjosesilva

2018/08/28 at 13:10
@Tuxpilgrim can contribute more consistently to problem resolution ?

– alexjosesilva

2018/08/28 at 13:11
I just edited my comment with a suggestion, and I believe that suggesting another version is consistent with your problem, since it was not clear in the question that you could not change version. Try to do what I suggested by putting mov.find_elements_by_xpath("td")[3].text.decode('utf-8')

– Tuxpilgrim

2018/08/28 at 13:13
In the image, has decode() but in the line that showed have encode(), which one are you using?

– Tuxpilgrim

2018/08/28 at 13:15
I used both and the problem persists...in some questions here at steckoverflow used Decode to solve

– alexjosesilva

2018/08/28 at 13:23
i receive this message: fase_motion = mov.find_elements_by_xpath('td')[3].text.Decode('utf-8') File "C: Python27 Lib encodings utf_8.py", line 16, in Decode Return codecs.utf_8_decode(input, errors, True) Unicodeencorror: 'ascii' codec can’t Find characters in position 6-7: ordinal not in range(128)

– alexjosesilva

2018/08/28 at 13:24
@Tuxpilgrim where I find the documentation for Selenium webdriver ?

– alexjosesilva

2018/08/28 at 13:26
Here has the doc

– Tuxpilgrim

2018/08/28 at 13:27
I did fase_move = mov.find_elements_by_xpath('td')[3].text.Encode('ISO-8859-1') the error persists!!

– alexjosesilva

2018/08/28 at 13:29
Let’s go continue this discussion in chat.

– Tuxpilgrim

2018/08/28 at 13:40

Show 6 more comments

1 answer

Browser other questions tagged python selenium selenium-webdriver

You are not signed in. Login or sign up in order to post.

by Guilherme Nascimento • **98,651** points · Answer 1 · 2018-08-28T13:56:42+00:00

The problem is not in Selenium, probably the problem is in the site you are trying to encode or decode to utf8, simply there may be invalid characters, which probably in the page you try to access should be displayed as:

�

Maybe you don’t even need the encode in mov.find_elements_by_xpath("td")[3].text.encode('utf-8') because it is likely that the site is already in utf-8 as well.

Generally site use iso-8859-1/windows-1252 or utf-8, I believe your intention is to convert to a string, in case you could probably do this:

fase_movimentacao = b"abc"

try:
    minha_str = fase_movimentacao.decode('utf-8')
except ValueError:
    minha_str = fase_movimentacao.decode('iso-8859-1')

print(minha_str)

In your code it would be something like:

fase_movimentacao = mov.find_elements_by_xpath("td")[3].text

try:
    minha_str = fase_movimentacao.decode('utf-8')
except ValueError:
    minha_str = fase_movimentacao.decode('iso-8859-1')

mov = {'data': data_hora,
    'faseMovimentacao': {'descricao': minha_str }
}

But I can’t say if it’ll take care of everything, especially if you have coding problems on the source site.