How to remove spaces at the beginning and end of an XLS output

Asked

Viewed 1,883 times

3

I have the code below, for him to open a spreadsheet on XLS and show me the header. however I would like the left and right spaces to be removed and the internal spaces to be replaced by underline there is some function in the xlrd do that?

   import xlrd

    def xlread(arq_xls):

        xls = xlrd.open_workbook(arq_xls)
        # Pega a primeira planilha do arquivo
        plan = xls.sheets()[0]

        # Para i de zero ao numero de linhas da planilha
        for i in xrange(plan.nrows):
            # Le os valores nas linhas da planilha
            yield plan.row_values(0)

    for linha in xlread("teste.xls"):
        print linha

Example of Saida;

[u'Cod.Lab.', u'Laudo', u'Cliente', u'Fazenda', u'Talh\xe3o', u'Profundidade', u'Descri\xe7\xe3o', u' pH          ', u'pH', u' pH           ', u'CE', u'MO ', u'P resina', u'S-SO4', u'K (res)', u'Na', u'Ca', u'Mg', u'Al'

As you can see where it has ~ it is not figured out and in "pH" there are several spaces.

  • Hi William. Can post an example of the output you are getting and the output you want?

  • edited. As I want, it would all be in the same format, lowcase, without accent and spaces.

2 answers

4


To remove the whitespace before and after the text, you can use the .strip().

To print the special characters, you can put the .encode('utf-8').

To put everything in Lower case, put a .lower() at the end of the sentence.

To swap the various spaces in the middle of the string for a single _, you can make a .split() to separate all words within the string and then join them with the "_".join().

Putting the four together would look like this:

print "_".join(celula.encode('utf-8').strip().lower().split())

Whereas celula is an iteration of each element of linha.

Besides you have a logic error in this excerpt:

yield plan.row_values(0)

Where instead of 0 should be i.

Putting everything back into your code would look like this:

import xlrd


def xlread(arq_xls):

    xls = xlrd.open_workbook(arq_xls)
    # Pega a primeira planilha do arquivo
    plan = xls.sheets()[0]

    # Para i de zero ao numero de linhas da planilha
    for i in xrange(plan.nrows):
        # Le os valores nas linhas da planilha
        yield plan.row_values(i)

for linha in xlread("teste.xlsx"):
    for celula in linha:
        print "_".join(celula.encode('utf-8').strip().lower().split())
  • Very nice, solved a lot of the problem. However as I described in the text (my error maybe), to put everything as lowercase and also remove the spaces inside...maybe replace by underline

  • To put in lowercase is only for one .lower() after strip. About the spaces in the middle of the sentence you have to decide, is to take or to put _? Besides, it would be nice if you edit your question, because that part was hidden in the comments.

  • I think the ideal would be to put a _

  • OK, editing the answer

  • 1

    Perfect Math .

0

I believe that spaces can be easily removed as follows:

for linha in xlread("teste.xls"):
    for item in linha:
        print item.strip()

As for the special characters I think you should leave so, when they are read from the list they should return to their original format.

  • 'list' Object has in attribute'strip'.. the following error occurred.

  • @Guilhermelima I made a mistake when writing the code. It’s already fixed

  • This solved the question of spaces at the beginning, but in the center or even the uppercase letters. But thank you very much.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.