Read rows and columns merged in excel, using python

Asked

Viewed 2,188 times

4

I’m having trouble reading the information from a spreadsheet that has rows and columns merged. I tried to use merged_cell to get the value, but I don’t understand how it works. Can anyone help me?

Using the script below, I can get the values, except for for merged cells. How can I get all cells, including merged cells?

from xlrd import open_workbook,cellname

workbook = open_workbook('file.xls')
sheet = workbook.sheet_by_index(0)

for row_index in range(sheet.nrows):       
for col_index in range(sheet.ncols):    

print cellname(row_index,col_index),'-',    
print sheet.cell_value(row_index,col_index)
  • Your problem is that the merged cells give no value, or that only one of them (top-left) gives some value?

  • 1

    Hi, I have a table 515x258. I have for example five columns merged, and within merge I have 2 rows. The first row is normal and the second row has 4 columns merged. Can you visualize mentally? Of all these rows and columns, I get value from: some cells that are merged and I get no unmixed cell value. All cells contain values.

1 answer

2


The functioning of merged_cells is as follows: each entry in that list contains 4 values - the first row (top), the last row, the first column (left) and the last column. These lines and columns form a rectangle. In the worksheet data, only the upper left cell has data and formatting, the others appear as if they were blank.

A way to map all cells to their actual value would then be as follows:

from collections import defaultdict
mescladas = defaultdict(lambda: '')

for crange in sheet.merged_cells:    # Para cada intervalo (conjunto de células mescladas)
    rlo, rhi, clo, chi = crange      # Descobre o "retângulo" que elas formam
    for rowx in range(rlo, rhi):     # Para cada linha
        for colx in range(clo, chi): # e cada coluna
            mescladas[(rowx,colx)] = sheet.cell_value(rlo, clo) # Copie o valor da célula
                                                                # superior esquerda

for row_index in range(sheet.nrows):
   for col_index in range(sheet.ncols):
       print cellname(row_index,col_index),'-',
       print sheet.cell_value(row_index,col_index) or mescladas[(row_index,col_index)]
  • Thank you for the explanation so detailed, I will try here now!

  • Hi, it returns me Nameerror: name 'thesheet' is not defined . How do I define/declare it? I thought it was the name of my spreadsheet, but it’s not. Can you guide me? Thank you

  • @Marytortugo It was just a mistake to copy the example of documentation, the right one is to use sheet same (that was the name you gave to variable). Fixed

  • It worked Super Right!! Thank you so much!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.