My dataframe skips a line in excel when Columns receives more than one line (pandas)

Question

My dataframe skips a line in excel when Columns receives more than one line (pandas)

Asked 4 years, 3 months ago

Viewed 54 times

1

I’m trying to make dataframes and pass them to excel, but when I do a dataframe with more than one header line, it skips a line before plotting the data in the spreadsheet (leaves an empty line). It seems that the problem is tied to the fact of my argument Columns be a list of lists, because it doesn’t happen in dataframes where Columns is a single list. Does anyone have any suggestions on how to resolve this, take away these empty lines?

import pandas as pd
import numpy as np

def multiple_dfs(df_list, sheets, file_name, spaces):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer, sheet_name=sheets, startrow=row, startcol=0)
        row = row + len(dataframe.index) + spaces + 1
    writer.save()
    
array1 = [['one', 'one', 'one', 'two', 'two', 'two'],['a', 'b', 'c', 'a', 'b', 'c']]
df1 = pd.DataFrame(np.random.randn(5, 6),columns=array1)

array2 =[[1, 2, 3], ['Sharon', 'Nick', 'Bailey'],[22,23,24]]
midx = pd.MultiIndex.from_arrays(array2, 
            names =('Number', 'Names', 'Ages')) 
df2 = pd.DataFrame(np.random.randn(5, 3),columns=midx)

array3 = ['one', 'one', 'one', 'two', 'two', 'two']
df3 = pd.DataFrame(np.random.randn(5, 6),columns=array3)

df_lista = [df1, df2, df3]
multiple_dfs(df_lista,'Frames','Frames.xlsx',4)

Change anything if you put row = 1?

– Paulo Marques

2021/03/19 at 20:56
He would start printing the dataframes from the second line of the spreadsheet

– gabrielouverney

2021/03/19 at 21:17
Yes, exactly. I understood that the problem is that you are skipping a line between the header and the data. But, you tried row = 1?

– Paulo Marques

2021/03/19 at 21:50
I just tried, it didn’t work

– gabrielouverney

2021/03/19 at 22:23

1 answer

Browser other questions tagged python excel pandas

You are not signed in. Login or sign up in order to post.

by Paulo Marques • **3,739** points · Answer 1 · 2021-03-21T19:13:12+00:00

Did you find or knot that needs to be untied by the people of pandas.

It is a matter of their choice. Although it is not a bug, the result is as if it were.

This is only perceived when the column name is the type MultiIndex.

See below for why this happens:

Case 1 - visually a bug

m = pd.MultiIndex.from_tuples([(1,1),(1,2)])
df = pd.DataFrame([[1,2],[3,4]], columns=m)
df.to_excel('Frames.xls', sheet_name="Frames", startrow=0, startcol=0)

Case 2 - starting to understand the problem

m = pd.MultiIndex.from_tuples([(1,1),(1,2)], names=['linha 1','linha 2'])
df = pd.DataFrame([[1,2],[3,4]], columns=m)
df.to_excel('Frames.xls', sheet_name="Frames", startrow=0, startcol=0)

The modification was the inclusion of names for the columns

Case 3 - The reason for the problem

m = pd.MultiIndex.from_tuples([(1,1),(1,2)], names=['linha 1','linha 2'])
df = pd.DataFrame([[1,2],[3,4]], columns=m)
df.to_excel('Frames.xls', sheet_name="Frames", startrow=0, startcol=0, index_label="Indices")

The modification was the inclusion of names for the columns and labels for the indexes.

Note that that blank line in Case 1 is actually a line reserved for the index label.

It seems that they beat the hammer to when the columns nay are of the type MultiIndex, the index label is in the same row as the column name and ignores the column label.

Case 4 - Described above

m = pd.Index([1,2], name="coluna")
df = pd.DataFrame([[1,2],[3,4]], columns=m)
df.to_excel('Frames.xls', sheet_name="Frames", startrow=0, startcol=0, index_label="Indices")