I need to join several excel files in a single split into python folders. I did this method but wanted to create a function:

Asked

Viewed 392 times

0

I made this function that receives the destination directory and the month that was generated the reports, I wanted to make it smaller using a for that goes through this list:

ufv_name = ['CB1', 'CB2', 'CB3', 'GM1', 'IM1', 'JD2', 'PIU', 'STR'] #sigla das ufvs

Follows the function:

def unir(dir, mes):
    #IMPORTANDO OS RELATORIOS PARA UNIR EM UM SÓ
    CB1 = pd.read_excel('{}PR_CB1_{}.xlsx'.format(dir, mes))
    CB2 = pd.read_excel('{}PR_CB2_{}.xlsx'.format(dir, mes))
    CB3 = pd.read_excel('{}PR_CB3_{}.xlsx'.format(dir, mes))
    PIU = pd.read_excel('{}PR_PIU_{}.xlsx'.format(dir, mes))
    IM1 = pd.read_excel('{}PR_IM1_{}.xlsx'.format(dir, mes))
    STR = pd.read_excel('{}PR_STR_{}.xlsx'.format(dir, mes))
    JD2 = pd.read_excel('{}PR_JD2_{}.xlsx'.format(dir, mes))
    GM1 = pd.read_excel('{}PR_GM1_{}.xlsx'.format(dir, mes))

    with pd.ExcelWriter('{}Relatorio_total.xlsx'.format(dir), engine='xlsxwriter') as writer:
        CB1.to_excel(writer, sheet_name='CB1', index = False)
        CB2.to_excel(writer, sheet_name='CB2', index = False)
        CB3.to_excel(writer, sheet_name='CB3', index = False)
        PIU.to_excel(writer, sheet_name='PIU', index = False)
        IM1.to_excel(writer, sheet_name='IM1', index = False)
        STR.to_excel(writer, sheet_name='STR', index = False)
        JD2.to_excel(writer, sheet_name='JD2', index = False)
        GM1.to_excel(writer, sheet_name='GM1', index = False)

    print('Relatórios Unidos em um só arquivo')

You may notice that imported reports always have the same pattern.

  • Test this https://ideone.com/uPVSQv code, I did it in my head because I don’t have the files to test.

  • I managed to do with your code, thank you very much!

2 answers

0

I hope it helps you.

import pandas as pd

ufv_names = ['CB1','CB2','CB3']

def unir(dir, mes):
    files = [_load_file(f'{dir}PR_{name}_{mes}.xlsx') for name in ufv_names]
    names_files = list(zip(ufv_names, files))
    
    with pd.ExcelWriter('{}Relatorio_total.xlsx'.format(dir), engine='xlsxwriter') as writer:
        [_write_file(writer, f, name) for name, f in names_files]
        print('Relatórios Unidos em um só arquivo')

def _load_file(file):
    return pd.read_excel(file)

def _write_file(writer, file, name):
    file.to_excel(writer,sheet_name=name, index= False)
  • I took the test here and it really worked out, thank you very much! Congratulations on your knowledge

0

A possible refactoring using a partial function:

from functools import partial
import pandas as pd
 
ufv_name = ['CB1', 'CB2', 'CB3', 'GM1', 'IM1', 'JD2', 'PIU', 'STR']
 
def readxlsx(ufv, d, m):
  return (pd.read_excel(f'{d}PR_{ufv}_{m}.xlsx'), ufv)
 
def unir(dir, mes):
  f = partial(readxlsx, d=dir, m=mes)
  planilhas = list(map(f, ufv_name))
  with pd.ExcelWriter(f'{dir}Relatorio_total.xlsx', engine='xlsxwriter') as writer:
    for p in planilhas:
      p[0].to_excel(writer, sheet_name=p[1], index = False)

A partial function is a function created from another function where part of its argument is previously defined. In Python the module is available on functools the function partial() to create partial functions.

The logic of the algorithm is simple:

  • was created the readxlsx(ufv, d, m) who reads a spreadsheet on the way f'{d}PR_{ufv}_{m}.xlsx' and associates it with ufv on a tuple.
  • a partial function is created f about readxlsx() fixing the d in dia and m in mes and leaving ufv as a valid parameter.
  • then in planilhas = list(map(f, ufv_name)) a list is obtained from a mapping between each element of ufv_name and a worksheet.
  • Then just go through the list and add to each sheet the file Relatorio_total.xlsx.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.