I suggest you take a look at the function zip
and - if you do not yet have this concept - in list comprehensions (list comprehensions). This will help you a lot to structure your code.
If you have a ready code that handles a single file:
workbook_r = open_workbook('arquivo.xls', formatting_info=True)
worksheet_r = workbook_r.sheet_by_index(0)
...
The "natural" way to move to another file is the use of functions:
def ler_arquivo(nome):
workbook_r = open_workbook(nome, formatting_info=True)
worksheet_r = workbook_r.sheet_by_index(0)
...
return resultados # Aquilo que você quer extrair do arquivo
ler_arquivo('arquivo.xls')
If you have N files then you will call this function N times by extracting the individual results from each one (if the files have different structures then you will have to do different functions for each "file type", of course):
arquivos = ['arquivo1.xls', 'arquivo2.xls', 'arquivo3.xls']
resultados = [ler_arquivo(nome) for nome in arquivos]
Okay, so far I probably didn’t say anything you didn’t already know. The problem now is - how combine the processing results of each file? That’s where the function comes in zip
. Let’s say part of your results are values from a column in the archive:
def ler_arquivo(nome):
...
coluna_que_me_interessa = []
... # Extrai os valores de uma coluna que te interesse e coloque na lista
return {
"minha_coluna":coluna_que_me_interessa,
... # outros dados
}
For example, let’s say that the Excel A file contains "weights" and the B file contains "height measurements" - and you want to use both to make a calculation.
resultados1 = ler_arquivo('pessoas_pesos.xls')
resultados2 = ler_arquivo('pessoas_alturas.xls')
pesos = resultados1['minha_coluna']
alturas = resultados2['minha_coluna']
for i in range(len(pesos)):
imc = pesos[i] / alturas[i]*alturas[i]
Instead of making this "weird" loop, you can use the function zip
to combine the elements of the lists pesos
and alturas
two-two:
pesos = [50,70,60]
alturas = [160,180,165]
zip(pesos, alturas) # [(50,160), (70,180), (60,165)]
So that you can use the pair in a list comprehension:
def calcular_imc(peso, altura): # Exemplo; na prática, suas funções serão mais complexas
...
imcs = [calcular_imc(x,y) for (x,y) in zip(pesos, alturas)]
Finally, to answer your question: what if I have a number arbitrary files? In this case (and assuming the files are homogeneous, i.e. with the same structure) you can pass a list of pro arguments zip
dynamically. Example:
arquivos = ['arquivo1.xls', 'arquivo2.xls', 'arquivo3.xls']
resultados = [ler_arquivo(nome) for nome in arquivos]
minha_coluna = [r['minha_coluna'] for r in resultados]
calculo = [meu_calculo(x) for x in zip(*minha_coluna)]
This will make it meu_calculo
receive a list containing the values of each column [of each file] corresponding to row 1, then a list for row 2, etc.
Updating: then adapting the technique described for your specific case (in this case, the zip
was unnecessary):
def ler_arquivo(nome):
inputWorkbook = xlrd.open_workbook(nome, formatting_info=True)
inputSheet = inputWorkbook.sheet_by_name('idade')
return {
"ncols":inputSheet.ncols,
# As duas linhas que interessam
"linha1":[inputSheet.cell_value(1, coluna) for coluna in range(inputSheet.ncols)],
"linha2":[inputSheet.cell_value(2, coluna) for coluna in range(inputSheet.ncols)],
}
arquivos = ["entrada{}.xls".format(i+1) for i in range(10)]
resultados = [ler_arquivo(nome) for nome in arquivos]
outputWorkbook = xlwt.Workbook(encoding="utf-8")
outputSheet = outputWorkbook.add_sheet('idade')
headerStyle = xlwt.easyxf('font: height 200, bold 1, color white;''pattern: pattern solid, fore_color black; ')
# Percorre os resultados de cada arquivo, incrementando a linha em 2
linha = 0
for resultado in resultados:
for coluna in range(resultado["ncols"]):
outputSheet.write(linha+0, coluna, resultado["linha1"][coluna], headerStyle)
outputSheet.write(linha+1, coluna, resultado["linha2"][coluna])
linha = linha + 2
outputWorkbook.save("resultado.xls")
It is very difficult to answer your question like this in the abstract, because I do not know exactly what you want to do with these files, whether you need to combine your results or not, etc. I tried to answer as best I could, but if this does not clarify your question, please [Dit] your question by asking the maximum of context possible (i.e. say what you want to do, how you are trying to do it, any relevant chunk of code, etc).
– mgibsonbr
Hi, I edited the question to improve what I want to do.
– Mary Tortugo
Improved a lot, but still left some doubts: 1) all files
entradaN.xls
have equal structure? i.e. a page calledidade
, where you intend to extract the lines1
and2
of all columns (as you are doing in this example); or not? 2) Where will the information of the other entries be placed? The first was for the lines0
and1
from the exit, the next will go to the lines2
and3
? Or is it all in the first two rows, and the next one will simply add more columns? Or even: only the first will have headers, the following will not... Etc.– mgibsonbr
That’s good. 1) Yes, all files entered in N.xls have the same structure. For example, it has 10 columns and 2 rows. 2) That’s right, I need to take the information from all columns and rows 0 and 1 of the input file, and for the pulled row it will be added on row 0 and 1, and then 2 and 3 and so on. I am studying on how to create equal role you taught below, but I am still struggling. But I am attentive to your teachings. Thank you
– Mary Tortugo
I updated the answer with a full example. In this case, the
zip
was unnecessary, but the overall structure of the solution remained the same.– mgibsonbr