1
Hello, this is my first question here, I don’t know if I’m asking you correctly, but come on.
From the "Concat" functionality of the pandas package, I am joining several excel files, the code that does the same I found here in Stackoverflow, and modified it a little to suit my final format.
import pandas as pd
excel_names = ["1.xlsx", "2.xlsx", "3.xlsx", "4.xlsx", "5.xlsx"]
excels = [pd.ExcelFile(name) for name in excel_names]
frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels]
frames[1:] = [df[1:] for df in frames[1:]]
combined = pd.concat(frames, join='outer', axis=1, sort=False)
combined.to_excel("final.xlsx", header=False, index=False)
The problem is that I would like to relate some columns and do not know how to do, in case, all tables have at least one column in common which is the ID, I would like in the table "final.xlsx", the ID column was the same for all occurrences of ID, and if possible, delete repeats from the ID column.
I also accept out-of-code solutions, such as scripts for Excel, so I can at least organize the rows and columns by relating them to the ID column.