1
I have 4 folders, and each of them is filled with csvs of 3 types (ap,peers,visits).
I’m a beginner in python, but I wanted to create a python script that would merge the files that are peer, in order to get 1 single file with the lines of all peer files found. In addition, I wanted to add a column to the header called "student", and for each line I wrote in the final peers file I would put the respective student at the end.
mainfolder = sys.argv
mainfolder.pop(0)
mainfolder = mainfolder[0]
allfolders = glob.glob(mainfolder + '*\\')
with open(mainfolder + "finalpeers\\totalpeers.csv", "w") as finalPeersFile:
newpheader = '"_id","ssid","bssid","dateTime","latitude","longitude","student"\\n'
finalPeersFile.write(newpheader)
for folder in allfolders:
student = folder.split('\\')[-2]
filesTomerge = glob.glob(folder + '*.csv')
for filename in filesTomerge:
if (isPeers(filename)):
with open(filename, 'r') as p:
for line in p:
finalPeersFile.write(line)
My code even does that, but since the headers are the same and there are files that only have headers, I get lots of lines with repeated headers. Also I can’t just take the header of the first line and add "student" because there is a "hidden" new line, I think it’s something particular from python. And although I have the student to add at the end of the line, I can’t just add it to a string (line + student).
Final file:
How can I delete repeat or merge (merge) files so as not to put headers?
p.s.: Price sorry if you are asking a question that has already been asked (although I have searched a lot and none have helped me solve the problem).
Thanks for the quick reply! I had to make some changes but it’s already working as wanted thanks to the indications. José
– José Soares