1
I am trying to automate a process that I do manually in excel. That is to extract the company’s employee base from excel, select some specific columns (because the file is too large), remove certain level of hierarchy and filter some companies. so far it has been. if you want to give any suggestions for better it will be very welcome. however in the name column, has some duplicate names are really different people. It is necessary to keep duplicate. my doubt is when I do in excel I put "." at the end of each name to differentiate, I can do it by python? I am using googlecolab.
Obs: when I run it presents 2 errors
- WARNING *** file size (7827463) not 512 + Multiple of sector size (512)
- /usr/local/lib/python3.6/dist-Packages/ipykernel_launcher.py:6: Userwarning: Boolean Series key will be reindexed to match Dataframe index. is normal?
view = pd.read_excel ("/content/View.xls")
filtro = view['Emp'] < 4
filtro2 = view['Hierarquia Cargo'] > 3
view1 = view[filtro]
view5 = view1[filtro2]
bd = view5 [['Nome', 'Emp', 'EST', 'Matr', 'Nome Estabelecimento', 'Descr Unid Lotacao', 'Descr CC', 'Desc Afast']]
bd = bd.sort_values (by='Nome', ascending=True)
display (bd)