0
I have this excel below:
NÚMERO "URL NÚMERO 16571 SICAN" "URL DECRETOS 2011 PRINCIPAL"
1 CCIVIL_03/Atos/decretos/1991/D00001.html CCIVIL_03/decreto/1990-1994/D0001.htm
4 CCIVIL_03/Atos/decretos/1889/D00004.html CCIVIL_03/decreto/1851-1899/D0004.htm
5 CCIVIL_03/Atos/decretos/1889/D00005.html CCIVIL_03/decreto/1851-1899/D0005.htm
5 CCIVIL_03/Atos/decretos/1934/D00005.html
5 CCIVIL_03/Atos/decretos/1991/D00005.html CCIVIL_03/decreto/1990-1994/D0005.htm
7 CCIVIL_03/Atos/decretos/1991/D00007.html CCIVIL_03/decreto/1990-1994/D0007.htm
8 CCIVIL_03/Atos/decretos/1991/D00008.html CCIVIL_03/decreto/1990-1994/D0008.htm
9 CCIVIL_03/Atos/decretos/1991/D00009.html CCIVIL_03/decreto/1990-1994/D0009.htm
12 CCIVIL_03/Atos/decretos/1934/D00012.html
and developed the following code to treat values:
import pandas as pd
import numpy as np
from time import time
def truncus04(filein='../brito procv.xlsx' ):
df = pd.read_excel(filein, names=['num', 'url1', 'url2'], sheet_name='Plan7')
df2 = df.dropna()
df2['num'] = df['num'].dropna().apply(np.int64)
result = (df2.url1.str.split('/'))
print(type(result))
# print(result)
#df2['ano'] = df2.url1.str.split('/')[:][6]
df2['ano'] = df2.url1.str.split('/').loc[:, (6)]
# df2.to_csv('../brito_procv_{}.css'.format(int(time())), index=False)
df2 = df2.sort_values(['ano', 'num'])
df2[['num', 'ano', 'url1', 'url2']].to_csv('../brito_procv_{}.css'.format(int(time())), index=False)
return True
My goal is to create a new listing with the ['num', 'ano', 'url1', 'url2'] fields, without the empty fields.
num,ano,url1,url2
4,1889, CCIVIL_03/Atos/decretos/1889/D00004.html, CCIVIL_03/decreto/1851-1899/D0004.htm
5,1889, CCIVIL_03/Atos/decretos/1889/D00005.html, CCIVIL_03/decreto/1851-1899/D0005.htm
291,1890, CCIVIL_03/Atos/decretos/1890/D00291.html, CCIVIL_03/decreto/1851-1899/D291.htm
456,1890, CCIVIL_03/Atos/decretos/1890/D00456.html, CCIVIL_03/decreto/1851-1899/D00456.html
How to Fix My Code to Create the Year Column?