Addition of columns in csv file - Python

Asked

Viewed 141 times

1

I’m not able to create a csv file with a new column (month). When I try to create the whole column is only with the month of December(12) but the month column should correspond to the whole number of the month. Anyway, follow the code:

import pandas as pd

dfdados = pd.read_csv('DadosClimaticos2018Londrina.csv', sep =';')

x = 0

while x <= 1096:
    linha = dfdados.iloc[x]
    data = linha ['Data']
    dia_mes_ano = data.split('/')
    print(dia_mes_ano)
    x = x + 1
    

dfdados['Mês'] = dia_mes_ano[1]

dfdados.to_csv('Dadoss.csv',sep = ';',index = False)

2 answers

1

Your code has two problems: First your problem is happening because your dia_mes_year variable will only store the last value of the loop, it makes all your months equal. According to pandas is an efficient tool for working very large spreadsheets. If you make a loop that traverses the entire spreadsheet you will do this less efficiently than the pandas are able to do. See a much more efficient way to use pandas features to get months without having to scan the spreadsheet:

dfdados = pd.read_csv('DadosClimaticos2018Londrina.csv', sep =';')

meses = dfdados.Data.str.split('/').str[1]
dfdados['Mês']=meses

understand that the command meses = dfdados.Data.str.split('/').str[1] does exactly the same thing you wanted: a split of the month and the selection of the element at position 1 of the list. The difference is that it does this on all elements of the Date column at once

0

You can do it this way:

Creating the month column:

dfdados['Mes'] = pd.DatetimeIndex(dfdados['Data']).month

Saving the csv:

dfdados.to_csv('./novo.csv',sep = ';', index = False)

More about the Datetimeindex


Complete code:

import pandas as pd

dfdados = pd.read_csv('DadosClimaticos2018Londrina.csv', sep =';')
dfdados['Mes'] = pd.DatetimeIndex(dfdados['Data']).month
dfdados.to_csv('./novo.csv',sep = ';', index = False)

  • If the day appears first you can set dayfirst = True
  • If the year appears first you can arrow yearfisrt = True

Example:

dfdados['Mes'] = pd.DatetimeIndex(dfdados['Data'], yearfirst = True).month
  • Your code misses if you do not specify the date format. At least on my computer it assumed month/day/year and so would have to use .day in place of .month

  • yes, but in the case of the example, the month will be in element 1 regardless of how your system is defined. In your solution you could use dayfirst=True

  • Thank you! It helped a lot. If you don’t mind, can you take a look at my other question? https://answall.com/q/481747/212837

  • Gabriela, good morning! I’m glad you decided! If the answer has solved your problem, consider marking the answer as valid (not required but is a good practice for future users with the same problem). See how. I visited the other question and left a suggestion to solve the problem. Hugs!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.