Convert monetary value (string) to Python/Pandas float

Asked

Viewed 3,303 times

0

I’m reading a csv file where one of the columns has monetary values like '10,000.00', the pandas is interpreting as string.

I wonder if I will have to convert this into race (iterating on all the items in the column) or if there is any easier way to do this?

I would like the value as float, ex: 10000.00

  • And why would you convert a monetary value to float?

  • I want to do mathematical operations on the values, as string is not possible.

  • With float has no accuracy and will give rounding errors, so it makes no sense.

  • I understood the point... But I was curious, hehe, I searched here, I found as an alternative 'numpy.float128' to increase the accuracy of decimal values, do you think it is more appropriate? What would you use?

  • I said it’s not accurate, it’s not a matter of precision, there’s no way to use any kind of binary floating point data. Actually the bigger the worse. https://docs.python.org/3.7/library/decimal.html

  • Show!! Thank you very much! You are right, I will make the changes to apply this decimal module. This should fix a small bug in another application of mine!!

  • Small because you didn’t multiply by a million :)

Show 2 more comments

1 answer

3


a way that I find simple is to use apply

import pandas as pd
df = pd.DataFrame({'col1':pd.date_range('2015-01-02 15:00:07', periods=3),
               'col2':pd.date_range('2015-05-02 15:00:07', periods=3),
               'col3':pd.date_range('2015-04-02 15:00:07', periods=3),
               'col4':pd.date_range('2015-09-02 15:00:07', periods=3),
               'col5':[5,3,6],
               'col6':['10.000,00','10.000,00','10.000,00']})


df['col6'] = df['col6'].apply(lambda x: float(x.replace(".","").replace(",",".")))
print(df)

output

            col1                col2                col3  \
0 2015-01-02 15:00:07 2015-05-02 15:00:07 2015-04-02 15:00:07
1 2015-01-03 15:00:07 2015-05-03 15:00:07 2015-04-03 15:00:07
2 2015-01-04 15:00:07 2015-05-04 15:00:07 2015-04-04 15:00:07

             col4  col5     col6
0 2015-09-02 15:00:07     5  10000.0
1 2015-09-03 15:00:07     3  10000.0
2 2015-09-04 15:00:07     6  10000.0
  • Thanks Willian, I will try to do this, I was afraid of the for because the file has 150K lines and later I want to make me much bigger files. But it is a good solution I will test tomorrow and check the performance. Thanks!!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.