Transform column with Nan and string to integer

Asked

Viewed 164 times

0

I have a dataframe with the following column:

     Years

0    1990 
1    1990
2    1990
3    1991 
5    NaN
4    1994 
6    NaN
...  ...

Name: Years, Length: 9742, dtype: object

I have already performed part of the cleaning of this data, including np.nan in missing data. However, I want to change the type of data in the column - since the data is in the type object, wish to change them to int64 for better analysis. It is possible to make this change even with NaN present?

In addition, part of the data is in string format, such as '1996', and not 1996 numerical, in the type int.

How to proceed?

1 answer

1


The pandas won’t turn np.NaN in int, because he considers it a float. But he can turn into Int64 (or Int16 and Int32). The NaN is transformed into <NA> (pd.NA), which is the null for integers, and functions using this null value, such as .isnull() work with pd.NA and with the np.NaN.

Then do: df['Years'].astype('Int64') solves the problem of nulls.

But we still have the '1996', that cannot be turned into 'Int64' directly. Therefore, we can turn them into float before:

df['Years'].astype('float').astype('Int64')

Browser other questions tagged

You are not signed in. Login or sign up in order to post.