3
I have a Dataframe with some columns (I’m only representing two in this post). I need to fill the Nan of one column with certain values of another. See below:
Creating the Test Dataframe
>>> import pandas as pd
>>> df = pd.DataFrame({"base": [2, 2, 3, 3, 4, 4, 5, 5], "valores":[3, None, 100, 3, None, None, 15, None]})
>>> df
base valores
0 2 3.0
1 2 NaN
2 3 100.0
3 3 3.0
4 4 NaN
5 4 NaN
6 5 15.0
7 5 NaN
The way out I hope:
>>> df
base valores
0 2 3.0
1 2 3.0 # valor da coluna base referente ao índice 3
2 3 100.0
3 3 3.0
4 4 5.0 # valor da coluna base referente ao índice 6
5 4 5.0 # valor da coluna base referente ao índice 6
6 5 15.0
7 5 NaN # nenhum valor posterior
That is, for each Nan value found, replace with the next valid value. In the case of the latter, if this is Nan, keep it.
What I tried
I tried to use the method fillna()
which would update the Nan with a fixed value or the subsequent not-Nan of the same column if method='bfill'
as below
>>> df["valores"].fillna(method='bfill')
0 3.0
1 100.0
2 100.0
3 3.0
4 15.0
5 15.0
6 15.0
7 NaN
I also tried to use the method fillna()
searching the values of the "base" as below:
>>> df["valores"].fillna(df["base"])
0 3.0
1 2.0
2 100.0
3 3.0
4 4.0
5 4.0
6 15.0
7 5.0
Name: valores, dtype: float64
However the values received are of the same index
I need to join the two features or another way to get the result.
Other ideas
In time: Another method I thought could help is the isna()
or notna()
>>> df["valores"].isna()
0 False
1 True
2 False
3 False
4 True
5 True
6 False
7 True
Name: valores, dtype: bool