0
I have the table
data ativo valor ajuste arquivo
17/07/2020 teste1 4,35 nan 9032800000190015197.txt
16/07/2020 teste1 nan nan 9032800000190015197.txt
15/07/2020 teste1 3,12 nan 9032800000190015197.txt
14/07/2020 teste1 nan nan 9032800000190015197.txt
13/07/2020 teste1 nan 6367.55 9032800000190015197.txt
17/07/2020 teste2 3,7 -1462.825 9032800000190015197.txt
16/07/2020 teste2 nan nan 9032800000190015197.txt
15/07/2020 teste2 nan nan 9032800000190015197.txt
14/07/2020 teste2 nan -235.825 9032800000190015197.txt
13/07/2020 teste2 nan nan 9032800000190015197.txt
17/07/2020 teste3 5 nan 9032800000190015197.txt
16/07/2020 teste3 nan nan 9032800000190015197.txt
15/07/2020 teste3 6,7 nan 9032800000190015197.txt
14/07/2020 teste3 nan nan 9032800000190015197.txt
13/07/2020 teste3 nan nan 9032800000190015197.txt
and I need to get the following result
date ativo valor ajuste arquivo
17/07/2020 teste1 4,35 nan 9032800000190015197.txt
16/07/2020 teste1 4,35 nan 9032800000190015197.txt
15/07/2020 teste1 3,12 nan 9032800000190015197.txt
14/07/2020 teste1 3,12 nan 9032800000190015197.txt
13/07/2020 teste1 3,12 6367.55 9032800000190015197.txt
17/07/2020 teste2 3,7 -1462.825 9032800000190015197.txt
16/07/2020 teste2 3,7 nan 9032800000190015197.txt
15/07/2020 teste2 3,7 nan 9032800000190015197.txt
14/07/2020 teste2 3,7 -235.825 9032800000190015197.txt
13/07/2020 teste2 3,7 nan 9032800000190015197.txt
17/07/2020 teste3 5 nan 9032800000190015197.txt
16/07/2020 teste3 5 nan 9032800000190015197.txt
15/07/2020 teste3 6,7 nan 9032800000190015197.txt
14/07/2020 teste3 6,7 nan 9032800000190015197.txt
13/07/2020 teste3 6,7 nan 9032800000190015197.txt
I have values absent in the field value and I need to fill in the previous values of the same table, considering that the value must be of the same active and date lower. And the value is not the highest or the lowest, but the value of the previous day that has a non-zero value.
I couldn’t find a solution that met these conditions. Does anyone have any idea how to do it? Using pandas.DataFrame.backfill, couldn’t.
I tried SQL, and I couldn’t, I made this code
select a.dt_data, a.co_ativo,
nullif(a.vr_valor, (select b.vr_valor from tabela b
where b.co_ativo = a.co_ativo and b.dt_data between '2020-07-01' and '2020-07-31' and b.vr_valor is not null order by b.dt_data desc limit 1))
from tabela a
where a.dt_data between '2020-07-01' and '2020-07-31'
order by a.dt_data desc
unsuccessful.
Thank you, Terry, for your contribution. Playing the code in my data frame, did not execute the replacements this is due to the fact that I have more columns in the file?
– Kabelo
@Kabelo Having more columns shouldn’t be a problem. In your DF the column date was already in the format of
Date
? 'Cause if she was already this could go wrong– Terry
I also added another option using
assign
to return the filled values to the columnvalor
. If you can test :)– Terry
If the problem persists, I will ask you to modify your available test data in the question so that I can reproduce on my machine what is happening to you. Using the data you have provided so far, my output is equal to the way you expected it to be correct
– Terry
I inserted the columns I have in the file with the same data types. This alternative did not work.
– Kabelo
@Kabelo For me the code keeps working, giving the expected output.
– Terry
Terry, it worked, man!! Re-start the kernel here and rolled!! Thank you so much!!!
– Kabelo