0
I have a dataframe with the following columns:
COL1 COL2 COL3 NEW_COL*
A asd 1 8
B adf 2 9
A adg 8 1
B adh 9 2
C adj 7 7
D adk 1 1
Where NEW_COL = (total sum of col1 by type - the value of col3) / (total Qtd of col1 by type - 1)
In this column I need help, someone knows how I can do in a Dataframe with pyspark?
Thanks!
It would look like this: Sum of type A = 9 - Value of type B = 1 => 8 Quantity of type A = 2 - Unit value = 1 => 1. So we have 8/1.... :)
– Adriana Cavalcanti