1
I’m trying to perform a groupby with the pd.cut() function, but I’m getting weird results. I have the following columns:
jul/16 Ago/16
R$ 34.252,00 R$ 52.000,00
R$ 33.850,00 R$ 33.850,00
R$ 34.850,00 R$ 44.590,00
R$ 35.850,00 R$ 45.590,00
R$ 36.850,00 R$ 46.590,00
R$ 37.850,00 R$ 47.590,00
R$ 38.850,00 R$ 48.590,00
R$ 39.850,00 R$ 49.590,00
R$ 40.850,00 R$ 50.590,00
R$ 41.850,00 R$ 51.590,00
R$ 42.850,00 R$ 52.590,00
And when I do pd.cut() in the column of Jul/16...
quartis = pd.cut(teste['Jul/16'], 4)
I have the following rooms:
[(33841.0, 36100.0] < (36100.0, 38350.0] < (38350.0, 40600.0] < (40600.0, 42850.0]]
But when I use the groupby with the quartiles found, to apply descriptive functions in the column Aug/16, I have strange results:
min max mean count
Jul/16
(33841.0, 36100.0] 33850 52000 44007.5 4
(36100.0, 38350.0] 46590 47590 47090.0 2
(38350.0, 40600.0] 48590 49590 49090.0 2
(40600.0, 42850.0] 50590 52590 51590.0 3
Like the maximum for the first group, which goes from 33841 to 36100, is holding a value of 52000? It’s like the groups are being ignored, they can help me?
The answer is: You are setting up the quartiles based on July, but using the August values. Within the first quarter of July (33841 to 36100), you have, in the August column, the value 53000. I hope to have been able to explain.
– Paulo Marques
I get it now, he’s not filtering Aug/16 like I thought he would.. You know how to do that? For example, I have this categorical Jul/16 data with ranges of values, and I would like Aug/16 to follow these ranges, there is some way to do this?
– Yan Lucas Gabor Sampaio
I don’t understand what you want to do. You can update the post with the desired output for what you need?
– Paulo Marques
For example, I found the intervals for Jul/16, correct? Now, I wanted to filter Aug/16 with the same intervals, that is, if the first interval is this: (33841.0, 36100.0], I wanted him to give me a minimum, maximum and counting, of the numbers of Aug/16, that fit the proposed interval. In this case, it would be only 1 number, the R $ 33.850,00, since all others are larger than the proposed intervention. In this case the count would be 1, and the minimum and maximum would be equal. I could explain better?
– Yan Lucas Gabor Sampaio