1
I am working with some data from IBGE and I meet with two spreadsheets that I need to take their percentage.
The formula for this is very simple, ie:
percentage = (dividend / divisor) * 100
Following, I have, for example, the two Dataframe:
data1 = {'local': ['São Paulo', 'Rio de Janeiro', 'Curitiba', 'Salvador'],
'prod_1': [576, 456, 789, 963]}
divisor = pd.DataFrame(data1)
data2 = {'local': ['São Paulo', 'Rio de Janeiro', 'Curitiba', 'Salvador'],
'prod_2': [123, '-', 231, '-']}
dividendo = pd.DataFrame(data2)
When I apply the formula to get the percentages:
quociente = ( dividendo['prod_2'] / divisor['prod_1'] ) * 100
I have the following mistake, which is already expected:
Typeerror: Unsupported operand type(s) for /: str and 'int'
However, the problem is, how do I outline it to get the percentages and ignore the spaces it contains '-'
?
Use for
and if
is out of the question for being about 70 tables with 500 lines. Besides, they say it’s not good programming practice for Pandas/Python.
At the end of everything, I will need to merge all these spreadsheets and create one with the 70 tables that I want, however, I’m lost in not being able to do the percentage efficiently.
What result do you expect to receive from
'-' / 456
?– Augusto Vasques
Well, according to the IBGE, whenever it comes
'-'
or'X'
, indicates that there has been insufficient data collection, or that there has been no production. In this case, if'-'
, it would be right to have the same'-'
, or any text indicating this.– R. C. Junior