TL;DR
Based on the simplicity of the question: Convert the data to numeric or create a new variable of the numeric type based on the categorical that you have, I do not know what you are using for the "treatment" of the dataset, below I leave a very simple example using pandas.DataFrame
which is very common in the python world:
import pandas as pd
dataset = {'aluno': [1, 1, 1, 2, 2],
'periodo': [1, 2, 1, 1, 1],
'nota': ['A', 'D', 'C', 'D', 'A']}
df = pd.DataFrame(dataset, columns = ['aluno', 'periodo', 'nota'])
print('','DataFrame Original:', df, sep='\n')
In the example the original dataset has the two columns aluno
and periodo
of the numerical type and nota
type categorica, the output to the above code is:
DataFrame Original:
aluno periodo nota
0 1 1 A
1 1 2 D
2 1 1 C
3 2 1 D
4 2 1 A
Now, in the code below, we create a new column of the numerical type (nota_numerica
) column-based nota
notas = {'A': 100, 'B': 80, 'C': 60, 'D': 40}
df['nota_numerica'] = df['nota'].apply(notas.get)
print('','DataFrame Modificado:', df, sep='\n')
The exit to this new code would be:
DataFrame Modificado:
aluno periodo nota nota_numerica
0 1 1 A 100
1 1 2 D 40
2 1 1 C 60
3 2 1 D 40
4 2 1 A 100
As I said before, as well as the question, it is a very simple example, of course that depending on the complexity of the real problem you can elaborate a more elaborate solution with the pandas itself.
See working on repl it.
I didn’t quite understand the question. Could you add examples of what you have as input/output and what is expected? Non-numerical entries are categorical (such as
azul
orvermelho
) or text (such as names, descriptions)?– AlexCiuffa