2
I have a database that has 789 reviews of people on a particular product, it has the columns reviews and stars. I normalized the data to positive (star >= 3) 1 and negative 0.
outputs = data_frame['estrelas']
rotulo = list()
for output in outputs:
if output >= 3:
rotulo.append(1)
else:
rotulo.append(0)
Then I counted the number of positives and negatives of the dataset and it came that it has 738 positives and 51 negatives. What I need is for them to be equal to 51 negatives and 51 positives, in other words, 102 records. I’m using python and pandas.
I don’t know if I understand the problem. Dataframe has 789 lines, of which 738 have the column
estrela
with value >=3 and 51 with value <3. The goal is to catch, of those 738, only 51? Have any criteria to choose these 51?– AlexCiuffa
That’s right! No, it just needs to be >= 3.
– Tauane Marton