0
I would like to calculate the weight of each class of a dataset multi-label to pass to fit_generator
of Keras the parameter class_weight
. In the case of a dataset single label, as my output is limited to only one class, I can calculate as the example below:
def calc_weight_single_label(label_count):
max_value = max(label_count.values())
class_weight = {}
for key in label_count.keys():
class_weight[key] = max_value/label_count[key]
return class_weight
>>> # class A:10%, class B:50% and class C:40%
>>> labels_dict = {'A':10, 'B':50, 'C':40}
>>> calc_weight_single_label(labels_dict)
{'A': 5.0, 'B': 1.0, 'C': 1.25}
This means the loss when classifying the class A
erroneously will be 5 times higher than ranking B
erroneously.
However, in a multi-label dataset, can I have ratings like: only A
, A
and B
, A
and C
and so on. How can I calculate the weight of each class in this case?
An example would be this dictionary with occurrences labels_dict = {'A':10, 'B':50, 'C':40, 'D':20}
and a total number of samples equal to 100
.
How so can you exist classification A and B? I don’t understand!!! For example you want to classify whether a person is healthy, sick or dead. In your description could you classify that person is healthy and dead or sick and dead? Strange huh ...
– Octávio Santana
In this example of being healthy, sick or dead, the rating is single-label. An example of multi-label are classes such as: class A is 0 for age < 18 and 1 for age > 18, class B is 0 for not being a student and 1 for being. Ai a classification can be either A and B, or only A, only B or none.
– AlexCiuffa
@Alexciuffa what Keras method are you using to do this training? I need to understand the multi-label strategy being used to give a more appropriate response.
– Arthur Ferraz
I’m using the
.fit_generator()
. The generator is something like theflow_from_dataframe()
, but customized. My Abels are on a dataframe in a columnlabels
. An example of two lines would be:[1,0,0,1]
and[0,0,1,1]
– AlexCiuffa
I don’t think I was very clear, but I want to know which classifier you are using. I’m not an expert on Keras but from what I understand. fit_generator() is just a way to train with batches more flexibly.
– Arthur Ferraz
I don’t quite understand the question. I’m using a CNN architecture, followed by a Fully-Connected network and, in the last layer, output with sigmoid as activation function. If that doesn’t answer the question, could you give me an example of classifiers?
– AlexCiuffa