How to select methods in a function and add results in a list?

Asked

Viewed 64 times

0

I’m trying with a function of this kind, which is in my main:

if __name__ == "__main__":
    run_experiment(outros_parametros,['LA', 'RF', 'DT', 'ARD', 'PCA'])

Cause only a few methods to be selected, calculated and the results added to a list named after medidas_importances in this way:

def run_experiment(FEATURE_METHODS... outros parâmetros):

    for j in FEATURE_METHODS:

        for add in medidas_importances:

            #Faz a seleção dos métodos e adiciona na lista:
            if FEATURE_METHODS == "LA":
                ranks1 = frk.ranks_Lasso(x_train, y_train, features_train, LASSO_CV_K, RESULT_PATH)
                medidas_importances.append(add)

            elif FEATURE_METHODS == "RF":
                ranks2 = frk.ranks_RF(x_train, y_train, features_train, RESULT_PATH)
                medidas_importances.append(add)

            elif FEATURE_METHODS == "DT":
                ranks3 = frk.ranks_DT(x_train, y_train, features_train, RESULT_PATH)
                medidas_importances.append(add)

            elif  FEATURE_METHODS == "PCA":
                ranks4 = frk.ranks_PCA(x_train, y_train, features_train, RESULT_PATH)
                medidas_importances.append(add)

            elif FEATURE_METHODS == "ARD":
                ranks5 = frk.ranks_ARD(x_train, y_train, features_train, RESULT_PATH)
                medidas_importances.append(add)

            else:
                print("not found")

    print("Imprime importancias")
    print(medidas_importances)

    frk.plot_ranks(FEATURE_METHODS, features_train, 'Cumulative Ranks', RESULT_PATH)

However, the calculation that should be performed from the selection of methods in the function run_experiment with ['LA', 'RF', 'DT', 'ARD', 'PCA'] does not return anything in the list that was created in the second loop for.

What can I do to make it work?

  • 1

    It’s unclear where the medidas_importances list came from, but it looks like it’s iterating and adding elements to the same list. See https://docs.python.org/3/tutorial/controlflow.html: Code that modifies a Collection while iterating over that same Collection can be Tricky to get right. Instead, it is usually more straight-forward to loop over a copy of the Collection or to create a new Collection .

2 answers

0


How do you do for j in FEATURE_METHODS, then I assume FEATURE_METHODS is a list (or any other durable object) containing the values "LA", "RF", etc. And within the loop you do:

if FEATURE_METHODS == "LA":

I mean, you’re comparing the string "LA" with the list. So do not enter any of these if's. Probably what you want is to compare the value of each element, ie:

if j == "LA":

But beyond that, there’s another problem. You make a loop in medidas_importances, and within that loop you add elements to that same list. This is a problem, as see what happens to this simple and seemingly "innocent example":

# ATENÇÃO: esse código entra em loop infinito 
lista = [ 'a', 'b', 'c' ]
for i, s in enumerate(lista):
    print(f'verificando posição {i}, valor {s}')
    if s == 'a':
        print(f'inserindo {s} na lista')
        lista.append(s)

That code goes into loop infinite (see here). The exit is:

verificando posição 0, valor a
inserindo a na lista
verificando posição 1, valor b
verificando posição 2, valor c
verificando posição 3, valor a
inserindo a na lista
verificando posição 4, valor a
inserindo a na lista
verificando posição 5, valor a
inserindo a na lista
verificando posição 6, valor a
inserindo a na lista
verificando posição 7, valor a
inserindo a na lista
....

That is, as it iterates through the list, elements are added at the end, and so the loop never ends, because there will always be a new element at the end (and after iterating through that element, a new one is added, and so on...)

And that’s exactly what you’re doing. In a simplified version of your code:

medidas_importances = [ 1, 2, 3 ]
FEATURE_METHODS = ['LA', 'RF', 'DT', 'ARD', 'PCA']
for j in FEATURE_METHODS:
    for add in medidas_importances:
        if j == 'LA':
            # faz algum cálculo
            print('cálculo para LA')
            medidas_importances.append(add)
        else:
            print('not found')

Upon entering the for add in medidas_importances, he enters the if j == 'LA' and adds the element add at the end of the list. That’s why loop doesn’t end, and he ends up falling in if again (because we are still in the first iteration of for j), and adds the element again, and again, and again... so the above code prints "calculation for LA" indefinitely, see.


It was unclear why you need to add the same element again in the list. Anyway, the documentation says that instead of modifying a list while iterating on it, the ideal is to create another list for it, or iterate over a copy.

But for this case, we can also create another list to store the measures that have been calculated:

medidas_importances = [ 1, 2, 3 ]
FEATURE_METHODS = ['LA', 'RF', 'DT', 'ARD', 'PCA']
calculadas = []
for j in FEATURE_METHODS:
    for add in medidas_importances:
        if j == 'LA':
            # faz algum cálculo
            calculadas.append(add) # adiciona na lista de medidas calculadas, sem alterar a original
        elif j == "RF":
            # faz outro cálculo
            calculadas.append(add)
        elif j == "DT":
           # etc ...
        else:
            print('not found')

# No final, se quiser mesmo adicionar na lista original
medidas_importances.extend(calculadas)

That is, I create another list to add the measurements, and only at the end I use extend to add the elements of this list in medidas_importances. Still, it seems strange to me to add the same elements again in the same list, but anyway...


But, if you just want to add the results of the calculations to the list, then you don’t need that second for:

FEATURE_METHODS = ['LA', 'RF', 'DT', 'ARD', 'PCA']
medidas_importances = []
for j in FEATURE_METHODS:
    if j == 'LA':
        ranks1 = ...
        medidas_importances.append(ranks1)
    elif j == "RF":
        ranks2 = ...
        medidas_importances.append(ranks2)
    elif j == "DT":
        etc...
    else:
        print('not found')

Although there is a bit of repetition that you can simplify. With the exception of "LA", all other calls receive the same parameters, only changing the function name, so it could be so too:

medidas_importances = []
funcoes = {
    'LA': frk.ranks_Lasso,
    'RF': frk.ranks_RF,
    'DT': frk.ranks_DT,
    'PCA': frk.ranks_PCA,
    'ARD': frk.ranks_ARD,
}
for j in FEATURE_METHODS:
    if j in funcoes:
        if j == 'LA':
            params = (x_train, y_train, features_train, LASSO_CV_K, RESULT_PATH)
        else:
            params = (x_train, y_train, features_train, RESULT_PATH)
        ranks = funcoes[j](*params)
        medidas_importances.append(ranks)
    else:
        print('not found')

0

You could create a dictionary by associating your Identifiers with the corresponding functions to avoid the block if/elif/else. Look at that:

# Dicionario associando o identificador
# com a método/funcao correspondente
mapa = {
    'LA'  : frk.ranks_Lasso
    'RF'  : frk.ranks_RF
    'DT'  : frk.ranks_DT
    'ARD' : frk.ranks_ARD
    'PCA' : frk.ranks_PCA
}

def run_experiment(FEATURE_METHODS):
    for m in FEATURE_METHODS:
        try:
            func = mapa[m]
        except KeyError:
            print("Identificador nao encontrado: {}".format(m))

        ranks = func(x_train, y_train, features_train, RESULT_PATH)

if __name__ == "__main__":
    run_experiment(['LA', 'RF', 'DT', 'ARD', 'PCA'])

Browser other questions tagged

You are not signed in. Login or sign up in order to post.