Different global variables that when one is changed change the value at all

Asked

Viewed 187 times

0

I have a class responsible for managing certain call variables clusters, these are a vetor de vetores yesterday contains a word and an associated value. But every time I make an adjustment to the associated value this value is changed in all other clusters (que são variáveis globais).

Within this class there are three methods one of them Find_palavra that takes as a parameter a phrase that will analyze whether this already exists in the created clusters. You have the add_palavra takes a word and adds in all clusters and the Ajust_Cluster adding a value to the word.

There must be several clusters why determinadas palavras precisam assumir valores diferentes em situações diferentes, in order to have this flexibility the value changed in a cluster needs to be alone in it without being mirrored in all others.

**There is also a global variable called data which contains all the words in the clusters but with the associated values zeroed.

class Clustering:
def Find_palavra(frase):
    global data
    i=0
    x=0
    controle=0
    palavras=frase.split(' ')
    while(len(palavras)>i):
        if(str(palavras[i]) in data):
            controle=1                
        if(controle==0):
            Clustering.add_palavra(palavras[i])
        i=i+1


def add_palavra(palavra):
    global data
    global cluster1
    global cluster2
    global cluster3
    global cluster4
    global cluster5
    global cluster6
    global cluster7
    global cluster8
    global cluster9
    global cluster10
    global cluster11

    array=[]
    array.append(palavra)
    array.append(0)
    data.append(array)
    cluster1.append(array)
    cluster2.append(array)
    cluster3.append(array)
    cluster4.append(array)
    cluster5.append(array)
    cluster6.append(array)
    cluster7.append(array)
    cluster8.append(array)
    cluster9.append(array)
    cluster10.append(array)
    cluster11.append(array)
    i=i+1

#parametro res é as palavras que precisam ser alteradas no cluster se as palavras estiverem em res é somado em seu valor associado se não é diminuido.
def Ajust_Cluster(palavras,res,cluster,p):

    global data
    global cluster1
    global cluster2
    global cluster3
    global cluster4
    global cluster5
    global cluster6
    global cluster7
    global cluster8
    global cluster9
    global cluster10
    global cluster11
    k=0
    i=0
    w=0
    difer=(1/1)
    words=res.split(' ')
    while(len(words)>k):
        i=0
        while((len(cluster))>i):
            if(words[k]==cluster[i][0]):
                cluster[i][1]=float(cluster[i][1])+difer
            i=i+1
        k=k+1
    while((len(cluster)-1)>w):
        if(cluster[w][0] in words):
            pass
        else:
            T=randint(1,2)
            if(T==1):
                cluster[w][1]=float(cluster[w][1])-difer
        w=w+1

    print("Valor alterado:",cluster)

    if(300>palavras and palavras>230):
        cluster10=cluster


    elif(230>palavras and palavras>200):
        cluster11=cluster
  • I find this code very confusing and if I simplify it it will be easier to handle it, whatever it is you want to do.

1 answer

2


You’re missing some concepts of data structures there, and your program gets confused. I’ll try to give you some tips, but it’s important that you take a break and digest it well - preferably by going to Python’s interactive mode and doing some "trial and error" operations until you understand things well.

I don’t even know if I can write in order of importance.

  • what is generating your mistake: A Python object is created when you build it - either by calling your class, or by using a literal like [] (which creates an empty list). You name objects - these names we call variables. When you point another variable to the same object, or place the same object in several different lists, or dictionaries- is still the same object. You have references to it in several places. In your code you create a single empty list with the name of array, initializes it with some data, and inserts the same list in all its "clusters". When this list undergoes a change, whatever the way you arrived at it (if you went straight through the "array" variable, if you went through one of the "clusters"), you change that list - the change will be visible no matter if you retrieve the "cluster1" or "cluster11" listIt’s gonna be the same list. Python has operator is that checks if you have references to the same object - if you do at the end of your method add word to comparison with is array with the last element inserted in any of the clusters, you will see that it returns "True". And if you change the "array", you will see the change reflected in all clusters (which is the "problem" you reported). how to solve it punctually: Add one to each cluster copy from the original list, by name array, not itself. You can create a copy by importing the module copy and using the Python function copy:

    import copy

    ... def add_palavra(...): ... cluster1.append(copy.copy(array))

In the specific case of lists, you do not need the copy.copy to make a copy - the operation of recovering an index with slices can do this - you take a "slice" that goes from the first to the last element of the list (just leave blank the index of beginning and end of the slice around the :) as in: cluster1.append(array[:]).

It is vital that before you continue your program you go at an interactive Python prompt, create a variable with a list, put the list in another variable and other data structures, put copies of the list in other variables, and experiment with the operator is and the function id. Nothingness I’ll replace about 40 minutes playing with this:

Exemplo: 

In [1]: a = []

In [2]: b = a

In [3]: c = a[:]

In [4]: a.append("teste")

In [5]: a
Out[5]: ['teste']

In [6]: b
Out[6]: ['teste']

In [7]: c
Out[7]: []

In [8]: id(a), id(b), id(c)
Out[8]: (140510059971000, 140510059971000, 140510060067080)
  • Second thing: "lists" in Python are "lists". Calling them array is not a good practice and can confuse you - "array" is best used for a data structure that has elements of uniform size, one in sequence from the other in the memóra, and "more or less" fixed size (although it can be changed). Python lists are composed of objects of any type, and fundamentally can change size.

  • Third thing: global variables are not generally a good practice. I am a controversial author because I advocate its use in various contexts - since in Python they are isolated in the modules (there is "namespace"), but the way you are really using is bad.

  • Fourth thing: If you have a list or other changeable object in a variable, you do not need to declare that object as global in a function in which it will move within of this object. This is all you need to reassign the variable to another object. That is - in your case, as the "clusters" are already created outside of your methods, and you will only add elements within them, you do not need this lot of statement global in each function. In Python a global variable is accessed for reading normally from within a function - when you access the variable, Python returns the object to which it refers (in this case, lists) - and you operate in these lists.

That is to say:

cluster1.append("teste") does not need to declare cluster1 as global in the function. Now cluster1 = ["teste"] need -in this case a new object of type list is created and associated with the variable.

  • fifth thing - but perhaps the most important thing, is to be first: The most important task that computers do is to be able to repeat tasks thousands of times. You seem to forget this by creating 11 virtually identical variables without even a distinct name and repeating dozens of times the same lines when dealing with them. You only need one variable clusters - itself a list (possibly better a dictionary) - and at each point you have to do an operation in all clusters, use a for. At each point acting on a specific cluster, use the index of the cluster. (So it may be better to have a dictionary, clusters can be identified by more significant names than numbers from "1" to "11".).

With these consderations, your method add_palavra is just being:

# Cria 11 listas distintas  a serem usadas comoclusters:

clusters = [[] for _ in range(11)]  

class Clustering:

    def add_palavra(palavra):

        array=[]

        array.append(palavra)
        array.append(0)
        data.append(array[:])
        for cluster in clusters:
            cluster.append(array[:])
  • sixth thing: Here you are associating a word and a number using a 2-element list - it sounds like a bad idea and you should probably be using a dictionary instead of a 2-element list. That is, instead of having a structure like [['palavra1', 2], ['palavra2', 0]] that you will have to scroll through linearly to find each corresponding word, use a direct dictionary: {'palavra1': 2, 'palavra2': 0} - In Python dictionaries you have your key found in constant time by the language,and you do not need to "program" the search for a key and change its value. Just like references to the same object, it’s important that you play with dictionaries at the interactive prompt until you understand them. (and if using dictionaries just to store the number of occurrences of each word, check the structure collections.Counter - it provides various facilities for this)

So - to start the tips are these - it’s critical that you understand what’s going on, and don’t just make the changes I’ve indicated.

  • I will adhere to all the tips and study the points presented. Thank you very much for informing me all the points I should improve! : D

Browser other questions tagged

You are not signed in. Login or sign up in order to post.