Are class methods recreated for each instance in Python?

Question

Are class methods recreated for each instance in Python?

Asked 6 years, 4 months ago

Viewed 707 times

7

By what I observed when generating an instance of a class all class methods are recreated in a different memory position as in the excerpt below:

class Foo():
    def __init__ (self):
        self.x = 10
    def getFoo(self):
        return self.x
    def setFoo(self,xnew):
        self.x = xnew

a = Foo()
b = Foo()

>>> a.getFoo
Out[79]: <bound method Foo.getFoo of <__main__.Foo object at 0x000001C0EF1F6198>>

>>> b.getFoo
Out[80]: <bound method Foo.getFoo of <__main__.Foo object at 0x000001C0EF1F6518>>

But when you’re scanning, creating and testing many objects in this class with different parameters, it can kill the memory, causing the execution to take a long time. Is there any way to create unique instance methods for the class? I figured it was the @classmethod (by what I read in https://python-textbok.readthedocs.io/en/1.0/Classes.html#classmethod) but what I tested also did not work:

class Foo():
    x = 5
    def __init__ (self):
        self.x = 10
    def getFoo(self):
        return self.x
    @classmethod
    def setFoo(cls,xnew):
        cls.x = xnew

a = Foo()
b = Foo()
print(a.x, b.x,Foo.x) #10 10 5

a.setFoo(29)
print(a.x,b.x,Foo.x)  #10 10 29

print(a.getFoo == b.getFoo) #False
print(a.setFoo == b.setFoo) #True

Only one addendum, in python we do not use getters and setters, in this case you should use properties (this in very specific cases still)

– Luis Eduardo

2019/03/16 at 00:56
I’m on mobile and I can’t answer, but the idea is that the class itself has functions and when the class is instances Python creates a descriptor that defines the method. That is, the getFoo of the instance will be a descriptor that references the original function of the class. It is this descriptor that is responsible for defining the value of self which will be passed as the first parameter. That is, the function itself is not recreated, what happens is that each instance has its own descriptor, but all referring to the same function.

– Woss

2019/03/16 at 02:58

2 answers

7

tl;dr:

Methods are temporary objects - they are actually created each time they are accessed, with the aggregation of the attribute "self" the function that is declared in the class. That is: a method Python instance even exists in memory while is not in use for any instance of a class.

"Calm down, that’s not what you’re thinking at all"

Yes, of course for you to look at the clipping you made, a logical hypothesis is precisely that "each instance has its own instances of methods" - and, if it were really that, your concern is very well placed - each instance of a class would occupy a very large memory space with almost identical to the methods.

But not always the first chance we get, even if it seems very simple, it’s a fact.

What happens is that method objects are not created when the class is created, nor when an instance is created. Methods are produced in "real time" when accessed - either in the context of the class, whether in an instance, and do not consume memory if not in use.

Yeah, that might have some impact on performance, nothing close to the impact on the memory that concerned you, and there are ways to mitigate that loss of performance if it is found that it actually has some impact on your code, through Profiling. (And it also bothered me a lot when I found out, until I realized that the impact is really minimal)

First, let’s continue your finding that methods in different instances are different objects: in fact, the same method, in the same instance of a class, is a different object to be recovered as an attribute on different occasions:

In [38]: class A: 
    ...:     def b(self): 
    ...:         pass 
    ...:                                                                                                                       

In [39]: a = A()                                                                                                               

In [40]: a.b is a.b                                                                                                            
Out[40]: False

Why is it no use comparing the id of the methods?

I could compare the id of the above methods, and Python print the same result - but this would be misleading. Why when calling id(a.b), the function id returns the value, and the object a.b which has been passed as a parameter is left without any reference, and is destroyed. A next call to id(a.b) can bring bad luck (or luck), and create the new method in exactly the same memory address, and the comparison id(a.b) == id(a.b) may result in True, even the two a.b being distinct objects. If I keep a reference to the first method object, however, the second will be created with a distinct id:

In [42]: c = a.b                                                                                                               

In [43]: print(id(c), id(a.b), id(a.b))                                                                                        
140180715412680 140180937921736 140180937921736

Note just what I described: the first method object has a reference to more, in variable c - so it continues to exist after the call of id(c), but the second object is destroyed at the instant id ends his execution, and the third call to id gets a a.b in exactly the same position as the second call.

Mácomoéquepode??

Getting back to the main - what mechanism Python uses to create these method objects? This may be the coolest part of all: the mechanism used internally by the language is 100% exposed as a language feature, and is customizable in pure Python! That is: you can create your own decorators equivalent to @classmethod and @staticmethod that change the behavior of a method (this among other possibilities).

What language does is depend on the descriptors protocol (Descriptor Protocol): any attribute of the class (which includes functions defined in the class body), which implements a method from within __get__, __set__ or __delete__, when it is recovered (either with the notation of "Class.attr" or "instance.attr", or with "getattr(Class, 'attr')"), instead of being returned directly, it has its method __get__ called - what is __get__ return is used as the attribute value.

Typically, the descriptors protocol is more visible when using built-in @property, which is already a shortcut to turn a method into a descriptor object that calls that method.

However, any function in Python 3 has the method __get__, and what the method __get__ of a function does is just to transform it into an instance method! And a method is actually a very simple object: the method __get__ takes as parameter the instance where the attribute is being accessed - the method object stores this reference as an attribute, and, when called (in Python, any object that has the method __call__ can be called), it calls the original function, passing the instance as the first parameter. That’s where the argument self is injected into the call of a method. (that is, the "self" that "seems magical", used as a parameter in all methods, is added by a well-documented language mechanism for "use and modification").

In [48]: class A: 
    ...:     def b(self): 
    ...:         pass 
    ...:                                                                                                                       

In [49]: A().b                                                                                                                 
Out[49]: <bound method A.b of <__main__.A object at 0x7f7e9ee2dd30>>

In [50]: A.b                                                                                                                   
Out[50]: <function __main__.A.b(self)>

In [51]: print(dir(A().b))                                                                                                     
['__call__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__func__', '__ge__', '__get__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__self__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

In [52]: A().b.__self__, A().b.__func__                                                                                        
Out[52]: (<__main__.A at 0x7f7e9ee076d8>, <function __main__.A.b(self)>)

In [53]: A().b.__func__ is A.b                                                                                                 
Out[53]: True

That is, at the moment when Python executes the expression a.b, the method __get__ of function A.b is called, and takes as parameter a. That one __get__ then creates the object method, with the attributes __func__ and __self__ configured. When this method object is called, Python enters its method __call__, and what he performs is equivalent to:

def __call__(self, *args, **kw):
    return self.__func__(self.__self__, *args, **kw)

(The Python code of an object method would be exactly that, and it’s just not why the object method itself is defined in code C in Cpython)

@classmethod and @staticmethod

These two built-ins are implemented in native code, but now you can understand how they work: you can do the "classmethod" that makes a method receive the class instead of the instance in the first parameter creating an object that: keeps a reference to the original function, and has the attributes __get__ and __self__ appropriate - see how it looks in a few lines:

In [29]: class MyClassMethod:
    ...:     def __init__(self, func, owner=None):
    ...:         self.func = func
    ...:         self.owner = owner
    ...:         
    ...:     def __get__(self, instance, owner):
    ...:         # Cria um objeto novo a cada vez que 
    ...:         # o aributo é recuperado - evita problemas
    ...:         # potênciais em programas multithreading
    ...:         # com herança de classes:
    ...:         return MyClassMethod(self.func, owner)
    ...:         # Sem se preocupar com multithreading,
    ...:         # esta função poderia fazer simplesmente:
    ...:         self.owner = owner
    ...:         return self
    ...:     
    ...:     def __call__(self, *args, **kw):
    ...:         print("Método de classe chamado")
    ...:         return self.func(self.owner, *args, **kw)
    ...:     
    ...: 
    ...: class A:
    ...:     @MyClassMethod
    ...:     def b(cls):
    ...:         print(f"Estou na classe {cls!r}")
    ...: 
    ...:         

In [30]: A().b()
Método de classe chamado
Estou na classe <class '__main__.A'>

Memory cost of an instance

Going back a little to your initial concern: we have seen that each instance of an object does not create copies of the methods - what is in memory for each object then?

An instance creates in memory a generic Python object, which has a reference to its class (in Python, the class is an object like any other), in its attribute .__class__, and creates a new dictionary in its attribute __dict__ and a structure to reference the "weakrefs" in __weakref__. In addition, it has a reference to all attributes that are set in the __init__.

An empty dictionary has about 250 bytes, the __weakref__ empty about 80 - and the "Pyobject" itself, about 60 bytes (Python 3.7 64bit - in 32bit these values may be smaller) - that is, a "new" instance of a common class, will use about 390 bytes.

In case it is an instance with well defined attributes, which will be instantiated many times (let’s say it is a class Point, which will only store coordinates "x" and "y" and have methods to operate with them), it is possible to delete the creation of the internal dictionary of the instance (and the __weakref__) - in this case, each instance will use only its 60 bytes plus the attribute space, without the 250 bytes of the __dict__. To do this, just define the attribute __slots__ in the class body - Python creates a class with special layout, with direct space for the predefined attributes, and without the __dict__:

In [51]: class BadPoint:
    ...:     def __init__(self, x, y):
    ...:         self.x = x
    ...:         self.y = y
    ...:         

In [52]: class Point:
    ...:     __slots__ = "x", "y"
    ...:     def __init__(self, x, y):
    ...:         self.x = x
    ...:         self.y = y
    ...:     def  distance(self, other):
    ...:          return ((self.x - other.x) ** 2 + (self.y - other.y) ** 2) ** 0.5
    ...: 
    ...:     def __repr__(self):
    ...:          return f"P<{self.x}, {self.y}>"
    ...:     

In [53]: 

In [53]: a = [Point(i, i) for i in range(1000)]

In [54]: get_size(a)
Out[54]: 65024

In [55]: b = [BadPoint(i, i) for i in range(1000)]

In [56]: get_size(b)
Out[56]: 205120

(get_size is a function that recursively calls the sys.getsizeof of an object - I used the implementation that is in this recipe: https://goshippo.com/blog/measure-real-size-any-python-object/)

As you can see, the extra methods make no difference in size - there is only one copy of their "original" (like function objects) in the class. On the other hand, deleting the internal dictionary makes quite a difference on simple objects.

Possible optimizations

As written above, it is possible that this creation/destruction of method-type objects may impact some portion of an application - in general only if within another chunk of code we will call several times the same method of the same instance (that is, within a loop for or while).

And in such cases, all you need to do to avoid wasting resources is to keep a reference to the method that exists during the loop. I mean, instead of:

for character in big_text:
    myobject.transmogrify(character)

just write:

transmogrify = myobject.transmogrify
for character in big_text:
    transmogrify(character)

Note that this same idea holds for any access to attribute, actually - since every time we write instancia.atributo, the language has to check several things, including whether the attribute is a descriptor, before retrieving the attribute. The simple fact of placing the attribute in a local variable before the for makes this mechanism be used only once instead of once in each repeat.

I could not have explained better +1!

– Woss

2019/03/16 at 11:19
I’ve fattened her up a little bit now.

– jsbueno

2019/03/16 at 14:01

Browser other questions tagged python oop python-internals

You are not signed in. Login or sign up in order to post.

by Elton Nunes • **490** points · Answer 1 · 2019-03-16T01:26:29+00:00

if curious, I suggest using function to create function. my suggestion is the following code

class Foo():
    x = 5
    def __init__ (self):
        self.x = 10
        self.setFoo = self.setarFoo()

    def getFoo(self):
        return self.x

    def setarFoo(self):
        def setFoo(xnew):
            self.x = xnew
        return setFoo


a = Foo()
b = Foo()
print(a.x, b.x,Foo.x) 

a.setFoo(29)
print(a.x,b.x,Foo.x)  

print(a.getFoo == b.getFoo) 
print(a.setFoo == b.setFoo)

see that the __init__ calls the function setarFoo, and setarFoo returns the function created in your call

@classmethod if ñ I’m mistaken it’s to add a new construtor to class