Why should descriptors instances in Python be class attributes?

Asked

Viewed 102 times

9

I’m studying descritores in Python and I found that they should be implemented as atributos de classe for example:

class Descriptor:

    def __init__(self, obj):
        self.obj = obj

    def __get__(self, instance, owner=None):
       print('Acessando o __get__')
       return self.obj


class Grok:

    attr = Descriptor('value')

# Output
>>> g = Grok()
>>> g.attr
Acessando o __get__
value

That way the same works, but if I do it that way:

class Grok:

    def __init__(self, attr):
        self.attr = Descriptor(attr)


# Output
>>> g = Grok('value')
>>> g.attr
<__main__.Descriptor at 0x7fe5bca77550>

It doesn’t work that way, my question is: Why?

2 answers

10


As commented on in What is the function of Python descriptors? there is a call order that the interpreter executes when you do g.attr. Like g, in this case, it is an instance, the interpreter will run g.__getattribute__('attr'). In Python, what the interpreter will try to access is:

type(g).__dict__['attr'].__get__(g, type(g))

That is, it will seek the value of attr in the class of g, not in g directly. This explains why it works when the descriptor is a class attribute, but it is not sufficient to demonstrate that it does not work for instance attribute. To do this, we will go deeper into the code and look at the C code that is executed.

The C implementation of the method __getattribute__ is described by Pyobject_genericgetattr which is implemented in Objects/Object. c. Let’s take a little look.

The function is:

PyObject *
PyObject_GenericGetAttr(PyObject *obj, PyObject *name)
{
    return _PyObject_GenericGetAttrWithDict(obj, name, NULL);
}

And so we should look at the implementation of _PyObject_GenericGetAttrWithDict.

PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name, PyObject *dict)
{
    PyTypeObject *tp = Py_TYPE(obj);
    PyObject *descr = NULL;
    PyObject *res = NULL;
    descrgetfunc f;
    Py_ssize_t dictoffset;
    PyObject **dictptr;

    ...
}

Important information to continue:

  1. The function takes as a parameter obj, a reference to the object g;
  2. The function takes as a parameter name, attribute name accessed;
  3. The function takes as a parameter dict, a dictionary which in this case shall be void;
  4. From obj sought the reference to its type, Grok, by the variable tp;
  5. Initializes null pointers descr, which will be a possible descriptor, res, the return of function, f, the function __get__ of the possible descriptor, as well as other pointers;

From this is validated the name of the accessed attribute, returning an error if the attribute is not a string. If it is, increment the number of references to the object with Py_INCREF.

if (!PyUnicode_Check(name)){
    PyErr_Format(PyExc_TypeError,
                 "attribute name must be string, not '%.200s'",
                 name->ob_type->tp_name);
    return NULL;
}
Py_INCREF(name);

After, the internal dictionary of the type of g, tp, finalizing the function in case of failure:

if (tp->tp_dict == NULL) {
    if (PyType_Ready(tp) < 0)
        goto done;
}

After, it is searched for by the attribute in the class of g, Grok, saving in descr. If found, the references are incremented and the value of f as being the function __get__ of the value found in descr. If you find the function and the descriptor is a data descriptor (it has the method __set__), is defined res as a result of __get__ and the function ends:

descr = _PyType_Lookup(tp, name);

f = NULL;
if (descr != NULL) {
    Py_INCREF(descr);
    f = descr->ob_type->tp_descr_get;
    if (f != NULL && PyDescr_IsData(descr)) {
        res = f(descr, obj, (PyObject *)obj->ob_type);
        goto done;
    }
}

The function that checks whether it is a data descriptor, PyDescr_IsData, is defined by

#define PyDescr_IsData(d) (Py_TYPE(d)->tp_descr_set != NULL)

Which basically checks whether the method exists __set__ in the object.

And it is so far that it is executed when the (data) descriptor is a class attribute. For an instance attribute, execution continues. Now, as we will work directly with the instance, it will be necessary to also consider its internal dictionary. Thus, the next step will be the union between the dictionaries of the instance and the class, and the final pointer will be stored in dict:

if (dict == NULL) {
    /* Inline _PyObject_GetDictPtr */
    dictoffset = tp->tp_dictoffset;
    if (dictoffset != 0) {
        if (dictoffset < 0) {
            Py_ssize_t tsize;
            size_t size;

            tsize = ((PyVarObject *)obj)->ob_size;
            if (tsize < 0)
                tsize = -tsize;
            size = _PyObject_VAR_SIZE(tp, tsize);
            assert(size <= PY_SSIZE_T_MAX);

            dictoffset += (Py_ssize_t)size;
            assert(dictoffset > 0);
            assert(dictoffset % SIZEOF_VOID_P == 0);
        }
        dictptr = (PyObject **) ((char *)obj + dictoffset);
        dict = *dictptr;
    }
}

After that, you will be searched for the attribute in the dictionary dict and, if found, is returned the value:

if (dict != NULL) {
    Py_INCREF(dict);
    res = PyDict_GetItem(dict, name);
    if (res != NULL) {
        Py_INCREF(res);
        Py_DECREF(dict);
        goto done;
    }
    Py_DECREF(dict);
}

Note that here, as the instance attribute will exist in the dictionary, the value returned in PyDict_GetItem will be the instance of the decorator that, as will be different from null, will be returned, without considering whether there is, or not, the method __get__ defined.

If you do not find the attribute in the dictionary of the instance, it will be verified if the descriptor found in the class is a nondata (who does not have the method __set__) and, if it exists, is called:

if (f != NULL) {
    res = f(descr, obj, (PyObject *)Py_TYPE(obj));
    goto done;
}

After, if it has not yet satisfied any of the above conditions, it is verified whether the descr is different from null (found something about the attribute in the type of g), then it is defined descr as the result and the return:

if (descr != NULL) {
    res = descr;
    descr = NULL;
    goto done;
}

And finally, if nothing has worked so far, returns the attribute error not found:

PyErr_Format(PyExc_AttributeError,
             "'%.50s' object has no attribute '%U'",
             tp->tp_name, name);

To conclude, we move the reference quantities and return the value of res:

done:
    Py_XDECREF(descr);
    Py_DECREF(name);
    return res;

The whole function, for better viewing is:

PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name, PyObject *dict)
{
    PyTypeObject *tp = Py_TYPE(obj);
    PyObject *descr = NULL;
    PyObject *res = NULL;
    descrgetfunc f;
    Py_ssize_t dictoffset;
    PyObject **dictptr;

    if (!PyUnicode_Check(name)){
        PyErr_Format(PyExc_TypeError,
                     "attribute name must be string, not '%.200s'",
                     name->ob_type->tp_name);
        return NULL;
    }
    Py_INCREF(name);

    if (tp->tp_dict == NULL) {
        if (PyType_Ready(tp) < 0)
            goto done;
    }

    descr = _PyType_Lookup(tp, name);

    f = NULL;
    if (descr != NULL) {
        Py_INCREF(descr);
        f = descr->ob_type->tp_descr_get;
        if (f != NULL && PyDescr_IsData(descr)) {
            res = f(descr, obj, (PyObject *)obj->ob_type);
            goto done;
        }
    }

    if (dict == NULL) {
        /* Inline _PyObject_GetDictPtr */
        dictoffset = tp->tp_dictoffset;
        if (dictoffset != 0) {
            if (dictoffset < 0) {
                Py_ssize_t tsize;
                size_t size;

                tsize = ((PyVarObject *)obj)->ob_size;
                if (tsize < 0)
                    tsize = -tsize;
                size = _PyObject_VAR_SIZE(tp, tsize);
                assert(size <= PY_SSIZE_T_MAX);

                dictoffset += (Py_ssize_t)size;
                assert(dictoffset > 0);
                assert(dictoffset % SIZEOF_VOID_P == 0);
            }
            dictptr = (PyObject **) ((char *)obj + dictoffset);
            dict = *dictptr;
        }
    }
    if (dict != NULL) {
        Py_INCREF(dict);
        res = PyDict_GetItem(dict, name);
        if (res != NULL) {
            Py_INCREF(res);
            Py_DECREF(dict);
            goto done;
        }
        Py_DECREF(dict);
    }

    if (f != NULL) {
        res = f(descr, obj, (PyObject *)Py_TYPE(obj));
        goto done;
    }

    if (descr != NULL) {
        res = descr;
        descr = NULL;
        goto done;
    }

    PyErr_Format(PyExc_AttributeError,
                 "'%.50s' object has no attribute '%U'",
                 tp->tp_name, name);
  done:
    Py_XDECREF(descr);
    Py_DECREF(name);
    return res;
}
  • There are a few more things I need to comment on. I’ll see if I can do it tonight, if no one answers before.

  • Thank you for the reply.

4

The other answer is very good - including code snippets from the reference implementation. But I will write a shorter answer here, addressing another aspect of the question.

I think you can think of it as a "guide" to understand many of the behaviors of language - which, in my opinion, is very pleasant because it brings very few surprises once you understand these guides.

In Python, the entire "magic" - ie - methods that are called transparently by the language itself - is tied to class attributes, not instances.

Yes, for ease of recognition, and avoiding coincidence of names, these features are also, in general, denoted by names that begin and end with a double underscore - the famous ones __dunder__.

But then, one of the reasons why this behavior was chosen is this: Descriptors imply special behavior for access to the attribute - so do attributes __len__, __init__, etc... are special - and therefore have to be defined in the class.

From this motivation that is more a "feeling", here comes practical reasons, like implementation - the access of attributes in an instance works from a normal dictionary - this mechanism would have to be changed so that when retrieving an attribute from the instance, instead of delivering that attribute, some other operation was done.

And - how would Descriptor be "installed" in an instance to begin with? It would look strange at the first access objeto.attr = MeuDescriptor() is made a assignment normal, and from there, in the following accesses objeto.attr = ..., instead of placing the attribute in the __dict__ of the instance, the method __set__ of the Scriptor would be called. This would put a state depletion on assignment operations that has the potential to complicate the code greatly - since what an assigment does would depend on the execution order.

So much so that even an implementation of Descriptor (in the same class) allows that you write code that does different things depending on the order in which values are assigned to Descriptor: just keep a state variable controlled by Descriptor. But this is not used in almost any case.

finally

This is the implementation option, but the language is dynamic enough to allow you to make your own classes that work with "Descriptors in the instance" - and it wouldn’t even be that much trouble. Just make a base class by setting the methods __getattribute__ and __setattr__ to work with Descriptors. (and of course, the crazy effect I mentioned above would be worth).

To run only with "readonly" would be something more or less like this:

class InstanceDescriptableBase:
    def __getattribute__(self, attrname):
        attr = super().__getattribute__(attrname)
        if hasattr(attr, "__get__"):
            attr = attr.__get__(self, self.__class__)
        return attr

And at the terminal:

In [3]: class D:
   ...:     def __get__(self, instace, owner):
   ...:         return 42
   ...:     

In [4]: class Test(InstanceDescriptableBase):
   ...:     def __init__(self):
   ...:         self.attr = D()
   ...:         

In [5]: t = Test()

In [6]: t.attr
Out[6]: 42
  • It was something like this that I wanted to complement in the answer, but I imagined that soon you would answer :D

  • Hahahaha! There are some things I can’t resist. I’m still holding back from writing the code on yesterday’s Abstract Static methods question.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.