What is a virtual subclass in Python and what are its advantages?

Asked

Viewed 408 times

11

In Python we can implement an abstract class from the module abc and one of the forms is the class inheriting from abc.ABC:

from abc import ABC

class AbstractClass(ABC):
    @abstractmethod
    def method(self):
        ...

From this class we can define a subclass from the direct inheritance:

class Subclass(AbstractClass):
    def method(self):
        return "Metodo da subclasse"

Or by defining a virtual subclass from AbstractClass.register:

class VirtualSubclass:
    def method(self):
        return "Metodo da subclasse virtual"

AbstractClass.register(VirtualSubclass)

When checking from the function issubclass both classes will satisfy the condition:

print(issubclass(Subclass, AbstractClass))  # True
print(issubclass(VirtualSubclass, AbstractClass))  # True

See working on Ideone

So what is the difference between implementing a subclass from the inheritance or virtually from the register? When using the virtual subclass?

  • Here’s something I miss about Java: classes inherit static methods. I bet there’s also a superscript of static functions there

  • @Jeffersonquesado I’m not sure I understand what you mean xD

  • the method register belongs to the abc.ABCMeta. And he becomes a AbstractClass the moment you do AbstractClass(abc.ABC).

  • 1

    @Jeffersonquesado If I’m not mistaken each class stores a set of references to the subclasses it has. The function issubclass checks if the class is part of this set. In this case, the function register only add the reference to the class in that set, without going through the construction of the inheritance.

  • From what I understand, the virtual subclass is a means of saying that class B meets the contract defined by A even while B is not A.

  • With classes resident in their own code there will be no difference. However If you have to use a class external to your code, not related to any class of yours, existing in a third party package and the use of that class requires that you fulfill a contract (implicit interface), so that you cannot change the code then use the register() to register it as a virtual subclass and get around all these obstacles.

  • @Augustovasques But if Python doesn’t care about the type and class meets the expected contract, what’s the reason for registering it?

  • The language doesn’t mind but if you are using, for example, a Pattern design of strategy and method where the dependency injection is made requires that the passed class be specifically subclass of a certain type ai makes the difference.

  • 1

    The discussion was going in a good direction - I wrote the answer without reading the comments here, and I think I covered both sides of the conversation. : -) Comment there.

  • So - until Friday, when I answered, I didn’t see any practical case possible - but today I did, and for a "noble reason" - I’ll probably complete the answer with these cases later: https://github.com/jsbueno/terminedia/blob/484f163cec1e5f74ba8afb9c9b4d26f8705b353b/terminedia/image.py#L339

  • TL;DR: I used virtualsubclassing in a class that will be a proxy for "virtual super class" hierarchy objects - and the logic contained in the proxies themselves is different from the classes that are "proxiated". Only that the places that will receive these proxies have to "perceive" them as being of that hierarchy.

  • 2

    I increased the response now, explaining better this use I made of the resource.

  • 1

    By "coincidence" there is an article written last week about this: https://www.hillelwayne.com/post/negatypes. (I am reading now)

  • @jsbueno "How is this Useful? No idea" haha sincerity is all :D

  • Interesting is that the author writes more or less the same as I do about "Collections.abc not having become as popular as predicted".

  • 1

    (and in fact, finding "iterables that are not strings" in particular is quite useful in Python - I think I’ll leave it in my utility bag)

Show 11 more comments

1 answer

10


TL;DR: - updating - The main motivation for creating this language functionality (not the call to .register, and yes, the whole concept and mechanism of "virtual subclassing") was almost certainly able to register built-in classes in native code such as dict and list as instances of the most generic types collections.abc.Mapping and collections.abc.Sequence (and other compatible types). The use of this mechanism for user-created classes has limited utility, as detailed below:

The ugly

The simplest answer is, you’ll practically never need!

But if you’re in a project that uses Static type hinting with Mypy, you might need - (and then, you’ll probably find that not everything is flowers by increasing complexity, and your choice, although logic and the right thing will not work, because there are things done badly in that way there)

Let’s go in pieces:

What to implement a class MinhaCLasse as a virtual subclass of another ClasseBase does is that when at some point in the code, yours, or anyone else using your MinhaClasse is an instance of ClasseBase, with issubclass(MinhaClasse, ClasseBase) return True.

So, going to a more concrete example - you create a class that behaves like a Python sequence - implements from the inside __getitem__, __len__, and other things, but does not directly inherit nor list, nor of collections.abc.Sequence. Then you This makes, for example, that if you were to use your class with a library that will use Python "sequences". If your team wrote this other code, and you’ve combined to always test whether an object is passed by testing isinstance(obj, collections.abc.Sequence), cool - your call will work, and merging your code with the rest of your team’s code will work fine.

Only - how many times have you tested whether an object you’re going to use as a sequence is a sequence using that? In the real world, there are rare libraries that will check if your parameter is a sequence even using this comparison - in practice, your object will be plugged into a for on the other side, and, if he doesn’t respond right to the "iterable" protocol, an exception happens - and that’s it, everybody’s happy. :-)

That’s - you run it here:

...
class MinhaSeq:
   ...

collections.abc.Sequence.register(MinhaSeq)
...

def qualquer_funcao():
    items = MinhaSeq(...)

    biblioteca_x.funcao_y(items)

And if on the other side, in the library:

def funcao_y(obj):
    for item in obj:
        funcao_z(item)

Hence, if by chance, your MinhaSeq did not function as a sequence or iterable, the line that has the for would cause a TypeError.

Now, if back in the library they "thought about it," the code could be like this:

import collections.abc

def funcao_y(obj):
    if not isinstance(obj, collections.abc.Iterable):
         raise TypeError("funcao_y precisa receber um iterável")
    for item in obj:
        funcao_z(item)

Congratulations - now Typeerror happens exactly one line before! E - the library_x will only work for who or inherit from a official Python sequence, or remember to call the "Register". That is - on the library_x side, it did not facilitate the use for third party, and did not win nothingness with that check.

(and you have an extra "isinstance" check, which if you go on a tight loop can give performance difference, though this is a rare situation in Python code (that is: a single call to isinstance impact on the performance of a code snippet))

the right thing to do

Now, yes, with PEP 484, and static check, the library_x could have the code like this:

import typing as T

def funcao_y(obj: T.Iterable[T.Any]):
    for item in obj:
        funcao_z(item)

Note that in this case the library_x can help the user who is worried about making calls with correct typing in a large project: the project will include the execution of "mypy" at test time/q.a. and if the call in your code passes an object that is not an "Iterable" that will be charged before the code is being executed or produced. And, if, on the other hand, who will use the library_x is not worried about this check, will not be checking his project with "mypy" and will only make the call - which will work as in the first case above, with no extra code at runtime.

Hence the problem with the "virtual subclass": she nay It’ll work in this case! Because "mypy" (and other similar tools), there is no way to find out that you will call "Collections.abc...Register" for your class in static analysis: it will error your call in the same way.


Just to give a very real example of when I say that the "bibitoeca_x" won’t test with isinstance: it’s not a joke - it doesn’t happen in the real world - it doesn’t even happen in the standard Python library. The JSON Encounter, for example, requires real instances of dict and list (or direct subclassses) to work, and will not work with `Collections.abc.Sequence. This week even had a highly complex question on Soen about this, involving users with the highest reputation, and complex issues (such as "metaclasses") - and discovering bugs in the standard Python implementation itself: https://stackoverflow.com/a/58031309/108205 (Disclaimer: I was involved in the question and mine was the answer accepted).


(continuing the correct): What to do then?

Good, you want to use some typing check in Python and do things right from the OO point of view? So the right thing to do is to remember that when the "Register" engine and virtualsubclassing were created, no thought had been given to the optional static type check of Python that is gaining popularity today. What would happen in the above case is that instead of worrying about registering MinhaSeq as a virtual subclass of collections.abc.Sequence, you would advise with the recommendations of PEP 484 and the Typing module that your class respects the interface typing.Sequence.

The problem? It is that typed Python code to work with Mypy gets boring to write. For example, if you have declared the class MinhaSeq without being compatible, by its inheritance, with the type of sequence, can not simply make a "=" declaring that now it is compatible - ie, this here: (TL;DR: the example below is the recommended form in Python modermo to create other classes with a user-friendly interface by Mypy -- that is, when formally concerned about typing in the project):

import collections.abc
import typing as T

class MinhaSeq(collections.abc.Sequence[T.Any]):
    ... 

It works - but it’s obviously not equivalent to calling the .register after the class was created. The closest would be to use the typing.cast - "Typing.cast" is something that does nothing at runtime - returns the same object it was called with - but passes information to the static checker, in this case mypy, about the type of the returned object. The problem?? The static mapper will already have learned about the MinhaSeq and does not let you change her type after the statement - so the return of the cast, which he will understand is a type "Sequence" has to be to a similar name, but not the same:

import collections.abc
import typing as T

class _MinhaSeq:
    ...

MinhaSeq = T.cast(T.Type[T.Sequence[T.Any]], _MinhaSeq) 

And here, you’d be ready to call funcao_x(obj: T.Iterable[T.Any]): passing instances of MinhaSeq

The fun

Although it has no practical use - even in super-pedantic code regarding typing, the interesting thing about virtual classes is precisely the "concept". It may be that the idea comes back with more strength in a few years (if "mypy" pays the same attention to the call to "Register" that he pays in the call to "Typing.cast" for example, the thing would already work)

Behind the scenes, they are mechanisms to let super-classes respond programmatically to the calls of isinstance and issubclass that are at stake - the "Register" method of the ABC classes is just a form of the ABC classes annotate how their special methods __subclasscheck__ and __subclasshook__ are used - and it is possible to have some project that makes a legal use and with practical applications of it. But, as it is defined today, it would be difficult to use the virtual subclasses in a project with practical application in addition to proof of concept.

A case of use

As an "exception to confirm the rule", during the last weekend I was implementing a feature in which "virtual subclassing" seemed to provide an interesting feature. I wouldn’t have remembered that feature if I hadn’t interacted with that question - and I would have just used a subclass.

There is this free project that I’m developing - a library for drawings and arts with Unicode in the terminal - https://github.com/jsbueno/terminedia - In it, I have a class hierarchy that starts with "Shape" and provides some specialized classes to contain graphic and textual data (for example, one of the classes loads a binary image file, and keeps the internal data as an image of PIL, another keeps the data as strings, and has a color map which acts as a palette: each character can represent a distinct color, etc...).

All "Shape" classes have in common that their data is read and modified through the methods __getitem__ and __setitem__ - and I wanted to provide a "view" class that would allow choosing a smaller area within a Shape - for example, the rectangle between positions (5,5) and (15,10) - and be able to change data in that view as if it were a "Shape" - but the contents of the "0,0" position of the view would transparently alter the content of the original at position "5,5". And so on for any graphical operation in the view: it presents the same methods and attributes as a "Shape", but the internal data are those of the original instance, and the whole addressing is done only within the region of interest (ROI).

The logic for such a view is quite simple - it has to worry about providing a few attributes, and to transparently access all the other attributes of the original instance. In Python this is possible by customizing the method __getattribute__ of a class, carefully.
So it doesn’t make sense that I have to have all the logic of Shape on Shapeview, even in an inherited way - (and it would be inherited if I wasn’t using virtualsubclassing - it wouldn’t consume "extra" resources). All Shapeview has to do is "worry" about making the necessary changes to the coordinates for its proxi role.

Well, it turns out that in the rest of the project, there are some points where an object of type "Shape" is expected. Since Shapeview - being a proxy for a Shape - can do everything a Shape can, it makes sense that it can "say it’s a Shape, even without inheriting it directly from one". And then virtualsubclassing does exactly what would be interesting: the code that checks the class with isinstance(data, Shape) You’ll think you’re a Black Hat. And even if within the project this only happens in a few places, with virtualsubclassing, I can encourage this Pattern for the users of bilbliteca - will continue working.

What did I need to do? First, the base class "Shape" has to become a class that allows the registration of virtual subclasses. This can be done simply by inheriting abc.ABC of the standard library. And interestingly, exactly the Shape base had some "abstract" methods - which need to be overriden by the subclasses, but I had not given myself the job of inherited abc.ABC only on account of this - as the focus of the project (at least at the time), is not to be 100% in accordance with all good practices of OO, I simply had a raise NotImplementedError in those methods that need to be rewritten in the subclass. The @abstractmethod module abc Python does little more than that, so I wasn’t using it. But since I was going to use the base ABC for virtualsubclassing, no reason not to use the decorator @abstractmethod in the 3 places where it makes sense.

Another point that’s cool to note is that, so I needed to call Shape.register(ShapeView) to "activate" the virtualsubclassing. And, like any callable in Python that receives a callable as a single argument, this can be done with the syntax of Developer - that is, the registration of the virtual subclass could be done like this:

@Shape.register
class ShapeView:
    ... 

https://github.com/jsbueno/terminedia/blob/140c934da66c0186e52741cbb0dacfa6bc16f0b7/terminedia/image.py#L351

(note: the call .register works as a Developer why it returns the original argument - if it returns None, this would not work - in the documentation there is a note that in Python version 3.3 they realized this and changed to work).

Good - then, finally, to put water in the boil - I was excited that all the content of the class ShapeView had gotten minimal - she just needed to customize access to attributes, implement __setitem__, __getitem__ In addition to width, height and size to work as a Shape that does a lot more things -- and I realized what the implementation is like, I needed to replicate in it the namespaces with the drawing methods themselves - then it was not so far away from a "Shape". So - on that side - Shapeview had to reimplement a lot of Shape - virtualsubclassing didn’t bring that many gains. But on the other hand, a Shapeview can be associated with any of the other subclasses of Shape - and provide transparent access to attributes such as PixelClass - maybe for this to work right, I’d end up having to do all shapes have to have the features that are in "Shapeview" - or create dynamic subclasses every time a shapeview is created. - then perhaps the virtualsubclassing balance is still quite positive.

Disclaimer: by the size of the text used only to explain the case of concrete use, you can see that virtualsubclassing is not a feature that will be used all the time.

  • Just commenting that I have already read the answer, but I have not yet been able to analyze it completely, because it seems to have a lot of content to extract from it. You’ve already won my vote.

  • In the example I quoted, Strategy Pattern, then it’s no use case. I’d like to know about this proxy (I left the star in git) class Shape about the class ShapeView, I ask because I want to learn, it would not be the case to use an adapter pattern, Adapter Pattern, rather than a proxy in view of what was put in that answer.

  • 1

    Good - first, this is a hobby project, "blue sky" - in which my goal is (1) to have Apis that allow you to create drawings and graphics with Nicode in the terminal (and soon other backends), and (2) to make these Pis as succinct and intuitive as possible for external users. 'good practices' and 'Patterns' are adopted as they occur to me in development. Shapeview allows access to a subset of the image data of an associated object, exposing the same drawing methods and etc... and does so from a "Slice" of the original object, without data copy.

  • 1

    I’m still not sure if keeping shapeview in a separate hierarchy from the Shapes is an unnecessary preciousness, and if it couldn’t just be a subclass of 'Shape". Refactoring by a mixin last night made it even more similar - but even if it is a subclass of Shape, it would still act as a proxy for the associated Shape data - that seems to be right. It is not the case of "Adapter" because who will use the shapeview will use the same interface that exists in a Shape (AFAIK, "Adapter" is, roughly speaking, a way to translate an interface to another)

  • 1

    As for the example you created - "Strategy Pattern" - yes, it can be a use case - but in a very different project, and less used in Python because of the dynamism of the language.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.