How does Python structural matching (match declaration) work?

Asked

Viewed 399 times

15

Recently, the Pattern matching structural version was introduced 3.10 (still in alpha) Python. For this, the keywords match and case.

Were, in the notes of release, included several examples, among which I can quote:

def http_error(status):
    match status:
        case 400:
            return "Bad request"
        case 404:
            return "Not found"
        case 418:
            return "I'm a teapot"
        case _:
            return "Something's wrong with the Internet"

That said, I ask:

  • How does the match in Python?
  • Is it exhaustive? What happens if the value to be consulted does not give match in no arm?
  • It allows the unpacking of values?
  • Historically, the switch does not exist in Python, but now it is possible to say that this structure is in the language? Or are they different things?
  • My friend, taking a look at this PEP https://www.python.org/dev/peps/pep-0622/ you will understand that Pattern matching is something broader than the switch itself originally from C (for example) of course, because languages and resources obviously evolve. In this case of Python, it is clearly quoted that the syntactically similar feature in Scala and Erlang (and others...). Take a look at this question of mine that I asked about Swift very similar to yours! https://answall.com/questions/137912/switch-case-muito-interessante-em-swift-quais-outras-linguagem-suportam-isso

  • 5

    @Matthew, thank you for the links. Actually, the question is similar, but in a way, I already know what it is Pattern match (although I still don’t know the peculiarities of this in Python). My idea with this question is to create a "canonical" about this new feature, specifically in the Python universe. So I even included the last topic - I’m sure it’s something I hope a lot of people will think about that new statement. I’ll leave another one here related, only turned to PHP.

  • 5

    Is this the syntax? Python logo that says expressive? If people are consistent soon everyone from Python goes to C#. That is, Python wanting to turn PHP, putting things without thinking, without meeting his own philosophy.

  • @Luizfelipe forgive me, I did not want to pass the idea that you did not know what Pattern Match was (in the hurry I wrote without taking care). I found legal the implementation of the protocols in PEP 0622, see that mass: https://www.python.org/dev/peps/pep-0622/#custom-matching-Protocol

  • @Maniero (take in joke what I will write eh... promises?) It is not because your only tool is a hammer that all your problems seem hehehe nails. If you want to continue dropping bolts in Python with hahahaha chisel go ahead. Very welcome the resource.

  • 3

    @Matthew The idea of [pt.so] is to create a repository of knowledge about programming, which is why it is common (I mean, nowadays not so much, but however it is allowed/encouraged) for people to ask questions about a subject they already know, so that this knowledge is on the site, accessible to all. Of course, most questions end up being asked by those who do not know the subject, but there is no problem if the person already knows

  • 5

    (only pointing out that the above comments mention PEP 622: it was the first proposal, but it was not implemented. The syntax and examples described in PEP 634 (https://www.python.org/dev/peps/pep-0634/), 635 and 636 are valid)

Show 2 more comments

2 answers

11


The Pattern matching is not the same as a switch. In the example code that is in the question, it may even look like it, but if we go into the details, we will see that it is quite different.

According to the PEP 635, what the Pattern matching does is: "builds a Generalized Concept of iterable unpacking. Binding values extracted from a data Structure is at the very core of the Concept and Hence the Most common use case".

Yet, this article (which it has among its authors the creator of Python himself) cites that "Pattern matching Allows Both to Extract specific information from Complex data types, as well as to branch on the Structure of data and Thus apply Specialized actions to Different Forms of data".

That is, the idea of Pattern matching is the extraction of values based on some standard. The fact of can be used in a similar way to a switch (or a chain of if/else) is circumstantial. But let’s get to the details...


The specification of Pattern matching is in the PEP 634, the motivation to create it is in PEP 635 and a full tutorial can be seen on PEP 636, then I will just point out some more general lines, to give a sense of what is.

The first difference to the switch is that in the match there is the Fall through: once he enters an option, the rest are ignored.

That is, the code below only prints "Not found":

status = 404

match status:
    case 400:
        print("Bad request")
    case 404:
        print("Not found")
    case 418:
        print("I'm a teapot")
    case _:
        print("Something's wrong with the Internet")

But in languages they have switch (as Javascript), if there is no break within each case, occurs the Fall through:

var st = 404;

switch (st) {
  case 400:
    console.log('Bad request');
  case 404:
    console.log('Not found');
  case 418:
    console.log("I'm a teapot");
  default:
    console.log("Something's wrong with the Internet");
}

The code above prints:

Not found
I'm a teapot
Something's wrong with the Internet

For it to print only "Not found", it is necessary to have a break within each case.

So the first difference is that the match you don’t need this break. But there’s so much more...


In the link already quoted has all the complete examples, but to give a summary, I’ve put several together into one. In the examples a case is used in which the user enters a command and the match analyzes what was typed to take action:

command = input('command: ')

match command.split():
    case ["quit"]:
        print("Goodbye!")
        quit_game()
    case ["look"]:
        current_room.describe()
    case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
        character.get(obj, current_room)
    case ["go", direction] if direction in current_room.exits:
        current_room = current_room.neighbor(direction)
    case ["go", _]:
        print("Sorry, you can't go that way")
    case ["drop", *objects]:
        for obj in objects:
            do_something(obj)
    case _:
        print('invalid command')

In the first 2 cases (case ["quit"] and case ["look"]), there’s the Pattern matching for specific values. That is, if the typed string is exactly "quit" or "look", it falls into the respective case (remembering that split returns a list, so the case was made with a list containing the string).

Then we see another case: case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]. Here we have the |, which indicates "or" - meaning there are 3 different options that can fall into this case. Or is a list containing the string "get" and some other string (whose value will be placed in the variable obj), or is a list with 3 strings, containing "pick", "up" and some other string (which will be obj), where in the options the order varies (ie I could type "pick up object" or "pick object up").

Next we have the option to add a Guard at the case:

case ["go", direction] if direction in current_room.exits:
    current_room = current_room.neighbor(direction)
case ["go", _]:
    print("Sorry, you can't go that way")

That is, if a match in the case ["go", direction], the condition is assessed (in this case, the value of direction is one of those I consider valid). If it is, it enters the case, otherwise it enters the case from below.

Finally, we have the case where the unpacking: case ["drop", *objects]. That is, if you type "drop obj1 obj2 obj3", the values "obj1", "obj2" and "obj3" will be in the list objects (if you just type "drop", the list objects will be empty).


There is also the possibility of matching in the structure of a dictionary:

dados = {"text": "Lorem ipsum", "color": "azul"}

match dados:
    case { "text": texto, "color": cor }: # cai neste case
        print(f'imprimir "{texto}" na cor {cor}')
    case { "sound": arquivo, "format": formato }:
        print(f'Tocando {arquivo} (formato: {formato})')

Thus, I can check if the dictionary has certain keys and already assign the values in variables. In the above example, how the dictionary has the keys "text" and "color", the variables texto and cor shall have their respective values.

It is worth remembering that the dictionary may have other keys besides those indicated, and that the matching is done in the order in which the case's are, entering the first one that is found. For example, in the case below, it enters the first case (even if he had the keys that would also give match in the second case):

dados = { "sound": "musica", "format": "mp3", "text": "Lorem ipsum", "color": "azul"}
match dados:
    case { "text": texto, "color": cor }: # entra neste case
        print(f'imprimir "{texto}" na cor {cor}')
    case { "sound": arquivo, "format": formato }:
        print(f'Tocando {arquivo} (formato: {formato})')

But if I reversed the order of case's, it would first find the keys "sound" and "format":

dados = { "sound": "musica", "format": "mp3", "text": "Lorem ipsum", "color": "azul"}
match dados:
    case { "sound": arquivo, "format": formato }: # entra neste case
        print(f'Tocando {arquivo} (formato: {formato})')
    case { "text": texto, "color": cor }:
        print(f'imprimir "{texto}" na cor {cor}')

And there is still the possibility to check the types of values:

dados = { "nome": "Fulano de tal", "idade": 30 }
match dados:
    case { "nome": str(nome), "idade": int(idade) }:
        print(f'{nome} tem {idade} anos')
    case _:
        print('matching not found')

That is, only if nome for a string and idade is a whole, will match in the first case (for example, if I changed the age value for the string "30", would not work, and he would enter the second case).

And if you want to get the remaining keys (if the dictionary has extra keys), just use one wildcard:

dados = { "nome": "Fulano de tal", "idade": 30, "email": "[email protected]", "filhos": 2 }
match dados:
    case { "nome": str(nome), "idade": int(idade), **resto }:
        print(f'{nome} tem {idade} anos', resto)

In the above example, resto will be a dictionary containing the remaining keys ("email" and "children") and their respective values.


Anyway, as you can see, it’s much more than a simple switch, as well as analyzing the value, the match can also evaluate the internal structure, including classes, according to the examples below:

# obviamente, assumindo que as classes Click, KeyPress e Quit existem
match event.get():
    case Click(position=(x, y)):
        handle_click_at(x, y)
    case KeyPress(key_name="Q") | Quit():
        game.quit()
    case KeyPress(key_name="up arrow"):
        game.go_north()

That is, it would be possible not only to verify that the event is an instance of a certain class, but also to obtain the values of its attributes directly.


Another detail is that, unlike other languages (as for example Rust, cited in the comments), the match is not exhaustive. That is, we do not need to test all possibilities in case's.

Therefore, the example of the question might not have the clause default:

status = 444
match status:
    case 400:
        print("Bad request")
    case 404:
        print("Not found")
    case 418:
        print("I'm a teapot")

In this case, the above code does not print anything, as none of the case's finds a match.


And just to finish, another difference to the switch is that in many languages the clause default does not necessarily need to be the last. For example, in Javascript, the code below prints "Something’s Wrong with the Internet":

var st = 444;
switch(st) {
    case 400:
        console.log("Bad request");
        break;
    default:
        console.log("Something's wrong with the Internet");
        break;
    case 404:
        console.log("Not found");
        break;
    case 418:
        console.log("I'm a teapot");
        break;
}

But in the match Python, this gives error:

status = 444
match status:
    case 400:
        print("Bad request")
    case _: # erro, cláusula "default" deve ser a última
        print("Something's wrong with the Internet")
    case 404:
        print("Not found")
    case 418:
        print("I'm a teapot")

In the case, the case _ must be the last.


Anyway, in general, that’s it. For all the details, read the PEP’s already listed at the beginning.

  • 1

    Sensational. I would know if there is any similarity with match of the Rust?

  • 2

    @Lucas Unfortunately I don’t know Rust well enough to answer...

  • 3

    @Luke, in essence is almost the same thing. However, as the match Python is a statement, it does not have to be exhaustive, since it does not need to "return a value" in the evaluation. Python Pattern match of Rust, as it is a expression, should always return a value, which implies the need to always be exhaustive.

  • 2

    I would like to leave three examples of implementation that take the Pattern match to another level: LISP, Haskell and Scala. I personally would like Python to follow at least one of these examples.

8

I followed from the first proposal and the various interactions of the development to the final proposal being implemented.

How the Python match works?

The idea of match is, from an object, or result of an expression you run a code snippet that will be able to extract values and normalize variables to be consumed below the code.

For this, there are two resources: the Patterns themselves, which allow the association of variables already at the time of comparison, in the command case, and, when they are not sufficient, in the case, any valid Python code snippet.

Otherwise, because of this structure, in which the commands case allow a code snippet whenever the pattern marries, the commands match/case also function as the switch/case of C - but this is a "bonus", since the conditional execution of blocks of code as happens on switch/case is already fully satisfied for if/elif/else in Python, as I exemplified in the answer that you called the question.

Is it exhaustive? What happens if the value to be consulted does not match no arm?

The same as in a if/elif that did not enter any block: in time execution: nothing - simply the program continues execution in the first line after the block of match. However, the recommendation for tools static code analysis (like Pyflakes, mypi, etc...) is give at least a Warning in such cases.

Recalling that static code analysis is an optional step for Python developers to find errors and "strange things" that languages static finds at compile time. The same errors would only happen in Python at runtime, because it is a dynamic language.

It allows the unpacking of values?

Yes - and this is the main justification for the inclusion of the new syntax. The "value unpacking", or "object deconstruction".

So one of the biggest examples that comes along the Peps that described the functionality, would be a function that would do an operation with the coordinates "x" and "y" of a geometric point - only that it can receive this point in several different ways: it can be passed as a tuple of two values, as a dictionary with the keys "x" and "y", or as an object of type "Point" that has the attributes "x" and "y". The match/case structure allows a more elegant code snippet than a series of if/elifs to ensure that these values are in the variables x and y for the rest of the code:


def distance(point):
    match point:
        case Point(x, y): 
            pass
        case [x, y]:
            pass
        case {"x":x, "y": y}:
            pass
        case _:
            raise TypeError("Não foi possível reconhecer as coordenadas")
    return math.hypot(x, y)
    
    

And on the terminal, with a dev version of Python 3.10 works like this:


In [6]: from dataclasses import dataclass

In [7]: @dataclass
   ...: class Point:
   ...:     x: float
   ...:     y: float
   ...: 

In [8]: distance([3,4])
Out[8]: 5.0

In [9]: distance(Point(3, 4))
Out[9]: 5.0

In [10]: distance({"x":3, "y":4})
Out[10]: 5.0

As you can see, the main point is that within the clause case if you use a new syntax, where "loose" variables receive the values of the matching patterns (in the case of Point with the attributes in sequence used in the construction of the Point, but it has to be another name too), in the second clause, any sequence of length 2, and in the third, any object "Mapping" (objects that work Coo a dictionary) that has the keys "x" and "y" - and in the fourth clause the "default"). In this case, no code was required inside the blocks match - the value desired was already deconstructed for variables x and y - (the command pass could even be on the same line)

Historically, the switch does not exist in Python, but is now possible say that this structure is in the language? Or are things different?

As I wrote in the first paragraph: "switch" functionality ends entering as "bonus" in the language with "match/case" - but it is almost a second way of doing the same thing - with some more expressiveness allowed by the syntax of the match (i.e., expressions "match" may be shorter and readable than the same expressions using "if" - at the cost of having to understand a new syntax)

  • "any sequence of length 2" - interesting, I found that the case [x, y] would-be match only with lists, but I did the test here passing a tuple and a range(3, 5) and both worked. So that would work with any iterator/iterable?

  • 4

    iterator - pq. iterator does not have len (method __len__). Any yes sequence - the range is an object that is more than an iterator: it has len.

  • 1

    Yes, I tested with a Enerator Function and it was not (because there is no len)... On another subject, you know why they prefer case _ instead of default (or else else, or any other option)?

  • I’ll take advantage and ask if you also know why they chose to do the match a statement, not an expression, as seems to be most common among the Pattern match in other languages.

  • 3

    After the first proposal, written by 5 or 6 authors, including Guido, were exchanged hundreds of public emails, asking and arguing in favor of all possible forms and possibilities. At this stage, there were requests to be an expression - but I will not remember why it was agreed that the form with commands was better. I think that in the end, provide more flexibility: with the blocks case you can complement with any extra code you need - so it has more possibilities of use.

  • 3

    As to why they didn’t use default or else : were dozens of emails asking otherwise, and I will not remember the argument in favor of the case _:. I would prefer else also, since it is used in so many other commands in Python.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.