Structural Pattern Matching

Python 3.10 was released on 4th October 2021 and introduces several new features and improvements to the language. One of the biggest new introductions is structural pattern matching, which on the face of it is similar to case, switch, or match statements in other languages. Python 3.10 has its own twist on this however with some nifty new approaches.

To demonstrate python structural pattern matching we can think about parsing a logo program as an example. A great way to learn and play with the logo language is Turtle Academy.

A simple version of the Logo language has the following commands:

Command

Short Command

Argument

Example

forward

fd

distance

forward 50
go forwards 50 steps

back

bk

distance

back 50
go backwards 50 steps

left

lt

degrees

left 90
turn left 90 degrees

right

rt

degrees

right 180
turn right 180 degrees

penup

pu

penup
lift the pen up so it no longer writes

pendown

pd

pendown
put the pen down so it writes

For example, we can create the spiral below based on repeatedly drawing an almost-square shape:

using the following turtle program:

pendown
forward 1
left 91
forward 2
left 91
forward 3
left 91
forward 4
left 91
 
# ...
 
forward 499
left left 91
forward 500
penup

Pattern matching variables

Python 3.10 introduces the new match statement, which we can use to pattern match if we should parse an action, or a move command. The match statement behaves somewhat like a combination of an if-elif-else block, and tuple unpacking.

First, we provide the thing we want to match against. In our example, this is match command.split(). Next, we provide a list of cases that we want to match, in our example any command that has a single action item in it, or both a move and an amount in it.

command = "forward 49"
match command.split():
    case [action]:
        ...  # handle penup / pendown actions
    case [move, amount]:
        ...  # handle forward / back / left / right amounts

If any case block matches then the contents of that block are executed, and the match statement is left. This is different to some other languages where multiple cases could be matches.

In python 3.10 only the first matching case is executed.

Without the new match statement perhaps we would have written this block as

command_split = command.split()
if len(command_split) == 1:
    [action] = command_split
    # hand penup / pendown actions
elif len(command_split) == 2:
    move, amount = command_split
    # handle forward / back / left / right amounts

which is much less semantically meaningful than the version using the new match statement. Through combining the conditions and unpacking, the match statement allows us to write much more intuitive code.

Pattern matching literals

If we want to match specific input we can also match pattern literals. In this example, we match the input command “pendown”, which we can then process. In this way, we can match exactly the input that we want, as opposed to just the pattern on the input.

match command.split():
    case ["pendown"]:
        ...  # handle pen down

Pattern matching wildcards

If no case statement matches then the entire block is gracefully skipped over just as an extended if-elif block would be. To handle the case where we would usually add an else statement, we can finish our match block with a wildcard match. Similar to other python standard language features, we can specify a wildcard using the underscore character. For example:

match command.split():
    case ["pendown"]:
        ...  # handle pen down
    case _:
        ...  # handle default case

will try to match the input "pendown", but if it is not found, will execute the final case block. The specific pattern case _ can only occur as the final case of a match statement, since it would be impossible to ever reach anything beyond it.

Mixing pattern types

So far we have covered the three primary pattern types that the match statement works with:

  1. Capture Patterns so called because they capture a placeholder in the input into a variable

  2. Literal Patterns that allow us to match exact inputs

  3. Wildcard Pattern that allows us to describe a pattern to match any given input

The match statement allows any combination of these three types of patterns, which can be used to make much more complex patterns. In an earlier example we matched for a case [move, amount], but our logo program allows several different moves: forward, back, left, and right. This means we would have to do a further nested match, which would be against the Zen of Python.

Instead, we can combine the literal pattern "forward" with a capture pattern amount:

match command.split():
    case ["forward", amount]:
        ...  # handle forward amount

Now inside this case we know that the command was “forward”, not some other command. This removes the need for further logic inside our code, making it much more readable and maintainable.

Alternatives inside a pattern

If we want to add alternatives to our patterns, we can do so using the pipe operator, |, between each alternative. For example, if we want to be able to match both the full command "forward" and the short version "fd" in a single statement, whilst also capturing the amount into a variable, we can use the following snippet.

match command.split():
    case ["forward", amount] | ["fd", amount]:
        ...  # handle forward amount

This case will enter the matching block if the user types either forward 90 or fd 90 for example.

Python 3.10 enforces that the captured variables must match exactly when using the pipe operator to express alternatives. If this were not the case, it would be impossible to know which variables would be set or not within the block prior to runtime, leading to a potential source of bugs and confusion.

Sub patterns

Structural pattern matching also allows for sub patterns in case statements, where a single element is expressed by a complex pattern, rather than one of the more simple capture, literal, and wildcard pattern types. In our logo program, users can rotate either right or left by some number of degrees.

We can express this case with the following pattern:

match command.split():
    case [("right" | "left"), degrees]:
        ...  # handle right/left degrees

The sub-pattern is denoted within round brackets, and is an alternative pattern of the strings “right” or “left”.

Capturing a sub pattern

In the previous sub pattern example we matched either a right or left turn by some degrees, but when inside the case block are unable to know if we matched right, or if we matched left! Literal patterns are not captured, because they can only possibly be one thing. In our example though we created a complex pattern with two alternative string literals.

Thankfully python 3.10 introduces a mechanism for us to explicitly capture any part of our pattern using the as keyword, similar to how it is used when importing. Following is the same pattern match, but this time with the direction captured:

match command.split():
    case [("right" | "left") as direction, degrees]:
        ...  # handle direction degrees

When this case is matched and the code block entered, we now have access to two captured variables: direction and degrees.

Conditional matches

So far we have matches on patterns, and any logic around that match would be pushed inside the case block. Similar to how python works in comprehensions though, we can also attach conditional logic to case statements.

In the following example, our case matches a right turn command, but only if the number of degrees specified is between 0 and 360.

match command.split():
    case ["right", degrees] if 0 <= degrees <= 360:
        ...  # handle right degrees

It is only possible to have one conditional match per case, since the conditions are applied on the entire case statement, not on sub patterns. This is because access to all variables is potentially needed, and so the case must be fully matched before the logic is introduced. This makes the feature slightly less powerful than logic in comprehensions, but still a very useful tool.

More on wildcards

So far we have just seen wildcards as a way to express the final case in a structured pattern match block. However, they can be much more powerful than that. The wildcard operator can be used in a similar way to tuple unpacking. Let’s look at a couple of examples.

The following snippet matches the literal “right”, followed by a wildcard, which is not captured. This is helpful when we want a case to match without the need for access to variables describing what was captured.

match command.split():
    case ["right", _]:
        ...  # handle right

The following snippet matches the case right 90 for example followed by anything else. This could be useful to allow a string of commands for example, with the match called recursively.

match command.split():
    case ["right", degrees, *_]:
        ...  # handle right degrees

Matching types

Pattern matching isn’t limited to capturing arrays, and literals aren’t limited to strings – we can be much more expressive! Here are a few examples:

Dictionaries

The following snippet matches a command given as a dictionary instead of an array. In this case only the keys specified in the pattern need to match and all other parts of the dictionary are ignored.

program = {"command": "right", "degrees": 100}
 
match program:
    case {"command": "right"}:
        ...  # handle right turn without knowing degrees

To capture any remaining keys in the dictionary we can modify the example slightly so that we capture them in a similar way to **kwargs in method definitions:

match program:
    case {"command": "right", **other}:
        ...  # handle right turn capturing other parts of dictionary

Classes

The following snippet demonstrates how to match classes, where we can specify a pattern based on the attributes of that class.

from dataclasses import dataclass
 
@dataclass
class TurnCommand:
    command: str
    degrees: int
 
match command:
    case TurnCommand("right"):
        ...  # hand turn right command

With any class type which has a defined attribute ordering this pattern matching style will work, for example with data classes or named tuples. For classes where there is no explicitly defined ordering, a new builtin attribute has been added to the language: __match_args__. For example, we could have written TurnCommand as a normal python class, but with our constructor accepting arguments the opposite way around:

class TurnCommand:
    __match_args__ = ("command", "degrees")
    def __init__(deg: int, cmd: str):
        self.degrees = deg
        self.command = cmd
 
match command:
    case TurnCommand("right"):
        ...  # hand turn right command

Builtin types

Similar to with classes we can also capture any builtin type, not just string literals. We can capture an integer input with the following snippet:

program = "forward 10"
match program.split():
    case ["forward", int() as amount]:
        ...  # handle forward amount

This allows us to capture the amount as an instance of int, rather than matching anything which is given in this position. PEP-634 defines that the following types will work in this way: bool, bytearray, bytes, dict, float, frozenset, int, list, set, str, and tuple.

As a convenience, for these builtin types it is also possible to capture directly within the type by using a positional parameter. The following snippet is equivalent to the previous example:

match program.split():
    case ["forward", int(amount)]:
        ...  # handle forward amount

Final thoughts

Structural pattern matching is a long awaited new feature for the python language which finally cleans up those pesky long if-elif-else blocks with hidden logic inside them. The way matching works is very intuitive to those who are familiar with tuple unpacking, and should remove many use cases where people reach for the dreaded regex package.

All told, this new feature should make code significantly more expressive, semantically meaningful, and easier to maintain. Personally I’m super excited to get more familiar with it in practice, and learn what works and what doesn’t.