Tuesday, February 23, 2016

Python 3 Type Hints and Static Analysis

Python 3.5 introduced the new typing module that provides standard library support for leveraging function annotations for optional type hints. That opens the door to new and interesting tools for static type checking like mypy and in the future possibly automated type-based optimization. Type hints are specified in PEP-483 and PEP-484.

In this tutorial I explore the possibilities that type hints present and show you how to use mypy to statically analyze your Python programs and significantly improve the quality of your code.

Type Hints

Type hints are built on top of function annotations. Check out Python 3 Function Annotations for a detailed deep dive into function annotations. Briefly, function annotations let you annotate the arguments and return value of a function or method with arbitrary metadata. Type hints are a special case of function annotations that specifically annotate function arguments and the return value with standard type information. Function annotations in general and type hints in particular are totally optional. Let’s take a look at a quick example:

The arguments were annotated with their type as well as the return value. But it’s critical to realize that Python ignores this completely. It makes the type information available through the annotations attribute of the function object, but that’s about it.

To verify that Python really ignores the type hints, let’s totally mess up the type hints:

As you can see, the code behaves the same, regardless of the type hints.

Motivation for Type Hints

OK. Type hints are optional. Type hints are totally ignored by Python. What’s the point of them, then? Well, there are several good reasons:

  • static analysis
  • IDE support
  • standard documentation

I’ll dive into static analysis with Mypy later. IDE support already started with PyCharm 5’s support for type hints. Standard documentation is great for developers that can easily figure out the type of arguments and return value just by looking at a function signature as well as automated documentation generators that can extract the type information from the hints.

The typing Module

The typing module contains types designed to support type hints. Why not just use existing Python types like int, str, list and dict? You can definitely use these types, but due to Python’s dynamic typing, beyond basic types you don’t get a lot of information. For example, if you want to specify that an argument may be a mapping between a string and an integer, there is no way to do it with standard Python types. With the typing module, it’s as easy as:

Let’s look at a more complete example: a function that takes two arguments. One of them is a list of dictionaries where each dictionary contains keys that are strings and values that are integers. The other argument is either a string or an integer. The typing module allows precise specifications of such complicated arguments.

Useful Types

Let’s see some of the more interesting types from the typing module.

The Callable type allows you to specify the function which may be passed as arguments or returned as a result, since Python treats functions as first-class citizens. The syntax for callables is to provide an array of argument types (again from the typing module) followed by a return value. If that’s confusing, here is an example:

The on_error callback function is specified as a function that takes an Exception and an integer as arguments and returns nothing.

The Any type means that a static type checker should allow any operation as well as assignment to any other type. Every type is a subtype of Any.

The Union type you saw earlier is useful when an argument can have multiple types, which is very common in Python. In the following example, the verify_config() function accepts a config argument, which can be either a Config object or a file name. If it’s a file name, it calls another function to parse the file into a Config object and return it.

The Optional type means the argument may be None too. Optional[T] is equivalent to Union[T, None]

There are many more types that denote various capabilities such as Iterable, Iterator, Reversible, SupportsInt, SupportsFloat, Sequence, MutableSequence and IO. Check out the typing module documentation for the full list.

The main thing is that you can specify the type of arguments in a very fine-grained way that supports the Python type system at a high fidelity and allows generics and abstract base classes too.

Forward References

Sometimes you want to refer to a class in a type hint within one of its methods. For example, let’s assume that class A can perform some merge operation that takes another instance of A, merges with itself and returns the result. Here is a naive attempt to use type hints to specify it:

What happened? The class A is not defined yet when the type hint for its merge() method is checked by Python, so the class A can’t be used at this point (directly). The solution is quite simple, and I’ve seen it used before by SQLAlchemy. You just specify the type hint as a string. Python will understand that it is a forward reference and will do the right thing:

Type Aliases

One downside of using type hints for long type specifications is that it can clutter the code and make it less readable, even if it provides a lot of type information. You can alias types just like any other object. It’s as simple as:

The get_type_hints() Helper Function

The typing module provides the get_type_hints() function, which provides information about the argument types and the return value. While the annotations attribute returns type hints because they are just annotations, I still recommend that you use the get_type_hints() function because it resolves forward references. Also, if you specify a default of None to one of the arguments, the get_type_hints() function will automatically return its type as Union[T, NoneType] if you just specified T. Let’s see the difference using the A.merge() method defined earlier:

The annotations attribute simply returns the annotation value as is. In this case it’s just the string ‘A’ and not the A class object, to which ‘A’ is just a forward reference.

The get_type_hints() function converted the type of the other argument to a Union of A (the class) and NoneType because of the None default argument. The return type was also converted to the class A.

The Decorators

Type hints are a specialization of function annotations, and they can also work side by side with other function annotation.

In order to do that, the typing module provides two decorators: @no_type_check and @no_type_check_decorator. The @no_type_check decorator can be applied to either a class or a function. It adds the no_type_check attribute to the function (or each method of the class). This way, type checkers will know to ignore annotations, which are not type hints.

It is a little cumbersome because if you write a library that will be used broadly, you must assume that a type checker will be used, and if you want to annotate your functions with non-type hints, you must also decorate them with @no_type_check.

A common scenario when using regular function annotations is also to have a decorator that operates over them. You also want to turn off type checking in this case. One option is to use the @no_type_check decorator in addition to your decorator, but that gets old. Instead, the @no_Type_check_decorator can be used to decorate your decorator so that it also behaves like @no_type_check (adds the no_type_check attribute).

Let me illustrate all these concepts. If you try to get_type_hint() (as any type checker will do) on a function that is annotated with a regular string annotation, the get_type_hints() will interpret it as a forward reference:

To avoid it, add the @no_type_check decorator, and get_type_hints simply returns an empty dict, while the __annotations__ attribute returns the annotations:

Now, suppose we have a decorator that prints the annotations dict. You can decorate it with the @no_Type_check_decorator and then decorate the function and not worry about some type checker calling get_type_hints() and getting confused. This is probably a best practice for every decorator that operates on annotations. Don’t forget the @functools.wraps, otherwise the annotations will not be copied to the decorated function and everything will fall apart. This is covered in detail in Python 3 Function Annotations.

Now, you can decorate the function just with @print_annotations, and whenever it is called it will print its annotations.

Calling get_type_hints() is also safe and returns an empty dict.

Static Analysis With Mypy

Mypy is a static type checker that was the inspiration for type hints and the typing module. Guido van Rossum himself is the author of PEP-483 and a co-author of PEP-484.

Installing Mypy

Mypy is in very active development, and as of this writing the package on PyPI is out of date and doesn’t work with Python 3.5. To use Mypy with Python 3.5, get the latest from Mypy’s repository on GitHub. It’s as simple as:

Playing With Mypy

Once you have Mypy installed, you can just run Mypy on your programs. The following program defines a function that expects a list of strings. It then invokes the function with a list of integers.

When running the program, it obviously fails at runtime with the following error:

What’s the problem with that? The problem is that it’s not clear immediately even in this very simple case what the root cause is. Is it an input type problem? Or maybe the code itself is wrong and shouldn’t try to call the lower() method on the ‘int’ object. Another issue is that if you don’t have 100% test coverage (and, let’s be honest, none of us do), then such issues can lurk in some untested, rarely used code path and be detected at the worst time in production.

Static typing, aided by type hints, gives you an extra safety net by making sure you always call your functions (annotated with type hints) with the right types. Here is the output of Mypy:

This is straightforward, points directly to the problem, and doesn’t require running a lot of tests. Another benefit of static type checking is that if you commit to it, you can skip dynamic type checking except when parsing external input (reading files, incoming network requests or user input). It also builds a lot of confidence as far as refactoring goes.

Conclusion

Type hints and the typing module are totally optional additions to the expressiveness of Python. While they may not suit everyone’s taste, for large projects and large teams they can be indispensable. The evidence is that large teams already use static type checking. Now that type information is standardized, it will be easier to share code, utilities and tools that use it. IDEs like PyCharm already take advantage of it to provide a better developer experience.


by Gigi Sayfan via Envato Tuts+ Code

No comments:

Post a Comment