Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 589 – TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys

Author:
Jukka Lehtosalo <jukka.lehtosalo at iki.fi>
Sponsor:
Guido van Rossum <guido at Python.org>
BDFL-Delegate:
Guido van Rossum <guido at Python.org>
Discussions-To:
Typing-SIG list
Status:
Final
Type:
Standards Track
Topic:
Typing
Created:
20-Mar-2019
Python-Version:
3.8
Post-History:

Resolution:
Typing-SIG message

Table of Contents

Attention

This PEP is a historical document: seeTypedDictand typing.TypedDictfor up-to-date specs and documentation. Canonical typing specs are maintained at thetyping specs site;runtime typing behaviour is described in the CPython documentation.

×

See thetyping specification update processfor how to propose changes to the typing spec.

Abstract

PEP 484defines the typeDict[K,V]for uniform dictionaries, where each value has the same type, and arbitrary key values are supported. It doesn’t properly support the common pattern where the type of a dictionary value depends on the string value of the key. This PEP proposes a type constructortyping.TypedDictto support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type.

Here is an example wherePEP 484doesn’t allow us to annotate satisfactorily:

movie={'name':'Blade Runner',
'year':1982}

This PEP proposes the addition of a new type constructor, called TypedDict,to allow the type ofmovieto be represented precisely:

fromtypingimportTypedDict

classMovie(TypedDict):
name:str
year:int

Now a type checker should accept this code:

movie:Movie={'name':'Blade Runner',
'year':1982}

Motivation

Representing an object or structured data using (potentially nested) dictionaries with string keys (instead of a user-defined class) is a common pattern in Python programs. Representing JSON objects is perhaps the canonical use case, and this is popular enough that Python ships with a JSON library. This PEP proposes a way to allow such code to be type checked more effectively.

More generally, representing pure data objects using only Python primitive types such as dictionaries, strings and lists has had certain appeal. They are easy to serialize and deserialize even when not using JSON. They trivially support various useful operations with no extra effort, including pretty-printing (throughstr()and thepprintmodule), iteration, and equality comparisons.

PEP 484doesn’t properly support the use cases mentioned above. Let’s consider a dictionary object that has exactly two valid string keys, 'name'with value typestr,and'year'with value type int.ThePEP 484typeDict[str,Any]would be suitable, but it is too lenient, as arbitrary string keys can be used, and arbitrary values are valid. Similarly,Dict[str,Union[str,int]]is too general, as the value for key'name'could be anint,and arbitrary string keys are allowed. Also, the type of a subscription expression such asd['name'](assumingdto be a dictionary of this type) would beUnion[str,int],which is too wide.

Dataclasses are a more recent alternative to solve this use case, but there is still a lot of existing code that was written before dataclasses became available, especially in large existing codebases where type hinting and checking has proven to be helpful. Unlike dictionary objects, dataclasses don’t directly support JSON serialization, though there is a third-party package that implements it[1].

Specification

A TypedDict type represents dictionary objects with a specific set of string keys, and with specific value types for each valid key. Each string key can be either required (it must be present) or non-required (it doesn’t need to exist).

This PEP proposes two ways of defining TypedDict types. The first uses a class-based syntax. The second is an alternative assignment-based syntax that is provided for backwards compatibility, to allow the feature to be backported to older Python versions. The rationale is similar to whyPEP 484supports a comment-based annotation syntax for Python 2.7: type hinting is particularly useful for large existing codebases, and these often need to run on older Python versions. The two syntax options parallel the syntax variants supported bytyping.NamedTuple.Other proposed features include TypedDict inheritance and totality (specifying whether keys are required or not).

This PEP also provides a sketch of how a type checker is expected to support type checking operations involving TypedDict objects. Similar toPEP 484,this discussion is left somewhat vague on purpose, to allow experimentation with a wide variety of different type checking approaches. In particular, type compatibility should be based on structural compatibility: a more specific TypedDict type can be compatible with a smaller (more general) TypedDict type.

Class-based Syntax

A TypedDict type can be defined using the class definition syntax with typing.TypedDictas the sole base class:

fromtypingimportTypedDict

classMovie(TypedDict):
name:str
year:int

Movieis a TypedDict type with two items:'name'(with type str) and'year'(with typeint).

A type checker should validate that the body of a class-based TypedDict definition conforms to the following rules:

  • The class body should only contain lines with item definitions of the formkey:value_type,optionally preceded by a docstring. The syntax for item definitions is identical to attribute annotations, but there must be no initializer, and the key name actually refers to the string value of the key instead of an attribute name.
  • Type comments cannot be used with the class-based syntax, for consistency with the class-basedNamedTuplesyntax. (Note that it would not be sufficient to support type comments for backwards compatibility with Python 2.7, since the class definition may have a totalkeyword argument, as discussed below, and this isn’t valid syntax in Python 2.7.) Instead, this PEP provides an alternative, assignment-based syntax for backwards compatibility, discussed in Alternative Syntax.
  • String literal forward references are valid in the value types.
  • Methods are not allowed, since the runtime type of a TypedDict object will always be justdict(it is never a subclass of dict).
  • Specifying a metaclass is not allowed.

An empty TypedDict can be created by only includingpassin the body (if there is a docstring,passcan be omitted):

classEmptyDict(TypedDict):
pass

Using TypedDict Types

Here is an example of how the typeMoviecan be used:

movie:Movie={'name':'Blade Runner',
'year':1982}

An explicitMovietype annotation is generally needed, as otherwise an ordinary dictionary type could be assumed by a type checker, for backwards compatibility. When a type checker can infer that a constructed dictionary object should be a TypedDict, an explicit annotation can be omitted. A typical example is a dictionary object as a function argument. In this example, a type checker is expected to infer that the dictionary argument should be understood as a TypedDict:

defrecord_movie(movie:Movie)->None:...

record_movie({'name':'Blade Runner','year':1982})

Another example where a type checker should treat a dictionary display as a TypedDict is in an assignment to a variable with a previously declared TypedDict type:

movie:Movie
...
movie={'name':'Blade Runner','year':1982}

Operations onmoviecan be checked by a static type checker:

movie['director']='Ridley Scott'# Error: invalid key 'director'
movie['year']='1982'# Error: invalid value type ( "int" expected)

The code below should be rejected, since'title'is not a valid key, and the'name'key is missing:

movie2:Movie={'title':'Blade Runner',
'year':1982}

The created TypedDict type object is not a real class object. Here are the only uses of the type a type checker is expected to allow:

  • It can be used in type annotations and in any context where an arbitrary type hint is valid, such as in type aliases and as the target type of a cast.
  • It can be used as a callable object with keyword arguments corresponding to the TypedDict items. Non-keyword arguments are not allowed. Example:
    m=Movie(name='Blade Runner',year=1982)
    

    When called, the TypedDict type object returns an ordinary dictionary object at runtime:

    print(type(m))# <class 'dict'>
    
  • It can be used as a base class, but only when defining a derived TypedDict. This is discussed in more detail below.

In particular, TypedDict type objects cannot be used in isinstance()tests such asisinstance(d,Movie).The reason is that there is no existing support for checking types of dictionary item values, sinceisinstance()does not work with manyPEP 484 types, including common ones likeList[str].This would be needed for cases like this:

classStrings(TypedDict):
items:List[str]

print(isinstance({'items':[1]},Strings))# Should be False
print(isinstance({'items':['x']},Strings))# Should be True

The above use case is not supported. This is consistent with how isinstance()is not supported forList[str].

Inheritance

It is possible for a TypedDict type to inherit from one or more TypedDict types using the class-based syntax. In this case the TypedDictbase class should not be included. Example:

classBookBasedMovie(Movie):
based_on:str

NowBookBasedMoviehas keysname,year,andbased_on. It is equivalent to this definition, since TypedDict types use structural compatibility:

classBookBasedMovie(TypedDict):
name:str
year:int
based_on:str

Here is an example of multiple inheritance:

classX(TypedDict):
x:int

classY(TypedDict):
y:str

classXYZ(X,Y):
z:bool

The TypedDictXYZhas three items:x(typeint),y (typestr), andz(typebool).

A TypedDict cannot inherit from both a TypedDict type and a non-TypedDict base class.

Additional notes on TypedDict class inheritance:

  • Changing a field type of a parent TypedDict class in a subclass is not allowed. Example:
    classX(TypedDict):
    x:str
    
    classY(X):
    x:int# Type check error: cannot overwrite TypedDict field "x"
    

    In the example outlined above TypedDict class annotations returns typestrfor keyx:

    print(Y.__annotations__)# {'x': <class 'str'>}
    
  • Multiple inheritance does not allow conflict types for the same name field:
    classX(TypedDict):
    x:int
    
    classY(TypedDict):
    x:str
    
    classXYZ(X,Y):# Type check error: cannot overwrite TypedDict field "x" while merging
    xyz:bool
    

Totality

By default, all keys must be present in a TypedDict. It is possible to override this by specifyingtotality.Here is how to do this using the class-based syntax:

classMovie(TypedDict,total=False):
name:str
year:int

This means that aMovieTypedDict can have any of the keys omitted. Thus these are valid:

m:Movie={}
m2:Movie={'year':2015}

A type checker is only expected to support a literalFalseor Trueas the value of thetotalargument.Trueis the default, and makes all items defined in the class body be required.

The totality flag only applies to items defined in the body of the TypedDict definition. Inherited items won’t be affected, and instead use totality of the TypedDict type where they were defined. This makes it possible to have a combination of required and non-required keys in a single TypedDict type.

Alternative Syntax

This PEP also proposes an alternative syntax that can be backported to older Python versions such as 3.5 and 2.7 that don’t support the variable definition syntax introduced inPEP 526.It resembles the traditional syntax for defining named tuples:

Movie=TypedDict('Movie',{'name':str,'year':int})

It is also possible to specify totality using the alternative syntax:

Movie=TypedDict('Movie',
{'name':str,'year':int},
total=False)

The semantics are equivalent to the class-based syntax. This syntax doesn’t support inheritance, however, and there is no way to have both required and non-required fields in a single type. The motivation for this is keeping the backwards compatible syntax as simple as possible while covering the most common use cases.

A type checker is only expected to accept a dictionary display expression as the second argument toTypedDict.In particular, a variable that refers to a dictionary object does not need to be supported, to simplify implementation.

Type Consistency

Informally speaking,type consistencyis a generalization of the is-subtype-of relation to support theAnytype. It is defined more formally inPEP 483.This section introduces the new, non-trivial rules needed to support type consistency for TypedDict types.

First, any TypedDict type is consistent withMapping[str,object]. Second, a TypedDict typeAis consistent with TypedDictBif Ais structurally compatible withB.This is true if and only if both of these conditions are satisfied:

  • For each key inB,Ahas the corresponding key and the corresponding value type inAis consistent with the value type inB.For each key inB,the value type inBis also consistent with the corresponding value type inA.
  • For each required key inB,the corresponding key is required inA.For each non-required key inB,the corresponding key is not required inA.

Discussion:

  • Value types behave invariantly, since TypedDict objects are mutable. This is similar to mutable container types such asListand Dict.Example where this is relevant:
    classA(TypedDict):
    x:Optional[int]
    
    classB(TypedDict):
    x:int
    
    deff(a:A)->None:
    a['x']=None
    
    b:B={'x':0}
    f(b)# Type check error: 'B' not compatible with 'A'
    b['x']+1# Runtime error: None + 1
    
  • A TypedDict type with a required key is not consistent with a TypedDict type where the same key is a non-required key, since the latter allows keys to be deleted. Example where this is relevant:
    classA(TypedDict,total=False):
    x:int
    
    classB(TypedDict):
    x:int
    
    deff(a:A)->None:
    dela['x']
    
    b:B={'x':0}
    f(b)# Type check error: 'B' not compatible with 'A'
    b['x']+1# Runtime KeyError: 'x'
    
  • A TypedDict typeAwith no key'x'is not consistent with a TypedDict type with a non-required key'x',since at runtime the key'x'could be present and have an incompatible type (which may not be visible throughAdue to structural subtyping). Example:
    classA(TypedDict,total=False):
    x:int
    y:int
    
    classB(TypedDict,total=False):
    x:int
    
    classC(TypedDict,total=False):
    x:int
    y:str
    
    deff(a:A)->None:
    a['y']=1
    
    defg(b:B)->None:
    f(b)# Type check error: 'B' incompatible with 'A'
    
    c:C={'x':0,'y':'foo'}
    g(c)
    c['y']+'bar'# Runtime error: int + str
    
  • A TypedDict isn’t consistent with anyDict[...]type, since dictionary types allow destructive operations, including clear().They also allow arbitrary keys to be set, which would compromise type safety. Example:
    classA(TypedDict):
    x:int
    
    classB(A):
    y:str
    
    deff(d:Dict[str,int])->None:
    d['y']=0
    
    defg(a:A)->None:
    f(a)# Type check error: 'A' incompatible with Dict[str, int]
    
    b:B={'x':0,'y':'foo'}
    g(b)
    b['y']+'bar'# Runtime error: int + str
    
  • A TypedDict with allintvalues is not consistent with Mapping[str,int],since there may be additional non-int values not visible through the type, due to structural subtyping. These can be accessed using thevalues()anditems() methods inMapping,for example. Example:
    classA(TypedDict):
    x:int
    
    classB(TypedDict):
    x:int
    y:str
    
    defsum_values(m:Mapping[str,int])->int:
    n=0
    forvinm.values():
    n+=v# Runtime error
    returnn
    
    deff(a:A)->None:
    sum_values(a)# Error: 'A' incompatible with Mapping[str, int]
    
    b:B={'x':0,'y':'foo'}
    f(b)
    

Supported and Unsupported Operations

Type checkers should support restricted forms of mostdict operations on TypedDict objects. The guiding principle is that operations not involvingAnytypes should be rejected by type checkers if they may violate runtime type safety. Here are some of the most important type safety violations to prevent:

  1. A required key is missing.
  2. A value has an invalid type.
  3. A key that is not defined in the TypedDict type is added.

A key that is not a literal should generally be rejected, since its value is unknown during type checking, and thus can cause some of the above violations. (Use of Final Values and Literal Types generalizes this to cover final names and literal types.)

The use of a key that is not known to exist should be reported as an error, even if this wouldn’t necessarily generate a runtime type error. These are often mistakes, and these may insert values with an invalid type if structural subtyping hides the types of certain items. For example,d['x']=1should generate a type check error if 'x'is not a valid key ford(which is assumed to be a TypedDict type).

Extra keys included in TypedDict object construction should also be caught. In this example, thedirectorkey is not defined in Movieand is expected to generate an error from a type checker:

m:Movie=dict(
name='Alien',
year=1979,
director='Ridley Scott')# error: Unexpected key 'director'

Type checkers should reject the following operations on TypedDict objects as unsafe, even though they are valid for normal dictionaries:

  • Operations with arbitrarystrkeys (instead of string literals or other expressions with known string values) should generally be rejected. This involves both destructive operations such as setting an item and read-only operations such as subscription expressions. As an exception to the above rule,d.get(e)andeind should be allowed for TypedDict objects, for an arbitrary expression ewith typestr.The motivation is that these are safe and can be useful for introspecting TypedDict objects. The static type ofd.get(e)should beobjectif the string value ofe cannot be determined statically.
  • clear()is not safe since it could remove required keys, some of which may not be directly visible because of structural subtyping.popitem()is similarly unsafe, even if all known keys are not required (total=False).
  • delobj['key']should be rejected unless'key'is a non-required key.

Type checkers may allow reading an item usingd['x']even if the key'x'is not required, instead of requiring the use of d.get('x')or an explicit'x'indcheck. The rationale is that tracking the existence of keys is difficult to implement in full generality, and that disallowing this could require many changes to existing code.

The exact type checking rules are up to each type checker to decide. In some cases potentially unsafe operations may be accepted if the alternative is to generate false positive errors for idiomatic code.

Use of Final Values and Literal Types

Type checkers should allow final names (PEP 591) with string values to be used instead of string literals in operations on TypedDict objects. For example, this is valid:

YEAR:Final='year'

m:Movie={'name':'Alien','year':1979}
years_since_epoch=m[YEAR]-1970

Similarly, an expression with a suitable literal type (PEP 586) can be used instead of a literal value:

defget_value(movie:Movie,
key:Literal['year','name'])->Union[int,str]:
returnmovie[key]

Type checkers are only expected to support actual string literals, not final names or literal types, for specifying keys in a TypedDict type definition. Also, only a boolean literal can be used to specify totality in a TypedDict definition. The motivation for this is to make type declarations self-contained, and to simplify the implementation of type checkers.

Backwards Compatibility

To retain backwards compatibility, type checkers should not infer a TypedDict type unless it is sufficiently clear that this is desired by the programmer. When unsure, an ordinary dictionary type should be inferred. Otherwise existing code that type checks without errors may start generating errors once TypedDict support is added to the type checker, since TypedDict types are more restrictive than dictionary types. In particular, they aren’t subtypes of dictionary types.

Reference Implementation

The mypy[2]type checker supports TypedDict types. A reference implementation of the runtime component is provided in the typing_extensions[3]module. The original implementation was in themypy_extensions[4] module.

Rejected Alternatives

Several proposed ideas were rejected. The current set of features seem to cover a lot of ground, and it was not clear which of the proposed extensions would be more than marginally useful. This PEP defines a baseline feature that can be potentially extended later.

These are rejected on principle, as incompatible with the spirit of this proposal:

  • TypedDict isn’t extensible, and it addresses only a specific use case. TypedDict objects are regular dictionaries at runtime, and TypedDict cannot be used with other dictionary-like or mapping-like classes, including subclasses ofdict.There is no way to add methods to TypedDict types. The motivation here is simplicity.
  • TypedDict type definitions could plausibly used to perform runtime type checking of dictionaries. For example, they could be used to validate that a JSON object conforms to the schema specified by a TypedDict type. This PEP doesn’t include such functionality, since the focus of this proposal is static type checking only, and other existing types do not support this, as discussed inClass-based syntax.Such functionality can be provided by a third-party library using thetyping_inspect[5]third-party module, for example.
  • TypedDict types can’t be used inisinstance()orissubclass() checks. The reasoning is similar to why runtime type checks aren’t supported in general with many type hints.

These features were left out from this PEP, but they are potential extensions to be added in the future:

  • TypedDict doesn’t support providing adefault value typefor keys that are not explicitly defined. This would allow arbitrary keys to be used with a TypedDict object, and only explicitly enumerated keys would receive special treatment compared to a normal, uniform dictionary type.
  • There is no way to individually specify whether each key is required or not. No proposed syntax was clear enough, and we expect that there is limited need for this.
  • TypedDict can’t be used for specifying the type of a**kwargs argument. This would allow restricting the allowed keyword arguments and their types. According toPEP 484,using a TypedDict type as the type of**kwargsmeans that the TypedDict is valid as thevalueof arbitrary keyword arguments, but it doesn’t restrict which keyword arguments should be allowed. The syntax **kwargs:Expand[T]has been proposed for this[6].

Acknowledgements

David Foster contributed the initial implementation of TypedDict types to mypy. Improvements to the implementation have been contributed by at least the author (Jukka Lehtosalo), Ivan Levkivskyi, Gareth T, Michael Lee, Dominik Miedzinski, Roy Williams and Max Moroz.

References


Source:https://github / Python /peps/blob/main/peps/pep-0589.rst

Last modified:2024-06-11 22:12:09 GMT