描述器 (Descriptor) 指南¶
- 作者:
Raymond Hettinger
- 聯絡方式:
<python at rcn dot com>
描述器讓物件自訂屬性能夠被查找、儲存和刪除。
此指南有四個主要章節:
「入門」提供基本概述,從簡單範例開始逐步介紹,一次新增一個功能。如果你是描述器的新手,請從這裡開始。
第二個章節展示完整、實用的描述器範例。如果你已經了解基礎知識,可以從那裡開始。
第三個章節提供更技術性的教學,深入探討描述器運作的詳細機制。大部分的人不需要這種程度的細節。
最後一個章節提供以 C 語言撰寫的內建描述器的純 Python 等價實作。如果你好奇函式如何轉變為綁定方法,或者想了解常見工具如
classmethod()、staticmethod()、property()和 __slots__ 的實作方式,請閱讀此章節。
入門¶
在此入門教學中,我們從最基礎的範例開始,然後逐一新增新功能。
簡單範例:回傳常數的描述器¶
Ten 類別是一個描述器,它的 __get__() 方法總是回傳常數 10:
class Ten:
def __get__(self, obj, objtype=None):
return 10
要使用描述器,必須將它儲存為另一個類別的類別變數:
class A:
x = 5 # 一般類別屬性
y = Ten() # 描述器實例
互動式工作階段顯示一般屬性查找和描述器查找之間的差異:
>>> a = A() # 建立類別 A 的實例
>>> a.x # 一般屬性查找
5
>>> a.y # 描述器查找
10
在 a.x 屬性查找中,點運算子在類別字典中找到 'x': 5。在 a.y 查找中,點運算子找到一個描述器實例,透過其 __get__ 方法來識別。呼叫該方法回傳 10。
請注意,數值 10 並不儲存在類別字典或實例字典中。相反地,數值 10 是按需計算的。
這個範例顯示了簡單描述器的運作方式,但它並不是很有用。對於取得常數,一般的屬性查找會比較好。
在下一個章節中,我們將建立更有用的東西:動態查找。
動態查找¶
有趣的描述器通常執行計算而不是回傳常數:
import os
class DirectorySize:
def __get__(self, obj, objtype=None):
return len(os.listdir(obj.dirname))
class Directory:
size = DirectorySize() # 描述器實例
def __init__(self, dirname):
self.dirname = dirname # 一般實例屬性
互動式工作階段顯示查找是動態的——每次都會計算不同的更新答案:
>>> s = Directory('songs')
>>> g = Directory('games')
>>> s.size # songs 目錄有二十個檔案
20
>>> g.size # games 目錄有三個檔案
3
>>> os.remove('games/chess') # 刪除一個遊戲
>>> g.size # 檔案數量自動更新
2
除了顯示描述器如何執行計算外,這個範例也揭示了 __get__() 參數的目的。self 參數是 size,一個 DirectorySize 的實例。obj 參數是 g 或 s,一個 Directory 的實例。正是 obj 參數讓 __get__() 方法得知目標目錄。objtype 參數是類別 Directory。
受管理的屬性¶
描述器的一個常見用途是管理對實例資料的存取。描述器被指派給類別字典中的公開屬性,而實際資料則儲存為實例字典中的私有屬性。當存取公開屬性時,會觸發描述器的 __get__() 和 __set__() 方法。
在以下範例中,age 是公開屬性,_age 是私有屬性。當存取公開屬性時,描述器會記錄查找或更新:
import logging
logging.basicConfig(level=logging.INFO)
class LoggedAgeAccess:
def __get__(self, obj, objtype=None):
value = obj._age
logging.info('Accessing %r giving %r', 'age', value)
return value
def __set__(self, obj, value):
logging.info('Updating %r to %r', 'age', value)
obj._age = value
class Person:
age = LoggedAgeAccess() # 描述器實例
def __init__(self, name, age):
self.name = name # 一般實例屬性
self.age = age # 呼叫 __set__()
def birthday(self):
self.age += 1 # 同時呼叫 __get__() 和 __set__()
互動式工作階段顯示對受管理屬性 age 的所有存取都被記錄,但一般屬性 name 則不會被記錄:
>>> mary = Person('Mary M', 30) # 初始的 age 更新被記錄
INFO:root:Updating 'age' to 30
>>> dave = Person('David D', 40)
INFO:root:Updating 'age' to 40
>>> vars(mary) # 實際資料在私有屬性中
{'name': 'Mary M', '_age': 30}
>>> vars(dave)
{'name': 'David D', '_age': 40}
>>> mary.age # 存取資料並記錄查找
INFO:root:Accessing 'age' giving 30
30
>>> mary.birthday() # 更新也會被記錄
INFO:root:Accessing 'age' giving 30
INFO:root:Updating 'age' to 31
>>> dave.name # 一般屬性查找不會被記錄
'David D'
>>> dave.age # 只有受管理的屬性會被記錄
INFO:root:Accessing 'age' giving 40
40
這個範例的一個主要問題是私有名稱 _age 在 LoggedAgeAccess 類別中是硬編碼的。這意味著每個實例只能有一個被記錄的屬性,而且它的名稱是不可變的。在下一個範例中,我們將修正這個問題。
自訂名稱¶
當類別使用描述器時,它可以告知每個描述器使用了哪個變數名稱。
在這個範例中,Person 類別有兩個描述器實例:name 和 age。當定義 Person 類別時,它會對 LoggedAccess 中的 __set_name__() 進行回呼,以便記錄欄位名稱,讓每個描述器都有自己的 public_name 和 private_name:
import logging
logging.basicConfig(level=logging.INFO)
class LoggedAccess:
def __set_name__(self, owner, name):
self.public_name = name
self.private_name = '_' + name
def __get__(self, obj, objtype=None):
value = getattr(obj, self.private_name)
logging.info('Accessing %r giving %r', self.public_name, value)
return value
def __set__(self, obj, value):
logging.info('Updating %r to %r', self.public_name, value)
setattr(obj, self.private_name, value)
class Person:
name = LoggedAccess() # 第一個描述器實例
age = LoggedAccess() # 第二個描述器實例
def __init__(self, name, age):
self.name = name # 呼叫第一個描述器
self.age = age # 呼叫第二個描述器
def birthday(self):
self.age += 1
An interactive session shows that the Person class has called
__set_name__() so that the field names would be recorded. Here
we call vars() to look up the descriptor without triggering it:
>>> vars(vars(Person)['name'])
{'public_name': 'name', 'private_name': '_name'}
>>> vars(vars(Person)['age'])
{'public_name': 'age', 'private_name': '_age'}
新類別現在記錄對 name 和 age 的存取:
>>> pete = Person('Peter P', 10)
INFO:root:Updating 'name' to 'Peter P'
INFO:root:Updating 'age' to 10
>>> kate = Person('Catherine C', 20)
INFO:root:Updating 'name' to 'Catherine C'
INFO:root:Updating 'age' to 20
兩個 Person 實例只包含私有名稱:
>>> vars(pete)
{'_name': 'Peter P', '_age': 10}
>>> vars(kate)
{'_name': 'Catherine C', '_age': 20}
結語¶
描述器 是我們對任何定義了 __get__()、__set__() 或 __delete__() 的物件的稱呼。
描述器可以選擇性地擁有 __set_name__() 方法。這只在描述器需要知道建立它的類別或被指派的類別變數名稱時使用。(此方法如果存在,即使類別不是描述器也會被呼叫。)
描述器在屬性查找期間由點運算子呼叫。如果描述器是透過 vars(some_class)[descriptor_name] 間接存取,會回傳描述器實例而不呼叫它。
描述器只有在用作類別變數時才能運作。當放在實例中時,它們沒有效果。
描述器的主要動機是提供一個掛鉤,讓儲存在類別變數中的物件能夠控制屬性查找期間發生的事情。
傳統上,呼叫的類別控制查找期間發生的事情。描述器顛倒了這種關係,讓被查找的資料在這件事上有發言權。
描述器在整個語言中被廣泛使用。函式就是透過這種方式轉變為綁定方法。常見工具如 classmethod()、staticmethod()、property() 和 functools.cached_property() 都是以描述器實作的。
完整實用範例¶
在這個範例中,我們建立一個實用且強大的工具,用於找出惡名昭彰且難以發現的資料損壞錯誤。
驗證器類別¶
驗證器是用於受管理屬性存取的描述器。在儲存任何資料之前,它會驗證新數值是否符合各種型別和範圍限制。如果不符合這些限制,它會引發例外以防止資料在源頭被損壞。
這個 Validator 類別既是 抽象基底類別 也是受管理屬性描述器:
from abc import ABC, abstractmethod
class Validator(ABC):
def __set_name__(self, owner, name):
self.private_name = '_' + name
def __get__(self, obj, objtype=None):
return getattr(obj, self.private_name)
def __set__(self, obj, value):
self.validate(value)
setattr(obj, self.private_name, value)
@abstractmethod
def validate(self, value):
pass
自訂驗證器需要繼承自 Validator 並且必須提供 validate() 方法來依需要測試各種限制。
自訂驗證器¶
以下是三個實用的資料驗證工具:
OneOf驗證數值是否為限制選項集合中的其中一個。
class OneOf(Validator):
def __init__(self, *options):
self.options = set(options)
def validate(self, value):
if value not in self.options:
raise ValueError(
f'Expected {value!r} to be one of {self.options!r}'
)
class Number(Validator):
def __init__(self, minvalue=None, maxvalue=None):
self.minvalue = minvalue
self.maxvalue = maxvalue
def validate(self, value):
if not isinstance(value, (int, float)):
raise TypeError(f'Expected {value!r} to be an int or float')
if self.minvalue is not None and value < self.minvalue:
raise ValueError(
f'Expected {value!r} to be at least {self.minvalue!r}'
)
if self.maxvalue is not None and value > self.maxvalue:
raise ValueError(
f'Expected {value!r} to be no more than {self.maxvalue!r}'
)
class String(Validator):
def __init__(self, minsize=None, maxsize=None, predicate=None):
self.minsize = minsize
self.maxsize = maxsize
self.predicate = predicate
def validate(self, value):
if not isinstance(value, str):
raise TypeError(f'Expected {value!r} to be a str')
if self.minsize is not None and len(value) < self.minsize:
raise ValueError(
f'Expected {value!r} to be no smaller than {self.minsize!r}'
)
if self.maxsize is not None and len(value) > self.maxsize:
raise ValueError(
f'Expected {value!r} to be no bigger than {self.maxsize!r}'
)
if self.predicate is not None and not self.predicate(value):
raise ValueError(
f'Expected {self.predicate} to be true for {value!r}'
)
實際應用¶
以下展示如何在真實類別中使用資料驗證器:
class Component:
name = String(minsize=3, maxsize=10, predicate=str.isupper)
kind = OneOf('wood', 'metal', 'plastic')
quantity = Number(minvalue=0)
def __init__(self, name, kind, quantity):
self.name = name
self.kind = kind
self.quantity = quantity
描述器防止建立無效的實例:
>>> Component('Widget', 'metal', 5) # 被阻止:'Widget' 不是全大寫
Traceback (most recent call last):
...
ValueError: Expected <method 'isupper' of 'str' objects> to be true for 'Widget'
>>> Component('WIDGET', 'metle', 5) # 被阻止:'metle' 拼寫錯誤
Traceback (most recent call last):
...
ValueError: Expected 'metle' to be one of {'metal', 'plastic', 'wood'}
>>> Component('WIDGET', 'metal', -5) # 被阻止:-5 是負數
Traceback (most recent call last):
...
ValueError: Expected -5 to be at least 0
>>> Component('WIDGET', 'metal', 'V') # 被阻止:'V' 不是數字
Traceback (most recent call last):
...
TypeError: Expected 'V' to be an int or float
>>> c = Component('WIDGET', 'metal', 5) # 允許:輸入有效
技術教學¶
接下來是關於描述器運作機制和細節的更技術性教學。
摘要¶
定義描述器、總結協定,並展示描述器如何被呼叫。提供展示物件關聯映射如何運作的範例。
學習描述器不僅提供更大的工具集,還能更深入理解 Python 的運作方式。
定義與介紹¶
一般來說,描述器是具有描述器協定中某個方法的屬性值。這些方法包括 __get__()、__set__() 和 __delete__()。如果為屬性定義了其中任何一個方法,就稱為 描述器。
屬性存取的預設行為是從物件的字典中取得、設定或刪除屬性。例如,a.x 有一個查找鏈,從 a.__dict__['x'] 開始,然後是 type(a).__dict__['x'],並繼續通過 type(a) 的方法解析順序。如果查找到的值是定義了其中一個描述器方法的物件,那麼 Python 可能會覆蓋預設行為並改為呼叫描述器方法。這在優先順序鏈中發生的位置取決於定義了哪些描述器方法。
描述器是強大且通用的協定。它們是屬性、方法、靜態方法、類別方法和 super() 背後的機制。它們在 Python 本身中廣泛使用。描述器簡化了底層 C 程式碼,並為日常 Python 程式提供靈活的新工具集。
描述器協定¶
descr.__get__(self, obj, type=None)
descr.__set__(self, obj, value)
descr.__delete__(self, obj)
就是這樣而已。定義其中任何一個方法,物件就被視為描述器,並且可以在作為屬性被查找時覆蓋預設行為。
如果物件定義了 __set__() 或 __delete__(),它被視為資料描述器。只定義 __get__() 的描述器稱為非資料描述器(它們通常用於方法,但也有其他用途)。
資料描述器和非資料描述器在如何計算相對於實例字典條目的覆蓋方面有所不同。如果實例的字典有與資料描述器同名的條目,資料描述器具有優先權。如果實例的字典有與非資料描述器同名的條目,字典條目具有優先權。
要建立唯讀資料描述器,定義 __get__() 和 __set__(),並讓 __set__() 在被呼叫時引發 AttributeError。定義引發例外佔位符的 __set__() 方法就足以讓它成為資料描述器。
描述器呼叫概觀¶
描述器可以直接使用 desc.__get__(obj) 或 desc.__get__(None, cls) 呼叫。
But it is more common for a descriptor to be invoked automatically from attribute access.
The expression obj.x looks up the attribute x in the chain of
namespaces for obj. If the search finds a descriptor outside of the
instance __dict__, its __get__() method is
invoked according to the precedence rules listed below.
The details of invocation depend on whether obj is an object, class, or
instance of super.
從實例呼叫¶
實例查找掃描命名空間鏈,給予資料描述器最高優先權,接著是實例變數,然後是非資料描述器,再來是類別變數,最後如果有提供的話是 __getattr__()。
如果為 a.x 找到描述器,則使用 desc.__get__(a, type(a)) 呼叫它。
點查找的邏輯在 object.__getattribute__() 中。這裡是純 Python 等價實作:
def find_name_in_mro(cls, name, default):
"Emulate _PyType_Lookup() in Objects/typeobject.c"
for base in cls.__mro__:
if name in vars(base):
return vars(base)[name]
return default
def object_getattribute(obj, name):
"Emulate PyObject_GenericGetAttr() in Objects/object.c"
null = object()
objtype = type(obj)
cls_var = find_name_in_mro(objtype, name, null)
descr_get = getattr(type(cls_var), '__get__', null)
if descr_get is not null:
if (hasattr(type(cls_var), '__set__')
or hasattr(type(cls_var), '__delete__')):
return descr_get(cls_var, obj, objtype) # data descriptor
if hasattr(obj, '__dict__') and name in vars(obj):
return vars(obj)[name] # instance variable
if descr_get is not null:
return descr_get(cls_var, obj, objtype) # non-data descriptor
if cls_var is not null:
return cls_var # class variable
raise AttributeError(name)
Note, there is no __getattr__() hook in the __getattribute__()
code. That is why calling __getattribute__() directly or with
super().__getattribute__ will bypass __getattr__() entirely.
Instead, it is the dot operator and the getattr() function that are
responsible for invoking __getattr__() whenever __getattribute__()
raises an AttributeError. Their logic is encapsulated in a helper
function:
def getattr_hook(obj, name):
"Emulate slot_tp_getattr_hook() in Objects/typeobject.c"
try:
return obj.__getattribute__(name)
except AttributeError:
if not hasattr(type(obj), '__getattr__'):
raise
return type(obj).__getattr__(obj, name) # __getattr__
從類別呼叫¶
The logic for a dotted lookup such as A.x is in
type.__getattribute__(). The steps are similar to those for
object.__getattribute__() but the instance dictionary lookup is replaced
by a search through the class's method resolution order.
如果找到描述器,則使用 desc.__get__(None, A) 呼叫它。
The full C implementation can be found in type_getattro() and
_PyType_Lookup() in Objects/typeobject.c.
Invocation from super¶
The logic for super's dotted lookup is in the __getattribute__() method for
object returned by super().
A dotted lookup such as super(A, obj).m searches obj.__class__.__mro__
for the base class B immediately following A and then returns
B.__dict__['m'].__get__(obj, A). If not a descriptor, m is returned
unchanged.
The full C implementation can be found in super_getattro() in
Objects/typeobject.c. A pure Python equivalent can be found in
Guido's Tutorial.
Summary of invocation logic¶
The mechanism for descriptors is embedded in the __getattribute__()
methods for object, type, and super().
要記住的重點是:
Descriptors are invoked by the
__getattribute__()method.Classes inherit this machinery from
object,type, orsuper().Overriding
__getattribute__()prevents automatic descriptor calls because all the descriptor logic is in that method.object.__getattribute__()andtype.__getattribute__()make different calls to__get__(). The first includes the instance and may include the class. The second puts inNonefor the instance and always includes the class.Data descriptors always override instance dictionaries.
Non-data descriptors may be overridden by instance dictionaries.
Automatic name notification¶
Sometimes it is desirable for a descriptor to know what class variable name it
was assigned to. When a new class is created, the type metaclass
scans the dictionary of the new class. If any of the entries are descriptors
and if they define __set_name__(), that method is called with two
arguments. The owner is the class where the descriptor is used, and the
name is the class variable the descriptor was assigned to.
The implementation details are in type_new() and
set_names() in Objects/typeobject.c.
Since the update logic is in type.__new__(), notifications only take
place at the time of class creation. If descriptors are added to the class
afterwards, __set_name__() will need to be called manually.
ORM 範例¶
The following code is a simplified skeleton showing how data descriptors could be used to implement an object relational mapping.
The essential idea is that the data is stored in an external database. The Python instances only hold keys to the database's tables. Descriptors take care of lookups or updates:
class Field:
def __set_name__(self, owner, name):
self.fetch = f'SELECT {name} FROM {owner.table} WHERE {owner.key}=?;'
self.store = f'UPDATE {owner.table} SET {name}=? WHERE {owner.key}=?;'
def __get__(self, obj, objtype=None):
return conn.execute(self.fetch, [obj.key]).fetchone()[0]
def __set__(self, obj, value):
conn.execute(self.store, [value, obj.key])
conn.commit()
We can use the Field class to define models that describe the schema for
each table in a database:
class Movie:
table = 'Movies' # Table name
key = 'title' # Primary key
director = Field()
year = Field()
def __init__(self, key):
self.key = key
class Song:
table = 'Music'
key = 'title'
artist = Field()
year = Field()
genre = Field()
def __init__(self, key):
self.key = key
To use the models, first connect to the database:
>>> import sqlite3
>>> conn = sqlite3.connect('entertainment.db')
An interactive session shows how data is retrieved from the database and how it can be updated:
>>> Movie('Star Wars').director
'George Lucas'
>>> jaws = Movie('Jaws')
>>> f'Released in {jaws.year} by {jaws.director}'
'Released in 1975 by Steven Spielberg'
>>> Song('Country Roads').artist
'John Denver'
>>> Movie('Star Wars').director = 'J.J. Abrams'
>>> Movie('Star Wars').director
'J.J. Abrams'
Pure Python Equivalents¶
The descriptor protocol is simple and offers exciting possibilities. Several use cases are so common that they have been prepackaged into built-in tools. Properties, bound methods, static methods, class methods, and __slots__ are all based on the descriptor protocol.
Properties¶
Calling property() is a succinct way of building a data descriptor that
triggers a function call upon access to an attribute. Its signature is:
property(fget=None, fset=None, fdel=None, doc=None) -> property
The documentation shows a typical use to define a managed attribute x:
class C:
def getx(self): return self.__x
def setx(self, value): self.__x = value
def delx(self): del self.__x
x = property(getx, setx, delx, "I'm the 'x' property.")
To see how property() is implemented in terms of the descriptor protocol,
here is a pure Python equivalent that implements most of the core functionality:
class Property:
"Emulate PyProperty_Type() in Objects/descrobject.c"
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
self.fget = fget
self.fset = fset
self.fdel = fdel
if doc is None and fget is not None:
doc = fget.__doc__
self.__doc__ = doc
def __set_name__(self, owner, name):
self.__name__ = name
def __get__(self, obj, objtype=None):
if obj is None:
return self
if self.fget is None:
raise AttributeError
return self.fget(obj)
def __set__(self, obj, value):
if self.fset is None:
raise AttributeError
self.fset(obj, value)
def __delete__(self, obj):
if self.fdel is None:
raise AttributeError
self.fdel(obj)
def getter(self, fget):
return type(self)(fget, self.fset, self.fdel, self.__doc__)
def setter(self, fset):
return type(self)(self.fget, fset, self.fdel, self.__doc__)
def deleter(self, fdel):
return type(self)(self.fget, self.fset, fdel, self.__doc__)
The property() builtin helps whenever a user interface has granted
attribute access and then subsequent changes require the intervention of a
method.
For instance, a spreadsheet class may grant access to a cell value through
Cell('b10').value. Subsequent improvements to the program require the cell
to be recalculated on every access; however, the programmer does not want to
affect existing client code accessing the attribute directly. The solution is
to wrap access to the value attribute in a property data descriptor:
class Cell:
...
@property
def value(self):
"Recalculate the cell before returning value"
self.recalc()
return self._value
Either the built-in property() or our Property() equivalent would
work in this example.
Functions and methods¶
Python's object oriented features are built upon a function based environment. Using non-data descriptors, the two are merged seamlessly.
Functions stored in class dictionaries get turned into methods when invoked. Methods only differ from regular functions in that the object instance is prepended to the other arguments. By convention, the instance is called self but could be called this or any other variable name.
Methods can be created manually with types.MethodType which is
roughly equivalent to:
class MethodType:
"Emulate PyMethod_Type in Objects/classobject.c"
def __init__(self, func, obj):
self.__func__ = func
self.__self__ = obj
def __call__(self, *args, **kwargs):
func = self.__func__
obj = self.__self__
return func(obj, *args, **kwargs)
def __getattribute__(self, name):
"Emulate method_getset() in Objects/classobject.c"
if name == '__doc__':
return self.__func__.__doc__
return object.__getattribute__(self, name)
def __getattr__(self, name):
"Emulate method_getattro() in Objects/classobject.c"
return getattr(self.__func__, name)
def __get__(self, obj, objtype=None):
"Emulate method_descr_get() in Objects/classobject.c"
return self
To support automatic creation of methods, functions include the
__get__() method for binding methods during attribute access. This
means that functions are non-data descriptors that return bound methods
during dotted lookup from an instance. Here's how it works:
class Function:
...
def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
if obj is None:
return self
return MethodType(self, obj)
Running the following class in the interpreter shows how the function descriptor works in practice:
class D:
def f(self):
return self
class D2:
pass
The function has a qualified name attribute to support introspection:
>>> D.f.__qualname__
'D.f'
Accessing the function through the class dictionary does not invoke
__get__(). Instead, it just returns the underlying function object:
>>> D.__dict__['f']
<function D.f at 0x00C45070>
Dotted access from a class calls __get__() which just returns the
underlying function unchanged:
>>> D.f
<function D.f at 0x00C45070>
The interesting behavior occurs during dotted access from an instance. The
dotted lookup calls __get__() which returns a bound method object:
>>> d = D()
>>> d.f
<bound method D.f of <__main__.D object at 0x00B18C90>>
Internally, the bound method stores the underlying function and the bound instance:
>>> d.f.__func__
<function D.f at 0x00C45070>
>>> d.f.__self__
<__main__.D object at 0x00B18C90>
If you have ever wondered where self comes from in regular methods or where cls comes from in class methods, this is it!
Kinds of methods¶
Non-data descriptors provide a simple mechanism for variations on the usual patterns of binding functions into methods.
To recap, functions have a __get__() method so that they can be converted
to a method when accessed as attributes. The non-data descriptor transforms an
obj.f(*args) call into f(obj, *args). Calling cls.f(*args)
becomes f(*args).
This chart summarizes the binding and its two most useful variants:
Transformation
Called from an object
Called from a class
函式
f(obj, *args)
f(*args)
staticmethod
f(*args)
f(*args)
classmethod
f(type(obj), *args)
f(cls, *args)
Static methods¶
Static methods return the underlying function without changes. Calling either
c.f or C.f is the equivalent of a direct lookup into
object.__getattribute__(c, "f") or object.__getattribute__(C, "f"). As a
result, the function becomes identically accessible from either an object or a
class.
Good candidates for static methods are methods that do not reference the
self variable.
For instance, a statistics package may include a container class for
experimental data. The class provides normal methods for computing the average,
mean, median, and other descriptive statistics that depend on the data. However,
there may be useful functions which are conceptually related but do not depend
on the data. For instance, erf(x) is handy conversion routine that comes up
in statistical work but does not directly depend on a particular dataset.
It can be called either from an object or the class: s.erf(1.5) --> 0.9332
or Sample.erf(1.5) --> 0.9332.
Since static methods return the underlying function with no changes, the example calls are unexciting:
class E:
@staticmethod
def f(x):
return x * 10
>>> E.f(3)
30
>>> E().f(3)
30
Using the non-data descriptor protocol, a pure Python version of
staticmethod() would look like this:
import functools
class StaticMethod:
"Emulate PyStaticMethod_Type() in Objects/funcobject.c"
def __init__(self, f):
self.f = f
functools.update_wrapper(self, f)
def __get__(self, obj, objtype=None):
return self.f
def __call__(self, *args, **kwds):
return self.f(*args, **kwds)
@property
def __annotations__(self):
return self.f.__annotations__
The functools.update_wrapper() call adds a __wrapped__ attribute
that refers to the underlying function. Also it carries forward
the attributes necessary to make the wrapper look like the wrapped
function, including __name__, __qualname__,
and __doc__.
Class methods¶
Unlike static methods, class methods prepend the class reference to the argument list before calling the function. This format is the same for whether the caller is an object or a class:
class F:
@classmethod
def f(cls, x):
return cls.__name__, x
>>> F.f(3)
('F', 3)
>>> F().f(3)
('F', 3)
This behavior is useful whenever the method only needs to have a class
reference and does not rely on data stored in a specific instance. One use for
class methods is to create alternate class constructors. For example, the
classmethod dict.fromkeys() creates a new dictionary from a list of
keys. The pure Python equivalent is:
class Dict(dict):
@classmethod
def fromkeys(cls, iterable, value=None):
"Emulate dict_fromkeys() in Objects/dictobject.c"
d = cls()
for key in iterable:
d[key] = value
return d
Now a new dictionary of unique keys can be constructed like this:
>>> d = Dict.fromkeys('abracadabra')
>>> type(d) is Dict
True
>>> d
{'a': None, 'b': None, 'r': None, 'c': None, 'd': None}
Using the non-data descriptor protocol, a pure Python version of
classmethod() would look like this:
import functools
class ClassMethod:
"Emulate PyClassMethod_Type() in Objects/funcobject.c"
def __init__(self, f):
self.f = f
functools.update_wrapper(self, f)
def __get__(self, obj, cls=None):
if cls is None:
cls = type(obj)
return MethodType(self.f, cls)
The functools.update_wrapper() call in ClassMethod adds a
__wrapped__ attribute that refers to the underlying function. Also
it carries forward the attributes necessary to make the wrapper look
like the wrapped function: __name__,
__qualname__, __doc__,
and __annotations__.
Member objects and __slots__¶
When a class defines __slots__, it replaces instance dictionaries with a
fixed-length array of slot values. From a user point of view that has
several effects:
1. Provides immediate detection of bugs due to misspelled attribute
assignments. Only attribute names specified in __slots__ are allowed:
class Vehicle:
__slots__ = ('id_number', 'make', 'model')
>>> auto = Vehicle()
>>> auto.id_nubmer = 'VYE483814LQEX'
Traceback (most recent call last):
...
AttributeError: 'Vehicle' object has no attribute 'id_nubmer'
2. Helps create immutable objects where descriptors manage access to private
attributes stored in __slots__:
class Immutable:
__slots__ = ('_dept', '_name') # Replace the instance dictionary
def __init__(self, dept, name):
self._dept = dept # Store to private attribute
self._name = name # Store to private attribute
@property # Read-only descriptor
def dept(self):
return self._dept
@property
def name(self): # Read-only descriptor
return self._name
>>> mark = Immutable('Botany', 'Mark Watney')
>>> mark.dept
'Botany'
>>> mark.dept = 'Space Pirate'
Traceback (most recent call last):
...
AttributeError: property 'dept' of 'Immutable' object has no setter
>>> mark.location = 'Mars'
Traceback (most recent call last):
...
AttributeError: 'Immutable' object has no attribute 'location'
3. Saves memory. On a 64-bit Linux build, an instance with two attributes
takes 48 bytes with __slots__ and 152 bytes without. This flyweight
design pattern likely only
matters when a large number of instances are going to be created.
4. Improves speed. Reading instance variables is 35% faster with
__slots__ (as measured with Python 3.10 on an Apple M1 processor).
5. Blocks tools like functools.cached_property() which require an
instance dictionary to function correctly:
from functools import cached_property
class CP:
__slots__ = () # Eliminates the instance dict
@cached_property # Requires an instance dict
def pi(self):
return 4 * sum((-1.0)**n / (2.0*n + 1.0)
for n in reversed(range(100_000)))
>>> CP().pi
Traceback (most recent call last):
...
TypeError: No '__dict__' attribute on 'CP' instance to cache 'pi' property.
It is not possible to create an exact drop-in pure Python version of
__slots__ because it requires direct access to C structures and control
over object memory allocation. However, we can build a mostly faithful
simulation where the actual C structure for slots is emulated by a private
_slotvalues list. Reads and writes to that private structure are managed
by member descriptors:
null = object()
class Member:
def __init__(self, name, clsname, offset):
'Emulate PyMemberDef in Include/structmember.h'
# Also see descr_new() in Objects/descrobject.c
self.name = name
self.clsname = clsname
self.offset = offset
def __get__(self, obj, objtype=None):
'Emulate member_get() in Objects/descrobject.c'
# Also see PyMember_GetOne() in Python/structmember.c
if obj is None:
return self
value = obj._slotvalues[self.offset]
if value is null:
raise AttributeError(self.name)
return value
def __set__(self, obj, value):
'Emulate member_set() in Objects/descrobject.c'
obj._slotvalues[self.offset] = value
def __delete__(self, obj):
'Emulate member_delete() in Objects/descrobject.c'
value = obj._slotvalues[self.offset]
if value is null:
raise AttributeError(self.name)
obj._slotvalues[self.offset] = null
def __repr__(self):
'Emulate member_repr() in Objects/descrobject.c'
return f'<Member {self.name!r} of {self.clsname!r}>'
The type.__new__() method takes care of adding member objects to class
variables:
class Type(type):
'Simulate how the type metaclass adds member objects for slots'
def __new__(mcls, clsname, bases, mapping, **kwargs):
'Emulate type_new() in Objects/typeobject.c'
# type_new() calls PyTypeReady() which calls add_methods()
slot_names = mapping.get('slot_names', [])
for offset, name in enumerate(slot_names):
mapping[name] = Member(name, clsname, offset)
return type.__new__(mcls, clsname, bases, mapping, **kwargs)
The object.__new__() method takes care of creating instances that have
slots instead of an instance dictionary. Here is a rough simulation in pure
Python:
class Object:
'Simulate how object.__new__() allocates memory for __slots__'
def __new__(cls, *args, **kwargs):
'Emulate object_new() in Objects/typeobject.c'
inst = super().__new__(cls)
if hasattr(cls, 'slot_names'):
empty_slots = [null] * len(cls.slot_names)
object.__setattr__(inst, '_slotvalues', empty_slots)
return inst
def __setattr__(self, name, value):
'Emulate _PyObject_GenericSetAttrWithDict() Objects/object.c'
cls = type(self)
if hasattr(cls, 'slot_names') and name not in cls.slot_names:
raise AttributeError(
f'{cls.__name__!r} object has no attribute {name!r}'
)
super().__setattr__(name, value)
def __delattr__(self, name):
'Emulate _PyObject_GenericSetAttrWithDict() Objects/object.c'
cls = type(self)
if hasattr(cls, 'slot_names') and name not in cls.slot_names:
raise AttributeError(
f'{cls.__name__!r} object has no attribute {name!r}'
)
super().__delattr__(name)
To use the simulation in a real class, just inherit from Object and
set the metaclass to Type:
class H(Object, metaclass=Type):
'Instance variables stored in slots'
slot_names = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
At this point, the metaclass has loaded member objects for x and y:
>>> from pprint import pp
>>> pp(dict(vars(H)))
{'__module__': '__main__',
'__doc__': 'Instance variables stored in slots',
'slot_names': ['x', 'y'],
'__init__': <function H.__init__ at 0x7fb5d302f9d0>,
'x': <Member 'x' of 'H'>,
'y': <Member 'y' of 'H'>}
When instances are created, they have a slot_values list where the
attributes are stored:
>>> h = H(10, 20)
>>> vars(h)
{'_slotvalues': [10, 20]}
>>> h.x = 55
>>> vars(h)
{'_slotvalues': [55, 20]}
Misspelled or unassigned attributes will raise an exception:
>>> h.xz
Traceback (most recent call last):
...
AttributeError: 'H' object has no attribute 'xz'