A Information to Python’s Weak References Utilizing the weakref Module | by Martin Heinz | Jun, 2024
Study all about weak references in Python: reference counting, rubbish assortment, and sensible makes use of of the weakref
module
Likelihood is that you just by no means touched and perhaps haven’t even heard about Python’s weakref
module. Whereas it may not be generally utilized in your code, it is elementary to the inside workings of many libraries, frameworks and even Python itself. So, on this article we’ll discover what it’s, how it’s useful, and the way you could possibly incorporate it into your code as properly.
To know weakref
module and weak references, we first want somewhat intro to rubbish assortment in Python.
Python makes use of reference counting as a mechanism for rubbish assortment — in easy phrases — Python retains a reference depend for every object we create and the reference depend is incremented at any time when the item is referenced in code; and it’s decremented when an object is de-referenced (e.g. variable set to None
). If the reference depend ever drop to zero, the reminiscence for the item is deallocated (garbage-collected).
Let’s have a look at some code to grasp it somewhat extra:
import sysclass SomeObject:
def __del__(self):
print(f"(Deleting {self=})")
obj = SomeObject()
print(sys.getrefcount(obj)) # 2
obj2 = obj
print(sys.getrefcount(obj)) # 3
obj = None
obj2 = None
# (Deleting self=<__main__.SomeObject object at 0x7d303fee7e80>)
Right here we outline a category that solely implements a __del__
technique, which known as when object is garbage-collected (GC’ed) – we do that in order that we are able to see when the rubbish assortment occurs.
After creating an occasion of this class, we use sys.getrefcount
to get present variety of references to this object. We might anticipate to get 1
right here, however the depend returned by getrefcount
is usually one larger than you would possibly anticipate, that is as a result of after we name getrefcount
, the reference is copied by worth into the perform’s argument, quickly bumping up the item’s reference depend.
Subsequent, if we declare obj2 = obj
and name getrefcount
once more, we get 3
as a result of it is now referenced by each obj
and obj2
. Conversely, if we assign None
to those variables, the reference depend will lower to zero, and finally we’ll get the message from __del__
technique telling us that the item obtained garbage-collected.
Nicely, and the way do weak references match into this? If solely remaining references to an object are weak references, then Python interpreter is free to garbage-collect this object. In different phrases — a weak reference to an object will not be sufficient to maintain the item alive:
import weakrefobj = SomeObject()
reference = weakref.ref(obj)
print(reference) # <weakref at 0x734b0a514590; to 'SomeObject' at 0x734b0a4e7700>
print(reference()) # <__main__.SomeObject object at 0x707038c0b700>
print(obj.__weakref__) # <weakref at 0x734b0a514590; to 'SomeObject' at 0x734b0a4e7700>
print(sys.getrefcount(obj)) # 2
obj = None
# (Deleting self=<__main__.SomeObject object at 0x70744d42b700>)
print(reference) # <weakref at 0x7988e2d70590; useless>
print(reference()) # None
Right here we once more declare a variable obj
of our class, however this time as an alternative of making second sturdy reference to this object, we create weak reference in reference
variable.
If we then examine the reference depend, we are able to see that it didn’t improve, and if we set the obj
variable to None
, we are able to see that it instantly will get garbage-collected despite the fact that the weak reference nonetheless exist.
Lastly, if attempt to entry the weak reference to the already garbage-collected object, we get a “useless” reference and None
respectively.
Additionally discover that after we used the weak reference to entry the item, we needed to name it as a perform ( reference()
) to retrieve to object. Subsequently, it’s typically extra handy to make use of a proxy as an alternative, particularly if it is advisable entry object attributes:
obj = SomeObject()reference = weakref.proxy(obj)
print(reference) # <__main__.SomeObject object at 0x78a420e6b700>
obj.attr = 1
print(reference.attr) # 1
Now that we all know how weak references work, let’s have a look at some examples of how they may very well be helpful.
A typical use-case for weak references is tree-like information constructions:
class Node:
def __init__(self, worth):
self.worth = worth
self._parent = None
self.youngsters = []def __repr__(self):
return "Node({!r:})".format(self.worth)
@property
def mother or father(self):
return self._parent if self._parent is None else self._parent()
@mother or father.setter
def mother or father(self, node):
self._parent = weakref.ref(node)
def add_child(self, youngster):
self.youngsters.append(youngster)
youngster.mother or father = self
root = Node("mother or father")
n = Node("youngster")
root.add_child(n)
print(n.mother or father) # Node('mother or father')
del root
print(n.mother or father) # None
Right here we implement a tree utilizing a Node
class the place youngster nodes have weak reference to their mother or father. On this relation, the kid Node
can stay with out mother or father Node
, which permits mother or father to be silently eliminated/garbage-collected.
Alternatively, we are able to flip this round:
class Node:
def __init__(self, worth):
self.worth = worth
self._children = weakref.WeakValueDictionary()@property
def youngsters(self):
return checklist(self._children.objects())
def add_child(self, key, youngster):
self._children[key] = youngster
root = Node("mother or father")
n1 = Node("youngster one")
n2 = Node("youngster two")
root.add_child("n1", n1)
root.add_child("n2", n2)
print(root.youngsters) # [('n1', Node('child one')), ('n2', Node('child two'))]
del n1
print(root.youngsters) # [('n2', Node('child two'))]
Right here as an alternative, the mother or father retains a dictionary of weak references to its youngsters. This makes use of WeakValueDictionary
— at any time when a component (weak reference) referenced from the dictionary will get dereferenced elsewhere in this system, it robotically will get faraway from the dictionary too, so we do not have handle lifecycle of dictionary objects.
One other use of weakref
is in Observer design pattern:
class Observable:
def __init__(self):
self._observers = weakref.WeakSet()def register_observer(self, obs):
self._observers.add(obs)
def notify_observers(self, *args, **kwargs):
for obs in self._observers:
obs.notify(self, *args, **kwargs)
class Observer:
def __init__(self, observable):
observable.register_observer(self)
def notify(self, observable, *args, **kwargs):
print("Obtained", args, kwargs, "From", observable)
topic = Observable()
observer = Observer(topic)
topic.notify_observers("take a look at", kw="python")
# Obtained ('take a look at',) {'kw': 'python'} From <__main__.Observable object at 0x757957b892d0>
The Observable
class retains weak references to its observers, as a result of it would not care in the event that they get eliminated. As with earlier examples, this avoids having to handle the lifecycle of dependant objects. As you in all probability seen, on this instance we used WeakSet
which is one other class from weakref
module, it behaves identical to the WeakValueDictionary
however is carried out utilizing Set
.
Remaining instance for this part is borrowed from weakref
docs:
import tempfile, shutil
from pathlib import Pathclass TempDir:
def __init__(self):
self.title = tempfile.mkdtemp()
self._finalizer = weakref.finalize(self, shutil.rmtree, self.title)
def __repr__(self):
return "TempDir({!r:})".format(self.title)
def take away(self):
self._finalizer()
@property
def eliminated(self):
return not self._finalizer.alive
tmp = TempDir()
print(tmp) # TempDir('/tmp/tmp8o0aecl3')
print(tmp.eliminated) # False
print(Path(tmp.title).is_dir()) # True
This showcases yet another characteristic of weakref
module, which is weakref.finalize
. Because the title recommend it permits executing a finalizer perform/callback when the dependant object is garbage-collected. On this case we implement a TempDir
class which can be utilized to create a short lived listing – in preferrred case we’d at all times bear in mind to scrub up the TempDir
after we do not want it anymore, but when we neglect, we’ve the finalizer that can robotically run rmtree
on the listing when the TempDir
object is GC’ed, which incorporates when program exits fully.
The earlier part has proven couple sensible usages for weakref
, however let’s additionally check out real-world examples—one among them being making a cached occasion:
import logging
a = logging.getLogger("first")
b = logging.getLogger("second")
print(a is b) # Falsec = logging.getLogger("first")
print(a is c) # True
The above is primary utilization of Python’s builtin logging
module – we are able to see that it permits to solely affiliate a single logger occasion with a given title – which means that after we retrieve similar logger a number of instances, it at all times returns the identical cached logger occasion.
If we needed to implement this, it might look one thing like this:
class Logger:
def __init__(self, title):
self.title = title_logger_cache = weakref.WeakValueDictionary()
def get_logger(title):
if title not in _logger_cache:
l = Logger(title)
_logger_cache[name] = l
else:
l = _logger_cache[name]
return l
a = get_logger("first")
b = get_logger("second")
print(a is b) # False
c = get_logger("first")
print(a is c) # True
And at last, Python itself makes use of weak references, e.g. in implementation of OrderedDict
:
from _weakref import proxy as _proxyclass OrderedDict(dict):
def __new__(cls, /, *args, **kwds):
self = dict.__new__(cls)
self.__hardroot = _Link()
self.__root = root = _proxy(self.__hardroot)
root.prev = root.subsequent = root
self.__map = {}
return self
The above is snippet from CPython’s collections
module. Right here, the weakref.proxy
is used to forestall round references (see the doc-strings for extra particulars).
weakref
is pretty obscure, however at instances very great tool that it is best to preserve in your toolbox. It may be very useful when implementing caches or information constructions which have reference loops in them, comparable to doubly linked lists.
With that stated, one ought to pay attention to weakref
help — every part stated right here and within the docs is CPython particular and totally different Python implementations could have totally different weakref
habits. Additionally, lots of the builtin sorts do not help weak references, comparable to checklist
, tuple
or int
.
This text was initially posted at martinheinz.dev