Add __slots__ entries.
Hello, all this does is add a __slots__ entry to a few classes. this small change makes an outsized impact, reducing the size of instances dramatically, and leads the way to efficient and fully typed page extraction.
from pympler import asizeof
class Rect:
def __init__(self, a, b, c, d):
self.x0: float = float(a)
self.y0: float = float(b)
self.x1: float = float(c)
self.y1: float = float(d)
class RectWSlot:
__slots__ = ("x0", "y0", "x1", "y1")
def __init__(self, a, b, c, d):
self.x0: float = float(a)
self.y0: float = float(b)
self.x1: float = float(c)
self.y1: float = float(d)
print(asizeof.asizeof(Rect(1, 2, 3, 4))) # 624
print(asizeof.asizeof(RectWSlot(1, 2, 3, 4))) # 160
I noticed the reasoning for non-typed values on the Structure of Dictionary Outputs section.
If this pull gets merged, would you accept a pull that types the dictionaries?
Thanks for the idea - and you are quite right: we should have considered doing this.
Before we accept: could you please extend this PR to the Matrix class please? The slots here would carry the names "a" through "f".
please extend this idea to the remaining "geometry" class Matrix.
Gladly! Note, this might be kind of limited in it's effectiveness because of some upstream code, namely the swig generated code. Seems easy enough to add.
I think the test failures can be fixed by deleting IdentityMatrix's __setattr__() method. Could you try doing this in your PR?
[Presumably this was somehow forcing the creation of a __dict__ and so messing up lookups of in __slots__.]
Apologies, I've been trying to get a job, and I'm helping someone move. I also just found out how to run your tests locally via act, so it's a bit easier to make commits/changes.
I think the test failures can be fixed by deleting
IdentityMatrix's__setattr__()method.
You have to be careful with that, as you don't want someone to change the identity matrix to be anything else.
I haven't tested it but the only reason for the existence of pymupdf.IdentityMatrix is its immutability. This is ensured by __setattr__().
So deleting this method also deletes the reason for having IdentityMatrix at all.
~~I could think of a way out:~~
~~Define this object as a @property~~
Sorry, that will not work either!
I think best is to give up supplying Matrix with slots.
This passes tests for me:
class IdentityMatrix(Matrix):
def __init__(self):
# Deliberately do not call Matrix.__init__(self).
pass
@property
def a(self):
return 1
@property
def b(self):
return 0
@property
def c(self):
return 0
@property
def d(self):
return 1
@property
def e(self):
return 0
@property
def f(self):
return 0
Identity = IdentityMatrix()
assert Identity.a == 1
assert Identity.b == 0
try:
Identity.a = 23
except Exception:
pass
else:
assert 0
- The error from
Identity.a = 23isAttributeError: property 'a' of 'IdentityMatrix' object has no setter. - Instances of
IdentityMatrix()will not benefit from the__slots__optimisation, but this probably doesn't matter.