Digest Equivalence

The fleche.digest.digest() function turns Python objects into SHA256 hex strings. Those strings are the only thing the cache compares, so two objects that produce the same digest are — for caching purposes — interchangeable. Most of the time this is exactly what you want: digest(1) == digest(1.0) == digest(1+0j) because all three hold the same numeric value, and digest((1, 2)) != digest([1, 2]) because the container type matters. This page documents the deliberate equivalences fleche guarantees, and the boundaries you should keep in mind.

The digest function is content-based: it walks each value, salts the running hash with type(value).__name__ for type discrimination, and folds in a representation that is stable across processes and Python versions (where the underlying object’s representation allows it). When two values of different concrete Python types are considered equivalent at the value level (1 and 1.0, an int subclass with the same numeric value, …), the digest reflects that.

Dataclasses and attrs classes

A stdlib dataclasses.dataclass() instance and an attrs-decorated instance hash identically when:

  • they share the same class __name__, and

  • they expose the same (field_name, field_value) mapping.

This means you can convert a class from one record framework to the other without invalidating any already-cached call that took an instance of that class as an argument (or returned one).

from dataclasses import dataclass
import attrs
from fleche.digest import digest

# before the migration
@dataclass
class Point:
    x: int
    y: int

before = digest(Point(x=1, y=2))

# after the migration — same name, same fields
@attrs.define
class Point:
    x: int
    y: int

after = digest(Point(x=1, y=2))

assert before == after

Why we make them equivalent

  • Persistence is the point. fleche is a persistent cache. Migrating from @dataclass to @attrs.define (or back) is a routine refactor that should not silently throw away potentially expensive cached results.

  • Both are record types. The digest already only inspects attribute names and values; it never reads __init__, __eq__, __hash__, or any of the other generated dunders. Through that lens, an attrs record and a dataclass record with identical contents are identical.

  • Symmetric with numeric equivalence. digest(1) == digest(1.0) == digest(1+0j) for the same reason: when two objects of different concrete types denote the same value, the digest collapses them.

Boundaries to keep in mind

  • Equivalence is by class name, not module path. Two declarations with the same __name__ but in different modules collide. This was already true for dataclasses; it now extends to attrs. If you have multiple classes named Point in your project that mean different things, give them distinct names or override the digest per class (see Customizing Digests).

  • Class-level construction logic is bypassed on load. When fleche reads a destructured attrs / dataclass instance back from value storage, it reconstructs it via object.__new__() plus object.__setattr__(), intentionally bypassing __init__, __post_init__, attrs converters, and attrs validators. The same instance going into the cache may have been constructed through any of those; coming back out it will not. In particular, validators do not re-run on load. Make sure your validators express invariants of the data, not side effects to perform on construction.

  • Different runtime semantics still differ at runtime. slots, frozen, custom __eq__ / __hash__, attrs’s eq_key/order_key, … None of these affect the digest. Two records that hash the same may still compare or iterate differently. The digest tells you when fleche will reuse a cached result; it does not tell you the two instances are observationally identical.

Opting out per type

If a particular class needs stricter scoping than “same name + same fields”, give it a custom __digest__ (see Customizing Digests). For example, to scope by full qualified path:

@dataclass
class Point:
    x: int
    y: int

    def __digest__(self):
        from fleche.digest import digest
        return digest((f"{type(self).__module__}.{type(self).__qualname__}",
                       self.x, self.y))

A custom __digest__ short-circuits the dataclass / attrs path and takes precedence over both built-in cases.