Digest Equivalence
==================

The :func:`fleche.digest.digest` function turns Python objects into SHA256 hex strings.
Those strings are the *only* thing the cache compares, so two objects that produce the
same digest are — for caching purposes — interchangeable.  Most of the time this is
exactly what you want: ``digest(1) == digest(1.0) == digest(1+0j)`` because all three
hold the same numeric value, and ``digest((1, 2)) != digest([1, 2])`` because the
container type matters.  This page documents the deliberate equivalences ``fleche``
guarantees, and the boundaries you should keep in mind.

The digest function is content-based: it walks each value, salts the running hash
with ``type(value).__name__`` for type discrimination, and folds in a representation
that is stable across processes and Python versions (where the underlying object's
representation allows it).  When two values *of different concrete Python types* are
considered equivalent at the value level (``1`` and ``1.0``, an ``int`` subclass with
the same numeric value, ...), the digest reflects that.

Dataclasses and ``attrs`` classes
---------------------------------

A stdlib :func:`dataclasses.dataclass` instance and an ``attrs``-decorated instance
hash identically when:

* they share the same class ``__name__``, and
* they expose the same ``(field_name, field_value)`` mapping.

This means you can convert a class from one record framework to the other without
invalidating any already-cached call that took an instance of that class as an
argument (or returned one).

.. code-block:: python

    from dataclasses import dataclass
    import attrs
    from fleche.digest import digest

    # before the migration
    @dataclass
    class Point:
        x: int
        y: int

    before = digest(Point(x=1, y=2))

    # after the migration — same name, same fields
    @attrs.define
    class Point:
        x: int
        y: int

    after = digest(Point(x=1, y=2))

    assert before == after

Why we make them equivalent
~~~~~~~~~~~~~~~~~~~~~~~~~~~

* **Persistence is the point.**  ``fleche`` is a *persistent* cache.  Migrating from
  ``@dataclass`` to ``@attrs.define`` (or back) is a routine refactor that should not
  silently throw away potentially expensive cached results.
* **Both are record types.**  The digest already only inspects attribute names and
  values; it never reads ``__init__``, ``__eq__``, ``__hash__``, or any of the other
  generated dunders.  Through that lens, an attrs record and a dataclass record with
  identical contents *are* identical.
* **Symmetric with numeric equivalence.**  ``digest(1) == digest(1.0) == digest(1+0j)``
  for the same reason: when two objects of different concrete types denote the same
  value, the digest collapses them.

Boundaries to keep in mind
~~~~~~~~~~~~~~~~~~~~~~~~~~

* **Equivalence is by class name, not module path.**  Two declarations with the same
  ``__name__`` but in different modules collide.  This was already true for
  dataclasses; it now extends to attrs.  If you have multiple classes named ``Point``
  in your project that mean different things, give them distinct names or override
  the digest per class (see :doc:`/dev/custom_digests`).
* **Class-level construction logic is bypassed on load.**  When ``fleche`` reads a
  destructured attrs / dataclass instance back from value storage, it reconstructs it
  via :py:func:`object.__new__` plus :py:func:`object.__setattr__`, intentionally
  bypassing ``__init__``, ``__post_init__``, attrs converters, and attrs validators.
  The same instance going *into* the cache may have been constructed through any of
  those; coming back out it will not.  In particular, validators do not re-run on
  load.  Make sure your validators express *invariants of the data*, not *side
  effects to perform on construction*.
* **Different runtime semantics still differ at runtime.**  ``slots``, ``frozen``,
  custom ``__eq__`` / ``__hash__``, attrs's ``eq_key``/``order_key``, ...  None of
  these affect the digest.  Two records that hash the same may still compare or
  iterate differently.  The digest tells you when ``fleche`` will reuse a cached
  result; it does not tell you the two instances are observationally identical.

Opting out per type
~~~~~~~~~~~~~~~~~~~

If a particular class needs stricter scoping than "same name + same fields", give it
a custom ``__digest__`` (see :doc:`/dev/custom_digests`).  For example, to scope by full
qualified path:

.. code-block:: python

    @dataclass
    class Point:
        x: int
        y: int

        def __digest__(self):
            from fleche.digest import digest
            return digest((f"{type(self).__module__}.{type(self).__qualname__}",
                           self.x, self.y))

A custom ``__digest__`` short-circuits the dataclass / attrs path and takes
precedence over both built-in cases.