fleche.caches

Attributes

`logger`
`DigestedIterable`
`DigestedDict`

Exceptions

Rejected

Cache refused to cache the call for some reason or other.

Classes

`BaseCache`	Minimal base that exposes the `_operation_context()` hook.
`Cache`	Mixin that locks per-key so concurrent ops on different keys proceed in parallel.
`CacheWrapper`	Forwarding base class: all BaseCache methods delegate to `self.cache`.
`ReadOnlyMixin`	Read-only behaviour: `save` and `evict` raise `Rejected`.
`ReadOnlyCache`	A cache that can only be read from.
`FilteringMixin`	Filters `load` and `_query` results by a predicate.
`FilteredCache`	A read-only view of a cache that only exposes calls matching a predicate.
`RefreshingCache`	A cache that forces re-execution by always missing on load.
`_MultiCache`	Shared read fan-out for caches that aggregate several member caches.
`CacheStack`	A combination of caches with a shared traversal policy.
`CachePool`	A read-only collection of caches queried as one.
`SizeLimitedMixin`	Mixin that enforces a maximum number of cached calls with random eviction.
`SizeLimitedCache`	A `Cache` that enforces a maximum number of cached calls.

Functions

_combine_shrink(→ fleche.digest.Digest)

Reduce sub-storage shrink results to the longest (safest) prefix.

Module Contents

fleche.caches.logger[source]

exception fleche.caches.Rejected[source]

Bases: Exception

Cache refused to cache the call for some reason or other.

fleche.caches.DigestedIterable[source]

fleche.caches.DigestedDict[source]

class fleche.caches.BaseCache[source]

Bases: fleche.storage.base.OperationContext

Minimal base that exposes the _operation_context() hook.

Both KeyManagement (storage layer) and BaseCache (cache layer) inherit from this class so that the same thread-safety mixins (SerializingMixin, PerKeyLockMixin) can attach to either layer without duplication.

classmethod from_config(config: dict[str, Any] | list[dict[str, Any]]) → BaseCache[source]

Build a cache from a config dict/list.

Thin wrapper around fleche.config.cache_from_config(). The concrete cache type is chosen from the shape of config (plain cache, stack, pool, size-limited, read-only, template shorthand, …), so the returned instance is not necessarily of type cls.

fleche.config is imported lazily here because it imports this module at import time; a module-level import would be circular.

abstractmethod save(call: fleche.call.Call) → str[source]

abstractmethod load(key: str) → fleche.call.LazyCall[source]

abstractmethod load_value(key: str) → Any[source]

abstractmethod evict(key: str | fleche.digest.Digest) → None[source]

contains(key: str) → bool[source]

transfer(other: BaseCache, pop: bool = False, overwrite: bool = False) → None[source]

Transfer all calls from this cache to another cache.

Parameters:

other – The destination cache.
pop – If True, evict transferred keys from the source cache after moving.
overwrite – If True, overwrite existing entries in the target cache. If False (default), skip entries that already exist in the target.

_transfer_one(c: fleche.call.LazyCall, *, overwrite: bool = False) → bool[source]

Atomically replay one call into this cache, honouring overwrite.

Holds this cache’s per-key operation context across the whole contains → save sequence, so the “skip if already present” decision cannot race a concurrent writer (the #452 contains``→``save TOCTOU). The context is reentrant, so contains / save re-entering it for the same key do not deadlock. Keeping this on the cache (rather than in transfer()) means the query layer only ever calls public cache methods, and each cache encapsulates its own locking — CacheWrapper and CacheStack override _operation_context() so wrapper/stack targets lock their real inner Cache rather than the no-op base context.

Parameters:

c – the call to transfer; fetched from its source cache only on the non-conflict path, so a skipped transfer pays no deserialisation.
overwrite – if True, write even when a conflicting entry exists.

Returns:

True if the call was written, False if it was skipped because a conflicting entry already exists.

readonly() → ReadOnlyCache[source]: Return a read-only view of this cache.

push(cache: BaseCache) → CacheStack[source]

abstractmethod expand(key: fleche.digest.Digest | str) → fleche.digest.Digest[source]

Expand a short digest prefix to its full-length digest.

Parameters:

key (str or Digest) – the short digest prefix to expand

Returns:

the full-length digest

Return type:

Digest

Raises:

KeyError – if the key is not found
AmbiguousDigestError – if the prefix matches more than one entry

shrink(key: fleche.digest.Digest | str, /) → fleche.digest.Digest[source]

shrink(key: fleche.digest.Digest | str, /, *keys: fleche.digest.Digest | str) → tuple[Digest, ...]

Find the shortest substring(s) that unambiguously reference each call.

With a single key, returns one Digest. With multiple keys, returns a tuple of Digest in the same order as the inputs; the batched form lets sub-storages list their keys once instead of per-key, which matters on backends where listing is expensive (e.g. SQL, filesystem).

Each input key must belong to one of the sub-storages (call or value). Mixing call keys and value keys in a single call is undefined behaviour — the result depends on internal partitioning order and may change without notice.

Warning

This is a property of how many values there are in your storage! A key returned from this function may become ambigious in the future when more values are added. Do not rely on this function in your programs, it is provided as a convenience for users only!

Parameters:: *keys (str or Digest) – one or more keys to shorten
Returns:: Digest (single key) or tuple of Digest (multiple)
Raises:: AmbiguousDigestError – if no shorter key is possible for any input

abstractmethod _shrink(*keys: fleche.digest.Digest | str) → tuple[Digest, ...][source]: Partition and shrink all keys; always returns a same-length tuple of short digests.

abstractmethod _query(call: BaseCache._query.call) → Iterable[fleche.call.LazyCall][source]

query(template: fleche.call.QueryCall | None = None, **kwargs) → fleche.query.QueryIterator[source]

Query the cache for matching calls.

Accepts either a QueryCall as the first positional argument, or the same keyword arguments that QueryCall accepts. Omitted fields default to None (wildcard). Passing both a template and keyword arguments raises TypeError.

Examples:

cache.query(name="my_func")
cache.query(name="my_func", arguments={"x": 1})
cache.query(QueryCall(name="my_func"))  # existing form still works
cache.query()  # all calls

Returns:: QueryIterator

table(arguments: Iterable[str] | str | Literal[True] = (), results=False, shrink_keys: bool = True) → pandas.DataFrame[source]

Return a pandas DataFrame summarizing cached calls via query().

This implementation uses a fully-wildcard Call template to retrieve all calls through self.query and then flattens metadata keys into top-level columns for convenience.

By default, arguments and results are elided.

The DataFrame index will be the lookup key (digest) of each call. Columns are:

name: the function name

module: the module name

‘result`: if results argument is True

metadata fields are flattened and added as columns directly

If given argument names collide with any of the above columns, they are prefixed by ‘a_’. Only requested arguments are loaded from cache.

Parameters:

arguments – add the given arguments (of the queried calls) as columns to the table. Pass True to add all arguments, or a single string as a shortcut for a one-element tuple.
results (bool) – if True, add results of queried calls to table
shrink_keys (bool) – if True (default), shrink each index entry to its shortest unambiguous prefix. Set to False to keep full-length digests.

Returns:

table of all calls on cache

Return type:

pandas.DataFrame

filter(predicate: Callable[[fleche.call.Call | fleche.call.LazyCall], bool] | fleche.call.QueryCall) → FilteredCache[source]

Create a read-only view of this cache that only exposes calls matching the predicate.

Parameters:: predicate – A function that takes a Call or LazyCall and returns True if it should be included in the new cache, or a QueryCall object to use as a template.
Returns:: A read-only view of the cache.
Return type:: FilteredCache

fleche.caches._combine_shrink(key: Digest | str, results: Iterable[Digest]) → fleche.digest.Digest[source]

Reduce sub-storage shrink results to the longest (safest) prefix.

Raises:: KeyError – if no results were found.

class fleche.caches.Cache[source]

Bases: fleche.storage.thread_safe.PerKeyLockMixin, BaseCache

Mixin that locks per-key so concurrent ops on different keys proceed in parallel.

A lightweight threading.Lock guards the lock-table itself; once the per-key RLock is obtained the table lock is released, so two threads operating on different keys never block each other. Operations on the same key are serialized by the per-key lock, which is reentrant to allow nested calls (e.g. expand inside load).

Instances must be hashable. Place before the concrete storage class in the MRO:

@dataclass(frozen=True)
class PerKeyValuePickle(PerKeyLockMixin, ValuePickleFile): ...

values: fleche.storage.ValueStorage[source]

calls: fleche.storage.CallStorage[source]

load_value(key)[source]

save(call: fleche.call.Call) → str[source]

load(key: str) → fleche.call.LazyCall[source]

contains(key: str) → bool[source]

expand(key: fleche.digest.Digest | str) → fleche.digest.Digest[source]

Expand a short digest prefix to its full-length digest.

Parameters:

key (str or Digest) – the short digest prefix to expand

Returns:

the full-length digest

Return type:

Digest

Raises:

KeyError – if the key is not found
AmbiguousDigestError – if the prefix matches more than one entry

_shrink(*keys: fleche.digest.Digest | str) → tuple[Digest, ...][source]: Partition and shrink all keys; always returns a same-length tuple of short digests.

_query(call: Cache._query.call) → Iterable[fleche.call.LazyCall][source]

Query for cached calls that match a template and return decoded results.

This delegates to the underlying CallStorage.query() using the provided template call. Any digested argument values and the result are decoded via this cache’s value storage before yielding.

Parameters:: call – A Call instance used as a template; fields set to None act as wildcards. For arguments and result, comparisons follow digest semantics (i.e., values are matched by their digest).
Yields:: Call | LazyCall – Matching calls with arguments and result decoded from digests where possible.

evict(key: str | fleche.digest.Digest) → None[source]

redigest() → None[source]

Ensures consistent cache keys in case digest function changed.

This may take time depending on cache size.

gc() → set[fleche.digest.Digest][source]

Evict value entries not reachable from any stored call.

Brute-force mark-and-sweep: walks every call record to build the set of directly-referenced value digests, then transitively follows destructured sub-references (via DestructuringMixin.child_digests() on storages that satisfy HasChildDigests), and evicts every values key outside the reachable set. Call records are left untouched.

Returns:: The set of digests that were evicted from value storage.

class fleche.caches.CacheWrapper[source]

Bases: BaseCache

Forwarding base class: all BaseCache methods delegate to self.cache.

Combine with behaviour mixins (ReadOnlyMixin, FilteringMixin) to build concrete wrapper classes without redeclaring cache.

cache: BaseCache[source]

save(call: fleche.call.Call) → str[source]

load(key: str) → fleche.call.LazyCall[source]

load_value(key: str) → Any[source]

contains(key: str) → bool[source]

evict(key: str | fleche.digest.Digest) → None[source]

expand(key: fleche.digest.Digest | str) → fleche.digest.Digest[source]

Expand a short digest prefix to its full-length digest.

Parameters:

key (str or Digest) – the short digest prefix to expand

Returns:

the full-length digest

Return type:

Digest

Raises:

KeyError – if the key is not found
AmbiguousDigestError – if the prefix matches more than one entry

_shrink(*keys: fleche.digest.Digest | str) → tuple[Digest, ...][source]: Partition and shrink all keys; always returns a same-length tuple of short digests.

_query(call: CacheWrapper._query.call) → Iterable[fleche.call.LazyCall][source]

_operation_context(key, *, intent: fleche.storage.base.Intent = Intent.WRITE)[source]

Context manager entered around every operation on key.

The base implementation is a no-op. Override in a mixin to inject any resource scoped to the operation — a threading lock, a SQLAlchemy session, an open file handle, a decompression stream, etc.

Receiving key lets implementations choose between a single global resource (ignore the key) or per-key resources (e.g. a striped lock table or a key-specific file handle).

intent describes the kind of operation being performed. Mixins may use it to choose between exclusive and shared locks. Currently the only defined value is Intent.WRITE (the default).

Composing multiple mixins: use super() to chain so that every mixin in the MRO gets to wrap the operation:

@contextlib.contextmanager
def _operation_context(self, key, *, intent=Intent.WRITE):
    with self._lock:                   # this mixin's resource
        with super()._operation_context(key, intent=intent):
            yield

class fleche.caches.ReadOnlyMixin[source]

Read-only behaviour: save and evict raise Rejected.

Field-free and base-free, so it composes onto any cache layout — a single-cache wrapper (ReadOnlyCache, FilteredCache) or a multi-cache view (CachePool). Place it first in the bases so its save/evict win over a forwarding/aggregating implementation.

It is also the marker fleche.remote._is_read_only() keys on, so any cache mixing it in is recognised as read-only by the SSH layer (which then short-circuits save/evict without a round-trip).

save(call: fleche.call.Call)[source]

evict(key: str | fleche.digest.Digest) → None[source]

class fleche.caches.ReadOnlyCache[source]

Bases: ReadOnlyMixin, CacheWrapper

A cache that can only be read from.

class fleche.caches.FilteringMixin[source]

Bases: CacheWrapper

Filters load and _query results by a predicate.

predicate: Callable[[fleche.call.Call | fleche.call.LazyCall], bool][source]

load(key: str) → fleche.call.LazyCall[source]

_query(call: FilteringMixin._query.call) → Iterable[fleche.call.LazyCall][source]

class fleche.caches.FilteredCache[source]

Bases: ReadOnlyMixin, FilteringMixin

A read-only view of a cache that only exposes calls matching a predicate.

class fleche.caches.RefreshingCache[source]

Bases: CacheWrapper

A cache that forces re-execution by always missing on load.

It forwards saves and value loads to an underlying cache, allowing new results to be stored while ensuring that existing ones are ignored for the duration of its use.

This is necessary to handle nested fleche calls during a rerun, otherwise forcing them to re-execute would be awkward.

load(key: str) → fleche.call.LazyCall[source]

contains(key: str) → bool[source]

class fleche.caches._MultiCache[source]

Bases: BaseCache

Shared read fan-out for caches that aggregate several member caches.

Subclasses expose their members via _members and choose their own write / load() policy. Everything that only reads across the members — contains(), load_value(), expand(), _shrink(), _query() — plus the three private traversal helpers lives here, so CacheStack (an ordered, writable hierarchy) and CachePool (an unordered, read-only collection) share one implementation of the fan-out.

Each traversal helper implements one of the recurring patterns:

_first_hit() — return on the first success; raise if all miss.
_collect() — gather every success; caller combines the results.
_foreach() — apply to every member; swallow expected refusals.

property _members: tuple[BaseCache, ...][source]

Abstractmethod:

The member caches to fan out over, in traversal order.

load_value(key)[source]

contains(key: str) → bool[source]

expand(key: fleche.digest.Digest | str) → fleche.digest.Digest[source]

Expand a short digest prefix to its full-length digest.

Parameters:

key (str or Digest) – the short digest prefix to expand

Returns:

the full-length digest

Return type:

Digest

Raises:

KeyError – if the key is not found
AmbiguousDigestError – if the prefix matches more than one entry

_shrink(*keys: fleche.digest.Digest | str) → tuple[Digest, ...][source]: Partition and shrink all keys; always returns a same-length tuple of short digests.

_query(call: _MultiCache._query.call) → Iterable[fleche.call.LazyCall][source]

Aggregate query results across the members, avoiding duplicates.

The members are queried in order. Results are deduplicated by their lookup key (via Call.to_lookup_key()) and yielded in the order they are first seen.

Parameters:: call – A template Call where None fields act as wildcards.
Yields:: Call | LazyCall – Matching calls from any member, without duplicates.

_first_hit(op: Callable[[BaseCache], Any], *, exc: type[BaseException] = KeyError) → Any[source]

Return the first successful result from iterating the members.

Invokes op(cache) on each member in _members in order and returns immediately when a call does not raise exc. If every member raises exc the exception is re-raised.

This is the first-hit-wins pattern: used when any single cache can satisfy the request and earlier members are preferred (e.g. load_value()). The caller supplies the per-cache operation as a lambda so the key (or other closure state) is always available in the traceback without adding an extra helper argument.

Parameters:

op – Callable that accepts a single BaseCache and returns the desired result. Called at most once per member.
exc – Exception class treated as a cache miss. Defaults to KeyError. Must be a single type (not a tuple) because it is also used in the raise at the end.

Raises:

exc – If every member raises exc.

_collect(op: Callable[[BaseCache], Any], *, exc: type[BaseException] = KeyError) → list[source]

Collect one result per member, skipping misses.

Invokes op(cache) on every member in _members and appends each non-raising result to a list. Members that raise exc are silently skipped; all other exceptions propagate normally.

This is the collect-and-combine pattern: used when all members may hold relevant data and the caller needs to aggregate results before returning (e.g. expand() and _shrink(), which pass the collected list to _resolve_prefix/_combine_shrink).

Parameters:

op – Callable that accepts a single BaseCache and returns a result to collect. Called exactly once per member.
exc – Exception class to treat as a miss and skip. Defaults to KeyError.

Returns:

A list of all non-raising results in member order. May be empty when every member misses; the caller is responsible for handling that case (typically by raising KeyError).

_foreach(op: Callable[[BaseCache], None], *, exc: type[BaseException] | tuple[type[BaseException], Ellipsis] = (Rejected, KeyError)) → None[source]

Apply an operation to every member, swallowing refusals.

Invokes op(cache) on every member in _members unconditionally. Exceptions of type exc are caught and discarded; any other exception propagates normally.

This is the apply-everywhere pattern: used when an operation should be attempted on all members regardless of whether individual caches support it (e.g. CacheStack.evict(), where read-only caches raise Rejected and empty caches raise KeyError, and both are expected non-fatal outcomes).

Parameters:

op – Callable that accepts a single BaseCache. Its return value is ignored. Called exactly once per member.
exc – Exception type(s) to swallow. Defaults to (Rejected, KeyError) — the two standard refusal signals used across the cache hierarchy. Pass a tuple to swallow multiple types.

class fleche.caches.CacheStack[source]

Bases: fleche.storage.thread_safe.PerKeyLockMixin, _MultiCache

A combination of caches with a shared traversal policy.

Saving always targets the lowest level (stack[0]); loading traverses from stack[0] upward and back-fills any hit into stack[0]. The back-fill is serialized per key (via PerKeyLockMixin) so that concurrent loads of the same missing key do not all run the base cache’s non-atomic check-evict-save at once.

All multi-cache fan-out is inherited from _MultiCache’s three private traversal helpers.

stack: tuple[BaseCache, Ellipsis][source]

property _members: tuple[BaseCache, ...][source]: The member caches to fan out over, in traversal order.

__post_init__()[source]

save(call: fleche.call.Call)[source]

_operation_context(key, *, intent: fleche.storage.base.Intent = Intent.WRITE)[source]

Context manager entered around every operation on key.

The base implementation is a no-op. Override in a mixin to inject any resource scoped to the operation — a threading lock, a SQLAlchemy session, an open file handle, a decompression stream, etc.

Receiving key lets implementations choose between a single global resource (ignore the key) or per-key resources (e.g. a striped lock table or a key-specific file handle).

intent describes the kind of operation being performed. Mixins may use it to choose between exclusive and shared locks. Currently the only defined value is Intent.WRITE (the default).

Composing multiple mixins: use super() to chain so that every mixin in the MRO gets to wrap the operation:

@contextlib.contextmanager
def _operation_context(self, key, *, intent=Intent.WRITE):
    with self._lock:                   # this mixin's resource
        with super()._operation_context(key, intent=intent):
            yield

load(key) → fleche.call.LazyCall[source]

_backfill(key, lc: fleche.call.LazyCall) → None[source]

Transfer a hit from a higher cache into the base cache.

Serialized per key so that concurrent loads of the same missing key do not all run the base cache’s non-atomic check-evict-save at once. All concurrent loaders block on the per-key _operation_context() lock; the first one past the lock does the transfer, and every later waiter finds the key already present via contains() and returns without repeating the save.

push(cache: BaseCache) → CacheStack[source]

evict(key: str | fleche.digest.Digest) → None[source]

class fleche.caches.CachePool[source]

Bases: ReadOnlyMixin, _MultiCache

A read-only collection of caches queried as one.

Where CacheStack is an ordered, writable hierarchy (saves land on stack[0] and hits back-fill downward), a CachePool is an unordered, read-only aggregate: it never writes to any member. Use it to expose several independent caches — a teammate’s results directory, a shared read-only archive, last month’s run — as a single cache you can load(), contains(), query(), expand() and shrink() against without risking a write to any of them.

All reads fan out across caches:

load() / load_value() — first member to hold the key wins.
contains() — true if any member holds the key.
query() — union across members, deduplicated by lookup key.
expand() / shrink() — combined across members.

Read-only-ness is inherited from ReadOnlyMixin (so save and evict raise Rejected, and the SSH layer recognises the pool as read-only); the members are kept exactly as the caller supplied them. Unlike CacheStack, load does not back-fill a hit anywhere, so members are never mutated as a side effect of reading. The member order only decides which cache’s copy is returned on a load() collision; every member is an equally valid read source.

caches: tuple[BaseCache, Ellipsis][source]

property _members: tuple[BaseCache, ...][source]: The member caches to fan out over, in traversal order.

load(key: str) → fleche.call.LazyCall[source]

class fleche.caches.SizeLimitedMixin[source]

Bases: BaseCache

Mixin that enforces a maximum number of cached calls with random eviction.

Combine this with Cache (mixin first in MRO) to get a size-limited cache:

@dataclass
class SizeLimitedCache(SizeLimitedMixin, Cache):
    max_size: int

When a new call is saved and the number of cached calls exceeds max_size, a call record is selected for eviction via _pick_eviction_target(). Value storage is intentionally left untouched.

The concrete class must provide a max_size integer, which is provided automatically when mixed with Cache.

max_size: int[source]

_lock: fleche.storage.thread_safe._PicklableRLock[source]

_keys: set[str][source]

__post_init__(*args, **kwargs)[source]

_pick_eviction_target(keys: list[str]) → str[source]

Select the call to evict from a sample of cached call keys.

The default implementation chooses uniformly at random. Override this method to implement a different eviction policy without touching any other part of the class.

Parameters:: keys – A non-empty list of all tracked call keys.
Returns:: The key of the call that should be evicted.

_enforce_size_limit() → None[source]: Evict call records until the cache is within max_size.

save(call: SizeLimitedMixin.save.call) → str[source]

evict(key: str | fleche.digest.Digest) → None[source]

class fleche.caches.SizeLimitedCache[source]

Bases: SizeLimitedMixin, Cache

A Cache that enforces a maximum number of cached calls.

When a new call is saved and the number of cached calls exceeds max_size, a call record is selected for eviction via _pick_eviction_target(). The default policy evicts uniformly at random; override _pick_eviction_target() to change this.

Parameters:

values – Value storage (forwarded to Cache).
_calls – Call storage (forwarded to Cache).
max_size – Maximum number of calls to keep.

max_size: int[source]