[1]:

!rm .fleche -rf

Getting Started with Fleche

This notebook demonstrates the main features of the fleche library, a caching library for Python.

Long-running calculation

[2]:

import time
from fleche import fleche, cache, tags, project
from fleche.digest import Digest

No config file found. Using default memory cache.

[3]:

@fleche
def long_running_calculation(x):
    print(f'Running calculation for {x}...')
    time.sleep(2)
    return x * x

[4]:

start = time.time()
long_running_calculation(2)
end = time.time()
print(f'First call took {end - start:.2f} seconds.')

Running calculation for 2...
First call took 2.00 seconds.

[5]:

start = time.time()
long_running_calculation(2)
end = time.time()
print(f'Second call took {end - start:.2f} seconds.')

Second call took 0.00 seconds.

As you can see, the second call returns almost instantly, because the result was cached.

Recursive function

[6]:

@fleche
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

[7]:

start = time.time()
fib(20)
end = time.time()
print(f'fib(20) took {end - start:.4f} seconds with caching.')

fib(20) took 0.0127 seconds with caching.

Without caching, this would be much slower as each call to fib would be recomputed.

Caching Methods of User-defined Types

fleche can also cache methods of classes. For this to work, the class must be “digest-compatible”. You can make a class digest-compatible by implementing a __digest__ method or by using a dataclass.

[8]:

class MyClass:
    def __init__(self, val):
        self.val = val

    def __digest__(self):
        # The digest defines how the instance is identified in the cache
        return Digest(str(self.val))

    @fleche
    def compute(self, x):
        print(f"Computing {self.val} + {x}...")
        time.sleep(1)
        return self.val + x

[9]:

obj = MyClass(10)

start = time.time()
print(f"Result: {obj.compute(5)}")
print(f"First call took {time.time() - start:.2f} seconds.")

start = time.time()
print(f"Result: {obj.compute(5)}")
print(f"Second call (same instance) took {time.time() - start:.2f} seconds.")

Computing 10 + 5...
Result: 15
First call took 1.00 seconds.
Result: 15
Second call (same instance) took 0.00 seconds.

If you mutate the instance such that its digest changes, the cache will be missed.

[10]:

obj.val = 20
start = time.time()
print(f"Result: {obj.compute(5)}")
print(f"Call after mutation took {time.time() - start:.2f} seconds.")

Computing 20 + 5...
Result: 25
Call after mutation took 1.00 seconds.

Passing Digests as Arguments

fleche supports passing Digest objects directly to cached functions. When a function receives a Digest, fleche automatically expands it to its actual value from the cache before executing the function. You can use the convenience wrapper D to mark a string as a digest.

[11]:

from fleche import D
from fleche.digest import digest as value_digest

@fleche
def double(x):
    print(f"Doubling {x}...")
    return x * 2

# 1. Run the calculation to ensure it is cached
long_running_calculation(10)

# 2. Compute the value digest for 100 (the cached result)
v = long_running_calculation(10)
val_dig = value_digest(v)
print(f"Value Digest: {val_dig}")

# 3. Pass a short digest prefix of the value to double(); it will expand to 100.
short = str(val_dig)[:8]
print(f"Short digest: {short}")
print(f"Result: {double(D(short))}")

Running calculation for 10...
Value Digest: 60079f7901a9295349d1796c037afc132e81286f785ddeeb763104ef02363102
Short digest: 60079f79
Doubling 100...
Result: 200

Metadata

fleche allows you to add metadata to your cached functions using the tags context manager. This can be useful for organizing and querying your results.

[12]:

@fleche
def another_calculation(a, b):
    return a + b

[13]:

with tags(project='my_project', category='testing'):
    another_calculation(1, 2)
    another_calculation(3, 4)

This metadata is stored alongside the cached result. You can then use the metadata.table method to view the metadata for all cached results.

[14]:

cache().table()

[14]:

	name	module	timestart	timestop	walltime	project	category
4435e2927c194eff1bb90270ab3f353f0db1a8b50ec04932de88f7a4e5745953	long_running_calculation	__main__	1.773968e+09	1.773968e+09	2.000198	NaN	NaN
698e29f05ba00ee23503848cd166215f62cd976d54431594036979d8d56f254f	fib	__main__	1.773968e+09	1.773968e+09	0.000009	NaN	NaN
405dfbadf453a9d5bbe3482fbb05251a6fc18b90045f8b4edeafb1c6f236fc80	fib	__main__	1.773968e+09	1.773968e+09	0.000007	NaN	NaN
24231d9bc47f7abc0ea485d178fc8457dce8082790f8c948b936ebe906352225	fib	__main__	1.773968e+09	1.773968e+09	0.002490	NaN	NaN
dabebddba19859bdb1a075e422b1842dde433c7d6ca51c2b9a284dc1bb743d79	fib	__main__	1.773968e+09	1.773968e+09	0.003173	NaN	NaN
e2b3a4eaf9036f83560677b9bad64cbb8263cf3496ba6faeefc15b4231178912	fib	__main__	1.773968e+09	1.773968e+09	0.003788	NaN	NaN
981a54293b3027931a47d21c275955d6dab2fe6d8fa4c2f4fcd3190b6187b0ac	fib	__main__	1.773968e+09	1.773968e+09	0.004265	NaN	NaN
fdb0a9b8b1df91293924ad8fc03bcf041f3e50f924e9d199b8606eed60579d91	fib	__main__	1.773968e+09	1.773968e+09	0.004732	NaN	NaN
db6b5632c77a01a692e13c486c54536a0c46907972d93900a60a1eab59110b93	fib	__main__	1.773968e+09	1.773968e+09	0.005261	NaN	NaN
cd94bda61e484ca82c470b6ab50c0f1b98055e6a905d82f18b57abc5e07b6457	fib	__main__	1.773968e+09	1.773968e+09	0.005755	NaN	NaN
f4efb0e68400d195d8d57051088964d98b9ae4f7dbb78dad4feb055b56fd47de	fib	__main__	1.773968e+09	1.773968e+09	0.006238	NaN	NaN
bc7f8b67d293d9fbdec343f49c44d46e655516537cc492ba3a990edb7f86f176	fib	__main__	1.773968e+09	1.773968e+09	0.006701	NaN	NaN
6fbe26ceafe80276c7714b2aae14c25bd38e24683d6624f01322946f5e53e7cf	fib	__main__	1.773968e+09	1.773968e+09	0.007159	NaN	NaN
f3cefd42fd07119792c38084ecbb33f1955ccd83d8021e1bd078db3377cca4c3	fib	__main__	1.773968e+09	1.773968e+09	0.007635	NaN	NaN
2b7c9e8c5e0a73afddb28a49d47415462b3c84adc05c35567b81514b90c053f2	fib	__main__	1.773968e+09	1.773968e+09	0.008112	NaN	NaN
60bf26e7cdec62d1a8f9ebfbebb336850cf588d547badc7a18a0001fd0b65cf9	fib	__main__	1.773968e+09	1.773968e+09	0.008859	NaN	NaN
9b23a905e639149d35169118ffefadec03a181cbd290006620e671655b1499c3	fib	__main__	1.773968e+09	1.773968e+09	0.009603	NaN	NaN
18e77dab7a4a0385a66512547336e7a02ee62803634fd801e6dd49c6e5444e2f	fib	__main__	1.773968e+09	1.773968e+09	0.010176	NaN	NaN
e1a78503325bcbb94116e03febed590b14bdd7f955c4f15b8649e613fbd6bf5c	fib	__main__	1.773968e+09	1.773968e+09	0.010721	NaN	NaN
34814b3262b7920c7cef43eaf8a1e86e849611ed0d0317c4342b29231a8213ea	fib	__main__	1.773968e+09	1.773968e+09	0.011218	NaN	NaN
9214caae9e9819230882f7e9f0f1970fd5f2fc354c3404d1fd7ef30aa2c75873	fib	__main__	1.773968e+09	1.773968e+09	0.011720	NaN	NaN
c5a5bb6cf5b69c7eb5e244ed540af44391dff12a77fcecf19505cc660b50e6d9	fib	__main__	1.773968e+09	1.773968e+09	0.012221	NaN	NaN
81d4df70836ded7d8a0dd70348976c431210ff824d870acb2932b984e7936e14	compute	__main__	1.773968e+09	1.773968e+09	1.000190	NaN	NaN
e9f9ca077d0566f335e54df480d202815d7b5b5ac5e7ccf7ba47c5bf2fa219ae	compute	__main__	1.773968e+09	1.773968e+09	1.000223	NaN	NaN
a7ce6824785adc406ca3561dcf98b3c64ddf1539d2467e1b9e6318e4a97368e8	long_running_calculation	__main__	1.773968e+09	1.773968e+09	2.000249	NaN	NaN
dda964d172cde9eb85bc22b048444a149e7536453b23c8a90155a6c9dcc6f035	double	__main__	1.773968e+09	1.773968e+09	0.000051	NaN	NaN
a8a3061653183cff08e5c414f9a4087550ab5e33cc186ae7a2e91aa1fbac8c80	another_calculation	__main__	1.773968e+09	1.773968e+09	0.000013	my_project	testing
1e352b538b9219d4e5fcaf429882a4b3587f3bc6358ef0feca453b301b40156a	another_calculation	__main__	1.773968e+09	1.773968e+09	0.000011	my_project	testing

Filtering

The metadata table is just pandas so you can query and filter as you like.

[15]:

cache().table().query('name!="fib"')

[15]:

	name	module	timestart	timestop	walltime	project	category
4435e2927c194eff1bb90270ab3f353f0db1a8b50ec04932de88f7a4e5745953	long_running_calculation	__main__	1.773968e+09	1.773968e+09	2.000198	NaN	NaN
81d4df70836ded7d8a0dd70348976c431210ff824d870acb2932b984e7936e14	compute	__main__	1.773968e+09	1.773968e+09	1.000190	NaN	NaN
e9f9ca077d0566f335e54df480d202815d7b5b5ac5e7ccf7ba47c5bf2fa219ae	compute	__main__	1.773968e+09	1.773968e+09	1.000223	NaN	NaN
a7ce6824785adc406ca3561dcf98b3c64ddf1539d2467e1b9e6318e4a97368e8	long_running_calculation	__main__	1.773968e+09	1.773968e+09	2.000249	NaN	NaN
dda964d172cde9eb85bc22b048444a149e7536453b23c8a90155a6c9dcc6f035	double	__main__	1.773968e+09	1.773968e+09	0.000051	NaN	NaN
a8a3061653183cff08e5c414f9a4087550ab5e33cc186ae7a2e91aa1fbac8c80	another_calculation	__main__	1.773968e+09	1.773968e+09	0.000013	my_project	testing
1e352b538b9219d4e5fcaf429882a4b3587f3bc6358ef0feca453b301b40156a	another_calculation	__main__	1.773968e+09	1.773968e+09	0.000011	my_project	testing

Querying Cached Calls via Function Wrapper

You can retrieve previously cached calls that match some of your function’s arguments and metadata using the function wrapper’s query method. Any field left as None is treated as a wildcard. Arguments and result are compared by digest internally, but the wrapper decodes them back to Python objects when returning matches.

Example using the another_calculation wrapper we created above:

[16]:

# Query by metadata presence (tags) and a specific key-value filter
for call in another_calculation.query(1, 2, metadata={"tags": {}}):
    # presence-only: any call with 'tags'
    print(call.name, call.arguments, call.metadata.get("tags"))

for call in another_calculation.query(3, 4, metadata={"tags": {"project": "my_project"}}):
    # equality filter on metadata
    assert call.metadata["tags"]["project"] == "my_project"
    # arguments and result are decoded if they were stored as digests
    print(call.arguments, call.result)

another_calculation {'a': 1, 'b': 2} {'project': 'my_project', 'category': 'testing'}
{'a': 3, 'b': 4} 7

Lazy Loading

When dealing with large results or many cached calls, you might not want to load everything at once. fleche supports “lazy loading”, where arguments and results are only fetched from the cache when you actually access them.

[17]:

from fleche.call import LazyCall

# 1. Get the digest for a known call
key = fib.digest(20)

# 2. Load it lazily
lazy_call = cache().load(key, lazy=True)
print(f"Obtained {type(lazy_call).__name__} for {lazy_call.name}(20)")

# 3. Accessing .result or .arguments will now trigger the load
print("Accessing result now (triggers load)...")
print(f"Result: {lazy_call.result}")

Obtained LazyCall for fib(20)
Accessing result now (triggers load)...
Result: 6765

This is on by default. You can request to load everything at once using lazy=False, or the .fetch method on a lazily loaded call.

[18]:

cache().load(key).fetch() == cache().load(key, lazy=False)

[18]:

True