{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Fleche Storage Backends\n", "\n", "This notebook demonstrates the usage of the different storage backends available in `fleche`.\n", "In `fleche`, a `Cache` is composed of two storage components:\n", "1. **values**: Stores the actual results of the functions.\n", "2. **calls**: Stores the metadata about the function calls (arguments, function name, etc.) and references to the stored values.\n", "\n", "You can mix and match different storage backends for values and calls to suit your needs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Memory Storage\n", "\n", "The `Memory` storage backend keeps all the cached data in memory. This is the simplest backend and is useful for testing or when you don't need to persist the cache beyond the current process." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:49.949163Z", "iopub.status.busy": "2026-03-22T19:56:49.948989Z", "iopub.status.idle": "2026-03-22T19:56:50.449807Z", "shell.execute_reply": "2026-03-22T19:56:50.448807Z" } }, "outputs": [], "source": "from fleche import fleche, cache\nfrom fleche.caches import Cache\nfrom fleche.storage import ValueMemory, CallMemory\n\n# Using Memory for both values and calls\nmemory_cache = Cache(values=ValueMemory({}), calls=CallMemory({}))\n\nwith cache(memory_cache):\n @fleche\n def add(a, b):\n print(f\"Executing add({a}, {b})\")\n return a + b\n \n print(f\"Result 1: {add(2, 3)}\")\n print(f\"Result 2: {add(2, 3)}\") # This should be cached\nprint('Cache keys in calls storage:', list(memory_cache.calls.list()))" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PickleFile Storage\n", "\n", "The `PickleFile` storage backend serializes data using Python's standard `pickle` module and stores it in individual files. It requires a `root` directory." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:50.451880Z", "iopub.status.busy": "2026-03-22T19:56:50.451542Z", "iopub.status.idle": "2026-03-22T19:56:50.575246Z", "shell.execute_reply": "2026-03-22T19:56:50.574139Z" } }, "outputs": [], "source": "import shutil\nfrom pathlib import Path\nfrom fleche.storage import ValuePickleFile, CallPickleFile\n\nshutil.rmtree('.pickle_cache', ignore_errors=True)\npickle_cache = Cache(\n values=ValuePickleFile.with_pickle(root='.pickle_cache/values'),\n calls=CallPickleFile.with_pickle(root='.pickle_cache/calls')\n)\n\nwith cache(pickle_cache):\n @fleche\n def greet(name):\n print(f\"Executing greet({name})\")\n return f\"Hello, {name}!\"\n \n print(greet(\"World\"))\n print(greet(\"World\"))\n\nprint(\"Files in calls storage:\")\n!ls .pickle_cache/calls" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Compressed PickleFile Storage\n", "\n", "You can also enable gzip compression for `PickleFile` (and its variants `CloudpickleFile` and `DillFile`) by passing `compress=True`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:50.577316Z", "iopub.status.busy": "2026-03-22T19:56:50.577099Z", "iopub.status.idle": "2026-03-22T19:56:50.585392Z", "shell.execute_reply": "2026-03-22T19:56:50.584495Z" } }, "outputs": [], "source": "import shutil\nshutil.rmtree('.compressed_pickle_cache', ignore_errors=True)\ncompressed_cache = Cache(\n values=ValuePickleFile.with_pickle(root='.compressed_pickle_cache/values', compress=True),\n calls=CallPickleFile.with_pickle(root='.compressed_pickle_cache/calls', compress=True)\n)\n\nwith cache(compressed_cache):\n @fleche\n def big_result(n):\n return \"a\" * n\n \n big_result(1000)\n\nfile_path = list(Path('.compressed_pickle_cache/values').iterdir())[0]\nwith open(file_path, 'rb') as f:\n header = f.read(2)\n # check for gzip magic number 0x1f 0x8b\n is_compressed = (header[0] == 0x1f) and (header[1] == 0x8b)\n print(f\"Is gzip compressed: {is_compressed}\")" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CloudpickleFile Storage\n", "\n", "The `CloudpickleFile` storage backend is similar to `PickleFile` but uses `cloudpickle` for serialization. `cloudpickle` can handle more complex Python objects, like lambdas or functions defined interactively, that standard `pickle` might struggle with." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:50.587178Z", "iopub.status.busy": "2026-03-22T19:56:50.586992Z", "iopub.status.idle": "2026-03-22T19:56:50.706524Z", "shell.execute_reply": "2026-03-22T19:56:50.705580Z" } }, "outputs": [], "source": "shutil.rmtree('.cloudpickle_cache', ignore_errors=True)\ncp_cache = Cache(\n values=ValuePickleFile.with_cloudpickle(root='.cloudpickle_cache/values'),\n calls=CallPickleFile.with_cloudpickle(root='.cloudpickle_cache/calls')\n)\n\nwith cache(cp_cache):\n @fleche\n def mul(a, b):\n print(f\"Executing mul({a}, {b})\")\n return a * b\n \n print(f\"Result: {mul(3, 4)}\")\n print(f\"Result: {mul(3, 4)}\")\n\nprint(\"Files in values storage:\")\n!ls .cloudpickle_cache/values" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DillFile Storage\n", "\n", "The `DillFile` storage backend is similar to `CloudpickleFile` but uses `dill` for serialization. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:50.708571Z", "iopub.status.busy": "2026-03-22T19:56:50.708378Z", "iopub.status.idle": "2026-03-22T19:56:50.829061Z", "shell.execute_reply": "2026-03-22T19:56:50.827888Z" } }, "outputs": [], "source": "shutil.rmtree('.dill_cache', ignore_errors=True)\ndill_cache = Cache(\n values=ValuePickleFile.with_dill(root='.dill_cache/values'),\n calls=CallPickleFile.with_dill(root='.dill_cache/calls')\n)\n\nwith cache(dill_cache):\n @fleche\n def add(a, b):\n print(f\"Executing add({a}, {b})\")\n return a + b\n \n print(f\"Result: {add(3, 4)}\")\n print(f\"Result: {add(3, 4)}\")\n\nprint(\"Files in values storage:\")\n!ls .dill_cache/values" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## BagOfHoldingH5File Storage\n", "\n", "The `BagOfHoldingH5File` storage backend uses the `bagofholding` library to store data in HDF5 files. This is particularly efficient for large numerical arrays (e.g., NumPy arrays)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:50.831273Z", "iopub.status.busy": "2026-03-22T19:56:50.831064Z", "iopub.status.idle": "2026-03-22T19:56:50.959794Z", "shell.execute_reply": "2026-03-22T19:56:50.958659Z" } }, "outputs": [], "source": "import numpy as np\nfrom fleche.storage import ValueBagOfHoldingH5File\n\nshutil.rmtree('.boh_cache', ignore_errors=True)\nboh_cache = Cache(\n values=ValueBagOfHoldingH5File(root='.boh_cache/values'),\n calls=CallPickleFile.with_cloudpickle(root='.boh_cache/calls') # We can use Cloudpickle for calls\n)\n\nwith cache(boh_cache):\n @fleche\n def make_array(n):\n print(f\"Executing make_array({n})\")\n return np.ones((n, n))\n\n print(f\"Array sum: {make_array(5).sum()}\")\n print(f\"Array sum: {make_array(5).sum()}\")\n\nprint(\"Files in H5 values storage:\")\n!ls .boh_cache/values" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sql Storage (Call Storage Only)\n", "\n", "The `Sql` storage backend uses SQLAlchemy to store call metadata in a SQL database (like SQLite). It provides advanced querying capabilities but can only be used for the **calls** component of a `Cache`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:50.961998Z", "iopub.status.busy": "2026-03-22T19:56:50.961789Z", "iopub.status.idle": "2026-03-22T19:56:51.006219Z", "shell.execute_reply": "2026-03-22T19:56:51.005233Z" } }, "outputs": [], "source": "from fleche.storage import Sql\nimport os\n\nif os.path.exists('calls.db'): os.remove('calls.db')\n\nsql_cache = Cache(\n values=ValueMemory({}),\n calls=Sql(url='sqlite:///calls.db')\n)\n\nwith cache(sql_cache):\n @fleche\n def power(a, b):\n print(f\"Executing power({a}, {b})\")\n return a ** b\n\n power(2, 8)\n power(2, 8)\n\nprint(\"Calls in SQL storage:\")\nprint(list(sql_cache.calls.list()))" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mix and Match\n", "\n", "As seen in the `BagOfHoldingH5File` and `Sql` examples, you can mix different backends for values and calls. For instance, you might want to use `BagOfHoldingH5File` for large values but `Sql` for calls to enable efficient metadata querying." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:51.008027Z", "iopub.status.busy": "2026-03-22T19:56:51.007846Z", "iopub.status.idle": "2026-03-22T19:56:51.032418Z", "shell.execute_reply": "2026-03-22T19:56:51.031506Z" } }, "outputs": [], "source": "mixed_cache = Cache(\n values=ValueBagOfHoldingH5File(root='.mixed_cache/values'),\n calls=Sql(url='sqlite:///.mixed_cache/calls.db')\n)\n\nwith cache(mixed_cache):\n @fleche\n def compute_heavy(x):\n return np.random.rand(x, x)\n \n compute_heavy(10)\n \nprint(f\"Value storage: {type(mixed_cache.values).__name__}\")\nprint(f\"Call storage: {type(mixed_cache.calls).__name__}\")" }, { "cell_type": "markdown", "metadata": {}, "source": "## Clean Up\n\nRemove temporary cache directories." }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-03-22T19:56:51.034211Z", "iopub.status.busy": "2026-03-22T19:56:51.033999Z", "iopub.status.idle": "2026-03-22T19:56:51.041707Z", "shell.execute_reply": "2026-03-22T19:56:51.040810Z" } }, "outputs": [], "source": [ "import shutil, os\n", "for d in ('.pickle_cache', '.compressed_pickle_cache', '.cloudpickle_cache',\n", " '.dill_cache', '.boh_cache', '.mixed_cache'):\n", " shutil.rmtree(d, ignore_errors=True)\n", "if os.path.exists('calls.db'):\n", " os.remove('calls.db')\n", "print('Cleaned up temporary cache directories.')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.13" } }, "nbformat": 4, "nbformat_minor": 4 }