( ideas licensed public domain / product of https://operand.online )

@kura - splendid issue to raise;
this is the main reason I'm focusing on the `import_export` module before I find my app in a bind.

My search for a db began from a blog post by fjall:
https://fjall-rs.github.io/post/recreating-webtable
The fjall blog made me see the mechanisms of LSM trees more clearly,
and this page specifically emphasized how practical they can be for page indexing.

I feel like a small package in Rustler could be able to locate a fjall db on disc,
and use a query to load a subset into a khepri db.
Elixir could use the shared khepri store to coordinate across nodes.

Perhaps khepri should be used only for "receipts"  of scraped pages -
```
@example.com/addr/@cache -> @example.com/addr/@2026-04-15_12-59-59
@example.com/addr/@2026-04-15_12-59-59.hash -> <hash>
@example.com/addr/@2026-04-15_12-59-59.api_key -> <key>
@example.com/addr/@2026-04-15_12-59-59.module -> Scraper.ExampleCom
```

So, how to bring the scraped pages from the nodes, back into fjall?
seems like this could easily be RabbitMQ / Broadway / plain old Elixir.

The full process is

0. spin up cpu-node and connect to disc-node.
1. cpu-node queries for a record to process.
2. disc-node loads record receipts -> fjall (in rustler) -> khepri.
3. cpu-node checks receipts, enriches original record.
4. rabbitmq sends enriched fields back to disc-node.
5. disc-node pushes records & receipts to fjall.

 In 1. and 3., the same query could be used to decide if the enriched keys are added to kehpri;
this is an optimistic-update that khepri can use to keep nodes coordinated.
for example, links between pages and their associated keywords & addresses.
this could mean other nodes in the cluster could do downstream processing based on keywords like "github".

This general approach means that many small khepri clusters *should* be able to sync off of the same fjall db,
where each khepri cluster is using a group of queries designed for that cluster's unique purpose.
This semantic-partitioning seems like it could keep memory use small even for complex domains.