docs(tvix): document Store configuration
This describes the current composition system used for BlobService / DirectoryService / PathInfoService, why it's hidden, how to expose it, and adds some common examples to explain it. Change-Id: I2ce7da40992cc988947c3e924499f8157c5e4937 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12749 Tested-by: BuildkiteCI Reviewed-by: yuka <yuka@yuka.dev>
This commit is contained in:
parent
1428ea4e19
commit
0b8ec03797
3 changed files with 175 additions and 2 deletions
|
@ -120,8 +120,6 @@ Extend the other pages in here. Some ideas on what should be tackled:
|
|||
and trait-focused?
|
||||
- Restructure docs on castore vs store, this seems to be duplicated a bit and
|
||||
is probably still not too clear.
|
||||
- Describe store composition(s) in more detail. There's some notes on granular
|
||||
fetching which probably can be repurposed.
|
||||
- Absorb the rest of //tvix/website into this.
|
||||
|
||||
## Features
|
||||
|
|
|
@ -57,6 +57,8 @@ The flexibility of this doesn't need to be exposed to the user in the default
|
|||
case; in most cases we should be fine with some form of on-disk storage and a
|
||||
bunch of substituters with different priorities.
|
||||
|
||||
Check [Store Configuration](./store-configuration.md) for more details.
|
||||
|
||||
### gRPC Clients
|
||||
Clients are encouraged to always read blobs in a chunked fashion (asking for a
|
||||
list of chunks for a blob via `BlobService.Stat()`, then fetching chunks via
|
||||
|
|
173
tvix/docs/src/castore/store-configuration.md
Normal file
173
tvix/docs/src/castore/store-configuration.md
Normal file
|
@ -0,0 +1,173 @@
|
|||
# Store Configuration
|
||||
|
||||
Currently, tvix-store (and tvix-cli) expose three different `--*-service-addr`
|
||||
CLI args, describing how to talk to the three different stores.
|
||||
|
||||
Depending on the CLI entrypoint, they have different defaults:
|
||||
|
||||
- `tvix-cli` defaults to in-memory variants (`ServiceUrlsMemory`).
|
||||
- `tvix-store daemon` defaults to using a local filesystem-based backend for
|
||||
blobs, and redb backends for `DirectoryService` and `PathInfoService`
|
||||
(`ServiceUrls`).
|
||||
- other `tvix-store` entrypoints, as well as `nar-bridge` default to talking to
|
||||
a `tvix-store` gRPC daemon (`ServiceUrlsGrpc`).
|
||||
|
||||
The exact config and paths can be inspected by invoking `--help` on each of
|
||||
these entrypoints, and it's of course possible to change this config, for
|
||||
example in case everything should be done from a single binary, without a daemon
|
||||
in between.
|
||||
There currently is no caching on the client side wired up yet, and some (known)
|
||||
unnecessary roundtrips (which can be removed after some refactoring), so for
|
||||
everything except testing purposes you might want to directly connect to the
|
||||
data stores, or use Store Composition to have caching, (and describe more
|
||||
complicated fetch-through configs).
|
||||
|
||||
## Store Composition
|
||||
Internally, `tvix-castore` supports composing multiple instances of `BlobService`,
|
||||
`DirectoryService` (and `PathInfoService`) together.
|
||||
|
||||
It allows describing more complicated "hierarchies"/"tiers" of different
|
||||
service types. It supports combining different storage backend/substituters/
|
||||
caches, and combining them in a DAG of some sort, ultimately exposing the same
|
||||
(trait) interface as a single store.
|
||||
|
||||
The three individual URLs exposed in the CLI currently are internally converted
|
||||
to a composition with just one instance of each store (at the "root" name).
|
||||
|
||||
Keep in mind the config format is very granular and low-level, and due to this,
|
||||
a potential subject to larger breaking and unannounced changes, which is why we
|
||||
it is not exposed by default yet.
|
||||
|
||||
In the long term, for "user-facing" configuration, we might want to expose a
|
||||
more opinionated middle ground between only a single instance and the super
|
||||
granular store composition instead.
|
||||
|
||||
For example, users could configure things like "a list of substituters"
|
||||
and "caching args", and internally this could be transformed to a low-level
|
||||
composition - potentially leaving this granular format for library/power users
|
||||
only.
|
||||
|
||||
### CLI usage
|
||||
However, if you're okay with these caveats, and want to configure some caching
|
||||
today, using the existing CLI entrypoints, you can enable the
|
||||
`xp-composition-cli` feature flag in the `tvix-store` crate.
|
||||
|
||||
With `cargo`, this can be enabled by passing
|
||||
`--features tvix-store/xp-composition-cli` to a `cargo build` / `cargo run`
|
||||
invocation.
|
||||
|
||||
If enabled, CLI entrypoints get a `--experimental-store-composition` arg, which
|
||||
accepts a TOML file describing a composition for all three stores (causing the
|
||||
other `--*-service-addr` args to be ignored if set).
|
||||
|
||||
It expects all BlobService instances to be inside a `blobservices` namespace/
|
||||
attribute, (`DirectoryService`s in `directoryservices`, and `PathInfoService`s
|
||||
in `pathinfoservices` respectively), and requires one named "root".
|
||||
|
||||
### Library usage
|
||||
The store composition code can be accessed via `tvix_castore::composition`, and
|
||||
`tvix_store::composition`.
|
||||
|
||||
A global "registry" can be used to make other (out-of-tree) "types" of stores
|
||||
known to the composition machinery.
|
||||
|
||||
In terms of config format, you're also not required to use TOML, but anything
|
||||
`serde` can deserialize.
|
||||
|
||||
Make sure to check the module-level docstrings and code examples for
|
||||
`tvix_castore::composition`.
|
||||
|
||||
### Composition config format
|
||||
Below examples are in the format accepted by the CLI, using the
|
||||
`blobservices` / `directoryservices` / `pathinfoservices` namespace/attribute to
|
||||
describe all three services.
|
||||
|
||||
However, as expressed above, for library users this doesn't need to be TOML (but
|
||||
anything serde can deserialize), and the composition hierarchy needs to be built
|
||||
separately for each `{Blob,Directory,Pathinfo}Service`, dropping the namespaces
|
||||
present in the TOML.
|
||||
|
||||
#### Example: combined remote/local blobservice
|
||||
This fetches blobs from a local store. If not found there, a remote store is
|
||||
queried, and results are returned to the client and inserted into the local
|
||||
store, to make subsequent lookups not query the remote again.
|
||||
|
||||
```toml
|
||||
[blobservices.root]
|
||||
type = "combined"
|
||||
near = "near"
|
||||
far = "far"
|
||||
|
||||
[blobservices.near]
|
||||
type = "objectstore"
|
||||
object_store_url = "file:///tmp/tvix/blobservice"
|
||||
object_store_options = {}
|
||||
|
||||
[blobservices.far]
|
||||
type = "grpc"
|
||||
url = "grpc+http://[::1]:8000"
|
||||
|
||||
# […] directoryservices/pathinfoservices go here […]
|
||||
```
|
||||
|
||||
### Example: LRU cache wrapping pathinfoservice
|
||||
This keeps the last 1000 requested `PathInfo`s around in a local cache.
|
||||
```toml
|
||||
[pathinfoservices.root]
|
||||
type = "cache"
|
||||
near = "near"
|
||||
far = "far"
|
||||
|
||||
[pathinfoservices.near]
|
||||
type = "lru"
|
||||
capacity = 1000
|
||||
|
||||
[pathinfoservices.far]
|
||||
type = "grpc"
|
||||
url = "grpc+http://localhost:8000"
|
||||
|
||||
# […] blobservices/directoryservices go here […]
|
||||
```
|
||||
|
||||
### Example: Self-contained fetch-through tvix-store for `cache.nixos.org`.
|
||||
This provides a `PathInfoService` "containing" `PathInfo` that are in
|
||||
`cache.nixos.org`.
|
||||
|
||||
To construct the `PathInfo` initially, we need to ingest the NAR to add missing
|
||||
castore contents to `BlobService` / `DirectoryService` and return the resulting
|
||||
root node.
|
||||
|
||||
To not do this every time, the resulting `PathInfo` is saved in a local (`redb`)
|
||||
database.
|
||||
|
||||
This also showcases how PathInfo services can refer to other store types (blob
|
||||
services, directory services).
|
||||
|
||||
```
|
||||
[blobservices.root]
|
||||
type = "objectstore"
|
||||
object_store_url = "file:///var/lib/tvix-store/blobs.object_store"
|
||||
object_store_options = {}
|
||||
|
||||
[directoryservices.root]
|
||||
type = "redb"
|
||||
is_temporary = false
|
||||
path = "/var/lib/tvix-store/directories.redb"
|
||||
|
||||
[pathinfoservices.root]
|
||||
type = "cache"
|
||||
near = "redb"
|
||||
far = "cache-nixos-org"
|
||||
|
||||
[pathinfoservices.redb]
|
||||
type = "redb"
|
||||
is_temporary = false
|
||||
path = "/var/lib/tvix-store/pathinfo.redb"
|
||||
|
||||
[pathinfoservices.cache-nixos-org]
|
||||
type = "nix"
|
||||
base_url = "https://cache.nixos.org"
|
||||
public_keys = ["cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY="]
|
||||
blob_service = "root"
|
||||
directory_service = "root"
|
||||
```
|
Loading…
Reference in a new issue