docs(tvix): document Store configuration
This describes the current composition system used for BlobService / DirectoryService / PathInfoService, why it's hidden, how to expose it, and adds some common examples to explain it. Change-Id: I2ce7da40992cc988947c3e924499f8157c5e4937 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12749 Tested-by: BuildkiteCI Reviewed-by: yuka <yuka@yuka.dev>
This commit is contained in:
parent
1428ea4e19
commit
0b8ec03797
3 changed files with 175 additions and 2 deletions
|
@ -120,8 +120,6 @@ Extend the other pages in here. Some ideas on what should be tackled:
|
||||||
and trait-focused?
|
and trait-focused?
|
||||||
- Restructure docs on castore vs store, this seems to be duplicated a bit and
|
- Restructure docs on castore vs store, this seems to be duplicated a bit and
|
||||||
is probably still not too clear.
|
is probably still not too clear.
|
||||||
- Describe store composition(s) in more detail. There's some notes on granular
|
|
||||||
fetching which probably can be repurposed.
|
|
||||||
- Absorb the rest of //tvix/website into this.
|
- Absorb the rest of //tvix/website into this.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
|
@ -57,6 +57,8 @@ The flexibility of this doesn't need to be exposed to the user in the default
|
||||||
case; in most cases we should be fine with some form of on-disk storage and a
|
case; in most cases we should be fine with some form of on-disk storage and a
|
||||||
bunch of substituters with different priorities.
|
bunch of substituters with different priorities.
|
||||||
|
|
||||||
|
Check [Store Configuration](./store-configuration.md) for more details.
|
||||||
|
|
||||||
### gRPC Clients
|
### gRPC Clients
|
||||||
Clients are encouraged to always read blobs in a chunked fashion (asking for a
|
Clients are encouraged to always read blobs in a chunked fashion (asking for a
|
||||||
list of chunks for a blob via `BlobService.Stat()`, then fetching chunks via
|
list of chunks for a blob via `BlobService.Stat()`, then fetching chunks via
|
||||||
|
|
173
tvix/docs/src/castore/store-configuration.md
Normal file
173
tvix/docs/src/castore/store-configuration.md
Normal file
|
@ -0,0 +1,173 @@
|
||||||
|
# Store Configuration
|
||||||
|
|
||||||
|
Currently, tvix-store (and tvix-cli) expose three different `--*-service-addr`
|
||||||
|
CLI args, describing how to talk to the three different stores.
|
||||||
|
|
||||||
|
Depending on the CLI entrypoint, they have different defaults:
|
||||||
|
|
||||||
|
- `tvix-cli` defaults to in-memory variants (`ServiceUrlsMemory`).
|
||||||
|
- `tvix-store daemon` defaults to using a local filesystem-based backend for
|
||||||
|
blobs, and redb backends for `DirectoryService` and `PathInfoService`
|
||||||
|
(`ServiceUrls`).
|
||||||
|
- other `tvix-store` entrypoints, as well as `nar-bridge` default to talking to
|
||||||
|
a `tvix-store` gRPC daemon (`ServiceUrlsGrpc`).
|
||||||
|
|
||||||
|
The exact config and paths can be inspected by invoking `--help` on each of
|
||||||
|
these entrypoints, and it's of course possible to change this config, for
|
||||||
|
example in case everything should be done from a single binary, without a daemon
|
||||||
|
in between.
|
||||||
|
There currently is no caching on the client side wired up yet, and some (known)
|
||||||
|
unnecessary roundtrips (which can be removed after some refactoring), so for
|
||||||
|
everything except testing purposes you might want to directly connect to the
|
||||||
|
data stores, or use Store Composition to have caching, (and describe more
|
||||||
|
complicated fetch-through configs).
|
||||||
|
|
||||||
|
## Store Composition
|
||||||
|
Internally, `tvix-castore` supports composing multiple instances of `BlobService`,
|
||||||
|
`DirectoryService` (and `PathInfoService`) together.
|
||||||
|
|
||||||
|
It allows describing more complicated "hierarchies"/"tiers" of different
|
||||||
|
service types. It supports combining different storage backend/substituters/
|
||||||
|
caches, and combining them in a DAG of some sort, ultimately exposing the same
|
||||||
|
(trait) interface as a single store.
|
||||||
|
|
||||||
|
The three individual URLs exposed in the CLI currently are internally converted
|
||||||
|
to a composition with just one instance of each store (at the "root" name).
|
||||||
|
|
||||||
|
Keep in mind the config format is very granular and low-level, and due to this,
|
||||||
|
a potential subject to larger breaking and unannounced changes, which is why we
|
||||||
|
it is not exposed by default yet.
|
||||||
|
|
||||||
|
In the long term, for "user-facing" configuration, we might want to expose a
|
||||||
|
more opinionated middle ground between only a single instance and the super
|
||||||
|
granular store composition instead.
|
||||||
|
|
||||||
|
For example, users could configure things like "a list of substituters"
|
||||||
|
and "caching args", and internally this could be transformed to a low-level
|
||||||
|
composition - potentially leaving this granular format for library/power users
|
||||||
|
only.
|
||||||
|
|
||||||
|
### CLI usage
|
||||||
|
However, if you're okay with these caveats, and want to configure some caching
|
||||||
|
today, using the existing CLI entrypoints, you can enable the
|
||||||
|
`xp-composition-cli` feature flag in the `tvix-store` crate.
|
||||||
|
|
||||||
|
With `cargo`, this can be enabled by passing
|
||||||
|
`--features tvix-store/xp-composition-cli` to a `cargo build` / `cargo run`
|
||||||
|
invocation.
|
||||||
|
|
||||||
|
If enabled, CLI entrypoints get a `--experimental-store-composition` arg, which
|
||||||
|
accepts a TOML file describing a composition for all three stores (causing the
|
||||||
|
other `--*-service-addr` args to be ignored if set).
|
||||||
|
|
||||||
|
It expects all BlobService instances to be inside a `blobservices` namespace/
|
||||||
|
attribute, (`DirectoryService`s in `directoryservices`, and `PathInfoService`s
|
||||||
|
in `pathinfoservices` respectively), and requires one named "root".
|
||||||
|
|
||||||
|
### Library usage
|
||||||
|
The store composition code can be accessed via `tvix_castore::composition`, and
|
||||||
|
`tvix_store::composition`.
|
||||||
|
|
||||||
|
A global "registry" can be used to make other (out-of-tree) "types" of stores
|
||||||
|
known to the composition machinery.
|
||||||
|
|
||||||
|
In terms of config format, you're also not required to use TOML, but anything
|
||||||
|
`serde` can deserialize.
|
||||||
|
|
||||||
|
Make sure to check the module-level docstrings and code examples for
|
||||||
|
`tvix_castore::composition`.
|
||||||
|
|
||||||
|
### Composition config format
|
||||||
|
Below examples are in the format accepted by the CLI, using the
|
||||||
|
`blobservices` / `directoryservices` / `pathinfoservices` namespace/attribute to
|
||||||
|
describe all three services.
|
||||||
|
|
||||||
|
However, as expressed above, for library users this doesn't need to be TOML (but
|
||||||
|
anything serde can deserialize), and the composition hierarchy needs to be built
|
||||||
|
separately for each `{Blob,Directory,Pathinfo}Service`, dropping the namespaces
|
||||||
|
present in the TOML.
|
||||||
|
|
||||||
|
#### Example: combined remote/local blobservice
|
||||||
|
This fetches blobs from a local store. If not found there, a remote store is
|
||||||
|
queried, and results are returned to the client and inserted into the local
|
||||||
|
store, to make subsequent lookups not query the remote again.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[blobservices.root]
|
||||||
|
type = "combined"
|
||||||
|
near = "near"
|
||||||
|
far = "far"
|
||||||
|
|
||||||
|
[blobservices.near]
|
||||||
|
type = "objectstore"
|
||||||
|
object_store_url = "file:///tmp/tvix/blobservice"
|
||||||
|
object_store_options = {}
|
||||||
|
|
||||||
|
[blobservices.far]
|
||||||
|
type = "grpc"
|
||||||
|
url = "grpc+http://[::1]:8000"
|
||||||
|
|
||||||
|
# […] directoryservices/pathinfoservices go here […]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example: LRU cache wrapping pathinfoservice
|
||||||
|
This keeps the last 1000 requested `PathInfo`s around in a local cache.
|
||||||
|
```toml
|
||||||
|
[pathinfoservices.root]
|
||||||
|
type = "cache"
|
||||||
|
near = "near"
|
||||||
|
far = "far"
|
||||||
|
|
||||||
|
[pathinfoservices.near]
|
||||||
|
type = "lru"
|
||||||
|
capacity = 1000
|
||||||
|
|
||||||
|
[pathinfoservices.far]
|
||||||
|
type = "grpc"
|
||||||
|
url = "grpc+http://localhost:8000"
|
||||||
|
|
||||||
|
# […] blobservices/directoryservices go here […]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example: Self-contained fetch-through tvix-store for `cache.nixos.org`.
|
||||||
|
This provides a `PathInfoService` "containing" `PathInfo` that are in
|
||||||
|
`cache.nixos.org`.
|
||||||
|
|
||||||
|
To construct the `PathInfo` initially, we need to ingest the NAR to add missing
|
||||||
|
castore contents to `BlobService` / `DirectoryService` and return the resulting
|
||||||
|
root node.
|
||||||
|
|
||||||
|
To not do this every time, the resulting `PathInfo` is saved in a local (`redb`)
|
||||||
|
database.
|
||||||
|
|
||||||
|
This also showcases how PathInfo services can refer to other store types (blob
|
||||||
|
services, directory services).
|
||||||
|
|
||||||
|
```
|
||||||
|
[blobservices.root]
|
||||||
|
type = "objectstore"
|
||||||
|
object_store_url = "file:///var/lib/tvix-store/blobs.object_store"
|
||||||
|
object_store_options = {}
|
||||||
|
|
||||||
|
[directoryservices.root]
|
||||||
|
type = "redb"
|
||||||
|
is_temporary = false
|
||||||
|
path = "/var/lib/tvix-store/directories.redb"
|
||||||
|
|
||||||
|
[pathinfoservices.root]
|
||||||
|
type = "cache"
|
||||||
|
near = "redb"
|
||||||
|
far = "cache-nixos-org"
|
||||||
|
|
||||||
|
[pathinfoservices.redb]
|
||||||
|
type = "redb"
|
||||||
|
is_temporary = false
|
||||||
|
path = "/var/lib/tvix-store/pathinfo.redb"
|
||||||
|
|
||||||
|
[pathinfoservices.cache-nixos-org]
|
||||||
|
type = "nix"
|
||||||
|
base_url = "https://cache.nixos.org"
|
||||||
|
public_keys = ["cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY="]
|
||||||
|
blob_service = "root"
|
||||||
|
directory_service = "root"
|
||||||
|
```
|
Loading…
Reference in a new issue