When a directory or file is open()'ed, we already put some data into
a lookup table, and subsequent operations then use the returned handle
id.
By also adding the span that's been created during these calls into the
lookup table, we can properly set the span parent for these requests,
nicely connecting the individual operations to the bigger picture.
Change-Id: Ia354842fccdbc7f45c2d3efda3acf058b2dbc48e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11429
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Brian Olsen <me@griff.name>
Instead of creating another child span, we can use
`tracing::Span::current().record(k,v)` to add an additional field to the
current span.
Change-Id: I337faac0e73a0da6eb0a52cb75c2e8c026eff774
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11428
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This currently shares some code with readdir, except it's also providing
a second `fuse_backend_rs::api::filesystem::Entry` argument to the
`add_entry` function call.
Refactoring this to reduce some duplication is left for a future CL.
Change-Id: I282c8dfc6a711d00a4482c87cbb84d4950c0aee9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11426
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Use rq.handle in `release` too, and remove interpolating it into the
log message itself.
Also update the comment, we don't get ownership, just simply drop, and
change the level to warn!, as suggested in cl/11425.
Change-Id: If4e6cff6d8b580671b1548ae3862851db4af6694
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11427
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Similar to how we already handle opening files, implement opendir/
releasedir, and keep a map of dir_handles. They point to the rx side of
a channel.
This greatly improves performance listing the root of the filesystem
when used inside tvix-store, as we don't need to re-request the listing
(and skip to the desired position) all the time.
Change-Id: I0d3ec4cb70a8792c5a1343439cf47d78d9cbb1d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11425
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This allows us acquiring the lock in sync code still. Also, simplify
some of the error handling a bit.
Change-Id: I29e83b715f92808e95ecb0ae9de787339d1a371d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11424
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
We can just pass an async move closure to `self.tokio_handle.block_on`
and make this a bit shorter.
Change-Id: Iba674f34f22ba7a7de7c5bae59d64584884cb17c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11423
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This exposes `user.tvix.castore.{blob,directory}.digest` xattr keys for
files and directories:
```
❯ getfattr -d /tmp/tvix/06jrrv6wwp0nc1m7fr5bgdw012rfzfx2-nano-7.2-info
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.directory.digest="b3:SuYDcUM9RpWcnA40tYB1BtYpR0xw72v3ymhKDQbBfe4="
❯ getfattr -d /tmp/tvix/156a89x10c3kaby9rgf3fi4k0p6r9wl1-etc-shells
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.blob.digest="b3:pZkwZoHN+/VQ8wkaX0wYVXZ0tV/HhtKlSqiaWDK7uRs="
```
It's currently mostly used for debugging, though it might be useful for
tvix-castore-aware syncing programs using the filesystem too.
Change-Id: I26ac3cb9fe51ffbf7f880519f26741549cb5ab6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11422
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
We don't need to copy if we explicitly say that the returned
Option<Path> may hold onto bytes from the passed in &DirEntry.
Change-Id: Ib46b6fd2f8f19a45f8bef79c4c1d2fa6b490cad7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11410
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This is a stream of DirEntry, so let's call it direntry_stream.
Change-Id: I5b3cb4efba899d746393f75f6ece7eaa79424717
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11401
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This adds a Directory service using
https://cloud.google.com/bigtable/docs/ as a K/V store.
Directory (closures) are put in individual keys.
We don't do any bucketed upload of directory closures (yet), as castore/
fs does query individually, does not request recursively (and buffers).
This will be addressed by store composition at some point.
Change-Id: I7fada45bf386a78b7ec93be38c5f03879a2a6e22
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11212
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
This (and more) should now be covered by the generic testsuite
(in crate::blobservice::tests).
Change-Id: Ib3afc4f19f7e37a561b7398d43663dc941971f5c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11365
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
We need to define behaviours and add tests for these.
Change-Id: Id5825fafbf47897d8de42503ea6006eb131b1082
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11281
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This drops pretty much all of castore/utils.rs.
There were only two things left in there, both a bit messy and only used
for tests:
Some `gen_*_service()` helper functions. These can be expressed by
`from_addr("memory://")`.
The other thing was some plumbing code to test the gRPC layer, by
exposing a in-memory implementation via gRPC, and then connecting to
that channel via a gRPC client again.
Previous CLs moved the connection setup code to
{directory,blob}service::tests::utils, close to where we exercise them,
the new rstest-based tests.
The tests interacting directly on the gRPC types are removed, all
scenarios that were in there show now be covered through the rstest ones
on the trait level.
Change-Id: I450ccccf983b4c62145a25d81c36a40846664814
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11223
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
This introduces rstest-based tests. We also add fixtures for creating
some BlobService / DirectoryService out of thin air.
To test a PathInfoService, we don't really care too much about its
internal storage - ensuring they work is up to the castore tests.
Change-Id: Ia62af076ef9c9fbfcf8b020a781454ad299d972e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11272
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This allows us to use containers around BlobServices as BlobServices too.
Change-Id: I3c7feb074f42b4e07c550fb8dfa63cf81d448ab5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11249
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This creates test scenarios (using the DirectoryService trait) that we
want all DirectoryService implementations to pass.
Some of these tests are ported from proto::tests::grpc_directoryservice,
which tested this on the gRPC interface (rather than the trait),
some others ensure certain behaviour for which we only recently
introduced general checking logic (through ClosureValidator).
We also borrow some code related to setting up a gRPC DirectoryService
client (connecting to a server exposing a in-memory DiretoryService)
from castore::utils, this will be deleted once it's all ported over.
Change-Id: I6810215a76101f908e2aaecafa803c70d85bc552
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11247
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
This allows us to use containers around DirectoryServices as DirectoryServices too.
Change-Id: I56cca27b3212858db8b12b874df0e567dd868711
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11248
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This uses DirectoryClosureValidator for validation and the sled batch
API to insert multiple directories at once.
Change-Id: I2d6dc513ccbc02e638f8d22173da5463e73182ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11222
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This greatly simplifies the code in this function, replacing it with a
much better tested (and more capable!) version of the validation logic.
It also enables the gRPC server frontend to make use of the
DirectoryPutter interface. While this might not be too visible in terms
of latency thanks to gRPC streams bursting, it also enables further
optimizations later (such as bucketing of directory closures).
Change-Id: I21f805aa72377dd5266de3b525905d9f445337d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11221
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This simplifies a bunch of code, and gets rid of some TODOs.
Also, move it out of castore/utils, and into its own file.
Change-Id: Ie63e05a6cdfb2a73e878cf7107f9172aed1cdf13
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11224
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
This can be used to validate a Directory closure (connected DAG of
Directories), and their insertion order.
Directories need to be inserted (via `add`), in an order from the leaves
to the root. During insertion, we validate as much as we can at that
time:
- individual validation of Directory messages
- validation of insertion order (no upload of not-yet-known Directories)
- validation of size fields of referred Directories
Internally it keeps all received Directories (and their sizes) in a HashMap,
keyed by digest.
Once all Directories have been inserted, a drain() function can be
called to get a (deduplicated and) validated list of directories, in
from-leaves-to-root order (to be stored somewhere).
While assembling that list, a check for graph connectivity is performed
too, to ensure there's no separate components being sent (and only one
root).
It adds a test suite for these cases, which is much nicer to test than
where we previously had these checks (only in the gRPC server wrapper).
Followup CLs will move the existing putters to use this.
Change-Id: Ie88c832924c170a24626e9e3e91d868497b5d7a4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11220
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
We need to ensure the Directories are successfully uploaded before doing
any testing with them.
Change-Id: Iafa8deb86b3d5eb302ebfba3ced34385f67a7229
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11244
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
This allows these messages to be put in HashSets.
Change-Id: Ia58094cafe53eb624578821d3d8d969c5d21a1d7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11219
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Log the entire span with "trace" level, not just its `ret` level.
The level of the error value event defaults to ERROR, so we don't loose
these.
B3Digest implements Debug and Display the same way, so we can omit the
`(Display)` part in `ret(Display)` for them.
Change-Id: Id00d123a5798e5bdc9820dd97ae2b4d4eb5455f0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11218
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This is no public API to construct this, there's exactly one caller,
and it's perfectly fine to directly populate the struct there.
Change-Id: Idae43a0162ee9bc687d21c550e0c9df33f12d263
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11217
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This makes it easier to see what's going wrong when uploading multiple
Directories.
Change-Id: Ieb71424b9761777c5f719b2f365962644de82baf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11209
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This functionality is provided by the object store backend too
(using `objectstore+file://$some_path`).
This backend also supports content-defined chunking and compresses
chunks with zstd.
Change-Id: I5968c713112c400d23897c59db06b6c713c9d8cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11205
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This controls whether tvix-castore has support for various cloud
backends or not.
Use this to control the set of feature flags for the object_store
backend, and only enable the aws, azure and gcp ones if it's set.
In the future this can be used to enable/disable other cloud backends
too.
Without feature flags, `object_store` already supports the `InMemory`
and `LocalFilesystem` backends, and we also want to unconditionally
enable the `http` one. Make sure at least the construction of these
services is covered in the tests.
Similarly, the tvix-store crate, which provides the tvix-store CLI has a
`cloud` feature flag too (defaulting to enabled).
Change-Id: I9fb9c87b740e7dc83f8ff7a0862905d036d513f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11204
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
The rust trait was missing to document the order of the elements in the
stream. Document that, and also the reasoning behind this.
Change-Id: I27ef0b2020082783fc41c2015233175e2b8e716d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11203
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This will allow feature-flagging some of the backends.
Change-Id: Iddbdb89d3cf9c966a2c25b06b03e6917b284cae5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11201
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
This will allow feature-flagging some of the backends.
Change-Id: Idffbf8b3fd154f5a3d938225c3871feffea8ff8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11200
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
We don't need to use BASE64 here on our own, B3Digest has a Display
impl.
This will also make sure the `b3:` digest is present in field values.
Change-Id: I0ce6ee0f7e7e99fb9b16872953a1b742e99be291
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11192
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Have derive_{blob,chunk}_path emit trace-level events for both the
values they're called with, as well as the return value.
With RUST_LOG in place, it doesn't get lost in other unrelated noise.
Change-Id: Id2451e3657324eff482841eb26a22d19e22bde30
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11136
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Whenever this encounters an open_read(), it'll first check for more
granular chunking. If there's more granular chunking data available, a
ChunkedReader is constructed (which supports seeking backwards).
This currently is still a bit stupid, and doesn't compose, as
`ChunkedReader` uses `self` as the `BlobService` to ask for the
individual chunks.
In store composition future, we might want to compose this differently,
essentially constructing `ChunkedReader` with another `BlobService`
representing the entire hierarchy, so there's a chance to locally cache
things, and do less requests.
Change-Id: I22e0df4d6245f666d083b4f0b7114d3ac41d1dce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11185
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>