Commit graph

228 commits

Author SHA1 Message Date
Florian Klink
b70744fda6 refactor(tvix/*/import): rename direntry_stream, entries_per_depths
Align these names and comments with the two users, to make it more
obvious we're doing the same thing here, just use a different method to
come up with entries_per_depths.

Change-Id: I42058e397588b6b57a6299e87183bef27588b228
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11415
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:06:50 +00:00
Florian Klink
bcc00fba8f refactor(tvix/castore/import): inline process_entry
This did very little, and especially the part of relying on the outside
caller to pass in a Directory if the type is a directory required having
per-entry-type specific logic anyways.

It's cleaner to just inline it.

Change-Id: I997a8513ee91c67b0a2443cb5cd9e8700f69211e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11414
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:05:49 +00:00
Florian Klink
d47bd4f4bc refactor(tvix/castore/import): move process_entry to the end of the file
This makes it easier to understand the code.

Change-Id: I0a9047433000551a6ba1f50a8c5c93527bc86216
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11413
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:05:18 +00:00
Florian Klink
c19bc95b8a fix(tvix/castore/blobservice): update bytes_read only on successful read
We previously updated this.pos also in case the underlying read returned
an error.

Also, use the ready! macro to remove the match block, and instrument
errors returned during start_seek.

Change-Id: Ic32e26579d964a76b45687134acc48d72d67c36f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11421
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 14:04:47 +00:00
Florian Klink
bccb4c92c3 refactor(tvix/castore/fs): add InodeData::as_fuse_{entry,file_attr}
Remove the now unused gen_file_attr (which had a wrong docstring).

Change-Id: Ie86b14d1ad798e6233bc44c43ace3f8b95c67ea9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11430
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 13:59:46 +00:00
Florian Klink
50065afd36 refactor(tvix/castore/fs): remove "add … handle" debug messages
Change-Id: Iac22bbef96a2afa0416f011d073934b52b19975d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11433
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-15 13:54:14 +00:00
Florian Klink
d3d88431f3 feat(tvix/castore/fs): assign read[dir*]/release[dir] ops to parent span
When a directory or file is open()'ed, we already put some data into
a lookup table, and subsequent operations then use the returned handle
id.

By also adding the span that's been created during these calls into the
lookup table, we can properly set the span parent for these requests,
nicely connecting the individual operations to the bigger picture.

Change-Id: Ia354842fccdbc7f45c2d3efda3acf058b2dbc48e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11429
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Brian Olsen <me@griff.name>
2024-04-15 13:54:14 +00:00
Florian Klink
34d9d54aae fix(tvix/castore/fs): use record to add fields to current span
Instead of creating another child span, we can use
`tracing::Span::current().record(k,v)` to add an additional field to the
current span.

Change-Id: I337faac0e73a0da6eb0a52cb75c2e8c026eff774
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11428
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 13:23:33 +00:00
Florian Klink
ff5835008f feat(tvix/castore/fs): implement readdirplus
This currently shares some code with readdir, except it's also providing
a second `fuse_backend_rs::api::filesystem::Entry` argument to the
`add_entry` function call.

Refactoring this to reduce some duplication is left for a future CL.

Change-Id: I282c8dfc6a711d00a4482c87cbb84d4950c0aee9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11426
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-15 13:22:08 +00:00
Florian Klink
456191e69e refactor(tvix/castore/fs): use consistent span field name for handle
Use rq.handle in `release` too, and remove interpolating it into the
log message itself.

Also update the comment, we don't get ownership, just simply drop, and
change the level to warn!, as suggested in cl/11425.

Change-Id: If4e6cff6d8b580671b1548ae3862851db4af6694
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11427
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 13:20:07 +00:00
Florian Klink
2a47ad1d90 feat(tvix/castore/fs): implement opendir/releasedir
Similar to how we already handle opening files, implement opendir/
releasedir, and keep a map of dir_handles. They point to the rx side of
a channel.

This greatly improves performance listing the root of the filesystem
when used inside tvix-store, as we don't need to re-request the listing
(and skip to the desired position) all the time.

Change-Id: I0d3ec4cb70a8792c5a1343439cf47d78d9cbb1d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11425
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 13:18:36 +00:00
Florian Klink
3ed7eda79b refactor(tvix/castore/fs): use std::sync::Mutex
This allows us acquiring the lock in sync code still. Also, simplify
some of the error handling a bit.

Change-Id: I29e83b715f92808e95ecb0ae9de787339d1a371d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11424
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-15 09:27:04 +00:00
Florian Klink
1bf6b9f5a0 refactor(tvix/castore/fs): simplify some separate spawn and blocks
We can just pass an async move closure to `self.tokio_handle.block_on`
and make this a bit shorter.

Change-Id: Iba674f34f22ba7a7de7c5bae59d64584884cb17c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11423
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 09:27:04 +00:00
Florian Klink
515bfa18fb feat(tvix/castore/fs): support extended attributes
This exposes `user.tvix.castore.{blob,directory}.digest` xattr keys for
files and directories:

```
❯ getfattr -d /tmp/tvix/06jrrv6wwp0nc1m7fr5bgdw012rfzfx2-nano-7.2-info
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.directory.digest="b3:SuYDcUM9RpWcnA40tYB1BtYpR0xw72v3ymhKDQbBfe4="

❯ getfattr -d /tmp/tvix/156a89x10c3kaby9rgf3fi4k0p6r9wl1-etc-shells
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.blob.digest="b3:pZkwZoHN+/VQ8wkaX0wYVXZ0tV/HhtKlSqiaWDK7uRs="
```

It's currently mostly used for debugging, though it might be useful for
tvix-castore-aware syncing programs using the filesystem too.

Change-Id: I26ac3cb9fe51ffbf7f880519f26741549cb5ab6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11422
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
2024-04-15 09:27:04 +00:00
Florian Klink
b70e01a4db feat(tvix/castore/import): remove copying in find_ancestor
We don't need to copy if we explicitly say that the returned
Option<Path> may hold onto bytes from the passed in &DirEntry.

Change-Id: Ib46b6fd2f8f19a45f8bef79c4c1d2fa6b490cad7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11410
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-13 13:20:29 +00:00
Florian Klink
31e3382129 feat(tvix/*store/bigtable): limit retries connecting to cbtemulator
This kept retrying indefinitely if the socket didn't appear.

Change-Id: I4d4ef61df73cef6abda698501432f370abc8a82c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11406
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-13 12:01:00 +00:00
Florian Klink
cc42cac39c refactor(tvix/castore/import): rename ingest_entries function arg
This is a stream of DirEntry, so let's call it direntry_stream.

Change-Id: I5b3cb4efba899d746393f75f6ece7eaa79424717
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11401
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-13 10:51:58 +00:00
Florian Klink
7bcb896e48 feat(tvix/castore/directory/grpc): instrument functions
Change-Id: I9cc0a6a32184773597556ab5f9250257aa18ca4e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11399
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-12 22:32:37 +00:00
Florian Klink
f8800ba189 chore(tvix): bump rstest to 0.19.0
Change-Id: Ib2f5e84fdb8be1210b3507da67d4fe84f061651e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11387
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-12 22:16:56 +00:00
Florian Klink
17849c5c00 feat(tvix/castore/directory): add bigtable backend
This adds a Directory service using
https://cloud.google.com/bigtable/docs/ as a K/V store.

Directory (closures) are put in individual keys.

We don't do any bucketed upload of directory closures (yet), as castore/
fs does query individually, does not request recursively (and buffers).
This will be addressed by store composition at some point.

Change-Id: I7fada45bf386a78b7ec93be38c5f03879a2a6e22
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11212
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-09 15:50:34 +00:00
Florian Klink
289b3126db feat(tvix/castore): drop test-case crate dep
Change-Id: I5049a3682a58ce848d80f413b2964331025a90a8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11370
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
936a175b2f refactor(tvix/castore/directoryservice/from_addr): migrate to rstest
Change-Id: I52b14652822766421d66e95bc646ed7baecc705f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11369
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
Tested-by: BuildkiteCI
2024-04-07 14:51:47 +00:00
Florian Klink
d94ff54d42 refactor(tvix/castore): migrate closure_validator to rstest
Change-Id: I6c594d2e670a681484b858c3e04bc25b9e5a2077
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11368
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
c715b6d448 refactor(tvix/castore/tonic): migrate to rstest
Change-Id: Ie088bf03c739bf64abf3432175362a8f92f501c2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11367
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
71fb99a265 refactor(tvix/castore/hashing_reader): migrate to rstest
Change-Id: I99ae0e27b4db4799db8af7cd6b9cc8d7f09227de
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11366
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
c7c66abd85 refactor(tvix/castore/blobservice/object_store): drop individual tests
This (and more) should now be covered by the generic testsuite
(in crate::blobservice::tests).

Change-Id: Ib3afc4f19f7e37a561b7398d43663dc941971f5c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11365
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
32ac9bd110 refactor(tvix/blobservice/from_addr): move from test_case to rstest
This allows conditionalizing test cases on feature flags.

Change-Id: Ic9ed9ef52f703c973fda2ca4aae5f425e33b67de
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11364
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
Tested-by: BuildkiteCI
2024-04-07 14:51:47 +00:00
Florian Klink
6e8046bec7 feat(tvix/castore/*service/tests): add objectstore to tests, sort
Change-Id: If3a45d2f8008b75524ead704b05320364cc60d92
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11282
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 21:15:01 +00:00
Florian Klink
508fcd65ee feat(tvix/castore/directoryservice): log more TODOs
We need to define behaviours and add tests for these.

Change-Id: Id5825fafbf47897d8de42503ea6006eb131b1082
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11281
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 07:58:10 +00:00
Florian Klink
74023a07a4 refactor(tvix/castore/*): drop utils.rs and grpc directorysvc tests
This drops pretty much all of castore/utils.rs.

There were only two things left in there, both a bit messy and only used
for tests:

Some `gen_*_service()` helper functions. These can be expressed by
`from_addr("memory://")`.

The other thing was some plumbing code to test the gRPC layer, by
exposing a in-memory implementation via gRPC, and then connecting to
that channel via a gRPC client again.

Previous CLs moved the connection setup code to
{directory,blob}service::tests::utils, close to where we exercise them,
the new rstest-based tests.

The tests interacting directly on the gRPC types are removed, all
scenarios that were in there show now be covered through the rstest ones
on the trait level.

Change-Id: I450ccccf983b4c62145a25d81c36a40846664814
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11223
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 07:58:10 +00:00
Florian Klink
07a51c7dc9 feat(tvix/store): add rstest-based PathInfoService tests
This introduces rstest-based tests. We also add fixtures for creating
some BlobService / DirectoryService out of thin air.
To test a PathInfoService, we don't really care too much about its
internal storage - ensuring they work is up to the castore tests.

Change-Id: Ia62af076ef9c9fbfcf8b020a781454ad299d972e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11272
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 07:02:18 +00:00
Florian Klink
fe6ae58ba5 feat(tvix/castore): add rstest-based BlobService tests
Change-Id: Ifae9c41e1e3fa5e96f9b7e188181a44650ff8a04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11250
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-24 20:03:01 +00:00
Florian Klink
f5cf659245 feat(tvix/castore): AsRef<dyn BlobService> impl BlobService
This allows us to use containers around BlobServices as BlobServices too.

Change-Id: I3c7feb074f42b4e07c550fb8dfa63cf81d448ab5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11249
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 20:01:22 +00:00
Florian Klink
3ece32bbf9 feat(tvix/castore): add rstest-based DirectoryService tests
This creates test scenarios (using the DirectoryService trait) that we
want all DirectoryService implementations to pass.

Some of these tests are ported from proto::tests::grpc_directoryservice,
which tested this on the gRPC interface (rather than the trait),
some others ensure certain behaviour for which we only recently
introduced general checking logic (through ClosureValidator).

We also borrow some code related to setting up a gRPC DirectoryService
client (connecting to a server exposing a in-memory DiretoryService)
from castore::utils, this will be deleted once it's all ported over.

Change-Id: I6810215a76101f908e2aaecafa803c70d85bc552
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11247
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-24 20:00:40 +00:00
Florian Klink
6f5474bf02 feat(tvix/castore): AsRef<dyn DirectoryService> impl DirectoryService
This allows us to use containers around DirectoryServices as DirectoryServices too.

Change-Id: I56cca27b3212858db8b12b874df0e567dd868711
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11248
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-24 20:00:40 +00:00
Florian Klink
21fcc1c9df feat(tvix/castore/directory): add SledDirectoryPutter
This uses DirectoryClosureValidator for validation and the sled batch
API to insert multiple directories at once.

Change-Id: I2d6dc513ccbc02e638f8d22173da5463e73182ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11222
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 19:56:55 +00:00
Florian Klink
f7281d8fd5 refactor(tvix/castore/directory/grpc_wrapper): use ClosureValidator
This greatly simplifies the code in this function, replacing it with a
much better tested (and more capable!) version of the validation logic.

It also enables the gRPC server frontend to make use of the
DirectoryPutter interface. While this might not be too visible in terms
of latency thanks to gRPC streams bursting, it also enables further
optimizations later (such as bucketing of directory closures).

Change-Id: I21f805aa72377dd5266de3b525905d9f445337d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11221
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 19:55:42 +00:00
Florian Klink
5f069a3eb8 refactor(tvix/castore/directory): have SimplePutter use Validator
This simplifies a bunch of code, and gets rid of some TODOs.

Also, move it out of castore/utils, and into its own file.

Change-Id: Ie63e05a6cdfb2a73e878cf7107f9172aed1cdf13
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11224
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-24 17:42:30 +00:00
Florian Klink
c92ef2df64 feat(tvix/castore/directory): add ClosureValidator
This can be used to validate a Directory closure (connected DAG of
Directories), and their insertion order.

Directories need to be inserted (via `add`), in an order from the leaves
to the root. During insertion, we validate as much as we can at that
time:

 - individual validation of Directory messages
 - validation of insertion order (no upload of not-yet-known Directories)
 - validation of size fields of referred Directories

Internally it keeps all received Directories (and their sizes) in a HashMap,
keyed by digest.

Once all Directories have been inserted, a drain() function can be
called to get a (deduplicated and) validated list of directories, in
from-leaves-to-root order (to be stored somewhere).

While assembling that list, a check for graph connectivity is performed
too, to ensure there's no separate components being sent (and only one
root).

It adds a test suite for these cases, which is much nicer to test than
where we previously had these checks (only in the gRPC server wrapper).

Followup CLs will move the existing putters to use this.

Change-Id: Ie88c832924c170a24626e9e3e91d868497b5d7a4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11220
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-24 17:39:49 +00:00
Florian Klink
110734dfed docs(tvix/castore): fix missing slash in docstring
Change-Id: I5f39d0e613b651458b168cfd9df0693d7f01d704
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11246
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-23 22:06:27 +00:00
Florian Klink
e449f423e7 fix(tvix/castore/directory/tests): close upload handle
We need to ensure the Directories are successfully uploaded before doing
any testing with them.

Change-Id: Iafa8deb86b3d5eb302ebfba3ced34385f67a7229
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11244
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-23 22:00:16 +00:00
Florian Klink
9793745459 feat(tvix/castore): derive Eq and Hash automatically
This allows these messages to be put in HashSets.

Change-Id: Ia58094cafe53eb624578821d3d8d969c5d21a1d7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11219
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-03-20 21:19:02 +00:00
Florian Klink
345a639e79 refactor(tvix/castore): instrument DirectoryPutter impls consistently
Log the entire span with "trace" level, not just its `ret` level.

The level of the error value event defaults to ERROR, so we don't loose
these.

B3Digest implements Debug and Display the same way, so we can omit the
`(Display)` part in `ret(Display)` for them.

Change-Id: Id00d123a5798e5bdc9820dd97ae2b4d4eb5455f0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11218
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 21:02:44 +00:00
Florian Klink
60b47b336b refactor(tvix/castore/directory): remove GRPCPutter::new
This is no public API to construct this, there's exactly one caller,
and it's perfectly fine to directly populate the struct there.

Change-Id: Idae43a0162ee9bc687d21c550e0c9df33f12d263
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11217
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 21:02:44 +00:00
Florian Klink
9fb213f47a feat(tvix/castore): record errors for some failures in SimplePutter
This makes it easier to see what's going wrong when uploading multiple
Directories.

Change-Id: Ieb71424b9761777c5f719b2f365962644de82baf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11209
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-20 12:21:09 +00:00
Florian Klink
5627dc04e1 feat(tvix/castore/blob): document missing objectstore+*:// URL
Change-Id: I3cbc75e636efd467ddda7fc3c3c8f19ad42204ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11206
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 12:17:42 +00:00
Florian Klink
345cebaebb refactor(tvix/castore/blob): drop simplefs
This functionality is provided by the object store backend too
(using `objectstore+file://$some_path`).

This backend also supports content-defined chunking and compresses
chunks with zstd.

Change-Id: I5968c713112c400d23897c59db06b6c713c9d8cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11205
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 12:17:42 +00:00
Florian Klink
2798803f76 refactor(tvix/castore): introduce "cloud" feature flag
This controls whether tvix-castore has support for various cloud
backends or not.

Use this to control the set of feature flags for the object_store
backend, and only enable the aws, azure and gcp ones if it's set.
In the future this can be used to enable/disable other cloud backends
too.

Without feature flags, `object_store` already supports the `InMemory`
and `LocalFilesystem` backends, and we also want to unconditionally
enable the `http` one. Make sure at least the construction of these
services is covered in the tests.

Similarly, the tvix-store crate, which provides the tvix-store CLI has a
`cloud` feature flag too (defaulting to enabled).

Change-Id: I9fb9c87b740e7dc83f8ff7a0862905d036d513f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11204
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-20 12:17:42 +00:00
Florian Klink
591edf0d5b docs(tvix/castore/directory): update docstring for get_recursive
The rust trait was missing to document the order of the elements in the
stream. Document that, and also the reasoning behind this.

Change-Id: I27ef0b2020082783fc41c2015233175e2b8e716d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11203
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-20 12:17:42 +00:00
Florian Klink
c0566985b0 refactor(tvix/castore/directory/from_addr): use match guards
This will allow feature-flagging some of the backends.

Change-Id: Iddbdb89d3cf9c966a2c25b06b03e6917b284cae5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11201
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-20 11:53:04 +00:00
Florian Klink
7deadd50d5 refactor(tvix/castore/blob/from_addr): use match guards
This will allow feature-flagging some of the backends.

Change-Id: Idffbf8b3fd154f5a3d938225c3871feffea8ff8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11200
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 11:52:29 +00:00
Florian Klink
499bc2f7ee refactor(tvix/castore/blobsvc): use B3Digest Display impl
We don't need to use BASE64 here on our own, B3Digest has a Display
impl.

This will also make sure the `b3:` digest is present in field values.

Change-Id: I0ce6ee0f7e7e99fb9b16872953a1b742e99be291
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11192
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-18 16:10:05 +00:00
Florian Klink
05bdb68523 feat(tvix/blobservice/object_store) more logging
Have derive_{blob,chunk}_path emit trace-level events for both the
values they're called with, as well as the return value.

With RUST_LOG in place, it doesn't get lost in other unrelated noise.

Change-Id: Id2451e3657324eff482841eb26a22d19e22bde30
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11136
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-18 16:10:05 +00:00
Florian Klink
82f8ce8b7d feat(tvix/castore/blobsvc/grpc): read data in chunks
Whenever this encounters an open_read(), it'll first check for more
granular chunking. If there's more granular chunking data available, a
ChunkedReader is constructed (which supports seeking backwards).

This currently is still a bit stupid, and doesn't compose, as
`ChunkedReader` uses `self` as the `BlobService` to ask for the
individual chunks.

In store composition future, we might want to compose this differently,
essentially constructing `ChunkedReader` with another `BlobService`
representing the entire hierarchy, so there's a chance to locally cache
things, and do less requests.

Change-Id: I22e0df4d6245f666d083b4f0b7114d3ac41d1dce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11185
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-18 16:10:05 +00:00
Florian Klink
70bbf23767 refactor(tvix/castore/src/blobservice): remove useless else case
Change-Id: I09000371a1d8ff212ab46050d1a480509c6ffe70
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11183
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-18 14:25:37 +00:00
Florian Klink
3b1c9172f6 feat(tvix/castore): impl Debug for B3Digest
Use the same format as Display, b3: followed by the base64
representation. This makes the debug implementation of everything
containing a b3 digest much nicer to read.

Change-Id: I3ca3154d0b6fb07781c8f9c83ece3ff1a6957902
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11181
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-17 22:18:20 +00:00
Florian Klink
dbf87f3057 chore(tvix): bump tonic to 0.11.0
This bumps tonic and surrounding crates to 0.11.x.

We added support for tonic 0.11.x into tokio-listener
(https://github.com/vi/tokio-listener/pull/4), so that's bumped as well.

Change-Id: Icfade5894403228299836fefb21b2f9ae59dbebb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11156
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-16 17:04:12 +00:00
Florian Klink
fdf9657654 fix(tvix): don't emit rerun-if-changed
`build.rs` emits rerun-if-changed statements for all proto files, as
well as all include paths we pass it.

Unfortunately, due to protobufs include path rules, we need to specify
the path to the depot root itself as an include path, at least when
building impurely with `cargo`. This causes cargo to essentially always
rebuild, as it also puts its own temporary files in there.

Unfortunately, tonic-build does not chase down to individual .proto
files that are included.

Disable emitting these `rerun-if-changed` statements for now.

This could cause cargo to not rebuild protos every time, causing stale
data until the next local `cargo clean`, but considering the protos
change not that frequently, and it'll immediately surface if trying to
build via Nix (either locally or in CI), it's a good-enough compromise.

Change-Id: Ifd279a2216222ef3fc0e70c5a2fe6f87997f562e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11157
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2024-03-16 09:34:10 +00:00
Florian Klink
f22d5b3d11 feat(tvix/castore/blobsvc/from_addr): support object_store
The object_store crate supports a ton of different stores, with different schemes.

For now, use a objectstore+ scheme prefix to enable these.

Change-Id: I946f76e32a0fb0867ef59060217894cda5b959b9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11080
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-03-11 22:42:01 +00:00
Florian Klink
1c2db676a0 feat(tvix/castore/blobsvc): add object storage implementation
This uses the `object_store` crate to expose a tvix-castore BlobService
backed by object storage.

It's using FastCDC to chunk blobs into smaller chunks when writing to
it.

These are exposed at the .chunks() method.

Change-Id: I2858c403d4d6490cdca73ebef03c26290b2b3c8e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11076
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
2024-03-11 22:42:01 +00:00
Florian Klink
4e78de7393 fix(tvix/castore/grpc/directory): skip_all fields in instrument
This only contains the outer metadata wrapping, and that's not too interesting:

> Request { metadata: MetadataMap { headers: {"content-type":
> "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers",
> "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions:
> Extensions }

Drop these fields for now, and rely on the underlying implementations to
add instrumentation for the application-specific fields.

Clean up the error logging a bit.

Change-Id: Ife1090ed411766a61e1fa60fd4c9570f38de1e98
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11102
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-09 05:47:41 +00:00
Florian Klink
05deb37f44 fix(tvix/castore/grpc/blob): skip_all fields in instrument
This only contains the outer metadata wrapping, and that's not too interesting:

> Request { metadata: MetadataMap { headers: {"content-type":
> "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers",
> "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions:
> Extensions }

Drop these fields for now, and rely on the underlying implementations to
add instrumentation for the application-specific fields.

Log errors in some places where we didn't so far.

Change-Id: Ia68d6c526987d3716be62a0809195401cf28512b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11101
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-09 05:47:36 +00:00
Florian Klink
4954a39de4 fix(tvix/castore): also set SSL_CERT_FILE for tests there
For everything using reqwest here during test cases, we also need to
set SSL_CERT_FILE.

Change-Id: If8aeda65f3d75cb9ac5c9bc64e37a0cb7dffc17c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11092
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-03 17:43:42 +00:00
Florian Klink
ef3f8936cb refactor(tvix/*/from_addr): improve test debuggability
If there's an unexpected test failure, print it out, rather than just
saying something is false even though it should be true.

Use .expect() for this, which displays the error if it failed.
We can't use expect_err(), as our stores are not display'able, so use an
assertion with a message there.

Change-Id: I2d88861d979d107edc0717fbdb3cdac9a6bfc5e4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11091
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: flokli <flokli@flokli.de>
2024-03-03 16:54:19 +00:00
Florian Klink
4b4443240e feat(tvix/castore): add HashingReader, B3HashingReader
HashingReader wraps an existing AsyncRead, and allows querying for the
digest of all data read "through" it.
The hash function is configurable by type parameter, and we define
B3HashingReader.

Change-Id: Ic08142077566fc08836662218f5ec8c3aff80be5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11087
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-03 15:31:31 +00:00
Florian Klink
8383e9e02e feat(tvix/castore/digests): impl From digest::Output<_> for B3Digest
This allows calling .into() to get a B3Digest.

Change-Id: I6e63b496413cd00d84acfcd15c7de0f64c79721f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11086
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-03 15:18:19 +00:00
Florian Klink
7bebf492ec refactor(tvix/castore/blobsvc/chunked_reader): refactor, document
The public-consumable thing here is ChunkedReader, not ChunkedBlob.

ChunkedBlob is a helper that can be used to get a new AsyncRead, but
not AsyncSeek. It is used internally by ChunkedReader whenever the
client seeks.

Make this more obvious, by extending the documentation, and putting
ChunkedReader at the top of this file.

Also make ChunkedBlob and its methods private, and give ChunkedReader a
more useful constructor (from_chunks, instead of from_chunked_blob).

Change-Id: I2399867591df923faa73927b924e7c116ad98dc0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11079
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-03 11:22:56 +00:00
Florian Klink
53fb9ff4c6 feat(tvix/castore/blobsvc): BlobReader for more trivial types
Change-Id: I80e4f26c41a504fa4c6a013c2a1e76de613ba294
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11078
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-02 17:05:23 +00:00
Florian Klink
982459d343 fix(tvix/castore/blobwriter): don't require Sync + 'static
There's no reason for these two.

Change-Id: Ie6f238bbb0b17971c9877b11b61ea7ebca573c13
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11075
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-02 05:56:21 +00:00
Florian Klink
b38badf206 docs(tvix/castore/directorysvc): K/V is not necessarily flat
Some implementations of DirectoryService might not allow retrieval of
intermediate Directory nodes, that are not at the "root".

Think about an object store implementation. The client is doing a
get_recursive anyways to reduce the number of roundtrips.

By documenting the fact we don't need to support looking up intermediate
Directory messages, we can just batch all directories into the same
object, keyed by the root.

Change-Id: I019d720186d03c4125cec9191e93d20586a20963
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10988
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2024-02-20 09:17:38 +00:00
Peter Kolloch
5777050821 feat(tvix/castore): Compile fix for Darwin
Towards https://b.tvl.fyi/issues/264

Change-Id: If8fa912ae3fb2987b761f649ab738529ebf3b2e8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10970
Autosubmit: Peter Kolloch <info@eigenvalue.net>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-19 17:14:24 +00:00
Florian Klink
28f5c13c53 fix(tvix/castore): don't emit ret as INFO
This otherwise gets a bit spammy.

Change-Id: I288350a600d79a394c239f253424ad55bc3cefc5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10954
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-18 07:12:27 +00:00
Florian Klink
34a1ff291a feat(tvix/castore/fs): make allow_other configurable
Also add a cli argument to the `tvix-store` binary.

Change-Id: Id07d7fedb60d6060543b195f3a810a46137f9ad5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10945
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-17 07:00:41 +00:00
Florian Klink
d10c5309bc feat(tvix/castore/blobsvc): add Chunked{Blob,Reader}
These provide seekable access into a Blob for which we have more
granular chunking information.

There's no support for verified streaming in here yet, this simply
produces a stream of readers for each chunk, skipping irrelevant chunks
and data from the first chunk at the beginning.

A seek simply does produce a new reader using the same process.

Change-Id: I37f76b752adce027586770475435f3990a6dee0b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10731
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-10 14:24:51 +00:00
Florian Klink
40d81d0c74 docs(tvix/castore/blobstore): reorganize docs
docs/verified-streaming.md explained how CDC and verified streaming can
work together, but didn't really highlight enough how chunking in
general also helps with seeking.

In addition, a lot of the thoughts w.r.t. the BlobStore protocol, both
gRPC and Rust traits, as well as why there's no support for seeking
directly in gRPC, as well as how clients should behave w.r.t. chunked
fetching was missing, or mixed together with the verified streaming
bits.

While there is no verified streaming version yet, a chunked one is
coming soon, and documenting this a bit better is gonna make it easier
to understand, as well as provide some lookout on where this is heading.

Change-Id: Ib11b8ccf2ef82f9f3a43b36103df0ad64a9b68ce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10733
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-06 18:28:00 +00:00
Florian Klink
cb2cf3f6b7 fix(tvix/castore/grpc/svc_wrapper): expose chunks() over gRPC
The Stat() method was just always signalling no granular chunks are
available. However, as we now have a .chunks() method, we can expose it
over gRPC.

Change-Id: I74f0890ae083f301bb0cec62f1ea4a95463ac590
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10736
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 16:27:10 +00:00
Florian Klink
9504015031 feat(tvix/castore/blobsvc): validate StatBlobResponse
All chunks must have valid blake3 digests. It is allowed to send an
empty list, if no more granular chunking is available.

Change-Id: I7ecb53579cdf40fd938bb68a85685751b4d3626f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10726
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-02-02 16:26:38 +00:00
Florian Klink
5ad5a0da00 refactor(tvix/castore/grpc/blobsvc): inline stream_mapper
This can be written without the additional function.

Change-Id: Ib11c5d5254d3e44c8fa9661414835b0622eb1ac4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10735
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-02 16:25:37 +00:00
Florian Klink
1157eea710 docs(tvix/castore/blobsvc): fix doc comments on trait
The readers implement AsyncRead/AsyncSeek, not their sync counterparts.
Also update expectations around chunks.

Change-Id: Ic266688039d80d16d33f651b96ce2bcdedecfa00
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10734
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 16:24:06 +00:00
Florian Klink
4c5d9fa356 feat(tvix/castore/docs/verified-streaming): clarify reply
"given chunksize" is misleading here. It's up to the backend to decide
if it does chunking at all, and how it chunks.

Change-Id: I4f130ca9ac34db79f18ef1d6475295806ac7f9a4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10728
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-02-02 08:56:48 +00:00
Florian Klink
459a564ff1 refactor(tvix/castore/blobsvc/combinator): compact trait bounds
BlobService already implies Send and Sync, we don't need to explicitly
list it here.

Change-Id: I58a4c5912be61a60acd961565979aa01d94ee0f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10727
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-02-02 08:55:16 +00:00
Ryan Lahfa
68bba48d59 feat(tvix/castore): process_entry cannot process unsupported nodes
In the past, we had a `todo!` on unsupported node types, this returns a proper error
that can be caught by the caller.

Change-Id: Icba4c1dab33c0d670a97f162c9b358d1ed5855cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10675
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-01-22 14:15:42 +00:00
Connor Brewster
4e341fb5d9 chore(tvix/store): Use BoxStream type alias
The BoxStream type alias is a more concise and easier to read than
the full `Pin<Box<dyn Stream<Item = ...> + Send + ...>>` type.

Change-Id: I5b7bccfd066ded5557e01f7895f4cf5c4a33bd44
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10677
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
2024-01-21 19:41:02 +00:00
Ryan Lahfa
7275288f0e refactor(tvix/castore): break down ingest_path
In one function that does the heavy lifting: `ingest_entries`, and three additional helpers:

- `walk_path_for_ingestion` which perform the tree walking in a very naive way and can be replaced by the user
- `leveled_entries_to_stream` which transforms a list of a list of
  entries ordered by their depth in the tree to a stream of entries in
  the bottom to top order (Merkle-compatible order I will say in the
  future).
- `ingest_path` which calls the previous functions.

Change-Id: I724b972d3c5bffc033f03363255eae448f017cef
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10573
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: raitobezarius <tvl@lahfa.xyz>
2024-01-20 18:26:17 +00:00
Ryan Lahfa
1f1a42b4da feat(tvix/castore): ingestion does DFS and invert it
To make use of the filtering feature, we need to revert the internal walker to a real DFS.

We will therefore just invert the whole tree by storing all of its
contents in a level-keyed vector.

This is horribly expensive in memory, this is a compromise between CPU
and memory, here is the fundamental reason for why:

When you encounter a directory, it's either a leaf or not, i.e. it
contains subdirectories or not.

To know this fact, you can:

- wait until you notice subdirectories under it, i.e. you need to store
  any intermediate nodes you see in the meantime -> memory penalty.
- getdents or readdir on it to determine *NOW* its subdirectories -> CPU
  penalty and I/O penalty.

This is an implementation of the first proposal, we pay memory.

In practice, we are paying O(#nb of nodes) in memory.

There's a smarter albeit much more complicated algorithm that pays only
O(\sum_i #siblings(p_i)) nodes where (p_1, ..., p_n) is the path to a leaf.

which means for:

             A
            / \
           B   C
          /   / \
         D   E   F

We would never store D, E, F but only E, F at a given time.
But we would still store B, C no matter what.

Change-Id: I456ed1c3f0db493e018ba1182665d84bebe29c11
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10567
Tested-by: BuildkiteCI
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
2024-01-20 17:16:01 +00:00
sterni
526295a71d chore(3p/sources): Bump channels & overlays
- Adjust to ecl 23.9.9 release

- Regenerate go protos after protoc-gen-go update

- Drop dhall fork which hasn't kept up with 1.42.*

- Address new clippy warnings:

  - Variant naming of Error::ValidationError
  - Simplify .try_into().unwrap()
  - Drop unnecessary identity function
  - Test module must be last in file
  - Drop unused `pub use`

- Update agenix to 0.15.0. Current master has a installCheckPhase that
  doesn't work with C++ Nix 2.3.*:
  a23aa271be (commitcomment-137185861)

Change-Id: Ic29eef20d6fd1362ce1031364a5ca6b4edf195bd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10615
Reviewed-by: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
Autosubmit: sterni <sternenseemann@systemli.org>
2024-01-19 21:47:32 +00:00
Ryan Lahfa
93afc711f6 feat(tvix/castore): convert import error to std::io::Error
So that we can just `map_err` easily in functions returning `std::io::Error` but calling functions
returning `castore::import::Error`.

Change-Id: Id181b95e8431c69e95f3a8cd569ca10306656e1d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10572
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-18 14:40:06 +00:00
Florian Klink
c5e2832cbd feat(tvix/castore): implement Ord for node::Node
This allows assembling BTreeSets of node::Node.

Change-Id: I97b83be5ffc3e891307a8ef2b5fc31e38b747a62
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10625
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-01-15 18:19:15 +00:00
Florian Klink
4d135bcfa2 feat(tvix/castore): implement CombinedBlobService
First attempt on composition of BlobServices.

Change-Id: I6e70248007edfd322a503fd40c1c4b4300cbc30c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10587
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-01-09 17:32:03 +00:00
Florian Klink
719cbad871 feat(tvix/castore/blobsvc): add chunks method
This adds support to retrieve a list of chunks for a given blob to the
BlobService interface.

While theoretically all chunk-awareness could be kept private inside
each BlobService reader, we'd not be able to resolve individual chunks
from different Blobservices - and due to this, not able to substitute
chunks we already have in a more local store.

This function allows asking a BlobService for the list of chunks,
leaving any actual fetching up to the caller (be it through individual
calls to open_read), or asking another store for it.

Change-Id: I1d33c591195ed494be3aec71a8c804743cbe0dca
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10586
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-01-09 17:31:32 +00:00
Florian Klink
9596c5caff refactor(tvix/castore): do clone inside a scope
Make it clear this is only used inside the scope.

Change-Id: Ie94f88d7f0fb58cd4bf9c2f1176000b272e6f2e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10585
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 16:20:19 +00:00
Florian Klink
9de1ebf23e feat(tvix/castore/grpc): instrument some more functions
Change-Id: Icedb148c88c5f4a3b2242ed12df1dd8692af94fd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10584
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 16:18:48 +00:00
Florian Klink
0009383c07 refactor(tvix/castore/directorysvc): AsRef traverse_to
Change-Id: I641bd4ab3de591a013f03137f1e16295946315f3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10579
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-09 14:15:55 +00:00
Florian Klink
b1c556b7e1 refactor(tvix/castore/blobservice/grpc): remove fn pointer hack
It looks like the workaround isn't necessary anymore.

Change-Id: Ifbcef1d631b3f369cac3db25a2c793480043f697
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10583
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:13:24 +00:00
Florian Klink
89882ff9b1 refactor(tvix): use AsRef<dyn …> instead of Deref<Target= …>
Removes some more needs for Arcs.

Change-Id: I9a9f4b81641c271de260e9ffa98313a32944d760
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10578
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:08:22 +00:00
Florian Klink
8fbdf72825 feat(tvix/castore/blobsvc/grpc): rm VecDec, fix docstring
The docstrings were not updated once we made the BlobService trait async.
There's no more need to turn things into a sync reader.

Also, rearrange the stream manipulation a bit, and remove the need to
create a new VecDeque for each element in the stream. bytes::Bytes
implements the Buf trait.

Fixes b/289.

Change-Id: Id2bbedca5876b462e630c144b74cc289c3916c4d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10582
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:03:21 +00:00
Ryan Lahfa
cbcd078684 chore(tvix/castore): fix the docstring for process_entry
It was a `//` not a `///`.

Change-Id: Iee3e8c116d73b5dd8a41c027153714415a66695f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10566
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-01-06 01:39:41 +00:00
Florian Klink
5a82736122 chore(tvix): bump test-case dep to 3.3.1
Change-Id: I643548d95a5fab84563c7cbe51ca2ce640c186a9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10537
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-01-05 16:43:34 +00:00
Florian Klink
6b42aef88d fix(tvix/castore): validate Option<Node>
Extend our validation function to also check for the None case.

Change-Id: Ib75f880646d7fb3d66588f1988e61ec18be816a2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10534
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-01-05 16:43:34 +00:00
Florian Klink
f20969de9b refactor(tvix/castore): relax trait bounds for DS
Make this an `AsRef<dyn DirectoryService>`.

This helps dropping some Clone requirements.

Unfortunately, we can't thread this through to TvixStoreIO just yet.

Change-Id: I3f07eb28d6c793d3313fe21506ada84d5a8aa3ac
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10533
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-05 16:43:34 +00:00