Commit graph

307 commits

Author SHA1 Message Date
Florian Klink
5fc403587f refactor(tvix/castore): ingest filesystem entries in parallel
Rather than carrying around an Future in the IngestionEntry::Regular,
simply carry the plain B3Digest.

Code reading through a non-seekable data stream has no choice but to
read and upload blobs immediately, and code seeking through something
seekable (like a filesystem) probably knows better what concurrency to
pick when ingesting, rather than the consuming side.

(Our only) one of these seekable source implementations is now doing
exactly that. We produce a stream of futures, and then use
[StreamExt::buffered] to process more than one, concurrently.

We still keep the same order, to avoid shuffling things and violating
the stream order.

This also cleans up walk_path_for_ingestion in castore/import, as well
as ingest_dir_entries in glue/tvix_store_io.

Change-Id: I5eb70f3e1e372c74bcbfcf6b6e2653eba36e151d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11491
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-20 18:54:28 +00:00
Connor Brewster
b0bdeb2e89 feat(tvix/castore): Fix build warnings in release mode
Fixes some build warnings that only happen when building in release mode
which disables `debug_assertions`.

Change-Id: I554d5fce7c869c23cf4aa93179f0ee9f7f7c834e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11490
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
2024-04-20 16:47:12 +00:00
Connor Brewster
18ab59ed70 fix(tvix/castore): ensure all directories are present during ingestion
`ingest_entries` requires that all directories referenced by entries in
the ingestion stream have an explicit entry in the stream.

For example, if the stream contains a file with path `foo/bar`, there
must be an entry that comes later in the stream for the directory `foo`.

Change-Id: I61b4fbbb73ea7278715e04271d8073b484e05e61
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11488
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-20 16:46:41 +00:00
Aspen Smith
3107961428 feat(tvix/eval): Implement builtins.fetchTarball
Implement a first pass at the fetchTarball builtin.

This uses much of the same machinery as fetchUrl, but has the extra
complexity that tarballs have to be extracted and imported as store
paths (into the directory- and blob-services) before hashing. That's
reasonably involved due to the structure of those two services.

This is (unfortunately) not easy to test in an automated way, but I've
tested it manually for now and it seems to work:

    tvix-repl> (import ../. {}).third_party.nixpkgs.hello.outPath
    => "/nix/store/dbghhbq1x39yxgkv3vkgfwbxrmw9nfzi-hello-2.12.1" :: string

Co-authored-by: Connor Brewster <cbrewster@hey.com>
Change-Id: I57afc6b91bad617a608a35bb357861e782a864c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11020
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:58:04 +00:00
Florian Klink
f34e0fa342 feat(tvix/castore/import): only allow normal components in entry paths
Explicitly document and add a debug assertion for that.

It's up to callers to ensure this doesn't happen.

Change-Id: Ib5d154809c2ad2920258e239993d0b790d846dc8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11487
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:14:19 +00:00
Florian Klink
e9db0449e7 refactor(tvix/castore/import): make module, split off fs and error
Move error types and filesystem-specific functions to a separate file,
and keep the fs:: namespace in public exports.

Change-Id: I5e9e83ad78d9aea38553fafc293d3e4f8c31a8c1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11486
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-20 14:14:19 +00:00
Florian Klink
c4cb099823 refactor(tvix/castore/import): rename ingest_entries arg
This is not a stream of direntries anymore, but a stream of ingestion
entries.

Change-Id: I387f4497b6567066b24c58ca0262e710348180e9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11485
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:14:19 +00:00
Connor Brewster
259d7a3cfa refactor(tvix/castore): generalize store ingestion streams
Previously the store ingestion code was coupled to `walkdir::DirEntry`s
produced by the `walkdir` crate which made it impossible to reuse
ingesting from other sources like tarballs or NARs.

This introduces a `IngestionEntry` which carries enough information for
store ingestion and a future for computing the Blake3 digest of files.
This allows the producer to perform file uploads in a way that makes
sense for the source, ie. the filesystem upload could concurrently
upload multiple files at the same time, while the NAR ingestor will need
to ingest the entire blob before yielding the next blob in the stream.
In the future we can buffer small blobs and upload them concurrently,
but the full blob still needs to be read from the NAR before advancing.

Change-Id: I6d144063e2ba5b05e765bac1f27d41b3c8e7b283
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11462
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-19 20:37:05 +00:00
Connor Brewster
150106610e feat(tvix/castore): add convenience add method to Directory
This adds `Directory::add` which is a convenience helper for adding
nodes into a `Directory` while preserving sorted order.

This implements `Ord` and `PartialOrd` for `FileNode`, `SymlinkNode`,
and `DirectoryNode` so `binary_search` can be used.

Change-Id: I94b86bdef5d0da55aa352e098988b9704cafca19
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11481
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-04-19 20:10:14 +00:00
Florian Klink
538d5fc8ee fix(tvix/castore/blobservice/grpc): don't use NaiveSeeker for now
Userland likes to seek backwards, and until we have store composition
and can serve chunks from a local cache, we need to buffer the
individual chunks in memory.

Change-Id: I66978a0722d5f55ed4a9a49d116cecb64a01995d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11448
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
8107678632 fix(tvix/castore/src): record rq.handle field in read()
This makes it easier to separate concurrent requests on the same inode.

Change-Id: I7637c1d889336beeb0d186182ce22fbf60fd16c3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11447
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
99bc926d1e fix(tvix/castore/fs): use io::copy to fill kernel-provided buffer
The docs state we must fill all of the buffer, except on EOF.

Change-Id: Id977ba99c0b15132422474ebbf82bb92b79d55ba
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11446
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
bfd342873c feat(tvix/castore/blob/naive_seeker): add some more tracing
Change-Id: Iecf4a82a7d84008a8620825570b34e9094e6d590
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11445
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
9d9c731147 feat(tvix/castore/blob/chunked_reader): add some more traces
Change-Id: I2408707a7bc0e1c0cd8bd2933f8d68805b9e12c9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11444
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
28e98af9bc fix(tvix/castore/blobservice/chunk_rd): only skip *first* chunk bytes
When (re)initializing a chunked reader, we were erroneously skipping the
first n bytes from all chunks, not just the first one.

Fix this, by passing in an enumerated list of chunks, and only calling
SeekFrom::Start() on the first chunk in the stream.

With this, I'm able to invoke b3sum on bin/bash successfully.

Change-Id: I52ea480569267e093b0ac9d6bcd5c2d1b4db25f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11443
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
9398bc46b6 refactor(tvix/castore/blob/naive_seeker): rework skipping for clarity
Increase the discard_buf to 4096 (as I've seen this size).
Use the ready! macro to propagate pendings.
Make it more clear what exactly should be skipped in total, and what
during the current iteration.

Also write down that poll_read call already takes care of updating
self.pos, as I ran into that trap earlier (and added it here).

Change-Id: I2d22e1c8a835c0f3dd0c648917009b2bad4fd57c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11442
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
4d802fa0ae feat(tvix/castore/blob/chunked_reader): only reassemble on real seek
If the resulting offset equals to our current position, there's no need
to recreate a reader.

Change-Id: I855f0c79c514c16ca48a78e12978af2835fbbd6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11441
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
80d0b305a7 docs(tvix/castore/blobservice): explain open_read for small blobs more
State that this case applies if the blob is small enough to fit inside a
single chunk.

Change-Id: I0383514729e686799599b629cf1303b284147bb4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11440
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 19:33:37 +00:00
Florian Klink
e958cb0251 feat(tvix/castore/blobs/object_store): chunks() method for small blobs
We previously returned Ok(None) when being asked for more granular
chunking info, signalling the blob does not exist at all.

This is however incorrect, we should return an empty Vec instead, as
documented in the trait.

Change-Id: I83ecc2027e0767134c7598792c2ee6d964853c66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11439
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 19:33:06 +00:00
Florian Klink
c936c1c042 fix(tvix/castore/blob/object_store): tweak log levels
Don't log with info! here, bug debug!.

Change-Id: I57bd5f2a45276090b893a4051fd175e3948ddfa4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11438
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 19:32:35 +00:00
Florian Klink
9dc621cd95 fix(tix/castore/blobservice): don't warn if chunk list is empty
It's perfectly normal if we ask for more granular chunking info and the
backend responds it does not have it.

Change-Id: I593ab3e53b4f4e70c99f39b266546d2ac8eb10c1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11437
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 19:32:04 +00:00
Florian Klink
7156697010 feat(tvix/castore/blob/grpc_wrapper): add blob.digest field
We're receiving bytes over the wire, and encode them the same way
B3Digest does internally, but don't use it for formatting, as we're
discarding that string.

In case the sent bytes don't have the right length, the string will be
short, but it's better to still have it as a field, even if it's not a
valid b3 digest.

Change-Id: I6ef08275d51c8a0d98f5e46844b15dfd05d17cd8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11436
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 19:32:04 +00:00
Florian Klink
d1da9f5c84 refactor(tvix/castore/fs): add parenthesis for readability
As suggested in cl/11426.

Change-Id: Ic2bb8cf2838bf0be09fb8bc62b8e598a3d153699
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11434
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 15:09:49 +00:00
Florian Klink
b025a30d27 refactor(tvix/castore/fs): remove From<Node> for InodeData
These were copying unnecessarily. Instead, have a
InodeData::from_node(), which *consumes* the Node entirely, returns
`InodeData` and the split-off name (which is not part of InodeData).

Callers can then use the result in various helper functions, like:

 - InodeData::as_fuse_type
 - InodeData::as_fuse_file_attr
 - InodeData::as_fuse_entry

… to prepare their replies to the kernel.

This removes not only a bunch of clones, but also a lot of copy-pasted
code.

Change-Id: Idbca5f25cc29e96c1f4c614b33dff2becb0a8738
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11435
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:49:44 +00:00
Florian Klink
fb852b0245 fix(tvix/castore/blobs): reply to has() for chunks
We allow reading individual chunks via open_read(), it's inconsistent if
a has() would return Ok(false).

Change-Id: Ie713d968172ccd2687d2e6e0dfef89ee152ef511
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11420
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-15 14:47:12 +00:00
Florian Klink
f1349caf3f refactor(tvix/castore): relax trait bounds on BlobService
We don't need to clone BlobService anymore.

Change-Id: I2f3b9a595f604ec0f1e081f6e90cd8b67cbb8961
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11419
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 14:47:12 +00:00
Florian Klink
9498ac936e fix(tvix/castore/directory): fix graph traversal
Use a proper graph library to ensure all nodes are reachable from the
root.

We had a bit of that handrolled during add(), as well as later, which
had an annoying bug:

Redundant nodes were omitted during insert, but when returning the list
during finalize, we did not properly account they need to be introduced
before their parents are sent.

We now simply populate a petgraph DiGraph during insert (skipping
inserting nodes we already saw), and use petgraph's DfsPostOrder to
traverse the graph during finalize.

If the number of returned indices equals the total number of nodes in
the graph, all nodes are reachable from the root, we can consume the
graph and return the nodes as a vec, in the same order as the traversal
(and insertion).

Providing a regression test for the initial bug is challenging, as the
current code uses a bunch of HashSets. I manually tested ingesting a
full NixOS closure using this mechanism (via gRPC, which exposes this
problem, as it validates twice), and it now works.

Change-Id: Ic1d5e3e981f2993cc08c5c6b60ad895e578326dc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11418
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-15 14:47:12 +00:00
Florian Klink
6ebaa7b88a refactor(tvix/castore/import): restructure directory uploader a bit
Have a Option<Box<dyn DirectoryPutter>>, which is lazily initialized
whenever we first want to upload a directory.

Have the loop explicitly break when it encounters the root_node, and
deal with the flushing after the loop.

Deal with the FUTUREWORK (assertion for root directory digest matching
what the DirectoryPutter returns).

Change-Id: Iefc4904d8b8387e868fb752d40e3e4e4218c7407
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11417
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:47:12 +00:00
Florian Klink
c088123d4e refactor(tvix/castore/import): put invariant checker into a .inspect()
Separate this a bit stronger from the main application flow.

Change-Id: I2e9bd3ec47cc6e37256ba6afc6e0586ddc9a051f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11416
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:37:35 +00:00
Florian Klink
b70744fda6 refactor(tvix/*/import): rename direntry_stream, entries_per_depths
Align these names and comments with the two users, to make it more
obvious we're doing the same thing here, just use a different method to
come up with entries_per_depths.

Change-Id: I42058e397588b6b57a6299e87183bef27588b228
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11415
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:06:50 +00:00
Florian Klink
bcc00fba8f refactor(tvix/castore/import): inline process_entry
This did very little, and especially the part of relying on the outside
caller to pass in a Directory if the type is a directory required having
per-entry-type specific logic anyways.

It's cleaner to just inline it.

Change-Id: I997a8513ee91c67b0a2443cb5cd9e8700f69211e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11414
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:05:49 +00:00
Florian Klink
d47bd4f4bc refactor(tvix/castore/import): move process_entry to the end of the file
This makes it easier to understand the code.

Change-Id: I0a9047433000551a6ba1f50a8c5c93527bc86216
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11413
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:05:18 +00:00
Florian Klink
c19bc95b8a fix(tvix/castore/blobservice): update bytes_read only on successful read
We previously updated this.pos also in case the underlying read returned
an error.

Also, use the ready! macro to remove the match block, and instrument
errors returned during start_seek.

Change-Id: Ic32e26579d964a76b45687134acc48d72d67c36f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11421
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 14:04:47 +00:00
Florian Klink
bccb4c92c3 refactor(tvix/castore/fs): add InodeData::as_fuse_{entry,file_attr}
Remove the now unused gen_file_attr (which had a wrong docstring).

Change-Id: Ie86b14d1ad798e6233bc44c43ace3f8b95c67ea9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11430
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 13:59:46 +00:00
Florian Klink
50065afd36 refactor(tvix/castore/fs): remove "add … handle" debug messages
Change-Id: Iac22bbef96a2afa0416f011d073934b52b19975d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11433
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-15 13:54:14 +00:00
Florian Klink
d3d88431f3 feat(tvix/castore/fs): assign read[dir*]/release[dir] ops to parent span
When a directory or file is open()'ed, we already put some data into
a lookup table, and subsequent operations then use the returned handle
id.

By also adding the span that's been created during these calls into the
lookup table, we can properly set the span parent for these requests,
nicely connecting the individual operations to the bigger picture.

Change-Id: Ia354842fccdbc7f45c2d3efda3acf058b2dbc48e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11429
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Brian Olsen <me@griff.name>
2024-04-15 13:54:14 +00:00
Florian Klink
34d9d54aae fix(tvix/castore/fs): use record to add fields to current span
Instead of creating another child span, we can use
`tracing::Span::current().record(k,v)` to add an additional field to the
current span.

Change-Id: I337faac0e73a0da6eb0a52cb75c2e8c026eff774
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11428
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 13:23:33 +00:00
Florian Klink
ff5835008f feat(tvix/castore/fs): implement readdirplus
This currently shares some code with readdir, except it's also providing
a second `fuse_backend_rs::api::filesystem::Entry` argument to the
`add_entry` function call.

Refactoring this to reduce some duplication is left for a future CL.

Change-Id: I282c8dfc6a711d00a4482c87cbb84d4950c0aee9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11426
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-15 13:22:08 +00:00
Florian Klink
456191e69e refactor(tvix/castore/fs): use consistent span field name for handle
Use rq.handle in `release` too, and remove interpolating it into the
log message itself.

Also update the comment, we don't get ownership, just simply drop, and
change the level to warn!, as suggested in cl/11425.

Change-Id: If4e6cff6d8b580671b1548ae3862851db4af6694
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11427
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 13:20:07 +00:00
Florian Klink
2a47ad1d90 feat(tvix/castore/fs): implement opendir/releasedir
Similar to how we already handle opening files, implement opendir/
releasedir, and keep a map of dir_handles. They point to the rx side of
a channel.

This greatly improves performance listing the root of the filesystem
when used inside tvix-store, as we don't need to re-request the listing
(and skip to the desired position) all the time.

Change-Id: I0d3ec4cb70a8792c5a1343439cf47d78d9cbb1d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11425
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 13:18:36 +00:00
Florian Klink
3ed7eda79b refactor(tvix/castore/fs): use std::sync::Mutex
This allows us acquiring the lock in sync code still. Also, simplify
some of the error handling a bit.

Change-Id: I29e83b715f92808e95ecb0ae9de787339d1a371d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11424
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-15 09:27:04 +00:00
Florian Klink
1bf6b9f5a0 refactor(tvix/castore/fs): simplify some separate spawn and blocks
We can just pass an async move closure to `self.tokio_handle.block_on`
and make this a bit shorter.

Change-Id: Iba674f34f22ba7a7de7c5bae59d64584884cb17c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11423
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 09:27:04 +00:00
Florian Klink
515bfa18fb feat(tvix/castore/fs): support extended attributes
This exposes `user.tvix.castore.{blob,directory}.digest` xattr keys for
files and directories:

```
❯ getfattr -d /tmp/tvix/06jrrv6wwp0nc1m7fr5bgdw012rfzfx2-nano-7.2-info
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.directory.digest="b3:SuYDcUM9RpWcnA40tYB1BtYpR0xw72v3ymhKDQbBfe4="

❯ getfattr -d /tmp/tvix/156a89x10c3kaby9rgf3fi4k0p6r9wl1-etc-shells
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.blob.digest="b3:pZkwZoHN+/VQ8wkaX0wYVXZ0tV/HhtKlSqiaWDK7uRs="
```

It's currently mostly used for debugging, though it might be useful for
tvix-castore-aware syncing programs using the filesystem too.

Change-Id: I26ac3cb9fe51ffbf7f880519f26741549cb5ab6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11422
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
2024-04-15 09:27:04 +00:00
Florian Klink
b70e01a4db feat(tvix/castore/import): remove copying in find_ancestor
We don't need to copy if we explicitly say that the returned
Option<Path> may hold onto bytes from the passed in &DirEntry.

Change-Id: Ib46b6fd2f8f19a45f8bef79c4c1d2fa6b490cad7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11410
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-13 13:20:29 +00:00
Florian Klink
31e3382129 feat(tvix/*store/bigtable): limit retries connecting to cbtemulator
This kept retrying indefinitely if the socket didn't appear.

Change-Id: I4d4ef61df73cef6abda698501432f370abc8a82c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11406
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-13 12:01:00 +00:00
Florian Klink
cc42cac39c refactor(tvix/castore/import): rename ingest_entries function arg
This is a stream of DirEntry, so let's call it direntry_stream.

Change-Id: I5b3cb4efba899d746393f75f6ece7eaa79424717
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11401
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-13 10:51:58 +00:00
Florian Klink
7bcb896e48 feat(tvix/castore/directory/grpc): instrument functions
Change-Id: I9cc0a6a32184773597556ab5f9250257aa18ca4e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11399
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-12 22:32:37 +00:00
Florian Klink
f8800ba189 chore(tvix): bump rstest to 0.19.0
Change-Id: Ib2f5e84fdb8be1210b3507da67d4fe84f061651e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11387
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-12 22:16:56 +00:00
Florian Klink
17849c5c00 feat(tvix/castore/directory): add bigtable backend
This adds a Directory service using
https://cloud.google.com/bigtable/docs/ as a K/V store.

Directory (closures) are put in individual keys.

We don't do any bucketed upload of directory closures (yet), as castore/
fs does query individually, does not request recursively (and buffers).
This will be addressed by store composition at some point.

Change-Id: I7fada45bf386a78b7ec93be38c5f03879a2a6e22
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11212
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-09 15:50:34 +00:00
Florian Klink
289b3126db feat(tvix/castore): drop test-case crate dep
Change-Id: I5049a3682a58ce848d80f413b2964331025a90a8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11370
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
936a175b2f refactor(tvix/castore/directoryservice/from_addr): migrate to rstest
Change-Id: I52b14652822766421d66e95bc646ed7baecc705f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11369
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
Tested-by: BuildkiteCI
2024-04-07 14:51:47 +00:00
Florian Klink
d94ff54d42 refactor(tvix/castore): migrate closure_validator to rstest
Change-Id: I6c594d2e670a681484b858c3e04bc25b9e5a2077
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11368
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
c715b6d448 refactor(tvix/castore/tonic): migrate to rstest
Change-Id: Ie088bf03c739bf64abf3432175362a8f92f501c2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11367
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
71fb99a265 refactor(tvix/castore/hashing_reader): migrate to rstest
Change-Id: I99ae0e27b4db4799db8af7cd6b9cc8d7f09227de
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11366
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
c7c66abd85 refactor(tvix/castore/blobservice/object_store): drop individual tests
This (and more) should now be covered by the generic testsuite
(in crate::blobservice::tests).

Change-Id: Ib3afc4f19f7e37a561b7398d43663dc941971f5c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11365
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
32ac9bd110 refactor(tvix/blobservice/from_addr): move from test_case to rstest
This allows conditionalizing test cases on feature flags.

Change-Id: Ic9ed9ef52f703c973fda2ca4aae5f425e33b67de
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11364
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
Tested-by: BuildkiteCI
2024-04-07 14:51:47 +00:00
Florian Klink
6e8046bec7 feat(tvix/castore/*service/tests): add objectstore to tests, sort
Change-Id: If3a45d2f8008b75524ead704b05320364cc60d92
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11282
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 21:15:01 +00:00
Florian Klink
508fcd65ee feat(tvix/castore/directoryservice): log more TODOs
We need to define behaviours and add tests for these.

Change-Id: Id5825fafbf47897d8de42503ea6006eb131b1082
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11281
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 07:58:10 +00:00
Florian Klink
74023a07a4 refactor(tvix/castore/*): drop utils.rs and grpc directorysvc tests
This drops pretty much all of castore/utils.rs.

There were only two things left in there, both a bit messy and only used
for tests:

Some `gen_*_service()` helper functions. These can be expressed by
`from_addr("memory://")`.

The other thing was some plumbing code to test the gRPC layer, by
exposing a in-memory implementation via gRPC, and then connecting to
that channel via a gRPC client again.

Previous CLs moved the connection setup code to
{directory,blob}service::tests::utils, close to where we exercise them,
the new rstest-based tests.

The tests interacting directly on the gRPC types are removed, all
scenarios that were in there show now be covered through the rstest ones
on the trait level.

Change-Id: I450ccccf983b4c62145a25d81c36a40846664814
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11223
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 07:58:10 +00:00
Florian Klink
07a51c7dc9 feat(tvix/store): add rstest-based PathInfoService tests
This introduces rstest-based tests. We also add fixtures for creating
some BlobService / DirectoryService out of thin air.
To test a PathInfoService, we don't really care too much about its
internal storage - ensuring they work is up to the castore tests.

Change-Id: Ia62af076ef9c9fbfcf8b020a781454ad299d972e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11272
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 07:02:18 +00:00
Florian Klink
fe6ae58ba5 feat(tvix/castore): add rstest-based BlobService tests
Change-Id: Ifae9c41e1e3fa5e96f9b7e188181a44650ff8a04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11250
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-24 20:03:01 +00:00
Florian Klink
f5cf659245 feat(tvix/castore): AsRef<dyn BlobService> impl BlobService
This allows us to use containers around BlobServices as BlobServices too.

Change-Id: I3c7feb074f42b4e07c550fb8dfa63cf81d448ab5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11249
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 20:01:22 +00:00
Florian Klink
3ece32bbf9 feat(tvix/castore): add rstest-based DirectoryService tests
This creates test scenarios (using the DirectoryService trait) that we
want all DirectoryService implementations to pass.

Some of these tests are ported from proto::tests::grpc_directoryservice,
which tested this on the gRPC interface (rather than the trait),
some others ensure certain behaviour for which we only recently
introduced general checking logic (through ClosureValidator).

We also borrow some code related to setting up a gRPC DirectoryService
client (connecting to a server exposing a in-memory DiretoryService)
from castore::utils, this will be deleted once it's all ported over.

Change-Id: I6810215a76101f908e2aaecafa803c70d85bc552
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11247
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-24 20:00:40 +00:00
Florian Klink
6f5474bf02 feat(tvix/castore): AsRef<dyn DirectoryService> impl DirectoryService
This allows us to use containers around DirectoryServices as DirectoryServices too.

Change-Id: I56cca27b3212858db8b12b874df0e567dd868711
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11248
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-24 20:00:40 +00:00
Florian Klink
21fcc1c9df feat(tvix/castore/directory): add SledDirectoryPutter
This uses DirectoryClosureValidator for validation and the sled batch
API to insert multiple directories at once.

Change-Id: I2d6dc513ccbc02e638f8d22173da5463e73182ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11222
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 19:56:55 +00:00
Florian Klink
f7281d8fd5 refactor(tvix/castore/directory/grpc_wrapper): use ClosureValidator
This greatly simplifies the code in this function, replacing it with a
much better tested (and more capable!) version of the validation logic.

It also enables the gRPC server frontend to make use of the
DirectoryPutter interface. While this might not be too visible in terms
of latency thanks to gRPC streams bursting, it also enables further
optimizations later (such as bucketing of directory closures).

Change-Id: I21f805aa72377dd5266de3b525905d9f445337d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11221
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 19:55:42 +00:00
Florian Klink
5f069a3eb8 refactor(tvix/castore/directory): have SimplePutter use Validator
This simplifies a bunch of code, and gets rid of some TODOs.

Also, move it out of castore/utils, and into its own file.

Change-Id: Ie63e05a6cdfb2a73e878cf7107f9172aed1cdf13
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11224
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-24 17:42:30 +00:00
Florian Klink
c92ef2df64 feat(tvix/castore/directory): add ClosureValidator
This can be used to validate a Directory closure (connected DAG of
Directories), and their insertion order.

Directories need to be inserted (via `add`), in an order from the leaves
to the root. During insertion, we validate as much as we can at that
time:

 - individual validation of Directory messages
 - validation of insertion order (no upload of not-yet-known Directories)
 - validation of size fields of referred Directories

Internally it keeps all received Directories (and their sizes) in a HashMap,
keyed by digest.

Once all Directories have been inserted, a drain() function can be
called to get a (deduplicated and) validated list of directories, in
from-leaves-to-root order (to be stored somewhere).

While assembling that list, a check for graph connectivity is performed
too, to ensure there's no separate components being sent (and only one
root).

It adds a test suite for these cases, which is much nicer to test than
where we previously had these checks (only in the gRPC server wrapper).

Followup CLs will move the existing putters to use this.

Change-Id: Ie88c832924c170a24626e9e3e91d868497b5d7a4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11220
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-24 17:39:49 +00:00
Florian Klink
110734dfed docs(tvix/castore): fix missing slash in docstring
Change-Id: I5f39d0e613b651458b168cfd9df0693d7f01d704
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11246
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-23 22:06:27 +00:00
Florian Klink
e449f423e7 fix(tvix/castore/directory/tests): close upload handle
We need to ensure the Directories are successfully uploaded before doing
any testing with them.

Change-Id: Iafa8deb86b3d5eb302ebfba3ced34385f67a7229
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11244
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-23 22:00:16 +00:00
Florian Klink
9793745459 feat(tvix/castore): derive Eq and Hash automatically
This allows these messages to be put in HashSets.

Change-Id: Ia58094cafe53eb624578821d3d8d969c5d21a1d7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11219
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-03-20 21:19:02 +00:00
Florian Klink
345a639e79 refactor(tvix/castore): instrument DirectoryPutter impls consistently
Log the entire span with "trace" level, not just its `ret` level.

The level of the error value event defaults to ERROR, so we don't loose
these.

B3Digest implements Debug and Display the same way, so we can omit the
`(Display)` part in `ret(Display)` for them.

Change-Id: Id00d123a5798e5bdc9820dd97ae2b4d4eb5455f0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11218
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 21:02:44 +00:00
Florian Klink
60b47b336b refactor(tvix/castore/directory): remove GRPCPutter::new
This is no public API to construct this, there's exactly one caller,
and it's perfectly fine to directly populate the struct there.

Change-Id: Idae43a0162ee9bc687d21c550e0c9df33f12d263
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11217
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 21:02:44 +00:00
Florian Klink
9fb213f47a feat(tvix/castore): record errors for some failures in SimplePutter
This makes it easier to see what's going wrong when uploading multiple
Directories.

Change-Id: Ieb71424b9761777c5f719b2f365962644de82baf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11209
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-20 12:21:09 +00:00
Florian Klink
5627dc04e1 feat(tvix/castore/blob): document missing objectstore+*:// URL
Change-Id: I3cbc75e636efd467ddda7fc3c3c8f19ad42204ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11206
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 12:17:42 +00:00
Florian Klink
345cebaebb refactor(tvix/castore/blob): drop simplefs
This functionality is provided by the object store backend too
(using `objectstore+file://$some_path`).

This backend also supports content-defined chunking and compresses
chunks with zstd.

Change-Id: I5968c713112c400d23897c59db06b6c713c9d8cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11205
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 12:17:42 +00:00
Florian Klink
2798803f76 refactor(tvix/castore): introduce "cloud" feature flag
This controls whether tvix-castore has support for various cloud
backends or not.

Use this to control the set of feature flags for the object_store
backend, and only enable the aws, azure and gcp ones if it's set.
In the future this can be used to enable/disable other cloud backends
too.

Without feature flags, `object_store` already supports the `InMemory`
and `LocalFilesystem` backends, and we also want to unconditionally
enable the `http` one. Make sure at least the construction of these
services is covered in the tests.

Similarly, the tvix-store crate, which provides the tvix-store CLI has a
`cloud` feature flag too (defaulting to enabled).

Change-Id: I9fb9c87b740e7dc83f8ff7a0862905d036d513f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11204
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-20 12:17:42 +00:00
Florian Klink
591edf0d5b docs(tvix/castore/directory): update docstring for get_recursive
The rust trait was missing to document the order of the elements in the
stream. Document that, and also the reasoning behind this.

Change-Id: I27ef0b2020082783fc41c2015233175e2b8e716d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11203
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-20 12:17:42 +00:00
Florian Klink
c0566985b0 refactor(tvix/castore/directory/from_addr): use match guards
This will allow feature-flagging some of the backends.

Change-Id: Iddbdb89d3cf9c966a2c25b06b03e6917b284cae5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11201
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-20 11:53:04 +00:00
Florian Klink
7deadd50d5 refactor(tvix/castore/blob/from_addr): use match guards
This will allow feature-flagging some of the backends.

Change-Id: Idffbf8b3fd154f5a3d938225c3871feffea8ff8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11200
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 11:52:29 +00:00
Florian Klink
499bc2f7ee refactor(tvix/castore/blobsvc): use B3Digest Display impl
We don't need to use BASE64 here on our own, B3Digest has a Display
impl.

This will also make sure the `b3:` digest is present in field values.

Change-Id: I0ce6ee0f7e7e99fb9b16872953a1b742e99be291
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11192
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-18 16:10:05 +00:00
Florian Klink
05bdb68523 feat(tvix/blobservice/object_store) more logging
Have derive_{blob,chunk}_path emit trace-level events for both the
values they're called with, as well as the return value.

With RUST_LOG in place, it doesn't get lost in other unrelated noise.

Change-Id: Id2451e3657324eff482841eb26a22d19e22bde30
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11136
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-18 16:10:05 +00:00
Florian Klink
82f8ce8b7d feat(tvix/castore/blobsvc/grpc): read data in chunks
Whenever this encounters an open_read(), it'll first check for more
granular chunking. If there's more granular chunking data available, a
ChunkedReader is constructed (which supports seeking backwards).

This currently is still a bit stupid, and doesn't compose, as
`ChunkedReader` uses `self` as the `BlobService` to ask for the
individual chunks.

In store composition future, we might want to compose this differently,
essentially constructing `ChunkedReader` with another `BlobService`
representing the entire hierarchy, so there's a chance to locally cache
things, and do less requests.

Change-Id: I22e0df4d6245f666d083b4f0b7114d3ac41d1dce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11185
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-18 16:10:05 +00:00
Florian Klink
70bbf23767 refactor(tvix/castore/src/blobservice): remove useless else case
Change-Id: I09000371a1d8ff212ab46050d1a480509c6ffe70
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11183
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-18 14:25:37 +00:00
Florian Klink
3b1c9172f6 feat(tvix/castore): impl Debug for B3Digest
Use the same format as Display, b3: followed by the base64
representation. This makes the debug implementation of everything
containing a b3 digest much nicer to read.

Change-Id: I3ca3154d0b6fb07781c8f9c83ece3ff1a6957902
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11181
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-17 22:18:20 +00:00
Florian Klink
dbf87f3057 chore(tvix): bump tonic to 0.11.0
This bumps tonic and surrounding crates to 0.11.x.

We added support for tonic 0.11.x into tokio-listener
(https://github.com/vi/tokio-listener/pull/4), so that's bumped as well.

Change-Id: Icfade5894403228299836fefb21b2f9ae59dbebb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11156
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-16 17:04:12 +00:00
Florian Klink
fdf9657654 fix(tvix): don't emit rerun-if-changed
`build.rs` emits rerun-if-changed statements for all proto files, as
well as all include paths we pass it.

Unfortunately, due to protobufs include path rules, we need to specify
the path to the depot root itself as an include path, at least when
building impurely with `cargo`. This causes cargo to essentially always
rebuild, as it also puts its own temporary files in there.

Unfortunately, tonic-build does not chase down to individual .proto
files that are included.

Disable emitting these `rerun-if-changed` statements for now.

This could cause cargo to not rebuild protos every time, causing stale
data until the next local `cargo clean`, but considering the protos
change not that frequently, and it'll immediately surface if trying to
build via Nix (either locally or in CI), it's a good-enough compromise.

Change-Id: Ifd279a2216222ef3fc0e70c5a2fe6f87997f562e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11157
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2024-03-16 09:34:10 +00:00
Florian Klink
f22d5b3d11 feat(tvix/castore/blobsvc/from_addr): support object_store
The object_store crate supports a ton of different stores, with different schemes.

For now, use a objectstore+ scheme prefix to enable these.

Change-Id: I946f76e32a0fb0867ef59060217894cda5b959b9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11080
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-03-11 22:42:01 +00:00
Florian Klink
1c2db676a0 feat(tvix/castore/blobsvc): add object storage implementation
This uses the `object_store` crate to expose a tvix-castore BlobService
backed by object storage.

It's using FastCDC to chunk blobs into smaller chunks when writing to
it.

These are exposed at the .chunks() method.

Change-Id: I2858c403d4d6490cdca73ebef03c26290b2b3c8e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11076
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
2024-03-11 22:42:01 +00:00
Florian Klink
4e78de7393 fix(tvix/castore/grpc/directory): skip_all fields in instrument
This only contains the outer metadata wrapping, and that's not too interesting:

> Request { metadata: MetadataMap { headers: {"content-type":
> "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers",
> "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions:
> Extensions }

Drop these fields for now, and rely on the underlying implementations to
add instrumentation for the application-specific fields.

Clean up the error logging a bit.

Change-Id: Ife1090ed411766a61e1fa60fd4c9570f38de1e98
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11102
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-09 05:47:41 +00:00
Florian Klink
05deb37f44 fix(tvix/castore/grpc/blob): skip_all fields in instrument
This only contains the outer metadata wrapping, and that's not too interesting:

> Request { metadata: MetadataMap { headers: {"content-type":
> "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers",
> "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions:
> Extensions }

Drop these fields for now, and rely on the underlying implementations to
add instrumentation for the application-specific fields.

Log errors in some places where we didn't so far.

Change-Id: Ia68d6c526987d3716be62a0809195401cf28512b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11101
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-09 05:47:36 +00:00
Florian Klink
4954a39de4 fix(tvix/castore): also set SSL_CERT_FILE for tests there
For everything using reqwest here during test cases, we also need to
set SSL_CERT_FILE.

Change-Id: If8aeda65f3d75cb9ac5c9bc64e37a0cb7dffc17c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11092
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-03 17:43:42 +00:00
Florian Klink
ef3f8936cb refactor(tvix/*/from_addr): improve test debuggability
If there's an unexpected test failure, print it out, rather than just
saying something is false even though it should be true.

Use .expect() for this, which displays the error if it failed.
We can't use expect_err(), as our stores are not display'able, so use an
assertion with a message there.

Change-Id: I2d88861d979d107edc0717fbdb3cdac9a6bfc5e4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11091
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: flokli <flokli@flokli.de>
2024-03-03 16:54:19 +00:00
Florian Klink
4b4443240e feat(tvix/castore): add HashingReader, B3HashingReader
HashingReader wraps an existing AsyncRead, and allows querying for the
digest of all data read "through" it.
The hash function is configurable by type parameter, and we define
B3HashingReader.

Change-Id: Ic08142077566fc08836662218f5ec8c3aff80be5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11087
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-03 15:31:31 +00:00
Florian Klink
8383e9e02e feat(tvix/castore/digests): impl From digest::Output<_> for B3Digest
This allows calling .into() to get a B3Digest.

Change-Id: I6e63b496413cd00d84acfcd15c7de0f64c79721f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11086
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-03 15:18:19 +00:00
Florian Klink
7bebf492ec refactor(tvix/castore/blobsvc/chunked_reader): refactor, document
The public-consumable thing here is ChunkedReader, not ChunkedBlob.

ChunkedBlob is a helper that can be used to get a new AsyncRead, but
not AsyncSeek. It is used internally by ChunkedReader whenever the
client seeks.

Make this more obvious, by extending the documentation, and putting
ChunkedReader at the top of this file.

Also make ChunkedBlob and its methods private, and give ChunkedReader a
more useful constructor (from_chunks, instead of from_chunked_blob).

Change-Id: I2399867591df923faa73927b924e7c116ad98dc0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11079
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-03 11:22:56 +00:00
Florian Klink
53fb9ff4c6 feat(tvix/castore/blobsvc): BlobReader for more trivial types
Change-Id: I80e4f26c41a504fa4c6a013c2a1e76de613ba294
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11078
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-02 17:05:23 +00:00
Florian Klink
982459d343 fix(tvix/castore/blobwriter): don't require Sync + 'static
There's no reason for these two.

Change-Id: Ie6f238bbb0b17971c9877b11b61ea7ebca573c13
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11075
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-02 05:56:21 +00:00
Florian Klink
b38badf206 docs(tvix/castore/directorysvc): K/V is not necessarily flat
Some implementations of DirectoryService might not allow retrieval of
intermediate Directory nodes, that are not at the "root".

Think about an object store implementation. The client is doing a
get_recursive anyways to reduce the number of roundtrips.

By documenting the fact we don't need to support looking up intermediate
Directory messages, we can just batch all directories into the same
object, keyed by the root.

Change-Id: I019d720186d03c4125cec9191e93d20586a20963
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10988
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2024-02-20 09:17:38 +00:00
Peter Kolloch
5777050821 feat(tvix/castore): Compile fix for Darwin
Towards https://b.tvl.fyi/issues/264

Change-Id: If8fa912ae3fb2987b761f649ab738529ebf3b2e8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10970
Autosubmit: Peter Kolloch <info@eigenvalue.net>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-19 17:14:24 +00:00
Florian Klink
28f5c13c53 fix(tvix/castore): don't emit ret as INFO
This otherwise gets a bit spammy.

Change-Id: I288350a600d79a394c239f253424ad55bc3cefc5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10954
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-18 07:12:27 +00:00
Florian Klink
34a1ff291a feat(tvix/castore/fs): make allow_other configurable
Also add a cli argument to the `tvix-store` binary.

Change-Id: Id07d7fedb60d6060543b195f3a810a46137f9ad5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10945
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-17 07:00:41 +00:00
Florian Klink
d10c5309bc feat(tvix/castore/blobsvc): add Chunked{Blob,Reader}
These provide seekable access into a Blob for which we have more
granular chunking information.

There's no support for verified streaming in here yet, this simply
produces a stream of readers for each chunk, skipping irrelevant chunks
and data from the first chunk at the beginning.

A seek simply does produce a new reader using the same process.

Change-Id: I37f76b752adce027586770475435f3990a6dee0b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10731
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-10 14:24:51 +00:00
Florian Klink
40d81d0c74 docs(tvix/castore/blobstore): reorganize docs
docs/verified-streaming.md explained how CDC and verified streaming can
work together, but didn't really highlight enough how chunking in
general also helps with seeking.

In addition, a lot of the thoughts w.r.t. the BlobStore protocol, both
gRPC and Rust traits, as well as why there's no support for seeking
directly in gRPC, as well as how clients should behave w.r.t. chunked
fetching was missing, or mixed together with the verified streaming
bits.

While there is no verified streaming version yet, a chunked one is
coming soon, and documenting this a bit better is gonna make it easier
to understand, as well as provide some lookout on where this is heading.

Change-Id: Ib11b8ccf2ef82f9f3a43b36103df0ad64a9b68ce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10733
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-06 18:28:00 +00:00
Florian Klink
cb2cf3f6b7 fix(tvix/castore/grpc/svc_wrapper): expose chunks() over gRPC
The Stat() method was just always signalling no granular chunks are
available. However, as we now have a .chunks() method, we can expose it
over gRPC.

Change-Id: I74f0890ae083f301bb0cec62f1ea4a95463ac590
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10736
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 16:27:10 +00:00
Florian Klink
9504015031 feat(tvix/castore/blobsvc): validate StatBlobResponse
All chunks must have valid blake3 digests. It is allowed to send an
empty list, if no more granular chunking is available.

Change-Id: I7ecb53579cdf40fd938bb68a85685751b4d3626f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10726
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-02-02 16:26:38 +00:00
Florian Klink
5ad5a0da00 refactor(tvix/castore/grpc/blobsvc): inline stream_mapper
This can be written without the additional function.

Change-Id: Ib11c5d5254d3e44c8fa9661414835b0622eb1ac4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10735
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-02 16:25:37 +00:00
Florian Klink
1157eea710 docs(tvix/castore/blobsvc): fix doc comments on trait
The readers implement AsyncRead/AsyncSeek, not their sync counterparts.
Also update expectations around chunks.

Change-Id: Ic266688039d80d16d33f651b96ce2bcdedecfa00
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10734
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 16:24:06 +00:00
Florian Klink
4c5d9fa356 feat(tvix/castore/docs/verified-streaming): clarify reply
"given chunksize" is misleading here. It's up to the backend to decide
if it does chunking at all, and how it chunks.

Change-Id: I4f130ca9ac34db79f18ef1d6475295806ac7f9a4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10728
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-02-02 08:56:48 +00:00
Florian Klink
459a564ff1 refactor(tvix/castore/blobsvc/combinator): compact trait bounds
BlobService already implies Send and Sync, we don't need to explicitly
list it here.

Change-Id: I58a4c5912be61a60acd961565979aa01d94ee0f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10727
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-02-02 08:55:16 +00:00
Ryan Lahfa
68bba48d59 feat(tvix/castore): process_entry cannot process unsupported nodes
In the past, we had a `todo!` on unsupported node types, this returns a proper error
that can be caught by the caller.

Change-Id: Icba4c1dab33c0d670a97f162c9b358d1ed5855cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10675
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-01-22 14:15:42 +00:00
Connor Brewster
4e341fb5d9 chore(tvix/store): Use BoxStream type alias
The BoxStream type alias is a more concise and easier to read than
the full `Pin<Box<dyn Stream<Item = ...> + Send + ...>>` type.

Change-Id: I5b7bccfd066ded5557e01f7895f4cf5c4a33bd44
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10677
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
2024-01-21 19:41:02 +00:00
Ryan Lahfa
7275288f0e refactor(tvix/castore): break down ingest_path
In one function that does the heavy lifting: `ingest_entries`, and three additional helpers:

- `walk_path_for_ingestion` which perform the tree walking in a very naive way and can be replaced by the user
- `leveled_entries_to_stream` which transforms a list of a list of
  entries ordered by their depth in the tree to a stream of entries in
  the bottom to top order (Merkle-compatible order I will say in the
  future).
- `ingest_path` which calls the previous functions.

Change-Id: I724b972d3c5bffc033f03363255eae448f017cef
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10573
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: raitobezarius <tvl@lahfa.xyz>
2024-01-20 18:26:17 +00:00
Ryan Lahfa
1f1a42b4da feat(tvix/castore): ingestion does DFS and invert it
To make use of the filtering feature, we need to revert the internal walker to a real DFS.

We will therefore just invert the whole tree by storing all of its
contents in a level-keyed vector.

This is horribly expensive in memory, this is a compromise between CPU
and memory, here is the fundamental reason for why:

When you encounter a directory, it's either a leaf or not, i.e. it
contains subdirectories or not.

To know this fact, you can:

- wait until you notice subdirectories under it, i.e. you need to store
  any intermediate nodes you see in the meantime -> memory penalty.
- getdents or readdir on it to determine *NOW* its subdirectories -> CPU
  penalty and I/O penalty.

This is an implementation of the first proposal, we pay memory.

In practice, we are paying O(#nb of nodes) in memory.

There's a smarter albeit much more complicated algorithm that pays only
O(\sum_i #siblings(p_i)) nodes where (p_1, ..., p_n) is the path to a leaf.

which means for:

             A
            / \
           B   C
          /   / \
         D   E   F

We would never store D, E, F but only E, F at a given time.
But we would still store B, C no matter what.

Change-Id: I456ed1c3f0db493e018ba1182665d84bebe29c11
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10567
Tested-by: BuildkiteCI
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
2024-01-20 17:16:01 +00:00
sterni
526295a71d chore(3p/sources): Bump channels & overlays
- Adjust to ecl 23.9.9 release

- Regenerate go protos after protoc-gen-go update

- Drop dhall fork which hasn't kept up with 1.42.*

- Address new clippy warnings:

  - Variant naming of Error::ValidationError
  - Simplify .try_into().unwrap()
  - Drop unnecessary identity function
  - Test module must be last in file
  - Drop unused `pub use`

- Update agenix to 0.15.0. Current master has a installCheckPhase that
  doesn't work with C++ Nix 2.3.*:
  a23aa271be (commitcomment-137185861)

Change-Id: Ic29eef20d6fd1362ce1031364a5ca6b4edf195bd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10615
Reviewed-by: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
Autosubmit: sterni <sternenseemann@systemli.org>
2024-01-19 21:47:32 +00:00
Ryan Lahfa
93afc711f6 feat(tvix/castore): convert import error to std::io::Error
So that we can just `map_err` easily in functions returning `std::io::Error` but calling functions
returning `castore::import::Error`.

Change-Id: Id181b95e8431c69e95f3a8cd569ca10306656e1d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10572
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-18 14:40:06 +00:00
Florian Klink
c5e2832cbd feat(tvix/castore): implement Ord for node::Node
This allows assembling BTreeSets of node::Node.

Change-Id: I97b83be5ffc3e891307a8ef2b5fc31e38b747a62
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10625
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-01-15 18:19:15 +00:00
Florian Klink
4d135bcfa2 feat(tvix/castore): implement CombinedBlobService
First attempt on composition of BlobServices.

Change-Id: I6e70248007edfd322a503fd40c1c4b4300cbc30c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10587
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-01-09 17:32:03 +00:00
Florian Klink
719cbad871 feat(tvix/castore/blobsvc): add chunks method
This adds support to retrieve a list of chunks for a given blob to the
BlobService interface.

While theoretically all chunk-awareness could be kept private inside
each BlobService reader, we'd not be able to resolve individual chunks
from different Blobservices - and due to this, not able to substitute
chunks we already have in a more local store.

This function allows asking a BlobService for the list of chunks,
leaving any actual fetching up to the caller (be it through individual
calls to open_read), or asking another store for it.

Change-Id: I1d33c591195ed494be3aec71a8c804743cbe0dca
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10586
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-01-09 17:31:32 +00:00
Florian Klink
9596c5caff refactor(tvix/castore): do clone inside a scope
Make it clear this is only used inside the scope.

Change-Id: Ie94f88d7f0fb58cd4bf9c2f1176000b272e6f2e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10585
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 16:20:19 +00:00
Florian Klink
9de1ebf23e feat(tvix/castore/grpc): instrument some more functions
Change-Id: Icedb148c88c5f4a3b2242ed12df1dd8692af94fd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10584
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 16:18:48 +00:00
Florian Klink
0009383c07 refactor(tvix/castore/directorysvc): AsRef traverse_to
Change-Id: I641bd4ab3de591a013f03137f1e16295946315f3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10579
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-09 14:15:55 +00:00
Florian Klink
b1c556b7e1 refactor(tvix/castore/blobservice/grpc): remove fn pointer hack
It looks like the workaround isn't necessary anymore.

Change-Id: Ifbcef1d631b3f369cac3db25a2c793480043f697
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10583
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:13:24 +00:00
Florian Klink
89882ff9b1 refactor(tvix): use AsRef<dyn …> instead of Deref<Target= …>
Removes some more needs for Arcs.

Change-Id: I9a9f4b81641c271de260e9ffa98313a32944d760
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10578
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:08:22 +00:00
Florian Klink
8fbdf72825 feat(tvix/castore/blobsvc/grpc): rm VecDec, fix docstring
The docstrings were not updated once we made the BlobService trait async.
There's no more need to turn things into a sync reader.

Also, rearrange the stream manipulation a bit, and remove the need to
create a new VecDeque for each element in the stream. bytes::Bytes
implements the Buf trait.

Fixes b/289.

Change-Id: Id2bbedca5876b462e630c144b74cc289c3916c4d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10582
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:03:21 +00:00
Ryan Lahfa
cbcd078684 chore(tvix/castore): fix the docstring for process_entry
It was a `//` not a `///`.

Change-Id: Iee3e8c116d73b5dd8a41c027153714415a66695f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10566
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-01-06 01:39:41 +00:00
Florian Klink
5a82736122 chore(tvix): bump test-case dep to 3.3.1
Change-Id: I643548d95a5fab84563c7cbe51ca2ce640c186a9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10537
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-01-05 16:43:34 +00:00
Florian Klink
6b42aef88d fix(tvix/castore): validate Option<Node>
Extend our validation function to also check for the None case.

Change-Id: Ib75f880646d7fb3d66588f1988e61ec18be816a2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10534
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-01-05 16:43:34 +00:00
Florian Klink
f20969de9b refactor(tvix/castore): relax trait bounds for DS
Make this an `AsRef<dyn DirectoryService>`.

This helps dropping some Clone requirements.

Unfortunately, we can't thread this through to TvixStoreIO just yet.

Change-Id: I3f07eb28d6c793d3313fe21506ada84d5a8aa3ac
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10533
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-05 16:43:34 +00:00
Florian Klink
597a6b6205 refactor(tvix/castore/tests): let gen_*_service return Boxes
Only convert to and reuse an Arc<…> where needed.

Change-Id: I2c1bc69cca5a4a3ebd3bdb33d6e28e1f5fb86cb9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10514
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-01 14:45:17 +00:00
Florian Klink
1b62f82b10 refactor(tvix/castore/blobsvc/grpc/wrapper): don't require Arc<_>
Change-Id: I9655f5588c7dc98427de6af47d74b4ab7ce22071
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10516
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-01 14:42:36 +00:00
Florian Klink
96aa220dcf refactor(tvix/castore/directorysvc/grpc/wrapper): no Arc<_>
We can also drop the Clone requirement. Because the trait is async since
some time, there's no need to clone before moving into an async closure,
allowing us to simplify the code a bit.

Change-Id: I9b0a0e10077d8c548d218207b908bfd92c5b8de0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10515
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-01-01 14:40:35 +00:00
Florian Klink
54fe97e725 refactor(tvix/castore): make directorysvc more generic
This works on Box<dyn DirectoryService> too.

Change-Id: Ib869f0f4d963ef4dbaeab22db03ff6afb71ede04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10513
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-01-01 02:09:20 +00:00
Florian Klink
ddae4860c2 feat(tvix/castore/import): generalize ingest_path
We don't actually care if it's an Arc<dyn BlobService>, or something
else, as long as we can Deref to a BlobService and clone.

Change-Id: I0852aaf723f51c5e6b820be8db1199d17309ab08
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10510
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-01 00:54:14 +00:00
Florian Klink
41935fab70 refactor(tvix/castore/directorysvc): return Box, not Arc
While we currently mostly use it in an Arc, as we need to clone it
inside PathInfoService, there might be other usecases not requiring it
to be Clone.

Change-Id: Ia05bb370340792a048e2036be30e285ef1e63870
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10483
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-31 22:18:14 +00:00
Florian Klink
9ca1353122 refactor(tvix/castore/blobsvc): return Box, not Arc
While we currently mostly use it in an Arc, as we need to clone it
inside PathInfoService, there might be other usecases not requiring it
to be Clone.

Change-Id: I7bd337cd2e4c2d4154b385461eefa62c9b78345d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10482
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-31 22:18:14 +00:00
Florian Klink
8a52c7f1c5 feat(tvix/castore/fs): borrow some matches
We only do things with the reference, so we don't need to locally borrow
it.

Change-Id: I6073f7ec7aff717ae3069e28a00b1cb408a50ceb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10455
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-12-29 17:18:14 +00:00
Florian Klink
46a372d5d7 feat(tvix/castore/fs): instrument FuseDaemon functions
Change-Id: I696b7ab6b4c08004db147c0fda7312bbebaa0eec
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10451
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-29 15:57:26 +00:00
Florian Klink
acbb613e61 chore(tvix): switch to upstream futures 0.3.30
The bugs have been fixed,
https://github.com/rust-lang/futures-rs/pull/2801 and
https://github.com/rust-lang/futures-rs/pull/2812 were merged and ended
up in that release.

Change-Id: Iefd990d2d1719b884504093343e54e9c5258e2e2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10414
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-12-24 21:45:04 +00:00
Florian Klink
a5865ec7fa refactor(tvix/castore/fs/tests): drop unused args
There's no need to pass in an unused directory service into the
populate_blob_* method, and considering we have one or two invocation of
each of these, we don't really gain much from having all these functions
follow the same structure, at least for now.

Also, update some function names to better describe what they're doing.

Change-Id: I92f680745c157fb0a602b07342f8838bfad23ecd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10411
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-24 16:05:52 +00:00
Florian Klink
8d86d2f409 refactor(tvix/castore): add RootNode impl for BTreeMap, mv fs tests
cl/10378 did already move store/fs to castore/fs, but we kept the tests
in tvix-store, as they were populating a PathInfoService to make nodes
appear in the mount root.

Update these tests to now just insert root nodes into a BTreeMap, and
ensure we can use that as a RootNodes too.

Change-Id: Iad7d1ee4f9423eb6e3a1da33f433842c9ae0de1f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10410
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2023-12-24 15:44:30 +00:00
Florian Klink
a5167c508c chore(tvix): move store/fs to castore/fs
With the recent introduction of the RootNodes trait, there's nothing in
the fs module pulling in tvix-store dependencies, so it can live in
tvix-castore.

This allows other crates to make use of TvixStoreFS, without having to
pull in tvix-store.

For example, a tvix-build using a fuse mountpoint at /nix/store doesn't
need a PathInfoService to hold the root nodes that should be present,
but just a list.

tvix-store now has a pathinfoservice/fs module, which contains the
necessary glue logic to implement the RootNodes trait for a
PathInfoService.

To satisfy Rust orphan rules for trait implementations, we had to add a
small wrapper struct. It's mostly hidden away by the make_fs helper
function returning a TvixStoreFs.

It can't be entirely private, as its still leaking into the concrete
type of TvixStoreFS.

tvix-store still has `fuse` and `virtiofs` features, but they now simply
enable these features in the `tvix-castore` crate they depend on.

The tests for the fuse functionality stay in tvix-store for now, as
they populate the root nodes through a PathInfoService.

Once above mentioned "list of root nodes" implementation exists, we
might want to shuffle this around one more time.

Fixes b/341.

Change-Id: I989f664827a5a361b23b34368d242d10c157c756
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10378
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2023-12-22 16:55:18 +00:00
Florian Klink
9627ef15de docs(tvix/castore/protos): remove reference
This is not gonna end up as a interlinked docstring.

Change-Id: I2b0ca106aa75bae0156c0b411da5931da60c725d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10406
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
2023-12-21 16:44:48 +00:00
Florian Klink
783a1e314c docs(tvix/castore): fix reference
Change-Id: I00b1d56d58c4d3779b57ab0056cff1c7e6053b9b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10401
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
2023-12-21 16:43:44 +00:00
Ryan Lahfa
0ae32d45f6 feat(tvix/castore): simple filesystem blob service
The simple filesystem `BlobService` enable a user to write blob store
on an existing filesystem using a prefix-style layout in the provided root directory,
e.g. the two first bytes of the blake3 hashes are used as directories prefixes.

Change-Id: I3451a688a6f39027b9c6517d853b95a87adb3a52
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10071
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-12-17 14:34:13 +00:00
Florian Klink
923a5737e6 refactor(tvix/castore): drop is_closed() from impl DirectoryPutter
This is only used in the gRPC version (GRPCPutter), during the test
automation.

So define it as a method there, behind #[cfg(test)], and remove from
the trait.

Change-Id: Idf170884e3a10be0e96c75d946d9c431171e5e88
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10340
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-12-16 23:07:31 +00:00
Florian Klink
8b0047e277 docs(tvix/castore/directorysvc): update comment
This comment didn't make a lot of sense before.

Change-Id: Ie057a133ca4b1a099ed3c885e32316b0d87c5eb0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10339
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-12-13 21:11:58 +00:00
Florian Klink
3a32963b78 docs(tvix/castore): document expectations about DirectoryService
Namely, all trait implementations should reject invalid data being fed,
and detect invalid data being returned.

b/355 tracks writing some more tests for this, to ensure we're compliant
with this.

Change-Id: I3b05752932837ce208785efb21ffc21508b4b33a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10338
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
Autosubmit: flokli <flokli@flokli.de>
2023-12-13 19:57:45 +00:00
Florian Klink
d236b08916 docs(tvix/castore): fix docstrings
There's been some copypasta errors.

Change-Id: I8fcad6cfc951ead6c789e0dce823c798adbfcf97
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10337
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
2023-12-13 19:40:40 +00:00
Florian Klink
81ef26ba3f fix(tvix/castore/import): don't unwrap entry
If the path specified doesn't exist, construct a proper error instead
of panicking.

Part of b/344.

Change-Id: Id5c6a91248b0a387f3e8f138f8e686e402009e8f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10330
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-12 18:07:11 +00:00