Commit graph

266 commits

Author SHA1 Message Date
Florian Klink
4c5c810c6f refactor(tvix/castore/import): move upload_blob_at_path into fs mod
This is only useful for when we have access to a filesystem, so it
shouldn't be in the root.

Change-Id: I9923aaed1aef9d3a1e8fad41f58821d51c2eb34b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11555
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: firefly <firefly@firefly.nu>
Tested-by: BuildkiteCI
2024-04-30 15:53:58 +00:00
Florian Klink
5e8cfcfcd6 fix(tvix/castore/import): symlink targets are Vec<u8>
These can be arbitrary bytes in theory. Some of our libraries might
be more strict, or inconsistent w.r.t. their representation of path
separators.

Change-Id: I7981b74fc7d3dd79f5589cf2ef52ced7b71dd003
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11551
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
2024-04-30 13:18:03 +00:00
Florian Klink
ca64881cb3 docs(tvix/castore): fix tvix_castore::import sub-mod docstrings
The one for `fs` was wrong, and ended up being attached to ingest_path,
and the one for `archive` was missing entirely.

Change-Id: I8a4c32fb5293badb1ea0764c278a88e4ca33c018
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11552
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
2024-04-30 10:06:17 +00:00
Alice Carroll
8d49ff3d64 test(tvix): Fix tvix tests on macOS
Prior to this, some tests would not build
or would fail in an obscure way.

Change-Id: I68587cc7592492ebfd71ca02fc7ccc9ff7c0196f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11544
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-04-30 00:55:34 +00:00
sterni
69e4a78818 chore(3p/sources): Bump channels & overlays
- //tvix: address new clippy lints

- //users/tazjin: Satisfy gonic module's new need for a playlist folder.

- //users/aspen/games: adjust for changed location of df's default
  init.txt and d_init.txt.

Change-Id: I00a2adb506ae866206fb6f88c39c9a6af320380f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11509
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
Reviewed-by: aspen <root@gws.fyi>
2024-04-28 16:39:26 +00:00
edef
d93633937c fix(tvix): typo
Change-Id: Ibe4741b8086e9da442232c14cdb337556704cef6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11514
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-25 23:47:49 +00:00
Connor Brewster
d2e67f021e refactor(tvix/castore): add separate Error enum for archives
The `Error` enum for the `imports` crate has both filesystem and archive
specific errors and was starting to get messy.

This adds a separate `Error` enum for archive-specific errors and then
keeps a single `Archive` variant in the top-level import `Error` for all
archive errors.

Change-Id: I4cd0746c864e5ec50b1aa68c0630ef9cd05176c7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11498
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
2024-04-24 15:41:38 +00:00
Connor Brewster
79698c470c feat(tvix/castore): upload blobs concurrently when ingesting archives
Ingesting tarballs with a lot of small files is very slow because of the
round trip time to the `BlobService`. To mitigate this, small blobs can
be buffered into memory and uploaded concurrently in the background.

Change-Id: I3376d11bb941ae35377a089b96849294c9c139e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11497
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
2024-04-23 17:02:07 +00:00
Connor Brewster
fa69becf4d refactor(tvix/castore): switch to ingest_entries for tarball ingestion
With `ingest_entries` being more generalized, we can now use it for
ingesting the directory entries generated from tarballs.

Change-Id: Ie1f7a915c456045762e05fcc9af45771f121eb43
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11489
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-23 15:31:22 +00:00
Florian Klink
5fc403587f refactor(tvix/castore): ingest filesystem entries in parallel
Rather than carrying around an Future in the IngestionEntry::Regular,
simply carry the plain B3Digest.

Code reading through a non-seekable data stream has no choice but to
read and upload blobs immediately, and code seeking through something
seekable (like a filesystem) probably knows better what concurrency to
pick when ingesting, rather than the consuming side.

(Our only) one of these seekable source implementations is now doing
exactly that. We produce a stream of futures, and then use
[StreamExt::buffered] to process more than one, concurrently.

We still keep the same order, to avoid shuffling things and violating
the stream order.

This also cleans up walk_path_for_ingestion in castore/import, as well
as ingest_dir_entries in glue/tvix_store_io.

Change-Id: I5eb70f3e1e372c74bcbfcf6b6e2653eba36e151d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11491
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-20 18:54:28 +00:00
Connor Brewster
b0bdeb2e89 feat(tvix/castore): Fix build warnings in release mode
Fixes some build warnings that only happen when building in release mode
which disables `debug_assertions`.

Change-Id: I554d5fce7c869c23cf4aa93179f0ee9f7f7c834e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11490
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
2024-04-20 16:47:12 +00:00
Connor Brewster
18ab59ed70 fix(tvix/castore): ensure all directories are present during ingestion
`ingest_entries` requires that all directories referenced by entries in
the ingestion stream have an explicit entry in the stream.

For example, if the stream contains a file with path `foo/bar`, there
must be an entry that comes later in the stream for the directory `foo`.

Change-Id: I61b4fbbb73ea7278715e04271d8073b484e05e61
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11488
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-20 16:46:41 +00:00
Aspen Smith
3107961428 feat(tvix/eval): Implement builtins.fetchTarball
Implement a first pass at the fetchTarball builtin.

This uses much of the same machinery as fetchUrl, but has the extra
complexity that tarballs have to be extracted and imported as store
paths (into the directory- and blob-services) before hashing. That's
reasonably involved due to the structure of those two services.

This is (unfortunately) not easy to test in an automated way, but I've
tested it manually for now and it seems to work:

    tvix-repl> (import ../. {}).third_party.nixpkgs.hello.outPath
    => "/nix/store/dbghhbq1x39yxgkv3vkgfwbxrmw9nfzi-hello-2.12.1" :: string

Co-authored-by: Connor Brewster <cbrewster@hey.com>
Change-Id: I57afc6b91bad617a608a35bb357861e782a864c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11020
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:58:04 +00:00
Florian Klink
f34e0fa342 feat(tvix/castore/import): only allow normal components in entry paths
Explicitly document and add a debug assertion for that.

It's up to callers to ensure this doesn't happen.

Change-Id: Ib5d154809c2ad2920258e239993d0b790d846dc8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11487
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:14:19 +00:00
Florian Klink
e9db0449e7 refactor(tvix/castore/import): make module, split off fs and error
Move error types and filesystem-specific functions to a separate file,
and keep the fs:: namespace in public exports.

Change-Id: I5e9e83ad78d9aea38553fafc293d3e4f8c31a8c1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11486
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-20 14:14:19 +00:00
Florian Klink
c4cb099823 refactor(tvix/castore/import): rename ingest_entries arg
This is not a stream of direntries anymore, but a stream of ingestion
entries.

Change-Id: I387f4497b6567066b24c58ca0262e710348180e9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11485
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:14:19 +00:00
Connor Brewster
259d7a3cfa refactor(tvix/castore): generalize store ingestion streams
Previously the store ingestion code was coupled to `walkdir::DirEntry`s
produced by the `walkdir` crate which made it impossible to reuse
ingesting from other sources like tarballs or NARs.

This introduces a `IngestionEntry` which carries enough information for
store ingestion and a future for computing the Blake3 digest of files.
This allows the producer to perform file uploads in a way that makes
sense for the source, ie. the filesystem upload could concurrently
upload multiple files at the same time, while the NAR ingestor will need
to ingest the entire blob before yielding the next blob in the stream.
In the future we can buffer small blobs and upload them concurrently,
but the full blob still needs to be read from the NAR before advancing.

Change-Id: I6d144063e2ba5b05e765bac1f27d41b3c8e7b283
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11462
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-19 20:37:05 +00:00
Connor Brewster
150106610e feat(tvix/castore): add convenience add method to Directory
This adds `Directory::add` which is a convenience helper for adding
nodes into a `Directory` while preserving sorted order.

This implements `Ord` and `PartialOrd` for `FileNode`, `SymlinkNode`,
and `DirectoryNode` so `binary_search` can be used.

Change-Id: I94b86bdef5d0da55aa352e098988b9704cafca19
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11481
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-04-19 20:10:14 +00:00
Florian Klink
538d5fc8ee fix(tvix/castore/blobservice/grpc): don't use NaiveSeeker for now
Userland likes to seek backwards, and until we have store composition
and can serve chunks from a local cache, we need to buffer the
individual chunks in memory.

Change-Id: I66978a0722d5f55ed4a9a49d116cecb64a01995d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11448
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
8107678632 fix(tvix/castore/src): record rq.handle field in read()
This makes it easier to separate concurrent requests on the same inode.

Change-Id: I7637c1d889336beeb0d186182ce22fbf60fd16c3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11447
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
99bc926d1e fix(tvix/castore/fs): use io::copy to fill kernel-provided buffer
The docs state we must fill all of the buffer, except on EOF.

Change-Id: Id977ba99c0b15132422474ebbf82bb92b79d55ba
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11446
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
bfd342873c feat(tvix/castore/blob/naive_seeker): add some more tracing
Change-Id: Iecf4a82a7d84008a8620825570b34e9094e6d590
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11445
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
9d9c731147 feat(tvix/castore/blob/chunked_reader): add some more traces
Change-Id: I2408707a7bc0e1c0cd8bd2933f8d68805b9e12c9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11444
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
28e98af9bc fix(tvix/castore/blobservice/chunk_rd): only skip *first* chunk bytes
When (re)initializing a chunked reader, we were erroneously skipping the
first n bytes from all chunks, not just the first one.

Fix this, by passing in an enumerated list of chunks, and only calling
SeekFrom::Start() on the first chunk in the stream.

With this, I'm able to invoke b3sum on bin/bash successfully.

Change-Id: I52ea480569267e093b0ac9d6bcd5c2d1b4db25f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11443
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
9398bc46b6 refactor(tvix/castore/blob/naive_seeker): rework skipping for clarity
Increase the discard_buf to 4096 (as I've seen this size).
Use the ready! macro to propagate pendings.
Make it more clear what exactly should be skipped in total, and what
during the current iteration.

Also write down that poll_read call already takes care of updating
self.pos, as I ran into that trap earlier (and added it here).

Change-Id: I2d22e1c8a835c0f3dd0c648917009b2bad4fd57c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11442
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-16 18:45:52 +00:00
Florian Klink
4d802fa0ae feat(tvix/castore/blob/chunked_reader): only reassemble on real seek
If the resulting offset equals to our current position, there's no need
to recreate a reader.

Change-Id: I855f0c79c514c16ca48a78e12978af2835fbbd6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11441
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-16 18:45:52 +00:00
Florian Klink
80d0b305a7 docs(tvix/castore/blobservice): explain open_read for small blobs more
State that this case applies if the blob is small enough to fit inside a
single chunk.

Change-Id: I0383514729e686799599b629cf1303b284147bb4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11440
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 19:33:37 +00:00
Florian Klink
e958cb0251 feat(tvix/castore/blobs/object_store): chunks() method for small blobs
We previously returned Ok(None) when being asked for more granular
chunking info, signalling the blob does not exist at all.

This is however incorrect, we should return an empty Vec instead, as
documented in the trait.

Change-Id: I83ecc2027e0767134c7598792c2ee6d964853c66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11439
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 19:33:06 +00:00
Florian Klink
c936c1c042 fix(tvix/castore/blob/object_store): tweak log levels
Don't log with info! here, bug debug!.

Change-Id: I57bd5f2a45276090b893a4051fd175e3948ddfa4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11438
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 19:32:35 +00:00
Florian Klink
9dc621cd95 fix(tix/castore/blobservice): don't warn if chunk list is empty
It's perfectly normal if we ask for more granular chunking info and the
backend responds it does not have it.

Change-Id: I593ab3e53b4f4e70c99f39b266546d2ac8eb10c1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11437
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 19:32:04 +00:00
Florian Klink
7156697010 feat(tvix/castore/blob/grpc_wrapper): add blob.digest field
We're receiving bytes over the wire, and encode them the same way
B3Digest does internally, but don't use it for formatting, as we're
discarding that string.

In case the sent bytes don't have the right length, the string will be
short, but it's better to still have it as a field, even if it's not a
valid b3 digest.

Change-Id: I6ef08275d51c8a0d98f5e46844b15dfd05d17cd8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11436
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 19:32:04 +00:00
Florian Klink
d1da9f5c84 refactor(tvix/castore/fs): add parenthesis for readability
As suggested in cl/11426.

Change-Id: Ic2bb8cf2838bf0be09fb8bc62b8e598a3d153699
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11434
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 15:09:49 +00:00
Florian Klink
b025a30d27 refactor(tvix/castore/fs): remove From<Node> for InodeData
These were copying unnecessarily. Instead, have a
InodeData::from_node(), which *consumes* the Node entirely, returns
`InodeData` and the split-off name (which is not part of InodeData).

Callers can then use the result in various helper functions, like:

 - InodeData::as_fuse_type
 - InodeData::as_fuse_file_attr
 - InodeData::as_fuse_entry

… to prepare their replies to the kernel.

This removes not only a bunch of clones, but also a lot of copy-pasted
code.

Change-Id: Idbca5f25cc29e96c1f4c614b33dff2becb0a8738
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11435
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:49:44 +00:00
Florian Klink
fb852b0245 fix(tvix/castore/blobs): reply to has() for chunks
We allow reading individual chunks via open_read(), it's inconsistent if
a has() would return Ok(false).

Change-Id: Ie713d968172ccd2687d2e6e0dfef89ee152ef511
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11420
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-15 14:47:12 +00:00
Florian Klink
f1349caf3f refactor(tvix/castore): relax trait bounds on BlobService
We don't need to clone BlobService anymore.

Change-Id: I2f3b9a595f604ec0f1e081f6e90cd8b67cbb8961
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11419
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 14:47:12 +00:00
Florian Klink
9498ac936e fix(tvix/castore/directory): fix graph traversal
Use a proper graph library to ensure all nodes are reachable from the
root.

We had a bit of that handrolled during add(), as well as later, which
had an annoying bug:

Redundant nodes were omitted during insert, but when returning the list
during finalize, we did not properly account they need to be introduced
before their parents are sent.

We now simply populate a petgraph DiGraph during insert (skipping
inserting nodes we already saw), and use petgraph's DfsPostOrder to
traverse the graph during finalize.

If the number of returned indices equals the total number of nodes in
the graph, all nodes are reachable from the root, we can consume the
graph and return the nodes as a vec, in the same order as the traversal
(and insertion).

Providing a regression test for the initial bug is challenging, as the
current code uses a bunch of HashSets. I manually tested ingesting a
full NixOS closure using this mechanism (via gRPC, which exposes this
problem, as it validates twice), and it now works.

Change-Id: Ic1d5e3e981f2993cc08c5c6b60ad895e578326dc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11418
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-15 14:47:12 +00:00
Florian Klink
6ebaa7b88a refactor(tvix/castore/import): restructure directory uploader a bit
Have a Option<Box<dyn DirectoryPutter>>, which is lazily initialized
whenever we first want to upload a directory.

Have the loop explicitly break when it encounters the root_node, and
deal with the flushing after the loop.

Deal with the FUTUREWORK (assertion for root directory digest matching
what the DirectoryPutter returns).

Change-Id: Iefc4904d8b8387e868fb752d40e3e4e4218c7407
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11417
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:47:12 +00:00
Florian Klink
c088123d4e refactor(tvix/castore/import): put invariant checker into a .inspect()
Separate this a bit stronger from the main application flow.

Change-Id: I2e9bd3ec47cc6e37256ba6afc6e0586ddc9a051f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11416
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:37:35 +00:00
Florian Klink
b70744fda6 refactor(tvix/*/import): rename direntry_stream, entries_per_depths
Align these names and comments with the two users, to make it more
obvious we're doing the same thing here, just use a different method to
come up with entries_per_depths.

Change-Id: I42058e397588b6b57a6299e87183bef27588b228
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11415
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:06:50 +00:00
Florian Klink
bcc00fba8f refactor(tvix/castore/import): inline process_entry
This did very little, and especially the part of relying on the outside
caller to pass in a Directory if the type is a directory required having
per-entry-type specific logic anyways.

It's cleaner to just inline it.

Change-Id: I997a8513ee91c67b0a2443cb5cd9e8700f69211e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11414
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:05:49 +00:00
Florian Klink
d47bd4f4bc refactor(tvix/castore/import): move process_entry to the end of the file
This makes it easier to understand the code.

Change-Id: I0a9047433000551a6ba1f50a8c5c93527bc86216
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11413
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 14:05:18 +00:00
Florian Klink
c19bc95b8a fix(tvix/castore/blobservice): update bytes_read only on successful read
We previously updated this.pos also in case the underlying read returned
an error.

Also, use the ready! macro to remove the match block, and instrument
errors returned during start_seek.

Change-Id: Ic32e26579d964a76b45687134acc48d72d67c36f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11421
Reviewed-by: Brian Olsen <me@griff.name>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 14:04:47 +00:00
Florian Klink
bccb4c92c3 refactor(tvix/castore/fs): add InodeData::as_fuse_{entry,file_attr}
Remove the now unused gen_file_attr (which had a wrong docstring).

Change-Id: Ie86b14d1ad798e6233bc44c43ace3f8b95c67ea9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11430
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 13:59:46 +00:00
Florian Klink
50065afd36 refactor(tvix/castore/fs): remove "add … handle" debug messages
Change-Id: Iac22bbef96a2afa0416f011d073934b52b19975d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11433
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-15 13:54:14 +00:00
Florian Klink
d3d88431f3 feat(tvix/castore/fs): assign read[dir*]/release[dir] ops to parent span
When a directory or file is open()'ed, we already put some data into
a lookup table, and subsequent operations then use the returned handle
id.

By also adding the span that's been created during these calls into the
lookup table, we can properly set the span parent for these requests,
nicely connecting the individual operations to the bigger picture.

Change-Id: Ia354842fccdbc7f45c2d3efda3acf058b2dbc48e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11429
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Brian Olsen <me@griff.name>
2024-04-15 13:54:14 +00:00
Florian Klink
34d9d54aae fix(tvix/castore/fs): use record to add fields to current span
Instead of creating another child span, we can use
`tracing::Span::current().record(k,v)` to add an additional field to the
current span.

Change-Id: I337faac0e73a0da6eb0a52cb75c2e8c026eff774
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11428
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 13:23:33 +00:00
Florian Klink
ff5835008f feat(tvix/castore/fs): implement readdirplus
This currently shares some code with readdir, except it's also providing
a second `fuse_backend_rs::api::filesystem::Entry` argument to the
`add_entry` function call.

Refactoring this to reduce some duplication is left for a future CL.

Change-Id: I282c8dfc6a711d00a4482c87cbb84d4950c0aee9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11426
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-15 13:22:08 +00:00
Florian Klink
456191e69e refactor(tvix/castore/fs): use consistent span field name for handle
Use rq.handle in `release` too, and remove interpolating it into the
log message itself.

Also update the comment, we don't get ownership, just simply drop, and
change the level to warn!, as suggested in cl/11425.

Change-Id: If4e6cff6d8b580671b1548ae3862851db4af6694
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11427
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-15 13:20:07 +00:00
Florian Klink
2a47ad1d90 feat(tvix/castore/fs): implement opendir/releasedir
Similar to how we already handle opening files, implement opendir/
releasedir, and keep a map of dir_handles. They point to the rx side of
a channel.

This greatly improves performance listing the root of the filesystem
when used inside tvix-store, as we don't need to re-request the listing
(and skip to the desired position) all the time.

Change-Id: I0d3ec4cb70a8792c5a1343439cf47d78d9cbb1d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11425
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 13:18:36 +00:00
Florian Klink
3ed7eda79b refactor(tvix/castore/fs): use std::sync::Mutex
This allows us acquiring the lock in sync code still. Also, simplify
some of the error handling a bit.

Change-Id: I29e83b715f92808e95ecb0ae9de787339d1a371d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11424
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-15 09:27:04 +00:00
Florian Klink
1bf6b9f5a0 refactor(tvix/castore/fs): simplify some separate spawn and blocks
We can just pass an async move closure to `self.tokio_handle.block_on`
and make this a bit shorter.

Change-Id: Iba674f34f22ba7a7de7c5bae59d64584884cb17c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11423
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-15 09:27:04 +00:00
Florian Klink
515bfa18fb feat(tvix/castore/fs): support extended attributes
This exposes `user.tvix.castore.{blob,directory}.digest` xattr keys for
files and directories:

```
❯ getfattr -d /tmp/tvix/06jrrv6wwp0nc1m7fr5bgdw012rfzfx2-nano-7.2-info
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.directory.digest="b3:SuYDcUM9RpWcnA40tYB1BtYpR0xw72v3ymhKDQbBfe4="

❯ getfattr -d /tmp/tvix/156a89x10c3kaby9rgf3fi4k0p6r9wl1-etc-shells
getfattr: Removing leading '/' from absolute path names
user.tvix.castore.blob.digest="b3:pZkwZoHN+/VQ8wkaX0wYVXZ0tV/HhtKlSqiaWDK7uRs="
```

It's currently mostly used for debugging, though it might be useful for
tvix-castore-aware syncing programs using the filesystem too.

Change-Id: I26ac3cb9fe51ffbf7f880519f26741549cb5ab6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11422
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
2024-04-15 09:27:04 +00:00
Florian Klink
b70e01a4db feat(tvix/castore/import): remove copying in find_ancestor
We don't need to copy if we explicitly say that the returned
Option<Path> may hold onto bytes from the passed in &DirEntry.

Change-Id: Ib46b6fd2f8f19a45f8bef79c4c1d2fa6b490cad7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11410
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-13 13:20:29 +00:00
Florian Klink
31e3382129 feat(tvix/*store/bigtable): limit retries connecting to cbtemulator
This kept retrying indefinitely if the socket didn't appear.

Change-Id: I4d4ef61df73cef6abda698501432f370abc8a82c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11406
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-13 12:01:00 +00:00
Florian Klink
cc42cac39c refactor(tvix/castore/import): rename ingest_entries function arg
This is a stream of DirEntry, so let's call it direntry_stream.

Change-Id: I5b3cb4efba899d746393f75f6ece7eaa79424717
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11401
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-13 10:51:58 +00:00
Florian Klink
7bcb896e48 feat(tvix/castore/directory/grpc): instrument functions
Change-Id: I9cc0a6a32184773597556ab5f9250257aa18ca4e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11399
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-04-12 22:32:37 +00:00
Florian Klink
f8800ba189 chore(tvix): bump rstest to 0.19.0
Change-Id: Ib2f5e84fdb8be1210b3507da67d4fe84f061651e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11387
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-12 22:16:56 +00:00
Florian Klink
17849c5c00 feat(tvix/castore/directory): add bigtable backend
This adds a Directory service using
https://cloud.google.com/bigtable/docs/ as a K/V store.

Directory (closures) are put in individual keys.

We don't do any bucketed upload of directory closures (yet), as castore/
fs does query individually, does not request recursively (and buffers).
This will be addressed by store composition at some point.

Change-Id: I7fada45bf386a78b7ec93be38c5f03879a2a6e22
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11212
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-09 15:50:34 +00:00
Florian Klink
289b3126db feat(tvix/castore): drop test-case crate dep
Change-Id: I5049a3682a58ce848d80f413b2964331025a90a8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11370
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
936a175b2f refactor(tvix/castore/directoryservice/from_addr): migrate to rstest
Change-Id: I52b14652822766421d66e95bc646ed7baecc705f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11369
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
Tested-by: BuildkiteCI
2024-04-07 14:51:47 +00:00
Florian Klink
d94ff54d42 refactor(tvix/castore): migrate closure_validator to rstest
Change-Id: I6c594d2e670a681484b858c3e04bc25b9e5a2077
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11368
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
c715b6d448 refactor(tvix/castore/tonic): migrate to rstest
Change-Id: Ie088bf03c739bf64abf3432175362a8f92f501c2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11367
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
71fb99a265 refactor(tvix/castore/hashing_reader): migrate to rstest
Change-Id: I99ae0e27b4db4799db8af7cd6b9cc8d7f09227de
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11366
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
c7c66abd85 refactor(tvix/castore/blobservice/object_store): drop individual tests
This (and more) should now be covered by the generic testsuite
(in crate::blobservice::tests).

Change-Id: Ib3afc4f19f7e37a561b7398d43663dc941971f5c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11365
Tested-by: BuildkiteCI
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
2024-04-07 14:51:47 +00:00
Florian Klink
32ac9bd110 refactor(tvix/blobservice/from_addr): move from test_case to rstest
This allows conditionalizing test cases on feature flags.

Change-Id: Ic9ed9ef52f703c973fda2ca4aae5f425e33b67de
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11364
Reviewed-by: picnoir picnoir <picnoir@alternativebit.fr>
Tested-by: BuildkiteCI
2024-04-07 14:51:47 +00:00
Florian Klink
6e8046bec7 feat(tvix/castore/*service/tests): add objectstore to tests, sort
Change-Id: If3a45d2f8008b75524ead704b05320364cc60d92
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11282
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 21:15:01 +00:00
Florian Klink
508fcd65ee feat(tvix/castore/directoryservice): log more TODOs
We need to define behaviours and add tests for these.

Change-Id: Id5825fafbf47897d8de42503ea6006eb131b1082
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11281
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 07:58:10 +00:00
Florian Klink
74023a07a4 refactor(tvix/castore/*): drop utils.rs and grpc directorysvc tests
This drops pretty much all of castore/utils.rs.

There were only two things left in there, both a bit messy and only used
for tests:

Some `gen_*_service()` helper functions. These can be expressed by
`from_addr("memory://")`.

The other thing was some plumbing code to test the gRPC layer, by
exposing a in-memory implementation via gRPC, and then connecting to
that channel via a gRPC client again.

Previous CLs moved the connection setup code to
{directory,blob}service::tests::utils, close to where we exercise them,
the new rstest-based tests.

The tests interacting directly on the gRPC types are removed, all
scenarios that were in there show now be covered through the rstest ones
on the trait level.

Change-Id: I450ccccf983b4c62145a25d81c36a40846664814
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11223
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 07:58:10 +00:00
Florian Klink
07a51c7dc9 feat(tvix/store): add rstest-based PathInfoService tests
This introduces rstest-based tests. We also add fixtures for creating
some BlobService / DirectoryService out of thin air.
To test a PathInfoService, we don't really care too much about its
internal storage - ensuring they work is up to the castore tests.

Change-Id: Ia62af076ef9c9fbfcf8b020a781454ad299d972e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11272
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 07:02:18 +00:00
Florian Klink
fe6ae58ba5 feat(tvix/castore): add rstest-based BlobService tests
Change-Id: Ifae9c41e1e3fa5e96f9b7e188181a44650ff8a04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11250
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-24 20:03:01 +00:00
Florian Klink
f5cf659245 feat(tvix/castore): AsRef<dyn BlobService> impl BlobService
This allows us to use containers around BlobServices as BlobServices too.

Change-Id: I3c7feb074f42b4e07c550fb8dfa63cf81d448ab5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11249
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 20:01:22 +00:00
Florian Klink
3ece32bbf9 feat(tvix/castore): add rstest-based DirectoryService tests
This creates test scenarios (using the DirectoryService trait) that we
want all DirectoryService implementations to pass.

Some of these tests are ported from proto::tests::grpc_directoryservice,
which tested this on the gRPC interface (rather than the trait),
some others ensure certain behaviour for which we only recently
introduced general checking logic (through ClosureValidator).

We also borrow some code related to setting up a gRPC DirectoryService
client (connecting to a server exposing a in-memory DiretoryService)
from castore::utils, this will be deleted once it's all ported over.

Change-Id: I6810215a76101f908e2aaecafa803c70d85bc552
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11247
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-24 20:00:40 +00:00
Florian Klink
6f5474bf02 feat(tvix/castore): AsRef<dyn DirectoryService> impl DirectoryService
This allows us to use containers around DirectoryServices as DirectoryServices too.

Change-Id: I56cca27b3212858db8b12b874df0e567dd868711
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11248
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-24 20:00:40 +00:00
Florian Klink
21fcc1c9df feat(tvix/castore/directory): add SledDirectoryPutter
This uses DirectoryClosureValidator for validation and the sled batch
API to insert multiple directories at once.

Change-Id: I2d6dc513ccbc02e638f8d22173da5463e73182ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11222
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 19:56:55 +00:00
Florian Klink
f7281d8fd5 refactor(tvix/castore/directory/grpc_wrapper): use ClosureValidator
This greatly simplifies the code in this function, replacing it with a
much better tested (and more capable!) version of the validation logic.

It also enables the gRPC server frontend to make use of the
DirectoryPutter interface. While this might not be too visible in terms
of latency thanks to gRPC streams bursting, it also enables further
optimizations later (such as bucketing of directory closures).

Change-Id: I21f805aa72377dd5266de3b525905d9f445337d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11221
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-24 19:55:42 +00:00
Florian Klink
5f069a3eb8 refactor(tvix/castore/directory): have SimplePutter use Validator
This simplifies a bunch of code, and gets rid of some TODOs.

Also, move it out of castore/utils, and into its own file.

Change-Id: Ie63e05a6cdfb2a73e878cf7107f9172aed1cdf13
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11224
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-24 17:42:30 +00:00
Florian Klink
c92ef2df64 feat(tvix/castore/directory): add ClosureValidator
This can be used to validate a Directory closure (connected DAG of
Directories), and their insertion order.

Directories need to be inserted (via `add`), in an order from the leaves
to the root. During insertion, we validate as much as we can at that
time:

 - individual validation of Directory messages
 - validation of insertion order (no upload of not-yet-known Directories)
 - validation of size fields of referred Directories

Internally it keeps all received Directories (and their sizes) in a HashMap,
keyed by digest.

Once all Directories have been inserted, a drain() function can be
called to get a (deduplicated and) validated list of directories, in
from-leaves-to-root order (to be stored somewhere).

While assembling that list, a check for graph connectivity is performed
too, to ensure there's no separate components being sent (and only one
root).

It adds a test suite for these cases, which is much nicer to test than
where we previously had these checks (only in the gRPC server wrapper).

Followup CLs will move the existing putters to use this.

Change-Id: Ie88c832924c170a24626e9e3e91d868497b5d7a4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11220
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-24 17:39:49 +00:00
Florian Klink
110734dfed docs(tvix/castore): fix missing slash in docstring
Change-Id: I5f39d0e613b651458b168cfd9df0693d7f01d704
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11246
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-23 22:06:27 +00:00
Florian Klink
e449f423e7 fix(tvix/castore/directory/tests): close upload handle
We need to ensure the Directories are successfully uploaded before doing
any testing with them.

Change-Id: Iafa8deb86b3d5eb302ebfba3ced34385f67a7229
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11244
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-23 22:00:16 +00:00
Florian Klink
9793745459 feat(tvix/castore): derive Eq and Hash automatically
This allows these messages to be put in HashSets.

Change-Id: Ia58094cafe53eb624578821d3d8d969c5d21a1d7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11219
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-03-20 21:19:02 +00:00
Florian Klink
345a639e79 refactor(tvix/castore): instrument DirectoryPutter impls consistently
Log the entire span with "trace" level, not just its `ret` level.

The level of the error value event defaults to ERROR, so we don't loose
these.

B3Digest implements Debug and Display the same way, so we can omit the
`(Display)` part in `ret(Display)` for them.

Change-Id: Id00d123a5798e5bdc9820dd97ae2b4d4eb5455f0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11218
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 21:02:44 +00:00
Florian Klink
60b47b336b refactor(tvix/castore/directory): remove GRPCPutter::new
This is no public API to construct this, there's exactly one caller,
and it's perfectly fine to directly populate the struct there.

Change-Id: Idae43a0162ee9bc687d21c550e0c9df33f12d263
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11217
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 21:02:44 +00:00
Florian Klink
9fb213f47a feat(tvix/castore): record errors for some failures in SimplePutter
This makes it easier to see what's going wrong when uploading multiple
Directories.

Change-Id: Ieb71424b9761777c5f719b2f365962644de82baf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11209
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-20 12:21:09 +00:00
Florian Klink
5627dc04e1 feat(tvix/castore/blob): document missing objectstore+*:// URL
Change-Id: I3cbc75e636efd467ddda7fc3c3c8f19ad42204ee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11206
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 12:17:42 +00:00
Florian Klink
345cebaebb refactor(tvix/castore/blob): drop simplefs
This functionality is provided by the object store backend too
(using `objectstore+file://$some_path`).

This backend also supports content-defined chunking and compresses
chunks with zstd.

Change-Id: I5968c713112c400d23897c59db06b6c713c9d8cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11205
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 12:17:42 +00:00
Florian Klink
2798803f76 refactor(tvix/castore): introduce "cloud" feature flag
This controls whether tvix-castore has support for various cloud
backends or not.

Use this to control the set of feature flags for the object_store
backend, and only enable the aws, azure and gcp ones if it's set.
In the future this can be used to enable/disable other cloud backends
too.

Without feature flags, `object_store` already supports the `InMemory`
and `LocalFilesystem` backends, and we also want to unconditionally
enable the `http` one. Make sure at least the construction of these
services is covered in the tests.

Similarly, the tvix-store crate, which provides the tvix-store CLI has a
`cloud` feature flag too (defaulting to enabled).

Change-Id: I9fb9c87b740e7dc83f8ff7a0862905d036d513f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11204
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-03-20 12:17:42 +00:00
Florian Klink
591edf0d5b docs(tvix/castore/directory): update docstring for get_recursive
The rust trait was missing to document the order of the elements in the
stream. Document that, and also the reasoning behind this.

Change-Id: I27ef0b2020082783fc41c2015233175e2b8e716d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11203
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-20 12:17:42 +00:00
Florian Klink
c0566985b0 refactor(tvix/castore/directory/from_addr): use match guards
This will allow feature-flagging some of the backends.

Change-Id: Iddbdb89d3cf9c966a2c25b06b03e6917b284cae5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11201
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
2024-03-20 11:53:04 +00:00
Florian Klink
7deadd50d5 refactor(tvix/castore/blob/from_addr): use match guards
This will allow feature-flagging some of the backends.

Change-Id: Idffbf8b3fd154f5a3d938225c3871feffea8ff8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11200
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 11:52:29 +00:00
Florian Klink
499bc2f7ee refactor(tvix/castore/blobsvc): use B3Digest Display impl
We don't need to use BASE64 here on our own, B3Digest has a Display
impl.

This will also make sure the `b3:` digest is present in field values.

Change-Id: I0ce6ee0f7e7e99fb9b16872953a1b742e99be291
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11192
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-03-18 16:10:05 +00:00
Florian Klink
05bdb68523 feat(tvix/blobservice/object_store) more logging
Have derive_{blob,chunk}_path emit trace-level events for both the
values they're called with, as well as the return value.

With RUST_LOG in place, it doesn't get lost in other unrelated noise.

Change-Id: Id2451e3657324eff482841eb26a22d19e22bde30
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11136
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-18 16:10:05 +00:00
Florian Klink
82f8ce8b7d feat(tvix/castore/blobsvc/grpc): read data in chunks
Whenever this encounters an open_read(), it'll first check for more
granular chunking. If there's more granular chunking data available, a
ChunkedReader is constructed (which supports seeking backwards).

This currently is still a bit stupid, and doesn't compose, as
`ChunkedReader` uses `self` as the `BlobService` to ask for the
individual chunks.

In store composition future, we might want to compose this differently,
essentially constructing `ChunkedReader` with another `BlobService`
representing the entire hierarchy, so there's a chance to locally cache
things, and do less requests.

Change-Id: I22e0df4d6245f666d083b4f0b7114d3ac41d1dce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11185
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-18 16:10:05 +00:00
Florian Klink
70bbf23767 refactor(tvix/castore/src/blobservice): remove useless else case
Change-Id: I09000371a1d8ff212ab46050d1a480509c6ffe70
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11183
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-18 14:25:37 +00:00
Florian Klink
3b1c9172f6 feat(tvix/castore): impl Debug for B3Digest
Use the same format as Display, b3: followed by the base64
representation. This makes the debug implementation of everything
containing a b3 digest much nicer to read.

Change-Id: I3ca3154d0b6fb07781c8f9c83ece3ff1a6957902
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11181
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-17 22:18:20 +00:00
Florian Klink
dbf87f3057 chore(tvix): bump tonic to 0.11.0
This bumps tonic and surrounding crates to 0.11.x.

We added support for tonic 0.11.x into tokio-listener
(https://github.com/vi/tokio-listener/pull/4), so that's bumped as well.

Change-Id: Icfade5894403228299836fefb21b2f9ae59dbebb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11156
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-03-16 17:04:12 +00:00
Florian Klink
fdf9657654 fix(tvix): don't emit rerun-if-changed
`build.rs` emits rerun-if-changed statements for all proto files, as
well as all include paths we pass it.

Unfortunately, due to protobufs include path rules, we need to specify
the path to the depot root itself as an include path, at least when
building impurely with `cargo`. This causes cargo to essentially always
rebuild, as it also puts its own temporary files in there.

Unfortunately, tonic-build does not chase down to individual .proto
files that are included.

Disable emitting these `rerun-if-changed` statements for now.

This could cause cargo to not rebuild protos every time, causing stale
data until the next local `cargo clean`, but considering the protos
change not that frequently, and it'll immediately surface if trying to
build via Nix (either locally or in CI), it's a good-enough compromise.

Change-Id: Ifd279a2216222ef3fc0e70c5a2fe6f87997f562e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11157
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2024-03-16 09:34:10 +00:00
Florian Klink
f22d5b3d11 feat(tvix/castore/blobsvc/from_addr): support object_store
The object_store crate supports a ton of different stores, with different schemes.

For now, use a objectstore+ scheme prefix to enable these.

Change-Id: I946f76e32a0fb0867ef59060217894cda5b959b9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11080
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-03-11 22:42:01 +00:00
Florian Klink
1c2db676a0 feat(tvix/castore/blobsvc): add object storage implementation
This uses the `object_store` crate to expose a tvix-castore BlobService
backed by object storage.

It's using FastCDC to chunk blobs into smaller chunks when writing to
it.

These are exposed at the .chunks() method.

Change-Id: I2858c403d4d6490cdca73ebef03c26290b2b3c8e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11076
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
2024-03-11 22:42:01 +00:00
Florian Klink
4e78de7393 fix(tvix/castore/grpc/directory): skip_all fields in instrument
This only contains the outer metadata wrapping, and that's not too interesting:

> Request { metadata: MetadataMap { headers: {"content-type":
> "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers",
> "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions:
> Extensions }

Drop these fields for now, and rely on the underlying implementations to
add instrumentation for the application-specific fields.

Clean up the error logging a bit.

Change-Id: Ife1090ed411766a61e1fa60fd4c9570f38de1e98
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11102
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-09 05:47:41 +00:00
Florian Klink
05deb37f44 fix(tvix/castore/grpc/blob): skip_all fields in instrument
This only contains the outer metadata wrapping, and that's not too interesting:

> Request { metadata: MetadataMap { headers: {"content-type":
> "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers",
> "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions:
> Extensions }

Drop these fields for now, and rely on the underlying implementations to
add instrumentation for the application-specific fields.

Log errors in some places where we didn't so far.

Change-Id: Ia68d6c526987d3716be62a0809195401cf28512b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11101
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-09 05:47:36 +00:00