This uses our own data type to deal with Directories in the castore model.
It makes some undesired states unrepresentable, removing the need for conversions and checking in various places:
- In the protobuf, blake3 digests could have a wrong length, as proto doesn't know fixed-size fields. We now use `B3Digest`, which makes cloning cheaper, and removes the need to do size-checking everywhere.
- In the protobuf, we had three different lists for `files`, `symlinks` and `directories`. This was mostly a protobuf size optimization, but made interacting with them a bit awkward. This has now been replaced with a list of enums, and convenience iterators to get various nodes, and add new ones.
Change-Id: I7b92691bb06d77ff3f58a5ccea94a22c16f84f04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12057
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This provides a DirectoryService implementation which uses
redb (https://github.com/cberner/redb) as the database. It provides both
in-memory and persistent on-filesystem implementations.
Change-Id: Id8f7c812e2cf401cccd1c382b19907b17a6887bc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12038
Tested-by: BuildkiteCI
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: flokli <flokli@flokli.de>
When retrieving a closure with get_recursive, the following could happen in the GRPC client:
- The first reference to the deduplicated directory is added to expected_directory_digests
- The deduplicated directory is obtained removed from expected_directory_digests
- The second reference to the deduplicated directory is added to expected_directory_digests
- The deduplicated directory has already been sent, but is still in the
expected_directory_digests. It looks to the GRPC client like the
closure is incomplete and the stream ended prematurely.
Change-Id: Ic62bca12e7f8fb85af5fa4dacd199f0f3b8eea8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12033
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Use the ChunkedReader in CombinedBlobService instead which also supports seeking.
Change-Id: I681331a80763172c27e55362b7044fe81aaa323b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12031
Autosubmit: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This provides a PathInfoService implementation using redb
(https://github.com/cberner/redb) as the underlying storage engine.
Both an in-memory variant, as well as a filesystem one is provided,
similar how it's done with the sled implementation.
Supersedes: https://cl.tvl.fyi/c/depot/+/11692
Change-Id: I744619c51bf2efd0fb63659b12a27cbe0b2fd6fc
Signed-off-by: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11995
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
We still have the unique store name to identify which instantiation caused the error. For recursion errors, the full chain is still retained inside the CompositionError.
Change-Id: Iaddcece445a5df331e578d7c69d710db3d5f8dcd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12002
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
Reviewed-by: flokli <flokli@flokli.de>
This allows avoiding a `.node.unwrap()`` after validation.
Change-Id: Ieef1ffebab16cdca94c979ca6831a7ab4f6007da
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11989
Reviewed-by: Brian Olsen <me@griff.name>
Tested-by: BuildkiteCI
This will be necessary for the PathInfoService composition, as some
PathInfoService implementations require a BlobService & DirectoryService
to ingest into.
Using the Extend trait for creating compositions allows extending the same
composition with configs of various types e.g. BlobStore, DirectoryStore
Generics are moved from the Composition struct to the functions.The storage of
the InstantiatonStates uses the TypeId in the key and a Box<dyn Any> in the
value, which is downcasted to InstantiatonState<T>.
Change-Id: I2af11f26c535029adfb1c62905e0e7c4aaed7b51
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11980
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
this is useful when oneshot-instantiating a store from a single config
Change-Id: I08538fdee1d0bb26b3ae2da7d3b2339b2e93bc0a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11975
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Previously we had to make a mutable Config instance and set bytes and
other values in it because they were not exposed to the builder pattern
(https://github.com/hyperium/tonic/issues/908) but now they are, so we
just set them through the builder.
Change-Id: I8904c6b93f09173b56586024b1ced59d622bce66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11966
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
reqwest wants to be able to read a file of trust roots when constructed,
but as it doesn't actually do any HTTPS connections inside the nix
build, an empty list of trust roots is totally sufficient.
Thankfully /dev/null provides such a file.
Change-Id: I9bd1619b2c9f8ff2a6640d2ac410d4de5b20c2ea
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11961
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: aspen <root@gws.fyi>
We previously called ObjectStoreBlobService::parse_url, which passes an
empty list of options when constructing the ObjectStore.
This is most likely not what we want. The more reasonable thing to do is
pass along the query string (pairs) as options to
`object_store::parse_url_opts`, and remove them from the plain URL we
pass to object_store itself.
Change-Id: Ic2cb1dca2a2980a863165d81baa3323a355cf3fe
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11897
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Get rid of the `let grpc_client` and `let resp` in some cases.
Change-Id: Idc1c0f566a3b1b48da62e6f1977b07620656b16c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11884
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
We don't need to require these things for these impl blocks yet.
Change-Id: I3cec958a637a4f900bdd38abd00e9133bf75ce46
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11865
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Tested-by: BuildkiteCI
This introduces optional helper function in tvix/tracing for trace
propagation and uses these helper in the `tvix-store`.
The GRPCBlobService, GRPCDirectoryService and GRPCPathInfoService now
accept a generic client, meaning the client can be generated with either
`::new` or `::with_interceptor`.
This was tested and validated by starting a `tvix-store daemon` and
`tvix-store import`.
Change-Id: I4b194483bf09266820104b4b56e4a135dca2b77a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11863
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
By default tokio::spawn does not instrument the spawned task with the
current spawn (https://github.com/tokio-rs/tokio/discussions/6008), do
this manually for all tokio::spawn functions in functions that are
instrumented.
Change-Id: I83dd8145b3a62421454aff57d34180cebbee8304
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11864
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
Don't show an empty spinner for daemon commands.
Move the bar to the right, so the text is better aligned between spinner
progress and bar progress styles.
Generally, push progress bars a bit more down to the place where we can
track progress. This includes adding one in the upload_blob span.
Introduce another progress style template for transfers, which
interprets the counter as bytes (not just a plain integer), and also a data rate.
Use it for here and in the fetching code, and also make the progress bar
itself a bit less wide.
Change-Id: I15c2ea3d2b24b5186cec19cd3dbd706638497f40
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11845
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Both umounts happening from another process, as well as tvix-store
itself calling umount() on FuseDaemon will cause the FUSE worker threads
to terminate.
So far there was no nice way to wait on these threads to be terminated
from multiple places, causing the `tvix-store mount` command to only be
terminated if interrupted via ctrl-c, not via an external umount.
Update FuseDaemon to use a ThreadPool, which gives us a join primitive
over all threads, that can also be called from multiple places.
Await on a join() from there to end the program, not the ctrl-c signal
handler as it was before.
Using FuseDaemon from multiple tasks requires Arc<>-ing both the
ThreadPool as well as the inner FuseSession (which also needs to be
inside a Mutex if we want to unmount), but now we can clone FuseDaemon
around and use it in two places. We could probably also have used an
Option and drop the FuseSession after the first umount, but this looks
cleaner.
Change-Id: Id635ef59b560c111db52ad0b3ca3d12bc7ae28ca
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11825
Reviewed-by: Brian Olsen <me@griff.name>
Tested-by: BuildkiteCI
Use the new helper introduced in CL 11708 instead of rolling our own.
Change-Id: I292a9bc8baf73a6c75efe784031bcda1835bb645
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11709
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
Reviewed-by: flokli <flokli@flokli.de>
ClosureValidator was previously only suitable for a very narrow use case:
Validating incoming uploads, which are in leaves-to-root order.
This is because the ordering validation was hard-wired into the add()
function.
This
- Re-name ClosureValidator to DirectoryGraph, which is more suitable
since it actually stores the Directory structs and is drained in the end.
- Move the ordering-related logic to a separate OrderValidator, which
can be used independently.
- re-write DirectoryGraph to be a general purpose
validator which can accept the input in both orders
and can be drained in both orders as well.
This means the DirectoryGraph and OrderValidator can now serve
multiple new purposes:
- Validating the incoming closure on the client while downloading.
- Validating the incoming closure downloaded in a caching layer from the
`far` cache, and re-order it for insertion into the `near` cache.
Change-Id: I2b4b226348416912d7a31935bec050e53d911b70
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11708
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: yuka <yuka@yuka.dev>
Fix some typos found while reading various documents, mostly those
relating to the castore.
Here is a summary of the edits.
- fix broken link between documents in the store and castore directories
- clarify expression in castore's data model document that indicates
that the *name* of each child node of a directory must be unique
across all three lists of children
- add missing closing parenthesis in castore's data model document
- replace "how" with "what" in the phrase "unclear how a ... would even
look like" in castore's why-not-git-trees document
- remove unnecessary articles in castore's blobstore chunking document
- add missing "y" to "optionall" in eval's compilation of bindings
document
Change-Id: I1997ea91bb4e9c40abcd81e0cde9405968580ba6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11763
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This adds the tracing-indicatif crate, and configures it as a layer in
our tracing_subscriber pipeline to emit progress for every span that's
configured so.
It also moves from using std::io::stderr to write logs to using their
writer, to avoid clobbering output.
Progress bar styles are defined in a lazy_static, moving this into a
general tracing is left for later.
This adds some usage of this to the `imports` and `copy` commands.
The output can still be improved a bit - we should probably split each
task up into a smaller (instrumented) helper functions, so we can create
a progress bar for each task.
Change-Id: I59a1915aa4e0caa89c911632dec59c4cbeba1b89
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11747
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Closes: https://b.tvl.fyi/issues/401
With this change all crate features (and their combinations) will be built and
tested in CI.
From now on, when adding/removing a Cargo feature for a crate,
you will want to add it to the features power set that gets tested in CI.
For each crate there's a default.nix with a `mkFeaturePowerset` invocation,
modify the list to include/remove the feature.
Note that you don't want to add "collection" features,
such as `fs` for tvix-[ca]store or `default`.
Change-Id: I966dde1413d057770787da3296cce9c1924570e0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11717
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
These tests only interact with the FUSE layer, and
import super::fuse to do its work.
However, this only works if the `fuse` feature is enabled, which we
don't do if we enable the `virtiofs` feature only, causing the tests
to fail:
```
❯ cargo test --no-default-features --features virtiofs
Compiling tvix-castore v0.1.0 (/home/flokli/dev/nixos/code.tvl.fyi-submit2/tvix/castore)
error[E0432]: unresolved import `super::fuse`
--> castore/src/fs/tests.rs:14:13
|
14 | use super::{fuse::FuseDaemon, TvixStoreFs};
| ^^^^ could not find `fuse` in `super`
```
We move src/fs/tests.rs to src/fs/fuse/tests.rs
(and src/fs/fuse.rs to src/fs/fuse/mod.rs) to better structure this,
which will automatically cause both tests and code to only be built if
we have the `fuse` feature enabled.
Change-Id: I8fbbad3e4457e326bdfd171aa5c43d25d3187b5b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11715
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
The archive ingester has a mechanism for concurrently uploading small
blobs to the blob service in order to hide round trip latency with the
blob service when ingesting many small blobs.
Other ingestion sources like NARs also need a similar mechanism, this
extracts the concurrent blob uploading mechanism into its own struct to
make it more reusable.
Change-Id: I05020419ff4b9ad5829fbfb5cd08d36db983b8c0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11693
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Using remove_node messed up the extraction of nodes from the graph. Use
into_nodes_edges() instead, to remove the nodes without cloning.
Change-Id: Id76c7935d082d6f26192cc3cd490483594f1d1e2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11684
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
Reviewed-by: flokli <flokli@flokli.de>
We can just use the `BoxStream` directly, or a `once` with the single
`Directory`.
In the recursive case, we also did not properly close the channel after
the first error.
Change-Id: Ifad56d307fc7861107b6d3cffd28d35631d526e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11635
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Necessary to directly use this in the GRPC DirectoryService wrapper
directly.
Change-Id: Ic6a0038a40dc30071d145af5035345fcd93288ae
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11634
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Use try_stream! rather than stream!, and a bit more map_err and ok_err
to make things a bit more concise. Once we have proper error types here,
and impl Froms, a lot of the error mapping would disappear entirely.
Change-Id: I5240a6b0ff7818b94c151322774242b2c142e33b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11633
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Replace the loop manually driving the iterator with a for … in, and some
of the match with ok_or_else.
Change-Id: I6d7b3ef1bf1c7aa128bd6adef09390b54f79479e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11632
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
… in Cargo.toml.
This gets an imperative `cargo clippy` run to pick up that config,
so `-A clippy::blocks_in_conditions` doesn't need to be explicitly
specified anymore.
Change-Id: I32b6cc50c77c22cba0d816d0db508c2f94b2c383
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11659
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
We don't produce these erorrs anymore, no need to provide a conversion
to it.
Change-Id: I37933e436ad15c5d90b3ac270c4ef5742980513d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11614
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
We don't want to block here, and this also means there's no poisoning to
deal with.
Change-Id: Ic375571970c48beace0005ae2c012135086a4d67
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11613
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This one doesn't require us to deal with poisoning, is upgradeable and
the right thing to use when locking access to data, not IO resources.
Change-Id: I78634953a73404500d28f51f1d93a87e215c8149
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11612
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This does IO, which might take a longer amount of time than what we want
to be blocking the normal executor.
Use spawn_blocking instead. I didn't add it for the constructors, as we
only call these once.
Change-Id: I96231fcff8d10abe90cafde25a099a2db6ea9414
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11617
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
This never did any chunking, and sled (rightfully) performs really bad
if values get too large.
We switched the default to using the objectstore backend with the local
filesystem a while ago, no need to keep this footgun around anymore.
Change-Id: I2c12672f2ea6a22e40d0cbf9161560baddd73d4a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11616
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Once we break out with the root node, there may be no more elements in
the stream.
Change-Id: I6f5fc5662095aa2b2a56bcad506d25520d9ad00c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11592
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
We got away with not properly dealing with this for the archive case,
where everything is contained inside a toplevel dir, but NARs can encode
a single file/symlink.
Properly break if the IngestionEntry path has the ROOT as parent, and
only create filling directories in the other case.
Change-Id: Ib378d0d1040de7c3fe310912a0b0488c55afee83
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11590
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
There's no need for this to be a &PathBuf.
Change-Id: I2d4126d57cfd8ddaad5dd327943b70b83d45c749
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11589
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
We implement DirectoryService for Arc<DirectoryService> and
Box<DirectoryService>, this is sufficient.
Change-Id: I0a5a81cbc4782764406b5bca57f908ace6090737
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11586
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
The emulator and bigtable client are quite big. Remove them from the
default //tvix:shell.
Put the tests behind a `integration` feature flag, and add a variant
with that enabled to CI, and drop the bigtable tools from //tvix:shell.
Change-Id: Ie042097a0d6fc26542faa96c139b77298ccb160a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11582
Reviewed-by: edef <edef@edef.eu>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This switches from using std::path::Path to using castore paths.
We can drop some error handling in descend_to, as absolute (or redundant)
paths are not representable.
We however now need to convert from a std::path::Path to our
representation, and decide to accept .. canonicalization, as paths in
EvalIO might contain this. Dealing .. to hop into another store path, if
we encounter this, should be dealt with in a previous step.
Change-Id: I5e94693808420c5d56587c68731252b54755bf93
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11575
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
This allows converting from std::path::Path to castore PathBufs.
A flag is present to control .. canonicalization, and the usual caveats
about platform-specific differences apply.
Currently only added for unix, we'll carefully consider other platforms
on a case-by-case basis.
Change-Id: If289a92f75a2e5c3eec132b6a91a28d225fc1989
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11577
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
This allows using both Path and PathBuf in a function argument taking
`impl AsRef<Path>`.
Change-Id: Ibd3ba6fac538069d2fe729d1ef399fdef301668f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11574
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
These are fallible methods, so they should be named accordingly.
Change-Id: I6dc271c42989dd6500173488190f65381835d6fe
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11572
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
The empty path (Path::ROOT) is explicitly a valid path, and "foo" is
simply a child of "". The root itself is the only path without a parent.
Change-Id: Iff00dc8aed89eaf98702b664c0df658bd5a1d88a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11569
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Use the shared code for validating node names, since that is what path
components represent.
Change-Id: I12109c1306b224718faa66cf1f2874c78c1436a7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11566
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Since PathBuf doesn't have inherent methods anymore, these just forward
to Path itself.
Change-Id: I30f44adc9994337c367bad985ada0e8fcb98dd6a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11570
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
We implement Debug explicitly, so that we don't just see raw integers.
Change-Id: I11213094728f3e0c674562ee71c092a950041632
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11565
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This contains Path and PathBuf, representing platform-independent paths
representable by the castore model.
These are always relative, and platform-independent, which distinguishes
them from the ones provided in the standard library.
A subsequent CL will move IngestionEntry (and more) to use them.
Change-Id: Ib85857f4159ebc2f3c00192c95d4e5b54ffd4fcf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11558
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
Have ingest_entries return an Error type with only three kinds:
- Error while uploading a specific Directory
- Error while finalizing the directory upload
- Error from the producer
Move all ingestion method-specific errors to the individual
implementations.
Change-Id: I2a015cb7ebc96d084cbe2b809f40d1b53a15daf3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11557
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
We shouldn't try to represent non-representable things in the ingestion
entries (only to throw an error).
It's cleaner to throw the error directly in the part producing the
stream.
Change-Id: I6b6f6d8c2f677425210142a39f1829ddeefec812
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11556
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: firefly <firefly@firefly.nu>
This is only useful for when we have access to a filesystem, so it
shouldn't be in the root.
Change-Id: I9923aaed1aef9d3a1e8fad41f58821d51c2eb34b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11555
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: firefly <firefly@firefly.nu>
Tested-by: BuildkiteCI
These can be arbitrary bytes in theory. Some of our libraries might
be more strict, or inconsistent w.r.t. their representation of path
separators.
Change-Id: I7981b74fc7d3dd79f5589cf2ef52ced7b71dd003
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11551
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
The one for `fs` was wrong, and ended up being attached to ingest_path,
and the one for `archive` was missing entirely.
Change-Id: I8a4c32fb5293badb1ea0764c278a88e4ca33c018
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11552
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
Prior to this, some tests would not build
or would fail in an obscure way.
Change-Id: I68587cc7592492ebfd71ca02fc7ccc9ff7c0196f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11544
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
The `Error` enum for the `imports` crate has both filesystem and archive
specific errors and was starting to get messy.
This adds a separate `Error` enum for archive-specific errors and then
keeps a single `Archive` variant in the top-level import `Error` for all
archive errors.
Change-Id: I4cd0746c864e5ec50b1aa68c0630ef9cd05176c7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11498
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
Ingesting tarballs with a lot of small files is very slow because of the
round trip time to the `BlobService`. To mitigate this, small blobs can
be buffered into memory and uploaded concurrently in the background.
Change-Id: I3376d11bb941ae35377a089b96849294c9c139e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11497
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
With `ingest_entries` being more generalized, we can now use it for
ingesting the directory entries generated from tarballs.
Change-Id: Ie1f7a915c456045762e05fcc9af45771f121eb43
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11489
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Rather than carrying around an Future in the IngestionEntry::Regular,
simply carry the plain B3Digest.
Code reading through a non-seekable data stream has no choice but to
read and upload blobs immediately, and code seeking through something
seekable (like a filesystem) probably knows better what concurrency to
pick when ingesting, rather than the consuming side.
(Our only) one of these seekable source implementations is now doing
exactly that. We produce a stream of futures, and then use
[StreamExt::buffered] to process more than one, concurrently.
We still keep the same order, to avoid shuffling things and violating
the stream order.
This also cleans up walk_path_for_ingestion in castore/import, as well
as ingest_dir_entries in glue/tvix_store_io.
Change-Id: I5eb70f3e1e372c74bcbfcf6b6e2653eba36e151d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11491
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Fixes some build warnings that only happen when building in release mode
which disables `debug_assertions`.
Change-Id: I554d5fce7c869c23cf4aa93179f0ee9f7f7c834e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11490
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
`ingest_entries` requires that all directories referenced by entries in
the ingestion stream have an explicit entry in the stream.
For example, if the stream contains a file with path `foo/bar`, there
must be an entry that comes later in the stream for the directory `foo`.
Change-Id: I61b4fbbb73ea7278715e04271d8073b484e05e61
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11488
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Implement a first pass at the fetchTarball builtin.
This uses much of the same machinery as fetchUrl, but has the extra
complexity that tarballs have to be extracted and imported as store
paths (into the directory- and blob-services) before hashing. That's
reasonably involved due to the structure of those two services.
This is (unfortunately) not easy to test in an automated way, but I've
tested it manually for now and it seems to work:
tvix-repl> (import ../. {}).third_party.nixpkgs.hello.outPath
=> "/nix/store/dbghhbq1x39yxgkv3vkgfwbxrmw9nfzi-hello-2.12.1" :: string
Co-authored-by: Connor Brewster <cbrewster@hey.com>
Change-Id: I57afc6b91bad617a608a35bb357861e782a864c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11020
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Explicitly document and add a debug assertion for that.
It's up to callers to ensure this doesn't happen.
Change-Id: Ib5d154809c2ad2920258e239993d0b790d846dc8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11487
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Move error types and filesystem-specific functions to a separate file,
and keep the fs:: namespace in public exports.
Change-Id: I5e9e83ad78d9aea38553fafc293d3e4f8c31a8c1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11486
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
This is not a stream of direntries anymore, but a stream of ingestion
entries.
Change-Id: I387f4497b6567066b24c58ca0262e710348180e9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11485
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Previously the store ingestion code was coupled to `walkdir::DirEntry`s
produced by the `walkdir` crate which made it impossible to reuse
ingesting from other sources like tarballs or NARs.
This introduces a `IngestionEntry` which carries enough information for
store ingestion and a future for computing the Blake3 digest of files.
This allows the producer to perform file uploads in a way that makes
sense for the source, ie. the filesystem upload could concurrently
upload multiple files at the same time, while the NAR ingestor will need
to ingest the entire blob before yielding the next blob in the stream.
In the future we can buffer small blobs and upload them concurrently,
but the full blob still needs to be read from the NAR before advancing.
Change-Id: I6d144063e2ba5b05e765bac1f27d41b3c8e7b283
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11462
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This adds `Directory::add` which is a convenience helper for adding
nodes into a `Directory` while preserving sorted order.
This implements `Ord` and `PartialOrd` for `FileNode`, `SymlinkNode`,
and `DirectoryNode` so `binary_search` can be used.
Change-Id: I94b86bdef5d0da55aa352e098988b9704cafca19
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11481
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Userland likes to seek backwards, and until we have store composition
and can serve chunks from a local cache, we need to buffer the
individual chunks in memory.
Change-Id: I66978a0722d5f55ed4a9a49d116cecb64a01995d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11448
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This makes it easier to separate concurrent requests on the same inode.
Change-Id: I7637c1d889336beeb0d186182ce22fbf60fd16c3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11447
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI