ConcurrentBlobUploader buffers small blobs in memory, and then uploads
them to the BlobService in the background.
In these cases, we know the hash of the whole blob, so we could check if
it exists first before, uploading it.
We were however not, and this caused rate limiting issues in GCS, as it
has an update limit of one write per second on the same key, which we
ran into especially frequently with the empty blob.
This reduces the amount of writes of the same blob considerably.
In the future, we might be able to drop this, as our chunked blob
uploading protocol gets smarter and covers these cases.
Change-Id: Icf482df815812f80a0b65cec0426f8e686308abb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12497
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
From now on we will add the dependencies and their version in the root
Cargo.toml and in order to enable the dependency for a workspace member
we set `workspace = true` in the member's Cargo.toml.
Change-Id: I9738c1cf99810b7ace87ca712c3ea965ba846e25
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12389
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This updates all the dependencies and their "minimum" versions in
Cargo.{lock,toml} to the latest compatible version using `cargo-edit`'s
`cargo upgrade` command that will eventually be merged into `cargo
update`.
Change-Id: Iccb2aa4a1c84a0465222244a0bd0cafe2a82e781
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12388
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
This bumps bigtable_rs to
https://github.com/liufuyang/bigtable_rs/pull/86, allowing us to drop
our second set of prost/tonic/http/axum crates.
Change-Id: I70f9150289c3e8611ebe8a7d99490e3dfd085a6e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12384
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
tonic-reflection 0.12.x moved from the v1alpha to v1 of the reflection
protocol.
However, most clients, like Postman, Kreya and evans don't support that
one yet.
Bump tonic-reflection to 0.12.2, which re-introduces v1alpha support
alongside the v1 version of it, registering both services.
This fixes the example documented in tvix/store/README.md, it was
previously failing as evans couldn't find the v1alpha reflection
service.
See https://github.com/hyperium/tonic/pull/1888 for details.
Change-Id: I55438877317f82dc39face13afeb9594cda07a4e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12353
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Fix a discrepancy in the error message for DirectoryError::SizeOverflow.
The message indicates that the SizeOverflow error occurs when total size
exceeds u32::MAX, but that's not true. All size fields within the
castore's internal Directory ADT are u64, and the SizeOverflow error is
only returned after a call to the checked_add implementation on u64.
See tvix/castore/nodes/directory.rs +111
and tvix/castore/nodes/directory.rs +88
as of this commit.
Change-Id: I74d161ea8927362e1cb601ba163489aa96fb91b1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12259
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
We can avoid depending on things outside //tvix by just using a similar
util from nixpkgs.
Change-Id: I9ea3e1f0a8a059ea10caaec173569ba9f316aec6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12247
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Don't use ValidateNodeError, but SymlinkTargetError.
Also, add checks for too long symlink targets.
Change-Id: I4b533325d494232ff9d0b3f4f695f5a1a0a36199
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12230
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Don't use DirectoryError, but PathComponentError.
Also add checks for too long path components.
Change-Id: Ia9deb9dd0351138baadb2e9c9454c3e019d5a45e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12229
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Replace the hand-rolled code comparing names with a try_fold. Also, make
it slightly stricter here, detecting duplicates in the same fields
earlier.
Change-Id: I9c560838ece88c3b8b339249a8ecbf3b05969538
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12226
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
This provides a batched variant to construct a Directory, which reuses
the previously calculated size.
Checking and inserting code is factored out into a check_insert_node
function, taking the current size as a parameter and returning the new
size.
Change-Id: Ia6c2970a0c12181b7c40e63cf7ce8c93298ea37c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12225
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
Make this a proper struct with named fields. We apparently never access
B3Digest in there, so it can be removed.
Change-Id: Ifc07310393e1afb0a835778eae137a19b54070b9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12224
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Provide a into_nodes() function on a Directory, which consumes self and
returns owned PathComponent and Node.
Use it to provide a proper conversion from Directory to the proto
variant that doesn't clone.
There's no need for the one taking only &Directory, we don't use it
anywhere, and once someone needs that they might as well clone Directory
before converting it.
Update all other users of the `.nodes()` function to use `.into_nodes()`
where applicable, and avoid some more cloning there.
Change-Id: Id4577b9eb173c012e225337458898d3937112bcb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12218
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This encodes a verified component on the type level. Internally, it
contains a bytes::Bytes.
The castore Path/PathBuf component() and file_name() methods now
return this type, the old ones returning bytes were renamed to
component_bytes() and component_file_name() respectively.
We can drop the directory_reject_invalid_name test - it's not possible
anymore to pass an invalid name to Directories::add.
Invalid names in the Directory proto are still being tested to be
rejected in the validate_invalid_names tests.
Change-Id: Ide4d16415dfd50b7e2d7e0c36d42a3bbeeb9b6c5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12217
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Add a `SymlinkTarget` type to represent validated symlink targets.
With this, no invalid states are representable, so we can make `Node` be
just an enum of all three kind of types, and allow access to these
fields directly.
Change-Id: I20bdd480c8d5e64a827649f303c97023b7e390f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12216
Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Nodes only have names if they're contained inside a Directory, or if
they're a root node and have something else possibly giving them a name
externally.
This removes all `name` fields in the three different Nodes, and instead
maintains it inside a BTreeMap inside the Directory.
It also removes the NamedNode trait (they don't have a get_name()), as
well as Node::rename(self, name), and all [Partial]Ord implementations
for Node (as they don't have names to use for sorting).
The `nodes()`, `directories()`, `files()` iterators inside a `Directory`
now return a tuple of Name and Node, as does the RootNodesProvider.
The different {Directory,File,Symlink}Node struct constructors got
simpler, and the {Directory,File}Node ones became infallible - as
there's no more possibility to represent invalid state.
The proto structs stayed the same - there's now from_name_and_node and
into_name_and_node to convert back and forth between the two `Node`
structs.
Some further cleanups:
The error types for Node validation were renamed. Everything related to
names is now in the DirectoryError (not yet happy about the naming)
There's some leftover cleanups to do:
- There should be a from_(sorted_)iter and into_iter in Directory, so
we can construct and deconstruct in one go.
That should also enable us to implement conversions from and to the
proto representation that moves, rather than clones.
- The BuildRequest and PathInfo structs are still proto-based, so we
still do a bunch of conversions back and forth there (and have some
ugly expect there). There's not much point for error handling here,
this will be moved to stricter types in a followup CL.
Change-Id: I7369a8e3a426f44419c349077cb4fcab2044ebb6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12205
Tested-by: BuildkiteCI
Reviewed-by: yuka <yuka@yuka.dev>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
*Node and Directory are types of the tvix-castore model, not the tvix
DirectoryService model. A DirectoryService only happens to send
Directories.
Move types into individual files in a nodes/ subdirectory, as it's
gotten too cluttered in a single file, and (re-)export all types from
the crate root.
This has the effect that we now cannot poke at private fields directly
from other files inside `crate::directoryservice` (as it's not all in
the same file anymore), but that's a good thing, it now forces us to go
through the proper accessors.
For the same reasons, we currently also need to introduce the `rename`
functions on each *Node directly.
A followup is gonna move the names out of the individual enum kinds, so
we can better represent "unnamed nodes".
Change-Id: Icdb34dcfe454c41c94f2396e8e99973d27db8418
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12199
Reviewed-by: yuka <yuka@yuka.dev>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This uses our own data type to deal with Directories in the castore model.
It makes some undesired states unrepresentable, removing the need for conversions and checking in various places:
- In the protobuf, blake3 digests could have a wrong length, as proto doesn't know fixed-size fields. We now use `B3Digest`, which makes cloning cheaper, and removes the need to do size-checking everywhere.
- In the protobuf, we had three different lists for `files`, `symlinks` and `directories`. This was mostly a protobuf size optimization, but made interacting with them a bit awkward. This has now been replaced with a list of enums, and convenience iterators to get various nodes, and add new ones.
Change-Id: I7b92691bb06d77ff3f58a5ccea94a22c16f84f04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12057
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This provides a DirectoryService implementation which uses
redb (https://github.com/cberner/redb) as the database. It provides both
in-memory and persistent on-filesystem implementations.
Change-Id: Id8f7c812e2cf401cccd1c382b19907b17a6887bc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12038
Tested-by: BuildkiteCI
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: flokli <flokli@flokli.de>
When retrieving a closure with get_recursive, the following could happen in the GRPC client:
- The first reference to the deduplicated directory is added to expected_directory_digests
- The deduplicated directory is obtained removed from expected_directory_digests
- The second reference to the deduplicated directory is added to expected_directory_digests
- The deduplicated directory has already been sent, but is still in the
expected_directory_digests. It looks to the GRPC client like the
closure is incomplete and the stream ended prematurely.
Change-Id: Ic62bca12e7f8fb85af5fa4dacd199f0f3b8eea8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12033
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Use the ChunkedReader in CombinedBlobService instead which also supports seeking.
Change-Id: I681331a80763172c27e55362b7044fe81aaa323b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12031
Autosubmit: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This provides a PathInfoService implementation using redb
(https://github.com/cberner/redb) as the underlying storage engine.
Both an in-memory variant, as well as a filesystem one is provided,
similar how it's done with the sled implementation.
Supersedes: https://cl.tvl.fyi/c/depot/+/11692
Change-Id: I744619c51bf2efd0fb63659b12a27cbe0b2fd6fc
Signed-off-by: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11995
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
We still have the unique store name to identify which instantiation caused the error. For recursion errors, the full chain is still retained inside the CompositionError.
Change-Id: Iaddcece445a5df331e578d7c69d710db3d5f8dcd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12002
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
Reviewed-by: flokli <flokli@flokli.de>
This allows avoiding a `.node.unwrap()`` after validation.
Change-Id: Ieef1ffebab16cdca94c979ca6831a7ab4f6007da
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11989
Reviewed-by: Brian Olsen <me@griff.name>
Tested-by: BuildkiteCI
This will be necessary for the PathInfoService composition, as some
PathInfoService implementations require a BlobService & DirectoryService
to ingest into.
Using the Extend trait for creating compositions allows extending the same
composition with configs of various types e.g. BlobStore, DirectoryStore
Generics are moved from the Composition struct to the functions.The storage of
the InstantiatonStates uses the TypeId in the key and a Box<dyn Any> in the
value, which is downcasted to InstantiatonState<T>.
Change-Id: I2af11f26c535029adfb1c62905e0e7c4aaed7b51
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11980
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
this is useful when oneshot-instantiating a store from a single config
Change-Id: I08538fdee1d0bb26b3ae2da7d3b2339b2e93bc0a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11975
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Previously we had to make a mutable Config instance and set bytes and
other values in it because they were not exposed to the builder pattern
(https://github.com/hyperium/tonic/issues/908) but now they are, so we
just set them through the builder.
Change-Id: I8904c6b93f09173b56586024b1ced59d622bce66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11966
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
reqwest wants to be able to read a file of trust roots when constructed,
but as it doesn't actually do any HTTPS connections inside the nix
build, an empty list of trust roots is totally sufficient.
Thankfully /dev/null provides such a file.
Change-Id: I9bd1619b2c9f8ff2a6640d2ac410d4de5b20c2ea
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11961
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: aspen <root@gws.fyi>
We previously called ObjectStoreBlobService::parse_url, which passes an
empty list of options when constructing the ObjectStore.
This is most likely not what we want. The more reasonable thing to do is
pass along the query string (pairs) as options to
`object_store::parse_url_opts`, and remove them from the plain URL we
pass to object_store itself.
Change-Id: Ic2cb1dca2a2980a863165d81baa3323a355cf3fe
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11897
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Get rid of the `let grpc_client` and `let resp` in some cases.
Change-Id: Idc1c0f566a3b1b48da62e6f1977b07620656b16c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11884
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
We don't need to require these things for these impl blocks yet.
Change-Id: I3cec958a637a4f900bdd38abd00e9133bf75ce46
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11865
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Tested-by: BuildkiteCI