Commit graph

13 commits

Author SHA1 Message Date
Florian Klink
c7845f3c88 refactor(tvix/castore): move *Node and Directory to crate root
*Node and Directory are types of the tvix-castore model, not the tvix
DirectoryService model. A DirectoryService only happens to send
Directories.

Move types into individual files in a nodes/ subdirectory, as it's
gotten too cluttered in a single file, and (re-)export all types from
the crate root.

This has the effect that we now cannot poke at private fields directly
from other files inside `crate::directoryservice` (as it's not all in
the same file anymore), but that's a good thing, it now forces us to go
through the proper accessors.

For the same reasons, we currently also need to introduce the `rename`
functions on each *Node directly.

A followup is gonna move the names out of the individual enum kinds, so
we can better represent "unnamed nodes".

Change-Id: Icdb34dcfe454c41c94f2396e8e99973d27db8418
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12199
Reviewed-by: yuka <yuka@yuka.dev>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-08-13 18:39:49 +00:00
Yureka
3ca0b53840 refactor(tvix/castore): use Directory struct separate from proto one
This uses our own data type to deal with Directories in the castore model.

It makes some undesired states unrepresentable, removing the need for conversions and checking in various places:

 - In the protobuf, blake3 digests could have a wrong length, as proto doesn't know fixed-size fields. We now use `B3Digest`, which makes cloning cheaper, and removes the need to do size-checking everywhere.
 - In the protobuf, we had three different lists for `files`, `symlinks` and `directories`. This was mostly a protobuf size optimization, but made interacting with them a bit awkward. This has now been replaced with a list of enums, and convenience iterators to get various nodes, and add new ones.

Change-Id: I7b92691bb06d77ff3f58a5ccea94a22c16f84f04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12057
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-08-13 12:17:01 +00:00
Connor Brewster
b0aaff25fa refactor(tvix/castore): extract concurrent blob uploader
The archive ingester has a mechanism for concurrently uploading small
blobs to the blob service in order to hide round trip latency with the
blob service when ingesting many small blobs.

Other ingestion sources like NARs also need a similar mechanism, this
extracts the concurrent blob uploading mechanism into its own struct to
make it more reusable.

Change-Id: I05020419ff4b9ad5829fbfb5cd08d36db983b8c0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11693
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-05-20 15:21:46 +00:00
Florian Klink
281bd46a43 feat(tvix-castore/import) have IngestionEntry.path() return &Path
There's no need for this to be a &PathBuf.

Change-Id: I2d4126d57cfd8ddaad5dd327943b70b83d45c749
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11589
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-05-05 14:54:19 +00:00
Florian Klink
ba00f0c695 refactor(tvix/*store): use DS: DirectoryService
We implement DirectoryService for Arc<DirectoryService> and
Box<DirectoryService>, this is sufficient.

Change-Id: I0a5a81cbc4782764406b5bca57f908ace6090737
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11586
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-05-04 21:27:26 +00:00
Florian Klink
516c6dc572 refactor(tvix/castore/import): use crate Path[Buf] in IngestionEntry
This explicitly splits ingestion-method-specific path types from the
castore types.

Change-Id: Ia3b16105fadb8d52927a4ed79dc4b34efdf4311b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11563
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-05-02 15:26:29 +00:00
Florian Klink
c9d3946cb5 refactor(tvix/castore/import): restructure error types
Have ingest_entries return an Error type with only three kinds:

 - Error while uploading a specific Directory
 - Error while finalizing the directory upload
 - Error from the producer

Move all ingestion method-specific errors to the individual
implementations.

Change-Id: I2a015cb7ebc96d084cbe2b809f40d1b53a15daf3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11557
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-30 17:12:39 +00:00
Florian Klink
5e8cfcfcd6 fix(tvix/castore/import): symlink targets are Vec<u8>
These can be arbitrary bytes in theory. Some of our libraries might
be more strict, or inconsistent w.r.t. their representation of path
separators.

Change-Id: I7981b74fc7d3dd79f5589cf2ef52ced7b71dd003
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11551
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
2024-04-30 13:18:03 +00:00
Florian Klink
ca64881cb3 docs(tvix/castore): fix tvix_castore::import sub-mod docstrings
The one for `fs` was wrong, and ended up being attached to ingest_path,
and the one for `archive` was missing entirely.

Change-Id: I8a4c32fb5293badb1ea0764c278a88e4ca33c018
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11552
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
2024-04-30 10:06:17 +00:00
Connor Brewster
d2e67f021e refactor(tvix/castore): add separate Error enum for archives
The `Error` enum for the `imports` crate has both filesystem and archive
specific errors and was starting to get messy.

This adds a separate `Error` enum for archive-specific errors and then
keeps a single `Archive` variant in the top-level import `Error` for all
archive errors.

Change-Id: I4cd0746c864e5ec50b1aa68c0630ef9cd05176c7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11498
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
Reviewed-by: flokli <flokli@flokli.de>
2024-04-24 15:41:38 +00:00
Connor Brewster
79698c470c feat(tvix/castore): upload blobs concurrently when ingesting archives
Ingesting tarballs with a lot of small files is very slow because of the
round trip time to the `BlobService`. To mitigate this, small blobs can
be buffered into memory and uploaded concurrently in the background.

Change-Id: I3376d11bb941ae35377a089b96849294c9c139e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11497
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Connor Brewster <cbrewster@hey.com>
2024-04-23 17:02:07 +00:00
Connor Brewster
fa69becf4d refactor(tvix/castore): switch to ingest_entries for tarball ingestion
With `ingest_entries` being more generalized, we can now use it for
ingesting the directory entries generated from tarballs.

Change-Id: Ie1f7a915c456045762e05fcc9af45771f121eb43
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11489
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-23 15:31:22 +00:00
Aspen Smith
3107961428 feat(tvix/eval): Implement builtins.fetchTarball
Implement a first pass at the fetchTarball builtin.

This uses much of the same machinery as fetchUrl, but has the extra
complexity that tarballs have to be extracted and imported as store
paths (into the directory- and blob-services) before hashing. That's
reasonably involved due to the structure of those two services.

This is (unfortunately) not easy to test in an automated way, but I've
tested it manually for now and it seems to work:

    tvix-repl> (import ../. {}).third_party.nixpkgs.hello.outPath
    => "/nix/store/dbghhbq1x39yxgkv3vkgfwbxrmw9nfzi-hello-2.12.1" :: string

Co-authored-by: Connor Brewster <cbrewster@hey.com>
Change-Id: I57afc6b91bad617a608a35bb357861e782a864c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11020
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:58:04 +00:00