This connects to a (remote) tvix-store BlobService over gRPC.
Change-Id: If31f706738a5c3445886c117feca8b61f3203e9e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8552
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Whether chunking is involved or not, is an implementation detail of each
Blobstore. Consumers of a whole blob shouldn't need to worry about that.
It currently is not visible in the gRPC interface either. It
shouldn't bleed into everything.
Let the BlobService trait provide `open_read` and `open_write` methods,
which return handles providing io::Read or io::Write, and leave the
details up to the implementation.
This means, our custom BlobReader module can go away, and all the
chunking bits in there, too.
In the future, we might still want to add more chunking-aware syncing,
but as a syncing strategy some stores can expose, not as a fundamental
protocol component.
This currently needs "SyncReadIntoAsyncRead", taken and vendored in from
https://github.com/tokio-rs/tokio/pull/5669.
It provides a AsyncRead for a sync Read, which is necessary to connect
our (sync) BlobReader interface to a GRPC server implementation.
As an alternative, we could also make the BlobReader itself async, and
let consumers of the trait (EvalIO) deal with the async-ness, but this
is less of a change for now.
In terms of vendoring, I initially tried to move our tokio crate to
these commits, but ended up in version incompatibilities, so let's
vendor it in for now.
Change-Id: I5969ebbc4c0e1ceece47981be3b9e7cfb3f59ad0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8551
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Before there was code scattered about (e.g. text hashing module and
derivation output computation) constructing store paths from low level
building blocks --- there was some duplication and it was easy to make
nonsense store paths.
Now, we have roughly the same "safe-ish" ways of constructing them as
C++ Nix, and only those are exposed:
- Make text hashed content-addressed store paths
- Make other content-addressed store paths
- Make input-addressed fixed output hashes
Change-Id: I122a3ee0802b4f45ae386306b95b698991be89c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8411
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
The logic validating connectivity of Directory nodes should be moved
to SimplePutter, and this use whatever DirectoryPutter the store comes
with.
Change-Id: Id68a86a96cc49ff73920017839788859ea9c5161
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8358
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This should allow import_path to communicate to a gRPC remote store,
that actually verifies the Directory nodes are interconnected.
Change-Id: Ic5d28c33518f50dedec15f1732d81579a3afaff1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8357
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This provides a handle to upload multiple proto::Directory as part of
the same closure.
Change-Id: I9213dde257a260c8622239918ea541064b270484
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8356
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
This doesn't have anything to do with ATerms, we just happen to be using
the aterm representation of a Derivation as contents.
Moving this into store_path/utils.rs makes these things much cleaner -
Have a build_store_path_from_references function, and a
build_store_path_from_fingerprint helper function that makes use of it.
build_store_path_from_references is invoked from the derivation module
which can be used to calculate the derivation path.
In the derivation module, we also invoke
build_store_path_from_fingerprint during the output path calculation.
Change-Id: Ia8d61a5e8e5d3f396f93593676ed3f5d1a3f1d66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8367
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
We do compare for equality. This comment probably was when I tried to
compare the `Result<T, E>`, and as `E` doesn't derive PartialEq it was
annoying.
Change-Id: I18bb19528c76af91c9d24d88d55dd46d0c092d20
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8354
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This moves the recursive BFS traversal of Directory closures from the
GRPCDirectoryServiceWrapper out into a a DirectoryTraverser struct
implementing Iterator.
It is then used from various implementors of DirectoryService in the
`get_recursive()` method.
This allows distinguishing between recursive requests and non-recursive
requests in the gRPC client trait implementation.
Change-Id: I50bfd4a0d9eb11832847329b78c587ec7c9dc7b1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8351
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This provides a GRPCDirectoryService struct implementing
DirectoryService, allowing a client to Directory objects from a (remote)
tvix-store.
Remote in this case is anything outside the current process, be it
another process, or an endpoint on the network.
To keep the sync interface in the `DirectoryService` trait, a handle to
some tokio runtime needs to be passed into the constructor, and the two
methods use `self.tokio_handle.spawn` to start an async function, and
`self.tokio_handle.block_on` to wait for its completion.
The client handle, called `grpc_client` itself is easy to clone, and
treats concurrent requests internally. This means, even though we keep
the `DirectoryService` trait sync, there's nothing preventing it from
being used concurrently, let's say from multiple threads.
There's still two limitations for now:
1) The trait doesn't make use of the `recursive` request, which
currently leads to a N+1 query problem. This can be fixed
by `GRPCDirectoryService` having a reference to another
`DirectoryService` acting as the local side.
I want to wait for general store composition code to pop up before
manually coding this here.
2) It's currently only possible to put() leaf directory nodes, as the
request normally requires uploading a whole closure. We might want
to add another batch function to upload a whole closure, and/or do
this batching in certain cases. This still needs some more thinking.
Change-Id: I7ffec791610b72c0960cf5307cefbb12ec946dc9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8336
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
We query the blob service for detailled blob info, not the chunk
service.
Change-Id: I85a6a57b1dae74a950f734be7d4455c5c35ae355
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8348
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
After ingestion of the contents into the store, this will use the
NonCachingNARCalculationService to create a NAR stream or the contents
of the path, and use our Derivation output path calculation machinery to
determine the output path (using recursive hashing strategy).
In a real-world scenario, we obviously want to cache these calculations,
but this should be sufficient to tinker around with it.
Change-Id: I9b2e69384414f0be1bdcb5a99a4bfd46e8db9932
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8317
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Passing in a &proto::node::Node into all this allows us consumers to
keep ownership of the proto::node::Node.
Change-Id: I44882a86c46826b06a8a8a0b24c18adfc7052662
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8316
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
Apparently, having multiple packages with the same path is a bad thing:
```
The bin target `tvix-store` in package `tvix-store-bin v0.1.0 (/home/flokli/tvl/tvix/store)` has the same output filename as the lib target `tvix_store` in package `tvix-store-bin v0.1.0 (/home/flokli/tvl/tvix/store)`.
Colliding filename is: /home/flokli/tvl/tvix/target/doc/tvix_store/index.html
The output filenames should be unique.
This is a known bug where multiple crates with the same name use
the same path; see <https://github.com/rust-lang/cargo/issues/6313>.
```
Change-Id: Ic785c0349070783baf5e8fd23f5fb60603a3c995
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8308
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
All code initially using this has been replaced by the simpler and more
performant implementation with StreamCDC and read_all_and_chunk.
Change-Id: I08889e9a6984de91c5debcf2b612cb68ae5072d1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8265
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This was the last piece of code using BlobWriter.
We can also use `read_all_and_chunk`, it's just requires a bit more
plumbing:
- The data coming from the client (stream) needs to be mapped (we
extract the .data field).
- The stream needs to be turned into an (async) reader
- The reader needs to be made sync, and that code using the sync reader
needs to be in a `task::spawn_blocking`.
Change-Id: I4e374e1a9f47d5a0933f59a8f5c121185a5f3e95
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8260
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This moves the logic from src/import.rs that
- reads over the contents of a file
- chunks them up and uploads individual chunks
- keeps track of the uploaded chunks in a BlobMeta structure
- returns the hash of the blob and the BlobMeta structure
… into a generic read_all_and_chunk function in
src/chunkservice/util.rs.
It will work on anything implementing io::Read, not just files, which
will help us in a bit.
Change-Id: I53bf628114b73ee2e515bdae29974571ea2b6f6f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8259
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Make use of the helper function here as well.
Change-Id: Ia0afd84eb3903bb897ee6aee884dc291f3e4371c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8258
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
When building tvix-store without default features, this variable doesn't
need to be mutable. Silence the warning.
Change-Id: Iec61be0064c0cef276a29ef22e5c4af3b052efe8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8267
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
This removes the use of Box::new, switching fastcdc to version 3.0.2
with https://github.com/nlfiedler/fastcdc-rs/issues/25 fixed.
Change-Id: I64f388b9e0a7f358e25a8bb7ca0e4df1d3bb01c4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8249
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
We're using this in a bunch of places. Let's move it into a helper
function.
Change-Id: I118fba35f6d343704520ba37280e4ca52a61da44
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8251
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This is useful not only in blobwriter contexts.
Change-Id: I4c584b5264ff7b4bb3b1a9671affc39e18bf4ccf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8245
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This already is a &CS, so we don't need to clone().
Change-Id: I5397a5948ae7fe4781f18df760a79047f83dca01
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8243
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Look at the data that's written to us, and upload all chunks but the
rest in parallel, using rayon. This required moving `upload_chunk`
outside the struct, and accepting a ChunkService to use for upload
(which it was previously getting from `self.chunk_service`).
This doesn't speed up things too much for now, because things are still
mostly linear.
Change-Id: Id785b5705c3392214d2da1a5b6a182bcf5048c8d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8195
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
The Directory service does already reject inserting invalid (wrongly
sorted) Directory messages, but our test case didn't provoke it.
Change-Id: I228e201925e8999186659a2d8da0118db184d9ab
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8167
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This exposes the previous default behavior at the `tvix-store daemon`
subcommand.
It also adds a `tvix-store import` command, which will ingest a given
path into the store.
Change-Id: Ide14f1d409b9364e7f98090690c744326486e470
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8166
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This provides a service using /dev/shm, that's deleted once the
reference is dropped.
Refactor all tests to use these, which allows getting rid of most
TempDir usage in the tests.
The only place where we still use TempDir is in the importer tests,
which work on a filesystem path.
Change-Id: I08a950aa774bf9b46d9f5c92edf5efba36053242
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8193
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI