This trait is eval-specific, there's no point in dealing with these
things in tvix-store.
This implements the EvalIO interface for a Tvix store.
The proper place for this glue code (for now) is tvix-cli, which knows
about both tvix-store and tvix-eval.
There's one annoyance with this move: The `tvix-store import` subcommand
previously also used the TvixStoreIO implementation (because it
conveniently did what we wanted).
Some of this code had to be duplicated, mostly logic to calculate the
NAR-based output path and create the PathInfo object.
Some, but potentially more of this can be extracted into helper
functions in a shared crate, and then be used from both TvixStoreIO in
tvix-cli as well as the tvix-store CLI entrypoint.
Change-Id: Ia7515e83c1b54f95baf810fbd8414c5521382d40
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9212
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
There was a NixHash::new() before, which didn't perform any validation
of the digest length. We had some length validation when parsing nix
hashes or SRI hashes, but some places didn't perform validation and/or
constructed the struct directly.
Replace NixHash::new() with a
`impl TryFrom<(HashAlgo, Vec<u8>)> for NixHash`, which does do this
validation, and update constructing code to use that, rather than
populating structs directly. In some rare cases where we're sure the
digest length is correct we still populate the struct manually.
Fixes b/291.
Change-Id: I7a323c5b18d94de0ec15e391b3e7586df42f4229
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9109
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Previously, Output deserialization would silence validation errors and
provide `None` for `hash_with_mode` as soon as a validation error would
happen inside of the `NixHashWithMode` deserialization, e.g. invalid
hash length would not provide an validation error but a silent `None`
value.
This is problematic, we workaround a serde limitation here by writing
our own Deserializer.
As you can see, we write some boilerplate code unfortunately, as, for
example:
- `#[serde(fail_if_unparsed_as="Option::is_none")]` is not a thing,
otherwise, we could have been able to just bubble up errors in case of
"not fully parsed" (and not missing) values.
- `From<&serde_json::Value> for serde:🇩🇪:Unexpected` is not a thing,
otherwise, we could just map invalid type errors and reuse the
existing types instead of doing extremely bizarre things with
`serde:🇩🇪:Unexpected::Other`, note: this is a not problem for
expected, we know what we expect, we don't know what we received in
practice.
I decided to write a `NixHashWithMode::from_map` which will eat a map
deserialized via `serde_json`, so our serde magic is not totally "data
model" agnostic.
I wanted to go for data model agnosticity and enable maximal
performance, e.g. building the structure as the map values are streamed
in the Visitor, this is needlessly painful because `Output` and
`NixHashWithMode` are in different files and this really makes sense
only if we write the full implementation in one file, indeed, otherwise,
you end up duplicating the work or having strange coupling.
So, for now, we will allocate a full map of the fields inside the
`Output`, i.e. if any "unknown field" is in that map, we will
deserialize it for no reason.
Doing it properly, as I repeat it in the code and to flokli at C3Camp
2023, requires to patch serde upstream IMHO.
Change-Id: I46fe6ccb8c390c48d6934fd3e3f02a0dfe59557b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9107
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This `.into_iter()` call is equivalent to `.iter()` and will not consume
the `BTreeMap`.
Change-Id: Ie26637ebecb0bea5b09c447cc45ed207f8b50913
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9088
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
We only need bstr::ByteSlice to be able to use replace, it doesn't need
to return a BString.
Change-Id: I811948436fb89652e880970c2c05356183f3e439
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9084
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Make it more obvious if these are bytes pointing to JSON or ATerm, so we
don't get confused.
Change-Id: I2402c687b7ba9c05aac20ed63b0df54e4e96a9d8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8998
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This is more concise than a io::copy of a Cursor to bytes, and we have
everything to be written in memory.
Change-Id: I81f34666aa61aef4e16b33423ce4a69c3781efc3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8997
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
write_input_derivations shouldn't need to write a comma to separate it
from the previous output from write_outputs.
This is better placed in the function calling all of these helper
functions.
Change-Id: I9ccc440e4665b52369ef39e75151b9a29469ce48
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8995
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
We're happy with any &[S], as long as <S: AsRef<[u8]>.
This allows passing both strings and &[u8].
Change-Id: If2a80d9b1ee33ba328c9cdab4fa83ca7b98a71e2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8994
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Let the escape function only take care of string escaping, not quoting.
Let write_array_elements always quote and escape strings it consumes.
Move the business of writing additional wrapping characters around it to
the caller.
Change-Id: Ib8dea69c409561b49862c531ba5a3fe6c2f061f8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8993
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Derivations can have non-unicode strings in their env values, so the
ATerm representations are not necessarily String anymore, but Vec<u8>.
Change-Id: Ic23839471eb7f68d9c3c30667c878830946b6607
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8990
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
This allows sorting Store Paths. We delegate the sorting business to the
PartialOrd, Ord impls for our digest fields only, as two StorePaths with
the same digest, but different names can't exist.
Change-Id: I5f81631e5f5063893b316c63a240c5266b7e5bad
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8988
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Add the position in the string where the name is problematic.
Change-Id: If6fd8be6100b718f8d68568eafc77ebb3cfb82d0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8979
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Some paths might use names that are not valid UTF-8. We should be able
to represent them.
We don't actually need to touch the PathInfo structures, as they need to
represent StorePaths, which come with their own harder restrictions,
which can't encode non-UTF8 data.
While this doesn't change any of the wire format of the gRPC messages,
it does however change the interface of tvix_eval::EvalIO - its
read_dir() method does now return a list of Vec<u8>, rather than
SmolStr. Maybe this should be OsString instead?
Change-Id: I821016d9a58ec441ee081b0b9f01c9240723af0b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8974
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
The primary constructor for this is now from_bytes, from_string is
simply calling .as_bytes() on the string, passing it along.
The InvalidName error now contains a Vec<u8>, to encode the invalid name
(which might not be a string anymore).
from_absolute_path now accepts a &[u8] (even though we might want to
make this a OSString of some sort).
StorePath::validate_name has been degraded to a pub(crate) function.
It's still used in src/derivation, even though it probably shouldn't at
all - that cleanup is left for cl/8412 though.
Change-Id: I6b4e62a6fa5c4bec13b535279e73444f0b83ad35
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8973
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This being a nested error makes things more complicated than necessary.
Also, this caused BuildStorePathError to only hold NameError,
so refactor these utility functions to either return Error, or
BuildStorePathError.
Change-Id: I046fb403780cc5135df8b8833a291fc2a90fd913
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8972
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
This allows using a StorePath as a key in a hashmap.
Change-Id: Id3eed623da4e1fc44a970a3982c7caa21d2495c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8666
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This allows decomposing a path consisting of a store path AND a suffix
into a StorePath, and a PathBuf containing the rest.
Change-Id: I81290e2fd804cdc9d1e88c71cb22c0fb882d7936
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8567
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Make it cleaner that StorePath only does encode the first path component
after the STORE_DIR prefix. Also, move some of the comments around a
bit, so it makes more sense what's using what.
Change-Id: Ibb57373a13526e30c58ad561ca50e1336b091d94
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8566
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Make it more explicit that we expect the from_string calls to fail here.
Change-Id: Ib3d46fc0850e364125e3548670ef301eeea2e45c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8565
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
We already returned UnexpectedEof in case the reader stopped returning
bytes too early, but similarly we should also fail if there's still
bytes left to be read in the reader passed.
We normally use the NAR writer to produce new NAR files, so the readers
point to the blobs we actually want to render, and having some data left
in there should be an error.
If for some reason the reader points to more data than just the blob,
the `.take` method can be used to limit it to the (known) size.
Change-Id: I9e8fa0a6dd9c794492abb6dc9e55995e619cb3bb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8553
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Before there was code scattered about (e.g. text hashing module and
derivation output computation) constructing store paths from low level
building blocks --- there was some duplication and it was easy to make
nonsense store paths.
Now, we have roughly the same "safe-ish" ways of constructing them as
C++ Nix, and only those are exposed:
- Make text hashed content-addressed store paths
- Make other content-addressed store paths
- Make input-addressed fixed output hashes
Change-Id: I122a3ee0802b4f45ae386306b95b698991be89c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8411
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
When building store paths we can just construct the thing.
Change-Id: Ife5d461d6a440ecbb22f32a86a6d51d212a2035b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8409
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
They can go under `nixhash`
Change-Id: Ia15835c57130b66d58f5df80ae9595dceee00941
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8408
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
It is moved into `store_path::utils` with the other path builders.
Change-Id: I3257170e442af5d83bcf79e63fa7387dd914597c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8410
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This doesn't have anything to do with ATerms, we just happen to be using
the aterm representation of a Derivation as contents.
Moving this into store_path/utils.rs makes these things much cleaner -
Have a build_store_path_from_references function, and a
build_store_path_from_fingerprint helper function that makes use of it.
build_store_path_from_references is invoked from the derivation module
which can be used to calculate the derivation path.
In the derivation module, we also invoke
build_store_path_from_fingerprint during the output path calculation.
Change-Id: Ia8d61a5e8e5d3f396f93593676ed3f5d1a3f1d66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8367
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This only added a suffix to the input argument, if build_store_path was
building the path of a Derivation.
As we need to also add the `.drv` suffix to the name we pass into
text_hash_string inside calculate_derivation_path, we can simply add the
suffix there and drop the parameter from build_store_path.
Change-Id: Icd5343dd1458f112b9296b389e81ce2ebdd16a9f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8365
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Use it to calculate the text_hash_string, which is then used in the
calculate_derivation_path and path_with_references functions.
Relates to b/263.
Change-Id: I7478825e2a23a11224212fea5e3fd06daa97d5e5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8364
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
… and keep the pub exports as is.
Change-Id: I2ad21660577553395f05b5ba71083626429b0dfc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8363
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
… and keep the pub exports as is.
Change-Id: I9f89a738c508c478ddba61303c21ea294f01ee9f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8362
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
The behaviour of this function is a bit unintuitive, and cl/8310 already
inlined the other consumer of it.
Rewrite the last consumer of the function, so we can drop it.
Change-Id: I59c8486037ce3f777667d1d9e4f4a9316d5a0cb9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8311
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Instead of having two very similar match branches for the FOD and non-
FOD case, detect the FOD case while looping over all outputs.
In the case of anything other than recursive sha256 FODs, the
fingerprint and output path calculation is exactly the same.
Change-Id: Ieb6995653d008766e595cf29d7cd4fb1334e33dd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8310
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
This stops using our own custom Hash structure, which was mostly only
used because we had to parse the JSON representation somehow.
Since cl/8217, there's a `NixHash` struct, which is better suited to
hold this data. Converting the format requires a bit of serde labor
though, but that only really matters when interacting with JSON
representations (which we mostly don't).
Change-Id: Idc5ee511e36e6726c71f66face8300a441b0bf4c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8304
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This only trims the output paths from a Derivation struct, not the
output hashes.
Change-Id: I9250fec4602ed05bb64540c4a89ddb6fb052be1f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8303
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
Call this function derivation_or_fod_hash, and return a NixHash.
This is more in line with how cppnix calls this, and allows using
to_nix_hash_string() in some places.
Change-Id: Iebf5355f08ed5c9a044844739350f829f874f0ce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8293
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Some of these strings are actually just the nix hash representation of
the hash calculated earlier.
There's another one passed around via calculate_drv_replacement_str, but
that's left for a followup.
Change-Id: Id99a2a926a980d679eb49c34ee6a36bf224699b0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8218
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
This provides a way to construct a NixHash struct without parsing
strings.
Change-Id: I947d96e15e51e72d5b02929cda8c5fc31d81253a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8217
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
We ironically didn't add support parsing for the "native" format that
Nix uses under the hood.
This extends the from_str method to peek at the prefix of the string to
determine whether to try decoding as SRI, Nix string, or whether it
should be a bare digest.
Change-Id: I33efd24968b16f86eff18305b4ca8f112c7131d7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8216
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This will be used for both Nix hash strings and hash strings without the
algo specified.
Change-Id: Iedfe5494fba5f2be00614ba0fc38bf659eafd447
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8215
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Nix accepts SRI hashes that are missing their padding characters
in base64, as seen in
7e49471316/pkgs/development/libraries/kerberos/krb5.nix .
It only seems to work in the SRI case, not with `sha256` being set to a
(nopad) base64 string.
Add regression tests for this, and document why we don't want to support
*additional* characters afterwards.
Reported in https://b.tvl.fyi/issues/252
Change-Id: I9ffc2b417501b426ced1894a9cbf95ff5f0e5159
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8037
Reviewed-by: Alyssa Ross <hi@alyssa.is>
Tested-by: BuildkiteCI
Instead of implementing `std::fmt::Display for Derivation` and relying
on the `to_string` method, introduce a `to_aterm_string()` method, which
does the same thing, but makes it clearer what we're producing, rather
than just calling `to_string()``.
Change-Id: I21823de9096a0f2c2eb6f4591e48c1aa9fd94161
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7998
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This module takes care of parsing various hashes and algorithms.
It will get used to modify derivation output hashes in the next CL.
Change-Id: Idc07c401dbb7510f49883ac02b8379b9a5d930c7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7990
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
The module `src/derivation/derivation.rs` is a sign of module inception.
Move the Derivation struct definiton up into `src/derivation/mod.rs`,
and some of the helpers in a `util.rs`.
Change-Id: Ib24a5f8a27bdd45df8b1fa2b3482a79b33cab8d5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7997
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
clippy says:
> This expression creates a reference which is immediately dereferenced
> by the compiler
Change-Id: Ic2c093b043ebee9ae80912075083107e4d216cf1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7995
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
This doesn't require any other corresponding handling *yet*, as the
actual replacements happen in the builder logic (which we delegate to
cppnix at the moment).
Change-Id: I034147c933f05ae427c7a8794647132d108d0ede
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7972
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Put this in its src/derivation.
Change-Id: Ic047ab1c2da555a833ee454e10ef60c77537b617
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7967
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Move nixbase32 and store_path into this.
This allows //tvix/cli to not pull in //tvix/store for now.
Change-Id: Id3a32867205d95794bc0d33b21d4cb3d9bafd02a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7964
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>