We already have the parsed output_hash from above, no need to construct
it again.
Change-Id: Ie6d924ab446137c25c29fbeaf671aa7e5418262d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9110
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Derivations can have non-unicode strings in their env values, so the
ATerm representations are not necessarily String anymore, but Vec<u8>.
Change-Id: Ic23839471eb7f68d9c3c30667c878830946b6607
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8990
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Unfortunately, nixpkgs has at least one case[1] where the out environment
variable is shadowed -- though it doesn't cause a problem, since it's
shadowed with the correct value, odd as this may be!
[1]: c7c2984716/pkgs/development/python-modules/pybind11/default.nix (L19)
Change-Id: Ibf6790d2484dc9cce8e424feeb5886664d498dc3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8696
Autosubmit: tazjin <tazjin@tvl.su>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Before there was code scattered about (e.g. text hashing module and
derivation output computation) constructing store paths from low level
building blocks --- there was some duplication and it was easy to make
nonsense store paths.
Now, we have roughly the same "safe-ish" ways of constructing them as
C++ Nix, and only those are exposed:
- Make text hashed content-addressed store paths
- Make other content-addressed store paths
- Make input-addressed fixed output hashes
Change-Id: I122a3ee0802b4f45ae386306b95b698991be89c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8411
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This doesn't have anything to do with ATerms, we just happen to be using
the aterm representation of a Derivation as contents.
Moving this into store_path/utils.rs makes these things much cleaner -
Have a build_store_path_from_references function, and a
build_store_path_from_fingerprint helper function that makes use of it.
build_store_path_from_references is invoked from the derivation module
which can be used to calculate the derivation path.
In the derivation module, we also invoke
build_store_path_from_fingerprint during the output path calculation.
Change-Id: Ia8d61a5e8e5d3f396f93593676ed3f5d1a3f1d66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8367
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Walking the arguments might encounter an `outputs` output, which might
explicitly (for whatever reason) specify the `out` output.
To prevent dropping FOD settings in this case, we have to populate
that part of the configuration after walking the other attributes.
Change-Id: Iee6a7f0a71e9c9699e79d35e6cb19e1ddb49395d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8312
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This stops using our own custom Hash structure, which was mostly only
used because we had to parse the JSON representation somehow.
Since cl/8217, there's a `NixHash` struct, which is better suited to
hold this data. Converting the format requires a bit of serde labor
though, but that only really matters when interacting with JSON
representations (which we mostly don't).
Change-Id: Idc5ee511e36e6726c71f66face8300a441b0bf4c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8304
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Call this function derivation_or_fod_hash, and return a NixHash.
This is more in line with how cppnix calls this, and allows using
to_nix_hash_string() in some places.
Change-Id: Iebf5355f08ed5c9a044844739350f829f874f0ce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8293
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Warning: This is probably the biggest refactor in tvix-eval history,
so far.
This replaces all instances of trampolines and recursion during
evaluation of the VM loop with generators. A generator is an
asynchronous function that can be suspended to yield a message (in our
case, vm::generators::GeneratorRequest) and receive a
response (vm::generators::GeneratorResponsee).
The `genawaiter` crate provides an interpreter for generators that can
drive their execution and lets us move control flow between the VM and
suspended generators.
To do this, massive changes have occured basically everywhere in the
code. On a high-level:
1. The VM is now organised around a frame stack. A frame is either a
call frame (execution of Tvix bytecode) or a generator frame (a
running or suspended generator).
The VM has an outer loop that pops a frame off the frame stack, and
then enters an inner loop either driving the execution of the
bytecode or the execution of a generator.
Both types of frames have several branches that can result in the
frame re-enqueuing itself, and enqueuing some other work (in the
form of a different frame) on top of itself. The VM will eventually
resume the frame when everything "above" it has been suspended.
In this way, the VM's new frame stack takes over much of the work
that was previously achieved by recursion.
2. All methods previously taking a VM have been refactored into async
functions that instead emit/receive generator messages for
communication with the VM.
Notably, this includes *all* builtins.
This has had some other effects:
- Some test have been removed or commented out, either because they
tested code that was mostly already dead (nix_eq) or because they
now require generator scaffolding which we do not have in place for
tests (yet).
- Because generator functions are technically async (though no async
IO is involved), we lose the ability to use much of the Rust
standard library e.g. in builtins. This has led to many algorithms
being unrolled into iterative versions instead of iterator
combinations, and things like sorting had to be implemented from scratch.
- Many call sites that previously saw a `Result<..., ErrorKind>`
bubble up now only see the result value, as the error handling is
encapsulated within the generator loop.
This reduces number of places inside of builtin implementations
where error context can be attached to calls that can fail.
Currently what we gain in this tradeoff is significantly more
detailed span information (which we still need to bubble up, this
commit does not change the error display).
We'll need to do some analysis later of how useful the errors turn
out to be and potentially introduce some methods for attaching
context to a generator frame again.
This change is very difficult to do in stages, as it is very much an
"all or nothing" change that affects huge parts of the codebase. I've
tried to isolate changes that can be isolated into the parent CLs of
this one, but this change is still quite difficult to wrap one's mind
and I'm available to discuss it and explain things to any reviewer.
Fixes: b/238, b/237, b/251 and potentially others.
Change-Id: I39244163ff5bbecd169fe7b274df19262b515699
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8104
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Tested-by: BuildkiteCI
We need to distinguish explicitly between the paths used for the
scanner, and the paths that populate the derivation inputs. The full
paths must be accessible from the result of the refscanner to populate
drv fields correctly.
This was previously hidden by debug changes that masked actual IO
operations with no-ops.
Change-Id: I037af6e6bbe2b573034d695f8779bee1b56bc125
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8022
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Switch out the string-scanning algorithm used in the reference scanner.
The construction of aho-corasick automata made up the vast majority of
runtime when evaluating nixpkgs previously. While the actual scanning
with a constructed automaton is relatively fast, we almost never scan
for the same set of strings twice and the cost is not worth it.
An algorithm that better matches our needs is the Wu-Manber multiple
string match algorithm, which works efficiently on *long* and *random*
strings of the *same length*, which describes store paths (up to their
hash component).
This switches the refscanner crate to a Rust implementation[0][1] of
this algorithm.
This has several implications:
1. This crate does not provide a way to scan streams. I'm not sure if
this is an inherent problem with the algorithm (probably not, but
it would need buffering). Either way, related functions and
tests (which were actually unused) have been removed.
2. All strings need to be of the same length. For this reason, we
truncate the known paths after their hash part (they are still
unique, of course).
3. Passing an empty set of matches, or a match that is shorter than
the length of a store path, causes the crate to panic. We safeguard
against this by completely skipping the refscanning if there are no
known paths (i.e. when evaluating the first derivation of an eval),
and by bailing out of scanning a string that is shorter than a
store path.
On the upside, this reduces overall runtime to less 1/5 of what it was
before when evaluating `pkgs.stdenv.drvPath`.
[0]: Frankly, it's a random, research-grade MIT-licensed
crate that I found on Github:
https://github.com/jneem/wu-manber
[1]: We probably want to rewrite or at least fork the above crate, and
add things like a three-byte wide scanner. Evaluating large
portions of nixpkgs can easily lead to more than 65k derivations
being scanned for.
Change-Id: I08926778e1e5d5a87fc9ac26e0437aed8bbd9eb0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8017
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This doesn't require any other corresponding handling *yet*, as the
actual replacements happen in the builder logic (which we delegate to
cppnix at the moment).
Change-Id: I034147c933f05ae427c7a8794647132d108d0ede
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7972
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Put this in its src/derivation.
Change-Id: Ic047ab1c2da555a833ee454e10ef60c77537b617
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7967
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
We only do logic here if algo and hash_mode are Some(_)
(and there's an `out` output).
The fact we don't do anything in all in other cases is a bit hidden at
the bottom. Use if let for the destructuring, and drop the other case.
Change-Id: Icc0e38e62947d52d48ef610f754749737977fca9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7966
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This helper function only was created because
populate_output_configuration was hard to test before cl/7939.
With that out of the way, we can pull it in.
Change-Id: I64b36c0eb34343290a8d84a03b0d29392a821fc7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7961
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This is repetitive and error prone (e.g. switching around
to_string/as_str has drastic consequences) due to the ToString
overloads.
Change-Id: I9b16a2e0e05e4c21e83f43e9f603746eb42e53f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7947
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: tazjin <tazjin@tvl.su>
Instead of being called with `md5`, `sha1`, `sha256` or `sha512`,
`fetchurl.nix` (from corepkgs / `<nix`) can also be called with a `hash`
attribute, being an SRI hash.
In that case, `builtin.derivation` is called with `outputHashAlgo` being
an empty string, and `outputHash` being an SRI hash string.
In other cases, an SRI hash is passed as outputHash, but outputHashAlgo
is set too.
Nix does modify these values in (single, fixed) output specification it
serializes to ATerm, but keeps it unharmed in `env`.
Move this into a construct_output_hash helper function, that can be
tested better in isolation.
Change-Id: Id9d716a119664c44ea7747540399966752e20187
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7933
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This adds an implementation of this builtin which correctly calculates
paths, but does not actually write anything to the store or verify
references.
Change-Id: Ie9764cbc1d13a73d8dc9350910304e2b7cad3fe8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7910
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Implements the logic for converting an evaluator value supplied as
arguments to builtins.derivationStrict into an actual,
fully-functional derivation struct.
This skips the implementation of structuredAttrs, which are left for a
subsequent commit.
Note: We will need to port some eval tests over to CLI to test this
correct, which will be done in a separate commit later on.
Change-Id: I0db69dcf12716180de0eb0b126e3da4683712966
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7756
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Adds a helper function which handles special parameters to
`builtins.derivation` that are not just blindly passed through to the
builder environment, but populate other specific fields of the
derivation (outside of the ones handled by other, more complex helpers
from previous commits).
Change-Id: I82d1edf9af714fc4591e9071c0b83ece83be7eee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7901
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This threads through the fields that control whether a derivation is a
fixed-output derivation or not.
Change-Id: I49739de178fed9f258291174ca1a2c15a7cf5c2a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7900
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This adds a helper function which takes the output of the reference
scanner used on derivation inputs and populates the `input_sources`
and `input_derivations` field of the derivation accordingly.
Note that we have a divergence from C++ Nix here, as we do not
populate the entire FS closure of a literally referred derivation (and
our standing theory is that this is unnecessary for nixpkgs).
Change-Id: Id0f605dd8c0a82973c56605c2b8f478fc17777d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7899
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Adds a small helper function which uses a Nix value supplied to
`builtins.derivation{Strict}` to populate the `outputs` field of the
`Derivation` struct.
Change-Id: Iccc7a4f293b3d913140aed576a573a8992241e46
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7898
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI