We need to distinguish explicitly between the paths used for the
scanner, and the paths that populate the derivation inputs. The full
paths must be accessible from the result of the refscanner to populate
drv fields correctly.
This was previously hidden by debug changes that masked actual IO
operations with no-ops.
Change-Id: I037af6e6bbe2b573034d695f8779bee1b56bc125
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8022
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Switch out the string-scanning algorithm used in the reference scanner.
The construction of aho-corasick automata made up the vast majority of
runtime when evaluating nixpkgs previously. While the actual scanning
with a constructed automaton is relatively fast, we almost never scan
for the same set of strings twice and the cost is not worth it.
An algorithm that better matches our needs is the Wu-Manber multiple
string match algorithm, which works efficiently on *long* and *random*
strings of the *same length*, which describes store paths (up to their
hash component).
This switches the refscanner crate to a Rust implementation[0][1] of
this algorithm.
This has several implications:
1. This crate does not provide a way to scan streams. I'm not sure if
this is an inherent problem with the algorithm (probably not, but
it would need buffering). Either way, related functions and
tests (which were actually unused) have been removed.
2. All strings need to be of the same length. For this reason, we
truncate the known paths after their hash part (they are still
unique, of course).
3. Passing an empty set of matches, or a match that is shorter than
the length of a store path, causes the crate to panic. We safeguard
against this by completely skipping the refscanning if there are no
known paths (i.e. when evaluating the first derivation of an eval),
and by bailing out of scanning a string that is shorter than a
store path.
On the upside, this reduces overall runtime to less 1/5 of what it was
before when evaluating `pkgs.stdenv.drvPath`.
[0]: Frankly, it's a random, research-grade MIT-licensed
crate that I found on Github:
https://github.com/jneem/wu-manber
[1]: We probably want to rewrite or at least fork the above crate, and
add things like a three-byte wide scanner. Evaluating large
portions of nixpkgs can easily lead to more than 65k derivations
being scanned for.
Change-Id: I08926778e1e5d5a87fc9ac26e0437aed8bbd9eb0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8017
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This doesn't require any other corresponding handling *yet*, as the
actual replacements happen in the builder logic (which we delegate to
cppnix at the moment).
Change-Id: I034147c933f05ae427c7a8794647132d108d0ede
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7972
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Put this in its src/derivation.
Change-Id: Ic047ab1c2da555a833ee454e10ef60c77537b617
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7967
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
We only do logic here if algo and hash_mode are Some(_)
(and there's an `out` output).
The fact we don't do anything in all in other cases is a bit hidden at
the bottom. Use if let for the destructuring, and drop the other case.
Change-Id: Icc0e38e62947d52d48ef610f754749737977fca9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7966
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This helper function only was created because
populate_output_configuration was hard to test before cl/7939.
With that out of the way, we can pull it in.
Change-Id: I64b36c0eb34343290a8d84a03b0d29392a821fc7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7961
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This is repetitive and error prone (e.g. switching around
to_string/as_str has drastic consequences) due to the ToString
overloads.
Change-Id: I9b16a2e0e05e4c21e83f43e9f603746eb42e53f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7947
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: tazjin <tazjin@tvl.su>
Instead of being called with `md5`, `sha1`, `sha256` or `sha512`,
`fetchurl.nix` (from corepkgs / `<nix`) can also be called with a `hash`
attribute, being an SRI hash.
In that case, `builtin.derivation` is called with `outputHashAlgo` being
an empty string, and `outputHash` being an SRI hash string.
In other cases, an SRI hash is passed as outputHash, but outputHashAlgo
is set too.
Nix does modify these values in (single, fixed) output specification it
serializes to ATerm, but keeps it unharmed in `env`.
Move this into a construct_output_hash helper function, that can be
tested better in isolation.
Change-Id: Id9d716a119664c44ea7747540399966752e20187
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7933
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This adds an implementation of this builtin which correctly calculates
paths, but does not actually write anything to the store or verify
references.
Change-Id: Ie9764cbc1d13a73d8dc9350910304e2b7cad3fe8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7910
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Implements the logic for converting an evaluator value supplied as
arguments to builtins.derivationStrict into an actual,
fully-functional derivation struct.
This skips the implementation of structuredAttrs, which are left for a
subsequent commit.
Note: We will need to port some eval tests over to CLI to test this
correct, which will be done in a separate commit later on.
Change-Id: I0db69dcf12716180de0eb0b126e3da4683712966
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7756
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Adds a helper function which handles special parameters to
`builtins.derivation` that are not just blindly passed through to the
builder environment, but populate other specific fields of the
derivation (outside of the ones handled by other, more complex helpers
from previous commits).
Change-Id: I82d1edf9af714fc4591e9071c0b83ece83be7eee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7901
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This threads through the fields that control whether a derivation is a
fixed-output derivation or not.
Change-Id: I49739de178fed9f258291174ca1a2c15a7cf5c2a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7900
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This adds a helper function which takes the output of the reference
scanner used on derivation inputs and populates the `input_sources`
and `input_derivations` field of the derivation accordingly.
Note that we have a divergence from C++ Nix here, as we do not
populate the entire FS closure of a literally referred derivation (and
our standing theory is that this is unnecessary for nixpkgs).
Change-Id: Id0f605dd8c0a82973c56605c2b8f478fc17777d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7899
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Adds a small helper function which uses a Nix value supplied to
`builtins.derivation{Strict}` to populate the `outputs` field of the
`Derivation` struct.
Change-Id: Iccc7a4f293b3d913140aed576a573a8992241e46
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7898
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI