Commit graph

17 commits

Author SHA1 Message Date
Vincent Ambo
38e8c2e959 fix(tvix/cli): keep tracking full paths in known_paths
We need to distinguish explicitly between the paths used for the
scanner, and the paths that populate the derivation inputs. The full
paths must be accessible from the result of the refscanner to populate
drv fields correctly.

This was previously hidden by debug changes that masked actual IO
operations with no-ops.

Change-Id: I037af6e6bbe2b573034d695f8779bee1b56bc125
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8022
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-02-02 23:37:34 +00:00
Vincent Ambo
9d6f29a72b refactor(tvix/cli): use Wu-Manber string scanning for drv references
Switch out the string-scanning algorithm used in the reference scanner.

The construction of aho-corasick automata made up the vast majority of
runtime when evaluating nixpkgs previously. While the actual scanning
with a constructed automaton is relatively fast, we almost never scan
for the same set of strings twice and the cost is not worth it.

An algorithm that better matches our needs is the Wu-Manber multiple
string match algorithm, which works efficiently on *long* and *random*
strings of the *same length*, which describes store paths (up to their
hash component).

This switches the refscanner crate to a Rust implementation[0][1] of
this algorithm.

This has several implications:

1. This crate does not provide a way to scan streams. I'm not sure if
   this is an inherent problem with the algorithm (probably not, but
   it would need buffering). Either way, related functions and
   tests (which were actually unused) have been removed.

2. All strings need to be of the same length. For this reason, we
   truncate the known paths after their hash part (they are still
   unique, of course).

3. Passing an empty set of matches, or a match that is shorter than
   the length of a store path, causes the crate to panic. We safeguard
   against this by completely skipping the refscanning if there are no
   known paths (i.e. when evaluating the first derivation of an eval),
   and by bailing out of scanning a string that is shorter than a
   store path.

On the upside, this reduces overall runtime to less 1/5 of what it was
before when evaluating `pkgs.stdenv.drvPath`.

[0]: Frankly, it's a random, research-grade MIT-licensed
     crate that I found on Github:

     https://github.com/jneem/wu-manber

[1]: We probably want to rewrite or at least fork the above crate, and
     add things like a three-byte wide scanner. Evaluating large
     portions of nixpkgs can easily lead to more than 65k derivations
     being scanned for.

Change-Id: I08926778e1e5d5a87fc9ac26e0437aed8bbd9eb0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8017
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-02-02 17:50:44 +00:00
Florian Klink
99d5cf822a refactor(tvix/cli): use nixhash module for output hash calculation
This covers all the weird corner cases.

Change-Id: I85637e82e8929828064ab562dc8a1c8bf161fffa
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7991
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2023-02-01 15:29:48 +00:00
Vincent Ambo
759f9dbf39 feat(tvix/cli): implement builtins.placeholder
This doesn't require any other corresponding handling *yet*, as the
actual replacements happen in the builder logic (which we delegate to
cppnix at the moment).

Change-Id: I034147c933f05ae427c7a8794647132d108d0ede
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7972
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-02-01 10:01:40 +00:00
Florian Klink
2d24c5f260 refactor(tvix/nix-compat): absorb //tvix/derivation
Put this in its src/derivation.

Change-Id: Ic047ab1c2da555a833ee454e10ef60c77537b617
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7967
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2023-01-31 15:16:39 +00:00
Florian Klink
8ea93bb646 refactor(tvix/cli/derivation): use if let to destructure
We only do logic here if algo and hash_mode are Some(_)
(and there's an `out` output).

The fact we don't do anything in all in other cases is a bit hidden at
the bottom. Use if let for the destructuring, and drop the other case.

Change-Id: Icc0e38e62947d52d48ef610f754749737977fca9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7966
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-01-31 13:26:11 +00:00
Florian Klink
6fa91349a9 refactor(tvix/cli): remove unneeded clone
Change-Id: I6f4cb24bdd636af8918a2ade44075af92161c97d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7965
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-01-31 13:26:11 +00:00
Florian Klink
4b3ccd205a refactor(tvix/cli): absorb construct_output_hash
This helper function only was created because
populate_output_configuration was hard to test before cl/7939.

With that out of the way, we can pull it in.

Change-Id: I64b36c0eb34343290a8d84a03b0d29392a821fc7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7961
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2023-01-31 08:52:00 +00:00
Vincent Ambo
e4bb750b3b refactor(tvix/cli): force outside of output configuration helper
Change-Id: I28357fe131cefedcef9761b08a72f675f4a10789
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7939
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-01-31 08:51:03 +00:00
Vincent Ambo
124af9e5d5 refactor(tvix/cli): add helper method for strong string coercion
This is repetitive and error prone (e.g. switching around
to_string/as_str has drastic consequences) due to the ToString
overloads.

Change-Id: I9b16a2e0e05e4c21e83f43e9f603746eb42e53f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7947
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: tazjin <tazjin@tvl.su>
2023-01-29 17:39:36 +00:00
Florian Klink
a94a1434cc fix(tvix/cli): handle SRI hashes in outputHash
Instead of being called with `md5`, `sha1`, `sha256` or `sha512`,
`fetchurl.nix` (from corepkgs / `<nix`) can also be called with a `hash`
attribute, being an SRI hash.

In that case, `builtin.derivation` is called with `outputHashAlgo` being
an empty string, and `outputHash` being an SRI hash string.

In other cases, an SRI hash is passed as outputHash, but outputHashAlgo
is set too.

Nix does modify these values in (single, fixed) output specification it
serializes to ATerm, but keeps it unharmed in `env`.

Move this into a construct_output_hash helper function, that can be
tested better in isolation.

Change-Id: Id9d716a119664c44ea7747540399966752e20187
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7933
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-01-27 14:06:13 +00:00
Vincent Ambo
f22b9cb0d7 feat(tvix/cli): faux implementation of builtins.toFile
This adds an implementation of this builtin which correctly calculates
paths, but does not actually write anything to the store or verify
references.

Change-Id: Ie9764cbc1d13a73d8dc9350910304e2b7cad3fe8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7910
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-01-27 12:21:41 +00:00
Vincent Ambo
8a9aa018dc feat(tvix/cli): implement builtins.derivationStrict
Implements the logic for converting an evaluator value supplied as
arguments to builtins.derivationStrict into an actual,
fully-functional derivation struct.

This skips the implementation of structuredAttrs, which are left for a
subsequent commit.

Note: We will need to port some eval tests over to CLI to test this
correct, which will be done in a separate commit later on.

Change-Id: I0db69dcf12716180de0eb0b126e3da4683712966
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7756
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-01-27 12:21:41 +00:00
Vincent Ambo
3d7c371e22 feat(tvix/cli): add helper for handling special drv parameters
Adds a helper function which handles special parameters to
`builtins.derivation` that are not just blindly passed through to the
builder environment, but populate other specific fields of the
derivation (outside of the ones handled by other, more complex helpers
from previous commits).

Change-Id: I82d1edf9af714fc4591e9071c0b83ece83be7eee
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7901
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-01-27 12:21:41 +00:00
Vincent Ambo
dfc50c9ef5 feat(tvix/cli): add helper for populating drv output configuration
This threads through the fields that control whether a derivation is a
fixed-output derivation or not.

Change-Id: I49739de178fed9f258291174ca1a2c15a7cf5c2a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7900
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-01-27 12:21:41 +00:00
Vincent Ambo
fdca93d6ed feat(tvix/cli): add helper for populating derivation inputs
This adds a helper function which takes the output of the reference
scanner used on derivation inputs and populates the `input_sources`
and `input_derivations` field of the derivation accordingly.

Note that we have a divergence from C++ Nix here, as we do not
populate the entire FS closure of a literally referred derivation (and
our standing theory is that this is unnecessary for nixpkgs).

Change-Id: Id0f605dd8c0a82973c56605c2b8f478fc17777d6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7899
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-01-27 12:21:41 +00:00
Vincent Ambo
c6811f0cea feat(tvix/cli): add helper for populating derivation outputs
Adds a small helper function which uses a Nix value supplied to
`builtins.derivation{Strict}` to populate the `outputs` field of the
`Derivation` struct.

Change-Id: Iccc7a4f293b3d913140aed576a573a8992241e46
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7898
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-01-27 12:21:41 +00:00