Commit graph

4 commits

Author SHA1 Message Date
Vincent Ambo
26b55f8cda fix(tvix/cli): use tvlfyi/wu-manber fork for refscanner
Our fork fixes a small bug (https://github.com/jneem/wu-manber/pull/1)
but it's not clear whether upstream will accept patches, so for now
lets point this directly at our fork.

Change-Id: Iccdcedae3e9a8b783241431787c952561d032694
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8031
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-02-04 12:44:59 +00:00
Vincent Ambo
9d6f29a72b refactor(tvix/cli): use Wu-Manber string scanning for drv references
Switch out the string-scanning algorithm used in the reference scanner.

The construction of aho-corasick automata made up the vast majority of
runtime when evaluating nixpkgs previously. While the actual scanning
with a constructed automaton is relatively fast, we almost never scan
for the same set of strings twice and the cost is not worth it.

An algorithm that better matches our needs is the Wu-Manber multiple
string match algorithm, which works efficiently on *long* and *random*
strings of the *same length*, which describes store paths (up to their
hash component).

This switches the refscanner crate to a Rust implementation[0][1] of
this algorithm.

This has several implications:

1. This crate does not provide a way to scan streams. I'm not sure if
   this is an inherent problem with the algorithm (probably not, but
   it would need buffering). Either way, related functions and
   tests (which were actually unused) have been removed.

2. All strings need to be of the same length. For this reason, we
   truncate the known paths after their hash part (they are still
   unique, of course).

3. Passing an empty set of matches, or a match that is shorter than
   the length of a store path, causes the crate to panic. We safeguard
   against this by completely skipping the refscanning if there are no
   known paths (i.e. when evaluating the first derivation of an eval),
   and by bailing out of scanning a string that is shorter than a
   store path.

On the upside, this reduces overall runtime to less 1/5 of what it was
before when evaluating `pkgs.stdenv.drvPath`.

[0]: Frankly, it's a random, research-grade MIT-licensed
     crate that I found on Github:

     https://github.com/jneem/wu-manber

[1]: We probably want to rewrite or at least fork the above crate, and
     add things like a three-byte wide scanner. Evaluating large
     portions of nixpkgs can easily lead to more than 65k derivations
     being scanned for.

Change-Id: I08926778e1e5d5a87fc9ac26e0437aed8bbd9eb0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8017
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-02-02 17:50:44 +00:00
Florian Klink
ab8486e5b8 chore(tvix/store): add tonic-mock
Upstream seems to be dead, so we're using https://github.com/tyrchen/
tonic-mock/pull/3 here.

According to https://github.com/tyrchen/tonic-mock/pull/1#issuecomment-
1241164173, we might not need this crate at all, but for now, it gets
the job done and is less code to write in the tests.

Change-Id: Ia77fa19b998a5bbabd0311cc714b85a2ee30f36a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7869
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-01-21 09:34:15 +00:00
Vincent Ambo
9d6ee5b6a6 fix(tvix/eval): use test-generator fork that supports workspaces
This should make no difference in Nix builds, but allows running tests
locally again with `cargo test` for //tvix/eval.

Change-Id: I97d61840143d5c14db61d5862781bf635f9a28e7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7590
Reviewed-by: grfn <grfn@gws.fyi>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2022-12-21 13:23:38 +00:00