No description
Find a file
Vincent Ambo 9d6f29a72b refactor(tvix/cli): use Wu-Manber string scanning for drv references
Switch out the string-scanning algorithm used in the reference scanner.

The construction of aho-corasick automata made up the vast majority of
runtime when evaluating nixpkgs previously. While the actual scanning
with a constructed automaton is relatively fast, we almost never scan
for the same set of strings twice and the cost is not worth it.

An algorithm that better matches our needs is the Wu-Manber multiple
string match algorithm, which works efficiently on *long* and *random*
strings of the *same length*, which describes store paths (up to their
hash component).

This switches the refscanner crate to a Rust implementation[0][1] of
this algorithm.

This has several implications:

1. This crate does not provide a way to scan streams. I'm not sure if
   this is an inherent problem with the algorithm (probably not, but
   it would need buffering). Either way, related functions and
   tests (which were actually unused) have been removed.

2. All strings need to be of the same length. For this reason, we
   truncate the known paths after their hash part (they are still
   unique, of course).

3. Passing an empty set of matches, or a match that is shorter than
   the length of a store path, causes the crate to panic. We safeguard
   against this by completely skipping the refscanning if there are no
   known paths (i.e. when evaluating the first derivation of an eval),
   and by bailing out of scanning a string that is shorter than a
   store path.

On the upside, this reduces overall runtime to less 1/5 of what it was
before when evaluating `pkgs.stdenv.drvPath`.

[0]: Frankly, it's a random, research-grade MIT-licensed
     crate that I found on Github:

     https://github.com/jneem/wu-manber

[1]: We probably want to rewrite or at least fork the above crate, and
     add things like a three-byte wide scanner. Evaluating large
     portions of nixpkgs can easily lead to more than 65k derivations
     being scanned for.

Change-Id: I08926778e1e5d5a87fc9ac26e0437aed8bbd9eb0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8017
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-02-02 17:50:44 +00:00
.gcroots feat(.envrc): gcroot third_party.sources 2022-09-15 11:27:53 +00:00
.nixery feat(ops/modules): Add module for running Nixery 2021-08-12 14:55:59 +00:00
corp chore(corp/tvixbolt): update Cargo.lock 2023-02-01 23:06:02 +00:00
docs docs: change email address mentions to depot@tvl.su 2022-12-27 19:46:11 +00:00
fun chore(3p/sources): Bump channels & overlays 2022-12-24 12:42:41 +00:00
lisp chore(gerrit): migrate OWNERS files to code-owners style 2022-09-19 11:13:28 +00:00
net chore(3p/sources): Bump channels & overlays 2022-09-28 08:02:31 +00:00
nix feat(nix/bufCheck): always run from repo root 2022-12-27 13:27:40 +00:00
ops fix(ops/buildkite): set default_branch explicitly 2023-02-01 17:25:06 +00:00
third_party chore(third_party/nixpkgs): drop permittedInsecurePackages 2023-02-01 11:50:58 +00:00
tools chore(tools/cheddar): bump to syntect 5.0 2023-01-07 08:02:37 +00:00
tvix refactor(tvix/cli): use Wu-Manber string scanning for drv references 2023-02-02 17:50:44 +00:00
users feat(tazjin/tverskoy): re-enable virtualbox 2023-01-24 19:59:55 +00:00
views feat(views/tvix): add buildkite pipeline 2023-02-01 17:11:50 +00:00
web feat(web/inbox): add landing page for inbox.tvl.su 2022-12-28 08:17:45 +00:00
.envrc feat(.envrc): gcroot third_party.sources 2022-09-15 11:27:53 +00:00
.git-blame-ignore-revs fix: add cl/4397 (treewide nixpkgs-fmt) to git-blame-ignore-revs 2022-02-07 18:15:09 +00:00
.gitignore feat(.envrc): gcroot third_party.sources 2022-09-15 11:27:53 +00:00
.hgignore chore(hgignore): ignore .git for hg 2020-06-14 18:23:13 +00:00
.mailmap chore(mailmap): add my name to mailmap 2020-07-18 18:15:05 +00:00
.rgignore chore: Only exclude //third_party/git from ripgrep 2020-05-17 23:58:22 +01:00
buf.gen.yaml feat(nix/bufCheck): ensure .pb.go is up to date 2022-12-27 13:27:40 +00:00
buf.yaml chore(buf): Use nixpkgs-provided buf 2022-10-21 18:39:03 +00:00
default.nix feat(3p/overlays): Build overlaid packages in CI explicitly 2022-12-26 15:43:45 +00:00
LICENSE chore(LICENSE): happy new year! 2022-11-26 00:40:57 +00:00
OWNERS chore(gerrit): migrate OWNERS files to code-owners style 2022-09-19 11:13:28 +00:00
README.md docs(README.md): reflect recent upheaval in depot 2022-05-27 23:24:28 +00:00
RULES feat(whitby): Let sterni bear the wheel 2021-05-23 19:06:15 +00:00
rustfmt.toml feat(depotfmt): Check & format Rust code with rustfmt 2022-02-08 12:06:39 +00:00

depot

Build status

This repository is the monorepo for the community around The Virus Lounge, containing our personal tools and infrastructure. Everything in here is built using Nix.

A large portion of the software here is very self-referential, meaning that it exists to sustain the operation of the repository. This is the case because we partially see this as an experiment in tooling for monorepos.

Highlights

Services

  • Source code is available primarily through Sourcegraph on cs.tvl.fyi, where it is searchable and even semantically indexed. A lower-tech view of the repository is also available via cgit-pink on code.tvl.fyi.

    The repository can be cloned using git from https://cl.tvl.fyi/depot.

  • All code in the depot, with the exception of code that is checked in to individual //users folders, needs to be reviewed. We use Gerrit on cl.tvl.fyi for this.

  • Issues are tracked via our own issue tracker on b.tvl.fyi. Its source code lives at //web/panettone/.

  • Smaller todo-list entries which do not warrant a separate issue are listed at todo.tvl.fyi.

  • We use Buildkite for CI. Recent builds are listed on tvl.fyi/builds and pipelines are configured dynamically via //ops/pipelines.

  • A search service that makes TVL services available via textual shortcuts is available: atward

All services that we host are deployed on NixOS machines that we manage. Their configuration is tracked in //ops/{modules,machines}.

Nix

  • //nix/readTree contains the Nix code which automatically registers projects in our Nix attribute hierarchy based on their in-tree location
  • //tools/nixery contains the source code of Nixery, a container registry that can build images ad-hoc from Nix packages
  • //nix/yants contains Yet Another Nix Type System, which we use for a variety of things throughout the repository
  • //nix/buildGo implements a Nix library that can build Go software in the style of Bazel's rules_go. Go programs in this repository are built using this library.
  • //nix/buildLisp implements a Nix library that can build Common Lisp software. Currently only SBCL is supported. Lisp programs in this repository are built using this library.
  • //web/blog and //web/atom-feed: A Nix-based static site generator which generates the web page and Atom feed for tazj.in (//users/tazjin/homepage) and tvl.fyi (//web/tvl)
  • //web/bubblegum contains a CGI-based web framework written in Nix.
  • //nix/nint: A shebang-compatible interpreter wrapper for Nix.
  • //tvix contains initial work towards a modular architecture for Nix.

We have a variety of other tools and libraries in the //nix folder which may be of interest.

Packages / Libraries

  • //net/alcoholic_jwt contains an easy-to-use JWT-validation library for Rust
  • //net/crimp contains a high-level HTTP client using cURL for Rust
  • //tools/emacs-pkgs contains various useful Emacs libraries, for example:
    • dottime.el provides dottime in the Emacs modeline
    • nix-util.el provides editing utilities for Nix files
    • term-switcher.el is an ivy-function for switching between vterm buffers
    • tvl.el provides helper functions for interacting with the TVL monorepo
  • //lisp/klatre provides a grab-bag utility library for Common Lisp

User packages

Contributors to the repository have user directories under //users, which can be used for personal or experimental code that does not require review.

Some examples:

  • //users/grfn/xanthous: A (WIP) TUI RPG, written in Haskell.
  • //users/tazjin/emacs: tazjin's Emacs & EXWM configuration
  • //users/tazjin/finito: A persistent finite-state machine library for Rust.

Licensing

Unless otherwise stated in a subdirectory, all code is licensed under the MIT license. See LICENSE for details.

Contributing

If you'd like to contribute to any of the tools in here, please check out the contribution guidelines and our code of conduct.

IRC users can find us in #tvl on hackint, which is also reachable via XMPP at #tvl@irc.hackint.org (sic!).

Hackint also provide a web chat.