docs(tvix/eval): why context strings now

Explain why we had to take the bullet on context strings now
and move the historical approach in a historical section.

Change-Id: Ie3bcc2213b391c6ba06547cc05c850891a41d06b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10446
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This commit is contained in:
Ryan Lahfa 2023-12-27 14:32:54 +01:00 committed by clbot
parent eba5c1757a
commit fc182061c6

View file

@ -97,6 +97,9 @@ C++ Nix has several builtins that interface directly with string contexts:
* `appendContext`: adds a given string context to the string in the same format
as returned by `getContext`
Most of the string manipulation operations will propagate the context to the
result based on their parameters' contexts.
## Placeholders
C++ Nix has `builtins.placeholder`, which given the name of an output (e.g.
@ -111,16 +114,24 @@ passes etc.
## Tvix & string contexts
Tvix does not track string contexts in its evaluator at all. Instead we are
investigating implementing a system which allows us to drop string contexts in
favour of reference scanning derivation attributes.
In the past, Tvix did not track string contexts in its evaluator at all, see
the historical section for more information about that.
Tvix tracks string contexts in every `NixString` structure via a
`HashSet<BuildReference>` and offers an API to combine the references while
keeping the exact internal structure of that data private.
## Historical attempt: Persistent reference tracking
We were investigating implementing a system which allows us to drop string
contexts in favour of reference scanning derivation attributes.
This means that instead of maintaining and passing around a string context data
structure in eval, we maintain a data structure of *known paths* from the same
evaluation elsewhere in Tvix, and scan each derivation attribute against this
set of known paths when instantiating derivations.
Until proven otherwise, we take the stance that the system of string contexts as
We believed we could take the stance that the system of string contexts as
implemented in C++ Nix is likely an implementation detail that should not be
leaking to the language surface as it does now.
@ -173,3 +184,71 @@ discrepancies between Tvix and C++ Nix, which can make initial comparisons of
derivations generated by the two systems difficult. If this occurs we need to
discuss how to approach it, but initially we will implement the mutating
builtins as no-ops.
### Why this did not work for us?
Nix has a feature to perform environmental checks of your derivation, e.g.
"these derivation outputs should not be referenced in this derivation", this was
introduced in Nix 2.2 by
https://github.com/NixOS/nix/commit/3cd15c5b1f5a8e6de87d5b7e8cc2f1326b420c88.
Unfortunately, this feature introduced a very unfortunate and critical bug: all
usage of this feature with contextful strings will actually force the
derivation to depend at least at build time on those specific paths, see
https://github.com/NixOS/nix/issues/4629.
For example, if you wanted to `disallowedReferences` to a package and you used a
derivation as a path, you would actually register that derivation as a input
derivation of that derivation.
This bug is still unfixed in Nix and it seems that fixing it would require
introducing different ways to evaluate Nix derivations to preserve the
output path calculation for Nix expressions so far.
All of this would be fine if the bug behavior was uniform in the sense that no
one tried to force-workaround it. Since Nixpkgs 23.05, due to
https://github.com/NixOS/nixpkgs/pull/211783 this is not true anymore.
If you let nixpkgs be the disjoint union of bootstrapping derivations $A$ and
`stdenv.mkDerivation`-built derivations $B$.
$A$ suffers from the bug and $B$ doesn't by the forced usage of
`unsafeDiscardStringContext` on those special checking fields.
This means that to build hash-compatible $A$ **and** $B$, we need to
distinguish $A$ and $B$. A lot of hacks could be imagined to support this
problem.
Let's assume we have a solution to that problem, it means that we are able to
detect implicitly when a set of specific fields are
`unsafeDiscardStringContext`-ed.
Thus, we could use that same trick to implement `unsafeDiscardStringContext`
entirely for all fields actually.
Now, to implement `unsafeDiscardStringContext` in the persistent reference
tracking model, you will need to store a disallowed list of strings that should
not trigger a reference when we are scanning a derivation parameters.
But assume you have something like:
```nix
derivation {
buildInputs = [
stdenv.cc
];
disallowedReferences = [ stdenv.cc ];
}
```
If you unregister naively the `stdenv.cc` reference, it will silence the fact
that it is part of the `buildInputs`, so you will observe that Nix will fail
the derivation during environmental check, but Tvix would silently force remove
that reference.
Until proven otherwise, it seems highly difficult to have the fine-grained
information to prevent reference tracking of those specific fields. It is not a
failure of the persistent reference tracking, it is an unresolved critical bug
of Nix that only nixpkgs really workarounded for `stdenv.mkDerivation`-based
derivations.