docs(tvix/eval): add document about how binding construction works
Change-Id: I216be1b75eb9f18a58ab2164a77b3c51de8bf784 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11583 Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Reviewed-by: Adam Joseph <adam@westernsemico.com>
This commit is contained in:
parent
1c7d319164
commit
f2f12d1556
1 changed files with 133 additions and 0 deletions
133
tvix/eval/docs/bindings.md
Normal file
133
tvix/eval/docs/bindings.md
Normal file
|
@ -0,0 +1,133 @@
|
|||
Compilation of bindings
|
||||
=======================
|
||||
|
||||
Compilation of Nix bindings is one of the most mind-bending parts of Nix
|
||||
evaluation. The implementation of just the compilation is currently almost 1000
|
||||
lines of code, excluding the various insane test cases we dreamt up for it.
|
||||
|
||||
## What is a binding?
|
||||
|
||||
In short, any attribute set or `let`-expression. Tvix currently does not treat
|
||||
formals in function parameters (e.g. `{ name ? "fred" }: ...`) the same as these
|
||||
bindings.
|
||||
|
||||
They have two very difficult features:
|
||||
|
||||
1. Keys can mutually refer to each other in `rec` sets or `let`-bindings,
|
||||
including out of definition order.
|
||||
2. Attribute sets can be nested, and parts of one attribute set can be defined
|
||||
in multiple separate bindings.
|
||||
|
||||
Tvix resolves as much of this logic statically (i.e. at compile-time) as
|
||||
possible, but the procedure is quite complicated.
|
||||
|
||||
## High-level concept
|
||||
|
||||
The idea behind the way we compile bindings is to fully resolve nesting
|
||||
statically, and use the usual mechanisms (i.e. recursion/thunking/value
|
||||
capturing) for resolving dynamic values.
|
||||
|
||||
This is done by compiling bindings in several phases:
|
||||
|
||||
1. An initial compilation phase *only* for plain inherit statements (i.e.
|
||||
`inherit name;`), *not* for namespaced inherits (i.e. `inherit (from)
|
||||
name;`).
|
||||
|
||||
2. A declaration-only phase, in which we use the compiler's scope tracking logic
|
||||
to calculate the physical runtime stack indices (further referred to as
|
||||
"stack slots" or just "slots") that all values will end up in.
|
||||
|
||||
In this phase, whenever we encounter a nested attribute set, it is merged
|
||||
into a custom data structure that acts like a synthetic AST node.
|
||||
|
||||
This can be imagined similar to a rewrite like this:
|
||||
|
||||
```nix
|
||||
# initial code:
|
||||
{
|
||||
a.b = 1;
|
||||
a.c = 2;
|
||||
}
|
||||
|
||||
# rewritten form:
|
||||
{
|
||||
a = {
|
||||
b = 1;
|
||||
c = 2;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
The rewrite applies to attribute sets and `let`-bindings alike.
|
||||
|
||||
At the end of this phase, we know the stack slots of all namespaces for
|
||||
inheriting from, all values inherited from them, and all values (and
|
||||
optionall keys) of bindings at the current level.
|
||||
|
||||
Only statically known keys are actually merged, so any dynamic keys that
|
||||
conflict will lead to a "key already defined" error at runtime.
|
||||
|
||||
3. A compilation phase, in which all values (and, when necessary, keys) are
|
||||
actually compiled. In this phase the custom data structure used for merging
|
||||
is encountered when compiling values.
|
||||
|
||||
As this data structure acts like an AST node, the process begins recursively
|
||||
for each nested attribute set.
|
||||
|
||||
At the end of this process we have bytecode that leaves the required values (and
|
||||
optionally keys) on the stack. In the case of attribute sets, a final operation
|
||||
is emitted that constructs the actual attribute set structure at runtime. For
|
||||
`let`-bindings a final operation is emitted that removes these locals from the
|
||||
stack when the scope ends.
|
||||
|
||||
## Moving parts
|
||||
|
||||
WARNING: This documents the *current* implementation. If you only care about the
|
||||
conceptual aspects, see above.
|
||||
|
||||
There's a few types involved:
|
||||
|
||||
* `PeekableAttrs`: peekable iterator over an attribute path (e.g. `a.b.c`)
|
||||
* `BindingsKind`: enum defining the kind of bindings (attrs/recattrs/let)
|
||||
* `AttributeSet`: struct holding the bindings kind, the AST nodes with inherits
|
||||
(both namespaced and not), and an internal representation of bindings
|
||||
(essentially a vector of tuples of the peekable attrs and the expression to
|
||||
compile for the value).
|
||||
* `Binding`: enum describing the kind of binding (namespaced inherit, attribute
|
||||
set, plain binding of *any other value type*)
|
||||
* `KeySlot`: enum describing the location in which a key slot is placed at
|
||||
runtime (nowhere, statically known value in a slot, dynamic value in a slot)
|
||||
* `TrackedBinding`: struct representing statically known information about a
|
||||
single binding (its key slot, value slot and `Binding`)
|
||||
* `TrackedBindings`: vector of tracked bindings, which implements logic for
|
||||
merging attribute sets together
|
||||
|
||||
And quite a few methods on `Compiler`:
|
||||
|
||||
* `compile_bindings`: entry point for compiling anything that looks like a
|
||||
binding, this calls out to the functions below.
|
||||
* `compile_plain_inherits`: takes all inherits of a bindings node and compiles
|
||||
the ones that are trivial to compile (i.e. just plain inherits without a
|
||||
namespace). The `rnix` parser does not represent namespaced/plain inherits in
|
||||
different nodes, so this function also aggregates the namespaced inherits and
|
||||
returns them for further use
|
||||
* `declare_namespaced_inherits`: passes over all namespaced inherits and
|
||||
declares them on the locals stack, as well as inserts them into the provided
|
||||
`TrackedBindings`
|
||||
* `declare_bindings`: declares all regular key/value bindings in a bindings
|
||||
scope, but without actually compiling their keys or values.
|
||||
|
||||
There's a lot of heavy lifting going on here:
|
||||
|
||||
1. It invokes the various pieces of logic responsible for merging nested
|
||||
attribute sets together, creating intermediate data structures in the value
|
||||
slots of bindings that can be recursively processed the same way.
|
||||
2. It decides on the key slots of expressions based on the kind of bindings,
|
||||
and the type of expression providing the key.
|
||||
* `bind_values`: runs the actual compilation of values. Notably this function is
|
||||
responsible for recursively compiling merged attribute sets when it encounters
|
||||
a `Binding::Set` (on which it invokes `compile_bindings` itself).
|
||||
|
||||
In addition to these several methods (such as `compile_attr_set`,
|
||||
`compile_let_in`, ...) invoke the binding-kind specific logic and then call out
|
||||
to the functions above.
|
Loading…
Reference in a new issue