f2f12d1556
Change-Id: I216be1b75eb9f18a58ab2164a77b3c51de8bf784 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11583 Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Reviewed-by: Adam Joseph <adam@westernsemico.com>
133 lines
5.5 KiB
Markdown
133 lines
5.5 KiB
Markdown
Compilation of bindings
|
|
=======================
|
|
|
|
Compilation of Nix bindings is one of the most mind-bending parts of Nix
|
|
evaluation. The implementation of just the compilation is currently almost 1000
|
|
lines of code, excluding the various insane test cases we dreamt up for it.
|
|
|
|
## What is a binding?
|
|
|
|
In short, any attribute set or `let`-expression. Tvix currently does not treat
|
|
formals in function parameters (e.g. `{ name ? "fred" }: ...`) the same as these
|
|
bindings.
|
|
|
|
They have two very difficult features:
|
|
|
|
1. Keys can mutually refer to each other in `rec` sets or `let`-bindings,
|
|
including out of definition order.
|
|
2. Attribute sets can be nested, and parts of one attribute set can be defined
|
|
in multiple separate bindings.
|
|
|
|
Tvix resolves as much of this logic statically (i.e. at compile-time) as
|
|
possible, but the procedure is quite complicated.
|
|
|
|
## High-level concept
|
|
|
|
The idea behind the way we compile bindings is to fully resolve nesting
|
|
statically, and use the usual mechanisms (i.e. recursion/thunking/value
|
|
capturing) for resolving dynamic values.
|
|
|
|
This is done by compiling bindings in several phases:
|
|
|
|
1. An initial compilation phase *only* for plain inherit statements (i.e.
|
|
`inherit name;`), *not* for namespaced inherits (i.e. `inherit (from)
|
|
name;`).
|
|
|
|
2. A declaration-only phase, in which we use the compiler's scope tracking logic
|
|
to calculate the physical runtime stack indices (further referred to as
|
|
"stack slots" or just "slots") that all values will end up in.
|
|
|
|
In this phase, whenever we encounter a nested attribute set, it is merged
|
|
into a custom data structure that acts like a synthetic AST node.
|
|
|
|
This can be imagined similar to a rewrite like this:
|
|
|
|
```nix
|
|
# initial code:
|
|
{
|
|
a.b = 1;
|
|
a.c = 2;
|
|
}
|
|
|
|
# rewritten form:
|
|
{
|
|
a = {
|
|
b = 1;
|
|
c = 2;
|
|
};
|
|
}
|
|
```
|
|
|
|
The rewrite applies to attribute sets and `let`-bindings alike.
|
|
|
|
At the end of this phase, we know the stack slots of all namespaces for
|
|
inheriting from, all values inherited from them, and all values (and
|
|
optionall keys) of bindings at the current level.
|
|
|
|
Only statically known keys are actually merged, so any dynamic keys that
|
|
conflict will lead to a "key already defined" error at runtime.
|
|
|
|
3. A compilation phase, in which all values (and, when necessary, keys) are
|
|
actually compiled. In this phase the custom data structure used for merging
|
|
is encountered when compiling values.
|
|
|
|
As this data structure acts like an AST node, the process begins recursively
|
|
for each nested attribute set.
|
|
|
|
At the end of this process we have bytecode that leaves the required values (and
|
|
optionally keys) on the stack. In the case of attribute sets, a final operation
|
|
is emitted that constructs the actual attribute set structure at runtime. For
|
|
`let`-bindings a final operation is emitted that removes these locals from the
|
|
stack when the scope ends.
|
|
|
|
## Moving parts
|
|
|
|
WARNING: This documents the *current* implementation. If you only care about the
|
|
conceptual aspects, see above.
|
|
|
|
There's a few types involved:
|
|
|
|
* `PeekableAttrs`: peekable iterator over an attribute path (e.g. `a.b.c`)
|
|
* `BindingsKind`: enum defining the kind of bindings (attrs/recattrs/let)
|
|
* `AttributeSet`: struct holding the bindings kind, the AST nodes with inherits
|
|
(both namespaced and not), and an internal representation of bindings
|
|
(essentially a vector of tuples of the peekable attrs and the expression to
|
|
compile for the value).
|
|
* `Binding`: enum describing the kind of binding (namespaced inherit, attribute
|
|
set, plain binding of *any other value type*)
|
|
* `KeySlot`: enum describing the location in which a key slot is placed at
|
|
runtime (nowhere, statically known value in a slot, dynamic value in a slot)
|
|
* `TrackedBinding`: struct representing statically known information about a
|
|
single binding (its key slot, value slot and `Binding`)
|
|
* `TrackedBindings`: vector of tracked bindings, which implements logic for
|
|
merging attribute sets together
|
|
|
|
And quite a few methods on `Compiler`:
|
|
|
|
* `compile_bindings`: entry point for compiling anything that looks like a
|
|
binding, this calls out to the functions below.
|
|
* `compile_plain_inherits`: takes all inherits of a bindings node and compiles
|
|
the ones that are trivial to compile (i.e. just plain inherits without a
|
|
namespace). The `rnix` parser does not represent namespaced/plain inherits in
|
|
different nodes, so this function also aggregates the namespaced inherits and
|
|
returns them for further use
|
|
* `declare_namespaced_inherits`: passes over all namespaced inherits and
|
|
declares them on the locals stack, as well as inserts them into the provided
|
|
`TrackedBindings`
|
|
* `declare_bindings`: declares all regular key/value bindings in a bindings
|
|
scope, but without actually compiling their keys or values.
|
|
|
|
There's a lot of heavy lifting going on here:
|
|
|
|
1. It invokes the various pieces of logic responsible for merging nested
|
|
attribute sets together, creating intermediate data structures in the value
|
|
slots of bindings that can be recursively processed the same way.
|
|
2. It decides on the key slots of expressions based on the kind of bindings,
|
|
and the type of expression providing the key.
|
|
* `bind_values`: runs the actual compilation of values. Notably this function is
|
|
responsible for recursively compiling merged attribute sets when it encounters
|
|
a `Binding::Set` (on which it invokes `compile_bindings` itself).
|
|
|
|
In addition to these several methods (such as `compile_attr_set`,
|
|
`compile_let_in`, ...) invoke the binding-kind specific logic and then call out
|
|
to the functions above.
|