Change-Id: I216be1b75eb9f18a58ab2164a77b3c51de8bf784 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11583 Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Reviewed-by: Adam Joseph <adam@westernsemico.com>
5.5 KiB
Compilation of bindings
Compilation of Nix bindings is one of the most mind-bending parts of Nix evaluation. The implementation of just the compilation is currently almost 1000 lines of code, excluding the various insane test cases we dreamt up for it.
What is a binding?
In short, any attribute set or let
-expression. Tvix currently does not treat
formals in function parameters (e.g. { name ? "fred" }: ...
) the same as these
bindings.
They have two very difficult features:
- Keys can mutually refer to each other in
rec
sets orlet
-bindings, including out of definition order. - Attribute sets can be nested, and parts of one attribute set can be defined in multiple separate bindings.
Tvix resolves as much of this logic statically (i.e. at compile-time) as possible, but the procedure is quite complicated.
High-level concept
The idea behind the way we compile bindings is to fully resolve nesting statically, and use the usual mechanisms (i.e. recursion/thunking/value capturing) for resolving dynamic values.
This is done by compiling bindings in several phases:
-
An initial compilation phase only for plain inherit statements (i.e.
inherit name;
), not for namespaced inherits (i.e.inherit (from) name;
). -
A declaration-only phase, in which we use the compiler's scope tracking logic to calculate the physical runtime stack indices (further referred to as "stack slots" or just "slots") that all values will end up in.
In this phase, whenever we encounter a nested attribute set, it is merged into a custom data structure that acts like a synthetic AST node.
This can be imagined similar to a rewrite like this:
# initial code: { a.b = 1; a.c = 2; } # rewritten form: { a = { b = 1; c = 2; }; }
The rewrite applies to attribute sets and
let
-bindings alike.At the end of this phase, we know the stack slots of all namespaces for inheriting from, all values inherited from them, and all values (and optionall keys) of bindings at the current level.
Only statically known keys are actually merged, so any dynamic keys that conflict will lead to a "key already defined" error at runtime.
-
A compilation phase, in which all values (and, when necessary, keys) are actually compiled. In this phase the custom data structure used for merging is encountered when compiling values.
As this data structure acts like an AST node, the process begins recursively for each nested attribute set.
At the end of this process we have bytecode that leaves the required values (and
optionally keys) on the stack. In the case of attribute sets, a final operation
is emitted that constructs the actual attribute set structure at runtime. For
let
-bindings a final operation is emitted that removes these locals from the
stack when the scope ends.
Moving parts
WARNING: This documents the current implementation. If you only care about the conceptual aspects, see above.
There's a few types involved:
PeekableAttrs
: peekable iterator over an attribute path (e.g.a.b.c
)BindingsKind
: enum defining the kind of bindings (attrs/recattrs/let)AttributeSet
: struct holding the bindings kind, the AST nodes with inherits (both namespaced and not), and an internal representation of bindings (essentially a vector of tuples of the peekable attrs and the expression to compile for the value).Binding
: enum describing the kind of binding (namespaced inherit, attribute set, plain binding of any other value type)KeySlot
: enum describing the location in which a key slot is placed at runtime (nowhere, statically known value in a slot, dynamic value in a slot)TrackedBinding
: struct representing statically known information about a single binding (its key slot, value slot andBinding
)TrackedBindings
: vector of tracked bindings, which implements logic for merging attribute sets together
And quite a few methods on Compiler
:
-
compile_bindings
: entry point for compiling anything that looks like a binding, this calls out to the functions below. -
compile_plain_inherits
: takes all inherits of a bindings node and compiles the ones that are trivial to compile (i.e. just plain inherits without a namespace). Thernix
parser does not represent namespaced/plain inherits in different nodes, so this function also aggregates the namespaced inherits and returns them for further use -
declare_namespaced_inherits
: passes over all namespaced inherits and declares them on the locals stack, as well as inserts them into the providedTrackedBindings
-
declare_bindings
: declares all regular key/value bindings in a bindings scope, but without actually compiling their keys or values.There's a lot of heavy lifting going on here:
- It invokes the various pieces of logic responsible for merging nested attribute sets together, creating intermediate data structures in the value slots of bindings that can be recursively processed the same way.
- It decides on the key slots of expressions based on the kind of bindings, and the type of expression providing the key.
-
bind_values
: runs the actual compilation of values. Notably this function is responsible for recursively compiling merged attribute sets when it encounters aBinding::Set
(on which it invokescompile_bindings
itself).
In addition to these several methods (such as compile_attr_set
,
compile_let_in
, ...) invoke the binding-kind specific logic and then call out
to the functions above.