tvl-depot/tvix/docs/value-pointer-equality.md
sterni 2aaabb709f docs(tvix): update C++ Nix code links for pointer equality
Change-Id: Icfd79b36c09607b4183e7378cd3c17f6238297b2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8853
Autosubmit: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-07-01 21:45:53 +00:00

8.8 KiB

Value Pointer Equality in Nix

Introduction

It is a piece of semi-obscure Nix trivia that while functions are generally not comparable, they can be compared in certain situations. This is actually quite an important fact, as it is essential for the evaluation of nixpkgs: The attribute sets used to represent platforms in nixpkgs, like stdenv.buildPlatform, contain functions, such as stdenv.buildPlatform.canExecute. When writing cross logic, one invariably ends up writing expressions that compare these sets, e.g. stdenv.buildPlatform != stdenv.hostPlatform. Since attribute set equality is the equality of their attribute names and values, we also end up comparing the functions within them. We can summarize the relevant part of this behavior for platform comparisons in the following (true) Nix expressions:

  • stdenv.hostPlatform.canExecute != stdenv.hostPlatform.canExecute
  • stdenv.hostPlatform == stdenv.hostPlatform

This fact is commonly referred to as pointer equality of functions (or function pointer equality) which is not an entirely accurate name, as we'll see. This account of the behavior states that, while functions are incomparable in general, they are comparable insofar, as they occupy the same spot in an attribute set.

However, a maybe lesser known trick is to write a function such as the following to allow comparing functions:

let
  pointerEqual = lhs: rhs: { x = lhs; } == { x = rhs; };

  f = name: "Hello, my name is ${name}";
  g = name: "Hello, my name is ${name}";
in
[
  (pointerEqual f f) # => true
  (pointerEqual f g) # => false
]

Here, clearly, the function is not contained at the same position in one and the same attribute set, but at the same position in two entirely different attribute sets. We can also see that we are not comparing the functions themselves (e.g. their AST), but rather if they are the same individual value (i.e. pointer equal).

To figure out the actual semantics, we'll first have a look at how value (pointer) equality works in C++ Nix, the only production ready Nix implementation currently available.

Nix (Pointer) Equality in C++ Nix

TIP: The summary presented here is up-to-date as of 2023-06-20 and was tested with Nix 2.3, 2.11 and 2.15.

EvalState::eqValues and ExprOpEq::eval

The function implementing equality in C++ Nix is EvalState::eqValues which starts with the following bit of code:

bool EvalState::eqValues(Value & v1, Value & v2)
{
    forceValue(v1);
    forceValue(v2);

    /* !!! Hack to support some old broken code that relies on pointer
       equality tests between sets.  (Specifically, builderDefs calls
       uniqList on a list of sets.)  Will remove this eventually. */
    if (&v1 == &v2) return true;

So this immediately looks more like pointer equality of arbitrary values instead of functions. In fact there is no special code facilitating function equality:

        /* Functions are incomparable. */
        case nFunction:
            return false;

So one takeaway of this is that pointer equality is neither dependent on functions nor attribute sets. In fact, we can also write our pointerEqual function as:

lhs: rhs: [ lhs ] == [ rhs ]

It's interesting that EvalState::eqValues forces the left and right-hand value before trying pointer equality. It explains that let x = throw ""; in x == x does not evaluate successfully, but it is puzzling why let f = x: x; in f == f does not return true. In fact, why do we need to wrap the values in a list or attribute set at all for our pointerEqual function to work?

The answer lies in the code that evaluates ExprOpEq, i.e. an expression involving the == operator:

void ExprOpEq::eval(EvalState & state, Env & env, Value & v)
{
    Value v1; e1->eval(state, env, v1);
    Value v2; e2->eval(state, env, v2);
    v.mkBool(state.eqValues(v1, v2));
}

As you can see, two distinct Value structs are created, so they can never be pointer equal even if the union inside points to the same bit of memory. We can thus understand what actually happens when we check the equality of an attribute set (or list), by looking at the following expression:

let
  x = { name = throw "nameless"; };
in

x == x # => causes an evaluation error

Because x can't be pointer equal, as it'll end up in the distinct structs v1 and v2, it needs to be compared by value. For this reason, the name attribute will be forced and an evaluation error caused. If we rewrite the expression to use…

{ inherit x; } == { inherit x; } # => true

…, it'll work: The two attribute sets are compared by value, but their x attribute turns out to be pointer equal after forcing it. This does not throw, since forcing an attribute set does not force its attributes' values (as forcing a list doesn't force its elements).

As we have seen, pointer equality can not only be used to compare function values, but also other otherwise incomparable values, such as lists and attribute sets that would cause an evaluation error if they were forced recursively. We can even switch out the throw for an abort. The limitation is of course that we need to use a value that behaves differently depending on whether it is forced “normally” (think builtins.seq) or recursively (think builtins.deepSeq), so thunks will generally be evaluated before pointer equality can kick into effect.

Other Comparisons

The != operator uses EvalState::eqValues internally as well, so it behaves exactly as !(a == b).

The >, <, >= and <= operators all desugar to CompareValues eventually which generally looks at the value type before comparing. It does, however, rely on EvalState::eqValues for list comparisons (introduced in Nix 2.5), so it is possible to compare lists with e.g. functions in them, as long as they are equal by pointer:

let
  f = x: x + 42;
in

[
  ([ f 2 ] > [ f 1 ]) # => true
  ([ f 2 ] > [ (x: x) 1]) # => error: cannot compare a function with a function
  ([ f ] > [ f ]) # => false
]

Finally, since builtins.elem relies on EvalState::eqValues, you can check for a function by pointer equality:

let
  f = x: f x;
in
builtins.elem f [ f 2 3 ] # => true

Summary

When comparing two Nix values, we must force both of them (non-recursively!), but are allowed to short-circuit the comparison based on pointer equality, i.e. if they are at the same exact value in memory, they are deemed equal immediately. This is completely independent of what type of value they are. If they are not pointer equal, they are (recursively) compared by value as expected.

However, when evaluating the Nix expression a == b, we must invoke our implementation's value equality function in a way that a and b themselves can never be deemed pointer equal. Any values we encounter while recursing during the equality check must be compared by pointer as described above, though.

Stability of the Feature

Keen readers will have noticed the following comment in the C++ Nix source code, indicating that pointer comparison may be removed in the future.

    /* !!! Hack to support some old broken code that relies on pointer
       equality tests between sets.  (Specifically, builderDefs calls
       uniqList on a list of sets.)  Will remove this eventually. */

Now, I can't speak for the upstream C++ Nix developers, but sure can speculate. As already pointed out, this feature is currently needed for evaluating nixpkgs. While its use could realistically be eliminated (only bothersome spot is probably the emulator function, but that should also be doable), removing the feature would seriously compromise C++ Nix's ability to evaluate historical nixpkgs revision which is arguably a strength of the system.

Another indication that it is likely here to stay is that it has already outlived builderDefs, even though it was (apparently) reintroduced just for this use case. More research into the history of this feature would still be prudent, especially the reason for its original introduction (maybe performance?).