tvl-depot/tvix/eval/src/opcode.rs
Adam Joseph d978b556e6 feat(tvix/eval): deduplicate overlap between Closure and Thunk
This commit deduplicates the Thunk-like functionality from Closure
and unifies it with Thunk.

Specifically, we now have one and only one way of breaking reference
cycles in the Value-graph: Thunk.  No other variant contains a
RefCell.  This should make it easier to reason about the behavior of
the VM.  InnerClosure and UpvaluesCarrier are no longer necessary.

This refactoring allowed an improvement in code generation:
`Rc<RefCell<>>`s are now created only for closures which do not have
self-references or deferred upvalues, instead of for all closures.
OpClosure has been split into two separate opcodes:

- OpClosure creates non-recursive closures with no deferred
  upvalues.  The VM will not create an `Rc<RefCell<>>` when executing
  this instruction.

- OpThunkClosure is used for closures with self-references or
  deferred upvalues.  The VM will create a Thunk when executing this
  opcode, but the Thunk will start out already in the
  `ThunkRepr::Evaluated` state, rather than in the
  `ThunkRepr::Suspeneded` state.

To avoid confusion, OpThunk has been renamed OpThunkSuspended.

Thanks to @sterni for suggesting that all this could be done without
adding an additional variant to ThunkRepr.  This does however mean
that there will be mutating accesses to `ThunkRepr::Evaluated`,
which was not previously the case.  The field `is_finalised:bool`
has been added to `Closure` to ensure that these mutating accesses
are performed only on finalised Closures.  Both the check and the
field are present only if `#[cfg(debug_assertions)]`.

Change-Id: I04131501029772f30e28da8281d864427685097f
Signed-off-by: Adam Joseph <adam@westernsemico.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7019
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-19 10:38:54 +00:00

174 lines
4.9 KiB
Rust

//! This module implements the instruction set running on the abstract
//! machine implemented by tvix.
use std::ops::{AddAssign, Sub};
/// Index of a constant in the current code chunk.
#[repr(transparent)]
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct ConstantIdx(pub usize);
/// Index of an instruction in the current code chunk.
#[repr(transparent)]
#[derive(Clone, Copy, Debug)]
pub struct CodeIdx(pub usize);
impl AddAssign<usize> for CodeIdx {
fn add_assign(&mut self, rhs: usize) {
*self = CodeIdx(self.0 + rhs)
}
}
impl Sub<usize> for CodeIdx {
type Output = Self;
fn sub(self, rhs: usize) -> Self::Output {
CodeIdx(self.0 - rhs)
}
}
/// Index of a value in the runtime stack.
#[repr(transparent)]
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd)]
pub struct StackIdx(pub usize);
/// Index of an upvalue within a closure's upvalue list.
#[repr(transparent)]
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct UpvalueIdx(pub usize);
/// Offset by which an instruction pointer should change in a jump.
#[repr(transparent)]
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct JumpOffset(pub usize);
/// Provided count for an instruction (could represent e.g. a number
/// of elements).
#[repr(transparent)]
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct Count(pub usize);
/// All variants of this enum carry a bounded amount of data to
/// ensure that no heap allocations are needed for an Opcode.
#[warn(variant_size_differences)]
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum OpCode {
/// Push a constant onto the stack.
OpConstant(ConstantIdx),
/// Discard a value from the stack.
OpPop,
// Push a literal value.
OpNull,
OpTrue,
OpFalse,
// Unary operators
OpInvert,
OpNegate,
// Arithmetic binary operators
OpAdd,
OpSub,
OpMul,
OpDiv,
// Comparison operators
OpEqual,
OpLess,
OpLessOrEq,
OpMore,
OpMoreOrEq,
// Logical operators & generic jumps
OpJump(JumpOffset),
OpJumpIfTrue(JumpOffset),
OpJumpIfFalse(JumpOffset),
OpJumpIfNotFound(JumpOffset),
// Attribute sets
/// Construct an attribute set from the given number of key-value pairs on the top of the stack
///
/// Note that this takes the count of *pairs*, not the number of *stack values* - the actual
/// number of values popped off the stack will be twice the argument to this op
OpAttrs(Count),
OpAttrsUpdate,
OpAttrsSelect,
OpAttrsTrySelect,
OpHasAttr,
/// Throw an error if the attribute set at the top of the stack has any attributes
/// other than those listed in the formals of the current lambda
///
/// Panics if the current frame is not a lambda with formals
OpValidateClosedFormals,
// `with`-handling
OpPushWith(StackIdx),
OpPopWith,
OpResolveWith,
// Lists
OpList(Count),
OpConcat,
// Strings
OpInterpolate(Count),
/// Force the Value on the stack and coerce it to a string, always using
/// `CoercionKind::Weak`.
OpCoerceToString,
// Paths
/// Attempt to resolve the Value on the stack using the configured [`NixSearchPath`][]
///
/// [`NixSearchPath`]: crate::nix_search_path::NixSearchPath
OpFindFile,
/// Attempt to resolve a path literal relative to the home dir
OpResolveHomePath,
// Type assertion operators
OpAssertBool,
/// Access local identifiers with statically known positions.
OpGetLocal(StackIdx),
/// Close scopes while leaving their expression value around.
OpCloseScope(Count), // number of locals to pop
/// Return an error indicating that an `assert` failed
OpAssertFail,
// Lambdas & closures
OpCall,
OpTailCall,
OpGetUpvalue(UpvalueIdx),
// A Closure which has upvalues but no self-references
OpClosure(ConstantIdx),
// A Closure which has self-references (direct or via upvalues)
OpThunkClosure(ConstantIdx),
// A suspended thunk, used to ensure laziness
OpThunkSuspended(ConstantIdx),
OpForce,
/// Finalise initialisation of the upvalues of the value in the
/// given stack index after the scope is fully bound.
OpFinalise(StackIdx),
// [`OpClosure`], [`OpThunkSuspended`], and [`OpThunkClosure`] have a
// variable number of arguments to the instruction, which is
// represented here by making their data part of the opcodes.
// Each of these two opcodes has a `ConstantIdx`, which must
// reference a `Value::Blueprint(Lambda)`. The `upvalue_count`
// field in that `Lambda` indicates the number of arguments it
// takes, and the opcode must be followed by exactly this number
// of `Data*` opcodes. The VM skips over these by advancing the
// instruction pointer.
//
// It is illegal for a `Data*` opcode to appear anywhere else.
DataLocalIdx(StackIdx),
DataDeferredLocal(StackIdx),
DataUpvalueIdx(UpvalueIdx),
DataCaptureWith,
}