tvl-depot/tvix/eval/src/builtins/impure.rs

180 lines
5.6 KiB
Rust
Raw Normal View History

use builtin_macros::builtins;
use smol_str::SmolStr;
use std::{
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
collections::BTreeMap,
env,
rc::Weak,
time::{SystemTime, UNIX_EPOCH},
};
use crate::{
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
compiler::GlobalsMap,
errors::ErrorKind,
io::FileType,
observer::NoOpObserver,
spans::LightSpan,
value::{Builtin, BuiltinArgument, NixAttrs, Thunk},
vm::VM,
SourceCode, Value,
};
#[builtins]
mod impure_builtins {
use super::*;
use crate::builtins::coerce_value_to_path;
#[builtin("getEnv")]
fn builtin_get_env(_: &mut VM, var: Value) -> Result<Value, ErrorKind> {
Ok(env::var(var.to_str()?).unwrap_or_else(|_| "".into()).into())
}
#[builtin("pathExists")]
fn builtin_path_exists(vm: &mut VM, s: Value) -> Result<Value, ErrorKind> {
let path = coerce_value_to_path(&s, vm)?;
vm.io().path_exists(path).map(Value::Bool)
}
#[builtin("readDir")]
fn builtin_read_dir(vm: &mut VM, path: Value) -> Result<Value, ErrorKind> {
let path = coerce_value_to_path(&path, vm)?;
let res = vm.io().read_dir(path)?.into_iter().map(|(name, ftype)| {
(
name,
Value::String(
SmolStr::new(match ftype {
FileType::Directory => "directory",
FileType::Regular => "regular",
FileType::Symlink => "symlink",
FileType::Unknown => "unknown",
})
.into(),
),
)
});
Ok(Value::attrs(NixAttrs::from_iter(res)))
}
#[builtin("readFile")]
fn builtin_read_file(vm: &mut VM, path: Value) -> Result<Value, ErrorKind> {
let path = coerce_value_to_path(&path, vm)?;
vm.io()
.read_to_string(path)
.map(|s| Value::String(s.into()))
}
}
/// Return all impure builtins, that is all builtins which may perform I/O
/// outside of the VM and so cannot be used in all contexts (e.g. WASM).
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
pub(super) fn builtins() -> BTreeMap<&'static str, Value> {
let mut map: BTreeMap<&'static str, Value> = impure_builtins::builtins()
.into_iter()
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
.map(|b| (b.name(), Value::Builtin(b)))
.collect();
// currentTime pins the time at which evaluation was started
{
let seconds = match SystemTime::now().duration_since(UNIX_EPOCH) {
Ok(dur) => dur.as_secs() as i64,
// This case is hit if the system time is *before* epoch.
Err(err) => -(err.duration().as_secs() as i64),
};
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
map.insert("currentTime", Value::Integer(seconds));
}
map
}
/// Constructs and inserts the `import` builtin. This builtin is special in that
/// it needs to capture the [crate::SourceCode] structure to correctly track
/// source code locations while invoking a compiler.
// TODO: need to be able to pass through a CompilationObserver, too.
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
pub fn builtins_import(globals: &Weak<GlobalsMap>, source: SourceCode) -> Builtin {
// This (very cheap, once-per-compiler-startup) clone exists
// solely in order to keep the borrow checker happy. It
// resolves the tension between the requirements of
// Rc::new_cyclic() and Builtin::new()
let globals = globals.clone();
Builtin::new(
"import",
&[BuiltinArgument {
strict: true,
name: "path",
}],
None,
move |mut args: Vec<Value>, vm: &mut VM| {
let mut path = super::coerce_value_to_path(&args.pop().unwrap(), vm)?;
if path.is_dir() {
path.push("default.nix");
}
let current_span = vm.current_span();
if let Some(cached) = vm.import_cache.get(&path) {
return Ok(cached.clone());
}
let contents = vm.io().read_to_string(path.clone())?;
let parsed = rnix::ast::Root::parse(&contents);
let errors = parsed.errors();
let file = source.add_file(path.to_string_lossy().to_string(), contents);
if !errors.is_empty() {
return Err(ErrorKind::ImportParseError {
path,
file,
errors: errors.to_vec(),
});
}
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
let result = crate::compiler::compile(
&parsed.tree().expr().unwrap(),
Some(path.clone()),
file,
feat(tvix/eval): builtins.import without RefCell CL/6867 added support for builtins.import, which required a cyclic reference import->globals->builtins->import. This was implemented using a RefCell, which makes it possible to mutate the builtins during evaluation. The commit message for CL/6867 expressed a desire to eliminate this possibility: This opens up a potentially dangerous footgun in which we could mutate the builtins at runtime leading to different compiler invocations seeing different builtins, so it'd be nice to have some kind of "finalised" status for them or some such, but I'm not sure how to represent that atm. This CL replaces the RefCell with Rc::new_cyclic(), making the globals/builtins immutable once again. At VM runtime (once opcodes start executing) everything is the same as before this CL, except that the Rc<RefCell<>> introduced by CL/6867 is turned into an rc::Weak<>. The function passed to Rc::new_cyclic works very similarly to overlays in nixpkgs: a function takes its own result as an argument. However instead of laziness "breaking the cycle", Rust's Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent memory leaks rather than divergence. This CL also resolves the following TODO from CL/6867: // TODO: encapsulate this import weirdness in builtins The main disadvantage of this CL is the fact that the VM now must ensure that it holds a strong reference to the globals while a program is executing; failure to do so will cause a panic when the weak reference in the builtins is upgrade()d. In theory it should be possible to create strong reference cycles the same way Rc::new_cyclic() creates weak cycles, but these cycles would cause a permanent memory leak -- without either an rc::Weak or RefCell there is no way to break the cycle. At some point we will have to implement some form of cycle collection; whatever library we choose for that purpose is likely to provide an "immutable strong reference cycle" primitive similar to Rc::new_cyclic(), and we should be able to simply drop it in. Signed-off-by: Adam Joseph <adam@westernsemico.com> Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2022-10-26 14:16:04 +02:00
// The VM must ensure that a strong reference to the
// globals outlives any self-references (which are
// weak) embedded within the globals. If the
// expect() below panics, it means that did not
// happen.
globals
.upgrade()
.expect("globals dropped while still in use"),
&mut NoOpObserver::default(),
)
.map_err(|err| ErrorKind::ImportCompilerError {
path: path.clone(),
errors: vec![err],
})?;
if !result.errors.is_empty() {
return Err(ErrorKind::ImportCompilerError {
path,
errors: result.errors,
});
}
// Compilation succeeded, we can construct a thunk from whatever it spat
// out and return that.
let res = Value::Thunk(Thunk::new_suspended(
result.lambda,
LightSpan::new_actual(current_span),
));
vm.import_cache.insert(path, res.clone());
for warning in result.warnings {
vm.push_warning(warning);
}
Ok(res)
},
)
}