This implements a simple DFS locator in listing structures.
It is interoperable with the Rust standard library paths, we also build our
own errors to restrict path values to reasonable secure defaults, e.g.
relative paths with no `..` component.
Tests are added for this new feature for a positive and a negative
check.
In addition, a path validation test was added. The Windows-style prefix
is gated on the Windows platform as UNIX does not parse `C:\\` as a
`Component::Prefix(_)` but as a `Component::Normal(_)`.
Change-Id: Iae2a80bebd8138e41af94aa7d09f2842c3c5a786
Signed-off-by: Ryan Lahfa <tvl@lahfa.xyz>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12255
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Requiring `name` to be a `&str` means it'll get annoying to pass around
`Signature`, but being able to pass them around in an owned fashion is
kinda a requirement for a stronger typed `PathInfo` struct, where we
want to have full ownership.
Rework the `Signature` struct to become generic over the type of the
`name` field. This means, it becomes possible to have owned versions
of it.
We don't want to impose `String` or `SmolStr` for example, but want to
leave it up to the nix-compat user to decide.
Provide a type alias for the existing `&str` variant (`SignatureRef`),
and use it where we previously used the non-generic `Signature` one.
Add some tests to ensure it's possible to *use* `Signature` with both
`String` and `SmolStr` (but only pull in `smol_str` as dev dependency
for the tests).
Also, add some more docstrings, these were a bit sparse.
Change-Id: I3f75691498c6bda9cd072d2d9dac83c4f6c57287
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12253
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
.ls files are useful to seek in a NAR without parsing it entirely.
The responsibility of validating the files is on the caller.
Change-Id: I5d1da28b5479c38f20ca5babe60e362a2217c9ea
Signed-off-by: Ryan Lahfa <tvl@lahfa.xyz>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12196
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
We can avoid depending on things outside //tvix by just using a similar
util from nixpkgs.
Change-Id: I9ea3e1f0a8a059ea10caaec173569ba9f316aec6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12247
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Using pkgs.nixos directly allows us to create a smaller nixos closure
for the tests and also not depend on things in depot.ops which can be
beneficial for extending the tvix josh workspace.
Change-Id: Ic6ad2122733418114b43aa692d6e42ac1e308eeb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12251
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
It's a `[u8; SIGNATURE_LENGTH]` type alias, and conveys what we're
accepting or returning a bit nicer.
Change-Id: I974cd97d56d383e51417eb0f26e1431a05711922
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12252
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Makes it possible to construct a GermanString from an owned byte vector, without
having to clone the data.
This is done by "disowning" the vector using ManuallyDrop to access its internal
pointer. For transient strings, this memory is then owned (and freed) by the
GermanString instance.
Small strings are copied out of the heap and stored inline as before, to avoid
any dereferencing operations.
Change-Id: I754736099f71d646d430aed73e558a5a7626c394
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12249
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This can short-circuit two large string comparisons.
Change-Id: If45e7cf33921fe571482dc710c27ef8cc7c70885
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12245
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This is where one of the advantages of this string representation starts to
shine: For small strings there's no derefencing any heap memory at all, and for
the long representation we can compare the prefix before moving on.
Change-Id: Iac333a52e8d7c9dd09e33dbcf51754e321c880e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12238
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
//tvix depending less on other parts of depot is prefered as it will
help with extending the josh workspace of tvix.
Change-Id: Ifcac3af1782dfd82e7543cb4c3ae57fbd186edff
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12250
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
It's easier to implement readTree/depot polyfills for gitignoreSource
when it's imported from third_party.sources, rather than in a file at
//third_party.gitignoreSource.
Change-Id: I1323f932bd0feeb2c50ccc76397a80e035842992
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12248
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
All of these strings are currently transient (the storage class is not yet
represented anywhere), and the ones that are heap allocated need to be
deallocated when the transient string dies.
Change-Id: Iba0ca926df5db7594f304c5d5318db9d69c6f42c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12235
Autosubmit: tazjin <tazjin@tvl.su>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This adds an initial implementation of the so-called "German Strings" in Rust.
https://cedardb.com/blog/german_strings/
This implementation is *far from* complete, the only thing that can be done
right now is construct a string, and even that I'm not fully happy with.
Change-Id: I2697932a0ef373be76ffd14d59677493a5783b58
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12234
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Removes imbl::OrdMap in favour of an Rc over the standard library's BTreeMap,
which allows us to drop the imbl dependency completely.
In my local tests this is actually slightly faster for `hello` and `firefox`.
Change-Id: Ic9597ead4e98bf9530f290c6a94a3c5c3efd0acc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12201
Reviewed-by: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
This vector type has served us well for now, but it contains internal refcounts
which are incompatible with upcoming changes related to garbage collection.
The performance impact of this change within all benchmarks I ran was within the
margin of error:
[nix-shell:/tmp/perf]$ hyperfine "./before -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings"
Benchmark 1: ./u64 -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings
Time (mean ± σ): 7.528 s ± 0.272 s [User: 6.578 s, System: 0.631 s]
Range (min … max): 7.160 s … 8.012 s 10 runs
nix-shell:/tmp/perf]$ hyperfine "./std-vec -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings"
Benchmark 1: ./std-vec -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings
Time (mean ± σ): 7.515 s ± 0.178 s [User: 6.508 s, System: 0.652 s]
Range (min … max): 7.276 s … 7.861 s 10 runs
Change-Id: Ib95f871956e336a1e5771f6293583854b1efb276
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12197
Reviewed-by: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
This replaces the OpCode enum with a new Op enum which is guaranteed to fit in a
single byte. Instead of carrying enum variants with data, every variant that has
runtime data encodes it into the `Vec<u8>` that a `Chunk` now carries.
This has several advantages:
* Less stack space is required at runtime, and fewer allocations are required
while compiling.
* The OpCode doesn't need to carry "weird" special-cased data variants anymore.
* It is faster (albeit, not by much). On my laptop, results consistently look
approximately like this:
Benchmark 1: ./before -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings
Time (mean ± σ): 8.224 s ± 0.272 s [User: 7.149 s, System: 0.688 s]
Range (min … max): 7.759 s … 8.583 s 10 runs
Benchmark 2: ./after -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings
Time (mean ± σ): 8.000 s ± 0.198 s [User: 7.036 s, System: 0.633 s]
Range (min … max): 7.718 s … 8.334 s 10 runs
See notes below for why the performance impact might be less than expected.
* It is faster while at the same time dropping some optimisations we previously
performed.
This has several disadvantages:
* The code is closer to how one would write it in C or Go.
* Bit shifting!
* There is (for now) slightly more code than before.
On performance I have the following thoughts at the moment:
In order to prepare for adding GC, there's a couple of places in Tvix where I'd
like to fence off certain kinds of complexity (such as mutating bytecode, which,
for various reaons, also has to be part of data that is subject to GC). With
this change, we can drop optimisations like retroactively modifying existing
bytecode and *still* achieve better performance than before.
I believe that this is currently worth it to pave the way for changes that are
more significant for performance.
In general this also opens other avenues of optimisation: For example, we can
profile which argument sizes actually exist and remove the copy overhead of
varint decoding (which does show up in profiles) by using more adequately sized
types for, e.g., constant indices.
Known regressions:
* Op::Constant is no longer printing its values in disassembly (this can be
fixed, I just didn't get around to it, will do separately).
Change-Id: Id9b3a4254623a45de03069dbdb70b8349e976743
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12191
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Since we switched from reference scanning to string context, this only
handles the `__corepkgs__` hack. Update the docstrings.
Change-Id: Ie857c8c99ae1cdb4697323ec738f88be0580df3e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12246
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
cl/12228 did enable automatic retries for some flaky tests, which
generally did work, as can be seen in
https://buildkite.com/tvl/depot/builds/35893
However, "🦆" still reports as failing, because we check the number
of steps to be nonzero, which is not the case if retries have happened.
We cannot check for the overall status of the build, as it's still
"RUNNING", but instead of counting all failed steps so far, we can query
all failed jobs and then filter out the ones that were already retried.
Change-Id: Ib9d27587c8a8ba7970850812c4302fecdc4482e7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12233
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
We started using josh from nixpkgs since cl/11457, but forgot to update
this documentation.
Change-Id: I6e07227bcd3e955076d46146024edd89b69f08f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12244
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
I'm gonna be doing some poking around in the internals of Context, so in
preparation this pulls it out into its own module.
Change-Id: I72ea7df80b5f36f838934ee07bdba66874c334c9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12189
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Don't use ValidateNodeError, but SymlinkTargetError.
Also, add checks for too long symlink targets.
Change-Id: I4b533325d494232ff9d0b3f4f695f5a1a0a36199
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12230
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Don't use DirectoryError, but PathComponentError.
Also add checks for too long path components.
Change-Id: Ia9deb9dd0351138baadb2e9c9454c3e019d5a45e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12229
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Replace the hand-rolled code comparing names with a try_fold. Also, make
it slightly stricter here, detecting duplicates in the same fields
earlier.
Change-Id: I9c560838ece88c3b8b339249a8ecbf3b05969538
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12226
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
This provides a batched variant to construct a Directory, which reuses
the previously calculated size.
Checking and inserting code is factored out into a check_insert_node
function, taking the current size as a parameter and returning the new
size.
Change-Id: Ia6c2970a0c12181b7c40e63cf7ce8c93298ea37c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12225
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
The boot tests are sometimes flaky, and we don't want them to
periodically fail other build. Have Buildkite auto-retry them 2 times
on failure.
Logs for individual failures are still available, so it won't hinder
flakiness debuggability.
See https://buildkite.com/docs/pipelines/command-step#retry-attributes
for a documentation of a parameter, and cl/8983 for the introduction of
that feature to //nix/buildkite.
Change-Id: I1c0d25fa1d0ca940b3bdcd145ede87154b0c28eb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12228
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
This is covered by clippy.toml these days.
Change-Id: I2330af5781844d5f9d975793d770efcea48d371b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12223
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Make this a proper struct with named fields. We apparently never access
B3Digest in there, so it can be removed.
Change-Id: Ifc07310393e1afb0a835778eae137a19b54070b9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12224
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Provide a into_nodes() function on a Directory, which consumes self and
returns owned PathComponent and Node.
Use it to provide a proper conversion from Directory to the proto
variant that doesn't clone.
There's no need for the one taking only &Directory, we don't use it
anywhere, and once someone needs that they might as well clone Directory
before converting it.
Update all other users of the `.nodes()` function to use `.into_nodes()`
where applicable, and avoid some more cloning there.
Change-Id: Id4577b9eb173c012e225337458898d3937112bcb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12218
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This encodes a verified component on the type level. Internally, it
contains a bytes::Bytes.
The castore Path/PathBuf component() and file_name() methods now
return this type, the old ones returning bytes were renamed to
component_bytes() and component_file_name() respectively.
We can drop the directory_reject_invalid_name test - it's not possible
anymore to pass an invalid name to Directories::add.
Invalid names in the Directory proto are still being tested to be
rejected in the validate_invalid_names tests.
Change-Id: Ide4d16415dfd50b7e2d7e0c36d42a3bbeeb9b6c5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12217
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Add a `SymlinkTarget` type to represent validated symlink targets.
With this, no invalid states are representable, so we can make `Node` be
just an enum of all three kind of types, and allow access to these
fields directly.
Change-Id: I20bdd480c8d5e64a827649f303c97023b7e390f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12216
Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Nodes only have names if they're contained inside a Directory, or if
they're a root node and have something else possibly giving them a name
externally.
This removes all `name` fields in the three different Nodes, and instead
maintains it inside a BTreeMap inside the Directory.
It also removes the NamedNode trait (they don't have a get_name()), as
well as Node::rename(self, name), and all [Partial]Ord implementations
for Node (as they don't have names to use for sorting).
The `nodes()`, `directories()`, `files()` iterators inside a `Directory`
now return a tuple of Name and Node, as does the RootNodesProvider.
The different {Directory,File,Symlink}Node struct constructors got
simpler, and the {Directory,File}Node ones became infallible - as
there's no more possibility to represent invalid state.
The proto structs stayed the same - there's now from_name_and_node and
into_name_and_node to convert back and forth between the two `Node`
structs.
Some further cleanups:
The error types for Node validation were renamed. Everything related to
names is now in the DirectoryError (not yet happy about the naming)
There's some leftover cleanups to do:
- There should be a from_(sorted_)iter and into_iter in Directory, so
we can construct and deconstruct in one go.
That should also enable us to implement conversions from and to the
proto representation that moves, rather than clones.
- The BuildRequest and PathInfo structs are still proto-based, so we
still do a bunch of conversions back and forth there (and have some
ugly expect there). There's not much point for error handling here,
this will be moved to stricter types in a followup CL.
Change-Id: I7369a8e3a426f44419c349077cb4fcab2044ebb6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12205
Tested-by: BuildkiteCI
Reviewed-by: yuka <yuka@yuka.dev>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
The tvix-cli tests now include evaluating the NixOS GNOME installer ISO
image and making sure that its drvPath and outPath matches the one
evaluated by Nix. (This required extending the helper function a bit and
adding docs).
NixOS docs generation is disabled for now, see comments in diff.
Change-Id: Ia510f209b1ec3ef9a823f1e5ac0ef2f5f193976f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12177
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This splits the existing ReferenceScanner into a ReferenceScanner and
ReferencePattern as well as adds an AsyncRead implementation that can
do a scan while you read from it.
The reason to split the scanner in two is that generating the pattern
is expensive and when ingesting build results with multiple outputs you
want to do several independant scans that look for the same pattern.
The reader is for scanning files without having to load the entire file
into memory.
Change-Id: I993f5a32308c12d9035840f8e04fe82e8dc1d962
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12052
Autosubmit: Brian Olsen <me@griff.name>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This comment is a bit misleading - everything that's not in this field
is referenced in a content-addressed fashion (by their digest).
Change-Id: I5097131530fd188173393063643c057f588ea2c4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12219
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
These were out of date.
Change-Id: Ideaf888c2851bb9ec36ae01b11d93165b62cd7a5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12209
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
When using the runTests feature of crate2nix the derivation that runs the
tests is put into passthru.test but all default.nix files for Rust crates
in Tvix threw that away.
This commit retains passthru so that you can get access to the test
derivation.
Change-Id: I8b7b7db57a49069348f08c12c00a3b1a41a0c05b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12215
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
These were out of date.
Change-Id: I89df37f088ad6c53b676d965f0dc1023f40481f4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12208
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>