This is the most naive version of string interning possible - we store a
map from the string itself to the pointer behind a global mutex, and
memoize the allocation of all strings below a threshold length (16
bytes, for now) into that map. This requires leaking /all/ strings,
since it's not easy to know just from the pointer that a string has been
interned - so interning is disabled if string leaking is also disabled.
In the case where we're leaking strings (the default), even the naive
version of this gets us a pretty nice perfomance boost:
hello outpath time: [742.54 ms 745.89 ms 749.14 ms]
change: [-2.8722% -2.0135% -1.0654%] (p = 0.00 < 0.05)
Performance has improved.
However, in the case where we're not leaking strings, we have to keep
track of which strings have and haven't been interned, which makes this
a little worse:
hello outpath time: [779.30 ms 792.82 ms 808.74 ms]
change: [+2.5258% +4.0884% +5.8931%] (p = 0.00 < 0.05)
Performance has regressed.
Hopefully we can close the gap here a bit with some clever
tricks (coming next).
Change-Id: If08cb48ede703c7fe3bdd8d617443f8a561ad09b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12047
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: aspen <root@gws.fyi>
Per https://nnethercote.github.io/perf-book/hashing.html, we have
basically no reason to use the default hasher over a faster,
non-DoS-resistant hasher. This gives a nice perf boost basically for
free:
hello outpath time: [704.76 ms 714.91 ms 725.63 ms]
change: [-7.2391% -6.1018% -4.9189%] (p = 0.00 < 0.05)
Performance has improved.
Change-Id: If5587f444ed3af69f8af4eead6af3ea303b4ae68
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12046
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Autosubmit: aspen <root@gws.fyi>
This depends on the ChunkReader work.
Change-Id: I38878d0f822c312151131e55baee4db6ef1c3650
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12142
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Autosubmit: flokli <flokli@flokli.de>
Explain the current caveats as far as performance tuning is concerned.
Change-Id: I1a9c11c81de09350240fb61e3c130fc401ef6618
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12141
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Default to always leaking strings, and copying strings by copying
pointers rather than cloning the underlying allocation. This (somewhat
bafflingly) doesn't seem to affect any benchmarks, but paves the way for
some tricks around string allocation that do.
Unfortunately, we can't do this (yet?) for contextful strings, for
reasons I don't currently understand but which I will address later,
when I address contextful strings more holistically.
I've left a flag in here to disable this, both to test the cloning logic
for strings for when/if we decide to bring this back, and to allow
people who care more about memory usage than perf to disable leaking.
Change-Id: Iec44bcbfe9b3d20389d2450b9a551792a79b9b26
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12045
Autosubmit: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Use redb instead of sled for the default filesystem implementation of
PathInfoService and DirectoryService. In the future we'll also drop sled
support completely.
Change-Id: I513ff0c2ff953d59714aa50b9aa1301b02f53d40
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12085
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This is a partial revert of https://cl.tvl.fyi/c/depot/+/12068 where I
changed tvix/crate2nix-check to use depotfmt.check. It turns out that we
don't actually want to be using it in this case as the wrapper sets
`--fail-on-change` which would always fail because the Cargo.nix
generated by crate2nix will always need to be changed (It's not
formatted properly).
Change-Id: Ife35c812ca69c90459ce4445f4252d0a24c218fa
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12132
Tested-by: BuildkiteCI
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: flokli <flokli@flokli.de>
This provides a DirectoryService implementation which uses
redb (https://github.com/cberner/redb) as the database. It provides both
in-memory and persistent on-filesystem implementations.
Change-Id: Id8f7c812e2cf401cccd1c382b19907b17a6887bc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12038
Tested-by: BuildkiteCI
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: flokli <flokli@flokli.de>
The "path" key in the arguments to builtins.path supports any
path-coercible type (a string, a path...). Coerce it to a path in the
argument rather than just requiring it already be one and throwing an
error if it's not.
This is... annoying to test, since it requires a file with known
contents that's available in the build sandbox. But it works! Trust me!
Fixes: b/412
Change-Id: I3c8e339bf344a208d5ed5990193942651f318745
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12053
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: aspen <root@gws.fyi>
This change exposes the already existing wrapper for treefmt/depotfmt
that supports running it inside a sandbox. We now reuse it inside
//tvix/crate2nix-check, where we previously duplicated the code. The
check is now stricter and will also fail on changes, so I had to set the
rust edition in the treefmt config.
Change-Id: I000e52421258979c038ba6b1f1ff2db14e391b0c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12068
Tested-by: BuildkiteCI
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: flokli <flokli@flokli.de>
The environment variables caused both nar-bridge and tvix-store daemon
to try to connect to the same store, which fails due to locking issues.
Pass the config to `tvix-store daemon` directly. Also, add the
`--otlp=false` to the instructions to remove spam, most users don't have
a OTLP collector running.
Explain when to use gRPC in `tvix-store virtiofs` and when not.
Also, point out to be able to boot a NixOS closure, it needs to be
copied into the store first.
Change-Id: If4eda07bba28ad0bbe70e468cb727441a21b0588
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12067
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
It was previously pointing to a sled implementation which no longer
exists. It was also using nar-bridge-go instead of the new nar-bridge
(rust).
Closes: https://b.tvl.fyi/issues/413
Change-Id: Id0df61d4728198c3ae95a8e27ba7303434892966
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12063
Reviewed-by: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
This is nice to test too - it's similar to hello, but runs for a lot
longer (like 7.5 seconds on my laptop) which means we get even better
stats for stuff.
Change-Id: I7935818f10a6d846d446e685b9263a72d7e2aabd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12061
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: aspen <root@gws.fyi>
We already do this for redb and for sled in SledDirectoryService.
Change-Id: I34c7178257a6a04e9f12ed4037a4ef585d7b0d54
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12060
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Add more logging and remove context from errors because that's already
provided by the logs (Errors also need to be refactored anyway, there's
also confusion about StorageError vs InvalidRequest, there's no
consistency)
Change-Id: Ia43c0d237d9075152490c635b05fb3fb343abcc8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12058
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
The decode function didn't check that the input had a valid length and
so would panic when given input with invalid length.
Change-Id: Ie27d006b8fe20f005b4a47a1763821a61e9a95c7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12051
Reviewed-by: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
Autosubmit: Brian Olsen <me@griff.name>
Previously, OpConstant would display some detail about its
ConstantIdx: whether it's a thunk or closure, and what its address
is. This has been expanded to also show when the ConstantIdx is a
blueprint, along with the blueprint's address, and to the other
opcodes that use a ConstantIdx.
Currently, it seems like blueprint addresses don't correspond to the
address of the thunk listed in the bytecode output, but it's still
useful to see that the constant being grabbed is a blueprint, and
maybe this pointer can be made to make more sense in the future.
Change-Id: Ia212b0d52b004c87051542c093274e7106ee08e4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12044
Tested-by: BuildkiteCI
Reviewed-by: aspen <root@gws.fyi>
Autosubmit: chickadee <matthewktromp@gmail.com>
path_exists was returning an error when certain common IO errors were
encountered. e.g. in the path "/dev/null/.", path_exists would throw
an error because the underlying call to Path::try_exists threw an
error because null isn't a directory. But if null isn't a directory,
then the path is invalid, so this should really be returning
false. That's what nix's behavior is and that's what makes sense.
The trait function isn't being changed because some other
implementers (e.g. tvix_store_io) have actual errors they can throw.
Fixes: b/411
Change-Id: I9e810e7a198bffe61365697c6d3d7e71f264c20b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12042
Tested-by: BuildkiteCI
Autosubmit: chickadee <matthewktromp@gmail.com>
Reviewed-by: aspen <root@gws.fyi>
This is the `{fixed,fixed:r,text}:{sha*,md5}` prefix used in various
string representations.
Factor that code out, and use it in the two places it can be used in.
Change-Id: Ic9555fa9e1884198d435e55c7f630b8d3ba2a032
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12041
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Brian Olsen <me@griff.name>
When retrieving a closure with get_recursive, the following could happen in the GRPC client:
- The first reference to the deduplicated directory is added to expected_directory_digests
- The deduplicated directory is obtained removed from expected_directory_digests
- The second reference to the deduplicated directory is added to expected_directory_digests
- The deduplicated directory has already been sent, but is still in the
expected_directory_digests. It looks to the GRPC client like the
closure is incomplete and the stream ended prematurely.
Change-Id: Ic62bca12e7f8fb85af5fa4dacd199f0f3b8eea8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12033
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Use the ChunkedReader in CombinedBlobService instead which also supports seeking.
Change-Id: I681331a80763172c27e55362b7044fe81aaa323b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12031
Autosubmit: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Ensure nar-bridge is healthy before connecting to it, don't just check
for the unix socket to be present.
We don't have a proper /health endpoint yet, but nix-cache-info works
fine for now.
Change-Id: I22df2c3b7bffcf52dbd3d00f3ba5382dc06ab03d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12030
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Ensure the service is healthy before connecting to it, don't just check
for the unix socket to be present.
Change-Id: If6501828677c247910d91f35b860960802084691
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12029
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This provides a PathInfoService implementation using redb
(https://github.com/cberner/redb) as the underlying storage engine.
Both an in-memory variant, as well as a filesystem one is provided,
similar how it's done with the sled implementation.
Supersedes: https://cl.tvl.fyi/c/depot/+/11692
Change-Id: I744619c51bf2efd0fb63659b12a27cbe0b2fd6fc
Signed-off-by: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11995
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Parsing of the narinfo file sets the compression field to None instead
of Some("none"). The mapping selecting the decompression reader expected
the former in //tvix/store/src/pathinfoservice/nix_http.rs.
Change-Id: I254a825b88a4016aab087446bdc0c7b6286de40c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12007
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This adds a generic `SigningKey` struct that can be used to sign
NARInfos with signers.
It also includes tooling to parse keypairs from bytes generated by Nix,
returning a specialized ed25519_dalek variant.
Change-Id: Ic9780c370939af54e7177c93cde3321adf189fc3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12014
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Align these with the way it's called in the ed25519 crates.
Change-Id: Ia52d3bb9bf831dc6b5f7d5356f5ac62135672883
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12013
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
This documents some thoughts and goals of the Tvix Build protocol, and
how it is possible to express Nix builds with it.
Additionally, it explains a proposed design for reference scanning.
Change-Id: I4b1f3feb2278e3c7ce06de831eb8eb1715cba1c9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12012
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
We still have the unique store name to identify which instantiation caused the error. For recursion errors, the full chain is still retained inside the CompositionError.
Change-Id: Iaddcece445a5df331e578d7c69d710db3d5f8dcd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12002
Tested-by: BuildkiteCI
Autosubmit: yuka <yuka@yuka.dev>
Reviewed-by: flokli <flokli@flokli.de>