Allow taking advantage of the buffer of the underlying reader to avoid
unnecessary copies of file data.
We can't easily implement the methods of BufRead directly, since we
have some extra I/O to perform in the final consume() invocation.
That could be resolved at the cost of additional bookkeeping, but this
will suffice for now.
Change-Id: I8100cf0abd79e7469670b8596bd989be5db44a91
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10089
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
We rely on being able to make small reads cheaply, so this was already
an implicit practical requirement. Requiring it explicitly removes a
performance footgun, and makes further optimisations possible.
Change-Id: I7f65880a41b1d6b5e6bf2e52dfe47d4c49b34bcd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10088
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
We don't need to validate UTF-8 separately, since valid names are
a strict subset of ASCII, and therefore a strict subset of UTF-8.
Change-Id: I3261bf0efe3480b5b315074efafcf5e47a6c5a65
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10087
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
This suggests it's cheap to convert around, but name actually does
allocate.
Move to a `to_owned(&self) -> StorePath`, to better signal that this
does allocate.
Change-Id: Ifaf7c21599e2a467d06e2b4ae1364228370275db
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10066
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
This appears in the cache.nixos.org dataset.
Change-Id: I2eadafe8441e0132a448828026553da2dc7c12aa
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9994
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This appears in the cache.nixos.org dataset.
Change-Id: I35921f7ef148f6681081a4e371abb8c9cc98854d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9993
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Rather than having our own error type, just make decoding errors use
the same common error type.
Change-Id: Ie2c86972f3745c695253adc3214444ac0ab8db6e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9995
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This appears in the cache.nixos.org dataset.
Change-Id: I055b60b9950a1a6a36c1b0576b957e11e1d4264b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9990
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This small tool formats A-Term in a more readable format. It's a lossy
conversion for non-valid UTF-8 environment values.
Change-Id: I65a51054d7faf528321bc2d9fc4425180a7813f5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9970
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Walking a btree_map twice is more expensive than copying a string,
especially because the cloning only happens in the (non-hot) error
path.
This fixes a clippy lint, so it's related to b/321.
Change-Id: I2ccfd0bc46792a45d277f47564e595b87107d8be
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9962
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This already has the right type.
Change-Id: I8f5850a41f9e97f1bc5f2a45ca05cf7439665c9d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9954
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
We also switch the MissingField error to &'static str, since we only
parse a fixed set of fields.
Together, this makes the performance impact of error handling
negligible in batch happy-path parsing.
Change-Id: I2bd0ef2f5b35fcaced56b32d238eca75ac199ef1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9867
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: edef <edef@edef.eu>
Tested-by: BuildkiteCI
We primarily want to measure the speed of the happy path.
Change-Id: Iad0146dde86fc262e2a4b8295bde4eb297b8bf30
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9866
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: edef <edef@edef.eu>
This provides more info about where a NARInfo failed to parse, rather
than just returning None and leaving a library user to manually debug.
Change-Id: I9a28ddd8e5712101483ebe686fdc474c7bbc8e4e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9831
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
This also takes input validation out of the loop, leaving the loop
backedge as the sole branch in the hot path.
Change-Id: Id08e6fb9cf5b074780efa09a7ad389352a601bcc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9847
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
If the output was fixed, we broke out of the for loop too early, before
actually validating individual outputs.
Change-Id: I2259697dfa2a157764358f6d326a1f7f6610647c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9815
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This specific struct is only used to represent content-addressed paths
(in case a Derivation has a fixed-output hash, for example).
Rename `Output`'s `hash_with_mode` to `ca_hash`.
We now also include `CAHash::Text`, and update the `validate` function
of the `Output` struct to reject text hashes there.
This allows cleaning up the various output path calculation functions
inside nix-compat/src/store_path/utils.rs, as they can now match on
the type.
`make_type` is renamed to `make_references_string`,
`build_regular_ca_path` is renamed to `build_ca_path`, and
`build_text_path` has a disclaimer added, because you might not actually
want to use it.
Change-Id: I674d065f2ed5c804012ddfed56e161ac49d23931
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9814
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
It is *eventually* followed by a Node, but there is some stuff in
between.
Change-Id: Ie7c7b462828bd3e066f4a7e774895f30b82763ef
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9768
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Equivalent to the existing code, but a little less cryptic.
Change-Id: Ib9b2f9aedddc84d0e79840bba4cce01f92d9bc56
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9766
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
The previous CLs did already absorb all the logic into the common tests,
no need to write this here again.
Change-Id: I7ba84ba86d5445ed247e5d11d5e59b7fa815670e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9732
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Due to the lack of a ATerm parser, we were previously loading the JSON
fixtures to construct our Derivation structs to run the output path
calculations with.
However, as we now have a ATerm parser, we can load the ATerm
representation directly.
This also means we can test the output path calculation for non-UTF8
Derivations.
Change-Id: I0e53f41a23566b5ad5f0fed12724e02a10b02707
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9731
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This provides a nom-based parser for Nix derivations in ATerm format,
which can be reached via `Derivation::from_aterm_bytes`.
Some of the lower-level ATerm primitives are moved into a (new) aterm
module, and some more higher-level ones that construct derivation-
specific types.
Also, move the escape_bytes function into there, this is a generic ATerm
thing.
Change-Id: I2b03b8a1461c7ea2fcb8640c2fc3d1fa3ea719fb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9730
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
We don't actually care if it's a BTreeMap of strings, bstrings or any of
that sort.
The only thing we want to be able to do is get a reference to the bytes
from the keys and values.
Change-Id: I21b85811a9ea47fe06afa3108836ef9295e5d89b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9737
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
We use test_resources and globbing for some of the test cases, so adding
additional files in there will also create new test cases, which we
don't always want.
Move it down one level to make some more space.
Change-Id: I619867dc80a4ced59d45096d0703678663b559cd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9729
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Limit the amount of memory consumed on the stack for NixHash. Sha512
isn't used that often, so it's fine if we heap-allocate it.
Change-Id: I4a9eecd20c6184610124dc130c41bfa5d0dc04c5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9726
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This is of known length.
Change-Id: Iba48ccc486d5bf9e38ec1a2da6e7b80997d2c6ca
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9723
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
This uses the newly introduced StorePath message type to add a Deriver
field to the PathInfo message.
Support for validation is added to both the golang and rust
implementation. This includes extending unit tests.
Change-Id: Ifc3eb3263fa25b9eec260db354cd74234c40af7e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9647
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
This is preparation for adding an async port.
Change-Id: Id638ec1f6f46e2f3935448184eed51e2233263fe
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9618
Tested-by: BuildkiteCI
Autosubmit: edef <edef@edef.eu>
Reviewed-by: flokli <flokli@flokli.de>
Consistent error messages, and slightly nicer code layout. We avoid
printing the input data, since we primarily want to point out the
specific violated invariant. In the one place where we do want to,
we use BStr's Debug implementation, since byte slices don't print
nicely.
Change-Id: I3a9a0c37be270ea5f16cf124922c254608fb849e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9617
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: edef <edef@edef.eu>
Found by Clippy, which we should probably run in CI.
Change-Id: Id79c30b63f681021ab79358e02d29454d43c0aa6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9614
Autosubmit: edef <edef@edef.eu>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
The point of clearing and reusing the same Vec is to avoid transiently
allocating for every directory entry. This was lost in cl/8974 when we
switched from String to Vec<u8>.
Change-Id: I65647e5c4e54e88f1fe45e9a752cb5154d98fb33
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9607
Autosubmit: edef <edef@edef.eu>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Nix has historically rejected these. The current behaviour was
accidentally introduced in Nix 2.4, and is considered a bug.
Link: https://github.com/NixOS/nix/pull/9095
Change-Id: I38ffa911f0a413086479bd972de09671dbe85121
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9507
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: edef <edef@edef.eu>
Suggested in https://cl.tvl.fyi/c/depot/+/9108/5, but this disallows
adding a .gitignore file to the store.
Remove the check, and add a testcase.
Change-Id: Ieb78c417934756b2dbeb493040fe79726d1b9079
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9447
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Similar to cl/9350. Most of the times, this is still a regular string,
so let's print it like that if we can, and resort to base64 encoding if
we can't.
We were also wrongly always outputting the second character in the
string.
Change-Id: Id0e2a9d9f1ad3d2d7b554893ecd89a7e6383e9c2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9445
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
We've decided to asyncify all of the services to reduce some of the
pains going back and for between sync<->async. The end goal will be for
all the tvix-store internals to be async and then expose a sync
interface for things like tvix eval io.
Change-Id: I97c71f8db1d05a38bd8f625df5087d565705d52d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9369
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This trait is eval-specific, there's no point in dealing with these
things in tvix-store.
This implements the EvalIO interface for a Tvix store.
The proper place for this glue code (for now) is tvix-cli, which knows
about both tvix-store and tvix-eval.
There's one annoyance with this move: The `tvix-store import` subcommand
previously also used the TvixStoreIO implementation (because it
conveniently did what we wanted).
Some of this code had to be duplicated, mostly logic to calculate the
NAR-based output path and create the PathInfo object.
Some, but potentially more of this can be extracted into helper
functions in a shared crate, and then be used from both TvixStoreIO in
tvix-cli as well as the tvix-store CLI entrypoint.
Change-Id: Ia7515e83c1b54f95baf810fbd8414c5521382d40
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9212
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
There was a NixHash::new() before, which didn't perform any validation
of the digest length. We had some length validation when parsing nix
hashes or SRI hashes, but some places didn't perform validation and/or
constructed the struct directly.
Replace NixHash::new() with a
`impl TryFrom<(HashAlgo, Vec<u8>)> for NixHash`, which does do this
validation, and update constructing code to use that, rather than
populating structs directly. In some rare cases where we're sure the
digest length is correct we still populate the struct manually.
Fixes b/291.
Change-Id: I7a323c5b18d94de0ec15e391b3e7586df42f4229
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9109
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Previously, Output deserialization would silence validation errors and
provide `None` for `hash_with_mode` as soon as a validation error would
happen inside of the `NixHashWithMode` deserialization, e.g. invalid
hash length would not provide an validation error but a silent `None`
value.
This is problematic, we workaround a serde limitation here by writing
our own Deserializer.
As you can see, we write some boilerplate code unfortunately, as, for
example:
- `#[serde(fail_if_unparsed_as="Option::is_none")]` is not a thing,
otherwise, we could have been able to just bubble up errors in case of
"not fully parsed" (and not missing) values.
- `From<&serde_json::Value> for serde:🇩🇪:Unexpected` is not a thing,
otherwise, we could just map invalid type errors and reuse the
existing types instead of doing extremely bizarre things with
`serde:🇩🇪:Unexpected::Other`, note: this is a not problem for
expected, we know what we expect, we don't know what we received in
practice.
I decided to write a `NixHashWithMode::from_map` which will eat a map
deserialized via `serde_json`, so our serde magic is not totally "data
model" agnostic.
I wanted to go for data model agnosticity and enable maximal
performance, e.g. building the structure as the map values are streamed
in the Visitor, this is needlessly painful because `Output` and
`NixHashWithMode` are in different files and this really makes sense
only if we write the full implementation in one file, indeed, otherwise,
you end up duplicating the work or having strange coupling.
So, for now, we will allocate a full map of the fields inside the
`Output`, i.e. if any "unknown field" is in that map, we will
deserialize it for no reason.
Doing it properly, as I repeat it in the code and to flokli at C3Camp
2023, requires to patch serde upstream IMHO.
Change-Id: I46fe6ccb8c390c48d6934fd3e3f02a0dfe59557b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9107
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This `.into_iter()` call is equivalent to `.iter()` and will not consume
the `BTreeMap`.
Change-Id: Ie26637ebecb0bea5b09c447cc45ed207f8b50913
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9088
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
We only need bstr::ByteSlice to be able to use replace, it doesn't need
to return a BString.
Change-Id: I811948436fb89652e880970c2c05356183f3e439
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9084
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Make it more obvious if these are bytes pointing to JSON or ATerm, so we
don't get confused.
Change-Id: I2402c687b7ba9c05aac20ed63b0df54e4e96a9d8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8998
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This is more concise than a io::copy of a Cursor to bytes, and we have
everything to be written in memory.
Change-Id: I81f34666aa61aef4e16b33423ce4a69c3781efc3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8997
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
write_input_derivations shouldn't need to write a comma to separate it
from the previous output from write_outputs.
This is better placed in the function calling all of these helper
functions.
Change-Id: I9ccc440e4665b52369ef39e75151b9a29469ce48
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8995
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
We're happy with any &[S], as long as <S: AsRef<[u8]>.
This allows passing both strings and &[u8].
Change-Id: If2a80d9b1ee33ba328c9cdab4fa83ca7b98a71e2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8994
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI