users.sterni.nix.utf8 implements UTF-8 decoding in pure nix. We
implement the decoding as a simple state machine which is fed one byte
at a time. Decoding whole strings is possible by subsequently calling
step. This is done in decode which uses builtins.foldl' to get around
recursion restrictions and a neat trick using builtins.deepSeq puck
showed me limiting the size of the thunks in a foldl' (which can also
cause a stack overflow).
This makes decoding arbitrarily large UTF-8 files into codepoints using
nix theoretically possible, but it is not really practical: Decoding a
36KB LaTeX file I had lying around takes ~160s on my laptop.
Change-Id: Iab8c973dac89074ec280b4880a7408e0b3d19bc7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2590
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
switch would probably otherwise be called match, but has been renamed so
it isn't confused with string.match and the enum matching capabilities
yants has.
It implements the closest to pattern matching nix can come which is
still flexible enough to not be painful: Syntactically it works like
cond, but is given a value. Instead of booleans it checks passed
predicates or equality if simple values are passed. Both types of checks
can be mixed.
Change-Id: I40f000979cfd469316e15fd58d6c3a80312c1cc4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2589
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
Since nix ends the substring at the end of the string anyways we can
just statically use the largest nix integer as the length of the string.
According to my testing this it ever so slightly faster as well.
Change-Id: I64566e91c7b223f03dcebe3bc5710696dc4261bc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2587
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
After all it only matches strings.
Change-Id: I3d2e5221ef43f692de69028e78ed98b6b11f82d1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2586
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
These aren't particularly useful without side effects, but one step at
a time.
This diverges slightly from the book, in that OpPop retains the last
value it "forgot" from the stack in a special field on the
interpreter.
This makes it possible to return values from expression statements,
which helps in cases where Lox is embedded as a scripting
language (please don't do this ever) or in tests.
Change-Id: Ided0bc04c6e80ddb23ba4693d61ac9e08b002d58
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2584
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This is again a step closer to the book, but there are some notable
differences:
* Only constants encountered by the compiler are interned, all other
string operations (well, concatenation) happen with heap objects.
* OpReturn will always ensure that a returned string value is newly
heap allocated and does not reference the interner.
Change-Id: If4f04309446e01b8ff2db51094e9710d465dbc50
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2582
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This is based on this matklad post:
https://matklad.github.io/2020/03/22/fast-simple-rust-interner.html
It's modified slightly to provide a safer interface and slightly more
readable implementation:
* interned string IDs are wrapped in a newtype that is not publicly
constructible
* unsafe block is reduced to only the small scope in which it is
needed
* lookup lifetime is pinned explicitly to make the intent clearer when
reading this code
Change-Id: Ia3dae988f33f8e5e7d8dc0c1a9216914a945b036
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2578
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
... including concatenation.
This diverges significantly from the book, as I'm using std::String
instead of implementing the book's whole heap object management
system.
It's possible that Lox in Rust actually doesn't need a GC and the
ownership model works just fine.
Change-Id: I374a0461d627cfafc26b2b54bfefac8b7c574d00
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2577
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
What you see here is mostly the fallout of me implementing a correct
urlencode implementation in nix for Profpatsch's blog implementation
(although they'll probably keep it at arm's length).
Where I want to go from here:
* Extend this library towards general purpose nix™, mainly by
implementing missing interfaces which you'd still have to use
<nixpkgs/lib> for right now. Reexposing parts of <nixpkgs/lib>
with better naming is fine for now, at some point I'd contemplate
making this depend on nothing outside of depot, maybe even itself
(should be easy we only use yants for an easily replaceable check).
* Improve error messages possibly by carefully reintroducing yants. I
originally typed essentially everything using yants, but turns out
this can a) be dangerous when stuff you are handling throws because
type checking means evaluating and b) has a incredible performance
cost in some cases.
* Reexpose builtins with better naming and slightly wrapped so they
don't unrecoverably throw in cases where a null or something would
suffice.
Change-Id: I33ab08ca4e62dbc16b86c66c653935686e6b0e79
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2541
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: Profpatsch <mail@profpatsch.de>
Tested-by: BuildkiteCI
This makes it possible to specify the input & output types of the
binary_op macro. If only one type is specified, it is assumed that the
input and output types are the same.
Change-Id: Idfcc9ba462db3976b69379b6693d091e1a525a3b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2573
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Adds support for true, false & nil. These each come with a new
separate opcode and are pushed directly on the stack.
Change-Id: I405b5b09496dcf99d514d3411c083e0834377167
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2571
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Introduces a new enum which represents the different types of possible
values, and modifies the rest of the existing code to wrap/unwrap
these enum variants correctly.
Notably in the vm module, a new macro has been introduced that makes
it possible to encode a type expectation and return a runtime error in
case of a type mismatch.
Change-Id: I325b5e31e395c62d8819ab2af6d398e1277333c0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2570
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
If I was adding any dependencies, this might be a good one for a
property-based test thing, but I'm not going to.
Change-Id: Ia801d041479d1a88c59ef9e0fe1460b3640382e3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2569
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Without this fix we would keep parsing in the same precedence level
and get weird things like:
10 - -10 + 10
=> 10
Change-Id: If2bed4569fbf566027011037165a9b3c09b7427c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2567
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This should clean up everything in the way of actually running this
end-to-end.
Change-Id: Ie89d82472a458256a251a4fddc1c36d88d21f5f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2563
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Defines a new precedence levels enum which can be used to restrict the
parser precedence in any given location. As an example, unary
expressions and grouping are implemented, as these have a different
precedence from e.g. expression()
Change-Id: I91f299fc77530f76c3aba717f638985428104ee5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2558
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This lets us suppress reporting of additional errors from the compiler
until a synchronisation point is reached.
Change-Id: Iacf90949f868fbdb4349750065b5e458cf74d32a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2557
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This one necessarily has to diverge more from the book than the
treewalk interpreter did, so some of this is expected to change, but
I'm happy with the rough shape.
Since we're reusing the old scanner, the compiler/parser struct owns
an iterator over all tokens with which the pull-scanner from the
bytecode chapters is simulated.
Change-Id: Icfa0bd4729d9df786e08f7e49a25cba1b9989a91
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2556
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
This makes it easier to transition between the single/multi error
functions via ?
Change-Id: Ie027f4700da463a549be6f0d4a0022a9b8dc0d61
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2555
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
Go 1.16 makes "go list all" not work. "go list std" is what we should be
using instead anyway.
Change-Id: I3f867fde477030d2358085b3d64b5856fb9c421b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2551
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
hibernate on low battery, and when the power button is pressed
Change-Id: I6560fc770ee5707e59fb2763614de2b8000e156e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2550
Reviewed-by: glittershark <grfn@gws.fyi>
Tested-by: BuildkiteCI
Uses project.el to anchor the ripgrep search. In combination with my
project detection logic, this means that grepping in TVL subprojects
works automatically.
Change-Id: I2705466d1de156c08ff0401a71112864aa24f976
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2542
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
When a file is added to the depot tree that is picked up by read-tree,
but it’s not a function like ({...}: {}), `readTree` will fail on the
function application, leading to a bad error message.
We can do slightly better, by checking the type and throwing a nicer
trace message.
`assertMsg` is copied from `nixpkgs/lib/assert.nix`, since at this
point we don’t have a reference to the lib.
There is another evaluation failure that can happen, which is when the
function we try to call does not have dots; however, nix does not
provide any inflection capabilies for checking whether a function
attrset is open (`builtins.functionArgs` only tells us the attrs it
mentions explicitly). Maybe the locality of the error could be
improved somehow.
Change-Id: Ibe38ce78bb56902075f7c31f2eeeb93485b34be3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2469
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
We stopped using this in favour of //web/panettone quite a while ago,
so lets clean it up.
Change-Id: I8aa8d86288933d470ab3962ffbb60294eaddd27b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2540
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: lukegb <lukegb@tvl.fyi>
Previously, for types defined using typedef (like all primitive types)
type.checkType would return a boolean. This is largely fine since in
most places `type.checkToBool (type.checkType x)` or similar is used.
However, some functions actually take type.checkType up on the promise
that it returns a set of the form:
{
ok = <bool>;
err = <option string>;
}
This is the case for restrict which has checkToBool = v: v.ok; and will
generate a proper set except if `t.checkToBool (t.checkType v) == false`
in which case it will return t.checkType v. If t was a primitive type or
defined using typedef, previously `t.checkType v` would be a boolean
which meant as soon as (restrict …).checkToBool was called on a restrict
checkType result in cases where the wrapped type didn't match, an
unrelated error would be thrown:
nix-repl> with nix.yants; restrict "foo" (_: true) int "lol"
error: value is a boolean while a set was expected, at /home/lukas/src/depot/nix/yants/default.nix:38:39
This is fixed by making typedef return a proper set from checkType and
adjusting its checkToBool accordingly.
Unfortunately I don't think we can easily add test cases for this except
by using recursive nix or VM tests as there is no way to introspect
error messages.
Change-Id: I96a7be065630f04ca33358f21809284911ec14fe
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2536
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
Reviewed-by: Profpatsch <mail@profpatsch.de>