The C code from which this is translated uses sentinel values for
various things, this commit replaces them with standard Rust types
instead (amongst a bunch of other small improvements).
Change-Id: I892811a7afebb5a0f3b825824fc493ab0b399e44
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3735
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
Since this code is essentially a fairly plain translation from C, it
is a bit confusing to deal with the original untyped code. This is an
attempt to try and clean some of it up.
Change-Id: Icd21f531932e1a811c0d6dbf2e9acba61ca9c45d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3734
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
Please read b/108 to make sense of this.
This gets rid of the explicit list of exposed packages from nixpkgs,
and instead makes the entire package set available at
`third_party.nixpkgs`.
To accommodate this, a LOT of things have to be very slightly shuffled
around. Some of this was done in already submitted CLs, but this
change is unfortunately still quite noisy.
Pay extra attention to:
* overlay-like functionality that was partially moved to actual
overlays (partially as in, the minimum required to get a green
build)
* modified uses of the package set path, esp. in NixOS systems
Special notes:
* xanthous has been disabled in CI because of issues with the Haskell
overlay
* //third_party/nix has been disabled because of other unclear
dependency issues
Both of these will be tackled in a followup CL.
Change-Id: I2f9c60a4d275fdb5209264be0addfd7e06c53118
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2910
Reviewed-by: glittershark <grfn@gws.fyi>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
This also includes a fix for an issue where the identifiers of
variables were pushed onto the stack, which is incorrect.
Change-Id: Id89b388268efad295f29978d767aa4b33c4ded14
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2594
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
identifier_str might look a bit overengineered, but we want to reuse
this bit of code and it needs a reference to the token from which to
pick the identifier.
The problem with this is that the token would be owned by self, but
the function needs to mutate (the interner), so this implementation is
the most straightforward way of acquiring and working with an
immutable reference to the token before interning the identifier.
Change-Id: I618ce8f789cb59b3a9c5b79a13111ea6d00b2424
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2592
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Making this function a macro instead makes it possible to match
arbitrary token kinds, even the ones that carry data, without changing
the syntax too much.
Change-Id: I5cda9e36d6833bd9c259f7d4d8340db6e783b4e8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2593
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
These aren't particularly useful without side effects, but one step at
a time.
This diverges slightly from the book, in that OpPop retains the last
value it "forgot" from the stack in a special field on the
interpreter.
This makes it possible to return values from expression statements,
which helps in cases where Lox is embedded as a scripting
language (please don't do this ever) or in tests.
Change-Id: Ided0bc04c6e80ddb23ba4693d61ac9e08b002d58
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2584
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This is again a step closer to the book, but there are some notable
differences:
* Only constants encountered by the compiler are interned, all other
string operations (well, concatenation) happen with heap objects.
* OpReturn will always ensure that a returned string value is newly
heap allocated and does not reference the interner.
Change-Id: If4f04309446e01b8ff2db51094e9710d465dbc50
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2582
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This is based on this matklad post:
https://matklad.github.io/2020/03/22/fast-simple-rust-interner.html
It's modified slightly to provide a safer interface and slightly more
readable implementation:
* interned string IDs are wrapped in a newtype that is not publicly
constructible
* unsafe block is reduced to only the small scope in which it is
needed
* lookup lifetime is pinned explicitly to make the intent clearer when
reading this code
Change-Id: Ia3dae988f33f8e5e7d8dc0c1a9216914a945b036
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2578
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
... including concatenation.
This diverges significantly from the book, as I'm using std::String
instead of implementing the book's whole heap object management
system.
It's possible that Lox in Rust actually doesn't need a GC and the
ownership model works just fine.
Change-Id: I374a0461d627cfafc26b2b54bfefac8b7c574d00
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2577
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
This makes it possible to specify the input & output types of the
binary_op macro. If only one type is specified, it is assumed that the
input and output types are the same.
Change-Id: Idfcc9ba462db3976b69379b6693d091e1a525a3b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2573
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Adds support for true, false & nil. These each come with a new
separate opcode and are pushed directly on the stack.
Change-Id: I405b5b09496dcf99d514d3411c083e0834377167
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2571
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Introduces a new enum which represents the different types of possible
values, and modifies the rest of the existing code to wrap/unwrap
these enum variants correctly.
Notably in the vm module, a new macro has been introduced that makes
it possible to encode a type expectation and return a runtime error in
case of a type mismatch.
Change-Id: I325b5e31e395c62d8819ab2af6d398e1277333c0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2570
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
If I was adding any dependencies, this might be a good one for a
property-based test thing, but I'm not going to.
Change-Id: Ia801d041479d1a88c59ef9e0fe1460b3640382e3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2569
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Without this fix we would keep parsing in the same precedence level
and get weird things like:
10 - -10 + 10
=> 10
Change-Id: If2bed4569fbf566027011037165a9b3c09b7427c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2567
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This should clean up everything in the way of actually running this
end-to-end.
Change-Id: Ie89d82472a458256a251a4fddc1c36d88d21f5f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2563
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Defines a new precedence levels enum which can be used to restrict the
parser precedence in any given location. As an example, unary
expressions and grouping are implemented, as these have a different
precedence from e.g. expression()
Change-Id: I91f299fc77530f76c3aba717f638985428104ee5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2558
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This lets us suppress reporting of additional errors from the compiler
until a synchronisation point is reached.
Change-Id: Iacf90949f868fbdb4349750065b5e458cf74d32a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2557
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This one necessarily has to diverge more from the book than the
treewalk interpreter did, so some of this is expected to change, but
I'm happy with the rough shape.
Since we're reusing the old scanner, the compiler/parser struct owns
an iterator over all tokens with which the pull-scanner from the
bytecode chapters is simulated.
Change-Id: Icfa0bd4729d9df786e08f7e49a25cba1b9989a91
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2556
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
This makes it easier to transition between the single/multi error
functions via ?
Change-Id: Ie027f4700da463a549be6f0d4a0022a9b8dc0d61
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2555
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
In the book, the clox interpreter has its own scanner which uses a
pull-based model for a single pass compiler.
I can't be bothered to write another scanner, or amend this one into
pull-mode to work with the treewalk interpreter, so instead I will
just reuse it and pull from a vector of tokens.
The tokens are shared between both interpreters and the scanner is not
what I'm interested in here.
Change-Id: Ib07e89127fce2b047f9b3e1ff7e9908d798b3b2b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2420
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
It's unclear if the second part of the book can reuse anything from
the first part (I'm guessing probably the scanner, but I'll move that
back if it turns out to be the case).
Change-Id: I9411355929e31ac6e953599e51665406b1f48d55
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2415
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
This is significantly simplified from the version in the book, since
I'm using Rust's Vec and not implementing dynamic arrays manually.
We'll see if I run into issues with that ...
Change-Id: Ie3446ac3884b850f3ba73a4b1a6ca14e68054188
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2413
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI
Right now this introduces a simple mechanism to flip between the
interpreters.
Change-Id: I92ee920c53d76ab6b664ac671993a6d6426af61a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2412
Reviewed-by: tazjin <mail@tazj.in>
Tested-by: BuildkiteCI