This turned out a lot nicer than I expected it to be.
Change-Id: I427670644eba789ea2037423fa9af8e632b19b34
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8695
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This makes more sense to me.
Change-Id: I013bf9457f20a31a9762768607f4094358e1b7cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8693
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Mounts the privacy policy at `/privacy-policy`. Using yew_router
"properly" is difficult in components that don't make use of macros
and context magic, so I've opted to use the gloo history handling
directly to parse the location here.
Change-Id: Icde11485f9947bc860a7b2c43772bb0f4cdf2ea1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8653
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Doesn't actually have bucket serving or access configuration yet, one
step at a time!
Change-Id: I0ce9b3b077252395bd807fad44cbdca40cdeac49
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8649
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This makes it possible to embed long texts from Markdown files instead
of dealing with writing the weird HTML-tags inside the yew macros,
which will be much easier for content editors to deal with.
Change-Id: Idc4e67404fcfe2b8d5083cf556df1c701ba17660
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8648
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This doesn't actually submit anything to the (not-yet-existing)
backend, but will help the designers figure out what we're actually
looking for here.
Change-Id: I680d88151fb0706953f18eb6256da6f205da7ffb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8489
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Sets up a virtual machine image that is bootable on Yandex Cloud.
There are some slightly wonky behaviours still, like cloud-init
apparently putting all keys into root's authorized_keys no matter what
is specified in the metadata, but it does work now.
Change-Id: I57dcb7fcfa6872a28855dc1347f73a6db3c56828
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8496
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This was a bit trickier than I anticipated, because there's no good
ways to avoid passing the credentials around manually.
What's basically happening now is that the credentials for the state
bucket are checked in (encrypted), and sourcing `creds.fish` uses the
cloud HSM to decrypt and load them into the environment.
Change-Id: I3f5ce1c9bd9d5efbf1013414f94771a09ea3a488
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8494
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Doesn't actually contain any configuration yet, just setting up TF
with the right providers and so on.
Change-Id: Ia7128dd977b4ff69eebaa36c6cad6ac104cafcdb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8492
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This module contains the request/response types for generators
requesting actions from the VM.
For most of these, an async helper function is added that will be used
inside of generator functions to make use of these requests/responses
instead of constructing them directly.
Change-Id: I1e085f88adaf784a34867957a0e82532d3a83d7c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8148
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Included fixes:
* //3p/overlays: tdlib override no longer needed (bump has landed upstream)
* //corp/{predlozhnik,tvixbolt}: bump wasm-bindgen to match nixpkgs
Home-manager has not been bumped as it has introduced an
incompatibility with Nix 2.3
Change-Id: I96ac3462b82c73db1ba23be03d7968f10abc9b53
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8033
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: sterni <sternenseemann@systemli.org>
Make it clear that Tvixbolt is a project of TVL LLC, and link to the
community website too.
See https://b.tvl.fyi/issues/248
Change-Id: Iefefe0263fa5ef01587d49c5a130a38b78ca7981
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8019
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Adds a multi-lingual version of the page, with the standard English
page being served at `/` and `/en`, and the new Russian version at
`/ru`.
Change-Id: I54ceea91d1442ee7b8717b59083e5d07c36ca8b0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7940
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This allows parsing TOML from Tvix. We can enable the eval-okay-fromTOML
testcase from nix_tests. It uses the `toml` crate, and the serde
integration it brings with it.
Change-Id: Ic6f95aacf2aeb890116629b409752deac49dd655
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7920
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Apparently our naive implementation of float formatting, which simply
used {:.5}, and trimmed trailing "0" strings not sufficient.
It wrongly trimmed numbers with zeroes but no decimal point, like
`10000` got trimmed to `1`.
Nix uses `std::to_string` on the double, which according to
https://en.cppreference.com/w/cpp/string/basic_string/to_string
is equivalent to `std::sprintf(buf, "%f", value)`.
https://en.cppreference.com/w/cpp/io/c/fprintf mentions this is treated
like this:
> Precision specifies the exact number of digits to appear after
> the decimal point character. The default precision is 6. In the
> alternative implementation decimal point character is written even if
> no digits follow it. For infinity and not-a-number conversion style
> see notes.
This doesn't seem to be the case though, and Nix uses scientific
notation in some cases.
There's a whole bunch of strategies to determine which is a more compact
notation, and which notation should be used for a given number.
https://github.com/rust-lang/rust/issues/24556 provides some pointers
into various rabbit holes for those interested.
This gist seems to be that currently a different formatting is not
exposed in rust directly, at least not for public consumption.
There is the
[lexical-core](https://github.com/Alexhuszagh/rust-lexical) crate
though, which provides a way to format floats with various strategies
and formats.
Change our implementation of `TotalDisplay` for the `Value::Float` case
to use that. We still need to do some post-processing, because Nix
always adds the sign in scientific notation (and there's no way to
configure lexical-core to do that), and lexical-core in some cases keeps
the trailing zeros.
Even with all that in place, there as a difference in `eval-okay-
fromjson.nix` (from tvix-tests), which I couldn't get to work. I updated
the fixture to a less problematic number.
With this, the testsuite passes again, and does for the upcoming CL
introducing builtins.fromTOML, and enabling the nix testsuite bits for
it, too.
Change-Id: Ie6fba5619e1d9fd7ce669a51594658b029057acc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7922
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
This table maps the grammemes for individual word forms (*not* for
lemmata in either corpus!) to the corresponding grammemes from the
other dataset.
These have drastically different shapes, so the mapping is not
perfect, but will help in determining which forms are intended to be
the same on both sides.
Change-Id: Ib0717e2f7a79d96bcb5e955a20f551e391fcd759
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7918
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: tazjin <tazjin@tvl.su>
This CL fixes the bug where output of a nix
evaluation is not set.
Change-Id: I8ae2759a7ec26e1de2e57dd43302129347a8c302
Signed-off-by: Aaqa Ishtyaq <aaqaishtyaq@gmail.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7896
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
The original dataset contains translations into different languages,
but only the English ones are imported here.
Note that translations are for lemmata only.
Change-Id: Ifb9c32c25fda44c38ad899efca9d205c520c0fa3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7895
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This is the full morphological set table for all the words from the
lemmata table, which they don't call it that.
Change-Id: I6f5be673c5f59f11e36bd8c8c935844a7d4fd170
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7894
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This is actually the lemmata table of this corpus, not the forms of
all words (they're in a separate table).
Change-Id: I89a2c2817ccce840f47406fa2a636f4ed3f49154
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7893
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This is the second dataset I want to integrate as it contains some
more practically useful, but somewhat less structured, information.
Change-Id: Ib46b2597a33e76f59e030f889a0961ecc5a144eb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7873
Tested-by: BuildkiteCI
Autosubmit: tazjin <tazjin@tvl.su>
Reviewed-by: tazjin <tazjin@tvl.su>
I'm changing strategies to importing both OC and another dataset
before continuing to normalise the data, as it might be easier to do
in a set of table-constructing queries inside of SQLite with all raw
data in place.
Change-Id: I26b41af80586fc1bfd8e26a6be20579068a82507
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7872
Autosubmit: tazjin <tazjin@tvl.su>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This makes the actual imported database of the ~whole Russian
language (all lemmas, grammemes, forms etc.) a Nix build target which
is built in CI.
This still needs schema normalisation (it's fairly directly mapped to
the raw data), but it's already starting to be a useful data set.
This also happens to be a pretty cool demonstration of the power of
Nix. You can do `nix-build -A corp.russian.data-import.database` and
out comes a perfectly valid SQLite database with a valid external data
import!
Change-Id: I5d6d15e67d0e4a7ff590fad06252be34f5d561fd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7866
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Otherwise up to 1000 elements might be missing.
Change-Id: I20d6238424eec27f0e758e7737c9c31bcb81b23d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7862
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
This is an initial and kind of dumb table structure, but there's some
massaging that needs to be done before this makes more sense.
Change-Id: I441288b684ef86be507099bcc4ebf984598789c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7861
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Adds the beginning of a tool which can import OpenCorpora data into a
SQLite database. This is quite a lot of toil and there's probably a
better way to do this, but overall becoming this intimately familiar
with the data structures is quite helpful for understanding what I
can/can't do with only this dataset.
Change-Id: Ieab33a8ce07ea4ac87917b9c8132226bbc6523b1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7859
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This is currently hosted by the company, and I'm assigning my
copyright to the company, which also runs an ad placement on the page.
Note that the NixOS module for hosting it has not been moved yet.
Change-Id: Iba9e1cab9370faa79e43c3344fbfbbbabead50b3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7857
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
This uses the `im::OrdMap` for `NixAttrs` to enable sharing of memory
between different iterations of a map.
This slightly speeds up eval, but not significantly. Future work might
include benchmarking whether using a `HashMap` and only ordering in
cases where order is actually required would help.
This switches to a fork of `im` that fixes some bugs with its OrdMap
implementation.
Change-Id: I2f6a5ff471b6d508c1e8a98b13f889f49c0d9537
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7676
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
This is a persistent, structurally sharing data structure which is
more efficient in some of our use-cases. I have verified the
efficiency improvement using `hyperfine` repeatedly over expressions
on nixpkgs.
Lists are not the most performance-critical structure in Nix (that
would be attribute sets), but we can already see a small (~5-10%)
improvement.
Note that there are a handful of cases where we still go via `Vec`
that need to be fixed, most notable for `builtins.sort` which can not
currently be implemented directly using `im::Vector` because of a
restrictive type bound.
Change-Id: I237cc50cbd7629a046e5a5e4601fbb40355e551d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7670
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
As expected, this ends up being significantly nicer to use than the
previous API.
While doing this, I've combined the error fields into one. This is
because there would only ever be one of those anyways, and combining
them ensures that we have consistent formatting (for example,
parser errors would previously not be run through the pretty
formatter but are now).
Change-Id: I6074ec8a4a3901ea82d5d07174b76a345210967b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7547
Autosubmit: tazjin <tazjin@tvl.su>
Reviewed-by: grfn <grfn@gws.fyi>
Tested-by: BuildkiteCI