We produce traces bigger than what tempo accepts by default, causing
traces to be rejected with TRACE_TOO_LARGE and to then be incomplete.
Bump the max size.
Change-Id: I8caa245d14db683853485ee5625c9662ea51ce29
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12926
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
There's no need to mkForce anything in that list.
Nix reads nix-cache-info to determine priority.
Change-Id: I08797ed25348f52f5696f80558d206b73d20dead
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12925
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
The mount didn't get applied for some reason, explicitly configure the
path.
Change-Id: Ie41eb3c1d5f6416493211fb77709aaeecf61edf0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12924
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
The data source defaults to 15s of time interval. As alloy scrapes every
60s only, this causes watching dashboards with a smaller time range to
just not show any data, like the CPU graph being empty for a time range
< last 12h.
Fix by setting time interval to 60s.
Co-Authored-By: WilliButz <willibutz@posteo.de>
Change-Id: Ife306b2fda968654cad818a82f99e0011819be3c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12923
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Putting this into UnitConfig won't work, so the bind mount didn't
happen, causing the blobs to be created on the SSD too.
This was already deployed and the data migrated over.
Change-Id: Ie30c8f458cdad8b764817a48a048ec3ca3c18e64
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12922
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Move the Directory and PathInfo storage to the SSD, and only bind-mount
the blob storage from the HDD.
This should improve IO for random access.
Change-Id: Icf9408a879dec8a52541953682ffac25b31e73d3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12921
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This is a bandaid until we have a proper fix.
Change-Id: Id9f0bab5f309a7796c1efee23071013618c6dd12
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12896
Autosubmit: Jonas Chevalier <zimbatm@zimbatm.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
With nar-bridge supporting zstd content-encoding, we don't need the
nginx zstd module and can re-enable http2.
We also need to propagate the Accept-Encoding sent by the client to
nar-bridge, so it actually knows it can send zstd.
This reduces the time measured in the microbenchmark from ~13s to this:
```
hyperfine 'rm -rf /tmp/cache; nix copy --from https://nixos.tvix.store/ --to "file:///tmp/cache?compression=none" /nix/store/jlkypcf54nrh4n6r0l62ryx93z752hb2-firefox-132.0'
Benchmark 1: rm -rf /tmp/cache; nix copy --from https://nixos.tvix.store/ --to "file:///tmp/cache?compression=none" /nix/store/jlkypcf54nrh4n6r0l62ryx93z752hb2-firefox-132.0
Time (mean ± σ): 4.880 s ± 0.207 s [User: 4.661 s, System: 2.377 s]
Range (min … max): 4.700 s … 5.274 s 10 runs
```
Change-Id: Id092307423636163ae95ef87ec8fa558b83ce0bb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12835
Reviewed-by: Jörg Thalheim <joerg@thalheim.io>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
We don't need a separate instance of opentelemetry-collector, alloy can
also do this job for us.
Change-Id: I1b671ba57d70b080f7db112e1afcfe2e0cbdd13e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12829
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: Jonas Chevalier <zimbatm@zimbatm.com>
Tested-by: BuildkiteCI
These are quite bursty, and I've seen messages about getting rate
limited.
Change-Id: I73058140957cb5718971fa432c003c2d1b0305e3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12828
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Use grafana-alloy to collect system metrics.
Change-Id: I592e64ca722701d4f12e69a531a434b54954955a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12827
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
This enables routing of metrics to an instance of VictoriaMetrics, and
configures opentelemetry-collector to route metrics there.
Change-Id: If765191a4cc70ddcaad821d45132b96a10a12148
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12812
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Jonas Chevalier <zimbatm@zimbatm.com>
This is a fetch-through mirror of cache.nixos.org, hosted by NumTide.
The current machine is a SX65 Hetzner dedicated server with 4x22TB SATA disks,
and 2x1TB NVMe disks.
The goals of this machine:
- Exercise tvix-store and nar-bridge code
- Collect usage metrics (see https://nixos.tvix.store/grafana)
- Identify bottlenecks
- Replace cache.nixos.org?
Be however aware that there's zero availability guarantees. Since Tvix doesn't
support garbage collection yet, we either will delete data or order a bigger
box.
Change-Id: Id24baa18cae1629a06caaa059c0c75d4a01659d5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12811
Tested-by: BuildkiteCI
Reviewed-by: Jonas Chevalier <zimbatm@zimbatm.com>
Reviewed-by: flokli <flokli@flokli.de>
This got lost somehow, but is necessary to keep `mg` in `$PATH`.
Change-Id: I2100d68225284bfe825bcc5ab01628891ebd09a3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12810
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: flokli <flokli@flokli.de>
The machine at Hetzner is gone, only the one in EC2 remains.
Change-Id: Ia7266d56ef1174267b95086c51e6d80015c2f905
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10711
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: flokli <flokli@flokli.de>
This adds a `parse-bucket-logs.{service,timer}`, running once every
night at 3AM UTC, figuring out the last time it was run and parsing
bucket logs for all previous days.
It invokes the `archeology-parse-bucket-logs` script to produce
a .parquet file with the bucket logs in `s3://nix-cache-log/log/` for
that day (inside a temporary directory), then on success uploads the
produced parquet file to
`s3://nix-archeologist/nix-cache-bucket-logs/yyyy-mm-dd.parquet`.
Change-Id: Ia75ca8c43f8074fbaa34537ffdba68350c504e52
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10011
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
This adds a `archeology-parse-bucket-logs` CLI tool to `$PATH`.
It can be invoked like this:
```
archeology-parse-bucket-logs http://nix-cache-log.s3.amazonaws.com/log/2023-11-10-00-* bucket_logs_2023-11-10-00.pq.zstd
````
… and will produce a zstd-compressed Parquet file for (roughly) that
time range.
As the EC2 instance credentials don't give access to the logs bucket
(yet), other AWS credentials need to be provided.
This can be accomplished by using "AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY", "AWS_SESSION_TOKEN" from
"Option 2: Manually add a profile to your AWS credentials file (Short-
term credentials)" in AWS IAM Identity Center.
Processing logs for a one-hour range takes a minute or two, the
resulting zstd-compressed Parquet file is around 40-80M in size.
Processing logs for a whole day takes some 25mins, due to the sheer
amount of data (12 GB of raw log data, distributed among 450k individual
files, 20Mio log lines), but at least clickhouse isn't able to parse the
resulting parquet file back in:
> Code: 36. DB::Exception: IOError: Couldn't deserialize thrift: MaxMessageSize reached
For future automation tasks, it's probably better to run this once an
hour, and further join the data later on.
Change-Id: I6c8108c0ec17dc8d4e2dbe923175553325210a5c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10007
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Expose `deps` separately, add a direnv with PATH_add for it to bring
tooling into $PATH.
Change-Id: I432cd2b082cad89e08bef78dc4653e10e137cd6b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9842
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Avoid having to re-enter the shell whenever the config is changed.
Change-Id: Ib9f6bb4075e29acaeb4863d64c017695ca85b60b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9841
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This add the EC2 box config to the repo.
Change-Id: Id7a888a2cfbf1454cd9f9465018df377e14b4e9f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9836
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
This adds a deploy-archeology script.
I tried getting morph to work first, but passing it a
depot.ops.nixos.nixosFor seems to be very hard - the NixOS module system
doesn't like the arguments it's called with.
Replace morph with a 3 line bash script, which assumes your ssh_config
contains config for an `archeology` host.
Change-Id: I2bf694c60ded39c201efbbb899f3b5512aa4d0f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9835
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>