tvl-depot/users/flokli/archeology
Florian Klink 46964f6d8f fix(users/flokli/archaeology): don't use file but column compression
Clickhouse also has column compression, configurable with the
output_format_parquet_compression_method setting.

It defaults to lz4, and the previous setting got a a zstd-compressed
parquet file with lz4 data.

Set output_format_parquet_compression_method to zstd instead, and sort
by timestamp before assembling the parquet file.

The existing files were updated to the same format with the following query:

```
SELECT * FROM file('bucket_logs_2023-11-11*.pq', 'Parquet', 'auto') ORDER BY timestamp ASC INTO OUTFILE 'bucket_logs_2023-11-11.parquet' SETTINGS output_format_parquet_compression_method = 'zstd'
```

Change-Id: Id63b14c82e7bf4b9907a500528b569a51e277751
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10008
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-11-11 19:49:13 +00:00
..
default.nix fix(users/flokli/archeology): make clickhouse use ambient AWS creds 2023-11-11 12:24:23 +00:00
OWNERS feat(users/flokli/archeology): init parse-bucket-logs 2023-11-11 12:24:23 +00:00
parse_bucket_logs.rs fix(users/flokli/archaeology): don't use file but column compression 2023-11-11 19:49:13 +00:00
README.md feat(users/flokli/archeology): init parse-bucket-logs 2023-11-11 12:24:23 +00:00

archeology

This directory contains various scripts and helpers used for nix-archeology tasks.

It's used from some of the archeology instances, as well as standalone.