Commit graph

14 commits

Author SHA1 Message Date
Simon Hauser
1515a970be feat(tvix/tracing): http propagation for axum
It introduces a new accept_trace function for axum0.7 which can be used
to accept a header trace from a received request. This function can be
used for tonic 0.12 once that version is released, and the specific
`accept_trace` function within `tvix_tracing::propagate::tonic` can then
be removed.

This also integrates http propagation into the nar_bridge crate.

Change-Id: I46dcc797d494bb3977c2633753e7060d88d29129
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11925
Reviewed-by: Brian Olsen <me@griff.name>
Tested-by: BuildkiteCI
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Reviewed-by: flokli <flokli@flokli.de>
2024-07-21 05:45:19 +00:00
Simon Hauser
618aacaa61 feat(tvix/tracing): http trace propagation
Introduces a helper function within tvix-tracing that returns a reqwest
tracing middleware that will ingest the traceparent if otlp is enabled.

It is feature flagged in tvix-tracing so not every consumer of that
library automatically has reqwest in its dependencies.

Tested using netcat to verify that the `traceparent` header is there if
otlp is enabled and missing if otlp feature is disabled.

Change-Id: I5abccae777b725f5ff7382e3686165383c477a39
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11886
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-07-02 13:43:09 +00:00
Simon Hauser
6a9a4d56a4 feat(tvix/tracing): expose stdout_writer and stderr_writer
Using std::io::{Stdout,StdErr} directly will clobber the output by an
active progress bar. To resolve this issue the exposed writers should be
prefered over `println!` and `eprintln!`.

Change-Id: Ic79465cd4e8b9dad5a138f6b08c5f0de9dcf54a1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11860
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-07-01 13:55:02 +00:00
Florian Klink
89361b2a7f fix(tvix/tracing): make cargo check and clippy happy
In case the otlp feature is not enabled, these generate warnings during
`cargo check`.
Fix by moving some imports into their functions, or using the
fully-qualified name (and one #[allow(unused_mut)])

Change-Id: I5afd89dcd4c772b6002cebdd5d0469932eacfdac
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11873
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-26 06:27:15 +00:00
Simon Hauser
639a00e2ab feat(tvix/tracing): gRPC trace context propagation
This introduces optional helper function in tvix/tracing for trace
propagation and uses these helper in the `tvix-store`.

The GRPCBlobService, GRPCDirectoryService and GRPCPathInfoService now
accept a generic client, meaning the client can be generated with either
`::new` or `::with_interceptor`.

This was tested and validated by starting a `tvix-store daemon` and
`tvix-store import`.

Change-Id: I4b194483bf09266820104b4b56e4a135dca2b77a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11863
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-06-20 19:21:01 +00:00
Simon Hauser
bd8d74a3ee feat(tvix/tracing): optional progressbar
Disable the progressbar on default and provide a interface for
optionally enabling the progressbar.

Change-Id: I0e31b1957e80cf64a8dcf65c6ceb3713975b8220
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11861
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-20 10:28:54 +00:00
Florian Klink
28b692fd50 feat(tvix/tvix-store): improve progress bars
Don't show an empty spinner for daemon commands.
Move the bar to the right, so the text is better aligned between spinner
progress and bar progress styles.

Generally, push progress bars a bit more down to the place where we can
track progress. This includes adding one in the upload_blob span.

Introduce another progress style template for transfers, which
interprets the counter as bytes (not just a plain integer), and also a data rate.
Use it for here and in the fetching code, and also make the progress bar
itself a bit less wide.

Change-Id: I15c2ea3d2b24b5186cec19cd3dbd706638497f40
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11845
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-17 12:57:34 +00:00
Florian Klink
6b6a34065e feat(tvix/tracing): add tracing-tracy support
This introduces another feature flag, "tracy" to the `tvix-tracing` crate.

If enabled (not enabled by default), it'll add an additional layer
emitting packets in a format that https://github.com/wolfpld/tracy can
display.

I had to be a bit tricky with the combinatorial complexity when adding
this, but the resulting code still seems manageable.

Change-Id: Ica824496728fa276ceae3f7a9754be0166e6558f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10952
Tested-by: BuildkiteCI
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Reviewed-by: flokli <flokli@flokli.de>
2024-06-14 19:33:44 +00:00
Florian Klink
d25f047b9d refactor(tvix/tracing): move otlp setup into helper function
Having all this in the main control flow makes it a bit hard to read.
Moving it into a helper function makes it a bit cleaner.

Change-Id: Ibdb739dbd1e013b4f8c4aaf9b036a6bd556a1871
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11814
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Simon Hauser <simon.hauser@helsinki-systems.de>
Tested-by: BuildkiteCI
2024-06-14 14:52:19 +00:00
Simon Hauser
118e69d81d fix(tvix/tracing): reduce the error logs of otlp if collector is offline
The problem is that opentelemetry_otlp tonic batch exporter tries to
exports if either the `scheduled_delay` or if the
`max_export_batch_size` is reached. Per default the
`max_export_batch_size` is set to 512 spans, which means that we try to
export these spans once that counter is reached. Each export will then
try to connect to the exporter (if that not already happening) and will
result in a `tcp connect error`.
Increasing the max_export_batch_size to 4096 will then ensure that the
export only happens if the `scheduled_delay` is met after the 10
seconds.
`max_queue_size` is also increased, because `max_export_batch_size`
should not be greater than `max_queue_size`, so similar to the default
config its set to `max_export_batch_size * 4`.
This will reduce the amount of tries to otlp if the collector is not
available and otlp enabled.

Change-Id: Ic3430006e8a104fa3b34d274678cae55b3620ce9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11791
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-14 10:20:31 +00:00
Simon Hauser
a857a2b978 feat(tvix/tracing): apply EnvFilter to all layers
Currently we apply the EnvFilter only to the stderr output writer.
This didn't affect any other layer, like the otlp layer, causing spans
from `h2`, `tokio_util` or other third party crate dependencies to be
always sent out via OTLP.

This changes that behaviour, applying EnvFilter to all exports, leading
to a lot less spans being exported.

Change-Id: I9f3a7233e9d0aeaa81fe08914579f0b3c80d134e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11813
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-14 10:18:59 +00:00
Simon Hauser
fa7ed39bf4 feat(tvix/tracing): correctly close otlp on exit
Provide a new interface for forcing a flush of otlp traces and use this
interface to shutdown otlp prior to exiting tvix-store, either if the
tool was stopped with a SIGTERM or ended regularly.
This also fixes an issue where traces were not even exported if for
example we just imported 10 paths and never even emitted more than 256
traces. The implementation uses a mpsc channel so a flush can be done
without having to wait for it to complete. If you want to wait for a
flush to complete you can provide a oneshot channel which will receive a
message once flushing is complete.

Because of a otlp bug `force_flush` as well as
`shutdown_tracer_provider` need to be executed using `spawn_blocking`
otherwise the function will deadlock. See
https://github.com/open-telemetry/opentelemetry-rust/issues/1395#issuecomment-1953280335

Change-Id: I0a828391adfb1f72dc8305f62ced8cba0515847c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11803
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-14 09:34:51 +00:00
Florian Klink
79bfa931ed feat(tvix/tracing): set release_max_level_debug for tracing
This allows explicitly opting in to get DEBUG-level log lines, by
setting RUST_LOG.

It currently also causes traces to be emitted in all cases, so we might
do some runtime filtering there too, as discussed in cl/11791.

Change-Id: I2865bb06a62465836d63196422f5f734f7165386
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11801
Tested-by: BuildkiteCI
Reviewed-by: aspen <root@gws.fyi>
Autosubmit: flokli <flokli@flokli.de>
2024-06-13 16:18:47 +00:00
Simon Hauser
825d498908 feat(tvix/tracing): introduce common tvix-tracing crate
Introduce a new common crate that contains tracing boilerplate which then
can be used in the cli, tvix-store and tvix-build crates.
It has otlp as an optional feature, which is currently only used by
tvix-store.

Change-Id: I41468ac4d9c65174515d721513b96fea463d6ed2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11758
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Simon Hauser <simon.hauser@helsinki-systems.de>
2024-06-10 16:35:08 +00:00