Commit graph

22 commits

Author SHA1 Message Date
Vincent Ambo
743bee8686 fix(ops/pipelines): Allow steps to run immediately after upload
This fix was recommended by Buildkite and is explained in the comment.

Change-Id: I3f1c1c07cba0b417857d69c021c8af4750d645c4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/4334
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2021-12-15 16:55:03 +00:00
Vincent Ambo
38ec27e834 fix(ops/pipelines): Chunk build pipeline into multiple uploads
The number of jobs in the depot pipeline is reaching the limits of the
Buildkite backend's ability for a single pipeline upload. Based on a
conversation with their support my understanding is that this has to
do with internal locking mechanisms at Buildkite.

To work around this, we can instead chunk the pipeline into several
smaller chunks that are uploaded serially.

This commit introduces logic to chunk the pipeline accordingly. The
chunk size chosen is 256 for now (a multiple of our number of agents,
which is useful if we can get builds from the first chunk to start
before the next ones are uploaded).

Note that this chunk size is significantly below even the current
number of targets (~460 as of this commit), but choosing a lower chunk
size might alleviate problems we've been seeing with timeouts during
pipeline uploads.

Change-Id: I77030aaf8b874c330218b78c77d15216e13b9af7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/4332
Tested-by: BuildkiteCI
Reviewed-by: wpcarro <wpcarro@gmail.com>
Autosubmit: tazjin <mail@tazj.in>
2021-12-15 15:49:40 +00:00
sterni
9f22b4f1c8 docs(ops/pipelines/depot): correct comment about fallback build cmd
We can gcroot the derivation files and drop this step, but have
elected not to do so for the moment, see cl/3436.

Change-Id: I993a1f3921e9f21e18fa260e76d3dd15ffa556bd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/4327
Tested-by: BuildkiteCI
Autosubmit: sterni <sternenseemann@systemli.org>
Reviewed-by: tazjin <mail@tazj.in>
2021-12-14 17:02:29 +00:00
Vincent Ambo
fc14c21bb9 fix(ops/pipelines): Move to static pipeline
This step would get inserted at the wrong point in the build pipeline
otherwise, causing a dependency cycle and causing the pipeline to fail.

Change-Id: I534568eec77f74ae6c47276820f8a9e99493a3ea
2021-12-10 11:01:21 +03:00
Vincent Ambo
e4231c9816 refactor(ops/pipelines): Move 🦆 logic into static pipeline
This simplifies the fallback logic used in case of Nix evaluation
failure and makes it so that the evaluation step itself is the one
that is marked as failed in Buildkite.

This is possible because the pipeline upload command will insert new
steps at the point where it runs in the pipeline, and not later.

Change-Id: I870534c004ebc457a1602623c4e5f9c0c68e28fc
2021-12-10 07:55:34 +00:00
Vincent Ambo
6edfdd0773 refactor(ops/pipelines): Query build status from Buildkite API
Instead of manually tracking the build status through Buildkite
metadata, use the Buildkite GraphQL API in the `🦆` build
step (i.e. the one that determines the status of the entire pipeline
to be reported back to Gerrit) to fetch the number of failed jobs.

This way we have less manual state accounting in the pipeline.

The downside is that the GraphQL query embedded here is a little hard
to read.

Notes:

  * This needs an access token for Buildkite. We already have one for
    besadii which is also run by the agents, so I've given it GraphQL
    permissions and reused it.

  * I almost introduced a very rare bug here: My initial intuition was
    to simply `exit $FAILED_JOBS` - in the extremely rare case where
    `$FAILED_JOBS % 256 = 0` this would mean we would ... fail to fail
    the build :)

Change-Id: I61976b11b591d722494d3010a362b544efe2cb25
2021-11-29 23:38:24 +03:00
Vincent Ambo
4b33401a36 refactor(ops/pipelines): Move revision tagging into static pipeline
This makes the revision number available much earlier (before the rest
of the pipeline runs, while Nix eval is happening) which should only
be a few seconds after a commit to canon.

It is also more readable in this shape.

Change-Id: Iccbb17dfef6afe68f54fda41e8d10c4dc52b08c2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3775
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
2021-11-05 14:24:53 +00:00
Vincent Ambo
00ae396eeb feat(ops/pipelines): Create revision numbers in CI
This automatically pushes a new ref at refs/r/$revision to Gerrit
whenever a CI run completes on canon.

Revision numbers can be fetched from Gerrit with this command:

    git fetch gerrit "refs/r/*:refs/r/*"

Note that this build step requires credentials to be provisioned on
the CI runner machine.

Change-Id: I37bb14346832f891240aa47bb55affaace3d5f21
2021-11-04 15:16:08 +01:00
Vincent Ambo
43269730e6 refactor(ops/pipelines): Move failure status zeroing to setup
We changed the configured pipeline in Buildkite to upload
`static-pipeline.yaml` instead of containing the steps of that
pipeline itself.

This makes it easier to test changes to builds and such, but adds
another build step with scheduling overhead etc.

However - we can work around this by killing one of the existing build
steps. There's no reason the failure status zeroing (required for
status reporting) shouldn't be part of the pipeline setup, so I've
moved it there instead and nuked that step.

This should mean that the pipeline is configurable from within the
repo, but without slowing anything down.

Change-Id: I206ecc02647de42a461e33c02879ab84daf5ed2b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3461
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2021-08-29 12:37:04 +00:00
Vincent Ambo
60b25b49de fix(ops/pipelines/depot): Buildkite branches use full ref names
... otherwise the filtering also applies to canon.

Change-Id: Ia1c67b99282fb8fd0e4d22e997535170f0326e33
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3432
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: grfn <grfn@gws.fyi>
Tested-by: BuildkiteCI
2021-08-26 16:35:44 +00:00
Vincent Ambo
d5ddfb7b96 feat(pipelines/depot): Skip build steps if their out paths exist
Skip build steps if they have already been built, reducing pipelines
to the things that actually changed between builds. On canon all
targets are always built (we require this for anchoring).

Note that this is not perfect, garbage collection and competing
pipelines may affect each other.

Also note that we have some impure targets that change on every
commit.

Change-Id: Ic6bae3b6c8e1e7fd2116ec252f5089f471854ab6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3427
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: grfn <grfn@gws.fyi>
2021-08-26 16:29:32 +00:00
sterni
17d78867bb feat(ops/pipelines/depot): only evaluate once if possible
We currently evaluate every target twice -- once when the depot pipeline
is built and once when actually running the build step in question. Nix
evaluation is quite slow especially given heavy use of import from
derivation in depot, so avoiding the second evaluation is desireable.

Evaluating a derivation yields a `drv` file in the nix store which can
be passed to `nix-store --realise` in order to build it eliminating the
need to wait for evaluation. We can obtain the path to the `drv` file
while building the pipeline via `target.drvPath` and remember it for the
build later.

However we need to work around a flaw (or oversight) in Nix's dependency
tracking via string context: This is based on derivations, not output
path (because this is what evaluation deals with, likely). This is no
problem per se, but an issue is that Nix can't express a dependency on
a `drv` file without any of its output paths. This means for us that we
either have to build all output paths at evaluation time (which we don't
want, obviously) or to deal with the fact that the `drv` file we need
may be garbage collected at any moment after discarding the string
context -- then nix is unable to track the reference from the pipeline
to the `drv` file in the store.

So to prevent a race condition between the pipeline and the garbage
collector we fall back to the normal nix-build invocation as we did
before.

Change-Id: I9ef8bd233085dc6e30eba54f403ea03ac2d35748
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3426
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
2021-08-26 15:24:33 +00:00
sterni
0e6ac814c6 feat(ops/pipelines): pass --show-trace to nix-build
--show-trace should make it easier to debug tricky evaluation errors
without running nix-build -A ops.pipelines.depot locally again.

Change-Id: Ice540562c3b389fc2a49ec1fc0adacb17db2a528
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2947
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
2021-04-12 15:50:43 +00:00
Vincent Ambo
9073ac18c4 fix(pipelines/depot): Buildkite refers to branches by full ref
This change is required to run the  step on canon builds.

Change-Id: Ib3cebac67c9f5337b27a948f120b0a9ba834ef2a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2932
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: glittershark <grfn@gws.fyi>
2021-04-11 21:18:59 +00:00
Vincent Ambo
d7b89df748 feat(ops/pipelines): Add gcroots for depot builds on canon
Adds a conditional build step that only runs on the canon branch, and
only if 🦆 (the status reporting step) succeeds, which creates a
new Nix GC root for all depot targets named `depot-canon`.

In practice this might be a bit racey, as canon builds are not
guaranteed to succeed in order (though it is likely). This shouldn't
matter much in practice: We only want to prevent rebuilds of the whole
world.

This fixes b/102

Change-Id: Id3d0bf4158bffcb1ed6929888a29d31609b6ece1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2904
Tested-by: BuildkiteCI
Reviewed-by: glittershark <grfn@gws.fyi>
2021-04-11 20:09:53 +00:00
Vincent Ambo
9c482d6238 feat(ci): Add subtarget support for builds
We have naturally evolved a distinction between logical and physical
targets.

Physical targets are those which correspond directly to a tree
location on disk and can be built with `-A path.to.files`, while
logical targets are those that are exported from within an expression
but do not have a corresponding file on disk.

This change adds support for exporting logical targets from any tree
location by adding a `meta.targets` attribute containing keys into
itself, which will be consumed by the CI target gathering logic and
included in the generated pipeline.

Note that the labels for subtargets are syntactically different to
emphasise that they do not correspond to a file location. For example,
this change enables 'ops.nixos.whitbySystem' as a subtarget, which is
labeled in CI as `ops/nixos:whitbySystem`.

Change-Id: Ied09647a62c2ba98e3914548e3742ad422c63ecf
Reviewed-on: https://cl.tvl.fyi/c/depot/+/1893
Tested-by: BuildkiteCI
Reviewed-by: glittershark <grfn@gws.fyi>
2020-08-31 23:14:11 +00:00
Vincent Ambo
61d2d2d503 feat(ops/pipelines): Dynamically generate CI pipeline from targets
Create the pipeline by outputting a file that contains nix-build
invocations for each target's *derivation path*.

Each invocation has a generated Nix expression passed to it with `-E`
which fetches the correct target from the tree while correctly
handling targets with strange characters (such as in Go-packages).

This makes it possible to run target-level granular pipelines. We're
getting somewhere!

Change-Id: Ia6946e389dafd1d4926130bb8891446d6e17133b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/1855
Tested-by: BuildkiteCI
Reviewed-by: glittershark <grfn@gws.fyi>
Reviewed-by: lukegb <lukegb@tvl.fyi>
2020-08-31 23:14:11 +00:00
Vincent Ambo
4ff9d5dee8 feat: Implement automatic CI target detection for the depot
Automatically walk the entire depot tree and pick out things that are
"buildable", then include them in the attribute `ci.targets` (which is
now also the target for CI builds).

A long time ago, in a land far away, we (well, I, at the time) had a
prototype of this which ran into constant issues with infinite
recursions while trying to walk the tree. In fact, this is why
readTree originally gained the `__readTree`-attribute which marks
things that were imported automatically.

Based on some code edef whipped up earlier (with the breakthrough
being that we also add the attribute to top-level folders, which
suddenly resolves a whole bunch of problems), I've now implemented
this actually working version.

At the moment all builds still happen as one big bag of builds, but at
some point we will granularise this.

Change-Id: I86f12ce7f63dae98e7e5c6646a4e9d220de783f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/1854
Tested-by: BuildkiteCI
Reviewed-by: kanepyork <rikingcoding@gmail.com>
Reviewed-by: glittershark <grfn@gws.fyi>
2020-08-26 23:49:32 +00:00
Kane York
4dd236be53 feat(ci): run buf check lint in CI
Breaking change detection will run but not enforce.

Emoji of water buffalo was chosen by @pedge fiat in the bufbuild slack.

Change-Id: Ie292f2bfddc0e3bc512e4a138c0b5d0fa2603bad
Reviewed-on: https://cl.tvl.fyi/c/depot/+/1247
Tested-by: BuildkiteCI
Reviewed-by: tazjin <mail@tazj.in>
Reviewed-by: glittershark <grfn@gws.fyi>
2020-07-17 22:55:13 +00:00
Griffin Smith
93d1ab7a54 feat(pipelines/depot): Run with --show-trace
So if an evaluation fails we get a stacktrace

Change-Id: I54cdc9e93c765ef7cf3a4d0cd79e6d067f4789d3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/733
2020-06-29 00:38:32 +00:00
Vincent Ambo
15335e7b3d feat(besadii): Enable automatic builds for CLs
This expands builds to also be triggered for updates to CL refs.

The message displayed on Buildkite will contain a link back to the
CL (& patchset) from which the build was triggered.

Change-Id: Ib36dee454aeb11d623b89c78b384359ee7ea3477
Reviewed-on: https://cl.tvl.fyi/c/depot/+/708
Reviewed-by: ericvolp12 <ericvolp12@gmail.com>
Reviewed-by: isomer <isomer@tvl.fyi>
2020-06-28 17:56:10 +00:00
Vincent Ambo
22b8a49b87 feat(ops/pipelines): Add Buildkite pipeline configuration
This adds configuration which generates the structure expected for
Buildkite pipelines, which can then be dynamically ingested by
Buildkite when a pipeline is triggered.

Change-Id: I61e3dc3affb19c1f2550ef827fa73b17f8d8ae47
Reviewed-on: https://cl.tvl.fyi/c/depot/+/627
Reviewed-by: ericvolp12 <ericvolp12@gmail.com>
Reviewed-by: lukegb <lukegb@tvl.fyi>
2020-06-27 16:55:18 +00:00