Previously background contexts where created where necessary (e.g. in
GCS interactions). Should I begin to use request timeouts or other
context-dependent things in the future, it's useful to have the actual
HTTP request context around.
This threads the request context through the application to all places
that need it.
The point at which files are moved happens to also (initially) be the
point where the `layers` directory is created. For this reason
renaming must ensure that all path components exist, which this commit
takes care of.
The filesystem storage backend can be enabled by setting
`NIXERY_STORAGE_BACKEND` to `filesystem` and `STORAGE_PATH` to a disk
location from which Nixery can serve files.
This abstracts over the functionality of Google Cloud Storage and
other potential underlying storage backends to make it possible to
replace these in Nixery.
The GCS backend is not yet reimplemented.
The JSON file generated for service account keys already contains the
required information for signing URLs in GCS, thus the environment
variables for toggling signing behaviour have been removed.
Signing is now enabled automatically in the presence of service
account credentials (i.e. `GOOGLE_APPLICATION_CREDENTIALS`).
Some Nix download mechanisms will add a second hash in the store path,
which had been added to the source hash output (breaking argument
interpolation).
Instead of compressing & decompressing again to get the underlying tar
hash, use a similar mechanism as for store path layers for the symlink
layer and only compress it once while uploading.
Docker expects hashes of compressed tarballs in the manifest (as these
are used to fetch from the content-addressable layer store), but for
some reason it expects hashes in the configuration layer to be of
uncompressed tarballs.
To achieve this an additional SHA256 hash is calculcated while
creating the layer tarballs, but before passing them to the gzip
writer.
In the current constellation the symlink layer is first compressed and
then decompressed again to calculate its hash. This can be refactored
in a future change.
This has become an issue recently with changes such as GZIP
compression, where CI runs no longer work because they conflict with
the production bucket for the public instance.
Makes use of the `.WithError` and `.WithField` convenience functions
in logrus to simplify log statement construction.
This has the added benefit of making it easier to correctly log
errors.
This rewrites all existing log statements into the structured logrus
format. For consistency, all errors are always logged separately from
the primary message in a field called `error`.
Only the "info", "error" and "warn" severities are used.
Uses a hash of Nixery's sources as the version displayed when Nixery
launches or logs an error. This makes it possible to distinguish
between errors logged from different versions.
The source hashes should be reproducible between different checkouts
of the same source tree.
This formatter has basic support for the Stackdriver Error Reporting
format, but several things are still lacking:
* the service version (preferably git commit?) needs to be included in
the server somehow
* log streams should be split between stdout/stderr as that is how
AppEngine (and several other GCP services?) seemingly differentiate
between info/error logs
These two packages almost always end up being required by programs,
but people don't necessarily consider them.
They will now always be added and their popularity is artificially
inflated to ensure they end up at the top of the layer list.
Image layers in manifests are now sorted in a stable (descending)
order based on their merge rating, meaning that layers more likely to
be shared between images come first.
The reason for this change is Docker's handling of image layers on
overlayfs2: Images are condensed into a single representation on disk
after downloading.
Due to this Docker will constantly redownload all layers that are
applied in a different order in different images (layer order matters
in imperatively created images), based on something it calls the
'ChainID'.
Sorting the layers this way raises the likelihood of a long chain of
matching layers at the beginning of an image.
This relates to #39.
Implements a local manifest cache that uses the temporary directory to
cache manifest builds.
This is necessary due to the size of manifests: Keeping them entirely
in-memory would quickly balloon the memory usage of Nixery, unless
some mechanism for cache eviction is implemented.
The functions used for layer creation are now easier to follow and
have clear points at which the layer cache is checked and populated.
This relates to #50.
MD5 hash checking is no longer performed by Nixery (it does not seem
to be necessary), hence the layer cache now only keeps the SHA256 hash
and size in the form of the manifest entry.
This makes it possible to restructure the builder code to perform
cache-fetching and cache-populating for layers in the same place.
The new builder now caches and reads cached manifests to/from GCS. The
in-memory cache is disabled, as manifests are no longer written to
local file and the caching of file paths does not work (unless we
reintroduce reading/writing from temp files as part of the local
cache).
This cache is no longer required as it is implicit because the layer
cache (mapping store path hashes to layer hashes) implies that a layer
has been seen.
Implements the new build process to the point where it can actually
construct and serve image manifests.
It is worth noting that this build process works even if the Nix
sandbox is enabled!
It is also worth nothing that none of the caching functionality that
the new build process enables (such as per-layer build caching) is
actually in use yet, hence running Nixery at this commit is prone to
doing more work than previously.
This relates to #50.
The new manifest package creates image manifests and their
configuration. This previously happened in Nix, but is now part of the
server's workload.
This relates to #50.
The new build process can now call out to Nix to create layers and
upload them to the bucket if necessary.
The layer cache is populated, but not yet used.