Setting the environment variable NIX_COUNT_CALLS to 1 enables some
basic profiling in the evaluator. It will count calls to functions
and primops as well as evaluations of attributes.
For example, to see where evaluation of a NixOS configuration spends
its time:
$ NIX_SHOW_STATS=1 NIX_COUNT_CALLS=1 ./src/nix-instantiate/nix-instantiate '<nixos>' -A system --readonly-mode
...
calls to 39 primops:
239532 head
233962 tail
191252 hasAttr
...
calls to 1595 functions:
224157 `/nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs/pkgs/lib/lists.nix:17:19'
221767 `/nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs/pkgs/lib/lists.nix:17:14'
221767 `/nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs/pkgs/lib/lists.nix:17:10'
...
evaluations of 7088 attributes:
167377 undefined position
132459 `/nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs/pkgs/lib/attrsets.nix:119:41'
47322 `/nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs/pkgs/lib/attrsets.nix:13:21'
...
This is a problem because one process may set the immutable bit before
the second process has created its link.
Addressed random Hydra failures such as:
error: cannot rename `/nix/store/.tmp-link-17397-1804289383' to
`/nix/store/rsvzm574rlfip3830ac7kmaa028bzl6h-nixos-0.1pre-git/upstart-interface-version':
Operation not permitted
Channels are implemented using a profile now, and profiles contain a
manifest.nix file. This should be ignored to prevent bogus packages
from showing up in nix-env.
Since SubstitutionGoal::finished() in build.cc computes the hash
anyway, we can prevent the inefficiency of computing the hash twice by
letting the substituter tell Nix about the expected hash, which can
then verify it.
The generated attrset has drvPath and outPath with the right string context, type 'derivation', outputName with
the right name, all with a list of outputs, and an attribute for each output.
I see three uses for this (though certainly there may be more):
* Using derivations generated by something besides nix-instantiate (e.g. guix)
* Allowing packages provided by channels to be used in nix expressions. If a channel installed a valid deriver
for each package it provides into the store, then those could be imported and used as dependencies or installed
in environment.systemPackages, for example.
* Enable hydra to be consistent in how it treats inputs that are outputs of another build. Right now, if an
input is passed as an argument to the job, it is passed as a derivation, but if it is accessed via NIX_PATH
(i.e. through the <> syntax), then it is a path that can be imported. This is problematic because the build
being depended upon may have been built with non-obvious arguments passed to its jobset file. With this
feature, hydra can just set the name of that input to the path to its drv file in NIX_PATH
Incremental optimisation requires creating links in /nix/store/.links
to all files in the store. However, this means that if we delete a
store path, no files are actually deleted because links in
/nix/store/.links still exists. So we need to check /nix/store/.links
for files with a link count of 1 and delete them.
optimiseStore() now creates persistent, content-addressed hard links
in /nix/store/.links. For instance, if it encounters a file P with
hash H, it will create a hard link
P' = /nix/store/.link/<H>
to P if P' doesn't already exist; if P' exist, then P is replaced by a
hard link to P'. This is better than the previous in-memory map,
because it had the tendency to unnecessarily replace hard links with a
hard link to whatever happened to be the first file with a given hash
it encountered. It also allows on-the-fly, incremental optimisation.
To implement binary caches efficiently, Hydra needs to be able to map
the hash part of a store path (e.g. "gbg...zr7") to the full store
path (e.g. "/nix/store/gbg...kzr7-subversion-1.7.5"). (The binary
cache mechanism uses hash parts as a key for looking up store paths to
ensure privacy.) However, doing a search in the Nix store for
/nix/store/<hash>* is expensive since it requires reading the entire
directory. queryPathFromHashPart() prevents this by doing a cheap
database lookup.
queryValidPaths() combines multiple calls to isValidPath() in one.
This matters when using the Nix daemon because it reduces latency.
For instance, on "nix-env -qas \*" it reduces execution time from 5.7s
to 4.7s (which is indistinguishable from the non-daemon case).
Instead make a single call to querySubstitutablePathInfo() per
derivation output. This is faster and prevents having to implement
the "have" function in the binary cache substituter.
Getting substitute information using the binary cache substituter has
non-trivial latency overhead. A package or NixOS system configuration
can have hundreds of dependencies, and in the worst case (when the
local info cache is empty) we have to do a separate HTTP request for
each of these. If the ping time to the server is t, getting N info
files will take tN seconds; e.g., with a ping time of 0.1s to
nixos.org, sequentially downloading 1000 info files (a typical NixOS
config) will take at least 100 seconds.
To fix this problem, the binary cache substituter can now perform
requests in parallel. This required changing the substituter
interface to support a function querySubstitutablePathInfos() that
queries multiple paths at the same time, and rewriting queryMissing()
to take advantage of parallelism. (Due to local caching,
parallelising queryMissing() is sufficient for most use cases, since
it's almost always called before building a derivation and thus fills
the local info cache.)
For example, parallelism speeds up querying all 1056 paths in a
particular NixOS system configuration from 116s to 2.6s. It works so
well because the eccentricity of the top-level derivation in the
dependency graph is only 9. So we only need 10 round-trips (when
using an unlimited number of parallel connections) to get everything.
Currently we do a maximum of 150 parallel connections to the server.
Thus it's important that the binary cache server (e.g. nixos.org) has
a high connection limit. Alternatively we could use HTTP pipelining,
but WWW::Curl doesn't support it and libcurl has a hard-coded limit of
5 requests per pipeline.
In a private PID namespace, processes have PIDs that are separate from
the rest of the system. The initial child gets PID 1. Processes in
the chroot cannot see processes outside of the chroot. This improves
isolation between builds. However, processes on the outside can see
processes in the chroot and send signals to them (if they have
appropriate rights).
Since the builder gets PID 1, it serves as the reaper for zombies in
the chroot. This might turn out to be a problem. In that case we'll
need to have a small PID 1 process that sits in a loop calling wait().
In chroot builds, set the host name to "localhost" and the domain name
to "(none)" (the latter being the kernel's default). This improves
determinism a bit further.
P.S. I have to idea what UTS stands for.
This improves isolation a bit further, and it's just one extra flag in
the unshare() call.
P.S. It would be very cool to use CLONE_NEWPID (to put the builder in
a private PID namespace) as well, but that's slightly more risky since
having a builder start as PID 1 may cause problems.
On Linux it's possible to run a process in its own network namespace,
meaning that it gets its own set of network interfaces, disjunct from
the rest of the system. We use this to completely remove network
access to chroot builds, except that they get a private loopback
interface. This means that:
- Builders cannot connect to the outside network or to other processes
on the same machine, except processes within the same build.
- Vice versa, other processes cannot connect to processes in a chroot
build, and open ports/connections do not show up in "netstat".
- If two concurrent builders try to listen on the same port (e.g. as
part of a test), they no longer conflict with each other.
This was inspired by the "PrivateNetwork" flag in systemd.
Systemd can start the Nix daemon on demand when the Nix daemon socket
is first accessed. This is signalled through the LISTEN_FDS
environment variable, so all we need to do is check for that and then
use file descriptor 3 as the listen socket instead of creating one
ourselves.
We can't open a SQLite database if the disk is full. Since this
prevents the garbage collector from running when it's most needed, we
reserve some dummy space that we can free just before doing a garbage
collection. This actually revives some old code from the Berkeley DB
days.
Fixes#27.
There is a race condition when doing parallel builds with chroots and
the immutable bit enabled. One process may call makeImmutable()
before the other has called link(), in which case link() will fail
with EPERM. We could retry or wrap the operation in a lock, but since
this condition is rare and I'm lazy, we just use the existing copy
fallback.
Fixes#9.
This should fix rare Hydra errors of the form:
error: symlinking `/nix/var/nix/gcroots/per-user/hydra/hydra-roots/7sfhs5fdmjxm8sqgcpd0pgcsmz1kq0l0-nixos-iso-0.1pre33785-33795' to `/nix/store/7sfhs5fdmjxm8sqgcpd0pgcsmz1kq0l0-nixos-iso-0.1pre33785-33795': File exists
Setting the UNAME26 personality causes "uname" to return "2.6.x",
regardless of the kernel version. This improves determinism in
a few misbehaved packages.
Make the garbage collector more concurrent by deleting valid paths
outside the region where we're holding the global GC lock. This
should greatly reduce the time during which new builds are blocked,
since the deletion accounts for the vast majority of the time spent in
the GC.
To ensure that this is safe, the valid paths are invalidated and
renamed to some arbitrary path while we're holding the lock. This
ensures that we when we finally delete the path, it's not a (newly)
valid or locked path.
Nix now requires SQLite and bzip2 to be pre-installed. SQLite is
detected using pkg-config. We required DBD::SQLite anyway, so
depending on SQLite is not a big problem.
The --with-bzip2, --with-openssl and --with-sqlite flags are gone.
By moving the destructor object to libstore.so, it's also run when
download-using-manifests and nix-prefetch-url exit. This prevents
them from cluttering /nix/var/nix/temproots with stale files.
Not all SQLite builds have the function sqlite3_table_column_metadata.
We were only using it in a schema upgrade check for compatibility with
databases that were probably never seen in the wild. So remove it.
The variable ‘useChroot’ was not initialised properly. This caused
random failures if using the build hook. Seen on Mac OS X 10.7 with Clang.
Thanks to KolibriFX for finding this :-)
Chroots are initialised by hard-linking inputs from the Nix store to
the chroot. This doesn't work if the input has its immutable bit set,
because it's forbidden to create hard links to immutable files. So
temporarily clear the immutable bit when creating and destroying the
chroot.
Note that making regular files in the Nix store immutable isn't very
reliable, since the bit can easily become cleared: for instance, if we
run the garbage collector after running ‘nix-store --optimise’. So
maybe we should only make directories immutable.
I was bitten one time too many by Python modifying the Nix store by
creating *.pyc files when run as root. On Linux, we can prevent this
by setting the immutable bit on files and directories (as in ‘chattr
+i’). This isn't supported by all filesystems, so it's not an error
if setting the bit fails. The immutable bit is cleared by the garbage
collector before deleting a path. The only tricky aspect is in
optimiseStore(), since it's forbidden to create hard links to an
immutable file. Thus optimiseStore() temporarily clears the immutable
bit before creating the link.