Caching path info is generally useful. For instance, it speeds up "nix
path-info -rS /run/current-system" (i.e. showing the closure sizes of
all paths in the closure of the current system) from 5.6s to 0.15s.
This also eliminates some APIs like Store::queryDeriver() and
Store::queryReferences().
Also, move a few free-standing functions into StoreAPI and Derivation.
Also, introduce a non-nullable smart pointer, ref<T>, which is just a
wrapper around std::shared_ptr ensuring that the pointer is never
null. (For reference-counted values, this is better than passing a
"T&", because the latter doesn't maintain the refcount. Usually, the
caller will have a shared_ptr keeping the value alive, but that's not
always the case, e.g., when passing a reference to a std::thread via
std::bind.)
Especially in WAL mode on a highly loaded machine, this is not a good
idea because it results in a WAL file of approximately the same size
ad the database, which apparently cannot be deleted while anybody is
accessing it.
If a root is a regular file, then its name must denote a store
path. For instance, the existence of the file
/nix/var/nix/gcroots/per-user/eelco/hydra-roots/wzc3cy1wwwd6d0dgxpa77ijr1yp50s6v-libxml2-2.7.7
would cause
/nix/store/wzc3cy1wwwd6d0dgxpa77ijr1yp50s6v-libxml2-2.7.7
to be a root.
This is useful because it involves less I/O (no need for a readlink()
call) and takes up less disk space (the symlink target typically takes
up a full disk block, while directory entries are packed more
efficiently). This is particularly important for hydra.nixos.org,
which has hundreds of thousands of roots, and where reading the roots
can take 25 minutes.
I.e. "nix-store -q --roots" will now show (for example)
/home/eelco/Dev/nixpkgs/result
rather than
/nix/var/nix/gcroots/auto/53222qsppi12s2hkap8dm2lg8xhhyk6v
But this time it's *obviously* correct! No more segfaults due to
infinite recursions for sure, etc.
Also, move directories to /nix/store/trash instead of renaming them to
/nix/store/bla-gc-<pid>. Then we can just delete /nix/store/trash at
the end.
This prevents zillions of derivations from being kept, and fixes an
infinite recursion in the garbage collector (due to an obscure cycle
that can occur with fixed-output derivations).
The outputs of a derivation can refer to each other (even though they
cannot have cycles), so they have to be deleted in the right order.
http://hydra.nixos.org/build/3026118
If the options gc-keep-outputs and gc-keep-derivations are both
enabled, you can get a cycle in the liveness graph. There was a hack
to handle this, but it didn't work with multiple-output derivations,
causing the garbage collector to fail with errors like ‘error: cannot
delete path `...' because it is in use by `...'’. The garbage
collector now handles strongly connected components in the liveness
graph as a unit and decides whether to delete all or none of the paths
in an SCC.