optimiseStore() now creates persistent, content-addressed hard links
in /nix/store/.links. For instance, if it encounters a file P with
hash H, it will create a hard link
P' = /nix/store/.link/<H>
to P if P' doesn't already exist; if P' exist, then P is replaced by a
hard link to P'. This is better than the previous in-memory map,
because it had the tendency to unnecessarily replace hard links with a
hard link to whatever happened to be the first file with a given hash
it encountered. It also allows on-the-fly, incremental optimisation.
To implement binary caches efficiently, Hydra needs to be able to map
the hash part of a store path (e.g. "gbg...zr7") to the full store
path (e.g. "/nix/store/gbg...kzr7-subversion-1.7.5"). (The binary
cache mechanism uses hash parts as a key for looking up store paths to
ensure privacy.) However, doing a search in the Nix store for
/nix/store/<hash>* is expensive since it requires reading the entire
directory. queryPathFromHashPart() prevents this by doing a cheap
database lookup.
We can't open a SQLite database if the disk is full. Since this
prevents the garbage collector from running when it's most needed, we
reserve some dummy space that we can free just before doing a garbage
collection. This actually revives some old code from the Berkeley DB
days.
Fixes#27.
Make the garbage collector more concurrent by deleting valid paths
outside the region where we're holding the global GC lock. This
should greatly reduce the time during which new builds are blocked,
since the deletion accounts for the vast majority of the time spent in
the GC.
To ensure that this is safe, the valid paths are invalidated and
renamed to some arbitrary path while we're holding the lock. This
ensures that we when we finally delete the path, it's not a (newly)
valid or locked path.
derivations added to the store by clients have "correct" output
paths (meaning that the output paths are computed by hashing the
derivation according to a certain algorithm). This means that a
malicious user could craft a special .drv file to build *any*
desired path in the store with any desired contents (so long as the
path doesn't already exist). Then the attacker just needs to wait
for a victim to come along and install the compromised path.
For instance, if Alice (the attacker) knows that the latest Firefox
derivation in Nixpkgs produces the path
/nix/store/1a5nyfd4ajxbyy97r1fslhgrv70gj8a7-firefox-5.0.1
then (provided this path doesn't already exist) she can craft a .drv
file that creates that path (i.e., has it as one of its outputs),
add it to the store using "nix-store --add", and build it with
"nix-store -r". So the fake .drv could write a Trojan to the
Firefox path. Then, if user Bob (the victim) comes along and does
$ nix-env -i firefox
$ firefox
he executes the Trojan injected by Alice.
The fix is to have the Nix daemon verify that derivation outputs are
correct (in addValidPath()). This required some refactoring to move
the hash computation code to libstore.
while checking the contents, since this operation can take a very
long time to finish. Also, fill in missing narSize fields in the DB
while doing this.
* If a path has disappeared, check its referrers first, and don't try
to invalidate paths that have valid referrers. Otherwise we get a
foreign key constraint violation.
* Read the whole Nix store directory instead of statting each valid
path, which is slower.
* Acquire the global GC lock.
doesn't work because the garbage collector doesn't actually look at
locks. So r22253 was stupid. Use addTempRoot() instead. Also,
locking the temporary directory in exportPath() was silly because it
isn't even in the store.
complete set of live and dead paths before starting the actual
deletion, but determines liveness on demand. I.e. for any path in
the store, it first tries to delete all the referrers, and then the
path itself. This means that the collector can start deleting paths
almost immediately.
UTC) rather than 0 (00:00:00). 1 is a better choice because some
programs use 0 as a special value. For instance, the Template
Toolkit uses a timestamp of 0 to denote the non-existence of a file,
so it barfs on files in the Nix store (see
template-toolkit-nix-store.patch in Nixpkgs). Similarly, Maya 2008
fails to load script directories with a timestamp of 0 and can't be
patched because it's closed source.
This will also shut up those "implausibly old time stamp" GNU tar
warnings.
SHA-256 outputs of fixed-output derivations. I.e. they now produce
the same store path:
$ nix-store --add x
/nix/store/j2fq9qxvvxgqymvpszhs773ncci45xsj-x
$ nix-store --add-fixed --recursive sha256 x
/nix/store/j2fq9qxvvxgqymvpszhs773ncci45xsj-x
the latter being the same as the path that a derivation
derivation {
name = "x";
outputHashAlgo = "sha256";
outputHashMode = "recursive";
outputHash = "...";
...
};
produces.
This does change the output path for such fixed-output derivations.
Fortunately they are quite rare. The most common use is fetchsvn
calls with SHA-256 hashes. (There are a handful of those is
Nixpkgs, mostly unstable development packages.)
* Documented the computation of store paths (in store-api.cc).
order of ascending last access time. This is useful in conjunction
with --max-freed or --max-links to prefer deleting non-recently used
garbage, which is good (especially in the build farm) since garbage
may become live again.
The code could easily be modified to accept other criteria for
ordering garbage by changing the comparison operator used by the
priority queue in collectGarbage().
again. (After the previous substituter mechanism refactoring I
didn't update the code that obtains the references of substitutable
paths.) This required some refactoring: the substituter programs
are now kept running and receive/respond to info requests via
stdin/stdout.