image: Refactor to use cas/ref engines instead of walkers by wking · Pull Request #159 · opencontainers/image-spec (original) (raw)
I've pushed c071139 → 7e96d03, rebasing around #212, #229, and #230
(master has been active in the past few days ;). I'm still waiting on
maintainer feedback for:
On Tue, Aug 30, 2016 at 11:08:41PM -0700, W. Trevor King wrote 1:
The main outstanding issue from @stevvooe's recent review is whether I
need to reroll the tar handling to use the filesystem for scratch
space instead of using memory for scratch space [2 and many other
recent comments ;)]. I don't think we need to put in the work to make
tar writing more performant 3, but I'll put in the time if it's a
blocker for landing this PR.…
[2]: #159 (comment)
3: #159 (comment)
I'm expecting something like one of:
a. “We agree that the current PR's tar handling isn't performant, but
that it's not broken for constructing small image-layout tarballs
from scratch, for append-only CAS operations, for new-key ref
operations, or for ‘overwriting the same key with larger serialized
descriptor JSON’ ref operations [2]. That's enough for the
tar-backed engines, and we'll accept it for this PR. We realize
that there is a chance of a corrupted tarball if a writer crashes
(discussed in 469bbb6) and are ok with that because nobody should
be using the tar-writer in production if they aren't prepared to
catch crashed writes and start over from scratch.”,
b. “You need to use temporary files to avoid holding blobs in memory
while you hash their contents and to avoid holding tarballs in
memory while you rewrite them. Using TempFile 3 with the default
TempDir 4 is fine for this. We realize that moving a blob from a
temporary-disk file into a tarball is not an atomic operation, and
that copying from TempDir over the original tar path may require
futher copies [5]. We also realize that this will slow the
read-only use-case, because readers will have to stat or re-open
the file to detect changes made by other process [6]. We still
want you to use temporary files.”, or
c. “Everything from (b), except you cannot rewrite the whole tar file
when appending a new entry. You must seek to before the trailing
two zero-byte records and write into the existing tar file. We
realize that this leaves a window where the content of the tar file
at the original tar path is incomplete, and accept the risk of
corruption if a writer crashes (see (a)) in return for the
efficiency gains from avoiding (b)'s temporary tar file.”.
[2]: Is there really no truncate(2) in Go's high-level file ops? More
in 469bbb6.
[5]: E.g. if TempFile and the original tar path are on different
filesystems.
[6]: Both readers and writers currently seek back to the beginning of
the file without re-opening it.