bug#6131: [PATCH]: fiemap support for efficient sparse file copy (original) (raw)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
From: | Joel Becker |
---|---|
Subject: | bug#6131: [PATCH]: fiemap support for efficient sparse file copy |
Date: | Tue, 15 Jun 2010 15:10:19 -0700 |
User-agent: | Mutt/1.5.20 (2009-06-14) |
On Tue, Jun 15, 2010 at 02:28:43PM -0700, Paul Eggert wrote:
I looked at the proposed fiemap patch for cp and see another issue with it, which needs to be thought through more carefully.
The proposed fiemap code tries to copy the physical holes in the source file. But currently cp copies the logical holes: that is, it skips over writing (and thus attempts to create physical holes) whenever the source file contains a block of zeros, regardless of whether that block of zeros is physically stored in the file system. The current cp behavior is often the desired behavior (particularly for cp --sparse=always) and the fiemap code shouldn't alter that.
For example, if a fiemapextent has the FIEMAPEXTENTUNWRITTEN flag set, cp should treat that as a hole, because the extent is all zeros. (This will greatly help performance in some cases.) Also, if an input extent is read and a block of it is found to be zeros, cp should skip over that block when writing.
I pretty much agree with everything you wrote above. Fiemap is
introduced into cp(1) to optimize the existing --sparse behavior. So we should be holding to that behavior. In fact, I'll go one further. While I haven't looked at the actual cp code in a little bit, I do not think that fiemap is used for the non-sparse copy. I contend that fiemap should be used for all copies if the capability exists. Even if cp is doing a non-sparse copy, there's no need for it to call read(2) on a 1GB hole in the file. It can write zeros just fine on its own. Basically, these capabilities (fiemap, SEEK_HOLE) are ways to avoid reading zeros we can determine without reading and scanning. We should always take advantage of that.
If the proposed fiemap code was intended to copy the physical holes, then the code wasn't successful in implementing the intent, as there's no guarantee that the extents of the destination are the same as the extents of the source. If it is useful to have 'cp' copy the physical holes, this could be supported via a new cp option or options, but surely the default should continue to be to copy logical holes.
The goal was not physical holes, it was logical ones as you
surmise. I can't see any way to ensure the file has identical extents. I don't know any filesystem that guarantees such allocations on write(2). I do think there might be an interesting performance optimization to be had, but it is independent of fiemap. Above, you describe the usual "cp --sparse" behavior of reading allocated data and checking for zeroes. If you're optimizing for space usage -- what "cp --sparse" is defined to do -- you definitely want to skip writing blocks that are all zero. But it actually creates a crappier allocation and I/O pattern for some files. Imagine you have a data extent that is alternating blocks of zeroes and ones:
hexdump -C /tmp/blk 00000000 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 |................| * 00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00002000 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 |................| * 00003000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00004000 ... (continued for 4MB in this example) 00400000
Classic "cp --sparse" will spend CPU time scanning this extent, which can be megabytes in size. It will then write out every other block, which is terribly slow, and it leavs a horribly fragmented destination file. The result is even more disastrous if those holes ever get filled in. I'm, of course, exaggerating a worst-case here. But folks looking at I/O performance would love detection like "here's a linear MB, sure it has some zero blocks, but it's faster to write the entire MB than to chop it up."
Joel
--
Life's Little Instruction Book #456
"Send your loved one flowers. Think of a reason later."
Joel Becker Principal Software Developer Oracle E-mail: address@hidden Phone: (650) 506-8127
- bug#6131: [PATCH]: fiemap support for efficient sparse file copy, (continued)
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Jim Meyering, 2010/06/15
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Paul Eggert, 2010/06/15
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Sunil Mushran, 2010/06/15
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Joel Becker, 2010/06/15
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Pádraig Brady, 2010/06/15
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Jim Meyering, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Pádraig Brady, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Jim Meyering, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Joel Becker, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Pádraig Brady, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy,Joel Becker <=
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, jeff.liu, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Joel Becker, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Tao Ma, 2010/06/16
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Sunil Mushran, 2010/06/10
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, jeff.liu, 2010/06/11
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Jim Meyering, 2010/06/11
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Jim Meyering, 2010/06/13
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Eric Sandeen, 2010/06/11
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, jeff.liu, 2010/06/12
* bug#6131: [PATCH]: fiemap support for efficient sparse file copy, Paul Eggert, 2010/06/15
- Prev by Date:bug#6131: [PATCH]: fiemap support for efficient sparse file copy
- Next by Date:bug#6131: [PATCH]: fiemap support for efficient sparse file copy
- Previous by thread:bug#6131: [PATCH]: fiemap support for efficient sparse file copy
- Next by thread:bug#6131: [PATCH]: fiemap support for efficient sparse file copy
- Index(es):