bug#6131: [PATCH]: fiemap support for efficient sparse file copy (original) (raw)

[Top][All Lists]


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Jim Meyering
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Fri, 21 May 2010 16:27:49 +0200

jeff.liu wrote: ...

> [*] I tried to count syscalls with strace but got a segfault. > Using valgrind I get errors, so debugged enough to get a clean > run, but possibly at the expense of correctness. We'll need more > tests to ensure that the non-sparse blocks in the copy all have > the same offset/length as in the original. Is it make sense if we write a utility in C through FIEMAP to show the extent info of a file? then wrap it in our current test scripts or a new test script to compare the non-sparse blocks offset and length?

If there were no adequate tool already available, that would be good.

filefrag(8) can do such thing(http://e2fsprogs.sourceforge.net/), but maybe we can implement a compacted version focus on furture extent maping related testing only for coreutils.

Or maybe just use filefrag, when it's available. On F13, with -v (verbose), it prints this:

$ filefrag -v big
Filesystem type is: ef53
File size of big is 2147483648 (524288 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0   254527               1
big: 1 extent found

> =========================================================== > The segv just above is due to hitting this line with i==0: > > fiemap->fmstart = (fmext[i-1].felogical + fmext[i-1].felength); strange, code should break if there is no extent allocated for a file. /* If 0 extents are returned, then more ioctls are not needed. */ if (fiemap->fmmappedextents == 0) break;

There is one extent, and it is while processing it, with i == 0 that would trigger the failure when referencing fm_ext[i-1] (aka fm_ext[-1]).

> the obvious fix is probably to do this instead: > > fiemap->fmstart = (fmext[i].felogical + fmext[i].felength); I just found a bug for dealing with the 'fiemap->fmstart', maybe it is the root cause of the segment fault. above line still need to write as 'fmext[i-1].felogical +....' to calculate the offset for the next ioctl(2).

"i" can be 0 there, so it sounds like you're saying we need to reference fm_ext[-1]. If you mean that, you'll have to demonstrate how we guarantee that i > 0 there.

> All of the used-uninitialized errors can be papered over by > clearing the fiemapbuf array, like this: > > + memset (fiemapbuf, 0, sizeof fiemapbuf); I recalled why I initialized this buf before when you ask me the reason, I was intented to initialize the 'fiemap->fmstart', so below line 'fiemap->fmstart = 0ULL' should be removed from the loop.

> do > { > fiemap->fmstart = 0ULL; > > However, if these are all due solely to F13's valgrind not yet knowing the > semantics of the FIEMAP ioctl, then that may be adequate. as what I mentioned above, this line should be removed or remove out of the loop if we do not initialize the fiemap buf.

I agree. Leaving the initialization in the loop would provoke an infinite loop, for a file with many extents.

This demonstrates it:

$ perl -e 'for (1..100) { sysseek(STDOUT,4096,1)' \
       -e '&& syswrite(STDOUT,"."x4096) or die "$!"}' > j
$ ./cp --sparse=always j j2
<INFLOOP!>
^C

With this statement "fiemap->fm_start = 0ULL;" in the do-while loop, the use of ./cp above would infloop. Without it, it works properly:

$ env time -f %E ./cp --sparse=always j j2
0:00.01

And we can compare the extents in the two: (the awk is mainly to exclude the physical block numbers, which will always differ)

$ diff -u <(filefrag -v j|awk '/^ / {print <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo separator="true">,</mo></mrow><annotation encoding="application/x-tex">1,</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8389em;vertical-align:-0.1944em;"></span><span class="mord">1</span><span class="mpunct">,</span></span></span></span>2,$NF}') \
          <(filefrag -v j2|awk '/^ / {print <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo separator="true">,</mo></mrow><annotation encoding="application/x-tex">1,</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8389em;vertical-align:-0.1944em;"></span><span class="mord">1</span><span class="mpunct">,</span></span></span></span>2,$NF}')
$

For reference, here's what filefrag -v output looks like, given a file with a nontrivial list of extents:

$ perl -e 'BEGIN{$n=16*1024; *F=*STDOUT}'
-e 'for (1..5) { sysseek(*F,$n,1)'
-e '&& syswrite *F,"."x$n or die "$!"}' > j $ filefrag -v j Filesystem type is: ef53 File size of j is 163840 (40 blocks, blocksize 4096) ext logical physical expected length flags 0 4 6258884 4 1 12 6258892 6258887 4 2 20 6258900 6258895 4 3 28 6258908 6258903 4 4 36 6258916 6258911 4 eof j: 6 extents found