bug#6131: [PATCH]: fiemap support for efficient sparse file copy (original) (raw)

[Top][All Lists]


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Jim Meyering
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Wed, 12 May 2010 10:48:44 +0200

jeff.liu wrote:

Hello All, ... Would you guys please review and consider apply the patches if there is no other issue?

Thanks for yet another iteration, and sorry it's taking so long.

>From f8c78794a70f1fb45a2c61c8bbeca344087287ab Mon Sep 17 00:00:00 2001 From: Jie Liu <address@hidden> Date: Fri, 7 May 2010 20:48:45 +0800 Subject: [PATCH 1/3] Add fiemap.h for fiemap ioctl(2) support. It does not shipped by default, so I copy it from kernel at the moment. I have update its code style respect to GNU coding style.

Your log message should look like this: (one-line summary, then ChangeLog-style after a blank line)

add fiemap.h for fiemap ioctl(2) support

Signed-off-by: Jie Liu <address@hidden> --- gnulib | 2 +-

Please omit the gnulib part. It is irrelevant to your change.

>From 8822b8e3f3ee70b49efb8b8aebff373792956422 Mon Sep 17 00:00:00 2001 From: Jie Liu <address@hidden> Date: Fri, 7 May 2010 21:31:56 +0800 Subject: [PATCH 3/3] Add test script for cp(1) fiemap copy.

Signed-off-by: Jie Liu <address@hidden> --- tests/cp/sparse-fiemap | 58 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 58 insertions(+), 0 deletions(-) create mode 100755 tests/cp/sparse-fiemap

diff --git a/tests/cp/sparse-fiemap b/tests/cp/sparse-fiemap new file mode 100755 index 0000000..25a8fd6 --- /dev/null +++ b/tests/cp/sparse-fiemap @@ -0,0 +1,58 @@ +#!/bin/sh +# Test cp --sparse=always through fiemap copy + +# Copyright (C) 2006-2010 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +if test "$VERBOSE" = yes; then + set -x + cp --version +fi + +. $srcdir/test-lib.sh + +cporig=cp +cpnew="$abstopbuilddir/src/cp" + _+test -d "/ext4" _ _+ && sparse="/ext4/sparse" _ _+ && normal="/ext4/sparse1" _ _+ && fiemap="/ext4/sparse2" _ + || skip=1

Be careful about writing to well-known files like these. That's risky if /ext4 is world-writable. Work in a temporary directory created by mktemp -d, just in case.

It'd be better still to make the test work when run as root, in which case you can create and mount a file-backed ext4 partition, and then use that, rather than relying on the existence of a preexisting, fixed-name directory that may or may not be of the right FS type.

But don't worry about this part if you're not inclined. We can make it do that later.

+test $skip = 1 && skiptest "/ext4 does not exists"

s/exists/exist/

+size=expr 10 \* 1024

Here, just use size=10240.

In general, please use $(...), not .... (multiple places)

+dd if=/dev/zero bs=4k count=1 seek=$size of=$sparse > /dev/null 2>&1 || frameworkfailure + +# Using time(1) instead of shell built-in time' command._ _+# It support "--format" option which is more convinent to calculate_ _+# the expense time for different cp' by combine with bc(1) for +# the performance measurement. +TIME=which time || skiptest "time(1) does not exists"

Don't use "which". It has portability problems. Instead,

env time --version | grep GNU > /dev/null
|| skip_test_ "you lack the GNU time program"

Another alternative: record $(date +%s.%N), before and after. Elapsed time should be adequate. Advantage: date is always available, while time may not be.

+x=$(echo "1+2" | bc) +test $x = 3 || skiptest "bc(1) does not exists" + +t1=$($TIME -f "%U + %S" cporig−−sparse=alwayscporig --sparse=always cporigsparse=alwayssparse $normal 2>&1 | bc) || fail=1 +t2=$($TIME -f "%U + %S" cpnew−−sparse=alwayscpnew --sparse=always cpnewsparse=alwayssparse $fiemap 2>&1 | bc) || fail=1 + +test $fail = 1 && skiptest "at least one sparse file copy failed" + +# Ensure that the sparse file copied through fiemap has the same size in bytes as the original. +test stat --printf %s <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>s</mi><mi>p</mi><mi>a</mi><mi>r</mi><mi>s</mi><mi>e</mi><mi mathvariant="normal">‘</mi><mo>−</mo><mi>e</mi><mi>q</mi><mi mathvariant="normal">‘</mi><mi>s</mi><mi>t</mi><mi>a</mi><mi>t</mi><mo>−</mo><mo>−</mo><mi>p</mi><mi>r</mi><mi>i</mi><mi>n</mi><mi>t</mi><mi>f</mi></mrow><annotation encoding="application/x-tex">sparse -eq stat --printf %s </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">s</span><span class="mord mathnormal">p</span><span class="mord mathnormal">a</span><span class="mord mathnormal">rse</span><span class="mord">‘</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord">‘</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord">−</span><span class="mord mathnormal">p</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">in</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span></span></span></span>fiemap || fail=1

s/-eq/=/

+echo "$t2 < $t1" | bc || fail=1

At first I wrote this:

Can't you require something stronger than merely
that the new code is faster than the old?
How do the times compare in general?
Can you construct a worst-case scenario that
makes the difference as large as possible, yet
that still completes (with the new code) in say
5 or 10 seconds on modern equipment?

Use awk rather than bc for comparison.
For an example that's similar, I made parted fail a test
when an efficiency-sensitive operation takes more than a minute here:

http://git.debian.org/?p=parted/parted.git;a=commitdiff;h=eedc6d77dc4b3488decd4dce9cb8cafaa95755ce

You can depend on some version of awk being available.
Invoke it via $AWK, but for that to work in the test script,
you'll have to add AWK=$(AWK) to tests/Makefile.am's TESTS_ENVIRONMENT,
as I did here:

http://git.debian.org/?p=parted/parted.git;a=commitdiff;h=246c953b53c1bd49b1f835f84a1ca29a6d2fbc1c

Then I remembered that here we have timeout(1), so: you may ignore the above and consider this a suggestion to use timeout:

But that was in Parted, where we can't guarantee that the timeout program is available. Here in coreutils, you're guaranteed to have timeout(1) (just built), so you might want to use it, too: Contrive a test that takes a very long time without FIEMAP support yet that runs in a couple seconds with it. Then run cp via timeout with a 10-second limit. If timeout's exit status is not 0, then make the test fail.

That has the advantage of letting you use an example that would take far longer that we typically want to wait for a non-FIEMAP test. I.e., perform only the FIEMAP-copy and ensure that it's "quick enough". You don't have to perform a non-FIEMAP one.

Another advantage: if you don't do the old/slow sparse copy, there's no need for comparison (and bc or awk) at all.