Re: Supporting reflinked/cow files in coreutils (original) (raw)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Jeff liu
Subject: Re: Supporting reflinked/cow files in coreutils
Date: Mon, 8 Oct 2012 21:10:44 +0800

Hi Philipp,

Thanks for taking a look at this feature, and sorry for my late response!

Hi Jim,

Recently, I have discussed with a colleague(Bo Liu CC-ed) who is focus on Btrfs development, and we decided to teach Btrfs aware of reflinked extents per file for du(1) purpose. :)

Philipp Thomas wrote: > Jim, > > * Jeff liu (address@hidden) [20110418 15:54]: > >> FYI, I once implemented a similar patch to teach du(1) to show the disk >> footprint with reflinked stuff for OCFS2 only. >> >> It is make use of rb-tree to figure out the shared extents and >> calculate the exactly used space eventually. >> http://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html >> http://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007288.html >> http://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007287.html >> >> I have not submited the patch set to Coreutils since I guess it is hard to >> accept by upstream as reflink and FIEMAPEXTENTSHARED flag are not spread >> over the general filesystems. > > in this thread Jeff asked you on your opinions regarding such a patch set > but somehow you never responded. Could you possibly do so now as I would > still like something like this to be included upstream.

Hi Philipp,

Glancing through the patch in the first message,

http://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html

I have a hard time justifying the addition of so much code for a relatively small feature that will be useful only for OCFS2 file systems:

coreutils-6.9/src/du.c | 479 +++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 473 insertions(+), 6 deletions(-)

That's a lot of new always-compiled code that will be used only by a very small fraction of users. And to add insult to injury, it extends the ugly tentacles of FIEMAP into du.c. It was bad enough to have to use it in copy.c, and we're all impatient for something better, but SEEKHOLE and SEEKDATA aren't yet widely available.

That's a lot of new code, not even counting these:

* lib/rbtree.c: Source file of rbtree. * lib/rbtree.h: Header file of rbtree. * lib/Makefile.am (libcoreutilsaSOURCES): Add both of them. --- coreutils-6.9/lib/Makefile.am | 3 +- coreutils-6.9/lib/Makefile.in | 4 +- coreutils-6.9/lib/rbtree.c | 403 +++++++++++++++++++++++++++++++++++++++++ coreutils-6.9/lib/rbtree.h | 143 +++++++++++++++ 4 files changed, 550 insertions(+), 3 deletions(-)

Besides which, there is no reason to add new rbtree.[ch] code when there is already plenty of rbtree code in gnulib. Yes. Originally, I wrote this patch set for a shared-du utility which only available on our Oracle Linux release for OCFS2 purpose, so I didn't make use of rbtree code in gnulib at that time.

If it can be made to work cleanly with BTRFS as well, and preferably without FIEMAP (if possible, though I fear that FIEMAP is the only interface we have for now), that would be even better. I also hate FIEMAP, however, for BTRFS, we have to make it works through FIEMAP as well...

If you and/or Jeff are motivated enough, a good first step would be to post a rebased patch to bug-coreutils, preferably after you've switched out that rbtree.[ch] for gnulib's code. I hesitate even to suggest that you rebase, because we might end up deciding it's not yet worth the cost, but let's hear what others have to say. Maybe enough people use du and are bothered because the space used by their reflinked files is over-reported.

We can find time to rebase it according to your suggestions if you like.

A final question: why use an rbtree rather than a hash table? du is already using hash tables at the required scale, and new code would be required at all.

Consider scalability, I think we can use hash table, but rbtree is also works well, except bitmap.

Thanks, -Jeff