[FYI] tux3: Core changes
David Lang
david at lang.hm
Tue May 19 13:33:31 PDT 2015
On Tue, 19 May 2015, Daniel Phillips wrote:
>> I understand that Tux3 may avoid these issues due to some other mechanisms
>> it internally has but if page forking should get into mm subsystem, the
>> above must work.
>
> It does work, and by example, it does not need a lot of code to make
> it work, but the changes are not trivial. Tux3's delta writeback model
> will not suit everyone, so you can't just lift our code and add it to
> Ext4. Using it in Ext4 would require a per-inode writeback model, which
> looks practical to me but far from a weekend project. Maybe something
> to consider for Ext5.
>
> It is the job of new designs like Tux3 to chase after that final drop
> of performance, not our trusty Ext4 workhorse. Though stranger things
> have happened - as I recall, Ext4 had O(n) directory operations at one
> time. Fixing that was not easy, but we did it because we had to. Fixing
> Ext4's write performance is not urgent by comparison, and the barrier
> is high, you would want jbd3 for one thing.
>
> I think the meta-question you are asking is, where is the second user
> for this new CoW functionality? With a possible implication that if
> there is no second user then Tux3 cannot be merged. Is that is the
> question?
I don't think they are asking for a second user. What they are saying is that
for this functionality to be accepted in the mm subsystem, these problem cases
need to work reliably, not just work for Tux3 because of your implementation.
So for things that you don't use, you need to make it an error if they get used
on a page that's been forked (or not be an error and 'do the right thing')
For cases where it doesn't matter because Tux3 controls the writeback, and it's
undefined in general what happens if writeback is triggered twice on the same
page, you will need to figure out how to either prevent the second writeback
from triggering if there's one in process, or define how the two writebacks are
going to happen so that you can't end up with them re-ordered by some other
filesystem.
I think that that's what's meant by the top statement that I left in the quote.
Even if your implementation details make it safe, these need to be safe even
without your implementation details to be acceptable in the core kernel.
David Lang
More information about the Tux3
mailing list