[FYI] tux3: Core changes

OGAWA Hirofumi hirofumi at mail.parknet.co.jp
Sun Aug 9 06:42:42 PDT 2015


Jan Kara <jack at suse.cz> writes:

> I'm not sure about which ENOSPC issue you are speaking BTW. Can you
> please ellaborate?

1. GUP simulate page fault, and prepare to modify
2. writeback clear dirty, and make PTE read-only
3. snapshot/reflink make block cow
4. driver called GUP modifies page, and dirty page without simulate page fault

>> If you claim, there is strange logic widely used already, and of course,
>> we can't simply break it because of compatibility. I would be able to
>> agree. But your claim sounds like that logic is sane and well designed
>> behavior. So I disagree.
>
> To me the rule: "Do not detach a page from a radix tree if it has an elevated
> refcount unless explicitely requested by a syscall" looks like a sane one.
> Yes.
>
>> > And frankly I fail to see why you and Daniel care so much about this
>> > corner case because from performance POV it's IMHO a non-issue and you
>> > bother with page forking because of performance, don't you?
>> 
>> Trying to penalize the corner case path, instead of normal path, should
>> try at first. Penalizing normal path to allow corner case path is insane
>> basically.
>>
>> Make normal path faster and more reliable is what we are trying.
>
> Elevated refcount of a page is in my opinion a corner case path. That's why
> I think that penalizing that case by waiting for IO instead of forking is
> acceptable cost for the improved compatibility & maintainability of the
> code.

What is "elevated refcount"? What is difference with normal refcount?
Are you saying "refcount >= specified threshold + waitq/wakeup" or
such? If so, it is not the path.  It is the state. IOW, some group may
not hit much, but some group may hit much, on normal path.

So it sounds like yet another "stable page". I.e. unpredictable
performance. (BTW, by recall of "stable page", noticed "stable page"
would not provide stabled page data for that logic too.)

Well, assuming "elevated refcount == threshold + waitq/wakeup", so
IMO, it is not attractive.  Rather the last option if there is no
others as design choice.

Thanks.
-- 
OGAWA Hirofumi <hirofumi at mail.parknet.co.jp>



More information about the Tux3 mailing list