[FYI] tux3: Core changes

OGAWA Hirofumi hirofumi at mail.parknet.co.jp
Sun Aug 16 12:42:04 PDT 2015


Jan Kara <jack at suse.cz> writes:

> On Sun 09-08-15 22:42:42, OGAWA Hirofumi wrote:
>> Jan Kara <jack at suse.cz> writes:
>> 
>> > I'm not sure about which ENOSPC issue you are speaking BTW. Can you
>> > please ellaborate?
>> 
>> 1. GUP simulate page fault, and prepare to modify
>> 2. writeback clear dirty, and make PTE read-only
>> 3. snapshot/reflink make block cow
>
> I assume by point 3. you mean that snapshot / reflink happens now and thus
> the page / block is marked as COW. Am I right?

Right.

>> 4. driver called GUP modifies page, and dirty page without simulate page fault
>
> OK, but this doesn't hit ENOSPC because as you correctly write in point 4.,
> the page gets modified without triggering another page fault so COW for the
> modified page isn't triggered. Modified page contents will be in both the
> original and the reflinked file, won't it?

And above result can be ENOSPC too, depending on implement and race
condition. Also, if FS converted zerod blocks to hole like hammerfs,
simply ENOSPC happens. I.e. other process uses all spaces, but then no
->page_mkwrite() callback to check ENOSPC.

> And I agree that the fact that snapshotted file's original contents can
> still get modified is a bug. A one which is difficult to fix.

Yes, it is why I'm thinking this logic is issue, before page forking.

>> So it sounds like yet another "stable page". I.e. unpredictable
>> performance. (BTW, by recall of "stable page", noticed "stable page"
>> would not provide stabled page data for that logic too.)
>> 
>> Well, assuming "elevated refcount == threshold + waitq/wakeup", so
>> IMO, it is not attractive.  Rather the last option if there is no
>> others as design choice.
>
> I agree the performance will be less predictable and that is not good. But
> changing what is visible in the file when writeback races with GUP is a
> worse problem to me.
>
> Maybe if GUP marked pages it got ref for so that we could trigger the slow
> behavior only for them (Peter Zijlstra proposed in [1] an infrastructure so
> that pages pinned by get_user_pages() would be properly accounted and then
> we could use PG_mlocked and elevated refcount as a more reliable indication
> of pages that need special handling).

I'm not reading Peter's patchset fully though, looks like good, and
maybe similar strategy in my mind currently. Also I'm thinking to add
callback for FS at start and end of GUP's pin window. (for just an
example, callback can be used to stop writeback by FS if FS wants.)

Thanks.
-- 
OGAWA Hirofumi <hirofumi at mail.parknet.co.jp>



More information about the Tux3 mailing list