[FYI] tux3: Core changes
OGAWA Hirofumi
hirofumi at mail.parknet.co.jp
Thu Jul 30 21:44:44 PDT 2015
Jan Kara <jack at suse.cz> writes:
>> > Yes, if userspace truncates the file, the situation we end up with is
>> > basically the same. However for truncate to happen some malicious process
>> > has to come and truncate the file - a failure scenario that is acceptable
>> > for most use cases since it doesn't happen unless someone is actively
>> > trying to screw you. With page forking it is enough for flusher thread
>> > to start writeback for that page to trigger the problem - event that is
>> > basically bound to happen without any other userspace application
>> > interfering.
>>
>> Acceptable conclusion is where came from? That pseudocode logic doesn't
>> say about usage at all. And even if assume it is acceptable, as far as I
>> can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a
>> page on non-exists block (sparse file. i.e. missing disk space check in
>> your logic). And if really no any lock/check, there would be another
>> races.
>
> So drop_caches won't cause any issues because it avoids mmaped pages.
> Also page reclaim or page migration don't cause any issues because
> they avoid pages with increased refcount (and increased refcount would stop
> drop_caches from reclaiming the page as well if it was not for the mmaped
> check before). Generally, elevated page refcount currently guarantees page
> isn't migrated, reclaimed, or otherwise detached from the mapping (except
> for truncate where the combination of mapping-index becomes invalid) and
> your page forking would change that assumption - which IMHO has a big
> potential for some breakage somewhere.
Lifetime and visibility from user are different topic. The issue here
is visibility. Of course, those has relation more or less though,
refcount doesn't stop to drop page from radix-tree at all.
Well, anyway, your claim seems to be assuming the userspace app
workarounds the issues. And it sounds like still not workarounds the
ENOSPC issue (validate at page fault/GUP) even if assuming userspace
behave as perfect. Calling it as kernel assumption is strange.
If you claim, there is strange logic widely used already, and of course,
we can't simply break it because of compatibility. I would be able to
agree. But your claim sounds like that logic is sane and well designed
behavior. So I disagree.
> And frankly I fail to see why you and Daniel care so much about this
> corner case because from performance POV it's IMHO a non-issue and you
> bother with page forking because of performance, don't you?
Trying to penalize the corner case path, instead of normal path, should
try at first. Penalizing normal path to allow corner case path is insane
basically.
Make normal path faster and more reliable is what we are trying.
> So you can have a look for example at
> drivers/media/v4l2-core/videobuf2-dma-contig.c which implements setting up
> of a video device buffer at virtual address specified by user. Now I don't
> know whether there really is any userspace video program that sets up the
> video buffer in mmaped file. I would agree with you that it would be a
> strange thing to do but I've seen enough strange userspace code that I
> would not be too surprised.
>
> Another example of similar kind is at
> drivers/infiniband/core/umem.c where we again set up buffer for infiniband
> cards at users specified virtual address. And there are more drivers in
> kernel like that.
Unfortunately, I'm not looking those yet though. I guess those would be
helpful to see the details.
Thanks.
--
OGAWA Hirofumi <hirofumi at mail.parknet.co.jp>
More information about the Tux3
mailing list