From hirofumi at mail.parknet.co.jp Sun Jul 5 05:54:45 2015 From: hirofumi at mail.parknet.co.jp (OGAWA Hirofumi) Date: Sun, 05 Jul 2015 21:54:45 +0900 Subject: [FYI] tux3: Core changes In-Reply-To: <20150623161247.GP2427@quack.suse.cz> (Jan Kara's message of "Tue, 23 Jun 2015 18:12:47 +0200") References: <5563F5C8.2040806@redhat.com> <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> Message-ID: <87k2ueepd6.fsf@mail.parknet.co.jp> Jan Kara writes: >> I'm not sure I'm understanding your pseudocode logic correctly though. >> This logic doesn't seems to be a page forking specific issue. And >> this pseudocode logic seems to be missing the locking and revalidate of >> page. >> >> If you can show more details, it would be helpful to see more, and >> discuss the issue of page forking, or we can think about how to handle >> the corner cases. >> >> Well, before that, why need more details? >> >> For example, replace the page fork at (4) with "truncate", "punch >> hole", or "invalidate page". >> >> Those operations remove the old page from radix tree, so the >> userspace's write creates the new page, and HW still refererences the >> old page. (I.e. situation should be same with page forking, in my >> understand of this pseudocode logic.) > > Yes, if userspace truncates the file, the situation we end up with is > basically the same. However for truncate to happen some malicious process > has to come and truncate the file - a failure scenario that is acceptable > for most use cases since it doesn't happen unless someone is actively > trying to screw you. With page forking it is enough for flusher thread > to start writeback for that page to trigger the problem - event that is > basically bound to happen without any other userspace application > interfering. Acceptable conclusion is where came from? That pseudocode logic doesn't say about usage at all. And even if assume it is acceptable, as far as I can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a page on non-exists block (sparse file. i.e. missing disk space check in your logic). And if really no any lock/check, there would be another races. >> IOW, this pseudocode logic seems to be broken without page forking if >> no lock and revalidate. Usually, we prevent unpleasant I/O by >> lock_page or PG_writeback, and an obsolated page is revalidated under >> lock_page. > > Well, good luck with converting all the get_user_pages() users in kernel to > use lock_page() or PG_writeback checks to avoid issues with page forking. I > don't think that's really feasible. What does all get_user_pages() conversion mean? Well, maybe right more or less, I also think there is the issue in/around get_user_pages() that we have to tackle. IMO, if there is a code that pseudocode logic actually, it is the breakage. And "it is acceptable and limitation, and give up to fix", I don't think it is the right way to go. If there is really code broken like your logic, I think we should fix. Could you point which code is using your logic? Since that seems to be so racy, I can't believe yet there are that racy codes actually. >> For page forking, we may also be able to prevent similar situation by >> locking, flags, and revalidate. But those details might be different >> with current code, because page states are different. > > Sorry, I don't understand what do you mean in this paragraph. Can you > explain it a bit more? This just means a forked page (old page) and a truncated page have different set of flags and state, so we may have to adjust revalidation. Thanks. -- OGAWA Hirofumi From jack at suse.cz Thu Jul 9 09:05:28 2015 From: jack at suse.cz (Jan Kara) Date: Thu, 9 Jul 2015 18:05:28 +0200 Subject: [FYI] tux3: Core changes In-Reply-To: <87k2ueepd6.fsf@mail.parknet.co.jp> References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> Message-ID: <20150709160528.GK2900@quack.suse.cz> On Sun 05-07-15 21:54:45, OGAWA Hirofumi wrote: > Jan Kara writes: > > >> I'm not sure I'm understanding your pseudocode logic correctly though. > >> This logic doesn't seems to be a page forking specific issue. And > >> this pseudocode logic seems to be missing the locking and revalidate of > >> page. > >> > >> If you can show more details, it would be helpful to see more, and > >> discuss the issue of page forking, or we can think about how to handle > >> the corner cases. > >> > >> Well, before that, why need more details? > >> > >> For example, replace the page fork at (4) with "truncate", "punch > >> hole", or "invalidate page". > >> > >> Those operations remove the old page from radix tree, so the > >> userspace's write creates the new page, and HW still refererences the > >> old page. (I.e. situation should be same with page forking, in my > >> understand of this pseudocode logic.) > > > > Yes, if userspace truncates the file, the situation we end up with is > > basically the same. However for truncate to happen some malicious process > > has to come and truncate the file - a failure scenario that is acceptable > > for most use cases since it doesn't happen unless someone is actively > > trying to screw you. With page forking it is enough for flusher thread > > to start writeback for that page to trigger the problem - event that is > > basically bound to happen without any other userspace application > > interfering. > > Acceptable conclusion is where came from? That pseudocode logic doesn't > say about usage at all. And even if assume it is acceptable, as far as I > can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a > page on non-exists block (sparse file. i.e. missing disk space check in > your logic). And if really no any lock/check, there would be another > races. So drop_caches won't cause any issues because it avoids mmaped pages. Also page reclaim or page migration don't cause any issues because they avoid pages with increased refcount (and increased refcount would stop drop_caches from reclaiming the page as well if it was not for the mmaped check before). Generally, elevated page refcount currently guarantees page isn't migrated, reclaimed, or otherwise detached from the mapping (except for truncate where the combination of mapping-index becomes invalid) and your page forking would change that assumption - which IMHO has a big potential for some breakage somewhere. And frankly I fail to see why you and Daniel care so much about this corner case because from performance POV it's IMHO a non-issue and you bother with page forking because of performance, don't you? > >> IOW, this pseudocode logic seems to be broken without page forking if > >> no lock and revalidate. Usually, we prevent unpleasant I/O by > >> lock_page or PG_writeback, and an obsolated page is revalidated under > >> lock_page. > > > > Well, good luck with converting all the get_user_pages() users in kernel to > > use lock_page() or PG_writeback checks to avoid issues with page forking. I > > don't think that's really feasible. > > What does all get_user_pages() conversion mean? Well, maybe right more > or less, I also think there is the issue in/around get_user_pages() that > we have to tackle. > > > IMO, if there is a code that pseudocode logic actually, it is the > breakage. And "it is acceptable and limitation, and give up to fix", I > don't think it is the right way to go. If there is really code broken > like your logic, I think we should fix. > > Could you point which code is using your logic? Since that seems to be > so racy, I can't believe yet there are that racy codes actually. So you can have a look for example at drivers/media/v4l2-core/videobuf2-dma-contig.c which implements setting up of a video device buffer at virtual address specified by user. Now I don't know whether there really is any userspace video program that sets up the video buffer in mmaped file. I would agree with you that it would be a strange thing to do but I've seen enough strange userspace code that I would not be too surprised. Another example of similar kind is at drivers/infiniband/core/umem.c where we again set up buffer for infiniband cards at users specified virtual address. And there are more drivers in kernel like that. Honza -- Jan Kara SUSE Labs, CR From hirofumi at mail.parknet.co.jp Thu Jul 30 21:44:44 2015 From: hirofumi at mail.parknet.co.jp (OGAWA Hirofumi) Date: Fri, 31 Jul 2015 13:44:44 +0900 Subject: [FYI] tux3: Core changes In-Reply-To: <20150709160528.GK2900@quack.suse.cz> (Jan Kara's message of "Thu, 9 Jul 2015 18:05:28 +0200") References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> Message-ID: <874mklaqbn.fsf@mail.parknet.co.jp> Jan Kara writes: >> > Yes, if userspace truncates the file, the situation we end up with is >> > basically the same. However for truncate to happen some malicious process >> > has to come and truncate the file - a failure scenario that is acceptable >> > for most use cases since it doesn't happen unless someone is actively >> > trying to screw you. With page forking it is enough for flusher thread >> > to start writeback for that page to trigger the problem - event that is >> > basically bound to happen without any other userspace application >> > interfering. >> >> Acceptable conclusion is where came from? That pseudocode logic doesn't >> say about usage at all. And even if assume it is acceptable, as far as I >> can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a >> page on non-exists block (sparse file. i.e. missing disk space check in >> your logic). And if really no any lock/check, there would be another >> races. > > So drop_caches won't cause any issues because it avoids mmaped pages. > Also page reclaim or page migration don't cause any issues because > they avoid pages with increased refcount (and increased refcount would stop > drop_caches from reclaiming the page as well if it was not for the mmaped > check before). Generally, elevated page refcount currently guarantees page > isn't migrated, reclaimed, or otherwise detached from the mapping (except > for truncate where the combination of mapping-index becomes invalid) and > your page forking would change that assumption - which IMHO has a big > potential for some breakage somewhere. Lifetime and visibility from user are different topic. The issue here is visibility. Of course, those has relation more or less though, refcount doesn't stop to drop page from radix-tree at all. Well, anyway, your claim seems to be assuming the userspace app workarounds the issues. And it sounds like still not workarounds the ENOSPC issue (validate at page fault/GUP) even if assuming userspace behave as perfect. Calling it as kernel assumption is strange. If you claim, there is strange logic widely used already, and of course, we can't simply break it because of compatibility. I would be able to agree. But your claim sounds like that logic is sane and well designed behavior. So I disagree. > And frankly I fail to see why you and Daniel care so much about this > corner case because from performance POV it's IMHO a non-issue and you > bother with page forking because of performance, don't you? Trying to penalize the corner case path, instead of normal path, should try at first. Penalizing normal path to allow corner case path is insane basically. Make normal path faster and more reliable is what we are trying. > So you can have a look for example at > drivers/media/v4l2-core/videobuf2-dma-contig.c which implements setting up > of a video device buffer at virtual address specified by user. Now I don't > know whether there really is any userspace video program that sets up the > video buffer in mmaped file. I would agree with you that it would be a > strange thing to do but I've seen enough strange userspace code that I > would not be too surprised. > > Another example of similar kind is at > drivers/infiniband/core/umem.c where we again set up buffer for infiniband > cards at users specified virtual address. And there are more drivers in > kernel like that. Unfortunately, I'm not looking those yet though. I guess those would be helpful to see the details. Thanks. -- OGAWA Hirofumi From shentino at gmail.com Fri Jul 31 08:37:35 2015 From: shentino at gmail.com (Raymond Jennings) Date: Fri, 31 Jul 2015 08:37:35 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: <874mklaqbn.fsf@mail.parknet.co.jp> References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> Message-ID: Returning ENOSPC when you have free space you can't yet prove is safer than not returning it and risking a data loss when you get hit by a write/commit storm. :) On Thu, Jul 30, 2015 at 9:44 PM, OGAWA Hirofumi wrote: > Jan Kara writes: > > >> > Yes, if userspace truncates the file, the situation we end up with is > >> > basically the same. However for truncate to happen some malicious > process > >> > has to come and truncate the file - a failure scenario that is > acceptable > >> > for most use cases since it doesn't happen unless someone is actively > >> > trying to screw you. With page forking it is enough for flusher thread > >> > to start writeback for that page to trigger the problem - event that > is > >> > basically bound to happen without any other userspace application > >> > interfering. > >> > >> Acceptable conclusion is where came from? That pseudocode logic doesn't > >> say about usage at all. And even if assume it is acceptable, as far as I > >> can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a > >> page on non-exists block (sparse file. i.e. missing disk space check in > >> your logic). And if really no any lock/check, there would be another > >> races. > > > > So drop_caches won't cause any issues because it avoids mmaped pages. > > Also page reclaim or page migration don't cause any issues because > > they avoid pages with increased refcount (and increased refcount would > stop > > drop_caches from reclaiming the page as well if it was not for the mmaped > > check before). Generally, elevated page refcount currently guarantees > page > > isn't migrated, reclaimed, or otherwise detached from the mapping (except > > for truncate where the combination of mapping-index becomes invalid) and > > your page forking would change that assumption - which IMHO has a big > > potential for some breakage somewhere. > > Lifetime and visibility from user are different topic. The issue here > is visibility. Of course, those has relation more or less though, > refcount doesn't stop to drop page from radix-tree at all. > > Well, anyway, your claim seems to be assuming the userspace app > workarounds the issues. And it sounds like still not workarounds the > ENOSPC issue (validate at page fault/GUP) even if assuming userspace > behave as perfect. Calling it as kernel assumption is strange. > > If you claim, there is strange logic widely used already, and of course, > we can't simply break it because of compatibility. I would be able to > agree. But your claim sounds like that logic is sane and well designed > behavior. So I disagree. > > > And frankly I fail to see why you and Daniel care so much about this > > corner case because from performance POV it's IMHO a non-issue and you > > bother with page forking because of performance, don't you? > > Trying to penalize the corner case path, instead of normal path, should > try at first. Penalizing normal path to allow corner case path is insane > basically. > > Make normal path faster and more reliable is what we are trying. > > > So you can have a look for example at > > drivers/media/v4l2-core/videobuf2-dma-contig.c which implements setting > up > > of a video device buffer at virtual address specified by user. Now I > don't > > know whether there really is any userspace video program that sets up the > > video buffer in mmaped file. I would agree with you that it would be a > > strange thing to do but I've seen enough strange userspace code that I > > would not be too surprised. > > > > Another example of similar kind is at > > drivers/infiniband/core/umem.c where we again set up buffer for > infiniband > > cards at users specified virtual address. And there are more drivers in > > kernel like that. > > Unfortunately, I'm not looking those yet though. I guess those would be > helpful to see the details. > > Thanks. > -- > OGAWA Hirofumi > > _______________________________________________ > Tux3 mailing list > Tux3 at phunq.net > http://phunq.net/mailman/listinfo/tux3 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel at phunq.net Fri Jul 31 10:27:13 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 10:27:13 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> Message-ID: <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> On Friday, July 31, 2015 8:37:35 AM PDT, Raymond Jennings wrote: > Returning ENOSPC when you have free space you can't yet prove is safer than > not returning it and risking a data loss when you get hit by a write/commit > storm. :) Remember when delayed allocation was scary and unproven, because proving that ENOSPC will always be returned when needed is extremely difficult? But the performance advantage was compelling, so we just worked at it until it worked. There were times when it didn't work properly, but the code was in the tree so it got fixed. It's like that now with page forking - a new technique with compelling advantages, and some challenges. In the past, we (the Linux community) would rise to the challenge and err on the side of pushing optimizations in early. That was our mojo, and that is how Linux became the dominant operating system it is today. Do we, the Linux community, still have that mojo? Regards, Daniel From david at lang.hm Fri Jul 31 11:29:51 2015 From: david at lang.hm (David Lang) Date: Fri, 31 Jul 2015 11:29:51 -0700 (PDT) Subject: [FYI] tux3: Core changes In-Reply-To: <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Fri, 31 Jul 2015, Daniel Phillips wrote: > Subject: Re: [FYI] tux3: Core changes > > On Friday, July 31, 2015 8:37:35 AM PDT, Raymond Jennings wrote: >> Returning ENOSPC when you have free space you can't yet prove is safer than >> not returning it and risking a data loss when you get hit by a write/commit >> storm. :) > > Remember when delayed allocation was scary and unproven, because proving > that ENOSPC will always be returned when needed is extremely difficult? > But the performance advantage was compelling, so we just worked at it > until it worked. There were times when it didn't work properly, but the > code was in the tree so it got fixed. > > It's like that now with page forking - a new technique with compelling > advantages, and some challenges. In the past, we (the Linux community) > would rise to the challenge and err on the side of pushing optimizations > in early. That was our mojo, and that is how Linux became the dominant > operating system it is today. Do we, the Linux community, still have that > mojo? We, the Linux Community have less tolerance for losing people's data and preventing them from operating than we used to when it was all tinkerer's personal data and secondary systems. So rather than pushing optimizations out to everyone and seeing what breaks, we now do more testing and checking for failures before pushing things out. This means that when something new is introduced, we default to the safe, slightly slower way initially (there will be enough other bugs to deal with in any case), and then as we gain experience from the tinkerers enabling the performance optimizations, we make those optimizations reliable and only then push them out to all users. If you define this as "loosing our mojo", then yes we have. But most people see the pace of development as still being high, just with more testing and polishing before it gets out to users. David Lang From daniel at phunq.net Fri Jul 31 11:40:23 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 11:40:23 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: <8dc3b4e6-51dc-4205-af33-6f2baeb7f107@phunq.net> On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: > If you define this as "loosing our mojo", then yes we have. A pity. There remains so much to do that simply will not get done in the absence of mojo. Regards, Daniel From daniel at phunq.net Fri Jul 31 11:43:58 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 11:43:58 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: > If you define this as "loosing our mojo", then yes we have. A pity. There remains so much to do that simply will not get done in the absence of mojo. Regards, Daniel From daniel at phunq.net Fri Jul 31 13:33:33 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 13:33:33 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: > We, the Linux Community have less tolerance for losing people's > data and preventing them from operating than we used to when it > was all tinkerer's personal data and secondary systems. > > So rather than pushing optimizations out to everyone and seeing > what breaks, we now do more testing and checking for failures > before pushing things out. By the way, I am curious about whose data you think will get lost as a result of pushing out Tux3 with a possible theoretical bug in a wildly improbable scenario that has not actually been described with sufficient specificity to falsify, let alone demonstrated. Regards, Daniel From daniel at phunq.net Fri Jul 31 15:12:20 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 15:12:20 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: > We, the Linux Community have less tolerance for losing people's data and preventing them from operating than we used to when it was all tinkerer's personal data and secondary systems. > > So rather than pushing optimizations out to everyone and seeing what breaks, we now do more testing and checking for failures before pushing things out. By the way, I am curious about whose data you think will get lost as a result of pushing out Tux3 with a possible theoretical bug in a wildly improbable scenario that has not actually been described with sufficient specificity to falsify, let alone demonstrated. Regards, Daniel From david at lang.hm Fri Jul 31 15:27:12 2015 From: david at lang.hm (David Lang) Date: Fri, 31 Jul 2015 15:27:12 -0700 (PDT) Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Fri, 31 Jul 2015, Daniel Phillips wrote: > On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: >> We, the Linux Community have less tolerance for losing people's data and >> preventing them from operating than we used to when it was all tinkerer's >> personal data and secondary systems. >> >> So rather than pushing optimizations out to everyone and seeing what >> breaks, we now do more testing and checking for failures before pushing >> things out. > > By the way, I am curious about whose data you think will get lost > as a result of pushing out Tux3 with a possible theoretical bug > in a wildly improbable scenario that has not actually been > described with sufficient specificity to falsify, let alone > demonstrated. you weren't asking about any particular feature of Tux, you were asking if we were still willing to push out stuff that breaks for users and fix it later. Especially for filesystems that can loose the data of whoever is using it, the answer seems to be a clear no. there may be bugs in what's pushed out that we don't know about. But we don't push out potential data corruption bugs that we do know about (or think we do) so if you think this should be pushed out with this known corner case that's not handled properly, you have to convince people that it's _so_ improbable that they shouldn't care about it. David Lang From daniel at phunq.net Fri Jul 31 17:00:43 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 17:00:43 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Friday, July 31, 2015 3:27:12 PM PDT, David Lang wrote: > On Fri, 31 Jul 2015, Daniel Phillips wrote: > >> On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: ... > > you weren't asking about any particular feature of Tux, you > were asking if we were still willing to push out stuff that > breaks for users and fix it later. I think you left a key word out of my ask: "theoretical". > Especially for filesystems that can loose the data of whoever > is using it, the answer seems to be a clear no. > > there may be bugs in what's pushed out that we don't know > about. But we don't push out potential data corruption bugs that > we do know about (or think we do) > > so if you think this should be pushed out with this known > corner case that's not handled properly, you have to convince > people that it's _so_ improbable that they shouldn't care about > it. There should also be an onus on the person posing the worry to prove their case beyond a reasonable doubt, which has not been done in case we are discussing here. Note: that is a technical assessment to which a technical response is appropriate. I do think that we should put a cap on this fencing and make a real effort to get Tux3 into mainline. We should at least set a ground rule that a problem should be proved real before it becomes a reason to derail a project in the way that our project has been derailed. Otherwise, it's hard to see what interest is served. OK, lets get back to the program. I accept your assertion that we should convince people that the issue is improbable. To do that, I need a specific issue to address. So far, no such issue has been provided with specificity. Do you see why this is frustrating? Please, community. Give us specific issues to address, or give us some way out of this eternal limbo. Or better, lets go back to the old way of doing things in Linux, which is what got us where we are today. Not this. Note: Hirofumi's email is clear, logical and speaks to the question. This branch of the thread is largely pointless, though it essentially says the same thing in non-technical terms. Perhaps your next response should be to Hirofumi, and perhaps it should be technical. Regards, Daniel From daniel at phunq.net Fri Jul 31 17:16:45 2015 From: daniel at phunq.net (Daniel Phillips) Date: Fri, 31 Jul 2015 17:16:45 -0700 Subject: [FYI] tux3: Core changes In-Reply-To: References: <67294911-1776-46b8-916d-0e5642a38725@phunq.net> <20150526070910.GA3307@quack.suse.cz> <20150526090058.GA8024@quack.suse.cz> <5564D60E.6000306@phunq.net> <20150527084138.GD2590@quack.suse.cz> <87a8vtdqfz.fsf@mail.parknet.co.jp> <20150623161247.GP2427@quack.suse.cz> <87k2ueepd6.fsf@mail.parknet.co.jp> <20150709160528.GK2900@quack.suse.cz> <874mklaqbn.fsf@mail.parknet.co.jp> <1981a91e-30a9-43ce-9a05-14aa777e46a5@phunq.net> Message-ID: On Friday, July 31, 2015 5:00:43 PM PDT, Daniel Phillips wrote: > Note: Hirofumi's email is clear, logical and speaks to the > question. This branch of the thread is largely pointless, though > it essentially says the same thing in non-technical terms. Perhaps > your next response should be to Hirofumi, and perhaps it should be > technical. Now, let me try to lead the way, but being specific. RDMA was raised as a potential failure case for Tux3 page forking. But the RDMA api does not let you use memory mmaped by Tux3 as a source or destination of IO. Instead, it sets up its own pages and hands them out to the RDMA app from a pool. So no issue. One down, right? Regards, Daniel