Re: [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>, Ted Tso <tytso@mit.edu>,
	Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, linux-xfs@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock
Date: Tue, 13 Jul 2021 13:11:39 +0200	[thread overview]
Message-ID: <20210713111139.GG12142@quack2.suse.cz> (raw)
In-Reply-To: <20210713012514.GB22402@magnolia>

On Mon 12-07-21 18:25:14, Darrick J. Wong wrote:
> On Mon, Jul 12, 2021 at 06:55:54PM +0200, Jan Kara wrote:
> > @@ -2967,6 +2992,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	pgoff_t max_off;
> >  	struct page *page;
> >  	vm_fault_t ret = 0;
> > +	bool mapping_locked = false;
> >  
> >  	max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
> >  	if (unlikely(offset >= max_off))
> > @@ -2988,15 +3014,30 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
> >  		ret = VM_FAULT_MAJOR;
> >  		fpin = do_sync_mmap_readahead(vmf);
> > +	}
> > +
> > +	if (!page) {
> 
> Is it still necessary to re-evaluate !page here?

No, you are right it is not necessary. I'll remove it.

> >  retry_find:
> > +		/*
> > +		 * See comment in filemap_create_page() why we need
> > +		 * invalidate_lock
> > +		 */
> > +		if (!mapping_locked) {
> > +			filemap_invalidate_lock_shared(mapping);
> > +			mapping_locked = true;
> > +		}
> >  		page = pagecache_get_page(mapping, offset,
> >  					  FGP_CREAT|FGP_FOR_MMAP,
> >  					  vmf->gfp_mask);
> >  		if (!page) {
> >  			if (fpin)
> >  				goto out_retry;
> > +			filemap_invalidate_unlock_shared(mapping);
> >  			return VM_FAULT_OOM;
> >  		}
> > +	} else if (unlikely(!PageUptodate(page))) {
> > +		filemap_invalidate_lock_shared(mapping);
> > +		mapping_locked = true;
> >  	}
> >  
> >  	if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> > @@ -3014,8 +3055,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 * We have a locked page in the page cache, now we need to check
> >  	 * that it's up-to-date. If not, it is going to be due to an error.
> >  	 */
> > -	if (unlikely(!PageUptodate(page)))
> > +	if (unlikely(!PageUptodate(page))) {
> > +		/*
> > +		 * The page was in cache and uptodate and now it is not.
> > +		 * Strange but possible since we didn't hold the page lock all
> > +		 * the time. Let's drop everything get the invalidate lock and
> > +		 * try again.
> > +		 */
> > +		if (!mapping_locked) {
> > +			unlock_page(page);
> > +			put_page(page);
> > +			goto retry_find;
> > +		}
> >  		goto page_not_uptodate;
> > +	}
> >  
> >  	/*
> >  	 * We've made it this far and we had to drop our mmap_lock, now is the
> > @@ -3026,6 +3079,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		unlock_page(page);
> >  		goto out_retry;
> >  	}
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> >  
> >  	/*
> >  	 * Found the page and have a reference on it.
> > @@ -3056,6 +3111,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  
> >  	if (!error || error == AOP_TRUNCATED_PAGE)
> >  		goto retry_find;
> > +	filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I /think/ it's the case that mapping_locked==true always holds here
> because the new "The page was in cache and uptodate and now it is not."
> block above will take the invalidate_lock and retry pagecache_get_page,
> right?

Yes. page_not_uptodate block can only be entered with mapping_locked ==
true - the only place that can enter this block is:

        if (unlikely(!PageUptodate(page))) {
                /*
                 * The page was in cache and uptodate and now it is not.
                 * Strange but possible since we didn't hold the page lock all
                 * the time. Let's drop everything get the invalidate lock and
                 * try again.
                 */
                if (!mapping_locked) {
                        unlock_page(page);
                        put_page(page);
                        goto retry_find;
                }
                goto page_not_uptodate;
        }

> >  
> >  	return VM_FAULT_SIGBUS;
> >  
> > @@ -3067,6 +3123,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 */
> >  	if (page)
> >  		put_page(page);
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I think this looks ok, even though this patch now contains the
> subtlety that we've both hoisted the xfs mmaplock to page cache /and/
> reduced the scope of the invalidate_lock.
> 
> As for fancy things like remap_range, I think they're still safe with
> this latest iteration because those functions grab the invalidate_lock
> in exclusive mode and invalidate the mappings before proceeding, which
> means that other programs will never find the lockless path (i.e. page
> locked, uptodate, and attached to the mapping) and will instead block on
> the invalidate lock until the remap operation completes.   Is that
> right?

Correct. For operations such as hole punch or destination of remap_range,
we lock invalidate_lock exclusively and invalidate pagecache in the
involved range. No new pages can be created in that range until you drop
invalidate_lock (places creating pages without holding i_rwsem are read,
readahead, fault and all those take invalidate_lock when they should create
the page).

There's also the case someone pointed out that *source* of remap_range
needs to be protected (but only from modifications through mmap). This is
achieved by having invalidate_lock taken in .page_mkwrite handlers and
thus not impacted by these changes to filemap_fault().

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID (diff)

From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>,
	linux-cifs@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-f2fs-devel@lists.sourceforge.net,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Ted Tso <tytso@mit.edu>,
	ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [f2fs-dev] [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock
Date: Tue, 13 Jul 2021 13:11:39 +0200	[thread overview]
Message-ID: <20210713111139.GG12142@quack2.suse.cz> (raw)
In-Reply-To: <20210713012514.GB22402@magnolia>

On Mon 12-07-21 18:25:14, Darrick J. Wong wrote:
> On Mon, Jul 12, 2021 at 06:55:54PM +0200, Jan Kara wrote:
> > @@ -2967,6 +2992,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	pgoff_t max_off;
> >  	struct page *page;
> >  	vm_fault_t ret = 0;
> > +	bool mapping_locked = false;
> >  
> >  	max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
> >  	if (unlikely(offset >= max_off))
> > @@ -2988,15 +3014,30 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
> >  		ret = VM_FAULT_MAJOR;
> >  		fpin = do_sync_mmap_readahead(vmf);
> > +	}
> > +
> > +	if (!page) {
> 
> Is it still necessary to re-evaluate !page here?

No, you are right it is not necessary. I'll remove it.

> >  retry_find:
> > +		/*
> > +		 * See comment in filemap_create_page() why we need
> > +		 * invalidate_lock
> > +		 */
> > +		if (!mapping_locked) {
> > +			filemap_invalidate_lock_shared(mapping);
> > +			mapping_locked = true;
> > +		}
> >  		page = pagecache_get_page(mapping, offset,
> >  					  FGP_CREAT|FGP_FOR_MMAP,
> >  					  vmf->gfp_mask);
> >  		if (!page) {
> >  			if (fpin)
> >  				goto out_retry;
> > +			filemap_invalidate_unlock_shared(mapping);
> >  			return VM_FAULT_OOM;
> >  		}
> > +	} else if (unlikely(!PageUptodate(page))) {
> > +		filemap_invalidate_lock_shared(mapping);
> > +		mapping_locked = true;
> >  	}
> >  
> >  	if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> > @@ -3014,8 +3055,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 * We have a locked page in the page cache, now we need to check
> >  	 * that it's up-to-date. If not, it is going to be due to an error.
> >  	 */
> > -	if (unlikely(!PageUptodate(page)))
> > +	if (unlikely(!PageUptodate(page))) {
> > +		/*
> > +		 * The page was in cache and uptodate and now it is not.
> > +		 * Strange but possible since we didn't hold the page lock all
> > +		 * the time. Let's drop everything get the invalidate lock and
> > +		 * try again.
> > +		 */
> > +		if (!mapping_locked) {
> > +			unlock_page(page);
> > +			put_page(page);
> > +			goto retry_find;
> > +		}
> >  		goto page_not_uptodate;
> > +	}
> >  
> >  	/*
> >  	 * We've made it this far and we had to drop our mmap_lock, now is the
> > @@ -3026,6 +3079,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		unlock_page(page);
> >  		goto out_retry;
> >  	}
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> >  
> >  	/*
> >  	 * Found the page and have a reference on it.
> > @@ -3056,6 +3111,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  
> >  	if (!error || error == AOP_TRUNCATED_PAGE)
> >  		goto retry_find;
> > +	filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I /think/ it's the case that mapping_locked==true always holds here
> because the new "The page was in cache and uptodate and now it is not."
> block above will take the invalidate_lock and retry pagecache_get_page,
> right?

Yes. page_not_uptodate block can only be entered with mapping_locked ==
true - the only place that can enter this block is:

        if (unlikely(!PageUptodate(page))) {
                /*
                 * The page was in cache and uptodate and now it is not.
                 * Strange but possible since we didn't hold the page lock all
                 * the time. Let's drop everything get the invalidate lock and
                 * try again.
                 */
                if (!mapping_locked) {
                        unlock_page(page);
                        put_page(page);
                        goto retry_find;
                }
                goto page_not_uptodate;
        }

> >  
> >  	return VM_FAULT_SIGBUS;
> >  
> > @@ -3067,6 +3123,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 */
> >  	if (page)
> >  		put_page(page);
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I think this looks ok, even though this patch now contains the
> subtlety that we've both hoisted the xfs mmaplock to page cache /and/
> reduced the scope of the invalidate_lock.
> 
> As for fancy things like remap_range, I think they're still safe with
> this latest iteration because those functions grab the invalidate_lock
> in exclusive mode and invalidate the mappings before proceeding, which
> means that other programs will never find the lockless path (i.e. page
> locked, uptodate, and attached to the mapping) and will instead block on
> the invalidate lock until the remap operation completes.   Is that
> right?

Correct. For operations such as hole punch or destination of remap_range,
we lock invalidate_lock exclusively and invalidate pagecache in the
involved range. No new pages can be created in that range until you drop
invalidate_lock (places creating pages without holding i_rwsem are read,
readahead, fault and all those take invalidate_lock when they should create
the page).

There's also the case someone pointed out that *source* of remap_range
needs to be protected (but only from modifications through mmap). This is
achieved by having invalidate_lock taken in .page_mkwrite handlers and
thus not impacted by these changes to filemap_fault().

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

next prev parent reply	other threads:[~2021-07-13 11:11 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-12 16:55 [PATCH 0/14 v9] fs: Hole punch vs page cache filling races Jan Kara
2021-07-12 16:55 ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 01/14] mm: Fix comments mentioning i_mutex Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 02/14] documentation: Sync file_operations members with reality Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-13  1:02   ` Darrick J. Wong
2021-07-13  1:02     ` Darrick J. Wong
2021-07-12 16:55 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-13  1:25   ` Darrick J. Wong
2021-07-13  1:25     ` Darrick J. Wong
2021-07-13 11:11     ` Jan Kara [this message]
2021-07-13 11:11       ` Jan Kara
2021-07-13  6:25   ` [f2fs-dev] " Christoph Hellwig
2021-07-13  6:25     ` Christoph Hellwig
2021-07-13 12:35     ` [f2fs-dev] " Jan Kara
2021-07-13 12:35       ` Jan Kara
2021-07-12 16:55 ` [PATCH 04/14] mm: Add functions to lock invalidate_lock for two mappings Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 05/14] ext4: Convert to use mapping->invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 06/14] ext2: Convert to using invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 07/14] xfs: Refactor xfs_isilocked() Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 08/14] xfs: Convert to use invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 09/14] xfs: Convert double locking of MMAPLOCK to use VFS helpers Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 10/14] zonefs: Convert to using invalidate_lock Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 11/14] f2fs: " Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 12/14] fuse: " Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 13/14] ceph: Fix race between hole punch and page fault Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [f2fs-dev] [PATCH 14/14] cifs: " Jan Kara
2021-07-12 16:56   ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2021-07-15 13:40 [PATCH 0/14 v10] fs: Hole punch vs page cache filling races Jan Kara
2021-07-15 13:40 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-15  9:17 [PATCH 0/14 v8] fs: Hole punch vs page cache filling races Jan Kara
2021-06-15  9:17 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-16  5:33   ` Christoph Hellwig
2021-06-17 16:15   ` Darrick J. Wong
2021-06-07 14:52 [PATCH 0/14 v7] fs: Hole punch vs page cache filling races Jan Kara
2021-06-07 14:52 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-07 16:09   ` Darrick J. Wong
2021-06-08 12:19     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210713111139.GG12142@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ceph-devel@vger.kernel.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.