[RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache
@ 2014-02-11  3:05 Kirill A. Shutemov
  2014-02-11  3:05 ` [PATCH 1/2] mm: extend ->fault interface to fault in few pages around fault address Kirill A. Shutemov
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-02-11  3:05 UTC (permalink / raw
  To: Linus Torvalds, Andrew Morton, Mel Gorman, Rik van Riel
  Cc: Andi Kleen, Matthew Wilcox, Dave Hansen, linux-mm,
	Kirill A. Shutemov

Okay, it's RFC only. I haven't stabilize it yet. And it's 5 AM...

It kind of work on small test-cases in kvm, but hung my laptop shortly
after boot. So no benchmark data.

The patches are on top of mine __do_fault() cleanup.

The idea is to minimize number of minor page faults by mapping pages around
the fault address if they are already in page cache.

With the patches we try to map up to 32 pages (subject to change) on read
page fault. Later can extended to write page faults to shared mappings if
works well.

The pages must be on the same page table so we can change all ptes under
one lock.

I tried to avoid additional latency, so we don't wait page to get ready,
just skip to the next one.

The only place where we can get stuck for relatively long time is
do_async_mmap_readahead(): it allocates pages and submits IO. We can't
just skip readahead, otherwise it will stop working and we will get miss
all the time. On other hand keeping do_async_mmap_readahead() there will
probably break readahead heuristics: interleaving access looks as
sequential.

Any comments are welcome.

Kirill A. Shutemov (2):
  mm: extend ->fault interface to fault in few pages around fault
    address
  mm: implement FAULT_FLAG_AROUND in filemap_fault()

 include/linux/mm.h | 24 +++++++++++++++++
 mm/filemap.c       | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 mm/memory.c        | 61 +++++++++++++++++++++++++++++++++++++-----
 3 files changed, 152 insertions(+), 10 deletions(-)

-- 
1.8.5.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] mm: extend ->fault interface to fault in few pages around fault address
  2014-02-11  3:05 [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Kirill A. Shutemov
@ 2014-02-11  3:05 ` Kirill A. Shutemov
  2014-02-11  3:05 ` [PATCH 2/2] mm: implement FAULT_FLAG_AROUND in filemap_fault() Kirill A. Shutemov
  2014-02-11 21:39 ` [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-02-11  3:05 UTC (permalink / raw
  To: Linus Torvalds, Andrew Morton, Mel Gorman, Rik van Riel
  Cc: Andi Kleen, Matthew Wilcox, Dave Hansen, linux-mm,
	Kirill A. Shutemov

If (flags & FAULT_FLAG_AROUND) fault handler asks ->fault to fill
->pages array with ->nr_pages pages if they are ready to map.

If a page is not ready to be map, no need to wait for it: skip to the
next.

It's okay to have some (or all) elements of the array set to NULL.

Indexes of pages must be in range between ->min and ->max inclusive.
Array must not contain page with index ->pgoff, in should be in ->pages.

->fault must set VM_FAULT_AROUND bit in return code, if it fills the
array.

Pages must be locked.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 include/linux/mm.h | 24 +++++++++++++++++++++
 mm/memory.c        | 61 ++++++++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 79 insertions(+), 6 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f28f46eade6a..fe5629bc9e5b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -191,6 +191,7 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 #define FAULT_FLAG_TRIED	0x40	/* second try */
 #define FAULT_FLAG_USER		0x80	/* The fault originated in userspace */
+#define FAULT_FLAG_AROUND	0x100   /* Try to get few pages a time */
 
 /*
  * vm_fault is filled by the the pagefault handler and passed to the vma's
@@ -210,6 +211,28 @@ struct vm_fault {
 					 * is set (which is also implied by
 					 * VM_FAULT_ERROR).
 					 */
+
+	/*
+	 * If (flags & FAULT_FLAG_AROUND) fault handler asks ->fault to fill
+	 * ->pages array with ->nr_pages pages if they are ready to map.
+	 *
+	 * If a page is not ready to be map, no need to wait for it: skip to
+	 * the next.
+	 *
+	 * It's okay to have some (or all) elements of the array set to NULL.
+	 *
+	 * Indexes of pages must be in range between ->min and ->max inclusive.
+	 * Array must not contain page with index ->pgoff, in should be in
+	 * ->pages.
+	 *
+	 * ->fault must set VM_FAULT_AROUND bit in return code, if it fills the
+	 * array.
+	 *
+	 * Pages must be locked.
+	 */
+	int nr_pages;
+	pgoff_t min, max;
+	struct page **pages;
 };
 
 /*
@@ -1004,6 +1027,7 @@ static inline int page_mapped(struct page *page)
 #define VM_FAULT_LOCKED	0x0200	/* ->fault locked the returned page */
 #define VM_FAULT_RETRY	0x0400	/* ->fault blocked, must retry */
 #define VM_FAULT_FALLBACK 0x0800	/* huge page fault failed, fall back to small */
+#define VM_FAULT_AROUND 0x1000	/* ->pages is filled */
 
 #define VM_FAULT_HWPOISON_LARGE_MASK 0xf000 /* encodes hpage index for large hwpoison */
 
diff --git a/mm/memory.c b/mm/memory.c
index 68c3dc141059..47ab9d6e1666 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3287,27 +3287,52 @@ oom:
 }
 
 static int __do_fault(struct vm_area_struct *vma, unsigned long address,
-		pgoff_t pgoff, unsigned int flags, struct page **page)
+		pgoff_t pgoff, unsigned int flags, struct page **page,
+		struct page **pages, int nr_pages)
 {
 	struct vm_fault vmf;
-	int ret;
+	int i, ret;
 
 	vmf.virtual_address = (void __user *)(address & PAGE_MASK);
 	vmf.pgoff = pgoff;
 	vmf.flags = flags;
 	vmf.page = NULL;
 
+	if (flags & FAULT_FLAG_AROUND) {
+		vmf.pages = pages;
+		vmf.nr_pages = nr_pages;
+
+		/*
+		 * From page for address aligned down to FAULT_AROUND_PAGES
+		 * baundary, to the end of page table.
+		 */
+		vmf.min = pgoff - ((address >> PAGE_SHIFT) & (nr_pages - 1));
+		vmf.min = min(pgoff, vmf.min); /* underflow */
+		vmf.max = pgoff + PTRS_PER_PTE - 1 -
+			((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1));
+		/* Both should be inside the vma */
+		vmf.min = max(vma->vm_pgoff, vmf.min);
+		vmf.max = min(vma_pages(vma) + vma->vm_pgoff - 1, vmf.max);
+	}
+
 	ret = vma->vm_ops->fault(vma, &vmf);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		return ret;
 
 	if (unlikely(PageHWPoison(vmf.page))) {
+		for (i = 0; (ret & VM_FAULT_AROUND) && i < nr_pages; i++) {
+			if (!pages[i])
+				continue;
+			unlock_page(pages[i]);
+			page_cache_release(vmf.page);
+		}
 		if (ret & VM_FAULT_LOCKED)
 			unlock_page(vmf.page);
 		page_cache_release(vmf.page);
 		return VM_FAULT_HWPOISON;
 	}
 
+	/* pages on ->nr_pages are always return locked */
 	if (unlikely(!(ret & VM_FAULT_LOCKED)))
 		lock_page(vmf.page);
 	else
@@ -3341,16 +3366,21 @@ static void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 	update_mmu_cache(vma, address, pte);
 }
 
+#define FAULT_AROUND_PAGES 32
 static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		unsigned long address, pmd_t *pmd,
 		pgoff_t pgoff, unsigned int flags, pte_t orig_pte)
 {
 	struct page *fault_page;
+	struct page *pages[FAULT_AROUND_PAGES];
 	spinlock_t *ptl;
 	pte_t *pte;
-	int ret;
+	int i, ret;
 
-	ret = __do_fault(vma, address, pgoff, flags, &fault_page);
+	if (!(flags & FAULT_FLAG_NONLINEAR))
+		flags |= FAULT_FLAG_AROUND;
+	ret = __do_fault(vma, address, pgoff, flags, &fault_page,
+			pages, ARRAY_SIZE(pages));
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		return ret;
 
@@ -3362,6 +3392,25 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		return ret;
 	}
 	do_set_pte(vma, address, fault_page, pte, false, false);
+	for (i = 0; (ret & VM_FAULT_AROUND) && i < ARRAY_SIZE(pages); i++) {
+		pte_t *_pte;
+		unsigned long addr;
+		if (!pages[i])
+			continue;
+		VM_BUG_ON_PAGE(!PageLocked(pages[i]), pages[i]);
+		if (PageHWPoison(pages[i]))
+			goto skip;
+		_pte = pte + pages[i]->index - pgoff;
+		if (!pte_none(*_pte))
+			goto skip;
+		addr = address + PAGE_SIZE * (pages[i]->index - pgoff);
+		do_set_pte(vma, addr, pages[i], _pte, false, false);
+		unlock_page(pages[i]);
+		continue;
+skip:
+		unlock_page(pages[i]);
+		put_page(pages[i]);
+	}
 	pte_unmap_unlock(pte, ptl);
 	unlock_page(fault_page);
 	return ret;
@@ -3388,7 +3437,7 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		return VM_FAULT_OOM;
 	}
 
-	ret = __do_fault(vma, address, pgoff, flags, &fault_page);
+	ret = __do_fault(vma, address, pgoff, flags, &fault_page, NULL, 0);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		goto uncharge_out;
 
@@ -3423,7 +3472,7 @@ static int do_shared_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	int dirtied = 0;
 	int ret, tmp;
 
-	ret = __do_fault(vma, address, pgoff, flags, &fault_page);
+	ret = __do_fault(vma, address, pgoff, flags, &fault_page, NULL, 0);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		return ret;
 
-- 
1.8.5.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm: implement FAULT_FLAG_AROUND in filemap_fault()
  2014-02-11  3:05 [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Kirill A. Shutemov
  2014-02-11  3:05 ` [PATCH 1/2] mm: extend ->fault interface to fault in few pages around fault address Kirill A. Shutemov
@ 2014-02-11  3:05 ` Kirill A. Shutemov
  2014-02-11 21:39 ` [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-02-11  3:05 UTC (permalink / raw
  To: Linus Torvalds, Andrew Morton, Mel Gorman, Rik van Riel
  Cc: Andi Kleen, Matthew Wilcox, Dave Hansen, linux-mm,
	Kirill A. Shutemov

If FAULT_FLAG_AROUND is set filemap_fault() will use find_get_pages()
for batched pages lookup.

Pages returned by find_get_pages() will be handled differently: page
with index vmf->pgoff will take normal filemap_fault() code path.

For all other pages we will not attempt retry locking or wait page to be
up-to-date, just give up and go to the next page.

I'm not sure how we should deal with readahead() here. For now I just
call do_async_mmap_readahead(). It probably breaks readahead heuristics:
interleaving access looks as sequential.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/filemap.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 73 insertions(+), 4 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d56d3c145b9f..4d00fc0094f6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1593,6 +1593,64 @@ static void do_async_mmap_readahead(struct vm_area_struct *vma,
 					   page, offset, ra->ra_pages);
 }
 
+static struct page *lock_secondary_pages(struct vm_area_struct *vma,
+		struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct address_space *mapping = file->f_mapping;
+	struct page *primary_page = NULL, **pages = vmf->pages;
+	pgoff_t size;
+	int i;
+
+	for (i = 0; i < vmf->nr_pages; i++) {
+		if (!pages[i])
+			continue;
+		if (pages[i]->index == vmf->pgoff) {
+			primary_page = pages[i];
+			pages[i] = NULL;
+			continue;
+		}
+		if (pages[i]->index > vmf->max)
+			goto put;
+		do_async_mmap_readahead(vma, &file->f_ra, file,
+				pages[i], pages[i]->index);
+		if (!trylock_page(pages[i]))
+			goto put;
+		/* Truncated? */
+		if (unlikely(pages[i]->mapping != mapping))
+			goto unlock;
+		if (unlikely(!PageUptodate(pages[i])))
+			goto unlock;
+		size = (i_size_read(mapping->host) + PAGE_CACHE_SIZE - 1)
+			>> PAGE_CACHE_SHIFT;
+		if (unlikely(pages[i]->index >= size))
+			goto unlock;
+		continue;
+unlock:
+		unlock_page(pages[i]);
+put:
+		put_page(pages[i]);
+		pages[i] = NULL;
+	}
+
+	return primary_page;
+}
+
+static void unlock_and_put_secondary_pages(struct vm_fault *vmf)
+{
+       int i;
+
+       if (!(vmf->flags & FAULT_FLAG_AROUND))
+	       return;
+       for (i = 0; i < vmf->nr_pages; i++) {
+               if (!vmf->pages[i])
+                       continue;
+               unlock_page(vmf->pages[i]);
+               page_cache_release(vmf->pages[i]);
+               vmf->pages[i] = NULL;
+       }
+}
+
 /**
  * filemap_fault - read in file data for page fault handling
  * @vma:	vma in which the fault was taken
@@ -1624,7 +1682,15 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	/*
 	 * Do we have something in the page cache already?
 	 */
-	page = find_get_page(mapping, offset);
+	if (vmf->flags & FAULT_FLAG_AROUND) {
+		ret = find_get_pages(mapping, vmf->min, vmf->nr_pages,
+				vmf->pages);
+		memset(vmf->pages + ret, 0,
+				sizeof(struct page *) * (vmf->nr_pages - ret));
+		page = lock_secondary_pages(vma, vmf);
+		ret = VM_FAULT_AROUND;
+	} else
+		page = find_get_page(mapping, offset);
 	if (likely(page) && !(vmf->flags & FAULT_FLAG_TRIED)) {
 		/*
 		 * We found the page, so try async readahead before
@@ -1636,7 +1702,7 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 		do_sync_mmap_readahead(vma, ra, file, offset);
 		count_vm_event(PGMAJFAULT);
 		mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
-		ret = VM_FAULT_MAJOR;
+		ret |= VM_FAULT_MAJOR;
 retry_find:
 		page = find_get_page(mapping, offset);
 		if (!page)
@@ -1644,12 +1710,14 @@ retry_find:
 	}
 
 	if (!lock_page_or_retry(page, vma->vm_mm, vmf->flags)) {
+		unlock_and_put_secondary_pages(vmf);
 		page_cache_release(page);
-		return ret | VM_FAULT_RETRY;
+		return (ret & ~VM_FAULT_AROUND) | VM_FAULT_RETRY;
 	}
 
 	/* Did it get truncated? */
 	if (unlikely(page->mapping != mapping)) {
+		unlock_and_put_secondary_pages(vmf);
 		unlock_page(page);
 		put_page(page);
 		goto retry_find;
@@ -1691,7 +1759,7 @@ no_cached_page:
 	 */
 	if (error >= 0)
 		goto retry_find;
-
+	unlock_and_put_secondary_pages(vmf);
 	/*
 	 * An error return from page_cache_read can result if the
 	 * system is low on memory, or a problem occurs while trying
@@ -1719,6 +1787,7 @@ page_not_uptodate:
 
 	if (!error || error == AOP_TRUNCATED_PAGE)
 		goto retry_find;
+	unlock_and_put_secondary_pages(vmf);
 
 	/* Things didn't work out. Return zero to tell the mm layer so. */
 	shrink_readahead_size_eio(file, ra);
-- 
1.8.5.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache
  2014-02-11  3:05 [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Kirill A. Shutemov
  2014-02-11  3:05 ` [PATCH 1/2] mm: extend ->fault interface to fault in few pages around fault address Kirill A. Shutemov
  2014-02-11  3:05 ` [PATCH 2/2] mm: implement FAULT_FLAG_AROUND in filemap_fault() Kirill A. Shutemov
@ 2014-02-11 21:39 ` Andrew Morton
  2014-02-11 23:52   ` Linus Torvalds
  2 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2014-02-11 21:39 UTC (permalink / raw
  To: Kirill A. Shutemov
  Cc: Linus Torvalds, Mel Gorman, Rik van Riel, Andi Kleen,
	Matthew Wilcox, Dave Hansen, linux-mm

On Tue, 11 Feb 2014 05:05:55 +0200 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:

> Okay, it's RFC only. I haven't stabilize it yet. And it's 5 AM...
> 
> It kind of work on small test-cases in kvm, but hung my laptop shortly
> after boot. So no benchmark data.
> 
> The patches are on top of mine __do_fault() cleanup.
> 
> The idea is to minimize number of minor page faults by mapping pages around
> the fault address if they are already in page cache.
> 
> With the patches we try to map up to 32 pages (subject to change) on read
> page fault. Later can extended to write page faults to shared mappings if
> works well.
> 
> The pages must be on the same page table so we can change all ptes under
> one lock.
> 
> I tried to avoid additional latency, so we don't wait page to get ready,
> just skip to the next one.
> 
> The only place where we can get stuck for relatively long time is
> do_async_mmap_readahead(): it allocates pages and submits IO. We can't
> just skip readahead, otherwise it will stop working and we will get miss
> all the time. On other hand keeping do_async_mmap_readahead() there will
> probably break readahead heuristics: interleaving access looks as
> sequential.
> 

hm, we tried that a couple of times, many years ago.  Try
https://www.google.com/#q="faultahead" then spend a frustrating hour
trying to work out what went wrong.

Of course, the implementation might have been poor and perhaps we can
get this to work.

It would seem to make most sense to tie the faultahead into linear
reads of mmapped files.  The disk readahead code already tries to
recognise and optimise such read patterns, but tying faultahead into
readahead won't work well because the pages will often already be in
pagecache.

A starting point for this work would be to get all the tracepoints in
place and then perform some analysis of what the access patterns really
look like.  Based on that (statistical) analysis we can then design a
feature to optimise it and make some predictions about how effective it
might be.

I have vague memories of writing code which, within the first fault
would read the entire file into pagecache and then mapped everything. 
It was really fast (mainly from linearising the read of executables and
libraries) but was wasteful and unserious.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache
  2014-02-11 21:39 ` [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Andrew Morton
@ 2014-02-11 23:52   ` Linus Torvalds
  2014-02-11 23:58     ` Kirill A. Shutemov
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2014-02-11 23:52 UTC (permalink / raw
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Mel Gorman, Rik van Riel, Andi Kleen,
	Matthew Wilcox, Dave Hansen, linux-mm

On Tue, Feb 11, 2014 at 1:39 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> hm, we tried that a couple of times, many years ago.  Try
> https://www.google.com/#q="faultahead" then spend a frustrating hour
> trying to work out what went wrong.
>
> Of course, the implementation might have been poor and perhaps we can
> get this to work.

Kirill's patch looks good, and shouldn't have much overhead, but the
fact that it doesn't work is obviously something of a strike against
it.. ;)

I don't see anything obviously wrong in it, although I think 32
fault-around pages might be excessive (it uses stack space, and there
are expenses wrt accounting and tear-down). But the patch is also
against some odd kernel (presumably -mm) with lots of other changes,
so I don't even know what it might be missing.

               Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache
  2014-02-11 23:52   ` Linus Torvalds
@ 2014-02-11 23:58     ` Kirill A. Shutemov
  2014-02-12  0:25       ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-02-11 23:58 UTC (permalink / raw
  To: Linus Torvalds
  Cc: Andrew Morton, Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Andi Kleen, Matthew Wilcox, Dave Hansen, linux-mm

Linus Torvalds wrote:
> On Tue, Feb 11, 2014 at 1:39 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> >
> > hm, we tried that a couple of times, many years ago.  Try
> > https://www.google.com/#q="faultahead" then spend a frustrating hour
> > trying to work out what went wrong.
> >
> > Of course, the implementation might have been poor and perhaps we can
> > get this to work.
> 
> Kirill's patch looks good, and shouldn't have much overhead, but the
> fact that it doesn't work is obviously something of a strike against
> it.. ;)
> 
> I don't see anything obviously wrong in it, although I think 32
> fault-around pages might be excessive (it uses stack space, and there
> are expenses wrt accounting and tear-down). But the patch is also
> against some odd kernel (presumably -mm) with lots of other changes,
> so I don't even know what it might be missing.

It's on top of v3.14-rc1 + __do_fault() claen up[1].

It's also on git:

git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux fault_around/v1

[1] http://thread.gmane.org/gmane.linux.kernel.mm/113364

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache
  2014-02-11 23:58     ` Kirill A. Shutemov
@ 2014-02-12  0:25       ` Linus Torvalds
  2014-02-12  0:44         ` Kirill A. Shutemov
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2014-02-12  0:25 UTC (permalink / raw
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Mel Gorman, Rik van Riel, Andi Kleen,
	Matthew Wilcox, Dave Hansen, linux-mm

On Tue, Feb 11, 2014 at 3:58 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> Linus Torvalds wrote:
>
> It's on top of v3.14-rc1 + __do_fault() claen up[1].
>
> It's also on git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux fault_around/v1
>
> [1] http://thread.gmane.org/gmane.linux.kernel.mm/113364

Ok, that patch-series looks good to me too.

And I still see nothing wrong that would cause it not to boot. I think
the "do_async_mmap_readahead()" in lock_secondary_pages() is silly and
shouldn't really be done, but I don't think it should cause any
problems per se, it just feels very wrong to do that inside the loop.

             Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache
  2014-02-12  0:25       ` Linus Torvalds
@ 2014-02-12  0:44         ` Kirill A. Shutemov
  0 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-02-12  0:44 UTC (permalink / raw
  To: Linus Torvalds
  Cc: Kirill A. Shutemov, Andrew Morton, Mel Gorman, Rik van Riel,
	Andi Kleen, Matthew Wilcox, Dave Hansen, linux-mm

Linus Torvalds wrote:
> On Tue, Feb 11, 2014 at 3:58 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> > Linus Torvalds wrote:
> >
> > It's on top of v3.14-rc1 + __do_fault() claen up[1].
> >
> > It's also on git:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux fault_around/v1
> >
> > [1] http://thread.gmane.org/gmane.linux.kernel.mm/113364
> 
> Ok, that patch-series looks good to me too.
> 
> And I still see nothing wrong that would cause it not to boot.

It actually boot to UI and kinda work until I try to rebuild kernel.
Then all IO stops, but my window manager is still work. I can switch
between windows :-/

> I think the "do_async_mmap_readahead()" in lock_secondary_pages() is silly
> and shouldn't really be done, but I don't think it should cause any problems
> per se, it just feels very wrong to do that inside the loop.

I tried to replace do_async_mmap_readahead() locally with this:

diff --git a/mm/filemap.c b/mm/filemap.c
index 0661358db958..b28d19cafefc 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1612,8 +1612,8 @@ static struct page *lock_secondary_pages(struct vm_area_struct *vma,
 		}
 		if (pages[i]->index > vmf->max)
 			goto put;
-		do_async_mmap_readahead(vma, &file->f_ra, file,
-				pages[i], pages[i]->index);
+		if (PageReadahead(pages[i]))
+			goto put;
 		if (!trylock_page(pages[i]))
 			goto put;
 		/* Truncated? */
@@ -1625,6 +1625,8 @@ static struct page *lock_secondary_pages(struct vm_area_struct *vma,
 			>> PAGE_CACHE_SHIFT;
 		if (unlikely(pages[i]->index >= size))
 			goto unlock;
+		if (file->f_ra.mmap_miss > 0)
+			file->f_ra.mmap_miss--;
 		continue;
 unlock:
 		unlock_page(pages[i]);
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-02-12  0:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-11  3:05 [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Kirill A. Shutemov
2014-02-11  3:05 ` [PATCH 1/2] mm: extend ->fault interface to fault in few pages around fault address Kirill A. Shutemov
2014-02-11  3:05 ` [PATCH 2/2] mm: implement FAULT_FLAG_AROUND in filemap_fault() Kirill A. Shutemov
2014-02-11 21:39 ` [RFC, PATCH 0/2] mm: map few pages around fault address if they are in page cache Andrew Morton
2014-02-11 23:52   ` Linus Torvalds
2014-02-11 23:58     ` Kirill A. Shutemov
2014-02-12  0:25       ` Linus Torvalds
2014-02-12  0:44         ` Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.