From: Jane Chu <jane.chu@oracle.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
nvdimm@lists.linux.dev, Dan Williams <dan.j.williams@intel.com>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
linux-mm@kvack.org
Subject: Re: [PATCH] mm: Convert DAX lock/unlock page to lock/unlock folio
Date: Thu, 24 Aug 2023 14:30:21 -0700 [thread overview]
Message-ID: <b2b0fce8-b7f8-420e-0945-ab9581b23d9a@oracle.com> (raw)
In-Reply-To: <ZOeq4HJwCULHPtaU@casper.infradead.org>
On 8/24/2023 12:09 PM, Matthew Wilcox wrote:
> On Thu, Aug 24, 2023 at 11:24:20AM -0700, Jane Chu wrote:
>>
>> On 8/22/2023 4:13 PM, Matthew Wilcox (Oracle) wrote:
>> [..]
>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>> index a6c3af985554..b81d6eb4e6ff 100644
>>> --- a/mm/memory-failure.c
>>> +++ b/mm/memory-failure.c
>>> @@ -1717,16 +1717,11 @@ static int mf_generic_kill_procs(unsigned long long pfn, int flags,
>>> struct dev_pagemap *pgmap)
>>> {
>>> struct page *page = pfn_to_page(pfn);
>>
>> Looks like the above line, that is, the 'page' pointer is no longer needed.
>
> So ...
>
> It seems to me that currently handling of hwpoison for DAX memory is
> handled on a per-allocation basis but it should probably be handled
> on a per-page basis eventually?
My recollection is that since the inception of
memory_failure_dev_pagemap(), dax poison handling is kind of on per-page
basis in that, the .si_addr points to the subpage vaddr, though the
.si_lsb indicates the user mapping size.
>
> If so, we'd want to do something like this ...
>
> +++ b/mm/memory-failure.c
> @@ -1755,7 +1755,9 @@ static int mf_generic_kill_procs(unsigned long long pfn, int flags,
> * Use this flag as an indication that the dax page has been
> * remapped UC to prevent speculative consumption of poison.
> */
> - SetPageHWPoison(&folio->page);
> + SetPageHWPoison(page);
> + if (folio_test_large(folio))
> + folio_set_has_hwpoisoned(folio);
>
> /*
> * Unlike System-RAM there is no possibility to swap in a
> @@ -1766,7 +1768,8 @@ static int mf_generic_kill_procs(unsigned long long pfn, int flags,
> flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
> collect_procs(&folio->page, &to_kill, true);
>
> - unmap_and_kill(&to_kill, pfn, folio->mapping, folio->index, flags);
> + unmap_and_kill(&to_kill, pfn, folio->mapping,
> + folio->index + folio_page_idx(folio, page), flags);
> unlock:
> dax_unlock_folio(folio, cookie);
> return rc;
>
This change make sense to me.
mf_generic_kill_procs() is the generic version if DAX-FS does not
register/provide dax_dev->holder_ops->notify_failure. Currently only
DAX-XFS does the registration and utilizes the specific version:
mf_dax_kill_procs().
> But this is a change in current behaviour and I didn't want to think
> through the implications of all of this. Would you like to take on this
> project? ;-)
>
Sure, be happy to.
>
> My vague plan for hwpoison in the memdesc world is that poison is always
> handled on a per-page basis (by means of setting page->memdesc to a
> hwpoison data structure). If the allocation contains multiple pages,
> then we set a flag somewhere like the current has_hwpoisoned flag.
Could you elaborate on "by means of setting page->memdesc to a hwpoison
data structure" please?
As for the PG_has_hwpoisoned flag, I see one reference for now -
$ git grep folio_test_has_hwpoisoned
mm/shmem.c: folio_test_has_hwpoisoned(folio)))
$ git grep folio_clear_has_hwpoisoned
<none>
A dax thp page is a thp page that is potentially recoverable from
hwpoison(s). If a dax thp page has multiple hwpoisons, only when the
last poisoned subpage is recovered, could we call
folio_clear_has_hwpoisoned() - this implies keeping track of the number
of poisoned subpages per thp somehow, any thoughts on the best thing to
do? hmm, maybe add a field in pgmap? or add a query to the driver to
return whether there is any hwpoison in a given range?
thanks!
-jane
prev parent reply other threads:[~2023-08-24 21:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-22 23:13 [PATCH] mm: Convert DAX lock/unlock page to lock/unlock folio Matthew Wilcox (Oracle)
2023-08-23 5:51 ` Naoya Horiguchi
2023-08-24 18:24 ` Jane Chu
2023-08-24 19:09 ` Matthew Wilcox
2023-08-24 21:30 ` Jane Chu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b2b0fce8-b7f8-420e-0945-ab9581b23d9a@oracle.com \
--to=jane.chu@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@nec.com \
--cc=nvdimm@lists.linux.dev \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).