From: "Huang, Kai" <kai.huang@intel.com>
To: "linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"bp@alien8.de" <bp@alien8.de>,
"jarkko@kernel.org" <jarkko@kernel.org>,
"Chatre, Reinette" <reinette.chatre@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"x86@kernel.org" <x86@kernel.org>,
"haitao.huang@linux.intel.com" <haitao.huang@linux.intel.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"mingo@redhat.com" <mingo@redhat.com>
Cc: "kristen@linux.intel.com" <kristen@linux.intel.com>,
"Mehta, Sohil" <sohil.mehta@intel.com>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"Hansen, Dave" <dave.hansen@intel.com>, "Christopherson,,
Sean" <seanjc@google.com>
Subject: Re: [PATCH v5] x86/sgx: Resolves SECS reclaim vs. page fault for EAUG race
Date: Thu, 27 Jul 2023 23:21:26 +0000 [thread overview]
Message-ID: <6ccb705bc4345420e6c730245f871ba1d9413203.camel@intel.com> (raw)
In-Reply-To: <op.18qu84gewjvjmi@hhuan26-mobl.amr.corp.intel.com>
On Thu, 2023-07-27 at 09:16 -0500, Haitao Huang wrote:
> On Wed, 26 Jul 2023 21:50:02 -0500, Huang, Kai <kai.huang@intel.com> wrote:
>
> > On Wed, 2023-07-26 at 18:02 -0700, Haitao Huang wrote:
> > > Under heavy load, the SGX EPC reclaimer (ksgxd) may reclaim the SECS EPC
> >
> > If I read correctly, Dave suggested to not use "high" (heavy in this
> > sentence)
> > or "low" pressure:
> >
> > https://lore.kernel.org/lkml/op.179a4xs0wjvjmi@hhuan26-mobl.amr.corp.intel.com/T/#m9120eac6a4a94daa7c9fcc47709f241cd181e5dc
> >
> > And I agree. For instance, consider this happens to one extremely
> > "small"
> > enclave, while there's a new "big" enclave starts to run. I don't think
> > we
> > should say this is "under heavy load". Just stick to the fact that the
> > reclaimer may reclaim the SECS page.
> >
> Mybe I have some confusion here but I did not think Dave had issues with
> 'heavy load'. When this happens, the last page causing #PF (page A below)
> should be the the "youngest" in PTE and it got paged out together with the
> SECS before the #PF is even handled. Based on that the ksgxd moves 'young'
> pages to the back of the queue for reclaiming, for that to happen, almost
> all EPC pages must be paged out for all enclaves at that time, so it means
> heavy load to me. And that's also consistent with my tests.
I already provided an example: swapping out an "extreme small" enclave.
Anyway, no big deal to me.
>
> > > page for an enclave and set encl->secs.epc_page to NULL. But the SECS
> > > EPC page is used for EAUG in the SGX page fault handler without checking
> > > for NULL and reloading.
> > >
> > > Fix this by checking if SECS is loaded before EAUG and loading it if it
> > > was reclaimed.
> > >
> > > The SECS page holds global enclave metadata. It can only be reclaimed
> > > when there are no other enclave pages remaining. At that point,
> > > virtually nothing can be done with the enclave until the SECS page is
> > > paged back in.
> ...
> > > But it is still possible for a #PF for a non-SECS page to race
> > > with paging out the SECS page: when the last resident non-SECS page A
> > > triggers a #PF in a non-resident page B, and then page A and the SECS
> > > both are paged out before the #PF on B is handled.
> > >
> > > Hitting this bug requires that race triggered with a #PF for EAUG.
> >
> > The above race can happen for the normal ELDU path too, thus I suppose
> > it will
> > be better to mention why the normal ELDU path doesn't have this issue: it
> > already does what this fix does.
> >
> Should we focus on the bug and fix itself instead of explaining a non-bug
> case?
> And the simple changes in this patch clearly show that too if people look
> for that.
So you spent a lot of text explaining the race condition, but such race
condition applies to both ELDU and EAUG. I personally went to see the code
whether ELDU has such issue too, and it turned out only EAUG has issue. If you
mention this in the changelog perhaps I wouldn't need to go to read the code.
Anyway, just my 2cents.
And I don't want to let those block this patch, so feel free to add my tag:
Reviewed-by: Kai Huang <kai.huang@intel.com>
prev parent reply other threads:[~2023-07-27 23:21 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-27 1:02 [PATCH v5] x86/sgx: Resolves SECS reclaim vs. page fault for EAUG race Haitao Huang
2023-07-27 2:50 ` Huang, Kai
2023-07-27 3:11 ` Huang, Kai
2023-07-27 14:16 ` Haitao Huang
2023-07-27 23:21 ` Huang, Kai [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6ccb705bc4345420e6c730245f871ba1d9413203.camel@intel.com \
--to=kai.huang@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=haitao.huang@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jarkko@kernel.org \
--cc=kristen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=seanjc@google.com \
--cc=sohil.mehta@intel.com \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).