QEMU-Devel Archive mirror
 help / color / mirror / Atom feed
From: Shiyang Ruan via <qemu-devel@nongnu.org>
To: Ira Weiny <ira.weiny@intel.com>
Cc: linux-cxl@vger.kernel.org, qemu-devel@nongnu.org,
	Jonathan.Cameron@huawei.com, dan.j.williams@intel.com,
	dave@stgolabs.net, alison.schofield@intel.com
Subject: Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
Date: Fri, 3 May 2024 18:42:31 +0800	[thread overview]
Message-ID: <0d0b087f-9aa4-45d3-9412-0b1a6a005c52@fujitsu.com> (raw)
In-Reply-To: <6627f68be4ff5_cd8e0294c3@iweiny-mobl.notmuch>



在 2024/4/24 1:57, Ira Weiny 写道:
> Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
>> could handle poison pages in time.  Per CXL spec, the device error event
>> could be signaled through FW-First and OS-First methods.
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
> 
> I'm a bit confused by the need for this patch.  Perhaps a bit more detail
> here?

Yes, I should have wrote more details.

I want to check and make sure the HWPOISON on a CXL device (type3) is 
working properly.  For example, a FSDAX filesystem created on a 
type3-pmem device, then it gets a POISON bit, the OS should be able to 
handle this POISON event: find the relevant process

Currently I'm using Qemu with several simulated CXL devices, and using 
poison injection API of Qemu to create POISON records, but OS isn't 
notified.  Only when we actively call list POISON records (cxl list -L) 
will the driver fetch them and log into trace events, then we see the 
POISON records.  Memory failure wasn't triggered too.

That's why I said "poison creation on cxl memdev is silent".  Per spec, 
POISON creation should be notified to OS.  Since not familiar with 
firmware part, I'm try adding this notification for OS-First.

> 
> More comments below.
> 
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
>> +			      u64 dpa)
>>   {
>> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
>> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
>> +	unsigned long pfn = PHYS_PFN(hpa);
>> +
>> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
>> +		return;
>> +
>> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
> 
> I thought that ras daemon was supposed to take care of this when the trace
> event occurred.  Alison is working on the HPA data for that path.

It seems to save CXL trace events/memory-failures to DB and report to 
others, but cannot let OS call memory failure.

> 
>> +}
>> +
>> +static int __cxl_report_poison(struct device *dev, void *arg)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_memdev *cxlmd;
>> +	u64 dpa = *(u64 *)arg;
>> +
>> +	cxled = to_cxl_endpoint_decoder(dev);
>> +	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
>> +		return 0;
>> +
>> +	if (cxled->mode == CXL_DECODER_MIXED) {
>> +		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
>> +		return 0;
>> +	}
>> +
>> +	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
>> +		return 0;
>> +
>> +	cxlmd = cxled_to_memdev(cxled);
>> +	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
>> +
>> +	return 1;
>> +}
>> +
>> +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
>> +{
>> +	struct cxl_port *port = cxlmd->endpoint;
>> +
>> +	/*
>> +	 * No region is mapped to this endpoint, that is to say no HPA is
>> +	 * mapped.
>> +	 */
>> +	if (!port || !is_cxl_endpoint(port) ||
>> +	    cxl_num_decoders_committed(port) == 0)
>> +		return;
>> +
>> +	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
>> +}
>> +
>> +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
>> +					   enum cxl_event_log_type type,
>> +					   struct cxl_event_gen_media *rec)
>> +{
>> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
>> +
>> +	if (type == CXL_EVENT_TYPE_FAIL) {
> 
> Why only FAIL and not FATAL?

POISON is recoverable(could be cleared by sending CLEAR_POISON command) 
and won't cause serious problems, so only use failure event here.

> 
>> +		switch (rec->transaction_type) {
>> +		case CXL_EVENT_TRANSACTION_READ:
>> +		case CXL_EVENT_TRANSACTION_WRITE:
>> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> 
> Why not scan media?

Yes, this type should be handled too.  Will add.

> 
>> +			cxl_event_handle_poison(cxlmd, dpa);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>> +}
>> +
>> +static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
>> +				  enum cxl_event_log_type type,
>> +				  struct cxl_event_dram *rec)
>> +{
>> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
>> +
>> +	if (type == CXL_EVENT_TYPE_FAIL) {
>> +		switch (rec->transaction_type) {
>> +		case CXL_EVENT_TRANSACTION_READ:
>> +		case CXL_EVENT_TRANSACTION_WRITE:
>> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
>> +			cxl_event_handle_poison(cxlmd, dpa);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>> +}
>> +
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt)
>> +{
>> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
>> -	else if (event_type == CXL_CPER_EVENT_DRAM)
>> +		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
>> +	} else if (event_type == CXL_CPER_EVENT_DRAM) {
>>   		trace_cxl_dram(cxlmd, type, &evt->dram);
>> -	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>> +		cxl_event_handle_dram(cxlmd, type, &evt->dram);
>> +	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>>   		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
>>   	else
>>   		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
>>   }
>> -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
>> +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
>>
> 
> Why all the churn with changing the names of functions?

After adding cxl_report_poison(), these functions not only traces events 
but also contains event handler.  So rename them.

--
Thanks,
Ruan.

> 
> Ira
> 
>>   
>> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -				     enum cxl_event_log_type type,
>> -				     struct cxl_event_record_raw *record)
>> +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +				      enum cxl_event_log_type type,
>> +				      struct cxl_event_record_raw *record)
>>   {
>>   	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
>>   	const uuid_t *uuid = &record->id;
>> @@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>>   	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
>>   		ev_type = CXL_CPER_EVENT_MEM_MODULE;
>>   
>> -	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
>> +	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
>>   }
>>   
>>   static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>> @@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>>   			break;
>>   
>>   		for (i = 0; i < nr_rec; i++)
>> -			__cxl_event_trace_record(cxlmd, type,
>> -						 &payload->records[i]);
>> +			__cxl_event_handle_record(cxlmd, type,
>> +						  &payload->records[i]);
>>   
>>   		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>>   			trace_cxl_overflow(cxlmd, type, payload);
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 36cee9c30ceb..ba1347de5651 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>   void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>   				  unsigned long *cmds);
>>   void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt);
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt);
>>   int cxl_set_timestamp(struct cxl_memdev_state *mds);
>>   int cxl_poison_state_init(struct cxl_memdev_state *mds);
>>   int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
>> index 03fa6d50d46f..8189bed76c12 100644
>> --- a/include/linux/cxl-event.h
>> +++ b/include/linux/cxl-event.h
>> @@ -23,6 +23,20 @@ struct cxl_event_generic {
>>   	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
>>   } __packed;
>>   
>> +/*
>> + * Event transaction type
>> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
>> + */
>> +enum cxl_event_transaction_type {
>> +	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
>> +	CXL_EVENT_TRANSACTION_READ,
>> +	CXL_EVENT_TRANSACTION_WRITE,
>> +	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
>> +	CXL_EVENT_TRANSACTION_INJECT_POISON,
>> +	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
>> +	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
>> +};
>> +
>>   /*
>>    * General Media Event Record
>>    * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
>> @@ -33,7 +47,7 @@ struct cxl_event_gen_media {
>>   	__le64 phys_addr;
>>   	u8 descriptor;
>>   	u8 type;
>> -	u8 transaction_type;
>> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>>   	u8 validity_flags[2];
>>   	u8 channel;
>>   	u8 rank;
>> @@ -52,7 +66,7 @@ struct cxl_event_dram {
>>   	__le64 phys_addr;
>>   	u8 descriptor;
>>   	u8 type;
>> -	u8 transaction_type;
>> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>>   	u8 validity_flags[2];
>>   	u8 channel;
>>   	u8 rank;
>> -- 
>> 2.34.1
>>
> 
> 


  reply	other threads:[~2024-05-03 10:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17  7:50 [PATCH v3 0/2] cxl: add poison creation event handler Shiyang Ruan via
2024-04-17  7:50 ` [PATCH v3 1/2] cxl/core: correct length of DPA field masks Shiyang Ruan via
2024-04-23 17:35   ` Ira Weiny
2024-04-23 17:35   ` Dan Williams
2024-04-23 17:42   ` Alison Schofield
2024-04-23 21:04     ` Ira Weiny
2024-04-25 10:05       ` Shiyang Ruan via
2024-04-25 16:04         ` Ira Weiny
2024-04-30 21:00   ` Alison Schofield
2024-05-03 11:37     ` Shiyang Ruan via
2024-04-17  7:50 ` [PATCH v3 2/2] cxl/core: add poison creation event handler Shiyang Ruan via
2024-04-17 17:30   ` Dave Jiang
2024-04-18  9:01     ` Shiyang Ruan via
2024-04-21 12:14   ` kernel test robot
2024-04-23 17:57   ` Ira Weiny
2024-05-03 10:42     ` Shiyang Ruan via [this message]
2024-05-08 16:15       ` Jonathan Cameron via
2024-04-23 18:40   ` Dan Williams
2024-05-03 11:32     ` Shiyang Ruan via
2024-05-21  5:35       ` Shiyang Ruan via
2024-05-22  6:45         ` Dan Williams
2024-05-24 15:15           ` Shiyang Ruan via
2024-05-28 10:13             ` Shiyang Ruan via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0d0b087f-9aa4-45d3-9412-0b1a6a005c52@fujitsu.com \
    --to=qemu-devel@nongnu.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=ruansy.fnst@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).