All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Lucas De Marchi <lucas.demarchi@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>,
	"Umesh Nerlige Ramappa" <umesh.nerlige.ramappa@intel.com>,
	Tvrtko Ursulin <tursulin@ursulin.net>
Subject: Re: [PATCH v2 3/6] drm/xe: Add helper to accumulate exec queue runtime
Date: Wed, 24 Apr 2024 09:51:58 -0500	[thread overview]
Message-ID: <ji47qyldawgmoj4fmdmzaitupigutgqxgzcadfmx3owems4bsy@lwsoy6p5o5jx> (raw)
In-Reply-To: <ZiiFvZYWhpdi8ZKL@DUT025-TGLU.fm.intel.com>

On Wed, Apr 24, 2024 at 04:08:29AM GMT, Matthew Brost wrote:
>On Tue, Apr 23, 2024 at 04:56:48PM -0700, Lucas De Marchi wrote:
>> From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>
>> Add a helper to accumulate per-client runtime of all its
>> exec queues. Currently that is done in 2 places:
>>
>> 	1. when the exec_queue is destroyed
>> 	2. when the sched job is completed
>>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>> ---
>>  drivers/gpu/drm/xe/xe_device_types.h |  9 +++++++
>>  drivers/gpu/drm/xe/xe_exec_queue.c   | 37 ++++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
>>  drivers/gpu/drm/xe/xe_sched_job.c    |  2 ++
>>  4 files changed, 49 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>> index 2e62450d86e1..33d3bf93a2f1 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -547,6 +547,15 @@ struct xe_file {
>>  		struct mutex lock;
>>  	} exec_queue;
>>
>> +	/**
>> +	 * @runtime: hw engine class runtime in ticks for this drm client
>> +	 *
>> +	 * Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc
>> +	 * case, since all jobs run in parallel on the engines, only the stats
>> +	 * from lrc[0] are sufficient.
>> +	 */
>> +	u64 runtime[XE_ENGINE_CLASS_MAX];
>> +
>>  	/** @client: drm client */
>>  	struct xe_drm_client *client;
>>  };
>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>> index 395de93579fa..b7b6256cb96a 100644
>> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>> @@ -214,6 +214,8 @@ void xe_exec_queue_fini(struct xe_exec_queue *q)
>>  {
>>  	int i;
>>
>> +	xe_exec_queue_update_runtime(q);
>> +
>>  	for (i = 0; i < q->width; ++i)
>>  		xe_lrc_finish(q->lrc + i);
>>  	if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
>> @@ -769,6 +771,41 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
>>  		q->lrc[0].fence_ctx.next_seqno - 1;
>>  }
>>
>> +/**
>> + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
>> + * @q: The exec queue
>> + *
>> + * Update the timestamp saved by HW for this exec queue and save runtime
>> + * calculated by using the delta from last update. On multi-lrc case, only the
>> + * first is considered.
>> + */
>> +void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
>> +{
>> +	struct xe_file *xef;
>> +	struct xe_lrc *lrc;
>> +	u32 old_ts, new_ts;
>> +
>> +	/*
>> +	 * Jobs that are run during driver load may use an exec_queue, but are
>> +	 * not associated with a user xe file, so avoid accumulating busyness
>> +	 * for kernel specific work.
>> +	 */
>> +	if (!q->vm || !q->vm->xef)
>> +		return;
>> +
>> +	xef = q->vm->xef;
>> +	lrc = &q->lrc[0];
>> +
>> +	new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
>> +
>> +	/*
>> +	 * Special case the very first timestamp: we don't want the
>> +	 * initial delta to be a huge value
>> +	 */
>> +	if (old_ts)
>> +		xef->runtime[q->class] += new_ts - old_ts;
>> +}
>> +
>>  void xe_exec_queue_kill(struct xe_exec_queue *q)
>>  {
>>  	struct xe_exec_queue *eq = q, *next;
>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h
>> index 02ce8d204622..45b72daa2db3 100644
>> --- a/drivers/gpu/drm/xe/xe_exec_queue.h
>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.h
>> @@ -66,5 +66,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct xe_exec_queue *e,
>>  					       struct xe_vm *vm);
>>  void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
>>  				  struct dma_fence *fence);
>> +void xe_exec_queue_update_runtime(struct xe_exec_queue *q);
>>
>>  #endif
>> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
>> index cd8a2fba5438..6a081a4fa190 100644
>> --- a/drivers/gpu/drm/xe/xe_sched_job.c
>> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
>> @@ -242,6 +242,8 @@ bool xe_sched_job_completed(struct xe_sched_job *job)
>>  {
>
>This seems like the wrong place. xe_sched_job_completed is a helper
>which determines *if* a job completed it *not* when it is completed. The

indeed, not the right place.


>DRM scheduler free_job callback is probably the right place
>(guc_exec_queue_free_job or execlist_job_free). So just call
>xe_exec_queue_update_runtime there?

yeah, I will add it there and do some tests.

thanks for catching this.

Lucas De Marchi

>
>Matt
>
>>  	struct xe_lrc *lrc = job->q->lrc;
>>
>> +	xe_exec_queue_update_runtime(job->q);
>> +
>>  	/*
>>  	 * Can safely check just LRC[0] seqno as that is last seqno written when
>>  	 * parallel handshake is done.
>> --
>> 2.43.0
>>

  reply	other threads:[~2024-04-24 14:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-23 23:56 [PATCH v2 0/6] drm/xe: Per client usage Lucas De Marchi
2024-04-23 23:56 ` [PATCH v2 1/6] drm/xe/lrc: Add helper to capture context timestamp Lucas De Marchi
2024-04-23 23:56 ` [PATCH v2 2/6] drm/xe: Add helper to capture engine timestamp Lucas De Marchi
2024-04-23 23:56 ` [PATCH v2 3/6] drm/xe: Add helper to accumulate exec queue runtime Lucas De Marchi
2024-04-24  4:08   ` Matthew Brost
2024-04-24 14:51     ` Lucas De Marchi [this message]
2024-04-26 10:49   ` Tvrtko Ursulin
2024-04-26 18:59     ` Umesh Nerlige Ramappa
2024-04-29  8:07       ` Tvrtko Ursulin
2024-04-23 23:56 ` [PATCH v2 4/6] drm/xe: Promote xe_hw_engine_class_to_str() Lucas De Marchi
2024-04-23 23:56 ` [PATCH v2 5/6] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion Lucas De Marchi
2024-04-23 23:56 ` [PATCH v2 6/6] drm/xe/client: Print runtime to fdinfo Lucas De Marchi
2024-04-26 10:47   ` Tvrtko Ursulin
2024-05-07 21:35     ` Lucas De Marchi
2024-05-08  8:23       ` Tvrtko Ursulin
2024-05-08 20:53         ` Lucas De Marchi
2024-05-09  9:39           ` Tvrtko Ursulin
2024-04-24  1:14 ` ✓ CI.Patch_applied: success for drm/xe: Per client usage (rev2) Patchwork
2024-04-24  1:14 ` ✗ CI.checkpatch: warning " Patchwork
2024-04-24  1:15 ` ✓ CI.KUnit: success " Patchwork
2024-04-24  1:27 ` ✓ CI.Build: " Patchwork
2024-04-24  1:29 ` ✓ CI.Hooks: " Patchwork
2024-04-24  1:31 ` ✓ CI.checksparse: " Patchwork
2024-04-24  1:53 ` ✓ CI.BAT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ji47qyldawgmoj4fmdmzaitupigutgqxgzcadfmx3owems4bsy@lwsoy6p5o5jx \
    --to=lucas.demarchi@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=tursulin@ursulin.net \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.