[PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
@ 2021-06-30 10:10 YuBiao Wang
  2021-06-30 11:14 ` Christian König
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: YuBiao Wang @ 2021-06-30 10:10 UTC (permalink / raw
  To: amd-gfx
  Cc: YuBiao Wang, Andrey Grodzovsky, Jack Xiao, Feifei Xu, horace.chen,
	Kevin Wang, Tuikov Luben, Deucher Alexander, Evan Quan,
	Christian König, Monk Liu, Hawking Zhang

[Why]
GPU timing counters are read via KIQ under sriov, which will introduce
a delay.

[How]
It could be directly read by MMIO.

v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Horace Chen <horace.chen@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index ff7e9f49040e..9355494002a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
 
 static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
 {
-	uint64_t clock;
+	uint64_t clock, clock_lo, clock_hi, hi_check;
 
 	amdgpu_gfx_off_ctrl(adev, false);
 	mutex_lock(&adev->gfx.gpu_clock_mutex);
@@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
 			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
 		break;
 	default:
-		clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER) |
-			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER) << 32ULL);
+		clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
+		clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
+		hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
+		/* Carryover happens every 4 Giga time cycles counts which is roughly 42 secs */
+		if (hi_check != clock_hi) {
+			clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
+			clock_hi = hi_check;
+		}
+		clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
 		break;
 	}
 	mutex_unlock(&adev->gfx.gpu_clock_mutex);
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
  2021-06-30 10:10 [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4) YuBiao Wang
@ 2021-06-30 11:14 ` Christian König
  2021-06-30 11:17   ` Liu, Monk
  2021-06-30 11:16 ` Liu, Monk
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Christian König @ 2021-06-30 11:14 UTC (permalink / raw
  To: YuBiao Wang, amd-gfx
  Cc: Andrey Grodzovsky, Jack Xiao, Feifei Xu, horace.chen, Kevin Wang,
	Tuikov Luben, Deucher Alexander, Evan Quan, Christian König,
	Monk Liu, Hawking Zhang

Am 30.06.21 um 12:10 schrieb YuBiao Wang:
> [Why]
> GPU timing counters are read via KIQ under sriov, which will introduce
> a delay.
>
> [How]
> It could be directly read by MMIO.
>
> v2: Add additional check to prevent carryover issue.
> v3: Only check for carryover for once to prevent performance issue.
> v4: Add comments of the rough frequency where carryover happens.
>
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> Acked-by: Horace Chen <horace.chen@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index ff7e9f49040e..9355494002a1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
>   
>   static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>   {
> -	uint64_t clock;
> +	uint64_t clock, clock_lo, clock_hi, hi_check;
>   
>   	amdgpu_gfx_off_ctrl(adev, false);
>   	mutex_lock(&adev->gfx.gpu_clock_mutex);
> @@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>   			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
>   		break;
>   	default:
> -		clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER) |
> -			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER) << 32ULL);

If you want to be extra sure you could add a preempt_disable(); here.

> +		clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
> +		clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
> +		hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
> +		/* Carryover happens every 4 Giga time cycles counts which is roughly 42 secs */
> +		if (hi_check != clock_hi) {
> +			clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
> +			clock_hi = hi_check;
> +		}

And a preempt_enable(); here. This way the critical section is also not 
interrupted by a task switch.

But either way the patch is Reviewed-by: Christian König 
<christian.koenig@amd.com>

Regards,
Christian.

> +		clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
>   		break;
>   	}
>   	mutex_unlock(&adev->gfx.gpu_clock_mutex);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
  2021-06-30 10:10 [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4) YuBiao Wang
  2021-06-30 11:14 ` Christian König
@ 2021-06-30 11:16 ` Liu, Monk
  2021-06-30 14:15 ` Jay Cornwall
  2021-06-30 16:52 ` Luben Tuikov
  3 siblings, 0 replies; 7+ messages in thread
From: Liu, Monk @ 2021-06-30 11:16 UTC (permalink / raw
  To: Wang, YuBiao, amd-gfx@lists.freedesktop.org
  Cc: Wang, YuBiao, Grodzovsky, Andrey, Xiao, Jack, Xu, Feifei,
	Chen, Horace, Wang, Kevin(Yang), Tuikov, Luben,
	Deucher, Alexander, Quan, Evan, Koenig, Christian, Zhang, Hawking

[AMD Official Use Only]

reviewed-by: Monk Liu <monk.liu@amd.com>
Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: YuBiao Wang <YuBiao.Wang@amd.com> 
Sent: Wednesday, June 30, 2021 6:10 PM
To: amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Tuikov, Luben <Luben.Tuikov@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Xiao, Jack <Jack.Xiao@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Xu, Feifei <Feifei.Xu@amd.com>; Wang, Kevin(Yang) <Kevin1.Wang@amd.com>; Wang, YuBiao <YuBiao.Wang@amd.com>
Subject: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)

[Why]
GPU timing counters are read via KIQ under sriov, which will introduce a delay.

[How]
It could be directly read by MMIO.

v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Horace Chen <horace.chen@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index ff7e9f49040e..9355494002a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
 
 static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)  {
-	uint64_t clock;
+	uint64_t clock, clock_lo, clock_hi, hi_check;
 
 	amdgpu_gfx_off_ctrl(adev, false);
 	mutex_lock(&adev->gfx.gpu_clock_mutex);
@@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
 			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
 		break;
 	default:
-		clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER) |
-			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER) << 32ULL);
+		clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
+		clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
+		hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
+		/* Carryover happens every 4 Giga time cycles counts which is roughly 42 secs */
+		if (hi_check != clock_hi) {
+			clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
+			clock_hi = hi_check;
+		}
+		clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
 		break;
 	}
 	mutex_unlock(&adev->gfx.gpu_clock_mutex);
--
2.25.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
  2021-06-30 11:14 ` Christian König
@ 2021-06-30 11:17   ` Liu, Monk
  2021-06-30 11:21     ` Christian König
  0 siblings, 1 reply; 7+ messages in thread
From: Liu, Monk @ 2021-06-30 11:17 UTC (permalink / raw
  To: Christian König, Wang, YuBiao, amd-gfx@lists.freedesktop.org
  Cc: Grodzovsky, Andrey, Xiao, Jack, Xu, Feifei, Chen, Horace,
	Wang, Kevin(Yang), Tuikov,  Luben, Deucher, Alexander, Quan, Evan,
	Koenig,  Christian, Zhang, Hawking

[AMD Official Use Only]

>> And a preempt_enable(); here. This way the critical section is also not interrupted by a task switch.

Do you mean put a "preempt_disable()" here ? 

Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Christian König <ckoenig.leichtzumerken@gmail.com> 
Sent: Wednesday, June 30, 2021 7:15 PM
To: Wang, YuBiao <YuBiao.Wang@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Xiao, Jack <Jack.Xiao@amd.com>; Xu, Feifei <Feifei.Xu@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Wang, Kevin(Yang) <Kevin1.Wang@amd.com>; Tuikov, Luben <Luben.Tuikov@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)

Am 30.06.21 um 12:10 schrieb YuBiao Wang:
> [Why]
> GPU timing counters are read via KIQ under sriov, which will introduce 
> a delay.
>
> [How]
> It could be directly read by MMIO.
>
> v2: Add additional check to prevent carryover issue.
> v3: Only check for carryover for once to prevent performance issue.
> v4: Add comments of the rough frequency where carryover happens.
>
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> Acked-by: Horace Chen <horace.chen@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index ff7e9f49040e..9355494002a1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
>   
>   static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>   {
> -	uint64_t clock;
> +	uint64_t clock, clock_lo, clock_hi, hi_check;
>   
>   	amdgpu_gfx_off_ctrl(adev, false);
>   	mutex_lock(&adev->gfx.gpu_clock_mutex);
> @@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>   			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
>   		break;
>   	default:
> -		clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER) |
> -			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER) << 32ULL);

If you want to be extra sure you could add a preempt_disable(); here.

> +		clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
> +		clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
> +		hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
> +		/* Carryover happens every 4 Giga time cycles counts which is roughly 42 secs */
> +		if (hi_check != clock_hi) {
> +			clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
> +			clock_hi = hi_check;
> +		}

And a preempt_enable(); here. This way the critical section is also not interrupted by a task switch.

But either way the patch is Reviewed-by: Christian König <christian.koenig@amd.com>

Regards,
Christian.

> +		clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
>   		break;
>   	}
>   	mutex_unlock(&adev->gfx.gpu_clock_mutex);
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
  2021-06-30 11:17   ` Liu, Monk
@ 2021-06-30 11:21     ` Christian König
  0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2021-06-30 11:21 UTC (permalink / raw
  To: Liu, Monk, Christian König, Wang, YuBiao,
	amd-gfx@lists.freedesktop.org
  Cc: Grodzovsky, Andrey, Xiao, Jack, Xu, Feifei, Chen, Horace,
	Wang, Kevin(Yang), Tuikov, Luben, Deucher, Alexander, Quan, Evan,
	Zhang, Hawking

Am 30.06.21 um 13:17 schrieb Liu, Monk:
> [AMD Official Use Only]
>
>>> And a preempt_enable(); here. This way the critical section is also not interrupted by a task switch.
> Do you mean put a "preempt_disable()" here ?

No? We need to disable preemption before the critical section and enable 
it again when we are done.

Or is the code called under a spinlock or similar? In that case this 
would be superfluous.

Anyway it is rather unlikely that the task is not scheduled again for 
the next 42 seconds.

Christian.

>
> Thanks
>
> ------------------------------------------
> Monk Liu | Cloud-GPU Core team
> ------------------------------------------
>
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Wednesday, June 30, 2021 7:15 PM
> To: Wang, YuBiao <YuBiao.Wang@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Xiao, Jack <Jack.Xiao@amd.com>; Xu, Feifei <Feifei.Xu@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Wang, Kevin(Yang) <Kevin1.Wang@amd.com>; Tuikov, Luben <Luben.Tuikov@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
>
> Am 30.06.21 um 12:10 schrieb YuBiao Wang:
>> [Why]
>> GPU timing counters are read via KIQ under sriov, which will introduce
>> a delay.
>>
>> [How]
>> It could be directly read by MMIO.
>>
>> v2: Add additional check to prevent carryover issue.
>> v3: Only check for carryover for once to prevent performance issue.
>> v4: Add comments of the rough frequency where carryover happens.
>>
>> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
>> Acked-by: Horace Chen <horace.chen@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
>>    1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index ff7e9f49040e..9355494002a1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
>>    
>>    static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>>    {
>> -	uint64_t clock;
>> +	uint64_t clock, clock_lo, clock_hi, hi_check;
>>    
>>    	amdgpu_gfx_off_ctrl(adev, false);
>>    	mutex_lock(&adev->gfx.gpu_clock_mutex);
>> @@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>>    			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
>>    		break;
>>    	default:
>> -		clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER) |
>> -			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER) << 32ULL);
> If you want to be extra sure you could add a preempt_disable(); here.
>
>> +		clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
>> +		clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
>> +		hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
>> +		/* Carryover happens every 4 Giga time cycles counts which is roughly 42 secs */
>> +		if (hi_check != clock_hi) {
>> +			clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
>> +			clock_hi = hi_check;
>> +		}
> And a preempt_enable(); here. This way the critical section is also not interrupted by a task switch.
>
> But either way the patch is Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Regards,
> Christian.
>
>> +		clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
>>    		break;
>>    	}
>>    	mutex_unlock(&adev->gfx.gpu_clock_mutex);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
  2021-06-30 10:10 [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4) YuBiao Wang
  2021-06-30 11:14 ` Christian König
  2021-06-30 11:16 ` Liu, Monk
@ 2021-06-30 14:15 ` Jay Cornwall
  2021-06-30 16:52 ` Luben Tuikov
  3 siblings, 0 replies; 7+ messages in thread
From: Jay Cornwall @ 2021-06-30 14:15 UTC (permalink / raw
  To: YuBiao Wang, amd-gfx
  Cc: Andrey Grodzovsky, Jack Xiao, Feifei Xu, horace.chen, Kevin Wang,
	Tuikov Luben, Deucher Alexander, Evan Quan, Christian König,
	Monk Liu, Hawking Zhang

On Wed, Jun 30, 2021, at 05:10, YuBiao Wang wrote:
> [Why]
> GPU timing counters are read via KIQ under sriov, which will introduce
> a delay.
> 
> [How]
> It could be directly read by MMIO.
> 
> v2: Add additional check to prevent carryover issue.
> v3: Only check for carryover for once to prevent performance issue.
> v4: Add comments of the rough frequency where carryover happens.
> 
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> Acked-by: Horace Chen <horace.chen@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index ff7e9f49040e..9355494002a1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
>  
>  static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>  {
> -	uint64_t clock;
> +	uint64_t clock, clock_lo, clock_hi, hi_check;
>  
>  	amdgpu_gfx_off_ctrl(adev, false);

This clock can be read with gfxoff enabled.

>  	mutex_lock(&adev->gfx.gpu_clock_mutex);

Is the mutex relevant with this clock? It doesn't snapshot like RLC.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4)
  2021-06-30 10:10 [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4) YuBiao Wang
                   ` (2 preceding siblings ...)
  2021-06-30 14:15 ` Jay Cornwall
@ 2021-06-30 16:52 ` Luben Tuikov
  3 siblings, 0 replies; 7+ messages in thread
From: Luben Tuikov @ 2021-06-30 16:52 UTC (permalink / raw
  To: YuBiao Wang, amd-gfx
  Cc: Andrey Grodzovsky, Jack Xiao, Feifei Xu, horace.chen, Kevin Wang,
	Deucher Alexander, Evan Quan, Christian König, Monk Liu,
	Hawking Zhang

On 2021-06-30 6:10 a.m., YuBiao Wang wrote:
> [Why]
> GPU timing counters are read via KIQ under sriov, which will introduce
> a delay.
>
> [How]
> It could be directly read by MMIO.
>
> v2: Add additional check to prevent carryover issue.
> v3: Only check for carryover for once to prevent performance issue.
> v4: Add comments of the rough frequency where carryover happens.
>
> Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
> Acked-by: Horace Chen <horace.chen@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index ff7e9f49040e..9355494002a1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
>  
>  static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>  {
> -	uint64_t clock;
> +	uint64_t clock, clock_lo, clock_hi, hi_check;
>  
>  	amdgpu_gfx_off_ctrl(adev, false);
>  	mutex_lock(&adev->gfx.gpu_clock_mutex);
> @@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
>  			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
>  		break;
>  	default:
> -		clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER) |
> -			((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER) << 32ULL);
> +		clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
> +		clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
> +		hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER);
> +		/* Carryover happens every 4 Giga time cycles counts which is roughly 42 secs */

I'd rather have put the clock frequency here, rather than some interpretation thereof.
This would make this maintainable in the future should the clock frequency change.
"4 Giga time cycles" isn't a standard expression.

Something like:
"The GFX clock frequency is ..., which sets 32-bit carry over with frequency 42 seconds."

It'll also allow anyone to check the math.

Regards,
Luben

> +		if (hi_check != clock_hi) {
> +			clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER);
> +			clock_hi = hi_check;
> +		}
> +		clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
>  		break;
>  	}
>  	mutex_unlock(&adev->gfx.gpu_clock_mutex);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-06-30 16:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-30 10:10 [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce delay (v4) YuBiao Wang
2021-06-30 11:14 ` Christian König
2021-06-30 11:17   ` Liu, Monk
2021-06-30 11:21     ` Christian König
2021-06-30 11:16 ` Liu, Monk
2021-06-30 14:15 ` Jay Cornwall
2021-06-30 16:52 ` Luben Tuikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.