All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] drm/panthor: Collection of tiler heap related fixes
@ 2024-04-30 11:28 Boris Brezillon
  2024-04-30 11:28 ` [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering Boris Brezillon
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-04-30 11:28 UTC (permalink / raw
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, kernel

This is a collection of tiler heap fixes for bugs/oddities found while
looking at incremental rendering.

Ideally, we want to land those before 6.10 is released, so we don't need
to increment the driver version to reflect the ABI changes.

Regards,

Boris

Antonino Maniscalco (1):
  drm/panthor: Fix tiler OOM handling to allow incremental rendering

Boris Brezillon (3):
  drm/panthor: Make sure the tiler initial/max chunks are consistent
  drm/panthor: Relax the constraints on the tiler chunk size
  drm/panthor: Fix an off-by-one in the heap context retrieval logic

 drivers/gpu/drm/panthor/panthor_heap.c  | 58 +++++++++++++++++--------
 drivers/gpu/drm/panthor/panthor_sched.c |  7 ++-
 include/uapi/drm/panthor_drm.h          | 14 ++++--
 3 files changed, 58 insertions(+), 21 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering
  2024-04-30 11:28 [PATCH v2 0/4] drm/panthor: Collection of tiler heap related fixes Boris Brezillon
@ 2024-04-30 11:28 ` Boris Brezillon
  2024-04-30 15:27   ` Liviu Dudau
  2024-05-02 14:03   ` Steven Price
  2024-04-30 11:28 ` [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent Boris Brezillon
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-04-30 11:28 UTC (permalink / raw
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, Antonino Maniscalco, kernel

From: Antonino Maniscalco <antonino.maniscalco@collabora.com>

If the kernel couldn't allocate memory because we reached the maximum
number of chunks but no render passes are in flight
(panthor_heap_grow() returning -ENOMEM), we should defer the OOM
handling to the FW by returning a NULL chunk. The FW will then call
the tiler OOM exception handler, which is supposed to implement
incremental rendering (execute an intermediate fragment job to flush
the pending primitives, release the tiler memory that was used to
store those primitives, and start over from where it stopped).

Instead of checking for both ENOMEM and EBUSY, make panthor_heap_grow()
return ENOMEM no matter the reason of this allocation failure, the FW
doesn't care anyway.

v2:
- Make panthor_heap_grow() return -ENOMEM for all kind of allocation
  failures
- Document the panthor_heap_grow() semantics

Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
Signed-off-by: Antonino Maniscalco <antonino.maniscalco@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_heap.c  | 12 ++++++++----
 drivers/gpu/drm/panthor/panthor_sched.c |  7 ++++++-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
index 143fa35f2e74..c3c0ba744937 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -410,6 +410,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
  * @renderpasses_in_flight: Number of render passes currently in-flight.
  * @pending_frag_count: Number of fragment jobs waiting for execution/completion.
  * @new_chunk_gpu_va: Pointer used to return the chunk VA.
+ *
+ * Return:
+ * - 0 if a new heap was allocated
+ * - -ENOMEM if the tiler context reached the maximum number of chunks
+ *   or if too many render passes are in-flight
+ *   or if the allocation failed
+ * - -EINVAL if any of the arguments passed to panthor_heap_grow() is invalid
  */
 int panthor_heap_grow(struct panthor_heap_pool *pool,
 		      u64 heap_gpu_va,
@@ -439,10 +446,7 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
 	 * handler provided by the userspace driver, if any).
 	 */
 	if (renderpasses_in_flight > heap->target_in_flight ||
-	    (pending_frag_count > 0 && heap->chunk_count >= heap->max_chunks)) {
-		ret = -EBUSY;
-		goto out_unlock;
-	} else if (heap->chunk_count >= heap->max_chunks) {
+	    heap->chunk_count >= heap->max_chunks) {
 		ret = -ENOMEM;
 		goto out_unlock;
 	}
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index b3a51a6de523..fd928362d45e 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -1354,7 +1354,12 @@ static int group_process_tiler_oom(struct panthor_group *group, u32 cs_id)
 					pending_frag_count, &new_chunk_va);
 	}
 
-	if (ret && ret != -EBUSY) {
+	/* If the heap context doesn't have memory for us, we want to let the
+	 * FW try to reclaim memory by waiting for fragment jobs to land or by
+	 * executing the tiler OOM exception handler, which is supposed to
+	 * implement incremental rendering.
+	 */
+	if (ret && ret != -ENOMEM) {
 		drm_warn(&ptdev->base, "Failed to extend the tiler heap\n");
 		group->fatal_queues |= BIT(cs_id);
 		sched_queue_delayed_work(sched, tick, 0);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent
  2024-04-30 11:28 [PATCH v2 0/4] drm/panthor: Collection of tiler heap related fixes Boris Brezillon
  2024-04-30 11:28 ` [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering Boris Brezillon
@ 2024-04-30 11:28 ` Boris Brezillon
  2024-04-30 15:31   ` Liviu Dudau
  2024-05-02 14:03   ` Steven Price
  2024-04-30 11:28 ` [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size Boris Brezillon
  2024-04-30 11:28 ` [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic Boris Brezillon
  3 siblings, 2 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-04-30 11:28 UTC (permalink / raw
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, kernel

It doesn't make sense to have a maximum number of chunks smaller than
the initial number of chunks attached to the context.

Fix the uAPI header to reflect the new constraint, and mention the
undocumented "initial_chunk_count > 0" constraint while at it.

v2:
- Fix the check

Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_heap.c | 3 +++
 include/uapi/drm/panthor_drm.h         | 8 ++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
index c3c0ba744937..3be86ec383d6 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -281,6 +281,9 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
 	if (initial_chunk_count == 0)
 		return -EINVAL;
 
+	if (initial_chunk_count > max_chunks)
+		return -EINVAL;
+
 	if (hweight32(chunk_size) != 1 ||
 	    chunk_size < SZ_256K || chunk_size > SZ_2M)
 		return -EINVAL;
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index dadb05ab1235..5db80a0682d5 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -895,13 +895,17 @@ struct drm_panthor_tiler_heap_create {
 	/** @vm_id: VM ID the tiler heap should be mapped to */
 	__u32 vm_id;
 
-	/** @initial_chunk_count: Initial number of chunks to allocate. */
+	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
 	__u32 initial_chunk_count;
 
 	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
 	__u32 chunk_size;
 
-	/** @max_chunks: Maximum number of chunks that can be allocated. */
+	/**
+	 * @max_chunks: Maximum number of chunks that can be allocated.
+	 *
+	 * Must be at least @initial_chunk_count.
+	 */
 	__u32 max_chunks;
 
 	/**
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size
  2024-04-30 11:28 [PATCH v2 0/4] drm/panthor: Collection of tiler heap related fixes Boris Brezillon
  2024-04-30 11:28 ` [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering Boris Brezillon
  2024-04-30 11:28 ` [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent Boris Brezillon
@ 2024-04-30 11:28 ` Boris Brezillon
  2024-04-30 13:08   ` Adrián Larumbe
  2024-04-30 16:10   ` Liviu Dudau
  2024-04-30 11:28 ` [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic Boris Brezillon
  3 siblings, 2 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-04-30 11:28 UTC (permalink / raw
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, kernel

The field used to store the chunk size if 12 bits wide, and the encoding
is chunk_size = chunk_header.chunk_size << 12, which gives us a
theoretical [4k:8M] range. This range is further limited by
implementation constraints, and all known implementations seem to
impose a [128k:8M] range, so do the same here.

We also relax the power-of-two constraint, which doesn't seem to
exist on v10. This will allow userspace to fine-tune initial/max
tiler memory on memory-constrained devices.

v2:
- Turn the power-of-two constraint into a page-aligned constraint to allow
  fine-tune of the initial/max heap memory size
- Fix the panthor_heap_create() kerneldoc

Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_heap.c | 8 ++++----
 include/uapi/drm/panthor_drm.h         | 6 +++++-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
index 3be86ec383d6..683bb94761bc 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -253,8 +253,8 @@ int panthor_heap_destroy(struct panthor_heap_pool *pool, u32 handle)
  * @pool: Pool to instantiate the heap context from.
  * @initial_chunk_count: Number of chunk allocated at initialization time.
  * Must be at least 1.
- * @chunk_size: The size of each chunk. Must be a power of two between 256k
- * and 2M.
+ * @chunk_size: The size of each chunk. Must be page-aligned and lie in the
+ * [128k:2M] range.
  * @max_chunks: Maximum number of chunks that can be allocated.
  * @target_in_flight: Maximum number of in-flight render passes.
  * @heap_ctx_gpu_va: Pointer holding the GPU address of the allocated heap
@@ -284,8 +284,8 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
 	if (initial_chunk_count > max_chunks)
 		return -EINVAL;
 
-	if (hweight32(chunk_size) != 1 ||
-	    chunk_size < SZ_256K || chunk_size > SZ_2M)
+	if (!IS_ALIGNED(chunk_size, PAGE_SIZE) ||
+	    chunk_size < SZ_128K || chunk_size > SZ_8M)
 		return -EINVAL;
 
 	down_read(&pool->lock);
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 5db80a0682d5..b8220d2e698f 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -898,7 +898,11 @@ struct drm_panthor_tiler_heap_create {
 	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
 	__u32 initial_chunk_count;
 
-	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
+	/**
+	 * @chunk_size: Chunk size.
+	 *
+	 * Must be page-aligned and lie in the [128k:8M] range.
+	 */
 	__u32 chunk_size;
 
 	/**
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-04-30 11:28 [PATCH v2 0/4] drm/panthor: Collection of tiler heap related fixes Boris Brezillon
                   ` (2 preceding siblings ...)
  2024-04-30 11:28 ` [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size Boris Brezillon
@ 2024-04-30 11:28 ` Boris Brezillon
  2024-04-30 16:40   ` Liviu Dudau
  2024-05-02 14:03   ` Steven Price
  3 siblings, 2 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-04-30 11:28 UTC (permalink / raw
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, kernel, Eric Smith

ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
[1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.

v2:
- New patch

Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
Reported-by: Eric Smith <eric.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Eric Smith <eric.smith@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_heap.c | 35 +++++++++++++++++++-------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
index 683bb94761bc..b1a7dbf25fb2 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -109,7 +109,11 @@ static int panthor_heap_ctx_stride(struct panthor_device *ptdev)
 
 static int panthor_get_heap_ctx_offset(struct panthor_heap_pool *pool, int id)
 {
-	return panthor_heap_ctx_stride(pool->ptdev) * id;
+	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
+	 * is [1:MAX_HEAPS_PER_POOL], which we need to turn into a
+	 * [0:MAX_HEAPS_PER_POOL-1] context index, hence the minus one here.
+	 */
+	return panthor_heap_ctx_stride(pool->ptdev) * (id - 1);
 }
 
 static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
@@ -118,6 +122,21 @@ static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
 	       panthor_get_heap_ctx_offset(pool, id);
 }
 
+static int panthor_get_heap_ctx_id(struct panthor_heap_pool *pool,
+				   u64 heap_ctx_gpu_va)
+{
+	u64 offset = heap_ctx_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
+	u32 heap_idx = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
+
+	if (offset > U32_MAX || heap_idx >= MAX_HEAPS_PER_POOL)
+		return -EINVAL;
+
+	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
+	 * is [1:MAX_HEAPS_PER_POOL], hence the plus one here.
+	 */
+	return heap_idx + 1;
+}
+
 static void panthor_free_heap_chunk(struct panthor_vm *vm,
 				    struct panthor_heap *heap,
 				    struct panthor_heap_chunk *chunk)
@@ -364,14 +383,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
 			      u64 heap_gpu_va,
 			      u64 chunk_gpu_va)
 {
-	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
-	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
+	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);
 	struct panthor_heap_chunk *chunk, *tmp, *removed = NULL;
 	struct panthor_heap *heap;
 	int ret;
 
-	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
-		return -EINVAL;
+	if (heap_id < 0)
+		return heap_id;
 
 	down_read(&pool->lock);
 	heap = xa_load(&pool->xa, heap_id);
@@ -427,14 +445,13 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
 		      u32 pending_frag_count,
 		      u64 *new_chunk_gpu_va)
 {
-	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
-	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
+	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);
 	struct panthor_heap_chunk *chunk;
 	struct panthor_heap *heap;
 	int ret;
 
-	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
-		return -EINVAL;
+	if (heap_id < 0)
+		return heap_id;
 
 	down_read(&pool->lock);
 	heap = xa_load(&pool->xa, heap_id);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size
  2024-04-30 11:28 ` [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size Boris Brezillon
@ 2024-04-30 13:08   ` Adrián Larumbe
  2024-05-02 14:03     ` Steven Price
  2024-04-30 16:10   ` Liviu Dudau
  1 sibling, 1 reply; 19+ messages in thread
From: Adrián Larumbe @ 2024-04-30 13:08 UTC (permalink / raw
  To: Boris Brezillon; +Cc: Steven Price, Liviu Dudau, dri-devel, kernel

Hi Boris,

On 30.04.2024 13:28, Boris Brezillon wrote:
> The field used to store the chunk size if 12 bits wide, and the encoding
> is chunk_size = chunk_header.chunk_size << 12, which gives us a
> theoretical [4k:8M] range. This range is further limited by
> implementation constraints, and all known implementations seem to
> impose a [128k:8M] range, so do the same here.
> 
> We also relax the power-of-two constraint, which doesn't seem to
> exist on v10. This will allow userspace to fine-tune initial/max
> tiler memory on memory-constrained devices.
> 
> v2:
> - Turn the power-of-two constraint into a page-aligned constraint to allow
>   fine-tune of the initial/max heap memory size
> - Fix the panthor_heap_create() kerneldoc
> 
> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_heap.c | 8 ++++----
>  include/uapi/drm/panthor_drm.h         | 6 +++++-
>  2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 3be86ec383d6..683bb94761bc 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -253,8 +253,8 @@ int panthor_heap_destroy(struct panthor_heap_pool *pool, u32 handle)
>   * @pool: Pool to instantiate the heap context from.
>   * @initial_chunk_count: Number of chunk allocated at initialization time.
>   * Must be at least 1.
> - * @chunk_size: The size of each chunk. Must be a power of two between 256k
> - * and 2M.
> + * @chunk_size: The size of each chunk. Must be page-aligned and lie in the
> + * [128k:2M] range.

Probably a typo, but I guess this should be [128k:8M] ?

>   * @max_chunks: Maximum number of chunks that can be allocated.
>   * @target_in_flight: Maximum number of in-flight render passes.
>   * @heap_ctx_gpu_va: Pointer holding the GPU address of the allocated heap
> @@ -284,8 +284,8 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
>  	if (initial_chunk_count > max_chunks)
>  		return -EINVAL;
>  
> -	if (hweight32(chunk_size) != 1 ||
> -	    chunk_size < SZ_256K || chunk_size > SZ_2M)
> +	if (!IS_ALIGNED(chunk_size, PAGE_SIZE) ||
> +	    chunk_size < SZ_128K || chunk_size > SZ_8M)
>  		return -EINVAL;
>  
>  	down_read(&pool->lock);
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 5db80a0682d5..b8220d2e698f 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -898,7 +898,11 @@ struct drm_panthor_tiler_heap_create {
>  	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
>  	__u32 initial_chunk_count;
>  
> -	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
> +	/**
> +	 * @chunk_size: Chunk size.
> +	 *
> +	 * Must be page-aligned and lie in the [128k:8M] range.
> +	 */
>  	__u32 chunk_size;
>  
>  	/**
> -- 
> 2.44.0


Adrian Larumbe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering
  2024-04-30 11:28 ` [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering Boris Brezillon
@ 2024-04-30 15:27   ` Liviu Dudau
  2024-05-02 14:03   ` Steven Price
  1 sibling, 0 replies; 19+ messages in thread
From: Liviu Dudau @ 2024-04-30 15:27 UTC (permalink / raw
  To: Boris Brezillon
  Cc: Steven Price, Adrián Larumbe, dri-devel, Antonino Maniscalco,
	kernel

On Tue, Apr 30, 2024 at 01:28:49PM +0200, Boris Brezillon wrote:
> From: Antonino Maniscalco <antonino.maniscalco@collabora.com>
> 
> If the kernel couldn't allocate memory because we reached the maximum
> number of chunks but no render passes are in flight
> (panthor_heap_grow() returning -ENOMEM), we should defer the OOM
> handling to the FW by returning a NULL chunk. The FW will then call
> the tiler OOM exception handler, which is supposed to implement
> incremental rendering (execute an intermediate fragment job to flush
> the pending primitives, release the tiler memory that was used to
> store those primitives, and start over from where it stopped).
> 
> Instead of checking for both ENOMEM and EBUSY, make panthor_heap_grow()
> return ENOMEM no matter the reason of this allocation failure, the FW
> doesn't care anyway.
> 
> v2:
> - Make panthor_heap_grow() return -ENOMEM for all kind of allocation
>   failures
> - Document the panthor_heap_grow() semantics
> 
> Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
> Signed-off-by: Antonino Maniscalco <antonino.maniscalco@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

> ---
>  drivers/gpu/drm/panthor/panthor_heap.c  | 12 ++++++++----
>  drivers/gpu/drm/panthor/panthor_sched.c |  7 ++++++-
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 143fa35f2e74..c3c0ba744937 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -410,6 +410,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
>   * @renderpasses_in_flight: Number of render passes currently in-flight.
>   * @pending_frag_count: Number of fragment jobs waiting for execution/completion.
>   * @new_chunk_gpu_va: Pointer used to return the chunk VA.
> + *
> + * Return:
> + * - 0 if a new heap was allocated
> + * - -ENOMEM if the tiler context reached the maximum number of chunks
> + *   or if too many render passes are in-flight
> + *   or if the allocation failed
> + * - -EINVAL if any of the arguments passed to panthor_heap_grow() is invalid
>   */
>  int panthor_heap_grow(struct panthor_heap_pool *pool,
>  		      u64 heap_gpu_va,
> @@ -439,10 +446,7 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
>  	 * handler provided by the userspace driver, if any).
>  	 */
>  	if (renderpasses_in_flight > heap->target_in_flight ||
> -	    (pending_frag_count > 0 && heap->chunk_count >= heap->max_chunks)) {
> -		ret = -EBUSY;
> -		goto out_unlock;
> -	} else if (heap->chunk_count >= heap->max_chunks) {
> +	    heap->chunk_count >= heap->max_chunks) {
>  		ret = -ENOMEM;
>  		goto out_unlock;
>  	}
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index b3a51a6de523..fd928362d45e 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -1354,7 +1354,12 @@ static int group_process_tiler_oom(struct panthor_group *group, u32 cs_id)
>  					pending_frag_count, &new_chunk_va);
>  	}
>  
> -	if (ret && ret != -EBUSY) {
> +	/* If the heap context doesn't have memory for us, we want to let the
> +	 * FW try to reclaim memory by waiting for fragment jobs to land or by
> +	 * executing the tiler OOM exception handler, which is supposed to
> +	 * implement incremental rendering.
> +	 */
> +	if (ret && ret != -ENOMEM) {
>  		drm_warn(&ptdev->base, "Failed to extend the tiler heap\n");
>  		group->fatal_queues |= BIT(cs_id);
>  		sched_queue_delayed_work(sched, tick, 0);
> -- 
> 2.44.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent
  2024-04-30 11:28 ` [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent Boris Brezillon
@ 2024-04-30 15:31   ` Liviu Dudau
  2024-05-02 14:03   ` Steven Price
  1 sibling, 0 replies; 19+ messages in thread
From: Liviu Dudau @ 2024-04-30 15:31 UTC (permalink / raw
  To: Boris Brezillon; +Cc: Steven Price, Adrián Larumbe, dri-devel, kernel

On Tue, Apr 30, 2024 at 01:28:50PM +0200, Boris Brezillon wrote:
> It doesn't make sense to have a maximum number of chunks smaller than
> the initial number of chunks attached to the context.
> 
> Fix the uAPI header to reflect the new constraint, and mention the
> undocumented "initial_chunk_count > 0" constraint while at it.
> 
> v2:
> - Fix the check
> 
> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_heap.c | 3 +++
>  include/uapi/drm/panthor_drm.h         | 8 ++++++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index c3c0ba744937..3be86ec383d6 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -281,6 +281,9 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
>  	if (initial_chunk_count == 0)
>  		return -EINVAL;
>  
> +	if (initial_chunk_count > max_chunks)
> +		return -EINVAL;
> +

Is is just me that feels like a lost opportunity to merge the check with the one above?

	if (!initial_chunk_count || initial_chunk_count > max_chunks)
		return -EINVAL;


Otherwise, Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

>  	if (hweight32(chunk_size) != 1 ||
>  	    chunk_size < SZ_256K || chunk_size > SZ_2M)
>  		return -EINVAL;
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index dadb05ab1235..5db80a0682d5 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -895,13 +895,17 @@ struct drm_panthor_tiler_heap_create {
>  	/** @vm_id: VM ID the tiler heap should be mapped to */
>  	__u32 vm_id;
>  
> -	/** @initial_chunk_count: Initial number of chunks to allocate. */
> +	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
>  	__u32 initial_chunk_count;
>  
>  	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
>  	__u32 chunk_size;
>  
> -	/** @max_chunks: Maximum number of chunks that can be allocated. */
> +	/**
> +	 * @max_chunks: Maximum number of chunks that can be allocated.
> +	 *
> +	 * Must be at least @initial_chunk_count.
> +	 */
>  	__u32 max_chunks;
>  
>  	/**
> -- 
> 2.44.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size
  2024-04-30 11:28 ` [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size Boris Brezillon
  2024-04-30 13:08   ` Adrián Larumbe
@ 2024-04-30 16:10   ` Liviu Dudau
  1 sibling, 0 replies; 19+ messages in thread
From: Liviu Dudau @ 2024-04-30 16:10 UTC (permalink / raw
  To: Boris Brezillon; +Cc: Steven Price, Adrián Larumbe, dri-devel, kernel

On Tue, Apr 30, 2024 at 01:28:51PM +0200, Boris Brezillon wrote:
> The field used to store the chunk size if 12 bits wide, and the encoding
> is chunk_size = chunk_header.chunk_size << 12, which gives us a
> theoretical [4k:8M] range. This range is further limited by
> implementation constraints, and all known implementations seem to
> impose a [128k:8M] range, so do the same here.
> 
> We also relax the power-of-two constraint, which doesn't seem to
> exist on v10. This will allow userspace to fine-tune initial/max
> tiler memory on memory-constrained devices.
> 
> v2:
> - Turn the power-of-two constraint into a page-aligned constraint to allow
>   fine-tune of the initial/max heap memory size
> - Fix the panthor_heap_create() kerneldoc
> 
> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

With the typo that Adrián mentioned fixed,

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

> ---
>  drivers/gpu/drm/panthor/panthor_heap.c | 8 ++++----
>  include/uapi/drm/panthor_drm.h         | 6 +++++-
>  2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 3be86ec383d6..683bb94761bc 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -253,8 +253,8 @@ int panthor_heap_destroy(struct panthor_heap_pool *pool, u32 handle)
>   * @pool: Pool to instantiate the heap context from.
>   * @initial_chunk_count: Number of chunk allocated at initialization time.
>   * Must be at least 1.
> - * @chunk_size: The size of each chunk. Must be a power of two between 256k
> - * and 2M.
> + * @chunk_size: The size of each chunk. Must be page-aligned and lie in the
> + * [128k:2M] range.
>   * @max_chunks: Maximum number of chunks that can be allocated.
>   * @target_in_flight: Maximum number of in-flight render passes.
>   * @heap_ctx_gpu_va: Pointer holding the GPU address of the allocated heap
> @@ -284,8 +284,8 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
>  	if (initial_chunk_count > max_chunks)
>  		return -EINVAL;
>  
> -	if (hweight32(chunk_size) != 1 ||
> -	    chunk_size < SZ_256K || chunk_size > SZ_2M)
> +	if (!IS_ALIGNED(chunk_size, PAGE_SIZE) ||
> +	    chunk_size < SZ_128K || chunk_size > SZ_8M)
>  		return -EINVAL;
>  
>  	down_read(&pool->lock);
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 5db80a0682d5..b8220d2e698f 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -898,7 +898,11 @@ struct drm_panthor_tiler_heap_create {
>  	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
>  	__u32 initial_chunk_count;
>  
> -	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
> +	/**
> +	 * @chunk_size: Chunk size.
> +	 *
> +	 * Must be page-aligned and lie in the [128k:8M] range.
> +	 */
>  	__u32 chunk_size;
>  
>  	/**
> -- 
> 2.44.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-04-30 11:28 ` [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic Boris Brezillon
@ 2024-04-30 16:40   ` Liviu Dudau
  2024-04-30 17:07     ` Boris Brezillon
  2024-05-02 14:03   ` Steven Price
  1 sibling, 1 reply; 19+ messages in thread
From: Liviu Dudau @ 2024-04-30 16:40 UTC (permalink / raw
  To: Boris Brezillon
  Cc: Steven Price, Adrián Larumbe, dri-devel, kernel, Eric Smith

On Tue, Apr 30, 2024 at 01:28:52PM +0200, Boris Brezillon wrote:
> ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
> [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
> in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.
> 
> v2:
> - New patch
> 
> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> Reported-by: Eric Smith <eric.smith@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Tested-by: Eric Smith <eric.smith@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_heap.c | 35 +++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 683bb94761bc..b1a7dbf25fb2 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -109,7 +109,11 @@ static int panthor_heap_ctx_stride(struct panthor_device *ptdev)
>  
>  static int panthor_get_heap_ctx_offset(struct panthor_heap_pool *pool, int id)

Can we make id and the return type here u32? I keep thinking about returning large negative
values here and how they can end up being exploited.

>  {
> -	return panthor_heap_ctx_stride(pool->ptdev) * id;
> +	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
> +	 * is [1:MAX_HEAPS_PER_POOL], which we need to turn into a
> +	 * [0:MAX_HEAPS_PER_POOL-1] context index, hence the minus one here.
> +	 */
> +	return panthor_heap_ctx_stride(pool->ptdev) * (id - 1);
>  }
>  
>  static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
> @@ -118,6 +122,21 @@ static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
>  	       panthor_get_heap_ctx_offset(pool, id);
>  }
>  
> +static int panthor_get_heap_ctx_id(struct panthor_heap_pool *pool,
> +				   u64 heap_ctx_gpu_va)
> +{
> +	u64 offset = heap_ctx_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> +	u32 heap_idx = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> +
> +	if (offset > U32_MAX || heap_idx >= MAX_HEAPS_PER_POOL)
> +		return -EINVAL;
> +
> +	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
> +	 * is [1:MAX_HEAPS_PER_POOL], hence the plus one here.
> +	 */
> +	return heap_idx + 1;
> +}
> +
>  static void panthor_free_heap_chunk(struct panthor_vm *vm,
>  				    struct panthor_heap *heap,
>  				    struct panthor_heap_chunk *chunk)
> @@ -364,14 +383,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
>  			      u64 heap_gpu_va,
>  			      u64 chunk_gpu_va)
>  {
> -	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> -	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> +	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);

I would keep heap_id here u32. Why do we need to change it? Also, I don't see how
panthor_get_heap_ctx_id() can ever return negative values unless we expect MAX_HEAPS_PER_POOL
to be S32_MAX at some moment.

>  	struct panthor_heap_chunk *chunk, *tmp, *removed = NULL;
>  	struct panthor_heap *heap;
>  	int ret;
>  
> -	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
> -		return -EINVAL;
> +	if (heap_id < 0)
> +		return heap_id;

This can then be removed if heap_id is u32.

>  
>  	down_read(&pool->lock);
>  	heap = xa_load(&pool->xa, heap_id);
> @@ -427,14 +445,13 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
>  		      u32 pending_frag_count,
>  		      u64 *new_chunk_gpu_va)
>  {
> -	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> -	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> +	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);

Again, keep u32 unless you have good reasons ...

>  	struct panthor_heap_chunk *chunk;
>  	struct panthor_heap *heap;
>  	int ret;
>  
> -	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
> -		return -EINVAL;
> +	if (heap_id < 0)
> +		return heap_id;

... and we will not need this.

Best regards,
Liviu


>  
>  	down_read(&pool->lock);
>  	heap = xa_load(&pool->xa, heap_id);
> -- 
> 2.44.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-04-30 16:40   ` Liviu Dudau
@ 2024-04-30 17:07     ` Boris Brezillon
  0 siblings, 0 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-04-30 17:07 UTC (permalink / raw
  To: Liviu Dudau
  Cc: Steven Price, Adrián Larumbe, dri-devel, kernel, Eric Smith

On Tue, 30 Apr 2024 17:40:54 +0100
Liviu Dudau <liviu.dudau@arm.com> wrote:

> On Tue, Apr 30, 2024 at 01:28:52PM +0200, Boris Brezillon wrote:
> > ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
> > [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
> > in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.
> > 
> > v2:
> > - New patch
> > 
> > Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> > Reported-by: Eric Smith <eric.smith@collabora.com>
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Tested-by: Eric Smith <eric.smith@collabora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_heap.c | 35 +++++++++++++++++++-------
> >  1 file changed, 26 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> > index 683bb94761bc..b1a7dbf25fb2 100644
> > --- a/drivers/gpu/drm/panthor/panthor_heap.c
> > +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> > @@ -109,7 +109,11 @@ static int panthor_heap_ctx_stride(struct panthor_device *ptdev)
> >  
> >  static int panthor_get_heap_ctx_offset(struct panthor_heap_pool *pool, int id)  
> 
> Can we make id and the return type here u32? I keep thinking about returning large negative
> values here and how they can end up being exploited.

Actually, I had the prototype changed to take/return an u32 locally, but
decided to drop it to both keep the amount of changes minimal and keep
prototype consistent with the new helper. I'm fine switching to u32s
though.

> 
> >  {
> > -	return panthor_heap_ctx_stride(pool->ptdev) * id;
> > +	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
> > +	 * is [1:MAX_HEAPS_PER_POOL], which we need to turn into a
> > +	 * [0:MAX_HEAPS_PER_POOL-1] context index, hence the minus one here.
> > +	 */
> > +	return panthor_heap_ctx_stride(pool->ptdev) * (id - 1);
> >  }
> >  
> >  static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
> > @@ -118,6 +122,21 @@ static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
> >  	       panthor_get_heap_ctx_offset(pool, id);
> >  }
> >  
> > +static int panthor_get_heap_ctx_id(struct panthor_heap_pool *pool,
> > +				   u64 heap_ctx_gpu_va)
> > +{
> > +	u64 offset = heap_ctx_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> > +	u32 heap_idx = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> > +
> > +	if (offset > U32_MAX || heap_idx >= MAX_HEAPS_PER_POOL)
> > +		return -EINVAL;
> > +
> > +	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
> > +	 * is [1:MAX_HEAPS_PER_POOL], hence the plus one here.
> > +	 */
> > +	return heap_idx + 1;
> > +}
> > +
> >  static void panthor_free_heap_chunk(struct panthor_vm *vm,
> >  				    struct panthor_heap *heap,
> >  				    struct panthor_heap_chunk *chunk)
> > @@ -364,14 +383,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
> >  			      u64 heap_gpu_va,
> >  			      u64 chunk_gpu_va)
> >  {
> > -	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> > -	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> > +	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);  
> 
> I would keep heap_id here u32. Why do we need to change it? Also, I don't see how
> panthor_get_heap_ctx_id() can ever return negative values unless we expect MAX_HEAPS_PER_POOL
> to be S32_MAX at some moment.

The reason I made it a signed value is so we could return an error code
too

-  > 0 => valid
-  < 0 error, with the value encoding the error

though we're likely to always return EINVAL anyway.

> 
> >  	struct panthor_heap_chunk *chunk, *tmp, *removed = NULL;
> >  	struct panthor_heap *heap;
> >  	int ret;
> >  
> > -	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
> > -		return -EINVAL;
> > +	if (heap_id < 0)
> > +		return heap_id;  
> 
> This can then be removed if heap_id is u32.

We need to replace that by an extra check on the VA, and given this is
done in two different places, I thought having an helper that does both
the VA to offset and the offset consistency check was simpler. I mean,
I could make this helper return an u32, and consider 0 as the
'no-context-found' special-value, but I can't drop this check.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering
  2024-04-30 11:28 ` [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering Boris Brezillon
  2024-04-30 15:27   ` Liviu Dudau
@ 2024-05-02 14:03   ` Steven Price
  1 sibling, 0 replies; 19+ messages in thread
From: Steven Price @ 2024-05-02 14:03 UTC (permalink / raw
  To: Boris Brezillon, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, Antonino Maniscalco, kernel

On 30/04/2024 12:28, Boris Brezillon wrote:
> From: Antonino Maniscalco <antonino.maniscalco@collabora.com>
> 
> If the kernel couldn't allocate memory because we reached the maximum
> number of chunks but no render passes are in flight
> (panthor_heap_grow() returning -ENOMEM), we should defer the OOM
> handling to the FW by returning a NULL chunk. The FW will then call
> the tiler OOM exception handler, which is supposed to implement
> incremental rendering (execute an intermediate fragment job to flush
> the pending primitives, release the tiler memory that was used to
> store those primitives, and start over from where it stopped).
> 
> Instead of checking for both ENOMEM and EBUSY, make panthor_heap_grow()
> return ENOMEM no matter the reason of this allocation failure, the FW
> doesn't care anyway.
> 
> v2:
> - Make panthor_heap_grow() return -ENOMEM for all kind of allocation
>   failures
> - Document the panthor_heap_grow() semantics
> 
> Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
> Signed-off-by: Antonino Maniscalco <antonino.maniscalco@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed-by: Steven Price <steven.price@arm.com>

Thanks,
Steve

> ---
>  drivers/gpu/drm/panthor/panthor_heap.c  | 12 ++++++++----
>  drivers/gpu/drm/panthor/panthor_sched.c |  7 ++++++-
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 143fa35f2e74..c3c0ba744937 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -410,6 +410,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
>   * @renderpasses_in_flight: Number of render passes currently in-flight.
>   * @pending_frag_count: Number of fragment jobs waiting for execution/completion.
>   * @new_chunk_gpu_va: Pointer used to return the chunk VA.
> + *
> + * Return:
> + * - 0 if a new heap was allocated
> + * - -ENOMEM if the tiler context reached the maximum number of chunks
> + *   or if too many render passes are in-flight
> + *   or if the allocation failed
> + * - -EINVAL if any of the arguments passed to panthor_heap_grow() is invalid
>   */
>  int panthor_heap_grow(struct panthor_heap_pool *pool,
>  		      u64 heap_gpu_va,
> @@ -439,10 +446,7 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
>  	 * handler provided by the userspace driver, if any).
>  	 */
>  	if (renderpasses_in_flight > heap->target_in_flight ||
> -	    (pending_frag_count > 0 && heap->chunk_count >= heap->max_chunks)) {
> -		ret = -EBUSY;
> -		goto out_unlock;
> -	} else if (heap->chunk_count >= heap->max_chunks) {
> +	    heap->chunk_count >= heap->max_chunks) {
>  		ret = -ENOMEM;
>  		goto out_unlock;
>  	}
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index b3a51a6de523..fd928362d45e 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -1354,7 +1354,12 @@ static int group_process_tiler_oom(struct panthor_group *group, u32 cs_id)
>  					pending_frag_count, &new_chunk_va);
>  	}
>  
> -	if (ret && ret != -EBUSY) {
> +	/* If the heap context doesn't have memory for us, we want to let the
> +	 * FW try to reclaim memory by waiting for fragment jobs to land or by
> +	 * executing the tiler OOM exception handler, which is supposed to
> +	 * implement incremental rendering.
> +	 */
> +	if (ret && ret != -ENOMEM) {
>  		drm_warn(&ptdev->base, "Failed to extend the tiler heap\n");
>  		group->fatal_queues |= BIT(cs_id);
>  		sched_queue_delayed_work(sched, tick, 0);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent
  2024-04-30 11:28 ` [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent Boris Brezillon
  2024-04-30 15:31   ` Liviu Dudau
@ 2024-05-02 14:03   ` Steven Price
  1 sibling, 0 replies; 19+ messages in thread
From: Steven Price @ 2024-05-02 14:03 UTC (permalink / raw
  To: Boris Brezillon, Liviu Dudau, Adrián Larumbe; +Cc: dri-devel, kernel

On 30/04/2024 12:28, Boris Brezillon wrote:
> It doesn't make sense to have a maximum number of chunks smaller than
> the initial number of chunks attached to the context.
> 
> Fix the uAPI header to reflect the new constraint, and mention the
> undocumented "initial_chunk_count > 0" constraint while at it.
> 
> v2:
> - Fix the check
> 
> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed-by: Steven Price <steven.price@arm.com>

Thanks,
Steve

> ---
>  drivers/gpu/drm/panthor/panthor_heap.c | 3 +++
>  include/uapi/drm/panthor_drm.h         | 8 ++++++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index c3c0ba744937..3be86ec383d6 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -281,6 +281,9 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
>  	if (initial_chunk_count == 0)
>  		return -EINVAL;
>  
> +	if (initial_chunk_count > max_chunks)
> +		return -EINVAL;
> +
>  	if (hweight32(chunk_size) != 1 ||
>  	    chunk_size < SZ_256K || chunk_size > SZ_2M)
>  		return -EINVAL;
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index dadb05ab1235..5db80a0682d5 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -895,13 +895,17 @@ struct drm_panthor_tiler_heap_create {
>  	/** @vm_id: VM ID the tiler heap should be mapped to */
>  	__u32 vm_id;
>  
> -	/** @initial_chunk_count: Initial number of chunks to allocate. */
> +	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
>  	__u32 initial_chunk_count;
>  
>  	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
>  	__u32 chunk_size;
>  
> -	/** @max_chunks: Maximum number of chunks that can be allocated. */
> +	/**
> +	 * @max_chunks: Maximum number of chunks that can be allocated.
> +	 *
> +	 * Must be at least @initial_chunk_count.
> +	 */
>  	__u32 max_chunks;
>  
>  	/**


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size
  2024-04-30 13:08   ` Adrián Larumbe
@ 2024-05-02 14:03     ` Steven Price
  0 siblings, 0 replies; 19+ messages in thread
From: Steven Price @ 2024-05-02 14:03 UTC (permalink / raw
  To: Adrián Larumbe, Boris Brezillon; +Cc: Liviu Dudau, dri-devel, kernel

On 30/04/2024 14:08, Adrián Larumbe wrote:
> Hi Boris,
> 
> On 30.04.2024 13:28, Boris Brezillon wrote:
>> The field used to store the chunk size if 12 bits wide, and the encoding
>> is chunk_size = chunk_header.chunk_size << 12, which gives us a
>> theoretical [4k:8M] range. This range is further limited by
>> implementation constraints, and all known implementations seem to
>> impose a [128k:8M] range, so do the same here.
>>
>> We also relax the power-of-two constraint, which doesn't seem to
>> exist on v10. This will allow userspace to fine-tune initial/max
>> tiler memory on memory-constrained devices.
>>
>> v2:
>> - Turn the power-of-two constraint into a page-aligned constraint to allow
>>   fine-tune of the initial/max heap memory size
>> - Fix the panthor_heap_create() kerneldoc
>>
>> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
>> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Other than the typo Adrián pointed out below...

Reviewed-by: Steven Price <steven.price@arm.com>

>> ---
>>  drivers/gpu/drm/panthor/panthor_heap.c | 8 ++++----
>>  include/uapi/drm/panthor_drm.h         | 6 +++++-
>>  2 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
>> index 3be86ec383d6..683bb94761bc 100644
>> --- a/drivers/gpu/drm/panthor/panthor_heap.c
>> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
>> @@ -253,8 +253,8 @@ int panthor_heap_destroy(struct panthor_heap_pool *pool, u32 handle)
>>   * @pool: Pool to instantiate the heap context from.
>>   * @initial_chunk_count: Number of chunk allocated at initialization time.
>>   * Must be at least 1.
>> - * @chunk_size: The size of each chunk. Must be a power of two between 256k
>> - * and 2M.
>> + * @chunk_size: The size of each chunk. Must be page-aligned and lie in the
>> + * [128k:2M] range.
> 
> Probably a typo, but I guess this should be [128k:8M] ?
> 
>>   * @max_chunks: Maximum number of chunks that can be allocated.
>>   * @target_in_flight: Maximum number of in-flight render passes.
>>   * @heap_ctx_gpu_va: Pointer holding the GPU address of the allocated heap
>> @@ -284,8 +284,8 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
>>  	if (initial_chunk_count > max_chunks)
>>  		return -EINVAL;
>>  
>> -	if (hweight32(chunk_size) != 1 ||
>> -	    chunk_size < SZ_256K || chunk_size > SZ_2M)
>> +	if (!IS_ALIGNED(chunk_size, PAGE_SIZE) ||
>> +	    chunk_size < SZ_128K || chunk_size > SZ_8M)
>>  		return -EINVAL;
>>  
>>  	down_read(&pool->lock);
>> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
>> index 5db80a0682d5..b8220d2e698f 100644
>> --- a/include/uapi/drm/panthor_drm.h
>> +++ b/include/uapi/drm/panthor_drm.h
>> @@ -898,7 +898,11 @@ struct drm_panthor_tiler_heap_create {
>>  	/** @initial_chunk_count: Initial number of chunks to allocate. Must be at least one. */
>>  	__u32 initial_chunk_count;
>>  
>> -	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
>> +	/**
>> +	 * @chunk_size: Chunk size.
>> +	 *
>> +	 * Must be page-aligned and lie in the [128k:8M] range.
>> +	 */
>>  	__u32 chunk_size;
>>  
>>  	/**
>> -- 
>> 2.44.0
> 
> 
> Adrian Larumbe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-04-30 11:28 ` [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic Boris Brezillon
  2024-04-30 16:40   ` Liviu Dudau
@ 2024-05-02 14:03   ` Steven Price
  2024-05-02 14:15     ` Boris Brezillon
  1 sibling, 1 reply; 19+ messages in thread
From: Steven Price @ 2024-05-02 14:03 UTC (permalink / raw
  To: Boris Brezillon, Liviu Dudau, Adrián Larumbe
  Cc: dri-devel, kernel, Eric Smith

On 30/04/2024 12:28, Boris Brezillon wrote:
> ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
> [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
> in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.

This might be a silly question, but do we need ID 0 to be
"no-tiler-heap"? Would it be easier to e.g. use a negative number for
that situation and avoid all the off-by-one problems?

I'm struggling to find the code which needs the 0 value to be special -
where is it exactly that we encode this "no-tiler-heap" value?

Steve

> 
> v2:
> - New patch
> 
> Fixes: 9cca48fa4f89 ("drm/panthor: Add the heap logical block")
> Reported-by: Eric Smith <eric.smith@collabora.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Tested-by: Eric Smith <eric.smith@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_heap.c | 35 +++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 683bb94761bc..b1a7dbf25fb2 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -109,7 +109,11 @@ static int panthor_heap_ctx_stride(struct panthor_device *ptdev)
>  
>  static int panthor_get_heap_ctx_offset(struct panthor_heap_pool *pool, int id)
>  {
> -	return panthor_heap_ctx_stride(pool->ptdev) * id;
> +	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
> +	 * is [1:MAX_HEAPS_PER_POOL], which we need to turn into a
> +	 * [0:MAX_HEAPS_PER_POOL-1] context index, hence the minus one here.
> +	 */
> +	return panthor_heap_ctx_stride(pool->ptdev) * (id - 1);
>  }
>  
>  static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
> @@ -118,6 +122,21 @@ static void *panthor_get_heap_ctx(struct panthor_heap_pool *pool, int id)
>  	       panthor_get_heap_ctx_offset(pool, id);
>  }
>  
> +static int panthor_get_heap_ctx_id(struct panthor_heap_pool *pool,
> +				   u64 heap_ctx_gpu_va)
> +{
> +	u64 offset = heap_ctx_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> +	u32 heap_idx = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> +
> +	if (offset > U32_MAX || heap_idx >= MAX_HEAPS_PER_POOL)
> +		return -EINVAL;
> +
> +	/* ID 0 is reserved to encode 'no-tiler-heap', the valid range
> +	 * is [1:MAX_HEAPS_PER_POOL], hence the plus one here.
> +	 */
> +	return heap_idx + 1;
> +}
> +
>  static void panthor_free_heap_chunk(struct panthor_vm *vm,
>  				    struct panthor_heap *heap,
>  				    struct panthor_heap_chunk *chunk)
> @@ -364,14 +383,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool,
>  			      u64 heap_gpu_va,
>  			      u64 chunk_gpu_va)
>  {
> -	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> -	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> +	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);
>  	struct panthor_heap_chunk *chunk, *tmp, *removed = NULL;
>  	struct panthor_heap *heap;
>  	int ret;
>  
> -	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
> -		return -EINVAL;
> +	if (heap_id < 0)
> +		return heap_id;
>  
>  	down_read(&pool->lock);
>  	heap = xa_load(&pool->xa, heap_id);
> @@ -427,14 +445,13 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
>  		      u32 pending_frag_count,
>  		      u64 *new_chunk_gpu_va)
>  {
> -	u64 offset = heap_gpu_va - panthor_kernel_bo_gpuva(pool->gpu_contexts);
> -	u32 heap_id = (u32)offset / panthor_heap_ctx_stride(pool->ptdev);
> +	int heap_id = panthor_get_heap_ctx_id(pool, heap_gpu_va);
>  	struct panthor_heap_chunk *chunk;
>  	struct panthor_heap *heap;
>  	int ret;
>  
> -	if (offset > U32_MAX || heap_id >= MAX_HEAPS_PER_POOL)
> -		return -EINVAL;
> +	if (heap_id < 0)
> +		return heap_id;
>  
>  	down_read(&pool->lock);
>  	heap = xa_load(&pool->xa, heap_id);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-05-02 14:03   ` Steven Price
@ 2024-05-02 14:15     ` Boris Brezillon
  2024-05-02 14:26       ` Steven Price
  0 siblings, 1 reply; 19+ messages in thread
From: Boris Brezillon @ 2024-05-02 14:15 UTC (permalink / raw
  To: Steven Price
  Cc: Liviu Dudau, Adrián Larumbe, dri-devel, kernel, Eric Smith

On Thu, 2 May 2024 15:03:51 +0100
Steven Price <steven.price@arm.com> wrote:

> On 30/04/2024 12:28, Boris Brezillon wrote:
> > ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
> > [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
> > in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.  
> 
> This might be a silly question, but do we need ID 0 to be
> "no-tiler-heap"? Would it be easier to e.g. use a negative number for
> that situation and avoid all the off-by-one problems?
> 
> I'm struggling to find the code which needs the 0 value to be special -
> where is it exactly that we encode this "no-tiler-heap" value?

Hm, I thought we were passing the heap handle to the group creation
ioctl, but heap queue/heap association is actually done through a CS
instruction, so I guess you have a point. The only thing that makes a
bit hesitant is that handle=0 is reserved for all other kind of handles
we return, and I think I'd prefer to keep it the same for heap handles.

This being said, we could do the `+- 1` in
panthor_ioctl_tiler_heap_{create,destroy}() to keep things simple in
panthor_heap.c.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-05-02 14:15     ` Boris Brezillon
@ 2024-05-02 14:26       ` Steven Price
  2024-05-02 14:36         ` Boris Brezillon
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Price @ 2024-05-02 14:26 UTC (permalink / raw
  To: Boris Brezillon
  Cc: Liviu Dudau, Adrián Larumbe, dri-devel, kernel, Eric Smith

On 02/05/2024 15:15, Boris Brezillon wrote:
> On Thu, 2 May 2024 15:03:51 +0100
> Steven Price <steven.price@arm.com> wrote:
> 
>> On 30/04/2024 12:28, Boris Brezillon wrote:
>>> ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
>>> [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
>>> in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.  
>>
>> This might be a silly question, but do we need ID 0 to be
>> "no-tiler-heap"? Would it be easier to e.g. use a negative number for
>> that situation and avoid all the off-by-one problems?
>>
>> I'm struggling to find the code which needs the 0 value to be special -
>> where is it exactly that we encode this "no-tiler-heap" value?
> 
> Hm, I thought we were passing the heap handle to the group creation
> ioctl, but heap queue/heap association is actually done through a CS
> instruction, so I guess you have a point. The only thing that makes a
> bit hesitant is that handle=0 is reserved for all other kind of handles
> we return, and I think I'd prefer to keep it the same for heap handles.
> 
> This being said, we could do the `+- 1` in
> panthor_ioctl_tiler_heap_{create,destroy}() to keep things simple in
> panthor_heap.c.

The heap handles returned to user space have the upper 16 bits encoding
the VM ID - so hopefully no one is doing anything crazy and splitting it
up to treat the lower part specially. And (unless I'm mistaken) the VM
IDs start from 1 so we'd still not have IDs of 0. So I don't think we
need the +- 1 part anywhere for tiler heaps.

I'd certainly consider it a user space bug to treat the handles as
anything other than opaque. Really user space shouldn't be treating 0 as
special either: the uAPI doesn't say it's not valid. But I'd be open to
updating the uAPI to say 0 is invalid if there's some desire for that.

Steve


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-05-02 14:26       ` Steven Price
@ 2024-05-02 14:36         ` Boris Brezillon
  2024-05-02 14:47           ` Boris Brezillon
  0 siblings, 1 reply; 19+ messages in thread
From: Boris Brezillon @ 2024-05-02 14:36 UTC (permalink / raw
  To: Steven Price
  Cc: Liviu Dudau, Adrián Larumbe, dri-devel, kernel, Eric Smith

On Thu, 2 May 2024 15:26:55 +0100
Steven Price <steven.price@arm.com> wrote:

> On 02/05/2024 15:15, Boris Brezillon wrote:
> > On Thu, 2 May 2024 15:03:51 +0100
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> On 30/04/2024 12:28, Boris Brezillon wrote:  
> >>> ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
> >>> [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
> >>> in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.    
> >>
> >> This might be a silly question, but do we need ID 0 to be
> >> "no-tiler-heap"? Would it be easier to e.g. use a negative number for
> >> that situation and avoid all the off-by-one problems?
> >>
> >> I'm struggling to find the code which needs the 0 value to be special -
> >> where is it exactly that we encode this "no-tiler-heap" value?  
> > 
> > Hm, I thought we were passing the heap handle to the group creation
> > ioctl, but heap queue/heap association is actually done through a CS
> > instruction, so I guess you have a point. The only thing that makes a
> > bit hesitant is that handle=0 is reserved for all other kind of handles
> > we return, and I think I'd prefer to keep it the same for heap handles.
> > 
> > This being said, we could do the `+- 1` in
> > panthor_ioctl_tiler_heap_{create,destroy}() to keep things simple in
> > panthor_heap.c.  
> 
> The heap handles returned to user space have the upper 16 bits encoding
> the VM ID - so hopefully no one is doing anything crazy and splitting it
> up to treat the lower part specially. And (unless I'm mistaken) the VM
> IDs start from 1 so we'd still not have IDs of 0. So I don't think we
> need the +- 1 part anywhere for tiler heaps.

Ah, I forgot about that too. Guess we're all good with a
[0,MAX_HEAPS_PER_POOL-1] range then.

> 
> I'd certainly consider it a user space bug to treat the handles as
> anything other than opaque. Really user space shouldn't be treating 0 as
> special either: the uAPI doesn't say it's not valid. But I'd be open to
> updating the uAPI to say 0 is invalid if there's some desire for that.

Will do that in v3 then.

Thanks!

Boris


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic
  2024-05-02 14:36         ` Boris Brezillon
@ 2024-05-02 14:47           ` Boris Brezillon
  0 siblings, 0 replies; 19+ messages in thread
From: Boris Brezillon @ 2024-05-02 14:47 UTC (permalink / raw
  To: Steven Price
  Cc: Liviu Dudau, Adrián Larumbe, dri-devel, kernel, Eric Smith

On Thu, 2 May 2024 16:36:02 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Thu, 2 May 2024 15:26:55 +0100
> Steven Price <steven.price@arm.com> wrote:
> 
> > On 02/05/2024 15:15, Boris Brezillon wrote:  
> > > On Thu, 2 May 2024 15:03:51 +0100
> > > Steven Price <steven.price@arm.com> wrote:
> > >     
> > >> On 30/04/2024 12:28, Boris Brezillon wrote:    
> > >>> ID 0 is reserved to encode 'no-tiler-heap', the heap ID range is
> > >>> [1:MAX_HEAPS_PER_POOL], which we occasionally need to turn into an index
> > >>> in the [0:MAX_HEAPS_PER_POOL-1] when we want to access the context object.      
> > >>
> > >> This might be a silly question, but do we need ID 0 to be
> > >> "no-tiler-heap"? Would it be easier to e.g. use a negative number for
> > >> that situation and avoid all the off-by-one problems?
> > >>
> > >> I'm struggling to find the code which needs the 0 value to be special -
> > >> where is it exactly that we encode this "no-tiler-heap" value?    
> > > 
> > > Hm, I thought we were passing the heap handle to the group creation
> > > ioctl, but heap queue/heap association is actually done through a CS
> > > instruction, so I guess you have a point. The only thing that makes a
> > > bit hesitant is that handle=0 is reserved for all other kind of handles
> > > we return, and I think I'd prefer to keep it the same for heap handles.
> > > 
> > > This being said, we could do the `+- 1` in
> > > panthor_ioctl_tiler_heap_{create,destroy}() to keep things simple in
> > > panthor_heap.c.    
> > 
> > The heap handles returned to user space have the upper 16 bits encoding
> > the VM ID - so hopefully no one is doing anything crazy and splitting it
> > up to treat the lower part specially. And (unless I'm mistaken) the VM
> > IDs start from 1 so we'd still not have IDs of 0. So I don't think we
> > need the +- 1 part anywhere for tiler heaps.  
> 
> Ah, I forgot about that too. Guess we're all good with a
> [0,MAX_HEAPS_PER_POOL-1] range then.
> 
> > 
> > I'd certainly consider it a user space bug to treat the handles as
> > anything other than opaque. Really user space shouldn't be treating 0 as
> > special either: the uAPI doesn't say it's not valid. But I'd be open to
> > updating the uAPI to say 0 is invalid if there's some desire for that.  
> 
> Will do that in v3 then.

Taking that back. I don't think it needs to be enforced in the uAPI. As
you said, it's supposed to be opaque, so I'm tempted to update the
drm_panthor_tiler_heap_destroy::handle kerneldoc saying it must be
a valid handle returned by DRM_IOCTL_PANTHOR_TILER_HEAP_CREATE instead.

It's just that making the handle non-zero is kinda nice for debugging
purposes, and as I said, this way it's consistent with other kind of
handles (GEMs, VMs, syncobjs, ...).

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-05-02 14:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-30 11:28 [PATCH v2 0/4] drm/panthor: Collection of tiler heap related fixes Boris Brezillon
2024-04-30 11:28 ` [PATCH v2 1/4] drm/panthor: Fix tiler OOM handling to allow incremental rendering Boris Brezillon
2024-04-30 15:27   ` Liviu Dudau
2024-05-02 14:03   ` Steven Price
2024-04-30 11:28 ` [PATCH v2 2/4] drm/panthor: Make sure the tiler initial/max chunks are consistent Boris Brezillon
2024-04-30 15:31   ` Liviu Dudau
2024-05-02 14:03   ` Steven Price
2024-04-30 11:28 ` [PATCH v2 3/4] drm/panthor: Relax the constraints on the tiler chunk size Boris Brezillon
2024-04-30 13:08   ` Adrián Larumbe
2024-05-02 14:03     ` Steven Price
2024-04-30 16:10   ` Liviu Dudau
2024-04-30 11:28 ` [PATCH v2 4/4] drm/panthor: Fix an off-by-one in the heap context retrieval logic Boris Brezillon
2024-04-30 16:40   ` Liviu Dudau
2024-04-30 17:07     ` Boris Brezillon
2024-05-02 14:03   ` Steven Price
2024-05-02 14:15     ` Boris Brezillon
2024-05-02 14:26       ` Steven Price
2024-05-02 14:36         ` Boris Brezillon
2024-05-02 14:47           ` Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.