All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* convert nvme to atomic queue limits updates
@ 2024-02-28 18:11 Christoph Hellwig
  2024-02-28 18:11 ` [PATCH 01/21] block: add a queue_limits_set helper Christoph Hellwig
                   ` (21 more replies)
  0 siblings, 22 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:11 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Hi all,

this series converts nvme to the new atomic queue limit updates, and
as part of that refactors a lot of the setup code.  The first two
patches are block layer patches already sent out as part of the md
queue limits conversion and will hopefully get merged into Jens' tree
soon.

Diffstat:
 block/blk-settings.c          |   43 ++++
 drivers/nvme/host/apple.c     |   18 +-
 drivers/nvme/host/core.c      |  378 ++++++++++++++++++++----------------------
 drivers/nvme/host/fc.c        |    8 
 drivers/nvme/host/multipath.c |   13 -
 drivers/nvme/host/nvme.h      |   12 -
 drivers/nvme/host/rdma.c      |   13 -
 drivers/nvme/host/zns.c       |   24 --
 drivers/nvme/target/loop.c    |    5 
 include/linux/blkdev.h        |    3 
 10 files changed, 272 insertions(+), 245 deletions(-)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/21] block: add a queue_limits_set helper
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
@ 2024-02-28 18:11 ` Christoph Hellwig
  2024-02-28 18:11 ` [PATCH 02/21] block: add a queue_limits_stack_bdev helper Christoph Hellwig
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:11 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Add a small wrapper around queue_limits_commit_update for stacking
drivers that don't want to update existing limits, but set an
entirely new set.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c   | 18 ++++++++++++++++++
 include/linux/blkdev.h |  1 +
 2 files changed, 19 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index b6bbe683d218fa..1989a177be201b 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -266,6 +266,24 @@ int queue_limits_commit_update(struct request_queue *q,
 }
 EXPORT_SYMBOL_GPL(queue_limits_commit_update);
 
+/**
+ * queue_limits_commit_set - apply queue limits to queue
+ * @q:		queue to update
+ * @lim:	limits to apply
+ *
+ * Apply the limits in @lim that were freshly initialized to @q.
+ * To update existing limits use queue_limits_start_update() and
+ * queue_limits_commit_update() instead.
+ *
+ * Returns 0 if successful, else a negative error code.
+ */
+int queue_limits_set(struct request_queue *q, struct queue_limits *lim)
+{
+	mutex_lock(&q->limits_lock);
+	return queue_limits_commit_update(q, lim);
+}
+EXPORT_SYMBOL_GPL(queue_limits_set);
+
 /**
  * blk_queue_bounce_limit - set bounce buffer limit for queue
  * @q: the request queue for the device
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a14ea934413850..dd510ad7ce4b45 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -889,6 +889,7 @@ queue_limits_start_update(struct request_queue *q)
 }
 int queue_limits_commit_update(struct request_queue *q,
 		struct queue_limits *lim);
+int queue_limits_set(struct request_queue *q, struct queue_limits *lim);
 
 /*
  * Access functions for manipulating queue properties
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/21] block: add a queue_limits_stack_bdev helper
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
  2024-02-28 18:11 ` [PATCH 01/21] block: add a queue_limits_set helper Christoph Hellwig
@ 2024-02-28 18:11 ` Christoph Hellwig
  2024-02-28 18:11 ` [PATCH 03/21] nvme: set max_hw_sectors unconditionally Christoph Hellwig
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:11 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Add a small wrapper around blk_stack_limits that allows passing a bdev
for the bottom device and prints an error in case of misaligned
device. The name fits into the new queue limits API and the intent is
to eventually replace disk_stack_limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c   | 25 +++++++++++++++++++++++++
 include/linux/blkdev.h |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 1989a177be201b..865fe4ebbf9b83 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -891,6 +891,31 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 }
 EXPORT_SYMBOL(blk_stack_limits);
 
+/**
+ * queue_limits_stack_bdev - adjust queue_limits for stacked devices
+ * @t:	the stacking driver limits (top device)
+ * @bdev:  the underlying block device (bottom)
+ * @offset:  offset to beginning of data within component device
+ * @pfx: prefix to use for warnings logged
+ *
+ * Description:
+ *    This function is used by stacking drivers like MD and DM to ensure
+ *    that all component devices have compatible block sizes and
+ *    alignments.  The stacking driver must provide a queue_limits
+ *    struct (top) and then iteratively call the stacking function for
+ *    all component (bottom) devices.  The stacking function will
+ *    attempt to combine the values and ensure proper alignment.
+ */
+void queue_limits_stack_bdev(struct queue_limits *t, struct block_device *bdev,
+		sector_t offset, const char *pfx)
+{
+	if (blk_stack_limits(t, &bdev_get_queue(bdev)->limits,
+			get_start_sect(bdev) + offset))
+		pr_notice("%s: Warning: Device %pg is misaligned\n",
+			pfx, bdev);
+}
+EXPORT_SYMBOL_GPL(queue_limits_stack_bdev);
+
 /**
  * disk_stack_limits - adjust queue limits for stacked drivers
  * @disk:  MD/DM gendisk (top)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index dd510ad7ce4b45..285e82723d641f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -924,6 +924,8 @@ extern void blk_set_queue_depth(struct request_queue *q, unsigned int depth);
 extern void blk_set_stacking_limits(struct queue_limits *lim);
 extern int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 			    sector_t offset);
+void queue_limits_stack_bdev(struct queue_limits *t, struct block_device *bdev,
+		sector_t offset, const char *pfx);
 extern void disk_stack_limits(struct gendisk *disk, struct block_device *bdev,
 			      sector_t offset);
 extern void blk_queue_update_dma_pad(struct request_queue *, unsigned int);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/21] nvme: set max_hw_sectors unconditionally
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
  2024-02-28 18:11 ` [PATCH 01/21] block: add a queue_limits_set helper Christoph Hellwig
  2024-02-28 18:11 ` [PATCH 02/21] block: add a queue_limits_stack_bdev helper Christoph Hellwig
@ 2024-02-28 18:11 ` Christoph Hellwig
  2024-02-29 10:46   ` Max Gurtovoy
  2024-02-28 18:11 ` [PATCH 04/21] nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard Christoph Hellwig
                   ` (18 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:11 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

All transports set a max_hw_sectors value in the nvme_ctrl, so make
the code using it unconditional and clean it up using a little helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index eed3e22e24d913..74cd384ca5fc73 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1944,19 +1944,19 @@ static int nvme_configure_metadata(struct nvme_ctrl *ctrl,
 	return 0;
 }
 
+static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl)
+{
+	return ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> SECTOR_SHIFT) + 1;
+}
+
 static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
 		struct request_queue *q)
 {
 	bool vwc = ctrl->vwc & NVME_CTRL_VWC_PRESENT;
 
-	if (ctrl->max_hw_sectors) {
-		u32 max_segments =
-			(ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> 9)) + 1;
-
-		max_segments = min_not_zero(max_segments, ctrl->max_segments);
-		blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
-		blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX));
-	}
+	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
+	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
+		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
 	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
 	blk_queue_dma_alignment(q, 3);
 	blk_queue_write_cache(q, vwc, vwc);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/21] nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (2 preceding siblings ...)
  2024-02-28 18:11 ` [PATCH 03/21] nvme: set max_hw_sectors unconditionally Christoph Hellwig
@ 2024-02-28 18:11 ` Christoph Hellwig
  2024-02-29 10:48   ` Max Gurtovoy
  2024-02-28 18:11 ` [PATCH 05/21] nvme: remove nvme_revalidate_zones Christoph Hellwig
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:11 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Move the handling of the NVME_QUIRK_DEALLOCATE_ZEROES quirk out of
nvme_config_discard so that it is combined with the normal write_zeroes
limit handling.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 74cd384ca5fc73..ee1e13a658c314 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1816,9 +1816,6 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl, struct gendisk *disk,
 	else
 		blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
 	queue->limits.discard_granularity = queue_logical_block_size(queue);
-
-	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
-		blk_queue_max_write_zeroes_sectors(queue, UINT_MAX);
 }
 
 static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b)
@@ -2029,8 +2026,12 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 	set_capacity_and_notify(disk, capacity);
 
 	nvme_config_discard(ctrl, disk, head);
-	blk_queue_max_write_zeroes_sectors(disk->queue,
-					   ctrl->max_zeroes_sectors);
+
+	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
+		blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX);
+	else
+		blk_queue_max_write_zeroes_sectors(disk->queue,
+				ctrl->max_zeroes_sectors);
 }
 
 static bool nvme_ns_is_readonly(struct nvme_ns *ns, struct nvme_ns_info *info)
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/21] nvme: remove nvme_revalidate_zones
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (3 preceding siblings ...)
  2024-02-28 18:11 ` [PATCH 04/21] nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard Christoph Hellwig
@ 2024-02-28 18:11 ` Christoph Hellwig
  2024-02-28 23:47   ` Damien Le Moal
  2024-02-28 18:12 ` [PATCH 06/21] nvme: move max_integrity_segments handling out of nvme_init_integrity Christoph Hellwig
                   ` (16 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:11 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Handle setting the zone size / chunk_sectors and max_append_sectors
limits together with the other ZNS limits, and just open code the
call to blk_revalidate_zones in the current place.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c |  2 +-
 drivers/nvme/host/nvme.h |  1 -
 drivers/nvme/host/zns.c  | 12 ++----------
 3 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ee1e13a658c314..4a05045e409904 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2154,7 +2154,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	blk_mq_unfreeze_queue(ns->disk->queue);
 
 	if (blk_queue_is_zoned(ns->queue)) {
-		ret = nvme_revalidate_zones(ns);
+		ret = blk_revalidate_disk_zones(ns->disk, NULL);
 		if (ret && !nvme_first_scan(ns->disk))
 			goto out;
 	}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 4a484fc8a073c8..01e8bae7886584 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1036,7 +1036,6 @@ static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
 }
 #endif /* CONFIG_NVME_MULTIPATH */
 
-int nvme_revalidate_zones(struct nvme_ns *ns);
 int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
 		unsigned int nr_zones, report_zones_cb cb, void *data);
 #ifdef CONFIG_BLK_DEV_ZONED
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 499bbb0eee8d09..852261d7891362 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -7,16 +7,6 @@
 #include <linux/vmalloc.h>
 #include "nvme.h"
 
-int nvme_revalidate_zones(struct nvme_ns *ns)
-{
-	struct request_queue *q = ns->queue;
-
-	blk_queue_chunk_sectors(q, ns->head->zsze);
-	blk_queue_max_zone_append_sectors(q, ns->ctrl->max_zone_append);
-
-	return blk_revalidate_disk_zones(ns->disk, NULL);
-}
-
 static int nvme_set_max_append(struct nvme_ctrl *ctrl)
 {
 	struct nvme_command c = { };
@@ -113,6 +103,8 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf)
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 	disk_set_max_open_zones(ns->disk, le32_to_cpu(id->mor) + 1);
 	disk_set_max_active_zones(ns->disk, le32_to_cpu(id->mar) + 1);
+	blk_queue_chunk_sectors(ns->queue, ns->head->zsze);
+	blk_queue_max_zone_append_sectors(ns->queue, ns->ctrl->max_zone_append);
 free_data:
 	kfree(id);
 	return status;
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/21] nvme: move max_integrity_segments handling out of nvme_init_integrity
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (4 preceding siblings ...)
  2024-02-28 18:11 ` [PATCH 05/21] nvme: remove nvme_revalidate_zones Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 10:58   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 07/21] nvme: cleanup the nvme_init_integrity calling conventions Christoph Hellwig
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

max_integrity_segments is just a hardware limit and doesn't need to be
in nvme_init_integrity with the PI setup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4a05045e409904..94d835f9dc90e9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1724,8 +1724,7 @@ int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
 }
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-static void nvme_init_integrity(struct gendisk *disk,
-		struct nvme_ns_head *head, u32 max_integrity_segments)
+static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
 {
 	struct blk_integrity integrity = { };
 
@@ -1773,11 +1772,9 @@ static void nvme_init_integrity(struct gendisk *disk,
 	integrity.tuple_size = head->ms;
 	integrity.pi_offset = head->pi_offset;
 	blk_integrity_register(disk, &integrity);
-	blk_queue_max_integrity_segments(disk->queue, max_integrity_segments);
 }
 #else
-static void nvme_init_integrity(struct gendisk *disk,
-		struct nvme_ns_head *head, u32 max_integrity_segments)
+static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
 {
 }
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
@@ -1954,6 +1951,7 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
 	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
 	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
 		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
+	blk_queue_max_integrity_segments(q, ctrl->max_integrity_segments);
 	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
 	blk_queue_dma_alignment(q, 3);
 	blk_queue_write_cache(q, vwc, vwc);
@@ -2017,8 +2015,7 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 	if (head->ms) {
 		if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
 		    (head->features & NVME_NS_METADATA_SUPPORTED))
-			nvme_init_integrity(disk, head,
-					    ctrl->max_integrity_segments);
+			nvme_init_integrity(disk, head);
 		else if (!nvme_ns_has_pi(head))
 			capacity = 0;
 	}
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/21] nvme: cleanup the nvme_init_integrity calling conventions
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (5 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 06/21] nvme: move max_integrity_segments handling out of nvme_init_integrity Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 12:33   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 08/21] nvme: move blk_integrity_unregister into nvme_init_integrity Christoph Hellwig
                   ` (14 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Handle the no metadata support case in nvme_init_integrity as well to
simplify the calling convention and prepare for future changes in the
area.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 94d835f9dc90e9..76e548902ec33e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1723,11 +1723,21 @@ int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
 	return 0;
 }
 
-#ifdef CONFIG_BLK_DEV_INTEGRITY
-static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
+static bool nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
 {
 	struct blk_integrity integrity = { };
 
+	if (!head->ms)
+		return true;
+
+	/*
+	 * PI can always be supported as we can ask the controller to simply
+	 * insert/strip it, which is not possible for other kinds of metadata.
+	 */
+	if (!IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) ||
+	    !(head->features & NVME_NS_METADATA_SUPPORTED))
+		return nvme_ns_has_pi(head);
+
 	switch (head->pi_type) {
 	case NVME_NS_DPS_PI_TYPE3:
 		switch (head->guard_type) {
@@ -1772,12 +1782,8 @@ static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
 	integrity.tuple_size = head->ms;
 	integrity.pi_offset = head->pi_offset;
 	blk_integrity_register(disk, &integrity);
+	return true;
 }
-#else
-static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
-{
-}
-#endif /* CONFIG_BLK_DEV_INTEGRITY */
 
 static void nvme_config_discard(struct nvme_ctrl *ctrl, struct gendisk *disk,
 		struct nvme_ns_head *head)
@@ -2012,13 +2018,8 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 	 * I/O to namespaces with metadata except when the namespace supports
 	 * PI, as it can strip/insert in that case.
 	 */
-	if (head->ms) {
-		if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
-		    (head->features & NVME_NS_METADATA_SUPPORTED))
-			nvme_init_integrity(disk, head);
-		else if (!nvme_ns_has_pi(head))
-			capacity = 0;
-	}
+	if (!nvme_init_integrity(disk, head))
+		capacity = 0;
 
 	set_capacity_and_notify(disk, capacity);
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/21] nvme: move blk_integrity_unregister into nvme_init_integrity
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (6 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 07/21] nvme: cleanup the nvme_init_integrity calling conventions Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 12:36   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk Christoph Hellwig
                   ` (13 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Move uneregistering the existing integrity profile into the helper
dealing with all the other integrity / metadata setup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 76e548902ec33e..105046a7957272 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1727,6 +1727,8 @@ static bool nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
 {
 	struct blk_integrity integrity = { };
 
+	blk_integrity_unregister(disk);
+
 	if (!head->ms)
 		return true;
 
@@ -1980,8 +1982,6 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 		bs = (1 << 9);
 	}
 
-	blk_integrity_unregister(disk);
-
 	atomic_bs = phys_bs = bs;
 	if (id->nabo == 0) {
 		/*
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (7 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 08/21] nvme: move blk_integrity_unregister into nvme_init_integrity Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 12:47   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 10/21] nvme: move a few things out of nvme_update_disk_info Christoph Hellwig
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Currently nvme_update_ns_info_block calls nvme_update_disk_info both for
the namespace attached disk, and the multipath one (if it exists).  This
is very different from how other stacking drivers work, and leads to
a lot of complexity.

Switch to setting the disk capacity and initializing the integrity
profile, and let blk_stack_limits which already is called just below
deal with updating the other limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 105046a7957272..0bf30580f29d5b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2159,7 +2159,8 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 
 	if (nvme_ns_head_multipath(ns->head)) {
 		blk_mq_freeze_queue(ns->head->disk->queue);
-		nvme_update_disk_info(ns->ctrl, ns->head->disk, ns->head, id);
+		nvme_init_integrity(ns->head->disk, ns->head);
+		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
 		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
 		nvme_mpath_revalidate_paths(ns);
 		blk_stack_limits(&ns->head->disk->queue->limits,
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/21] nvme: move a few things out of nvme_update_disk_info
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (8 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 11/21] nvme: move setting the write cache flags out of nvme_set_queue_limits Christoph Hellwig
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Move setting up the integrity profile and setting the disk capacity out
of nvme_update_disk_info to get nvme_update_disk_info into a shape where
it just sets queue_limits eventually.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 46 +++++++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0bf30580f29d5b..3d839945edd85d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1965,12 +1965,13 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
 	blk_queue_write_cache(q, vwc, vwc);
 }
 
-static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
-		struct nvme_ns_head *head, struct nvme_id_ns *id)
+static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id)
 {
-	sector_t capacity = nvme_lba_to_sect(head, le64_to_cpu(id->nsze));
+	struct gendisk *disk = ns->disk;
+	struct nvme_ns_head *head = ns->head;
 	u32 bs = 1U << head->lba_shift;
 	u32 atomic_bs, phys_bs, io_opt = 0;
+	bool valid = true;
 
 	/*
 	 * The block layer can't support LBA sizes larger than the page size
@@ -1978,8 +1979,8 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 	 * allow block I/O.
 	 */
 	if (head->lba_shift > PAGE_SHIFT || head->lba_shift < SECTOR_SHIFT) {
-		capacity = 0;
 		bs = (1 << 9);
+		valid = false;
 	}
 
 	atomic_bs = phys_bs = bs;
@@ -1992,7 +1993,7 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 		if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawupf)
 			atomic_bs = (1 + le16_to_cpu(id->nawupf)) * bs;
 		else
-			atomic_bs = (1 + ctrl->subsys->awupf) * bs;
+			atomic_bs = (1 + ns->ctrl->subsys->awupf) * bs;
 	}
 
 	if (id->nsfeat & NVME_NS_FEAT_IO_OPT) {
@@ -2012,24 +2013,14 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
 	blk_queue_io_min(disk->queue, phys_bs);
 	blk_queue_io_opt(disk->queue, io_opt);
 
-	/*
-	 * Register a metadata profile for PI, or the plain non-integrity NVMe
-	 * metadata masquerading as Type 0 if supported, otherwise reject block
-	 * I/O to namespaces with metadata except when the namespace supports
-	 * PI, as it can strip/insert in that case.
-	 */
-	if (!nvme_init_integrity(disk, head))
-		capacity = 0;
-
-	set_capacity_and_notify(disk, capacity);
-
-	nvme_config_discard(ctrl, disk, head);
+	nvme_config_discard(ns->ctrl, disk, head);
 
-	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
+	if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
 		blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX);
 	else
 		blk_queue_max_write_zeroes_sectors(disk->queue,
-				ctrl->max_zeroes_sectors);
+				ns->ctrl->max_zeroes_sectors);
+	return valid;
 }
 
 static bool nvme_ns_is_readonly(struct nvme_ns *ns, struct nvme_ns_info *info)
@@ -2103,6 +2094,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		struct nvme_ns_info *info)
 {
 	struct nvme_id_ns *id;
+	sector_t capacity;
 	unsigned lbaf;
 	int ret;
 
@@ -2121,6 +2113,8 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	lbaf = nvme_lbaf_index(id->flbas);
 	ns->head->lba_shift = id->lbaf[lbaf].ds;
 	ns->head->nuse = le64_to_cpu(id->nuse);
+	capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
+
 	nvme_set_queue_limits(ns->ctrl, ns->queue);
 
 	ret = nvme_configure_metadata(ns->ctrl, ns->head, id);
@@ -2129,7 +2123,19 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		goto out;
 	}
 	nvme_set_chunk_sectors(ns, id);
-	nvme_update_disk_info(ns->ctrl, ns->disk, ns->head, id);
+	if (!nvme_update_disk_info(ns, id))
+		capacity = 0;
+
+	/*
+	 * Register a metadata profile for PI, or the plain non-integrity NVMe
+	 * metadata masquerading as Type 0 if supported, otherwise reject block
+	 * I/O to namespaces with metadata except when the namespace supports
+	 * PI, as it can strip/insert in that case.
+	 */
+	if (!nvme_init_integrity(ns->disk, ns->head))
+		capacity = 0;
+
+	set_capacity_and_notify(ns->disk, capacity);
 
 	if (ns->head->ids.csi == NVME_CSI_ZNS) {
 		ret = nvme_update_zone_info(ns, lbaf);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/21] nvme: move setting the write cache flags out of nvme_set_queue_limits
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (9 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 10/21] nvme: move a few things out of nvme_update_disk_info Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 13:11   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 12/21] nvme: move common logic into nvme_update_ns_info Christoph Hellwig
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

nvme_set_queue_limits is used on the admin queue and all gendisks
including hidden ones that don't support block I/O.  The write cache
setting on the other hand only makes sense for block I/O.  Move the
blk_queue_write_cache call to nvme_update_ns_info_block instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3d839945edd85d..dcb8d0590ed0f3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1954,15 +1954,12 @@ static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl)
 static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
 		struct request_queue *q)
 {
-	bool vwc = ctrl->vwc & NVME_CTRL_VWC_PRESENT;
-
 	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
 	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
 		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
 	blk_queue_max_integrity_segments(q, ctrl->max_integrity_segments);
 	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
 	blk_queue_dma_alignment(q, 3);
-	blk_queue_write_cache(q, vwc, vwc);
 }
 
 static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id)
@@ -2093,6 +2090,7 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
 static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		struct nvme_ns_info *info)
 {
+	bool vwc = ns->ctrl->vwc & NVME_CTRL_VWC_PRESENT;
 	struct nvme_id_ns *id;
 	sector_t capacity;
 	unsigned lbaf;
@@ -2154,6 +2152,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	if ((id->dlfeat & 0x7) == 0x1 && (id->dlfeat & (1 << 3)))
 		ns->head->features |= NVME_NS_DEAC;
 	set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
+	blk_queue_write_cache(ns->disk->queue, vwc, vwc);
 	set_bit(NVME_NS_READY, &ns->flags);
 	blk_mq_unfreeze_queue(ns->disk->queue);
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/21] nvme: move common logic into nvme_update_ns_info
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (10 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 11/21] nvme: move setting the write cache flags out of nvme_set_queue_limits Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 13:30   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper Christoph Hellwig
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

nvme_update_ns_info_generic and nvme_update_ns_info_block share a
fair amount of logic related to not fully supported namespace
formats and updating the multipath information.  Move this logic
into the common caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 79 +++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 42 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dcb8d0590ed0f3..78667ba89ec491 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2070,21 +2070,8 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
 	set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
 	blk_mq_unfreeze_queue(ns->disk->queue);
 
-	if (nvme_ns_head_multipath(ns->head)) {
-		blk_mq_freeze_queue(ns->head->disk->queue);
-		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
-		nvme_mpath_revalidate_paths(ns);
-		blk_stack_limits(&ns->head->disk->queue->limits,
-				 &ns->queue->limits, 0);
-		ns->head->disk->flags |= GENHD_FL_HIDDEN;
-		blk_mq_unfreeze_queue(ns->head->disk->queue);
-	}
-
 	/* Hide the block-interface for these devices */
-	ns->disk->flags |= GENHD_FL_HIDDEN;
-	set_bit(NVME_NS_READY, &ns->flags);
-
-	return 0;
+	return -ENODEV;
 }
 
 static int nvme_update_ns_info_block(struct nvme_ns *ns,
@@ -2104,7 +2091,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		/* namespace not allocated or attached */
 		info->is_removed = true;
 		ret = -ENODEV;
-		goto error;
+		goto out;
 	}
 
 	blk_mq_freeze_queue(ns->disk->queue);
@@ -2162,54 +2149,62 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 			goto out;
 	}
 
-	if (nvme_ns_head_multipath(ns->head)) {
-		blk_mq_freeze_queue(ns->head->disk->queue);
-		nvme_init_integrity(ns->head->disk, ns->head);
-		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
-		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
-		nvme_mpath_revalidate_paths(ns);
-		blk_stack_limits(&ns->head->disk->queue->limits,
-				 &ns->queue->limits, 0);
-		disk_update_readahead(ns->head->disk);
-		blk_mq_unfreeze_queue(ns->head->disk->queue);
-	}
-
 	ret = 0;
 out:
-	/*
-	 * If probing fails due an unsupported feature, hide the block device,
-	 * but still allow other access.
-	 */
-	if (ret == -ENODEV) {
-		ns->disk->flags |= GENHD_FL_HIDDEN;
-		set_bit(NVME_NS_READY, &ns->flags);
-		ret = 0;
-	}
-
-error:
 	kfree(id);
 	return ret;
 }
 
 static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 {
+	int ret;
+
 	switch (info->ids.csi) {
 	case NVME_CSI_ZNS:
 		if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) {
 			dev_info(ns->ctrl->device,
 	"block device for nsid %u not supported without CONFIG_BLK_DEV_ZONED\n",
 				info->nsid);
-			return nvme_update_ns_info_generic(ns, info);
+			ret = nvme_update_ns_info_generic(ns, info);
+			break;
 		}
-		return nvme_update_ns_info_block(ns, info);
+		ret = nvme_update_ns_info_block(ns, info);
+		break;
 	case NVME_CSI_NVM:
-		return nvme_update_ns_info_block(ns, info);
+		ret = nvme_update_ns_info_block(ns, info);
+		break;
 	default:
 		dev_info(ns->ctrl->device,
 			"block device for nsid %u not supported (csi %u)\n",
 			info->nsid, info->ids.csi);
-		return nvme_update_ns_info_generic(ns, info);
+		ret = nvme_update_ns_info_generic(ns, info);
+		break;
+	}
+
+	/*
+	 * If probing fails due an unsupported feature, hide the block device,
+	 * but still allow other access.
+	 */
+	if (ret == -ENODEV) {
+		ns->disk->flags |= GENHD_FL_HIDDEN;
+		set_bit(NVME_NS_READY, &ns->flags);
+		ret = 0;
+	}
+
+	if (!ret && nvme_ns_head_multipath(ns->head)) {
+		blk_mq_freeze_queue(ns->head->disk->queue);
+		if (!(ns->disk->flags & GENHD_FL_HIDDEN))
+			nvme_init_integrity(ns->head->disk, ns->head);
+		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
+		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
+		nvme_mpath_revalidate_paths(ns);
+		blk_stack_limits(&ns->head->disk->queue->limits,
+				 &ns->queue->limits, 0);
+		disk_update_readahead(ns->head->disk);
+		blk_mq_unfreeze_queue(ns->head->disk->queue);
 	}
+
+	return ret;
 }
 
 #ifdef CONFIG_BLK_SED_OPAL
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (11 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 12/21] nvme: move common logic into nvme_update_ns_info Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-29 13:52   ` Max Gurtovoy
  2024-02-28 18:12 ` [PATCH 14/21] nvme: don't query identify data in configure_metadata Christoph Hellwig
                   ` (8 subsequent siblings)
  21 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Split the logic to query the Identify Namespace Data Structure, NVM
Command Set into a separate helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 38 ++++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 78667ba89ec491..adcd11720d1bb4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1831,12 +1831,35 @@ static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b)
 		a->csi == b->csi;
 }
 
+static int nvme_identify_ns_nvm(struct nvme_ctrl *ctrl, unsigned int nsid,
+		struct nvme_id_ns_nvm **nvmp)
+{
+	struct nvme_command c = {
+		.identify.opcode	= nvme_admin_identify,
+		.identify.nsid		= cpu_to_le32(nsid),
+		.identify.cns		= NVME_ID_CNS_CS_NS,
+		.identify.csi		= NVME_CSI_NVM,
+	};
+	struct nvme_id_ns_nvm *nvm;
+	int ret;
+
+	nvm = kzalloc(sizeof(*nvm), GFP_KERNEL);
+	if (!nvm)
+		return -ENOMEM;
+
+	ret = nvme_submit_sync_cmd(ctrl->admin_q, &c, nvm, sizeof(*nvm));
+	if (ret)
+		kfree(nvm);
+	else
+		*nvmp = nvm;
+	return ret;
+}
+
 static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
 		struct nvme_id_ns *id)
 {
 	bool first = id->dps & NVME_NS_DPS_PI_FIRST;
 	unsigned lbaf = nvme_lbaf_index(id->flbas);
-	struct nvme_command c = { };
 	struct nvme_id_ns_nvm *nvm;
 	int ret = 0;
 	u32 elbaf;
@@ -1849,18 +1872,9 @@ static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
 		goto set_pi;
 	}
 
-	nvm = kzalloc(sizeof(*nvm), GFP_KERNEL);
-	if (!nvm)
-		return -ENOMEM;
-
-	c.identify.opcode = nvme_admin_identify;
-	c.identify.nsid = cpu_to_le32(head->ns_id);
-	c.identify.cns = NVME_ID_CNS_CS_NS;
-	c.identify.csi = NVME_CSI_NVM;
-
-	ret = nvme_submit_sync_cmd(ctrl->admin_q, &c, nvm, sizeof(*nvm));
+	ret = nvme_identify_ns_nvm(ctrl, head->ns_id, &nvm);
 	if (ret)
-		goto free_data;
+		goto set_pi;
 
 	elbaf = le32_to_cpu(nvm->elbaf[lbaf]);
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/21] nvme: don't query identify data in configure_metadata
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (12 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 15/21] nvme: cleanup nvme_configure_metadata Christoph Hellwig
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Move reading the Identify Namespace Data Structure, NVM Command Set out
of configure_metadata into the caller.  This allows doing the identify
call outside the frozen I/O queues, and prepares for using data from
the Identify data structure for other purposes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 49 ++++++++++++++++------------------------
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index adcd11720d1bb4..531230ccd12210 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1855,32 +1855,26 @@ static int nvme_identify_ns_nvm(struct nvme_ctrl *ctrl, unsigned int nsid,
 	return ret;
 }
 
-static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
-		struct nvme_id_ns *id)
+static void nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
+		struct nvme_id_ns *id, struct nvme_id_ns_nvm *nvm)
 {
 	bool first = id->dps & NVME_NS_DPS_PI_FIRST;
 	unsigned lbaf = nvme_lbaf_index(id->flbas);
-	struct nvme_id_ns_nvm *nvm;
-	int ret = 0;
 	u32 elbaf;
 
 	head->pi_size = 0;
 	head->ms = le16_to_cpu(id->lbaf[lbaf].ms);
-	if (!(ctrl->ctratt & NVME_CTRL_ATTR_ELBAS)) {
+	if (!nvm || !(ctrl->ctratt & NVME_CTRL_ATTR_ELBAS)) {
 		head->pi_size = sizeof(struct t10_pi_tuple);
 		head->guard_type = NVME_NVM_NS_16B_GUARD;
 		goto set_pi;
 	}
 
-	ret = nvme_identify_ns_nvm(ctrl, head->ns_id, &nvm);
-	if (ret)
-		goto set_pi;
-
 	elbaf = le32_to_cpu(nvm->elbaf[lbaf]);
 
 	/* no support for storage tag formats right now */
 	if (nvme_elbaf_sts(elbaf))
-		goto free_data;
+		goto set_pi;
 
 	head->guard_type = nvme_elbaf_guard_type(elbaf);
 	switch (head->guard_type) {
@@ -1894,8 +1888,6 @@ static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
 		break;
 	}
 
-free_data:
-	kfree(nvm);
 set_pi:
 	if (head->pi_size && head->ms >= head->pi_size)
 		head->pi_type = id->dps & NVME_NS_DPS_PI_MASK;
@@ -1906,22 +1898,17 @@ static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
 		head->pi_offset = 0;
 	else
 		head->pi_offset = head->ms - head->pi_size;
-
-	return ret;
 }
 
-static int nvme_configure_metadata(struct nvme_ctrl *ctrl,
-		struct nvme_ns_head *head, struct nvme_id_ns *id)
+static void nvme_configure_metadata(struct nvme_ctrl *ctrl,
+		struct nvme_ns_head *head, struct nvme_id_ns *id,
+		struct nvme_id_ns_nvm *nvm)
 {
-	int ret;
-
-	ret = nvme_init_ms(ctrl, head, id);
-	if (ret)
-		return ret;
+	nvme_init_ms(ctrl, head, id, nvm);
 
 	head->features &= ~(NVME_NS_METADATA_SUPPORTED | NVME_NS_EXT_LBAS);
 	if (!head->ms || !(ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
-		return 0;
+		return;
 
 	if (ctrl->ops->flags & NVME_F_FABRICS) {
 		/*
@@ -1930,7 +1917,7 @@ static int nvme_configure_metadata(struct nvme_ctrl *ctrl,
 		 * remap the separate metadata buffer from the block layer.
 		 */
 		if (WARN_ON_ONCE(!(id->flbas & NVME_NS_FLBAS_META_EXT)))
-			return 0;
+			return;
 
 		head->features |= NVME_NS_EXT_LBAS;
 
@@ -1957,7 +1944,6 @@ static int nvme_configure_metadata(struct nvme_ctrl *ctrl,
 		else
 			head->features |= NVME_NS_METADATA_SUPPORTED;
 	}
-	return 0;
 }
 
 static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl)
@@ -2092,6 +2078,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		struct nvme_ns_info *info)
 {
 	bool vwc = ns->ctrl->vwc & NVME_CTRL_VWC_PRESENT;
+	struct nvme_id_ns_nvm *nvm = NULL;
 	struct nvme_id_ns *id;
 	sector_t capacity;
 	unsigned lbaf;
@@ -2108,6 +2095,12 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		goto out;
 	}
 
+	if (ns->ctrl->ctratt & NVME_CTRL_ATTR_ELBAS) {
+		ret = nvme_identify_ns_nvm(ns->ctrl, info->nsid, &nvm);
+		if (ret < 0)
+			goto out;
+	}
+
 	blk_mq_freeze_queue(ns->disk->queue);
 	lbaf = nvme_lbaf_index(id->flbas);
 	ns->head->lba_shift = id->lbaf[lbaf].ds;
@@ -2115,12 +2108,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
 
 	nvme_set_queue_limits(ns->ctrl, ns->queue);
-
-	ret = nvme_configure_metadata(ns->ctrl, ns->head, id);
-	if (ret < 0) {
-		blk_mq_unfreeze_queue(ns->disk->queue);
-		goto out;
-	}
+	nvme_configure_metadata(ns->ctrl, ns->head, id, nvm);
 	nvme_set_chunk_sectors(ns, id);
 	if (!nvme_update_disk_info(ns, id))
 		capacity = 0;
@@ -2165,6 +2153,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 
 	ret = 0;
 out:
+	kfree(nvm);
 	kfree(id);
 	return ret;
 }
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/21] nvme: cleanup nvme_configure_metadata
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (13 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 14/21] nvme: don't query identify data in configure_metadata Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 16/21] nvme-rdma: initialize max_hw_sectors earlier Christoph Hellwig
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Fold nvme_init_ms into nvme_configure_metadata after splitting up
a little helper to deal with the extended LBA formats.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 47 ++++++++++++++++------------------------
 1 file changed, 19 insertions(+), 28 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 531230ccd12210..f4ec8683b3b725 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1855,26 +1855,14 @@ static int nvme_identify_ns_nvm(struct nvme_ctrl *ctrl, unsigned int nsid,
 	return ret;
 }
 
-static void nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
+static void nvme_configure_pi_elbas(struct nvme_ns_head *head,
 		struct nvme_id_ns *id, struct nvme_id_ns_nvm *nvm)
 {
-	bool first = id->dps & NVME_NS_DPS_PI_FIRST;
-	unsigned lbaf = nvme_lbaf_index(id->flbas);
-	u32 elbaf;
-
-	head->pi_size = 0;
-	head->ms = le16_to_cpu(id->lbaf[lbaf].ms);
-	if (!nvm || !(ctrl->ctratt & NVME_CTRL_ATTR_ELBAS)) {
-		head->pi_size = sizeof(struct t10_pi_tuple);
-		head->guard_type = NVME_NVM_NS_16B_GUARD;
-		goto set_pi;
-	}
-
-	elbaf = le32_to_cpu(nvm->elbaf[lbaf]);
+	u32 elbaf = le32_to_cpu(nvm->elbaf[nvme_lbaf_index(id->flbas)]);
 
 	/* no support for storage tag formats right now */
 	if (nvme_elbaf_sts(elbaf))
-		goto set_pi;
+		return;
 
 	head->guard_type = nvme_elbaf_guard_type(elbaf);
 	switch (head->guard_type) {
@@ -1887,29 +1875,32 @@ static void nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
 	default:
 		break;
 	}
-
-set_pi:
-	if (head->pi_size && head->ms >= head->pi_size)
-		head->pi_type = id->dps & NVME_NS_DPS_PI_MASK;
-	else
-		head->pi_type = 0;
-
-	if (first)
-		head->pi_offset = 0;
-	else
-		head->pi_offset = head->ms - head->pi_size;
 }
 
 static void nvme_configure_metadata(struct nvme_ctrl *ctrl,
 		struct nvme_ns_head *head, struct nvme_id_ns *id,
 		struct nvme_id_ns_nvm *nvm)
 {
-	nvme_init_ms(ctrl, head, id, nvm);
-
 	head->features &= ~(NVME_NS_METADATA_SUPPORTED | NVME_NS_EXT_LBAS);
+	head->pi_type = 0;
+	head->pi_size = 0;
+	head->pi_offset = 0;
+	head->ms = le16_to_cpu(id->lbaf[nvme_lbaf_index(id->flbas)].ms);
 	if (!head->ms || !(ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
 		return;
 
+	if (nvm && (ctrl->ctratt & NVME_CTRL_ATTR_ELBAS)) {
+		nvme_configure_pi_elbas(head, id, nvm);
+	} else {
+		head->pi_size = sizeof(struct t10_pi_tuple);
+		head->guard_type = NVME_NVM_NS_16B_GUARD;
+	}
+
+	if (head->pi_size && head->ms >= head->pi_size)
+		head->pi_type = id->dps & NVME_NS_DPS_PI_MASK;
+	if (!(id->dps & NVME_NS_DPS_PI_FIRST))
+		head->pi_offset = head->ms - head->pi_size;
+
 	if (ctrl->ops->flags & NVME_F_FABRICS) {
 		/*
 		 * The NVMe over Fabrics specification only supports metadata as
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 16/21] nvme-rdma: initialize max_hw_sectors earlier
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (14 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 15/21] nvme: cleanup nvme_configure_metadata Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 17/21] nvme-loop: " Christoph Hellwig
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Initialize max_fr_pages and the values depending on it a little earlier
so that nvme_alloc_admin_tag_set can rely on it to set the initial
queue limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/rdma.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 20fdd40b1879f5..04a69f7bd48f46 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -797,6 +797,12 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 
 	ctrl->max_fr_pages = nvme_rdma_get_max_fr_pages(ctrl->device->dev,
 							pi_capable);
+	ctrl->ctrl.max_segments = ctrl->max_fr_pages;
+	ctrl->ctrl.max_hw_sectors = ctrl->max_fr_pages << (ilog2(SZ_4K) - 9);
+	if (pi_capable)
+		ctrl->ctrl.max_integrity_segments = ctrl->max_fr_pages;
+	else
+		ctrl->ctrl.max_integrity_segments = 0;
 
 	/*
 	 * Bind the async event SQE DMA mapping to the admin queue lifetime.
@@ -826,13 +832,6 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 	if (error)
 		goto out_stop_queue;
 
-	ctrl->ctrl.max_segments = ctrl->max_fr_pages;
-	ctrl->ctrl.max_hw_sectors = ctrl->max_fr_pages << (ilog2(SZ_4K) - 9);
-	if (pi_capable)
-		ctrl->ctrl.max_integrity_segments = ctrl->max_fr_pages;
-	else
-		ctrl->ctrl.max_integrity_segments = 0;
-
 	nvme_unquiesce_admin_queue(&ctrl->ctrl);
 
 	error = nvme_init_ctrl_finish(&ctrl->ctrl, false);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 17/21] nvme-loop: initialize max_hw_sectors earlier
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (15 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 16/21] nvme-rdma: initialize max_hw_sectors earlier Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 18/21] nvme-fc: " Christoph Hellwig
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Initialize max_hw_sectors a little earlier so that
nvme_alloc_admin_tag_set can rely on it to set the initial queue limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/target/loop.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index e589915ddef85c..d8e33427a921bb 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -351,6 +351,8 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl)
 	if (error)
 		return error;
 	ctrl->ctrl.queue_count = 1;
+	ctrl->ctrl.max_hw_sectors =
+		(NVME_LOOP_MAX_SEGMENTS - 1) << PAGE_SECTORS_SHIFT;
 
 	error = nvme_alloc_admin_tag_set(&ctrl->ctrl, &ctrl->admin_tag_set,
 			&nvme_loop_admin_mq_ops,
@@ -372,9 +374,6 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl)
 	if (error)
 		goto out_cleanup_tagset;
 
-	ctrl->ctrl.max_hw_sectors =
-		(NVME_LOOP_MAX_SEGMENTS - 1) << PAGE_SECTORS_SHIFT;
-
 	nvme_unquiesce_admin_queue(&ctrl->ctrl);
 
 	error = nvme_init_ctrl_finish(&ctrl->ctrl, false);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 18/21] nvme-fc: initialize max_hw_sectors earlier
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (16 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 17/21] nvme-loop: " Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 19/21] nvme-apple: " Christoph Hellwig
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Initialize max_hw_sectors and max_sectors a little earlier so that
nvme_alloc_admin_tag_set can rely on it to set the initial queue limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/fc.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 68a5d971657bb5..537008dd2c616c 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3113,10 +3113,6 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 	if (ret)
 		goto out_disconnect_admin_queue;
 
-	ctrl->ctrl.max_segments = ctrl->lport->ops->max_sgl_segments;
-	ctrl->ctrl.max_hw_sectors = ctrl->ctrl.max_segments <<
-						(ilog2(SZ_4K) - 9);
-
 	nvme_unquiesce_admin_queue(&ctrl->ctrl);
 
 	ret = nvme_init_ctrl_finish(&ctrl->ctrl, false);
@@ -3542,6 +3538,10 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 
 	/* at this point, teardown path changes to ref counting on nvme ctrl */
 
+	ctrl->ctrl.max_segments = ctrl->lport->ops->max_sgl_segments;
+	ctrl->ctrl.max_hw_sectors =
+		ctrl->ctrl.max_segments << (ilog2(SZ_4K) - 9);
+
 	ret = nvme_alloc_admin_tag_set(&ctrl->ctrl, &ctrl->admin_tag_set,
 			&nvme_fc_admin_mq_ops,
 			struct_size_t(struct nvme_fcp_op_w_sgl, priv,
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 19/21] nvme-apple: initialize max_hw_sectors earlier
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (17 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 18/21] nvme-fc: " Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 20/21] nvme: use the atomic queue limits update API Christoph Hellwig
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Initialize max_hw_sectors and max_sectors a little earlier so they
can be passed directly to blk_mq_alloc_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/apple.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index a480cdeac2883c..12281fc6932d40 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -1046,14 +1046,6 @@ static void apple_nvme_reset_work(struct work_struct *work)
 
 	dev_dbg(anv->dev, "ANS booted successfully.");
 
-	/*
-	 * Limit the max command size to prevent iod->sg allocations going
-	 * over a single page.
-	 */
-	anv->ctrl.max_hw_sectors = min_t(u32, NVME_MAX_KB_SZ << 1,
-					 dma_max_mapping_size(anv->dev) >> 9);
-	anv->ctrl.max_segments = NVME_MAX_SEGS;
-
 	dma_set_max_seg_size(anv->dev, 0xffffffff);
 
 	/*
@@ -1516,6 +1508,14 @@ static int apple_nvme_probe(struct platform_device *pdev)
 		goto put_dev;
 	}
 
+	/*
+	 * Limit the max command size to prevent iod->sg allocations going
+	 * over a single page.
+	 */
+	anv->ctrl.max_hw_sectors = min_t(u32, NVME_MAX_KB_SZ << 1,
+					 dma_max_mapping_size(anv->dev) >> 9);
+	anv->ctrl.max_segments = NVME_MAX_SEGS;
+
 	anv->ctrl.admin_q = blk_mq_alloc_queue(&anv->admin_tagset, NULL, NULL);
 	if (IS_ERR(anv->ctrl.admin_q)) {
 		ret = -ENOMEM;
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 20/21] nvme: use the atomic queue limits update API
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (18 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 19/21] nvme-apple: " Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-02-28 18:12 ` [PATCH 21/21] nvme-multipath: pass queue_limits to blk_alloc_disk Christoph Hellwig
  2024-03-01 16:20 ` convert nvme to atomic queue limits updates Keith Busch
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

Changes the callchains that update queue_limits to build an on-stack
queue_limits and update it atomically.  Note that for now only the
admin queue actually passes it to the queue allocation function.
Doing the same for the gendisks used for the namespaces will require
a little more work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/apple.c |   2 +
 drivers/nvme/host/core.c  | 126 +++++++++++++++++++-------------------
 drivers/nvme/host/nvme.h  |  11 +---
 drivers/nvme/host/zns.c   |  16 ++---
 4 files changed, 76 insertions(+), 79 deletions(-)

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index 12281fc6932d40..8b275d6afd9dfa 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -1382,6 +1382,7 @@ static void devm_apple_nvme_mempool_destroy(void *data)
 
 static int apple_nvme_probe(struct platform_device *pdev)
 {
+	struct queue_limits lim = { };
 	struct device *dev = &pdev->dev;
 	struct apple_nvme *anv;
 	int ret;
@@ -1516,6 +1517,7 @@ static int apple_nvme_probe(struct platform_device *pdev)
 					 dma_max_mapping_size(anv->dev) >> 9);
 	anv->ctrl.max_segments = NVME_MAX_SEGS;
 
+	nvme_set_ctrl_limits(&anv->ctrl, &lim);
 	anv->ctrl.admin_q = blk_mq_alloc_queue(&anv->admin_tagset, NULL, NULL);
 	if (IS_ERR(anv->ctrl.admin_q)) {
 		ret = -ENOMEM;
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f4ec8683b3b725..957fa388808b90 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1787,40 +1787,27 @@ static bool nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
 	return true;
 }
 
-static void nvme_config_discard(struct nvme_ctrl *ctrl, struct gendisk *disk,
-		struct nvme_ns_head *head)
+static void nvme_config_discard(struct nvme_ns *ns, struct queue_limits *lim)
 {
-	struct request_queue *queue = disk->queue;
-	u32 max_discard_sectors;
-
-	if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(head, UINT_MAX)) {
-		max_discard_sectors = nvme_lba_to_sect(head, ctrl->dmrsl);
-	} else if (ctrl->oncs & NVME_CTRL_ONCS_DSM) {
-		max_discard_sectors = UINT_MAX;
-	} else {
-		blk_queue_max_discard_sectors(queue, 0);
-		return;
-	}
+	struct nvme_ctrl *ctrl = ns->ctrl;
 
 	BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
 			NVME_DSM_MAX_RANGES);
 
-	/*
-	 * If discard is already enabled, don't reset queue limits.
-	 *
-	 * This works around the fact that the block layer can't cope well with
-	 * updating the hardware limits when overridden through sysfs.  This is
-	 * harmless because discard limits in NVMe are purely advisory.
-	 */
-	if (queue->limits.max_discard_sectors)
-		return;
+	if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(ns->head, UINT_MAX))
+		lim->max_hw_discard_sectors =
+			nvme_lba_to_sect(ns->head, ctrl->dmrsl);
+	else if (ctrl->oncs & NVME_CTRL_ONCS_DSM)
+		lim->max_hw_discard_sectors = UINT_MAX;
+	else
+		lim->max_hw_discard_sectors = 0;
+
+	lim->discard_granularity = lim->logical_block_size;
 
-	blk_queue_max_discard_sectors(queue, max_discard_sectors);
 	if (ctrl->dmrl)
-		blk_queue_max_discard_segments(queue, ctrl->dmrl);
+		lim->max_discard_segments = ctrl->dmrl;
 	else
-		blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
-	queue->limits.discard_granularity = queue_logical_block_size(queue);
+		lim->max_discard_segments = NVME_DSM_MAX_RANGES;
 }
 
 static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b)
@@ -1942,20 +1929,20 @@ static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl)
 	return ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> SECTOR_SHIFT) + 1;
 }
 
-static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
-		struct request_queue *q)
+void nvme_set_ctrl_limits(struct nvme_ctrl *ctrl, struct queue_limits *lim)
 {
-	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
-	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
-		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
-	blk_queue_max_integrity_segments(q, ctrl->max_integrity_segments);
-	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
-	blk_queue_dma_alignment(q, 3);
+	lim->max_hw_sectors = ctrl->max_hw_sectors;
+	lim->max_segments = min_t(u32, USHRT_MAX,
+		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments));
+	lim->max_integrity_segments = ctrl->max_integrity_segments;
+	lim->virt_boundary_mask = NVME_CTRL_PAGE_SIZE - 1;
+	lim->max_segment_size = UINT_MAX;
+	lim->dma_alignment = 3;
 }
 
-static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id)
+static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id,
+		struct queue_limits *lim)
 {
-	struct gendisk *disk = ns->disk;
 	struct nvme_ns_head *head = ns->head;
 	u32 bs = 1U << head->lba_shift;
 	u32 atomic_bs, phys_bs, io_opt = 0;
@@ -1991,23 +1978,19 @@ static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id)
 		io_opt = bs * (1 + le16_to_cpu(id->nows));
 	}
 
-	blk_queue_logical_block_size(disk->queue, bs);
 	/*
 	 * Linux filesystems assume writing a single physical block is
 	 * an atomic operation. Hence limit the physical block size to the
 	 * value of the Atomic Write Unit Power Fail parameter.
 	 */
-	blk_queue_physical_block_size(disk->queue, min(phys_bs, atomic_bs));
-	blk_queue_io_min(disk->queue, phys_bs);
-	blk_queue_io_opt(disk->queue, io_opt);
-
-	nvme_config_discard(ns->ctrl, disk, head);
-
+	lim->logical_block_size = bs;
+	lim->physical_block_size = min(phys_bs, atomic_bs);
+	lim->io_min = phys_bs;
+	lim->io_opt = io_opt;
 	if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
-		blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX);
+		lim->max_write_zeroes_sectors = UINT_MAX;
 	else
-		blk_queue_max_write_zeroes_sectors(disk->queue,
-				ns->ctrl->max_zeroes_sectors);
+		lim->max_write_zeroes_sectors = ns->ctrl->max_zeroes_sectors;
 	return valid;
 }
 
@@ -2022,7 +2005,8 @@ static inline bool nvme_first_scan(struct gendisk *disk)
 	return !disk_live(disk);
 }
 
-static void nvme_set_chunk_sectors(struct nvme_ns *ns, struct nvme_id_ns *id)
+static void nvme_set_chunk_sectors(struct nvme_ns *ns, struct nvme_id_ns *id,
+		struct queue_limits *lim)
 {
 	struct nvme_ctrl *ctrl = ns->ctrl;
 	u32 iob;
@@ -2050,25 +2034,33 @@ static void nvme_set_chunk_sectors(struct nvme_ns *ns, struct nvme_id_ns *id)
 		return;
 	}
 
-	blk_queue_chunk_sectors(ns->queue, iob);
+	lim->chunk_sectors = iob;
 }
 
 static int nvme_update_ns_info_generic(struct nvme_ns *ns,
 		struct nvme_ns_info *info)
 {
+	struct queue_limits lim;
+	int ret;
+
 	blk_mq_freeze_queue(ns->disk->queue);
-	nvme_set_queue_limits(ns->ctrl, ns->queue);
+	lim = queue_limits_start_update(ns->disk->queue);
+	nvme_set_ctrl_limits(ns->ctrl, &lim);
+	ret = queue_limits_commit_update(ns->disk->queue, &lim);
 	set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
 	blk_mq_unfreeze_queue(ns->disk->queue);
 
 	/* Hide the block-interface for these devices */
-	return -ENODEV;
+	if (!ret)
+		ret = -ENODEV;
+	return ret;
 }
 
 static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		struct nvme_ns_info *info)
 {
 	bool vwc = ns->ctrl->vwc & NVME_CTRL_VWC_PRESENT;
+	struct queue_limits lim;
 	struct nvme_id_ns_nvm *nvm = NULL;
 	struct nvme_id_ns *id;
 	sector_t capacity;
@@ -2098,11 +2090,26 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	ns->head->nuse = le64_to_cpu(id->nuse);
 	capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
 
-	nvme_set_queue_limits(ns->ctrl, ns->queue);
+	lim = queue_limits_start_update(ns->disk->queue);
+	nvme_set_ctrl_limits(ns->ctrl, &lim);
 	nvme_configure_metadata(ns->ctrl, ns->head, id, nvm);
-	nvme_set_chunk_sectors(ns, id);
-	if (!nvme_update_disk_info(ns, id))
+	nvme_set_chunk_sectors(ns, id, &lim);
+	if (!nvme_update_disk_info(ns, id, &lim))
 		capacity = 0;
+	nvme_config_discard(ns, &lim);
+	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
+	    ns->head->ids.csi == NVME_CSI_ZNS) {
+		ret = nvme_update_zone_info(ns, lbaf, &lim);
+		if (ret) {
+			blk_mq_unfreeze_queue(ns->disk->queue);
+			goto out;
+		}
+	}
+	ret = queue_limits_commit_update(ns->disk->queue, &lim);
+	if (ret) {
+		blk_mq_unfreeze_queue(ns->disk->queue);
+		goto out;
+	}
 
 	/*
 	 * Register a metadata profile for PI, or the plain non-integrity NVMe
@@ -2115,14 +2122,6 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 
 	set_capacity_and_notify(ns->disk, capacity);
 
-	if (ns->head->ids.csi == NVME_CSI_ZNS) {
-		ret = nvme_update_zone_info(ns, lbaf);
-		if (ret) {
-			blk_mq_unfreeze_queue(ns->disk->queue);
-			goto out;
-		}
-	}
-
 	/*
 	 * Only set the DEAC bit if the device guarantees that reads from
 	 * deallocated data return zeroes.  While the DEAC bit does not
@@ -3184,7 +3183,6 @@ static int nvme_init_identify(struct nvme_ctrl *ctrl)
 	ctrl->max_hw_sectors =
 		min_not_zero(ctrl->max_hw_sectors, max_hw_sectors);
 
-	nvme_set_queue_limits(ctrl, ctrl->admin_q);
 	ctrl->sgls = le32_to_cpu(id->sgls);
 	ctrl->kas = le16_to_cpu(id->kas);
 	ctrl->max_namespaces = le32_to_cpu(id->mnan);
@@ -4347,6 +4345,7 @@ EXPORT_SYMBOL_GPL(nvme_complete_async_event);
 int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 		const struct blk_mq_ops *ops, unsigned int cmd_size)
 {
+	struct queue_limits lim = {};
 	int ret;
 
 	memset(set, 0, sizeof(*set));
@@ -4366,7 +4365,8 @@ int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 	if (ret)
 		return ret;
 
-	ctrl->admin_q = blk_mq_alloc_queue(set, NULL, NULL);
+	nvme_set_ctrl_limits(ctrl, &lim);
+	ctrl->admin_q = blk_mq_alloc_queue(set, &lim, NULL);
 	if (IS_ERR(ctrl->admin_q)) {
 		ret = PTR_ERR(ctrl->admin_q);
 		goto out_free_tagset;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 01e8bae7886584..f58c5c18daef65 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -756,6 +756,7 @@ static __always_inline void nvme_complete_batch(struct io_comp_batch *iob,
 	blk_mq_end_request_batch(iob);
 }
 
+void nvme_set_ctrl_limits(struct nvme_ctrl *ctrl, struct queue_limits *lim);
 blk_status_t nvme_host_path_error(struct request *req);
 bool nvme_cancel_request(struct request *req, void *data);
 void nvme_cancel_tagset(struct nvme_ctrl *ctrl);
@@ -1038,8 +1039,9 @@ static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
 
 int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
 		unsigned int nr_zones, report_zones_cb cb, void *data);
+int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf,
+		struct queue_limits *lim);
 #ifdef CONFIG_BLK_DEV_ZONED
-int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf);
 blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns, struct request *req,
 				       struct nvme_command *cmnd,
 				       enum nvme_zone_mgmt_action action);
@@ -1050,13 +1052,6 @@ static inline blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns,
 {
 	return BLK_STS_NOTSUPP;
 }
-
-static inline int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf)
-{
-	dev_warn(ns->ctrl->device,
-		 "Please enable CONFIG_BLK_DEV_ZONED to support ZNS devices\n");
-	return -EPROTONOSUPPORT;
-}
 #endif
 
 static inline struct nvme_ns *nvme_get_ns_from_dev(struct device *dev)
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index 852261d7891362..722384bcc765cd 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -35,10 +35,10 @@ static int nvme_set_max_append(struct nvme_ctrl *ctrl)
 	return 0;
 }
 
-int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf)
+int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf,
+		struct queue_limits *lim)
 {
 	struct nvme_effects_log *log = ns->head->effects;
-	struct request_queue *q = ns->queue;
 	struct nvme_command c = { };
 	struct nvme_id_ns_zns *id;
 	int status;
@@ -99,12 +99,12 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf)
 		goto free_data;
 	}
 
-	disk_set_zoned(ns->disk);
-	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
-	disk_set_max_open_zones(ns->disk, le32_to_cpu(id->mor) + 1);
-	disk_set_max_active_zones(ns->disk, le32_to_cpu(id->mar) + 1);
-	blk_queue_chunk_sectors(ns->queue, ns->head->zsze);
-	blk_queue_max_zone_append_sectors(ns->queue, ns->ctrl->max_zone_append);
+	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, ns->queue);
+	lim->zoned = 1;
+	lim->max_open_zones = le32_to_cpu(id->mor) + 1;
+	lim->max_active_zones = le32_to_cpu(id->mar) + 1;
+	lim->chunk_sectors = ns->head->zsze;
+	lim->max_zone_append_sectors = ns->ctrl->max_zone_append;
 free_data:
 	kfree(id);
 	return status;
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 21/21] nvme-multipath: pass queue_limits to blk_alloc_disk
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (19 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 20/21] nvme: use the atomic queue limits update API Christoph Hellwig
@ 2024-02-28 18:12 ` Christoph Hellwig
  2024-03-01 16:20 ` convert nvme to atomic queue limits updates Keith Busch
  21 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-28 18:12 UTC (permalink / raw
  To: Hector Martin, Sven Peter, Keith Busch, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

The multipath disk starts out with the stacking default limits.
The one interesting part here is that blk_set_stacking_limits
sets the max_zone_append_sectorts to UINT_MAX, which fails the
validation for non-zoned devices.  With the old one call per
limit scheme this was fine because no one verified this weird
mismatch and it was fixed by blk_stack_limits a little later
before I/O could be issued.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/multipath.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index dc5d0d0a82d0e2..5397fb428b242c 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -516,6 +516,7 @@ static void nvme_requeue_work(struct work_struct *work)
 
 int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
 {
+	struct queue_limits lim;
 	bool vwc = false;
 
 	mutex_init(&head->lock);
@@ -532,7 +533,12 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
 	    !nvme_is_unique_nsid(ctrl, head) || !multipath)
 		return 0;
 
-	head->disk = blk_alloc_disk(NULL, ctrl->numa_node);
+	blk_set_stacking_limits(&lim);
+	lim.dma_alignment = 3;
+	if (head->ids.csi != NVME_CSI_ZNS)
+		lim.max_zone_append_sectors = 0;
+
+	head->disk = blk_alloc_disk(&lim, ctrl->numa_node);
 	if (IS_ERR(head->disk))
 		return PTR_ERR(head->disk);
 	head->disk->fops = &nvme_ns_head_ops;
@@ -553,11 +559,6 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
 	    ctrl->tagset->map[HCTX_TYPE_POLL].nr_queues)
 		blk_queue_flag_set(QUEUE_FLAG_POLL, head->disk->queue);
 
-	/* set to a default value of 512 until the disk is validated */
-	blk_queue_logical_block_size(head->disk->queue, 512);
-	blk_set_stacking_limits(&head->disk->queue->limits);
-	blk_queue_dma_alignment(head->disk->queue, 3);
-
 	/* we need to propagate up the VMC settings */
 	if (ctrl->vwc & NVME_CTRL_VWC_PRESENT)
 		vwc = true;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/21] nvme: remove nvme_revalidate_zones
  2024-02-28 18:11 ` [PATCH 05/21] nvme: remove nvme_revalidate_zones Christoph Hellwig
@ 2024-02-28 23:47   ` Damien Le Moal
  0 siblings, 0 replies; 38+ messages in thread
From: Damien Le Moal @ 2024-02-28 23:47 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme

On 2024/02/28 10:11, Christoph Hellwig wrote:
> Handle setting the zone size / chunk_sectors and max_append_sectors
> limits together with the other ZNS limits, and just open code the
> call to blk_revalidate_zones in the current place.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

[...]

>  static int nvme_set_max_append(struct nvme_ctrl *ctrl)
>  {
>  	struct nvme_command c = { };
> @@ -113,6 +103,8 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf)
>  	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
>  	disk_set_max_open_zones(ns->disk, le32_to_cpu(id->mor) + 1);
>  	disk_set_max_active_zones(ns->disk, le32_to_cpu(id->mar) + 1);

Not directly related to this patch, but I think that since you moved the max
open & active limits to queue->limits, we really should rename these to
blk_queue_max_open_zones() and blk_queue_max_active_zones() to be consistent
with all the other limits. and may be same for disk_set_zoned() while we are at it.

I can send a patch if you want...

> +	blk_queue_chunk_sectors(ns->queue, ns->head->zsze);
> +	blk_queue_max_zone_append_sectors(ns->queue, ns->ctrl->max_zone_append);
>  free_data:
>  	kfree(id);
>  	return status;

-- 
Damien Le Moal
Western Digital Research



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/21] nvme: set max_hw_sectors unconditionally
  2024-02-28 18:11 ` [PATCH 03/21] nvme: set max_hw_sectors unconditionally Christoph Hellwig
@ 2024-02-29 10:46   ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 10:46 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:11, Christoph Hellwig wrote:
> All transports set a max_hw_sectors value in the nvme_ctrl, so make
> the code using it unconditional and clean it up using a little helper.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index eed3e22e24d913..74cd384ca5fc73 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1944,19 +1944,19 @@ static int nvme_configure_metadata(struct nvme_ctrl *ctrl,
>   	return 0;
>   }
>   
> +static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl)
> +{
> +	return ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> SECTOR_SHIFT) + 1;
> +}
> +
>   static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
>   		struct request_queue *q)
>   {
>   	bool vwc = ctrl->vwc & NVME_CTRL_VWC_PRESENT;
>   
> -	if (ctrl->max_hw_sectors) {
> -		u32 max_segments =
> -			(ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> 9)) + 1;
> -
> -		max_segments = min_not_zero(max_segments, ctrl->max_segments);
> -		blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
> -		blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX));
> -	}
> +	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
> +	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
> +		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
>   	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
>   	blk_queue_dma_alignment(q, 3);
>   	blk_queue_write_cache(q, vwc, vwc);

Looks good,
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/21] nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard
  2024-02-28 18:11 ` [PATCH 04/21] nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard Christoph Hellwig
@ 2024-02-29 10:48   ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 10:48 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:11, Christoph Hellwig wrote:
> Move the handling of the NVME_QUIRK_DEALLOCATE_ZEROES quirk out of
> nvme_config_discard so that it is combined with the normal write_zeroes
> limit handling.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 74cd384ca5fc73..ee1e13a658c314 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1816,9 +1816,6 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl, struct gendisk *disk,
>   	else
>   		blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
>   	queue->limits.discard_granularity = queue_logical_block_size(queue);
> -
> -	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
> -		blk_queue_max_write_zeroes_sectors(queue, UINT_MAX);
>   }
>   
>   static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b)
> @@ -2029,8 +2026,12 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
>   	set_capacity_and_notify(disk, capacity);
>   
>   	nvme_config_discard(ctrl, disk, head);
> -	blk_queue_max_write_zeroes_sectors(disk->queue,
> -					   ctrl->max_zeroes_sectors);
> +
> +	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
> +		blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX);
> +	else
> +		blk_queue_max_write_zeroes_sectors(disk->queue,
> +				ctrl->max_zeroes_sectors);
>   }
>   
>   static bool nvme_ns_is_readonly(struct nvme_ns *ns, struct nvme_ns_info *info)

Looks good,
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 06/21] nvme: move max_integrity_segments handling out of nvme_init_integrity
  2024-02-28 18:12 ` [PATCH 06/21] nvme: move max_integrity_segments handling out of nvme_init_integrity Christoph Hellwig
@ 2024-02-29 10:58   ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 10:58 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> max_integrity_segments is just a hardware limit and doesn't need to be
> in nvme_init_integrity with the PI setup.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 11 ++++-------
>   1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4a05045e409904..94d835f9dc90e9 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1724,8 +1724,7 @@ int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
>   }
>   
>   #ifdef CONFIG_BLK_DEV_INTEGRITY
> -static void nvme_init_integrity(struct gendisk *disk,
> -		struct nvme_ns_head *head, u32 max_integrity_segments)
> +static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
>   {
>   	struct blk_integrity integrity = { };
>   
> @@ -1773,11 +1772,9 @@ static void nvme_init_integrity(struct gendisk *disk,
>   	integrity.tuple_size = head->ms;
>   	integrity.pi_offset = head->pi_offset;
>   	blk_integrity_register(disk, &integrity);
> -	blk_queue_max_integrity_segments(disk->queue, max_integrity_segments);
>   }
>   #else
> -static void nvme_init_integrity(struct gendisk *disk,
> -		struct nvme_ns_head *head, u32 max_integrity_segments)
> +static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
>   {
>   }
>   #endif /* CONFIG_BLK_DEV_INTEGRITY */
> @@ -1954,6 +1951,7 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
>   	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
>   	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
>   		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
> +	blk_queue_max_integrity_segments(q, ctrl->max_integrity_segments);
>   	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
>   	blk_queue_dma_alignment(q, 3);
>   	blk_queue_write_cache(q, vwc, vwc);
> @@ -2017,8 +2015,7 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
>   	if (head->ms) {
>   		if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
>   		    (head->features & NVME_NS_METADATA_SUPPORTED))
> -			nvme_init_integrity(disk, head,
> -					    ctrl->max_integrity_segments);
> +			nvme_init_integrity(disk, head);
>   		else if (!nvme_ns_has_pi(head))
>   			capacity = 0;
>   	}

Looks good,
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/21] nvme: cleanup the nvme_init_integrity calling conventions
  2024-02-28 18:12 ` [PATCH 07/21] nvme: cleanup the nvme_init_integrity calling conventions Christoph Hellwig
@ 2024-02-29 12:33   ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 12:33 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> Handle the no metadata support case in nvme_init_integrity as well to
> simplify the calling convention and prepare for future changes in the
> area.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 29 +++++++++++++++--------------
>   1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 94d835f9dc90e9..76e548902ec33e 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1723,11 +1723,21 @@ int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
>   	return 0;
>   }
>   
> -#ifdef CONFIG_BLK_DEV_INTEGRITY
> -static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
> +static bool nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
>   {
>   	struct blk_integrity integrity = { };
>   
> +	if (!head->ms)
> +		return true;
> +
> +	/*
> +	 * PI can always be supported as we can ask the controller to simply
> +	 * insert/strip it, which is not possible for other kinds of metadata.
> +	 */
> +	if (!IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) ||
> +	    !(head->features & NVME_NS_METADATA_SUPPORTED))
> +		return nvme_ns_has_pi(head);
> +
>   	switch (head->pi_type) {
>   	case NVME_NS_DPS_PI_TYPE3:
>   		switch (head->guard_type) {
> @@ -1772,12 +1782,8 @@ static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
>   	integrity.tuple_size = head->ms;
>   	integrity.pi_offset = head->pi_offset;
>   	blk_integrity_register(disk, &integrity);
> +	return true;
>   }
> -#else
> -static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
> -{
> -}
> -#endif /* CONFIG_BLK_DEV_INTEGRITY */
>   
>   static void nvme_config_discard(struct nvme_ctrl *ctrl, struct gendisk *disk,
>   		struct nvme_ns_head *head)
> @@ -2012,13 +2018,8 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
>   	 * I/O to namespaces with metadata except when the namespace supports
>   	 * PI, as it can strip/insert in that case.
>   	 */
> -	if (head->ms) {
> -		if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
> -		    (head->features & NVME_NS_METADATA_SUPPORTED))
> -			nvme_init_integrity(disk, head);
> -		else if (!nvme_ns_has_pi(head))
> -			capacity = 0;
> -	}
> +	if (!nvme_init_integrity(disk, head))
> +		capacity = 0;
>   
>   	set_capacity_and_notify(disk, capacity);
>   

Looks good,
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 08/21] nvme: move blk_integrity_unregister into nvme_init_integrity
  2024-02-28 18:12 ` [PATCH 08/21] nvme: move blk_integrity_unregister into nvme_init_integrity Christoph Hellwig
@ 2024-02-29 12:36   ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 12:36 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> Move uneregistering the existing integrity profile into the helper
> dealing with all the other integrity / metadata setup.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 76e548902ec33e..105046a7957272 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1727,6 +1727,8 @@ static bool nvme_init_integrity(struct gendisk *disk, struct nvme_ns_head *head)
>   {
>   	struct blk_integrity integrity = { };
>   
> +	blk_integrity_unregister(disk);
> +
>   	if (!head->ms)
>   		return true;
>   
> @@ -1980,8 +1982,6 @@ static void nvme_update_disk_info(struct nvme_ctrl *ctrl, struct gendisk *disk,
>   		bs = (1 << 9);
>   	}
>   
> -	blk_integrity_unregister(disk);
> -
>   	atomic_bs = phys_bs = bs;
>   	if (id->nabo == 0) {
>   		/*

Looks good,
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk
  2024-02-28 18:12 ` [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk Christoph Hellwig
@ 2024-02-29 12:47   ` Max Gurtovoy
  2024-02-29 13:02     ` Max Gurtovoy
  0 siblings, 1 reply; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 12:47 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> Currently nvme_update_ns_info_block calls nvme_update_disk_info both for
> the namespace attached disk, and the multipath one (if it exists).  This
> is very different from how other stacking drivers work, and leads to
> a lot of complexity.
> 
> Switch to setting the disk capacity and initializing the integrity
> profile, and let blk_stack_limits which already is called just below
> deal with updating the other limits.

where does the blk_stack_limits called in the nvme/core ?
Is it part of some preparation series ?

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 105046a7957272..0bf30580f29d5b 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2159,7 +2159,8 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   
>   	if (nvme_ns_head_multipath(ns->head)) {
>   		blk_mq_freeze_queue(ns->head->disk->queue);
> -		nvme_update_disk_info(ns->ctrl, ns->head->disk, ns->head, id);
> +		nvme_init_integrity(ns->head->disk, ns->head);
> +		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
>   		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
>   		nvme_mpath_revalidate_paths(ns);
>   		blk_stack_limits(&ns->head->disk->queue->limits,


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk
  2024-02-29 12:47   ` Max Gurtovoy
@ 2024-02-29 13:02     ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 13:02 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 29/02/2024 14:47, Max Gurtovoy wrote:
> 
> 
> On 28/02/2024 20:12, Christoph Hellwig wrote:
>> Currently nvme_update_ns_info_block calls nvme_update_disk_info both for
>> the namespace attached disk, and the multipath one (if it exists).  This
>> is very different from how other stacking drivers work, and leads to
>> a lot of complexity.
>>
>> Switch to setting the disk capacity and initializing the integrity
>> profile, and let blk_stack_limits which already is called just below
>> deal with updating the other limits.
> 
> where does the blk_stack_limits called in the nvme/core ?
> Is it part of some preparation series ?

please ignore, I missed the last line in the chunk..

> 
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>   drivers/nvme/host/core.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 105046a7957272..0bf30580f29d5b 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -2159,7 +2159,8 @@ static int nvme_update_ns_info_block(struct 
>> nvme_ns *ns,
>>       if (nvme_ns_head_multipath(ns->head)) {
>>           blk_mq_freeze_queue(ns->head->disk->queue);
>> -        nvme_update_disk_info(ns->ctrl, ns->head->disk, ns->head, id);
>> +        nvme_init_integrity(ns->head->disk, ns->head);
>> +        set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
>>           set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
>>           nvme_mpath_revalidate_paths(ns);
>>           blk_stack_limits(&ns->head->disk->queue->limits,
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/21] nvme: move setting the write cache flags out of nvme_set_queue_limits
  2024-02-28 18:12 ` [PATCH 11/21] nvme: move setting the write cache flags out of nvme_set_queue_limits Christoph Hellwig
@ 2024-02-29 13:11   ` Max Gurtovoy
  0 siblings, 0 replies; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 13:11 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> nvme_set_queue_limits is used on the admin queue and all gendisks
> including hidden ones that don't support block I/O.  The write cache
> setting on the other hand only makes sense for block I/O.  Move the
> blk_queue_write_cache call to nvme_update_ns_info_block instead.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3d839945edd85d..dcb8d0590ed0f3 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1954,15 +1954,12 @@ static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl)
>   static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
>   		struct request_queue *q)
>   {
> -	bool vwc = ctrl->vwc & NVME_CTRL_VWC_PRESENT;
> -
>   	blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
>   	blk_queue_max_segments(q, min_t(u32, USHRT_MAX,
>   		min_not_zero(nvme_max_drv_segments(ctrl), ctrl->max_segments)));
>   	blk_queue_max_integrity_segments(q, ctrl->max_integrity_segments);
>   	blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
>   	blk_queue_dma_alignment(q, 3);
> -	blk_queue_write_cache(q, vwc, vwc);
>   }
>   
>   static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id)
> @@ -2093,6 +2090,7 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
>   static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   		struct nvme_ns_info *info)
>   {
> +	bool vwc = ns->ctrl->vwc & NVME_CTRL_VWC_PRESENT;
>   	struct nvme_id_ns *id;
>   	sector_t capacity;
>   	unsigned lbaf;
> @@ -2154,6 +2152,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   	if ((id->dlfeat & 0x7) == 0x1 && (id->dlfeat & (1 << 3)))
>   		ns->head->features |= NVME_NS_DEAC;
>   	set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
> +	blk_queue_write_cache(ns->disk->queue, vwc, vwc);
>   	set_bit(NVME_NS_READY, &ns->flags);
>   	blk_mq_unfreeze_queue(ns->disk->queue);

Looks good,
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>

>   


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 12/21] nvme: move common logic into nvme_update_ns_info
  2024-02-28 18:12 ` [PATCH 12/21] nvme: move common logic into nvme_update_ns_info Christoph Hellwig
@ 2024-02-29 13:30   ` Max Gurtovoy
  2024-02-29 13:40     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 13:30 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> nvme_update_ns_info_generic and nvme_update_ns_info_block share a
> fair amount of logic related to not fully supported namespace
> formats and updating the multipath information.  Move this logic
> into the common caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 79 +++++++++++++++++++---------------------
>   1 file changed, 37 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index dcb8d0590ed0f3..78667ba89ec491 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2070,21 +2070,8 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
>   	set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
>   	blk_mq_unfreeze_queue(ns->disk->queue);
>   
> -	if (nvme_ns_head_multipath(ns->head)) {
> -		blk_mq_freeze_queue(ns->head->disk->queue);
> -		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
> -		nvme_mpath_revalidate_paths(ns);
> -		blk_stack_limits(&ns->head->disk->queue->limits,
> -				 &ns->queue->limits, 0);
> -		ns->head->disk->flags |= GENHD_FL_HIDDEN;
> -		blk_mq_unfreeze_queue(ns->head->disk->queue);
> -	}
> -
>   	/* Hide the block-interface for these devices */
> -	ns->disk->flags |= GENHD_FL_HIDDEN;
> -	set_bit(NVME_NS_READY, &ns->flags);
> -
> -	return 0;
> +	return -ENODEV;
>   }
>   
>   static int nvme_update_ns_info_block(struct nvme_ns *ns,
> @@ -2104,7 +2091,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   		/* namespace not allocated or attached */
>   		info->is_removed = true;
>   		ret = -ENODEV;
> -		goto error;
> +		goto out;
>   	}
>   
>   	blk_mq_freeze_queue(ns->disk->queue);
> @@ -2162,54 +2149,62 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   			goto out;
>   	}
>   
> -	if (nvme_ns_head_multipath(ns->head)) {
> -		blk_mq_freeze_queue(ns->head->disk->queue);
> -		nvme_init_integrity(ns->head->disk, ns->head);
> -		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
> -		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
> -		nvme_mpath_revalidate_paths(ns);
> -		blk_stack_limits(&ns->head->disk->queue->limits,
> -				 &ns->queue->limits, 0);
> -		disk_update_readahead(ns->head->disk);
> -		blk_mq_unfreeze_queue(ns->head->disk->queue);
> -	}
> -
>   	ret = 0;
>   out:
> -	/*
> -	 * If probing fails due an unsupported feature, hide the block device,
> -	 * but still allow other access.
> -	 */
> -	if (ret == -ENODEV) {
> -		ns->disk->flags |= GENHD_FL_HIDDEN;
> -		set_bit(NVME_NS_READY, &ns->flags);
> -		ret = 0;
> -	}
> -
> -error:
>   	kfree(id);
>   	return ret;
>   }
>   
>   static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
>   {
> +	int ret;
> +
>   	switch (info->ids.csi) {
>   	case NVME_CSI_ZNS:
>   		if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) {
>   			dev_info(ns->ctrl->device,
>   	"block device for nsid %u not supported without CONFIG_BLK_DEV_ZONED\n",
>   				info->nsid);
> -			return nvme_update_ns_info_generic(ns, info);
> +			ret = nvme_update_ns_info_generic(ns, info);
> +			break;
>   		}
> -		return nvme_update_ns_info_block(ns, info);
> +		ret = nvme_update_ns_info_block(ns, info);
> +		break;
>   	case NVME_CSI_NVM:
> -		return nvme_update_ns_info_block(ns, info);
> +		ret = nvme_update_ns_info_block(ns, info);
> +		break;
>   	default:
>   		dev_info(ns->ctrl->device,
>   			"block device for nsid %u not supported (csi %u)\n",
>   			info->nsid, info->ids.csi);
> -		return nvme_update_ns_info_generic(ns, info);
> +		ret = nvme_update_ns_info_generic(ns, info);
> +		break;
> +	}
> +
> +	/*
> +	 * If probing fails due an unsupported feature, hide the block device,
> +	 * but still allow other access.
> +	 */
> +	if (ret == -ENODEV) {
> +		ns->disk->flags |= GENHD_FL_HIDDEN;
> +		set_bit(NVME_NS_READY, &ns->flags);
> +		ret = 0;
> +	}
> +
> +	if (!ret && nvme_ns_head_multipath(ns->head)) {
> +		blk_mq_freeze_queue(ns->head->disk->queue);
> +		if (!(ns->disk->flags & GENHD_FL_HIDDEN))
> +			nvme_init_integrity(ns->head->disk, ns->head);

the logic in "nvme_update_ns_info_generic()" today is to hide also 
ns->head->disk..


> +		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
> +		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
> +		nvme_mpath_revalidate_paths(ns);
> +		blk_stack_limits(&ns->head->disk->queue->limits,
> +				 &ns->queue->limits, 0);
> +		disk_update_readahead(ns->head->disk);

we don't call disk_update_readahead() in "nvme_update_ns_info_generic()" 
today..

> +		blk_mq_unfreeze_queue(ns->head->disk->queue);
>   	}
> +
> +	return ret;
>   }
>   
>   #ifdef CONFIG_BLK_SED_OPAL


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 12/21] nvme: move common logic into nvme_update_ns_info
  2024-02-29 13:30   ` Max Gurtovoy
@ 2024-02-29 13:40     ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-29 13:40 UTC (permalink / raw
  To: Max Gurtovoy
  Cc: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni, Alyssa Rosenzweig,
	asahi, linux-nvme

On Thu, Feb 29, 2024 at 03:30:22PM +0200, Max Gurtovoy wrote:
>> +	/*
>> +	 * If probing fails due an unsupported feature, hide the block device,
>> +	 * but still allow other access.
>> +	 */
>> +	if (ret == -ENODEV) {
>> +		ns->disk->flags |= GENHD_FL_HIDDEN;
>> +		set_bit(NVME_NS_READY, &ns->flags);
>> +		ret = 0;
>> +	}
>> +
>> +	if (!ret && nvme_ns_head_multipath(ns->head)) {
>> +		blk_mq_freeze_queue(ns->head->disk->queue);
>> +		if (!(ns->disk->flags & GENHD_FL_HIDDEN))
>> +			nvme_init_integrity(ns->head->disk, ns->head);
>
> the logic in "nvme_update_ns_info_generic()" today is to hide also 
> ns->head->disk..

Indeed, that got lost.

>
>
>> +		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
>> +		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
>> +		nvme_mpath_revalidate_paths(ns);
>> +		blk_stack_limits(&ns->head->disk->queue->limits,
>> +				 &ns->queue->limits, 0);
>> +		disk_update_readahead(ns->head->disk);
>
> we don't call disk_update_readahead() in "nvme_update_ns_info_generic()" 
> today..

Yes, but it's harmless as it just propagates the limits from the
queue_limits to the bdi.  That beeing said I've failed to send out a
last patch that actually removes it by switching to the queue_limits_set
API for the multipath node as well.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper
  2024-02-28 18:12 ` [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper Christoph Hellwig
@ 2024-02-29 13:52   ` Max Gurtovoy
  2024-02-29 13:53     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Max Gurtovoy @ 2024-02-29 13:52 UTC (permalink / raw
  To: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni
  Cc: Alyssa Rosenzweig, asahi, linux-nvme



On 28/02/2024 20:12, Christoph Hellwig wrote:
> Split the logic to query the Identify Namespace Data Structure, NVM
> Command Set into a separate helper.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/core.c | 38 ++++++++++++++++++++++++++------------
>   1 file changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 78667ba89ec491..adcd11720d1bb4 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1831,12 +1831,35 @@ static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b)
>   		a->csi == b->csi;
>   }
>   
> +static int nvme_identify_ns_nvm(struct nvme_ctrl *ctrl, unsigned int nsid,
> +		struct nvme_id_ns_nvm **nvmp)
> +{
> +	struct nvme_command c = {
> +		.identify.opcode	= nvme_admin_identify,
> +		.identify.nsid		= cpu_to_le32(nsid),
> +		.identify.cns		= NVME_ID_CNS_CS_NS,
> +		.identify.csi		= NVME_CSI_NVM,
> +	};
> +	struct nvme_id_ns_nvm *nvm;
> +	int ret;
> +
> +	nvm = kzalloc(sizeof(*nvm), GFP_KERNEL);
> +	if (!nvm)
> +		return -ENOMEM;
> +
> +	ret = nvme_submit_sync_cmd(ctrl->admin_q, &c, nvm, sizeof(*nvm));
> +	if (ret)
> +		kfree(nvm);
> +	else
> +		*nvmp = nvm;
> +	return ret;
> +}
> +
>   static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
>   		struct nvme_id_ns *id)
>   {
>   	bool first = id->dps & NVME_NS_DPS_PI_FIRST;
>   	unsigned lbaf = nvme_lbaf_index(id->flbas);
> -	struct nvme_command c = { };
>   	struct nvme_id_ns_nvm *nvm;
>   	int ret = 0;
>   	u32 elbaf;
> @@ -1849,18 +1872,9 @@ static int nvme_init_ms(struct nvme_ctrl *ctrl, struct nvme_ns_head *head,
>   		goto set_pi;
>   	}
>   
> -	nvm = kzalloc(sizeof(*nvm), GFP_KERNEL);
> -	if (!nvm)
> -		return -ENOMEM;
> -
> -	c.identify.opcode = nvme_admin_identify;
> -	c.identify.nsid = cpu_to_le32(head->ns_id);
> -	c.identify.cns = NVME_ID_CNS_CS_NS;
> -	c.identify.csi = NVME_CSI_NVM;
> -
> -	ret = nvme_submit_sync_cmd(ctrl->admin_q, &c, nvm, sizeof(*nvm));
> +	ret = nvme_identify_ns_nvm(ctrl, head->ns_id, &nvm);
>   	if (ret)
> -		goto free_data;
> +		goto set_pi;
>   
>   	elbaf = le32_to_cpu(nvm->elbaf[lbaf]);

I actually had a similar logic in one of the patches of the PI series 
I've sent :)
BTW, we need only the elbaf field from the nvm response structure so we 
can ease the code a bit and free the nvm inside the 
nvme_identify_ns_nvm() instead of in the caller...

Otherwise, looks good
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>

>   


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper
  2024-02-29 13:52   ` Max Gurtovoy
@ 2024-02-29 13:53     ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2024-02-29 13:53 UTC (permalink / raw
  To: Max Gurtovoy
  Cc: Christoph Hellwig, Hector Martin, Sven Peter, Keith Busch,
	Sagi Grimberg, James Smart, Chaitanya Kulkarni, Alyssa Rosenzweig,
	asahi, linux-nvme

On Thu, Feb 29, 2024 at 03:52:08PM +0200, Max Gurtovoy wrote:
> I actually had a similar logic in one of the patches of the PI series I've 
> sent :)
> BTW, we need only the elbaf field from the nvm response structure so we can 
> ease the code a bit and free the nvm inside the nvme_identify_ns_nvm() 
> instead of in the caller...

Ithought about that, and while it might be a tad cleaner currently,
as soon as the identify_ns_nvm structure grows more fields that we'll
use that'll get uglier.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: convert nvme to atomic queue limits updates
  2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
                   ` (20 preceding siblings ...)
  2024-02-28 18:12 ` [PATCH 21/21] nvme-multipath: pass queue_limits to blk_alloc_disk Christoph Hellwig
@ 2024-03-01 16:20 ` Keith Busch
  2024-03-02 13:59   ` Christoph Hellwig
  21 siblings, 1 reply; 38+ messages in thread
From: Keith Busch @ 2024-03-01 16:20 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Hector Martin, Sven Peter, Sagi Grimberg, James Smart,
	Chaitanya Kulkarni, Alyssa Rosenzweig, asahi, linux-nvme

On Wed, Feb 28, 2024 at 10:11:54AM -0800, Christoph Hellwig wrote:
> Hi all,
> 
> this series converts nvme to the new atomic queue limit updates, and
> as part of that refactors a lot of the setup code.  The first two
> patches are block layer patches already sent out as part of the md
> queue limits conversion and will hopefully get merged into Jens' tree
> soon.

The whole series looks good. I see Jens has got the first two patches,
so I'll rebase nvme-6.9 to that and apply the rest of the nvme patches.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: convert nvme to atomic queue limits updates
  2024-03-01 16:20 ` convert nvme to atomic queue limits updates Keith Busch
@ 2024-03-02 13:59   ` Christoph Hellwig
  2024-03-02 23:21     ` Keith Busch
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2024-03-02 13:59 UTC (permalink / raw
  To: Keith Busch
  Cc: Christoph Hellwig, Hector Martin, Sven Peter, Sagi Grimberg,
	James Smart, Chaitanya Kulkarni, Alyssa Rosenzweig, asahi,
	linux-nvme

On Fri, Mar 01, 2024 at 09:20:31AM -0700, Keith Busch wrote:
> On Wed, Feb 28, 2024 at 10:11:54AM -0800, Christoph Hellwig wrote:
> > Hi all,
> > 
> > this series converts nvme to the new atomic queue limit updates, and
> > as part of that refactors a lot of the setup code.  The first two
> > patches are block layer patches already sent out as part of the md
> > queue limits conversion and will hopefully get merged into Jens' tree
> > soon.
> 
> The whole series looks good. I see Jens has got the first two patches,
> so I'll rebase nvme-6.9 to that and apply the rest of the nvme patches.

I had to fix up the little thing Max noticed an also added another
patch for multuipath.  I'll resend with those updates once you
rebased.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: convert nvme to atomic queue limits updates
  2024-03-02 13:59   ` Christoph Hellwig
@ 2024-03-02 23:21     ` Keith Busch
  0 siblings, 0 replies; 38+ messages in thread
From: Keith Busch @ 2024-03-02 23:21 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Hector Martin, Sven Peter, Sagi Grimberg, James Smart,
	Chaitanya Kulkarni, Alyssa Rosenzweig, asahi, linux-nvme

On Sat, Mar 02, 2024 at 02:59:54PM +0100, Christoph Hellwig wrote:
> On Fri, Mar 01, 2024 at 09:20:31AM -0700, Keith Busch wrote:
> > 
> > The whole series looks good. I see Jens has got the first two patches,
> > so I'll rebase nvme-6.9 to that and apply the rest of the nvme patches.
> 
> I had to fix up the little thing Max noticed an also added another
> patch for multuipath.  I'll resend with those updates once you
> rebased.

Done, nvme-6.9 is rebased to the current for-6.9/block.


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2024-03-02 23:21 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-28 18:11 convert nvme to atomic queue limits updates Christoph Hellwig
2024-02-28 18:11 ` [PATCH 01/21] block: add a queue_limits_set helper Christoph Hellwig
2024-02-28 18:11 ` [PATCH 02/21] block: add a queue_limits_stack_bdev helper Christoph Hellwig
2024-02-28 18:11 ` [PATCH 03/21] nvme: set max_hw_sectors unconditionally Christoph Hellwig
2024-02-29 10:46   ` Max Gurtovoy
2024-02-28 18:11 ` [PATCH 04/21] nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard Christoph Hellwig
2024-02-29 10:48   ` Max Gurtovoy
2024-02-28 18:11 ` [PATCH 05/21] nvme: remove nvme_revalidate_zones Christoph Hellwig
2024-02-28 23:47   ` Damien Le Moal
2024-02-28 18:12 ` [PATCH 06/21] nvme: move max_integrity_segments handling out of nvme_init_integrity Christoph Hellwig
2024-02-29 10:58   ` Max Gurtovoy
2024-02-28 18:12 ` [PATCH 07/21] nvme: cleanup the nvme_init_integrity calling conventions Christoph Hellwig
2024-02-29 12:33   ` Max Gurtovoy
2024-02-28 18:12 ` [PATCH 08/21] nvme: move blk_integrity_unregister into nvme_init_integrity Christoph Hellwig
2024-02-29 12:36   ` Max Gurtovoy
2024-02-28 18:12 ` [PATCH 09/21] nvme: don't use nvme_update_disk_info for the multipath disk Christoph Hellwig
2024-02-29 12:47   ` Max Gurtovoy
2024-02-29 13:02     ` Max Gurtovoy
2024-02-28 18:12 ` [PATCH 10/21] nvme: move a few things out of nvme_update_disk_info Christoph Hellwig
2024-02-28 18:12 ` [PATCH 11/21] nvme: move setting the write cache flags out of nvme_set_queue_limits Christoph Hellwig
2024-02-29 13:11   ` Max Gurtovoy
2024-02-28 18:12 ` [PATCH 12/21] nvme: move common logic into nvme_update_ns_info Christoph Hellwig
2024-02-29 13:30   ` Max Gurtovoy
2024-02-29 13:40     ` Christoph Hellwig
2024-02-28 18:12 ` [PATCH 13/21] nvme: split out a nvme_identify_ns_nvm helper Christoph Hellwig
2024-02-29 13:52   ` Max Gurtovoy
2024-02-29 13:53     ` Christoph Hellwig
2024-02-28 18:12 ` [PATCH 14/21] nvme: don't query identify data in configure_metadata Christoph Hellwig
2024-02-28 18:12 ` [PATCH 15/21] nvme: cleanup nvme_configure_metadata Christoph Hellwig
2024-02-28 18:12 ` [PATCH 16/21] nvme-rdma: initialize max_hw_sectors earlier Christoph Hellwig
2024-02-28 18:12 ` [PATCH 17/21] nvme-loop: " Christoph Hellwig
2024-02-28 18:12 ` [PATCH 18/21] nvme-fc: " Christoph Hellwig
2024-02-28 18:12 ` [PATCH 19/21] nvme-apple: " Christoph Hellwig
2024-02-28 18:12 ` [PATCH 20/21] nvme: use the atomic queue limits update API Christoph Hellwig
2024-02-28 18:12 ` [PATCH 21/21] nvme-multipath: pass queue_limits to blk_alloc_disk Christoph Hellwig
2024-03-01 16:20 ` convert nvme to atomic queue limits updates Keith Busch
2024-03-02 13:59   ` Christoph Hellwig
2024-03-02 23:21     ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.