From: Anoob Joseph <anoobj@marvell.com>
To: Chengwen Feng <fengchengwen@huawei.com>,
Kevin Laatz <kevin.laatz@intel.com>,
Bruce Richardson <bruce.richardson@intel.com>,
"Jerin Jacob" <jerinj@marvell.com>,
Thomas Monjalon <thomas@monjalon.net>
Cc: Vidya Sagar Velumuri <vvelumuri@marvell.com>,
Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>,
<dev@dpdk.org>
Subject: [PATCH v3 7/7] dma/odm: add remaining ops
Date: Fri, 19 Apr 2024 12:13:19 +0530 [thread overview]
Message-ID: <20240419064319.149-8-anoobj@marvell.com> (raw)
In-Reply-To: <20240419064319.149-1-anoobj@marvell.com>
From: Vidya Sagar Velumuri <vvelumuri@marvell.com>
Add all remaining ops such as fill, burst_capacity etc. Also update the
documentation.
Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
---
MAINTAINERS | 1 +
doc/guides/dmadevs/index.rst | 1 +
doc/guides/dmadevs/odm.rst | 92 +++++++++++++
drivers/dma/odm/odm.h | 4 +
drivers/dma/odm/odm_dmadev.c | 250 +++++++++++++++++++++++++++++++++++
5 files changed, 348 insertions(+)
create mode 100644 doc/guides/dmadevs/odm.rst
diff --git a/MAINTAINERS b/MAINTAINERS
index b8d2f7b3d8..38293008aa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1273,6 +1273,7 @@ M: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
M: Vidya Sagar Velumuri <vvelumuri@marvell.com>
T: git://dpdk.org/next/dpdk-next-net-mrvl
F: drivers/dma/odm/
+F: doc/guides/dmadevs/odm.rst
NXP DPAA DMA
M: Gagandeep Singh <g.singh@nxp.com>
diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst
index 5bd25b32b9..ce9f6eb260 100644
--- a/doc/guides/dmadevs/index.rst
+++ b/doc/guides/dmadevs/index.rst
@@ -17,3 +17,4 @@ an application through DMA API.
hisilicon
idxd
ioat
+ odm
diff --git a/doc/guides/dmadevs/odm.rst b/doc/guides/dmadevs/odm.rst
new file mode 100644
index 0000000000..a2eaab59a0
--- /dev/null
+++ b/doc/guides/dmadevs/odm.rst
@@ -0,0 +1,92 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2024 Marvell.
+
+Odyssey ODM DMA Device Driver
+=============================
+
+The ``odm`` DMA device driver provides a poll-mode driver (PMD) for Marvell Odyssey
+DMA Hardware Accelerator block found in Odyssey SoC. The block supports only mem
+to mem DMA transfers.
+
+ODM DMA device can support up to 32 queues and 16 VFs.
+
+Prerequisites and Compilation procedure
+---------------------------------------
+
+Device Setup
+-------------
+
+ODM DMA device is initialized by kernel PF driver. The PF kernel driver is part
+of Marvell software packages for Odyssey.
+
+Kernel module can be inserted as in below example::
+
+ $ sudo insmod odyssey_odm.ko
+
+ODM DMA device can support up to 16 VFs::
+
+ $ sudo echo 16 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
+
+Above command creates 16 VFs with 2 queues each.
+
+The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the
+presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma``
+will show all the Odyssey ODM DMA devices.
+
+Devices using VFIO drivers
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The HW devices to be used will need to be bound to a user-space IO driver.
+The ``dpdk-devbind.py`` script can be used to view the state of the devices
+and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``.
+For example::
+
+ $ dpdk-devbind.py -b vfio-pci 0000:08:00.1
+
+Device Probing and Initialization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To use the devices from an application, the dmadev API can be used.
+
+Once configured, the device can then be made ready for use
+by calling the ``rte_dma_start()`` API.
+
+Performing Data Copies
+~~~~~~~~~~~~~~~~~~~~~~
+
+Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section
+of the dmadev library documentation for details on operation enqueue and
+submission API usage.
+
+Performance Tuning Parameters
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To achieve higher performance, DMA device needs to be tuned using PF kernel
+driver module parameters.
+
+Following options are exposed by kernel PF driver via devlink interface for
+tuning performance.
+
+``eng_sel``
+
+ ODM DMA device has 2 engines internally. Engine to queue mapping is decided
+ by a hardware register which can be configured as below::
+
+ $ /sbin/devlink dev param set pci/0000:08:00.0 name eng_sel value 3435973836 cmode runtime
+
+ Each bit in the register corresponds to one queue. Each queue would be
+ associated with one engine. If the value of the bit corresponding to the queue
+ is 0, then engine 0 would be picked. If it is 1, then engine 1 would be
+ picked.
+
+ In the above command, the register value is set as
+ ``1100 1100 1100 1100 1100 1100 1100 1100`` which allows for alternate engines
+ to be used with alternate VFs (assuming the system has 16 VFs with 2 queues
+ each).
+
+``max_load_request``
+
+ Specifies maximum outstanding load requests on internal bus. Values can range
+ from 1 to 512. Set to 512 for maximum requests in flight.::
+
+ $ /sbin/devlink dev param set pci/0000:08:00.0 name max_load_request value 512 cmode runtime
diff --git a/drivers/dma/odm/odm.h b/drivers/dma/odm/odm.h
index e1373e0c7f..1d60d2d11a 100644
--- a/drivers/dma/odm/odm.h
+++ b/drivers/dma/odm/odm.h
@@ -75,6 +75,10 @@ extern int odm_logtype;
rte_log(RTE_LOG_INFO, odm_logtype, \
RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__, \
RTE_FMT_TAIL(__VA_ARGS__, )))
+#define odm_debug(...) \
+ rte_log(RTE_LOG_DEBUG, odm_logtype, \
+ RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__, \
+ RTE_FMT_TAIL(__VA_ARGS__, )))
#define ODM_MEMZONE_FLAGS \
(RTE_MEMZONE_1GB | RTE_MEMZONE_16MB | RTE_MEMZONE_16GB | RTE_MEMZONE_256MB | \
diff --git a/drivers/dma/odm/odm_dmadev.c b/drivers/dma/odm/odm_dmadev.c
index b21be83a89..57bd6923f1 100644
--- a/drivers/dma/odm/odm_dmadev.c
+++ b/drivers/dma/odm/odm_dmadev.c
@@ -320,6 +320,251 @@ odm_dmadev_copy_sg(void *dev_private, uint16_t vchan, const struct rte_dma_sge *
return vq->desc_idx++;
}
+static int
+odm_dmadev_fill(void *dev_private, uint16_t vchan, uint64_t pattern, rte_iova_t dst,
+ uint32_t length, uint64_t flags)
+{
+ uint16_t pending_submit_len, pending_submit_cnt, iring_sz_available, iring_head;
+ const int num_words = ODM_IRING_ENTRY_SIZE_MIN;
+ struct odm_dev *odm = dev_private;
+ uint64_t *iring_head_ptr;
+ struct odm_queue *vq;
+ uint64_t h;
+
+ vq = &odm->vq[vchan];
+
+ union odm_instr_hdr_s hdr = {
+ .s.ct = ODM_HDR_CT_CW_NC,
+ .s.nfst = 0,
+ .s.nlst = 1,
+ };
+
+ h = (uint64_t)length;
+
+ switch (pattern) {
+ case 0:
+ hdr.s.xtype = ODM_XTYPE_FILL0;
+ break;
+ case 0xffffffffffffffff:
+ hdr.s.xtype = ODM_XTYPE_FILL1;
+ break;
+ default:
+ return -ENOTSUP;
+ }
+
+ const uint16_t max_iring_words = vq->iring_max_words;
+
+ iring_sz_available = vq->iring_sz_available;
+ pending_submit_len = vq->pending_submit_len;
+ pending_submit_cnt = vq->pending_submit_cnt;
+ iring_head_ptr = vq->iring_mz->addr;
+ iring_head = vq->iring_head;
+
+ if (iring_sz_available < num_words)
+ return -ENOSPC;
+
+ if ((iring_head + num_words) >= max_iring_words) {
+
+ iring_head_ptr[iring_head] = hdr.u;
+ iring_head = (iring_head + 1) % max_iring_words;
+
+ iring_head_ptr[iring_head] = h;
+ iring_head = (iring_head + 1) % max_iring_words;
+
+ iring_head_ptr[iring_head] = dst;
+ iring_head = (iring_head + 1) % max_iring_words;
+
+ iring_head_ptr[iring_head] = 0;
+ iring_head = (iring_head + 1) % max_iring_words;
+ } else {
+ iring_head_ptr[iring_head] = hdr.u;
+ iring_head_ptr[iring_head + 1] = h;
+ iring_head_ptr[iring_head + 2] = dst;
+ iring_head_ptr[iring_head + 3] = 0;
+ iring_head += num_words;
+ }
+
+ pending_submit_len += num_words;
+
+ if (flags & RTE_DMA_OP_FLAG_SUBMIT) {
+ rte_wmb();
+ odm_write64(pending_submit_len, odm->rbase + ODM_VDMA_DBELL(vchan));
+ vq->stats.submitted += pending_submit_cnt + 1;
+ vq->pending_submit_len = 0;
+ vq->pending_submit_cnt = 0;
+ } else {
+ vq->pending_submit_len = pending_submit_len;
+ vq->pending_submit_cnt++;
+ }
+
+ vq->iring_head = iring_head;
+ vq->iring_sz_available = iring_sz_available - num_words;
+
+ /* No extra space to save. Skip entry in extra space ring. */
+ vq->ins_ring_head = (vq->ins_ring_head + 1) % vq->cring_max_entry;
+
+ vq->iring_sz_available = iring_sz_available - num_words;
+
+ return vq->desc_idx++;
+}
+
+static uint16_t
+odm_dmadev_completed(void *dev_private, uint16_t vchan, const uint16_t nb_cpls, uint16_t *last_idx,
+ bool *has_error)
+{
+ const union odm_cmpl_ent_s cmpl_zero = {0};
+ uint16_t cring_head, iring_sz_available;
+ struct odm_dev *odm = dev_private;
+ union odm_cmpl_ent_s cmpl;
+ struct odm_queue *vq;
+ uint64_t nb_err = 0;
+ uint32_t *cmpl_ptr;
+ int cnt;
+
+ vq = &odm->vq[vchan];
+ const uint32_t *base_addr = vq->cring_mz->addr;
+ const uint16_t cring_max_entry = vq->cring_max_entry;
+
+ cring_head = vq->cring_head;
+ iring_sz_available = vq->iring_sz_available;
+
+ if (unlikely(vq->stats.submitted == vq->stats.completed)) {
+ *last_idx = (vq->stats.completed_offset + vq->stats.completed - 1) & 0xFFFF;
+ return 0;
+ }
+
+ for (cnt = 0; cnt < nb_cpls; cnt++) {
+ cmpl_ptr = RTE_PTR_ADD(base_addr, cring_head * sizeof(cmpl));
+ cmpl.u = rte_atomic_load_explicit((RTE_ATOMIC(uint32_t) *)cmpl_ptr,
+ rte_memory_order_relaxed);
+ if (!cmpl.s.valid)
+ break;
+
+ if (cmpl.s.cmp_code)
+ nb_err++;
+
+ /* Free space for enqueue */
+ iring_sz_available += 4 + vq->extra_ins_sz[cring_head];
+
+ /* Clear instruction extra space */
+ vq->extra_ins_sz[cring_head] = 0;
+
+ rte_atomic_store_explicit((RTE_ATOMIC(uint32_t) *)cmpl_ptr, cmpl_zero.u,
+ rte_memory_order_relaxed);
+ cring_head = (cring_head + 1) % cring_max_entry;
+ }
+
+ vq->stats.errors += nb_err;
+
+ if (unlikely(has_error != NULL && nb_err))
+ *has_error = true;
+
+ vq->cring_head = cring_head;
+ vq->iring_sz_available = iring_sz_available;
+
+ vq->stats.completed += cnt;
+
+ *last_idx = (vq->stats.completed_offset + vq->stats.completed - 1) & 0xFFFF;
+
+ return cnt;
+}
+
+static uint16_t
+odm_dmadev_completed_status(void *dev_private, uint16_t vchan, const uint16_t nb_cpls,
+ uint16_t *last_idx, enum rte_dma_status_code *status)
+{
+ const union odm_cmpl_ent_s cmpl_zero = {0};
+ uint16_t cring_head, iring_sz_available;
+ struct odm_dev *odm = dev_private;
+ union odm_cmpl_ent_s cmpl;
+ struct odm_queue *vq;
+ uint32_t *cmpl_ptr;
+ int cnt;
+
+ vq = &odm->vq[vchan];
+ const uint32_t *base_addr = vq->cring_mz->addr;
+ const uint16_t cring_max_entry = vq->cring_max_entry;
+
+ cring_head = vq->cring_head;
+ iring_sz_available = vq->iring_sz_available;
+
+ if (vq->stats.submitted == vq->stats.completed) {
+ *last_idx = (vq->stats.completed_offset + vq->stats.completed - 1) & 0xFFFF;
+ return 0;
+ }
+
+#ifdef ODM_DEBUG
+ odm_debug("cring_head: 0x%" PRIx16, cring_head);
+ odm_debug("Submitted: 0x%" PRIx64, vq->stats.submitted);
+ odm_debug("Completed: 0x%" PRIx64, vq->stats.completed);
+ odm_debug("Hardware count: 0x%" PRIx64, odm_read64(odm->rbase + ODM_VDMA_CNT(vchan)));
+#endif
+
+ for (cnt = 0; cnt < nb_cpls; cnt++) {
+ cmpl_ptr = RTE_PTR_ADD(base_addr, cring_head * sizeof(cmpl));
+ cmpl.u = rte_atomic_load_explicit((RTE_ATOMIC(uint32_t) *)cmpl_ptr,
+ rte_memory_order_relaxed);
+ if (!cmpl.s.valid)
+ break;
+
+ status[cnt] = cmpl.s.cmp_code;
+
+ if (cmpl.s.cmp_code)
+ vq->stats.errors++;
+
+ /* Free space for enqueue */
+ iring_sz_available += 4 + vq->extra_ins_sz[cring_head];
+
+ /* Clear instruction extra space */
+ vq->extra_ins_sz[cring_head] = 0;
+
+ rte_atomic_store_explicit((RTE_ATOMIC(uint32_t) *)cmpl_ptr, cmpl_zero.u,
+ rte_memory_order_relaxed);
+ cring_head = (cring_head + 1) % cring_max_entry;
+ }
+
+ vq->cring_head = cring_head;
+ vq->iring_sz_available = iring_sz_available;
+
+ vq->stats.completed += cnt;
+
+ *last_idx = (vq->stats.completed_offset + vq->stats.completed - 1) & 0xFFFF;
+
+ return cnt;
+}
+
+static int
+odm_dmadev_submit(void *dev_private, uint16_t vchan)
+{
+ struct odm_dev *odm = dev_private;
+ uint16_t pending_submit_len;
+ struct odm_queue *vq;
+
+ vq = &odm->vq[vchan];
+ pending_submit_len = vq->pending_submit_len;
+
+ if (pending_submit_len == 0)
+ return 0;
+
+ rte_wmb();
+ odm_write64(pending_submit_len, odm->rbase + ODM_VDMA_DBELL(vchan));
+ vq->pending_submit_len = 0;
+ vq->stats.submitted += vq->pending_submit_cnt;
+ vq->pending_submit_cnt = 0;
+
+ return 0;
+}
+
+static uint16_t
+odm_dmadev_burst_capacity(const void *dev_private, uint16_t vchan __rte_unused)
+{
+ const struct odm_dev *odm = dev_private;
+ const struct odm_queue *vq;
+
+ vq = &odm->vq[vchan];
+ return (vq->iring_sz_available / ODM_IRING_ENTRY_SIZE_MIN);
+}
+
static int
odm_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *rte_stats,
uint32_t size)
@@ -419,6 +664,11 @@ odm_dmadev_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_dev
dmadev->fp_obj->copy = odm_dmadev_copy;
dmadev->fp_obj->copy_sg = odm_dmadev_copy_sg;
+ dmadev->fp_obj->fill = odm_dmadev_fill;
+ dmadev->fp_obj->submit = odm_dmadev_submit;
+ dmadev->fp_obj->completed = odm_dmadev_completed;
+ dmadev->fp_obj->completed_status = odm_dmadev_completed_status;
+ dmadev->fp_obj->burst_capacity = odm_dmadev_burst_capacity;
odm->pci_dev = pci_dev;
--
2.25.1
next prev parent reply other threads:[~2024-04-19 6:44 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-15 15:31 [PATCH 0/8] Add ODM DMA device Anoob Joseph
2024-04-15 15:31 ` [PATCH 1/8] usertools/devbind: add " Anoob Joseph
2024-04-15 15:31 ` [PATCH 2/8] dma/odm: add framework for " Anoob Joseph
2024-04-15 15:31 ` [PATCH 3/8] dma/odm: add hardware defines Anoob Joseph
2024-04-15 15:31 ` [PATCH 4/8] dma/odm: add dev init and fini Anoob Joseph
2024-04-15 15:31 ` [PATCH 5/8] dma/odm: add device ops Anoob Joseph
2024-04-15 15:31 ` [PATCH 6/8] dma/odm: add stats Anoob Joseph
2024-04-15 15:31 ` [PATCH 7/8] dma/odm: add copy and copy sg ops Anoob Joseph
2024-04-15 15:31 ` [PATCH 8/8] dma/odm: add remaining ops Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 0/7] Add ODM DMA device Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 1/7] dma/odm: add framework for " Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 2/7] dma/odm: add hardware defines Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 3/7] dma/odm: add dev init and fini Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 4/7] dma/odm: add device ops Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 5/7] dma/odm: add stats Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 6/7] dma/odm: add copy and copy sg ops Anoob Joseph
2024-04-17 7:27 ` [PATCH v2 7/7] dma/odm: add remaining ops Anoob Joseph
2024-04-19 6:43 ` [PATCH v3 0/7] Add ODM DMA device Anoob Joseph
2024-04-19 6:43 ` [PATCH v3 1/7] dma/odm: add framework for " Anoob Joseph
2024-05-24 13:26 ` Jerin Jacob
2024-04-19 6:43 ` [PATCH v3 2/7] dma/odm: add hardware defines Anoob Joseph
2024-05-24 13:29 ` Jerin Jacob
2024-04-19 6:43 ` [PATCH v3 3/7] dma/odm: add dev init and fini Anoob Joseph
2024-04-19 6:43 ` [PATCH v3 4/7] dma/odm: add device ops Anoob Joseph
2024-05-24 13:37 ` Jerin Jacob
2024-04-19 6:43 ` [PATCH v3 5/7] dma/odm: add stats Anoob Joseph
2024-04-19 6:43 ` [PATCH v3 6/7] dma/odm: add copy and copy sg ops Anoob Joseph
2024-04-19 6:43 ` Anoob Joseph [this message]
2024-05-27 15:16 ` [PATCH v4 0/7] Add ODM DMA device Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 1/7] dma/odm: add framework for " Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 2/7] dma/odm: add hardware defines Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 3/7] dma/odm: add dev init and fini Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 4/7] dma/odm: add device ops Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 5/7] dma/odm: add stats Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 6/7] dma/odm: add copy and copy sg ops Anoob Joseph
2024-05-27 15:16 ` [PATCH v4 7/7] dma/odm: add remaining ops Anoob Joseph
2024-05-28 8:12 ` [PATCH v4 0/7] Add ODM DMA device Jerin Jacob
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240419064319.149-8-anoobj@marvell.com \
--to=anoobj@marvell.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=gmuthukrishn@marvell.com \
--cc=jerinj@marvell.com \
--cc=kevin.laatz@intel.com \
--cc=thomas@monjalon.net \
--cc=vvelumuri@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).