From: Nicolin Chen <nicolinc@nvidia.com>
To: <will@kernel.org>, <robin.murphy@arm.com>, <jgg@nvidia.com>
Cc: <joro@8bytes.org>, <thierry.reding@gmail.com>,
<vdumpa@nvidia.com>, <jonathanh@nvidia.com>,
<linux-kernel@vger.kernel.org>, <iommu@lists.linux.dev>,
<linux-arm-kernel@lists.infradead.org>,
<linux-tegra@vger.kernel.org>
Subject: [PATCH v7 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)
Date: Tue, 7 May 2024 22:56:48 -0700 [thread overview]
Message-ID: <cover.1715147377.git.nicolinc@nvidia.com> (raw)
NVIDIA's Tegra241 (Grace) SoC has a CMDQ-Virtualization (CMDQV) hardware
that extends standard ARM SMMUv3 to support multiple command queues with
virtualization capabilities. Though this is similar to the ECMDQ in SMMU
v3.3, CMDQV provides additional Virtual Interfaces (VINTFs) allowing VMs
to have their own VINTFs and Virtual Command Queues (VCMDQs). The VCMDQs
can only execute a limited set of commands, mainly invalidation commands
when exclusively used by the VMs, compared to the standard SMMUv3 CMDQ.
Thus, there are two parts of patch series to add its support: the basic
in-kernel support as part 1, and the user-space support as part 2.
The in-kernel support is to detect/configure the CMDQV hardware and then
allocate a VINTF with some VCMDQs for the kernel/hypervisor to use. Like
ECMDQ, CMDQV also allows the kernel to use multiple VCMDQs, giving some
limited performance improvement: up to 20% reduction of TLB invalidation
time was measured by a multi-threaded DMA unmap benchmark, compared to a
single queue.
The user-space support is to provide uAPIs (via IOMMUFD) for hypervisors
in user space to passthrough VCMDQs to VMs, allowing these VMs to access
the VCMDQs directly without trappings, i.e. no VM Exits. This gives huge
performance improvements: 70% to 90% reductions of TLB invalidation time
were measured by various DMA unmap tests running in a guest OS, compared
to a nested SMMU CMDQ (with trappings).
This is the part-1 series:
- Preparatory changes to share the existing SMMU functions
- A new CMDQV driver and extending the SMMUv3 driver to interact with
the new driver
- Limit the commands for a guest kernel.
It's available on Github:
https://github.com/nicolinc/iommufd/commits/vcmdq_in_kernel-v7
And the part-2 RFC series is also sent for discussion:
https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/
Note that this in-kernel support isn't confined to host kernels running
on Grace-powered servers, but is also used by guest kernels running on
VMs virtualized on those servers. So, those VMs must install the driver,
ideally before the part 2 is merged. So, later those servers would only
need to upgrade their host kernels without bothering the VMs.
Thank you!
Changelog
v7:
* Moved all public symbols into one single patch
* Enforced a command batch to use the same cmdq
* Enforced the use of arm_smmu_cmdq_build_sync_cmd()
* Reworked the tegra241-cmdqv driver patch
- Dropped logging macros, cmdqv->dev, and atomic
- Dropped devm_* and added tegra241_cmdqv_device_remove()
- Moved all structure allocations to cmdqv's probe() from
device_reset() where only register configurations remains
- Switched the config macros to inline functions
- Optimized ISR routine with 64-bit reading MMIO
- Scan once per batch against command list
- Reorganized function locations
- Minor readability changes
v6:
https://lore.kernel.org/all/cover.1714451595.git.nicolinc@nvidia.com/
* Reordered the patch sequence to fix git-bisect break
* Added a status cache to cmdqv/vintf/vcmdq structure
* Added gerror/gerrorn value match in hw_deinit()
* Minimized changes in __arm_smmu_cmdq_skip_err()
* Preallocated VCMDQs to VINTFs for stablility
v5:
https://lore.kernel.org/all/cover.1712977210.git.nicolinc@nvidia.com/
* Improved print/mmio helpers
* Added proper register reset routines
* Reorganized init/deinit functions to share with VIOMMU callbacks in
the upcoming part-2 user-space series (RFC)
v4:
https://lore.kernel.org/all/cover.1711690673.git.nicolinc@nvidia.com/
* Rebased on v6.9-rc1
* Renamed to "tegra241-cmdqv", following other Grace kernel patches
* Added a set of print and MMIO helpers
* Reworked the guest limitation patch
v3:
https://lore.kernel.org/all/20211119071959.16706-1-nicolinc@nvidia.com/
* Dropped VMID and mdev patches to redesign later based on IOMMUFD
* Separated HYP_OWN part for guest support into a new patch
* Added new preparatory changes
v2:
https://lore.kernel.org/all/20210831025923.15812-1-nicolinc@nvidia.com/
* Added mdev interface support for hypervisor and VMs
* Added preparatory changes for mdev interface implementation
* PATCH-12 Changed ->issue_cmdlist() to ->get_cmdq() for a better
integration with recently merged ECMDQ-related changes
v1:
https://lore.kernel.org/all/20210723193140.9690-1-nicolinc@nvidia.com/
Nate Watterson (1):
iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace)
CMDQV
Nicolin Chen (5):
iommu/arm-smmu-v3: Make symbols public for CONFIG_TEGRA241_CMDQV
iommu/arm-smmu-v3: Issue a batch of commands to the same cmdq
iommu/arm-smmu-v3: Enforce arm_smmu_cmdq_build_sync_cmd
iommu/arm-smmu-v3: Add CS_NONE quirk for CONFIG_TEGRA241_CMDQV
iommu/tegra241-cmdqv: Limit CMDs for guest owned VINTF
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 11 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 164 ++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 67 +-
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 911 ++++++++++++++++++
6 files changed, 1088 insertions(+), 67 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
--
2.43.0
next reply other threads:[~2024-05-08 5:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-08 5:56 Nicolin Chen [this message]
2024-05-08 5:56 ` [PATCH v7 1/6] iommu/arm-smmu-v3: Make symbols public for CONFIG_TEGRA241_CMDQV Nicolin Chen
2024-05-08 5:56 ` [PATCH v7 2/6] iommu/arm-smmu-v3: Issue a batch of commands to the same cmdq Nicolin Chen
2024-05-12 15:34 ` Jason Gunthorpe
2024-05-08 5:56 ` [PATCH v7 3/6] iommu/arm-smmu-v3: Enforce arm_smmu_cmdq_build_sync_cmd Nicolin Chen
2024-05-12 15:39 ` Jason Gunthorpe
2024-05-12 20:56 ` Nicolin Chen
2024-05-08 5:56 ` [PATCH v7 4/6] iommu/arm-smmu-v3: Add CS_NONE quirk for CONFIG_TEGRA241_CMDQV Nicolin Chen
2024-05-08 5:56 ` [PATCH v7 5/6] iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV Nicolin Chen
2024-05-12 15:54 ` Jason Gunthorpe
2024-05-12 21:00 ` Nicolin Chen
2024-05-08 5:56 ` [PATCH v7 6/6] iommu/tegra241-cmdqv: Limit CMDs for guest owned VINTF Nicolin Chen
2024-05-12 16:06 ` Jason Gunthorpe
2024-05-12 22:09 ` Nicolin Chen
2024-05-14 15:15 ` Jason Gunthorpe
2024-05-14 22:20 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1715147377.git.nicolinc@nvidia.com \
--to=nicolinc@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=jonathanh@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tegra@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=thierry.reding@gmail.com \
--cc=vdumpa@nvidia.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).