QEMU-Devel Archive mirror
 help / color / mirror / Atom feed
From: Mattias Nissler <mnissler@rivosinc.com>
To: "Philippe Mathieu-Daudé" <philmd@linaro.org>
Cc: stefanha@redhat.com, qemu-devel@nongnu.org, peterx@redhat.com,
	 jag.raman@oracle.com, "Michael S. Tsirkin" <mst@redhat.com>,
	 Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	 Richard Henderson <richard.henderson@linaro.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	john.levon@nutanix.com
Subject: Re: [PATCH v9 2/5] softmmu: Support concurrent bounce buffers
Date: Tue, 7 May 2024 16:04:33 +0200	[thread overview]
Message-ID: <CAGNS4Tb1FVFmzVLvSZD84n0q80bwDiAMaSr752jb4nDwkGHSjA@mail.gmail.com> (raw)
In-Reply-To: <447ddc5a-ae1e-4fd1-b03a-dd7e1faa46e9@linaro.org>

On Tue, May 7, 2024 at 2:57 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> On 7/5/24 11:42, Mattias Nissler wrote:
> > When DMA memory can't be directly accessed, as is the case when
> > running the device model in a separate process without shareable DMA
> > file descriptors, bounce buffering is used.
> >
> > It is not uncommon for device models to request mapping of several DMA
> > regions at the same time. Examples include:
> >   * net devices, e.g. when transmitting a packet that is split across
> >     several TX descriptors (observed with igb)
> >   * USB host controllers, when handling a packet with multiple data TRBs
> >     (observed with xhci)
> >
> > Previously, qemu only provided a single bounce buffer per AddressSpace
> > and would fail DMA map requests while the buffer was already in use. In
> > turn, this would cause DMA failures that ultimately manifest as hardware
> > errors from the guest perspective.
> >
> > This change allocates DMA bounce buffers dynamically instead of
> > supporting only a single buffer. Thus, multiple DMA mappings work
> > correctly also when RAM can't be mmap()-ed.
> >
> > The total bounce buffer allocation size is limited individually for each
> > AddressSpace. The default limit is 4096 bytes, matching the previous
> > maximum buffer size. A new x-max-bounce-buffer-size parameter is
> > provided to configure the limit for PCI devices.
> >
> > Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
> > ---
> >   hw/pci/pci.c                |  8 ++++
> >   include/exec/memory.h       | 14 +++----
> >   include/hw/pci/pci_device.h |  3 ++
> >   system/memory.c             |  5 ++-
> >   system/physmem.c            | 82 ++++++++++++++++++++++++++-----------
> >   5 files changed, 76 insertions(+), 36 deletions(-)
>
>
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index d417d7f363..2ea1e99da2 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -1117,13 +1117,7 @@ typedef struct AddressSpaceMapClient {
> >       QLIST_ENTRY(AddressSpaceMapClient) link;
> >   } AddressSpaceMapClient;
> >
> > -typedef struct {
> > -    MemoryRegion *mr;
> > -    void *buffer;
> > -    hwaddr addr;
> > -    hwaddr len;
> > -    bool in_use;
> > -} BounceBuffer;
> > +#define DEFAULT_MAX_BOUNCE_BUFFER_SIZE (4096)
> >
> >   /**
> >    * struct AddressSpace: describes a mapping of addresses to #MemoryRegion objects
> > @@ -1143,8 +1137,10 @@ struct AddressSpace {
> >       QTAILQ_HEAD(, MemoryListener) listeners;
> >       QTAILQ_ENTRY(AddressSpace) address_spaces_link;
> >
> > -    /* Bounce buffer to use for this address space. */
> > -    BounceBuffer bounce;
> > +    /* Maximum DMA bounce buffer size used for indirect memory map requests */
> > +    uint32_t max_bounce_buffer_size;
>
> Alternatively size_t.

While switching things over, I was surprised to find that
DEFINE_PROP_SIZE wants a uint64_t field rather than a size_t field.
There is a DEFINE_PROP_SIZE32 variant for uint32_t though. Considering
my options, assuming that we want to use size_t for everything other
than the property:

(1) Make PCIDevice::max_bounce_buffer_size size_t and have the
preprocessor select DEFINE_PROP_SIZE/DEFINE_PROP_SIZE32. This makes
the qdev property type depend on the host. Ugh.

(2) Make PCIDevice::max_bounce_buffer_size uint64_t and clamp if
needed when used. Weird to allow larger values that are then clamped,
although it probably doesn't matter in practice since address space is
limited to 4GB anyways.

(3) Make PCIDevice::max_bounce_buffer_size uint32_t and accept the
limitation that the largest bounce buffer limit is 4GB even on 64-bit
hosts.

#3 seemed most pragmatic, so I'll go with that.


>
> > +    /* Total size of bounce buffers currently allocated, atomically accessed */
> > +    uint32_t bounce_buffer_size;
>
> Ditto.
>
> >       /* List of callbacks to invoke when buffers free up */
> >       QemuMutex map_client_list_lock;
> >       QLIST_HEAD(, AddressSpaceMapClient) map_client_list;
> > diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
> > index d3dd0f64b2..253b48a688 100644
> > --- a/include/hw/pci/pci_device.h
> > +++ b/include/hw/pci/pci_device.h
> > @@ -160,6 +160,9 @@ struct PCIDevice {
> >       /* ID of standby device in net_failover pair */
> >       char *failover_pair_id;
> >       uint32_t acpi_index;
> > +
> > +    /* Maximum DMA bounce buffer size used for indirect memory map requests */
> > +    uint32_t max_bounce_buffer_size;
>
> Ditto.
>
> >   };
>
>
> > diff --git a/system/physmem.c b/system/physmem.c
> > index 632da6508a..cd61758da0 100644
> > --- a/system/physmem.c
> > +++ b/system/physmem.c
> > @@ -3046,6 +3046,20 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len)
> >                                        NULL, len, FLUSH_CACHE);
> >   }
> >
> > +/*
> > + * A magic value stored in the first 8 bytes of the bounce buffer struct. Used
> > + * to detect illegal pointers passed to address_space_unmap.
> > + */
> > +#define BOUNCE_BUFFER_MAGIC 0xb4017ceb4ffe12ed
> > +
> > +typedef struct {
> > +    uint64_t magic;
> > +    MemoryRegion *mr;
> > +    hwaddr addr;
> > +    uint32_t len;
> > +    uint8_t buffer[];
> > +} BounceBuffer;
>
> Eh, you moved it back here. Never mind.
>
> > +
> >   static void
> >   address_space_unregister_map_client_do(AddressSpaceMapClient *client)
> >   {
> > @@ -3071,9 +3085,9 @@ void address_space_register_map_client(AddressSpace *as, QEMUBH *bh)
> >       qemu_mutex_lock(&as->map_client_list_lock);
> >       client->bh = bh;
> >       QLIST_INSERT_HEAD(&as->map_client_list, client, link);
> > -    /* Write map_client_list before reading in_use.  */
> > +    /* Write map_client_list before reading bounce_buffer_size. */
> >       smp_mb();
> > -    if (!qatomic_read(&as->bounce.in_use)) {
> > +    if (qatomic_read(&as->bounce_buffer_size) < as->max_bounce_buffer_size) {
> >           address_space_notify_map_clients_locked(as);
> >       }
> >       qemu_mutex_unlock(&as->map_client_list_lock);
> > @@ -3203,28 +3217,40 @@ void *address_space_map(AddressSpace *as,
> >       mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs);
> >
> >       if (!memory_access_is_direct(mr, is_write)) {
> > -        if (qatomic_xchg(&as->bounce.in_use, true)) {
> > +        uint32_t used = qatomic_read(&as->bounce_buffer_size);
>
> Nitpicking again, size_t seems clearer. Otherwise LGTM.


  reply	other threads:[~2024-05-07 14:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  9:42 [PATCH v9 0/5] Support message-based DMA in vfio-user server Mattias Nissler
2024-05-07  9:42 ` [PATCH v9 1/5] softmmu: Per-AddressSpace bounce buffering Mattias Nissler
2024-05-07 12:31   ` Philippe Mathieu-Daudé
2024-05-07  9:42 ` [PATCH v9 2/5] softmmu: Support concurrent bounce buffers Mattias Nissler
2024-05-07 12:57   ` Philippe Mathieu-Daudé
2024-05-07 14:04     ` Mattias Nissler [this message]
2024-05-07 14:46       ` Philippe Mathieu-Daudé
2024-05-08  6:33         ` Mattias Nissler
2024-05-13  6:29           ` Mattias Nissler
2024-05-07  9:42 ` [PATCH v9 3/5] Update subprojects/libvfio-user Mattias Nissler
2024-05-07  9:42 ` [PATCH v9 4/5] vfio-user: Message-based DMA support Mattias Nissler
2024-05-07  9:42 ` [PATCH v9 5/5] vfio-user: Fix config space access byte order Mattias Nissler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGNS4Tb1FVFmzVLvSZD84n0q80bwDiAMaSr752jb4nDwkGHSjA@mail.gmail.com \
    --to=mnissler@rivosinc.com \
    --cc=david@redhat.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=jag.raman@oracle.com \
    --cc=john.levon@nutanix.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).