QEMU-Devel Archive mirror
 help / color / mirror / Atom feed
From: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: qemu-devel@nongnu.org, jgross@suse.com,
	"Edgar E. Iglesias" <edgar.iglesias@amd.com>,
	"Anthony Perard" <anthony.perard@citrix.com>,
	"Paul Durrant" <paul@xen.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	xen-devel@lists.xenproject.org, Xenia.Ragiadakou@amd.com
Subject: Re: [PATCH v4 15/17] xen: mapcache: Remove assumption of RAMBlock with 0 offset
Date: Tue, 7 May 2024 19:18:57 +0200	[thread overview]
Message-ID: <CAJy5ezoTVyP5Rs=zM3jdY2NKhwQqDuCWD9oG72m4Xsdc-kmc+A@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2405021301400.624854@ubuntu-linux-20-04-desktop>

On Thu, May 2, 2024 at 10:02 PM Stefano Stabellini
<sstabellini@kernel.org> wrote:
>
> On Thu, 2 May 2024, Edgar E. Iglesias wrote:
> > On Thu, May 2, 2024 at 8:53 PM Stefano Stabellini
> > <sstabellini@kernel.org> wrote:
> > >
> > > +Xenia
> > >
> > > On Thu, 2 May 2024, Edgar E. Iglesias wrote:
> > > > On Wed, May 1, 2024 at 11:24 PM Stefano Stabellini
> > > > <sstabellini@kernel.org> wrote:
> > > > >
> > > > > On Tue, 30 Apr 2024, Edgar E. Iglesias wrote:
> > > > > > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>
> > > > > >
> > > > > > The current mapcache assumes that all memory is mapped
> > > > > > in a single RAM MR (the first one with offset 0). Remove
> > > > > > this assumption and propagate the offset to the mapcache
> > > > > > so it can do reverse mappings (from hostptr -> ram_addr).
> > > > > >
> > > > > > This is in preparation for adding grant mappings.
> > > > > >
> > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
> > > > >
> > > > >
> > > > > Looking at xen_remap_bucket, it is only using address_index (without
> > > > > adding ram_offset) to map foreign memory. From xen_remap_bucket, I would
> > > > > understand that address_index already includes the ram_offset.
> > > > >
> > > > > Meaning that if we want to map foreign mapping at address 0x5000, then
> > > > > address_index would be 0x5000, even if ram_offset is 0x1000.
> > > > >
> > > > > But then looking xen_ram_addr_from_mapcache_single ram_offset is added
> > > > > to paddr_index to calculate the physical address. So in that case we
> > > > > would want address_index to be 0x4000 and ram_offset to be 0x1000. But
> > > > > xen_remap_bucket would have to sum address_index and ram_offset to map
> > > > > foreign memory.
> > > > >
> > > > > So I am a bit confused, did I get it wrong? One more comment below.
> > > > >
> > > >
> > > > Thanks Stefano,
> > > >
> > > > I think the confusion is that this ram_addr_offset is not related to
> > > > guest address-space.
> > > > It's a QEMU internal thing and it shouldn't be included in the address
> > > > used to map foreign memory.
> > > > The mapcache can treat this ram_addr offset like a cookie that we keep
> > > > around to be able to do
> > > > reverse mappings from host pointers into ram_addr space
> > > > (xen_ram_addr_from_mapcache).
> > > >
> > > > The current mapcache implementation works because we've really only
> > > > been using foreign mappings
> > > > on RAMBlocks with offset 0. We're also creating RAM's such that the
> > > > offset into the RAM is also
> > > > the guest physical address, for x86 this is natural since RAM starts
> > > > at zero (for lowmem) but for
> > > > ARM we're creating larger than needed RAM's (GUEST_RAM0_BASE + ram-size) to
> > > > make this assumption true. Anyway, In this series I'm not addressing
> > > > this second assumption.
> > >
> > > Let's see if I understand correctly.
> > >
> > > The ram_addr space is an internal QEMU address space which is different
> > > from the guest physical address space and thus cannot and should not be
> > > used to do foreign mappings (foreign mapping hypercalls take a guest
> > > physical or a real physical address to map). Is that correct?
> > >
> > > If so, then I understand.
> > >
> >
> > Yes, that matches my understanding.
> >
> > >
> > >
> > > > There's a second call in physmem.c to xen_map_cache using the
> > > > block->offset as an address.
> > > > I was considering removing that second call since I can't see how it can work
> > > > (except perhaps in some specific use-case by luck?). Anyway, for now
> > > > I've left it unmodified.
> > >
> > > Yes, that code was written with the assumption that block->offset is an
> > > offset in the guest physical address space and could be used as a guest
> > > physical address. Actually, you might have spotted a real bug.
> > >
> > > The intent was for smaller regions (not the bit RAM region, things like
> > > a ROM region for instance) we could map them in full. So here we were
> > > trying to map the whole thing from start to finish using block->offset
> > > as start.
> > >
> > >
> > > > > > ---
> > > > > >  hw/xen/xen-mapcache.c         | 25 ++++++++++++++++++-------
> > > > > >  include/sysemu/xen-mapcache.h |  2 ++
> > > > > >  system/physmem.c              |  8 ++++----
> > > > > >  3 files changed, 24 insertions(+), 11 deletions(-)
> > > > > >
> > > > > > diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> > > > > > index 09b5f36d9c..1b32d0c003 100644
> > > > > > --- a/hw/xen/xen-mapcache.c
> > > > > > +++ b/hw/xen/xen-mapcache.c
> > > > > > @@ -43,6 +43,9 @@ typedef struct MapCacheEntry {
> > > > > >  #define XEN_MAPCACHE_ENTRY_DUMMY (1 << 0)
> > > > > >      uint8_t flags;
> > > > > >      hwaddr size;
> > > > > > +
> > > > > > +    /* Keep ram_addr offset for reverse mappings (hostptr -> ram_addr).  */
> > > > > > +    ram_addr_t ram_offset;
> > > > > >      struct MapCacheEntry *next;
> > > > > >  } MapCacheEntry;
> > > > > >
> > > > > > @@ -165,7 +168,8 @@ static void xen_remap_bucket(MapCache *mc,
> > > > > >                               void *vaddr,
> > > > > >                               hwaddr size,
> > > > > >                               hwaddr address_index,
> > > > > > -                             bool dummy)
> > > > > > +                             bool dummy,
> > > > > > +                             ram_addr_t ram_offset)
> > > > > >  {
> > > > > >      uint8_t *vaddr_base;
> > > > > >      xen_pfn_t *pfns;
> > > > > > @@ -244,6 +248,7 @@ static void xen_remap_bucket(MapCache *mc,
> > > > > >      entry->size = size;
> > > > > >      entry->valid_mapping = g_new0(unsigned long,
> > > > > >                                    BITS_TO_LONGS(size >> XC_PAGE_SHIFT));
> > > > > > +    entry->ram_offset = ram_offset;
> > > > > >
> > > > > >      if (dummy) {
> > > > > >          entry->flags |= XEN_MAPCACHE_ENTRY_DUMMY;
> > > > > > @@ -264,6 +269,7 @@ static void xen_remap_bucket(MapCache *mc,
> > > > > >
> > > > > >  static uint8_t *xen_map_cache_unlocked(MapCache *mc,
> > > > > >                                         hwaddr phys_addr, hwaddr size,
> > > > > > +                                       ram_addr_t ram_offset,
> > > > > >                                         uint8_t lock, bool dma, bool is_write)
> > > > > >  {
> > > > > >      MapCacheEntry *entry, *pentry = NULL,
> > > > > > @@ -335,14 +341,16 @@ tryagain:
> > > > > >      if (!entry) {
> > > > > >          entry = g_new0(MapCacheEntry, 1);
> > > > > >          pentry->next = entry;
> > > > > > -        xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy);
> > > > > > +        xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy,
> > > > > > +                         ram_offset);
> > > > > >      } else if (!entry->lock) {
> > > > > >          if (!entry->vaddr_base || entry->paddr_index != address_index ||
> > > > > >                  entry->size != cache_size ||
> > > > > >                  !test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > > >                      test_bit_size >> XC_PAGE_SHIFT,
> > > > > >                      entry->valid_mapping)) {
> > > > > > -            xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy);
> > > > > > +            xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy,
> > > > > > +                             ram_offset);
> > > > > >          }
> > > > > >      }
> > > > > >
> > > > > > @@ -389,13 +397,15 @@ tryagain:
> > > > > >
> > > > > >  uint8_t *xen_map_cache(MemoryRegion *mr,
> > > > > >                         hwaddr phys_addr, hwaddr size,
> > > > > > +                       ram_addr_t ram_addr_offset,
> > > > > >                         uint8_t lock, bool dma,
> > > > > >                         bool is_write)
> > > > > >  {
> > > > > >      uint8_t *p;
> > > > > >
> > > > > >      mapcache_lock(mapcache);
> > > > > > -    p = xen_map_cache_unlocked(mapcache, phys_addr, size, lock, dma, is_write);
> > > > > > +    p = xen_map_cache_unlocked(mapcache, phys_addr, size, ram_addr_offset,
> > > > > > +                               lock, dma, is_write);
> > > > > >      mapcache_unlock(mapcache);
> > > > > >      return p;
> > > > > >  }
> > > > > > @@ -432,7 +442,8 @@ static ram_addr_t xen_ram_addr_from_mapcache_single(MapCache *mc, void *ptr)
> > > > > >          raddr = RAM_ADDR_INVALID;
> > > > > >      } else {
> > > > > >          raddr = (reventry->paddr_index << mc->bucket_shift) +
> > > > > > -             ((unsigned long) ptr - (unsigned long) entry->vaddr_base);
> > > > > > +             ((unsigned long) ptr - (unsigned long) entry->vaddr_base) +
> > > > > > +             entry->ram_offset;
> > > > > >      }
> > > > > >      mapcache_unlock(mc);
> > > > > >      return raddr;
> > > > > > @@ -627,8 +638,8 @@ static uint8_t *xen_replace_cache_entry_unlocked(MapCache *mc,
> > > > > >
> > > > > >      trace_xen_replace_cache_entry_dummy(old_phys_addr, new_phys_addr);
> > > > > >
> > > > > > -    xen_remap_bucket(mapcache, entry, entry->vaddr_base,
> > > > > > -                     cache_size, address_index, false);
> > > > > > +    xen_remap_bucket(mc, entry, entry->vaddr_base,
> > > > > > +                     cache_size, address_index, false, entry->ram_offset);
> > > > > >      if (!test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > > >                  test_bit_size >> XC_PAGE_SHIFT,
> > > > > >                  entry->valid_mapping)) {
> > > > > > diff --git a/include/sysemu/xen-mapcache.h b/include/sysemu/xen-mapcache.h
> > > > > > index 1ec9e66752..b5e3ea1bc0 100644
> > > > > > --- a/include/sysemu/xen-mapcache.h
> > > > > > +++ b/include/sysemu/xen-mapcache.h
> > > > > > @@ -19,6 +19,7 @@ typedef hwaddr (*phys_offset_to_gaddr_t)(hwaddr phys_offset,
> > > > > >  void xen_map_cache_init(phys_offset_to_gaddr_t f,
> > > > > >                          void *opaque);
> > > > > >  uint8_t *xen_map_cache(MemoryRegion *mr, hwaddr phys_addr, hwaddr size,
> > > > > > +                       ram_addr_t ram_addr_offset,
> > > > > >                         uint8_t lock, bool dma,
> > > > > >                         bool is_write);
> > > > > >  ram_addr_t xen_ram_addr_from_mapcache(void *ptr);
> > > > > > @@ -37,6 +38,7 @@ static inline void xen_map_cache_init(phys_offset_to_gaddr_t f,
> > > > > >  static inline uint8_t *xen_map_cache(MemoryRegion *mr,
> > > > > >                                       hwaddr phys_addr,
> > > > > >                                       hwaddr size,
> > > > > > +                                     ram_addr_t ram_addr_offset,
> > > > > >                                       uint8_t lock,
> > > > > >                                       bool dma,
> > > > > >                                       bool is_write)
> > > > > > diff --git a/system/physmem.c b/system/physmem.c
> > > > > > index 1a5ffcba2a..5b16eeccca 100644
> > > > > > --- a/system/physmem.c
> > > > > > +++ b/system/physmem.c
> > > > > > @@ -2228,13 +2228,13 @@ static void *qemu_ram_ptr_length(RAMBlock *block, ram_addr_t addr,
> > > > > >           * In that case just map the requested area.
> > > > > >           */
> > > > > >          if (xen_mr_is_memory(block->mr)) {
> > > > > > -            return xen_map_cache(block->mr, addr, len, lock, lock,
> > > > > > -                                 is_write);
> > > > > > +            return xen_map_cache(block->mr, addr, len, block->offset,
> > > > > > +                                 lock, lock, is_write);
> > > > >
> > > > > Have you considered not tracking offset and address separately and
> > > > > simply do this?
> > > > >
> > > > >             return xen_map_cache(block->mr, addr + block->offset, len,
> > > > >                                  lock, lock, is_write);
> > > > >
> > > >
> > > > Unfortunately this won't work since block->offset is not related to where this
> > > > ram is mapped in guest address-space. In the case of grant's, we'd get the
> > > > wrong grant ref. See my previous comment.
> > >
> > > OK, this code below (the second xen_map_cache call passing block->offset
> > > as start address) was wrong before this patch. Can we fix it before
> > > changing it further with this patch? I worry about making things even
> > > worse.
> > >
> >
> > I'll dig around and see if we can find something that explains more.
> > There's some older code that implements some sort of address-translation
> > for x86 between ram_addr space and guest physical addresses but
> > that code is turned off with newer Xen versions (disabled in my build).
> >
> > https://github.com/qemu/qemu/blob/master/hw/xen/xen-mapcache.c#L330
> > https://github.com/qemu/qemu/blob/master/hw/i386/xen/xen-hvm.c#L193
>
> I don't have any more insights but I think Xenia might have a better
> idea as she has fixed bugs related to this to get virtio-gpu to work.

Hi again,

I found the reason this works today is because for RAM's created after
xen_memory,
QEMU hypercalls into Xen to populate the guest memory map with RAM at the given
ram_offset.
https://github.com/qemu/qemu/blob/master/hw/xen/xen-hvm-common.c#L44

The current grant series does not affect this current behaviour but I
think it would be
good to make the offset handling a little more explicit by adding the
addr+ram_offset
in the map-cache (except for the special grant region) and always
passing them in
the same argument slots. I'll try to improve this in the next version
and add some
comments and we can discuss from there.

Cheers,
Edgar


  reply	other threads:[~2024-05-07 17:20 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30 16:49 [PATCH v4 00/17] xen: Support grant mappings Edgar E. Iglesias
2024-04-30 16:49 ` [PATCH v4 01/17] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length() Edgar E. Iglesias
2024-05-01 15:56   ` David Hildenbrand
2024-05-01 16:48   ` Peter Xu
2024-04-30 16:49 ` [PATCH v4 02/17] xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry Edgar E. Iglesias
2024-04-30 16:49 ` [PATCH v4 03/17] xen: mapcache: Refactor lock functions for multi-instance Edgar E. Iglesias
2024-05-01 20:39   ` Stefano Stabellini
2024-05-06  9:52   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 04/17] xen: mapcache: Refactor xen_map_cache " Edgar E. Iglesias
2024-05-01 20:40   ` Stefano Stabellini
2024-05-06  9:53   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 05/17] xen: mapcache: Refactor xen_remap_bucket " Edgar E. Iglesias
2024-05-01 20:42   ` Stefano Stabellini
2024-05-06  9:54   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 06/17] xen: mapcache: Break out xen_ram_addr_from_mapcache_single Edgar E. Iglesias
2024-05-01 20:43   ` Stefano Stabellini
2024-05-06 10:22   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 07/17] xen: mapcache: Refactor xen_replace_cache_entry_unlocked Edgar E. Iglesias
2024-05-01 20:46   ` Stefano Stabellini
2024-05-02  6:32     ` Edgar E. Iglesias
2024-05-06 10:21       ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 08/17] xen: mapcache: Refactor xen_invalidate_map_cache_entry_unlocked Edgar E. Iglesias
2024-05-01 20:47   ` Stefano Stabellini
2024-05-06  9:55   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 09/17] xen: mapcache: Break out xen_invalidate_map_cache_single() Edgar E. Iglesias
2024-05-01 20:48   ` Stefano Stabellini
2024-05-06 10:21   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 10/17] xen: mapcache: Break out xen_map_cache_init_single() Edgar E. Iglesias
2024-05-01 20:51   ` Stefano Stabellini
2024-04-30 16:49 ` [PATCH v4 11/17] xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable Edgar E. Iglesias
2024-05-01 20:55   ` Stefano Stabellini
2024-04-30 16:49 ` [PATCH v4 12/17] xen: mapcache: Unmap first entries in buckets Edgar E. Iglesias
2024-05-01 21:01   ` Stefano Stabellini
2024-05-02  7:34   ` Edgar E. Iglesias
2024-04-30 16:49 ` [PATCH v4 13/17] softmmu: Pass RAM MemoryRegion and is_write xen_map_cache() Edgar E. Iglesias
2024-05-01 16:48   ` Peter Xu
2024-05-01 21:03   ` Stefano Stabellini
2024-05-02  7:24   ` David Hildenbrand
2024-05-02  7:31     ` Edgar E. Iglesias
2024-05-06  9:56   ` Philippe Mathieu-Daudé
2024-04-30 16:49 ` [PATCH v4 14/17] xen: Add xen_mr_is_memory() Edgar E. Iglesias
2024-05-01 21:06   ` Stefano Stabellini
2024-05-02  7:26   ` David Hildenbrand
2024-05-06  9:59     ` Philippe Mathieu-Daudé
2024-05-06 13:26       ` Edgar E. Iglesias
2024-04-30 16:49 ` [PATCH v4 15/17] xen: mapcache: Remove assumption of RAMBlock with 0 offset Edgar E. Iglesias
2024-05-01 21:24   ` Stefano Stabellini
2024-05-02  7:22     ` Edgar E. Iglesias
2024-05-02 18:53       ` Stefano Stabellini
2024-05-02 19:42         ` Edgar E. Iglesias
2024-05-02 20:02           ` Stefano Stabellini
2024-05-07 17:18             ` Edgar E. Iglesias [this message]
2024-04-30 16:49 ` [PATCH v4 16/17] xen: mapcache: Add support for grant mappings Edgar E. Iglesias
2024-05-02 19:18   ` Stefano Stabellini
2024-05-02 19:49     ` Edgar E. Iglesias
2024-04-30 16:49 ` [PATCH v4 17/17] hw/arm: xen: Enable use of " Edgar E. Iglesias
2024-05-01 22:11   ` Stefano Stabellini
2024-05-06 10:36 ` [PATCH v4 00/17] xen: Support " Philippe Mathieu-Daudé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJy5ezoTVyP5Rs=zM3jdY2NKhwQqDuCWD9oG72m4Xsdc-kmc+A@mail.gmail.com' \
    --to=edgar.iglesias@gmail.com \
    --cc=Xenia.Ragiadakou@amd.com \
    --cc=anthony.perard@citrix.com \
    --cc=david@redhat.com \
    --cc=edgar.iglesias@amd.com \
    --cc=jgross@suse.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).