From: Nir Soffer <nsoffer@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-block <qemu-block@nongnu.org>,
Markus Armbruster <armbru@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>,
Max Reitz <mreitz@redhat.com>,
libguestfs@redhat.com
Subject: Re: [PATCH 2/2] nbd: Add new qemu:joint-allocation metadata context
Date: Sat, 12 Jun 2021 02:39:44 +0300 [thread overview]
Message-ID: <CAMRbyyuyFf4F9S6+rW2j+YPQyV3PECifq1_4S6mQ+8V2hREsKA@mail.gmail.com> (raw)
In-Reply-To: <20210609180118.1003774-3-eblake@redhat.com>
On Wed, Jun 9, 2021 at 9:01 PM Eric Blake <eblake@redhat.com> wrote:
>
> When trying to reconstruct a qcow2 chain using information provided
> over NBD, ovirt had been relying on an unsafe assumption that any
> portion of the qcow2 file advertised as sparse would defer to the
> backing image; this worked with what qemu 5.2 reports for a qcow2 BSD
> loaded with "backing":null. However, in 6.0, commit 0da9856851 (nbd:
> server: Report holes for raw images) also had a side-effect of
> reporting unallocated zero clusters in qcow2 files as sparse. This
> change is correct from the NBD spec perspective (advertising bits has
> always been optional based on how much information the server has
> available, and should only be used to optimize behavior when a bit is
> set, while not assuming semantics merely because a bit is clear), but
> means that a qcow2 file that uses an unallocated zero cluster to
> override a backing file now shows up as sparse over NBD, and causes
> ovirt to fail to reproduce that cluster (ie. ovirt was assuming it
> only had to write clusters where the bit was clear, and the 6.0
> behavior change shows the flaw in that assumption).
>
> The correct fix is for ovirt to additionally use the
> qemu:allocation-depth metadata context added in 5.2: after all, the
> actual determination for what is needed to recreate a qcow2 file is
> not whether a cluster is sparse, but whether the allocation-depth
> shows the cluster to be local. But reproducing an image is more
> efficient when handling known-zero clusters, which means that ovirt
> has to track both base:allocation and qemu:allocation-depth metadata
> contexts simultaneously. While NBD_CMD_BLOCK_STATUS is just fine
> sending back information for two contexts in parallel, it comes with
> some bookkeeping overhead at the client side: the two contexts need
> not report the same length of replies, and it involves more network
> traffic.
Since this change is not simple, and the chance that we also get the dirty
bitmap included in the result seems to be very low, I decided to check the
direction of merging multiple extents.
I started with merging "base:allocation" and "qemu:dirty-bitmap:xxx" since
we already have both. It was not hard to do, although it is not completely
tested yet.
Here is the merging code:
https://gerrit.ovirt.org/c/ovirt-imageio/+/115216/1/daemon/ovirt_imageio/_internal/nbdutil.py
To make merging easy and safe, we map the NBD_STATE_DIRTY bit to a private bit
so it cannot clash with the NBD_STATE_HOLE bit:
https://gerrit.ovirt.org/c/ovirt-imageio/+/115215/1/daemon/ovirt_imageio/_internal/nbd.py
Here is a functional test using qemu-nbd showing that it works:
https://gerrit.ovirt.org/c/ovirt-imageio/+/115216/1/daemon/test/client_test.py
I'll try to use "qemu:allocation-depth" in a similar way next week, probably
mapping depth > 0 to EXTENT_EXISTS, to use when reporting holes in
single qcow2 images.
If this is successful, we can start using this in the next ovirt release, and we
don't need "qemu:joint-allocation".
Nir
next prev parent reply other threads:[~2021-06-11 23:41 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-09 18:01 [RFC PATCH 0/2] New NBD metacontext Eric Blake
2021-06-09 18:01 ` [PATCH 1/2] iotests: Improve and rename test 309 to nbd-qemu-allocation Eric Blake
2021-06-10 12:14 ` Vladimir Sementsov-Ogievskiy
2021-06-09 18:01 ` [PATCH 2/2] nbd: Add new qemu:joint-allocation metadata context Eric Blake
2021-06-09 23:52 ` Nir Soffer
2021-06-10 12:30 ` Vladimir Sementsov-Ogievskiy
2021-06-10 13:47 ` Eric Blake
2021-06-10 14:10 ` Vladimir Sementsov-Ogievskiy
2021-06-10 13:16 ` Nir Soffer
2021-06-10 14:04 ` Eric Blake
2021-06-10 14:43 ` Vladimir Sementsov-Ogievskiy
2021-06-10 13:31 ` Eric Blake
2021-06-11 23:39 ` Nir Soffer [this message]
2021-06-14 13:56 ` Eric Blake
2021-06-14 14:06 ` Nir Soffer
2021-06-09 21:31 ` [RFC libnbd PATCH] info: Add support for new qemu:joint-allocation Eric Blake
2021-06-09 22:20 ` Nir Soffer
2021-06-10 13:06 ` Eric Blake
2021-06-10 13:19 ` Nir Soffer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMRbyyuyFf4F9S6+rW2j+YPQyV3PECifq1_4S6mQ+8V2hREsKA@mail.gmail.com \
--to=nsoffer@redhat.com \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=libguestfs@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).