All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Lee Yarwood <1929710@bugs.launchpad.net>
To: qemu-devel@nongnu.org
Subject: [Bug 1929710] Re: virDomainGetBlockJobInfo fails during swap_volume as disk '$disk' not found in domain
Date: Thu, 10 Jun 2021 13:59:44 -0000	[thread overview]
Message-ID: <162333358438.28173.7908886005390234544.launchpad@wampee.canonical.com> (raw)
In-Reply-To: 162203621182.4387.4299404156046469363.malonedeb@gac.canonical.com

** No longer affects: qemu

** Changed in: nova
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1929710

Title:
  virDomainGetBlockJobInfo fails during swap_volume as disk '$disk' not
  found in domain

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  The error handling around swap_volume is missing the following failure
  when calling virDomainGetBlockJobInfo() after the entire device is
  detached by QEMU (?) after it encounters a failure during the block
  copy job that at first pauses and then somehow resumes:

  https://8a5fc27780098c5ee1bc-
  3ac81d180a9c011938b2cbb0293272f3.ssl.cf5.rackcdn.com/790660/5/gate
  /nova-next/e915ed4/controller/logs/screen-n-cpu.txt

  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver [None req-7cfcd661-29d4-4cc3-bc54-db0e7fed1a6e tempest-TestVolumeSwap-1841575704 tempest-TestVolumeSwap-1841575704-project-admin] Failure rebasing volume /dev/sdb on vdb.: libvirt.libvirtError: invalid argument: disk 'vdb' not found in domain
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver Traceback (most recent call last):
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2107, in _swap_volume
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     while not dev.is_job_complete():
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 800, in is_job_complete
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     status = self.get_job_info()
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 707, in get_job_info
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     status = self._guest._domain.blockJobInfo(self._disk, flags=0)
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 190, in doit
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     result = proxy_call(self._autowrap, f, *args, **kwargs)
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 148, in proxy_call
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     rv = execute(f, *args, **kwargs)
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 129, in execute
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     six.reraise(c, e, tb)
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/usr/local/lib/python3.8/dist-packages/six.py", line 719, in reraise
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     raise value
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 83, in tworker
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     rv = meth(*args, **kwargs)
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver   File "/usr/local/lib/python3.8/dist-packages/libvirt.py", line 985, in blockJobInfo
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver     raise libvirtError('virDomainGetBlockJobInfo() failed')
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver libvirt.libvirtError: invalid argument: disk 'vdb' not found in domain
  May 26 09:49:47.314813 ubuntu-focal-vexxhost-ca-ymq-1-0024823853 nova-compute[114649]: ERROR nova.virt.libvirt.driver

  https://zuul.opendev.org/t/openstack/build/e915ed4aeb9346bba83910bd79e9502b/log/controller/logs/libvirt/libvirtd_log.txt

  2021-05-26 09:49:40.189+0000: 79419: info : qemuMonitorSend:993 :
  QEMU_MONITOR_SEND_MSG: mon=0x7fc4bc07e7d0 msg={"execute":"blockdev-
  add","arguments":{"node-name":"libvirt-4-format","read-
  only":false,"cache":{"direct":true,"no-
  flush":false},"driver":"raw","file":"libvirt-4-storage"},"id":"libvirt-375"}^M

  2021-05-26 09:49:46.154+0000: 79422: info : qemuMonitorSend:993 :
  QEMU_MONITOR_SEND_MSG: mon=0x7fc4bc07e7d0 msg={"execute":"blockdev-
  add","arguments":{"node-name":"libvirt-5-format","read-
  only":false,"cache":{"direct":true,"no-
  flush":false},"driver":"raw","file":"libvirt-5-storage"},"id":"libvirt-379"}^M

  2021-05-26 09:49:46.165+0000: 79422: debug :
  qemuMonitorBlockdevMirror:3112 : jobname=copy-vdb-libvirt-4-format,
  persistjob=1, device=libvirt-4-format, target=libvirt-5-format,
  bandwidth=0, granularity=0, buf_size=0, shallow=0

  2021-05-26 09:49:46.167+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'created'(1)

  2021-05-26 09:49:46.167+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'running'(2)

  2021-05-26 09:49:46.763+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'paused'(3)

  2021-05-26 09:49:46.763+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'running'(2)

  2021-05-26 09:49:46.841+0000: 79417: debug :
  qemuProcessHandleDeviceDeleted:1362 : Device virtio-disk1 removed from
  domain 0x7fc4b416b0e0 instance-0000000b

  2021-05-26 09:49:47.457+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'aborting'(8)

  2021-05-26 09:49:47.458+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'concluded'(9)

  2021-05-26 09:49:47.459+0000: 79417: debug :
  qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-
  libvirt-4-format'(domain: 0x7fc4b416b0e0,instance-0000000b) state
  changed to 'null'(11)

  Steps to reproduce
  ==================

  $ cat queries/virDomainGetBlockJobInfo.yaml
  query: >
   message:"virDomainGetBlockJobInfo() failed" AND
   tags:"screen-n-cpu.txt"

  $ elastic-recheck-query queries/virDomainGetBlockJobInfo.yaml
  total hits: 6
  build_branch
    100% master
  build_change
    50% 786588
    50% 792322
  build_hostids
    50% 1b47a855be51bba01ac6d5e6fdc4859bc17ebe2c8faaeb83392f8ff3 79fb0487675c0137b7ac30f24b5de71c70afb836e46746de770fa0c0
    50% 33381c047c348ffefebf6b10cb7f0473c2359757d0bf11cc101eec54 33381c047c348ffefebf6b10cb7f0473c2359757d0bf11cc101eec54
  build_name
    100% nova-next
  build_node
    100% ubuntu-focal
  build_queue
    100% check
  build_status
    100% FAILURE
  build_zuul_url
    100% N/A
  filename
    100% controller/logs/screen-n-cpu.txt
  log_url
    50% https://89bc735e8a094e3d60b7-4f6db7cd5400cfa66e1c80fde6bd4076.ssl.cf1.rackcdn.com/792322/1/check/nova-next/de697b4/controller/logs/screen-n-cpu.txt
    50% https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_035/786588/6/check/nova-next/0357703/controller/logs/screen-n-cpu.txt
  loglevel
    100% ERROR
  module
    33% nova.compute.manager
    33% nova.virt.libvirt.driver
    33% oslo_messaging.rpc.server
  node_provider
    50% ovh-bhs1
    50% rax-iad
  port
    50% 48014
    50% 58238
  project
    100% openstack/nova
  syslog_pid
    50% 107528
    50% 108261
  syslog_program
    50% ubuntu-focal-ovh-bhs1-0024748800 nova-compute
    50% ubuntu-focal-rax-iad-0024745546 nova-compute
  tags
    100% screen-n-cpu.txt screen oslofmt
  voting
    100% 1
  zuul_attempts
    100% 1
  zuul_executor
    50% ze01.opendev.org
    50% ze07.opendev.org

  Expected result
  ===============
  swap_volume at least fails correctly leaving the original device attached.

  Actual result
  =============
  swap_volume fails and the original device appears detached from the device.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/

     master

  2. Which hypervisor did you use?
     (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
     What's the version of that?

     libvirt + QEMU (no KVM in the gate)

  2. Which storage type did you use?
     (For example: Ceph, LVM, GPFS, ...)
     What's the version of that?

     images_type=default=qcow2

  3. Which networking type did you use?
     (For example: nova-network, Neutron with OpenVSwitch, ...)

     N/A

  Logs & Configs
  ==============

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1929710/+subscriptions


      parent reply	other threads:[~2021-06-10 14:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <162203621182.4387.4299404156046469363.malonedeb@gac.canonical.com>
2021-05-27  8:21 ` [Bug 1929710] Re: virDomainGetBlockJobInfo fails during swap_volume as disk '$disk' not found in domain Lee Yarwood
2021-05-27 15:42 ` melanie witt
2021-05-28 19:11 ` [Bug 1929710] Related fix merged to nova (master) OpenStack Infra
2021-06-10 13:59 ` Lee Yarwood [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=162333358438.28173.7908886005390234544.launchpad@wampee.canonical.com \
    --to=1929710@bugs.launchpad.net \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.