All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Salvatore Bonaccorso <carnil@debian.org>
To: Timo Lindfors <timo.lindfors@iki.fi>, 1054514@bugs.debian.org
Cc: Thomas Zimmermann <tzimmermann@suse.de>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org,
	Gerd Hoffmann <kraxel@redhat.com>,
	spice-devel@lists.freedesktop.org,
	Dave Airlie <airlied@redhat.com>,
	regressions@lists.linux.dev
Subject: Re: Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Date: Tue, 24 Oct 2023 23:09:10 +0200	[thread overview]
Message-ID: <ZTgydqRlK6WX_b29@eldamar.lan> (raw)
In-Reply-To: <alpine.DEB.2.20.2310242308150.28457@mail.home>

Hi Timo,

On Tue, Oct 24, 2023 at 11:14:32PM +0300, Timo Lindfors wrote:
> Package: src:linux
> Version: 6.1.55-1
> Severity: normal
> 
> Steps to reproduce:
> 1) Install Debian 12 as a virtual machine using virt-manager, choose qxl
>    graphics card. You only need basic installation without wayland or X.
> 2) Login from the console and save thë following to reproduce.bash:
> 
> #!/bin/bash
> 
> chvt 3
> for j in $(seq 80); do
>     echo "$(date) starting round $j"
>     if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ];
> then
>         echo "bug was reproduced after $j tries"
>         exit 1
>     fi
>     for i in $(seq 100); do
>         dmesg > /dev/tty3
>     done
> done
> 
> echo "bug could not be reproduced"
> exit 0
> 
> 
> 3) Run chmod a+x reproduce.bash
> 4) Run ./reproduce.bash and wait for up to 20 minutes.
> 
> Expected results:
> 4) The system prints a steady flow of text without kernel error messages
> 
> Actual messages:
> 4) At some point the text stops flowing and the script prints "bug was
>    reproduced". If you run "journalctl --boot" you see
> 
> kernel: [TTM] Buffer eviction failed
> kernel: qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001)
> kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
> 
> 
> 
> More info:
> 1) The bug does not occur if I downgrade the kernel to
>    linux-image-5.10.0-26-amd64_5.10.197-1_amd64.deb from Debian 11.
> 2) I used the following test_linux.bash to bisect this issue against
>    upstream source:
> 
> #!/bin/bash
> set -x
> 
> gitversion="$(git describe HEAD|sed 's@^v@@')"
> 
> git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h
> git show bec771b5e0901f4b0bc861bcb58056de5151ae3a | patch -p1
> # Build
> cp ~/kernel.config .config
> # cp /boot/config-$(uname -r) .config
> # scripts/config --enable LOCALVERSION_AUTO
> # scripts/config --disable DEBUG_INFO
> # scripts/config --disable SYSTEM_TRUSTED_KEYRING
> # scripts/config --set-str SYSTEM_TRUSTED_KEYS ''
> # scripts/config --disable STACKPROTECTOR_STRONG
> make olddefconfig
> # make localmodconfig
> make -j$(nproc --all) bindeb-pkg
> rc="$?"
> if [ "$rc" != "0" ]; then
>     exit 125
> fi
> git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h
> 
> package="$(ls --sort=time ../linux-image-*_amd64.deb|head -n1)"
> version=$(echo $package | cut -d_ -f1|cut -d- -f3-)
> 
> if [ "$gitversion" != "$version" ]; then
>     echo "Build produced version $gitversion but got $version, ignoring"
>     #exit 255
> fi
> 
> # Deploy
> scp $package target:a.deb
> ssh target sudo apt install ./a.deb
> ssh target rm -f a.deb
> ssh target ./grub_set_default_version.bash $version
> ssh target sudo shutdown -r now
> sleep 40
> 
> detected_version=$(ssh target uname -r)
> if [ "$detected_version" != "$version" ]; then
>     echo "Booted to $detected_version but expected $version"
>     exit 255
> fi
> 
> # Test
> exec ssh target sudo ./reproduce.bash
> 
> 
> Bisect printed the following log:
> 
> git bisect start
> # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
> git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5
> # bad: [762949bb1da78941b25e63f7e952af037eee15a9] drm: fix drm_mode_create_blob comment
> git bisect bad 762949bb1da78941b25e63f7e952af037eee15a9
> # bad: [e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc] drm/gma500: Drop DRM_GMA600 config option
> git bisect bad e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc
> # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> git bisect bad 5a838e5d5825c85556011478abde708251cc0776
> # bad: [d2b6f8a179194de0ffc4886ffc2c4358d86047b8] Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
> git bisect bad d2b6f8a179194de0ffc4886ffc2c4358d86047b8
> # bad: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm
> git bisect bad 68a32ba14177d4a21c4a9a941cf1d7aea86d436f
> # bad: [0698b13403788a646073fcd9b2294f2dce0ce429] drm/amdgpu: skip PP_MP1_STATE_UNLOAD on aldebaran
> git bisect bad 0698b13403788a646073fcd9b2294f2dce0ce429
> # bad: [e1a5e6a8c48bf99ea374fb3e535661cfe226bca4] drm/doc: Add RFC section
> git bisect bad e1a5e6a8c48bf99ea374fb3e535661cfe226bca4
> # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
> git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5
> # bad: [2c8ab3339e398bbbcb0980933e266b93bedaae52] drm/i915: Pin timeline map after first timeline pin, v4.
> git bisect bad 2c8ab3339e398bbbcb0980933e266b93bedaae52
> # bad: [2eb8e1a69d9f8cc9c0a75e327f854957224ba421] drm/i915/gem: Drop relocation support on all new hardware (v6)
> git bisect bad 2eb8e1a69d9f8cc9c0a75e327f854957224ba421
> # bad: [b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2] drm/i915/gem: Drop legacy execbuffer support (v2)
> git bisect bad b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2
> # bad: [06debd6e1b28029e6e77c41e59a162868f377897] Merge tag 'drm-intel-next-2021-03-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
> git bisect bad 06debd6e1b28029e6e77c41e59a162868f377897
> # good: [e19eede54240d64b4baf9b0df4dfb8191f7ae48b] Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
> git bisect good e19eede54240d64b4baf9b0df4dfb8191f7ae48b
> # good: [1e28eed17697bcf343c6743f0028cc3b5dd88bf0] Linux 5.12-rc3
> git bisect good 1e28eed17697bcf343c6743f0028cc3b5dd88bf0
> # bad: [6af70eb3b40edfc8bdf2373cdc2bcf9d5a20c8c7] drm/atmel-hlcdc: Rename custom plane state variable
> git bisect bad 6af70eb3b40edfc8bdf2373cdc2bcf9d5a20c8c7
> # good: [4ca77c513537700d3fae69030879f781dde1904c] drm/qxl: release shadow on shutdown
> git bisect good 4ca77c513537700d3fae69030879f781dde1904c
> # bad: [4a11bd1e88af130f50a72e0f54391c1c7d268e03] drm/ast: Add constants for VGACRCB register bits
> git bisect bad 4a11bd1e88af130f50a72e0f54391c1c7d268e03
> # bad: [5c209d8056b9763ce544ecd7dadb3782cdaf96ed] drm/gma500: psb_spank() doesn't need it's own file
> git bisect bad 5c209d8056b9763ce544ecd7dadb3782cdaf96ed
> # bad: [db0c6bd2c0c0dada8927cd46a7c34c316a3a6c04] drm/gem: Export drm_gem_vmap() and drm_gem_vunmap()
> git bisect bad db0c6bd2c0c0dada8927cd46a7c34c316a3a6c04
> # bad: [f4a84e165e6d58606097dd07b5b78767a94b870c] drm/qxl: allocate dumb buffers in ram
> git bisect bad f4a84e165e6d58606097dd07b5b78767a94b870c
> # good: [a7709b9b89a67f3ead2d188b1d0c261059b1f291] drm/qxl: handle shadow in primary destroy
> git bisect good a7709b9b89a67f3ead2d188b1d0c261059b1f291
> # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> git bisect bad 5a838e5d5825c85556011478abde708251cc0776
> # good: [5f6c871fe919999774e8535ea611a6f84ee43ee4] drm/qxl: properly free qxl releases
> git bisect good 5f6c871fe919999774e8535ea611a6f84ee43ee4
> # first bad commit: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> 
> I took a look at
> 
> commit 5a838e5d5825c85556011478abde708251cc0776 (refs/bisect/bad)
> Author: Gerd Hoffmann <kraxel@redhat.com>
> Date:   Thu Feb 4 15:57:10 2021 +0100
> 
>     drm/qxl: simplify qxl_fence_wait
> 
>     Now that we have the new release_event wait queue we can just
>     use that in qxl_fence_wait() and simplify the code a lot.
> 
>     Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
>     Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
>     Link: http://patchwork.freedesktop.org/patch/msgid/20210204145712.1531203-10-kraxel@redhat.com
> 
> 
> and noticed that the bug does not occur if I boot 6.1 kernel with this patch
> reverted (see attached file).

Thanks for the excelent constructed report! I think it's best to
forward this directly to upstream including the people for the
bisected commit to get some idea.

Can you reproduce the issue with 6.5.8-1 in unstable as well?

If not, are you able to isolate an upstream fix which should be
backported to the 6.1.y series as well?

Regards,
Salvatore

WARNING: multiple messages have this Message-ID (diff)
From: Salvatore Bonaccorso <carnil@debian.org>
To: Timo Lindfors <timo.lindfors@iki.fi>, 1054514@bugs.debian.org
Cc: Gerd Hoffmann <kraxel@redhat.com>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	Dave Airlie <airlied@redhat.com>, Daniel Vetter <daniel@ffwll.ch>,
	virtualization@lists.linux-foundation.org,
	spice-devel@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	regressions@lists.linux.dev
Subject: Re: Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Date: Tue, 24 Oct 2023 23:09:10 +0200	[thread overview]
Message-ID: <ZTgydqRlK6WX_b29@eldamar.lan> (raw)
In-Reply-To: <alpine.DEB.2.20.2310242308150.28457@mail.home>

Hi Timo,

On Tue, Oct 24, 2023 at 11:14:32PM +0300, Timo Lindfors wrote:
> Package: src:linux
> Version: 6.1.55-1
> Severity: normal
> 
> Steps to reproduce:
> 1) Install Debian 12 as a virtual machine using virt-manager, choose qxl
>    graphics card. You only need basic installation without wayland or X.
> 2) Login from the console and save thë following to reproduce.bash:
> 
> #!/bin/bash
> 
> chvt 3
> for j in $(seq 80); do
>     echo "$(date) starting round $j"
>     if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ];
> then
>         echo "bug was reproduced after $j tries"
>         exit 1
>     fi
>     for i in $(seq 100); do
>         dmesg > /dev/tty3
>     done
> done
> 
> echo "bug could not be reproduced"
> exit 0
> 
> 
> 3) Run chmod a+x reproduce.bash
> 4) Run ./reproduce.bash and wait for up to 20 minutes.
> 
> Expected results:
> 4) The system prints a steady flow of text without kernel error messages
> 
> Actual messages:
> 4) At some point the text stops flowing and the script prints "bug was
>    reproduced". If you run "journalctl --boot" you see
> 
> kernel: [TTM] Buffer eviction failed
> kernel: qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001)
> kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
> 
> 
> 
> More info:
> 1) The bug does not occur if I downgrade the kernel to
>    linux-image-5.10.0-26-amd64_5.10.197-1_amd64.deb from Debian 11.
> 2) I used the following test_linux.bash to bisect this issue against
>    upstream source:
> 
> #!/bin/bash
> set -x
> 
> gitversion="$(git describe HEAD|sed 's@^v@@')"
> 
> git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h
> git show bec771b5e0901f4b0bc861bcb58056de5151ae3a | patch -p1
> # Build
> cp ~/kernel.config .config
> # cp /boot/config-$(uname -r) .config
> # scripts/config --enable LOCALVERSION_AUTO
> # scripts/config --disable DEBUG_INFO
> # scripts/config --disable SYSTEM_TRUSTED_KEYRING
> # scripts/config --set-str SYSTEM_TRUSTED_KEYS ''
> # scripts/config --disable STACKPROTECTOR_STRONG
> make olddefconfig
> # make localmodconfig
> make -j$(nproc --all) bindeb-pkg
> rc="$?"
> if [ "$rc" != "0" ]; then
>     exit 125
> fi
> git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h
> 
> package="$(ls --sort=time ../linux-image-*_amd64.deb|head -n1)"
> version=$(echo $package | cut -d_ -f1|cut -d- -f3-)
> 
> if [ "$gitversion" != "$version" ]; then
>     echo "Build produced version $gitversion but got $version, ignoring"
>     #exit 255
> fi
> 
> # Deploy
> scp $package target:a.deb
> ssh target sudo apt install ./a.deb
> ssh target rm -f a.deb
> ssh target ./grub_set_default_version.bash $version
> ssh target sudo shutdown -r now
> sleep 40
> 
> detected_version=$(ssh target uname -r)
> if [ "$detected_version" != "$version" ]; then
>     echo "Booted to $detected_version but expected $version"
>     exit 255
> fi
> 
> # Test
> exec ssh target sudo ./reproduce.bash
> 
> 
> Bisect printed the following log:
> 
> git bisect start
> # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
> git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5
> # bad: [762949bb1da78941b25e63f7e952af037eee15a9] drm: fix drm_mode_create_blob comment
> git bisect bad 762949bb1da78941b25e63f7e952af037eee15a9
> # bad: [e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc] drm/gma500: Drop DRM_GMA600 config option
> git bisect bad e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc
> # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> git bisect bad 5a838e5d5825c85556011478abde708251cc0776
> # bad: [d2b6f8a179194de0ffc4886ffc2c4358d86047b8] Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
> git bisect bad d2b6f8a179194de0ffc4886ffc2c4358d86047b8
> # bad: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm
> git bisect bad 68a32ba14177d4a21c4a9a941cf1d7aea86d436f
> # bad: [0698b13403788a646073fcd9b2294f2dce0ce429] drm/amdgpu: skip PP_MP1_STATE_UNLOAD on aldebaran
> git bisect bad 0698b13403788a646073fcd9b2294f2dce0ce429
> # bad: [e1a5e6a8c48bf99ea374fb3e535661cfe226bca4] drm/doc: Add RFC section
> git bisect bad e1a5e6a8c48bf99ea374fb3e535661cfe226bca4
> # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
> git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5
> # bad: [2c8ab3339e398bbbcb0980933e266b93bedaae52] drm/i915: Pin timeline map after first timeline pin, v4.
> git bisect bad 2c8ab3339e398bbbcb0980933e266b93bedaae52
> # bad: [2eb8e1a69d9f8cc9c0a75e327f854957224ba421] drm/i915/gem: Drop relocation support on all new hardware (v6)
> git bisect bad 2eb8e1a69d9f8cc9c0a75e327f854957224ba421
> # bad: [b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2] drm/i915/gem: Drop legacy execbuffer support (v2)
> git bisect bad b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2
> # bad: [06debd6e1b28029e6e77c41e59a162868f377897] Merge tag 'drm-intel-next-2021-03-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
> git bisect bad 06debd6e1b28029e6e77c41e59a162868f377897
> # good: [e19eede54240d64b4baf9b0df4dfb8191f7ae48b] Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
> git bisect good e19eede54240d64b4baf9b0df4dfb8191f7ae48b
> # good: [1e28eed17697bcf343c6743f0028cc3b5dd88bf0] Linux 5.12-rc3
> git bisect good 1e28eed17697bcf343c6743f0028cc3b5dd88bf0
> # bad: [6af70eb3b40edfc8bdf2373cdc2bcf9d5a20c8c7] drm/atmel-hlcdc: Rename custom plane state variable
> git bisect bad 6af70eb3b40edfc8bdf2373cdc2bcf9d5a20c8c7
> # good: [4ca77c513537700d3fae69030879f781dde1904c] drm/qxl: release shadow on shutdown
> git bisect good 4ca77c513537700d3fae69030879f781dde1904c
> # bad: [4a11bd1e88af130f50a72e0f54391c1c7d268e03] drm/ast: Add constants for VGACRCB register bits
> git bisect bad 4a11bd1e88af130f50a72e0f54391c1c7d268e03
> # bad: [5c209d8056b9763ce544ecd7dadb3782cdaf96ed] drm/gma500: psb_spank() doesn't need it's own file
> git bisect bad 5c209d8056b9763ce544ecd7dadb3782cdaf96ed
> # bad: [db0c6bd2c0c0dada8927cd46a7c34c316a3a6c04] drm/gem: Export drm_gem_vmap() and drm_gem_vunmap()
> git bisect bad db0c6bd2c0c0dada8927cd46a7c34c316a3a6c04
> # bad: [f4a84e165e6d58606097dd07b5b78767a94b870c] drm/qxl: allocate dumb buffers in ram
> git bisect bad f4a84e165e6d58606097dd07b5b78767a94b870c
> # good: [a7709b9b89a67f3ead2d188b1d0c261059b1f291] drm/qxl: handle shadow in primary destroy
> git bisect good a7709b9b89a67f3ead2d188b1d0c261059b1f291
> # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> git bisect bad 5a838e5d5825c85556011478abde708251cc0776
> # good: [5f6c871fe919999774e8535ea611a6f84ee43ee4] drm/qxl: properly free qxl releases
> git bisect good 5f6c871fe919999774e8535ea611a6f84ee43ee4
> # first bad commit: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> 
> I took a look at
> 
> commit 5a838e5d5825c85556011478abde708251cc0776 (refs/bisect/bad)
> Author: Gerd Hoffmann <kraxel@redhat.com>
> Date:   Thu Feb 4 15:57:10 2021 +0100
> 
>     drm/qxl: simplify qxl_fence_wait
> 
>     Now that we have the new release_event wait queue we can just
>     use that in qxl_fence_wait() and simplify the code a lot.
> 
>     Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
>     Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
>     Link: http://patchwork.freedesktop.org/patch/msgid/20210204145712.1531203-10-kraxel@redhat.com
> 
> 
> and noticed that the bug does not occur if I boot 6.1 kernel with this patch
> reverted (see attached file).

Thanks for the excelent constructed report! I think it's best to
forward this directly to upstream including the people for the
bisected commit to get some idea.

Can you reproduce the issue with 6.5.8-1 in unstable as well?

If not, are you able to isolate an upstream fix which should be
backported to the 6.1.y series as well?

Regards,
Salvatore

WARNING: multiple messages have this Message-ID (diff)
From: Salvatore Bonaccorso <carnil@debian.org>
To: Timo Lindfors <timo.lindfors@iki.fi>, 1054514@bugs.debian.org
Cc: Thomas Zimmermann <tzimmermann@suse.de>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org,
	Daniel Vetter <daniel@ffwll.ch>,
	spice-devel@lists.freedesktop.org,
	Dave Airlie <airlied@redhat.com>,
	regressions@lists.linux.dev
Subject: Re: Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Date: Tue, 24 Oct 2023 23:09:10 +0200	[thread overview]
Message-ID: <ZTgydqRlK6WX_b29@eldamar.lan> (raw)
In-Reply-To: <alpine.DEB.2.20.2310242308150.28457@mail.home>

Hi Timo,

On Tue, Oct 24, 2023 at 11:14:32PM +0300, Timo Lindfors wrote:
> Package: src:linux
> Version: 6.1.55-1
> Severity: normal
> 
> Steps to reproduce:
> 1) Install Debian 12 as a virtual machine using virt-manager, choose qxl
>    graphics card. You only need basic installation without wayland or X.
> 2) Login from the console and save thë following to reproduce.bash:
> 
> #!/bin/bash
> 
> chvt 3
> for j in $(seq 80); do
>     echo "$(date) starting round $j"
>     if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ];
> then
>         echo "bug was reproduced after $j tries"
>         exit 1
>     fi
>     for i in $(seq 100); do
>         dmesg > /dev/tty3
>     done
> done
> 
> echo "bug could not be reproduced"
> exit 0
> 
> 
> 3) Run chmod a+x reproduce.bash
> 4) Run ./reproduce.bash and wait for up to 20 minutes.
> 
> Expected results:
> 4) The system prints a steady flow of text without kernel error messages
> 
> Actual messages:
> 4) At some point the text stops flowing and the script prints "bug was
>    reproduced". If you run "journalctl --boot" you see
> 
> kernel: [TTM] Buffer eviction failed
> kernel: qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001)
> kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
> 
> 
> 
> More info:
> 1) The bug does not occur if I downgrade the kernel to
>    linux-image-5.10.0-26-amd64_5.10.197-1_amd64.deb from Debian 11.
> 2) I used the following test_linux.bash to bisect this issue against
>    upstream source:
> 
> #!/bin/bash
> set -x
> 
> gitversion="$(git describe HEAD|sed 's@^v@@')"
> 
> git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h
> git show bec771b5e0901f4b0bc861bcb58056de5151ae3a | patch -p1
> # Build
> cp ~/kernel.config .config
> # cp /boot/config-$(uname -r) .config
> # scripts/config --enable LOCALVERSION_AUTO
> # scripts/config --disable DEBUG_INFO
> # scripts/config --disable SYSTEM_TRUSTED_KEYRING
> # scripts/config --set-str SYSTEM_TRUSTED_KEYS ''
> # scripts/config --disable STACKPROTECTOR_STRONG
> make olddefconfig
> # make localmodconfig
> make -j$(nproc --all) bindeb-pkg
> rc="$?"
> if [ "$rc" != "0" ]; then
>     exit 125
> fi
> git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h
> 
> package="$(ls --sort=time ../linux-image-*_amd64.deb|head -n1)"
> version=$(echo $package | cut -d_ -f1|cut -d- -f3-)
> 
> if [ "$gitversion" != "$version" ]; then
>     echo "Build produced version $gitversion but got $version, ignoring"
>     #exit 255
> fi
> 
> # Deploy
> scp $package target:a.deb
> ssh target sudo apt install ./a.deb
> ssh target rm -f a.deb
> ssh target ./grub_set_default_version.bash $version
> ssh target sudo shutdown -r now
> sleep 40
> 
> detected_version=$(ssh target uname -r)
> if [ "$detected_version" != "$version" ]; then
>     echo "Booted to $detected_version but expected $version"
>     exit 255
> fi
> 
> # Test
> exec ssh target sudo ./reproduce.bash
> 
> 
> Bisect printed the following log:
> 
> git bisect start
> # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
> git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5
> # bad: [762949bb1da78941b25e63f7e952af037eee15a9] drm: fix drm_mode_create_blob comment
> git bisect bad 762949bb1da78941b25e63f7e952af037eee15a9
> # bad: [e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc] drm/gma500: Drop DRM_GMA600 config option
> git bisect bad e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc
> # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> git bisect bad 5a838e5d5825c85556011478abde708251cc0776
> # bad: [d2b6f8a179194de0ffc4886ffc2c4358d86047b8] Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
> git bisect bad d2b6f8a179194de0ffc4886ffc2c4358d86047b8
> # bad: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm
> git bisect bad 68a32ba14177d4a21c4a9a941cf1d7aea86d436f
> # bad: [0698b13403788a646073fcd9b2294f2dce0ce429] drm/amdgpu: skip PP_MP1_STATE_UNLOAD on aldebaran
> git bisect bad 0698b13403788a646073fcd9b2294f2dce0ce429
> # bad: [e1a5e6a8c48bf99ea374fb3e535661cfe226bca4] drm/doc: Add RFC section
> git bisect bad e1a5e6a8c48bf99ea374fb3e535661cfe226bca4
> # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
> git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5
> # bad: [2c8ab3339e398bbbcb0980933e266b93bedaae52] drm/i915: Pin timeline map after first timeline pin, v4.
> git bisect bad 2c8ab3339e398bbbcb0980933e266b93bedaae52
> # bad: [2eb8e1a69d9f8cc9c0a75e327f854957224ba421] drm/i915/gem: Drop relocation support on all new hardware (v6)
> git bisect bad 2eb8e1a69d9f8cc9c0a75e327f854957224ba421
> # bad: [b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2] drm/i915/gem: Drop legacy execbuffer support (v2)
> git bisect bad b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2
> # bad: [06debd6e1b28029e6e77c41e59a162868f377897] Merge tag 'drm-intel-next-2021-03-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
> git bisect bad 06debd6e1b28029e6e77c41e59a162868f377897
> # good: [e19eede54240d64b4baf9b0df4dfb8191f7ae48b] Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
> git bisect good e19eede54240d64b4baf9b0df4dfb8191f7ae48b
> # good: [1e28eed17697bcf343c6743f0028cc3b5dd88bf0] Linux 5.12-rc3
> git bisect good 1e28eed17697bcf343c6743f0028cc3b5dd88bf0
> # bad: [6af70eb3b40edfc8bdf2373cdc2bcf9d5a20c8c7] drm/atmel-hlcdc: Rename custom plane state variable
> git bisect bad 6af70eb3b40edfc8bdf2373cdc2bcf9d5a20c8c7
> # good: [4ca77c513537700d3fae69030879f781dde1904c] drm/qxl: release shadow on shutdown
> git bisect good 4ca77c513537700d3fae69030879f781dde1904c
> # bad: [4a11bd1e88af130f50a72e0f54391c1c7d268e03] drm/ast: Add constants for VGACRCB register bits
> git bisect bad 4a11bd1e88af130f50a72e0f54391c1c7d268e03
> # bad: [5c209d8056b9763ce544ecd7dadb3782cdaf96ed] drm/gma500: psb_spank() doesn't need it's own file
> git bisect bad 5c209d8056b9763ce544ecd7dadb3782cdaf96ed
> # bad: [db0c6bd2c0c0dada8927cd46a7c34c316a3a6c04] drm/gem: Export drm_gem_vmap() and drm_gem_vunmap()
> git bisect bad db0c6bd2c0c0dada8927cd46a7c34c316a3a6c04
> # bad: [f4a84e165e6d58606097dd07b5b78767a94b870c] drm/qxl: allocate dumb buffers in ram
> git bisect bad f4a84e165e6d58606097dd07b5b78767a94b870c
> # good: [a7709b9b89a67f3ead2d188b1d0c261059b1f291] drm/qxl: handle shadow in primary destroy
> git bisect good a7709b9b89a67f3ead2d188b1d0c261059b1f291
> # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> git bisect bad 5a838e5d5825c85556011478abde708251cc0776
> # good: [5f6c871fe919999774e8535ea611a6f84ee43ee4] drm/qxl: properly free qxl releases
> git bisect good 5f6c871fe919999774e8535ea611a6f84ee43ee4
> # first bad commit: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait
> 
> I took a look at
> 
> commit 5a838e5d5825c85556011478abde708251cc0776 (refs/bisect/bad)
> Author: Gerd Hoffmann <kraxel@redhat.com>
> Date:   Thu Feb 4 15:57:10 2021 +0100
> 
>     drm/qxl: simplify qxl_fence_wait
> 
>     Now that we have the new release_event wait queue we can just
>     use that in qxl_fence_wait() and simplify the code a lot.
> 
>     Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
>     Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
>     Link: http://patchwork.freedesktop.org/patch/msgid/20210204145712.1531203-10-kraxel@redhat.com
> 
> 
> and noticed that the bug does not occur if I boot 6.1 kernel with this patch
> reverted (see attached file).

Thanks for the excelent constructed report! I think it's best to
forward this directly to upstream including the people for the
bisected commit to get some idea.

Can you reproduce the issue with 6.5.8-1 in unstable as well?

If not, are you able to isolate an upstream fix which should be
backported to the 6.1.y series as well?

Regards,
Salvatore
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

       reply	other threads:[~2023-10-24 21:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <alpine.DEB.2.20.2310242308150.28457@mail.home>
2023-10-24 21:09 ` Salvatore Bonaccorso [this message]
2023-10-24 21:09   ` Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently Salvatore Bonaccorso
2023-10-24 21:09   ` Salvatore Bonaccorso
2023-10-24 21:39   ` Timo Lindfors
2023-10-24 21:39     ` Timo Lindfors
2023-12-06  9:56     ` Linux regression tracking (Thorsten Leemhuis)
2023-12-06  9:56       ` Linux regression tracking (Thorsten Leemhuis)
2023-12-06 10:45       ` Bug#1054514: Info received (Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently) Debian Bug Tracking System
2024-03-08  1:08       ` [PATCH 0/1] drm/qxl: fixes qxl_fence_wait Alex Constantino
2024-03-08  1:08         ` [PATCH 1/1] " Alex Constantino
2024-03-08  8:58           ` Thorsten Leemhuis
2024-03-20 15:25           ` Linux regression tracking (Thorsten Leemhuis)
2024-03-20 15:27             ` Bug#1054514: Info received ([PATCH 1/1] drm/qxl: fixes qxl_fence_wait) Debian Bug Tracking System
2024-03-27 13:27             ` [PATCH 1/1] drm/qxl: fixes qxl_fence_wait Maxime Ripard
2024-04-04 18:14               ` [PATCH v2 0/1] Revert "drm/qxl: simplify qxl_fence_wait" Alex Constantino
2024-04-04 18:14                 ` [PATCH v2 1/1] " Alex Constantino
2024-04-05  4:37                   ` Greg KH
2024-04-05 13:13                   ` (subset) " Maxime Ripard
2023-10-24 23:55   ` Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently Bagas Sanjaya
2023-10-24 23:55     ` Bagas Sanjaya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTgydqRlK6WX_b29@eldamar.lan \
    --to=carnil@debian.org \
    --cc=1054514@bugs.debian.org \
    --cc=airlied@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kraxel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=spice-devel@lists.freedesktop.org \
    --cc=timo.lindfors@iki.fi \
    --cc=tzimmermann@suse.de \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.