On Mon, May 06, 2024 at 07:50:57PM GMT, Laurent Pinchart wrote: > On Mon, May 06, 2024 at 10:57:17AM -0400, Sean Anderson wrote: > > On 5/6/24 03:35, Laurent Pinchart wrote: > > > On Mon, May 06, 2024 at 09:29:36AM +0200, Maxime Ripard wrote: > > >> Hi Laurent, Sean, > > >> > > >> On Sat, May 04, 2024 at 03:21:18PM GMT, Laurent Pinchart wrote: > > >> > On Fri, May 03, 2024 at 05:54:32PM -0400, Sean Anderson wrote: > > >> > > I have discovered a bug in the displayport driver on drm-misc-next. To > > >> > > trigger it, run > > >> > > > > >> > > echo fd4a0000.display > /sys/bus/platform/drivers/zynqmp-dpsub/unbind > > >> > > > > >> > > The system will become unresponsive and (after a bit) splat with a hard > > >> > > LOCKUP. One core will be unresponsive at the first zynqmp_dp_read in > > >> > > zynqmp_dp_bridge_detect. > > >> > > > > >> > > I believe the issue is due the registers being unmapped and the block > > >> > > put into reset in zynqmp_dp_remove instead of zynqmp_dpsub_release. > > >> > > > >> > That is on purpose. Drivers are not allowed to access the device at all > > >> > after .remove() returns. > > >> > > >> It's not "on purpose" no. Drivers indeed are not allowed to access the > > >> device after remove, but the kernel shouldn't crash. This is exactly > > >> why we have drm_dev_enter / drm_dev_exit. > > > > > > I didn't mean the crash was on purpose :-) It's the registers being > > > unmapped that is, as nothing should touch those registers after > > > .remove() returns. > > > > OK, so then we need to have some kind of flag in the driver or in the drm > > subsystem so we know not to access those registers. > > To avoid race conditions, the .remove() function should mark the device > as removed, wait for all ongoing access from userspace to be complete, > and then proceed to unmapping registers and doing other cleanups. > Userspace may still have open file descriptors to the device at that > point. Any new userspace access should be disallowed (by checking the > removed flag), with the only userspace-initiated operations that still > need to run being the release-related operations (unmapping memory, > closing file descriptors, ...). And for the record, this is exactly what drm_dev_unplug and drm_dev_enter/drm_dev_exit does. Maxime