All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
@ 2002-12-17  1:49 Ed Tomlinson
  2002-12-17  8:06 ` Paul P Komkoff Jr
  2002-12-17 12:33 ` Dave Jones
  0 siblings, 2 replies; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-17  1:49 UTC (permalink / raw
  To: linux-kernel; +Cc: Dave Jones

I am getting the above message in 2.5.51, 52, and 52+bk current.  
Pci info follows:

oscar# lspci -vv
00:00.0 Host bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3] (rev 04)
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR+
        Latency: 16
        Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 1.0
                Status: RQ=7 SBA+ 64bit- FW- Rate=x1,x2
                Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>


with a mga G400

01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) (prog-if 00 [VGA])
        Subsystem: Matrox Graphics, Inc. Millennium G400 MAX/Dual Head 32Mb
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (4000ns min, 8000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at e8000000 (32-bit, prefetchable) [size=32M]
        Region 1: Memory at e4000000 (32-bit, non-prefetchable) [size=16K]
        Region 2: Memory at e5000000 (32-bit, non-prefetchable) [size=8M]
        Expansion ROM at <unassigned> [disabled] [size=64K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [f0] AGP version 2.0
                Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
                Command: RQ=31 SBA+ AGP+ 64bit- FW- Rate=x1

What else would help to debug this?  The drm error above is all I find in the logs...

TIA
Ed Tomlinson



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-17  1:49 Ed Tomlinson
@ 2002-12-17  8:06 ` Paul P Komkoff Jr
  2002-12-17 12:50   ` Ed Tomlinson
  2002-12-17 12:33 ` Dave Jones
  1 sibling, 1 reply; 18+ messages in thread
From: Paul P Komkoff Jr @ 2002-12-17  8:06 UTC (permalink / raw
  To: linux-kernel

Replying to Ed Tomlinson:
> I am getting the above message in 2.5.51, 52, and 52+bk current.  
> Pci info follows:
> What else would help to debug this?  The drm error above is all I find in the logs...

If you mount devfs somewhere you also don't find misc/agpgart inside ?
:)))

And nothing about agp aperture in dmesg?

-- 
Paul P 'Stingray' Komkoff 'Greatest' Jr /// (icq)23200764 /// (http)stingr.net
  When you're invisible, the only one really watching you is you (my keychain)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-17  1:49 Ed Tomlinson
  2002-12-17  8:06 ` Paul P Komkoff Jr
@ 2002-12-17 12:33 ` Dave Jones
  1 sibling, 0 replies; 18+ messages in thread
From: Dave Jones @ 2002-12-17 12:33 UTC (permalink / raw
  To: Ed Tomlinson; +Cc: linux-kernel, rusty

On Mon, Dec 16, 2002 at 08:49:16PM -0500, Ed Tomlinson wrote:
 > I am getting the above message in 2.5.51, 52, and 52+bk current.  
 > Pci info follows:
 > 
 > What else would help to debug this?  The drm error above is all I find in the logs...

There are a bunch of pending fixes at bk://linux-dj.bkbits.net/agpgart,
but nothing that should be relevant to this problem.
Are you using agpgart as modules? Which ones loaded ?
I've a feeling agpgart.ko loaded here, but not via-agp.ko

What needs to happen is when agpgart.ko is loaded, all the
chipset drivers also get pulled in as dependancies.
Since the new module stuff went in, I'm not sure how this
works, if it works at all.[1]

        Dave

[1] I'm yet another developer who has had a rough time with the
    new modules stuff. I'll try it again soon.

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-17  8:06 ` Paul P Komkoff Jr
@ 2002-12-17 12:50   ` Ed Tomlinson
  2002-12-20  1:30     ` Matt Bernstein
  0 siblings, 1 reply; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-17 12:50 UTC (permalink / raw
  To: Paul P Komkoff Jr, linux-kernel, Dave Jones, Rusty Russell

Paul P Komkoff Jr wrote:

> Replying to Ed Tomlinson:
>> I am getting the above message in 2.5.51, 52, and 52+bk current.
>> Pci info follows:
>> What else would help to debug this?  The drm error above is all I find in
>> the logs...
> 
> If you mount devfs somewhere you also don't find misc/agpgart inside ?
> :)))
> 
> And nothing about agp aperture in dmesg?

Not normally.  If I modprobe via-agp modprobe segfaults (a Rusty's bug),
but via_agp and agpgart get loaded (note that - changed to _ when the module 
is loaded - it has dash in file in the directory).  Doing it this time gets 
an oops (52bk as of last night):

Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected VIA MVP3 chipset
Unable to handle kernel paging request at virtual address e0db9080
 printing eip:
e0db9080
*pde = 1ed12067
*pte = 00000000
Oops: 0000
CPU:    0
EIP:    0060:[<e0db9080>]    Not tainted
EFLAGS: 00010297
EIP is at 0xe0db9080
eax: 00000000   ebx: c15d5800   ecx: c0291b68   edx: 00000282
esi: 00000000   edi: c15d584c   ebp: c15d5800   esp: d93e5f08
ds: 0068   es: 0068   ss: 0068
Process modprobe (pid: 19122, threadinfo=d93e4000 task=df09aca0)
Stack: e0de816f c15d5800 c15d5800 e0dc81a1 c15d5800 e0dc84a8 c019f5c8 
c15d5800
       e0dcd2ac c15d584c e0dc84a8 ffffffed e0dc84a8 e0dc8480 c01ac0af 
c15d584c
       c15d584c c15d5854 c02bd934 c01ac183 c15d584c e0dc84a8 c02bd840 
e0dc83b8
Call Trace:
 [<e0de816f>] agp_register_driver+0x27/0x9c [agpgart]
 [<e0dc81a1>] agp_via_probe+0x35/0x3c [via_agp]
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<c019f5c8>] pci_device_probe+0x40/0x5c
 [<e0dcd2ac>] agp_via_pci_table+0x0/0x38 [via_agp]
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<e0dc8480>] agp_via_pci_driver+0x0/0xa0 [via_agp]
 [<c01ac0af>] bus_match+0x37/0x6c
 [<c01ac183>] driver_attach+0x37/0x60
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<e0dc83b8>] __module_pci_device_size+0x10/0x18 [via_agp]
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<c01ac44c>] bus_add_driver+0xa4/0xc4
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<e0dc84c8>] agp_via_pci_driver+0x48/0xa0 [via_agp]
 [<c01ac83c>] driver_register+0x34/0x38
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<c019f6c2>] pci_register_driver+0x42/0x50
 [<e0dc84a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
 [<e0dcd1af>] agp_via_init+0xb/0x44 [via_agp]
 [<e0dc8480>] agp_via_pci_driver+0x0/0xa0 [via_agp]
 [<c0125752>] sys_init_module+0x116/0x1a4
 [<c0108937>] syscall_call+0x7/0xb

Code:  Bad EIP value.

The module load occured after X was started.

Ed Tomlinson


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
       [not found] <20021218094714.43C712C076@lists.samba.org>
@ 2002-12-18 12:57 ` Ed Tomlinson
  2002-12-18 16:29   ` Dave Jones
       [not found] ` <200212181803.23279.tomlins@cam.org>
  1 sibling, 1 reply; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-18 12:57 UTC (permalink / raw
  To: Rusty Russell, Dave Jones; +Cc: linux-kernel

On December 18, 2002 04:46 am, Rusty Russell wrote:
> In message <20021218102004.A4947@suse.de> you write:
> > On Wed, Dec 18, 2002 at 06:07:48PM +1100, Rusty Russell wrote:
> >  > Dave, it's true that it's my fault, but I'm afraid it looks like your
> >  > bug 8).  This is most likely karma for not reporting bugs in the
> >  > module code when you had problems 8)
> >
> > Bah, my bad. 8)
>
> That's OK, you're in good company as the first I heard from Alan was
> that he gave up kernel work because "modules was so broken". 8(
>
> >  > static int __init agp_backend_initialize(struct pci_dev *dev)
> >  >
> >  > ....
> >  > int agp_register_driver (struct pci_dev *dev)
> >  > {
> >  > ....
> >  > 	ret_val = agp_backend_initialize(dev);
> >
> > Whoops. Thanks, fixed up here.
> > I'll bet I duped that bug in other places too, as I've rejigged
> > stuff around. I'll double check those paths in a mo.
> >
> >  > Ed, does it work if you take all the __init out of the agp code?
> >
> > My moneys on it working. The oops looked like it was jumping to oblivion
> > when it called agp_backend_initialize. So the new modutils discards
> > __init sections ? That's a new feature isn't it ?
>
> Yeah, kind of a bonus.  It's actually arch-dependent.
>
> I've added an item in my TODO list to check for relocations on the
> non-init section which point into the init sections.  It'd be cute.
> It'll probably never go into the main kernel, and it's no actually an
> error, you can imagine code which does:
>
> 	if (initializing)
> 		some_init_func();

Dave when you have this in a bk tree let me know and I will pull and 
verify it working here.

Thanks to both of you

Ed Tomlinson

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-18 12:57 ` [drm:drm_init] *ERROR* Cannot initialize the agpgart module Ed Tomlinson
@ 2002-12-18 16:29   ` Dave Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Jones @ 2002-12-18 16:29 UTC (permalink / raw
  To: Ed Tomlinson; +Cc: Rusty Russell, linux-kernel

On Wed, Dec 18, 2002 at 07:57:53AM -0500, Ed Tomlinson wrote:
 > > >  > Ed, does it work if you take all the __init out of the agp code?
 > > > My moneys on it working. The oops looked like it was jumping to oblivion
 > > > when it called agp_backend_initialize.
 > Dave when you have this in a bk tree let me know and I will pull and 
 > verify it working here.

bk://linux-dj.bkbits.net/agpgart

I've given it a compile testing, but not booted it yet.
Scream if necessary.

	Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-17 12:50   ` Ed Tomlinson
@ 2002-12-20  1:30     ` Matt Bernstein
  2002-12-20  1:43       ` Randy.Dunlap
  0 siblings, 1 reply; 18+ messages in thread
From: Matt Bernstein @ 2002-12-20  1:30 UTC (permalink / raw
  To: Ed Tomlinson; +Cc: Paul P Komkoff Jr, linux-kernel, Dave Jones, Rusty Russell

On Dec 17 Ed Tomlinson wrote:

>Not normally.  If I modprobe via-agp modprobe segfaults (a Rusty's bug),
>but via_agp and agpgart get loaded (note that - changed to _ when the module 
>is loaded - it has dash in file in the directory).  Doing it this time gets 
>an oops (52bk as of last night):
[snip]

I get a very similar oops, but with amd_k7_agp (2.5.52-mm2). I'm not 
bk-savvy as yet, but if pointed at a diff, would be happy to verify it.

Matt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-20  1:30     ` Matt Bernstein
@ 2002-12-20  1:43       ` Randy.Dunlap
  2002-12-20  9:33         ` Dave Jones
  0 siblings, 1 reply; 18+ messages in thread
From: Randy.Dunlap @ 2002-12-20  1:43 UTC (permalink / raw
  To: Matt Bernstein
  Cc: Ed Tomlinson, Paul P Komkoff Jr, linux-kernel, Dave Jones,
	Rusty Russell

On Fri, 20 Dec 2002, Matt Bernstein wrote:

| On Dec 17 Ed Tomlinson wrote:
|
| >Not normally.  If I modprobe via-agp modprobe segfaults (a Rusty's bug),
| >but via_agp and agpgart get loaded (note that - changed to _ when the module
| >is loaded - it has dash in file in the directory).  Doing it this time gets
| >an oops (52bk as of last night):
| [snip]
|
| I get a very similar oops, but with amd_k7_agp (2.5.52-mm2). I'm not
| bk-savvy as yet, but if pointed at a diff, would be happy to verify it.

2.5.zz kernel diff snapshots (from bk) are available at
  http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/
e.g., latest is:
  http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/patch-2.5.52-bk4.bz2

-- 
~Randy


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-20  1:43       ` Randy.Dunlap
@ 2002-12-20  9:33         ` Dave Jones
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Jones @ 2002-12-20  9:33 UTC (permalink / raw
  To: Randy.Dunlap
  Cc: Matt Bernstein, Ed Tomlinson, Paul P Komkoff Jr, linux-kernel,
	Rusty Russell

On Thu, Dec 19, 2002 at 05:43:14PM -0800, Randy.Dunlap wrote:
 > |
 > | I get a very similar oops, but with amd_k7_agp (2.5.52-mm2). I'm not
 > | bk-savvy as yet, but if pointed at a diff, would be happy to verify it.
 > 
 > 2.5.zz kernel diff snapshots (from bk) are available at
 >   http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/
 > e.g., latest is:
 >   http://www.kernel.org/pub/linux/kernel/v2.5/snapshots/patch-2.5.52-bk4.bz2

Latest AGP bits aren't in Linus tree yet. A few more bits to nail
down, and then I'll ask him to pull again.

        Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
       [not found]   ` <20021219105909.GE29122@suse.de>
@ 2002-12-20 23:29     ` Ed Tomlinson
  2002-12-21 14:22       ` Dave Jones
  0 siblings, 1 reply; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-20 23:29 UTC (permalink / raw
  To: Dave Jones; +Cc: linux-kernel

On December 19, 2002 05:59 am, Dave Jones wrote:
> On Wed, Dec 18, 2002 at 06:03:23PM -0500, Ed Tomlinson wrote:
>  > both with and without agp 3.0 enabled I get:
>  >
>  > Dec 18 17:51:10 oscar kernel: Linux agpgart interface v0.100 (c) Dave
>  > Jones Dec 18 17:51:29 oscar kernel: via_agp: Unknown symbol
>  > agp_generic_agp_3_0_enable
>
> I don't get this. Can you mail me your .config ?

Dave, with the pull from this morning (8am EST), it almost works modular.
I get:

Dec 20 18:20:19 oscar upsd[636]: Communication established
Dec 20 18:20:47 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jones
Dec 20 18:20:47 oscar kernel: agpgart: Detected VIA MVP3 chipset
Dec 20 18:20:47 oscar kernel: agpgart: AGP aperture is 64M @ 0xe0000000
Dec 20 18:20:58 oscar kernel: [drm] Initialized mga 3.1.0 20021029 on minor 0
Dec 20 18:20:58 oscar kernel: Module agpgart cannot be unloaded due to unsafe usage in drivers/char/ag
p/backend.c:58

but find this in the X startup log.

(EE) MGA: Failed to load module "mga_hal" (module does not exist, 0)
(EE) MGA(0): [agp] Out of memory (-1014)
(EE) MGA(0): [drm] failed to remove DRM signal handler
DRIUnlock called when not locked

ideas?

Ed Tomlinson


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-20 23:29     ` Ed Tomlinson
@ 2002-12-21 14:22       ` Dave Jones
  2002-12-22  4:41         ` Ed Tomlinson
  2002-12-23  1:10         ` Rusty Russell
  0 siblings, 2 replies; 18+ messages in thread
From: Dave Jones @ 2002-12-21 14:22 UTC (permalink / raw
  To: Ed Tomlinson; +Cc: Rusty Russell, linux-kernel

On Fri, Dec 20, 2002 at 06:29:18PM -0500, Ed Tomlinson wrote:
 > Dave, with the pull from this morning (8am EST), it almost works modular.
 > I get:
 > 
 > Dec 20 18:20:19 oscar upsd[636]: Communication established
 > Dec 20 18:20:47 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jones
 > Dec 20 18:20:47 oscar kernel: agpgart: Detected VIA MVP3 chipset
 > Dec 20 18:20:47 oscar kernel: agpgart: AGP aperture is 64M @ 0xe0000000
 > Dec 20 18:20:58 oscar kernel: [drm] Initialized mga 3.1.0 20021029 on minor 0
 > Dec 20 18:20:58 oscar kernel: Module agpgart cannot be unloaded due to unsafe usage in drivers/char/ag
 > p/backend.c:58

This one is due to the way AGPGART does (or has done for the last 3
years) its module locking. It does a MOD_INC_USE_COUNT as soon as
someone calls the acquire routines. (So you can't unload agpgart
whilst you've a 3d using app (like X) open).
This seems quite sensible, but these days you can't unload agpgart.ko
anyway because the chipset module (via-agp.ko in your case) already
has it 'in use', so I'm tempted to drop those bits.

 > but find this in the X startup log.
 > (EE) MGA: Failed to load module "mga_hal" (module does not exist, 0)

That's matrox's binary only X blob. Not my fault.

 > (EE) MGA(0): [agp] Out of memory (-1014)

This one is. But it may be a knock-on effect from the bug above.
I'll nail that one first.

 > (EE) MGA(0): [drm] failed to remove DRM signal handler
 > DRIUnlock called when not locked

That one's a problem for the DRI folks.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-21 14:22       ` Dave Jones
@ 2002-12-22  4:41         ` Ed Tomlinson
  2002-12-22 12:28           ` Dave Jones
  2002-12-23  1:10         ` Rusty Russell
  1 sibling, 1 reply; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-22  4:41 UTC (permalink / raw
  To: Dave Jones; +Cc: linux-kernel

On December 21, 2002 09:22 am, Dave Jones wrote:
> On Fri, Dec 20, 2002 at 06:29:18PM -0500, Ed Tomlinson wrote:

>  > but find this in the X startup log.
>  > (EE) MGA: Failed to load module "mga_hal" (module does not exist, 0)
>
> That's matrox's binary only X blob. Not my fault.

Know all about this one... My G400 runs just fine without with X setups.

>  > (EE) MGA(0): [agp] Out of memory (-1014)
>
> This one is. But it may be a knock-on effect from the bug above.
> I'll nail that one first.

Thanks

>  > (EE) MGA(0): [drm] failed to remove DRM signal handler
>  > DRIUnlock called when not locked
>
> That one's a problem for the DRI folks.

Yep

Now for something new.  With bk current (6pm EST) I get:

Dec 21 23:30:56 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jones
Dec 21 23:30:56 oscar kernel: agpgart: Detected VIA MVP3 chipset
Dec 21 23:30:56 oscar kernel:  printing eip:
Dec 21 23:30:56 oscar kernel: e0db9000
Dec 21 23:30:56 oscar kernel: Oops: 0000
Dec 21 23:30:56 oscar kernel: CPU:    0
Dec 21 23:30:56 oscar kernel: EIP:    0060:[<e0db9000>]    Not tainted
Dec 21 23:30:56 oscar kernel: EFLAGS: 00010296
Dec 21 23:30:56 oscar kernel: EIP is at 0xe0db9000
Dec 21 23:30:56 oscar kernel: eax: 00000000   ebx: c15d8800   ecx: 000000a4   edx: 07000203
Dec 21 23:30:56 oscar kernel: esi: 00000000   edi: c15d884c   ebp: c9fc3ee0   esp: c9fc3ec8
Dec 21 23:30:56 oscar kernel: ds: 0068   es: 0068   ss: 0068
Dec 21 23:30:56 oscar kernel: Process modprobe (pid: 2175, threadinfo=c9fc2000 task=d60e92a0)
Dec 21 23:30:56 oscar kernel: Stack: e0de9108 c15d8800 00000000 c15d884c 00000000 00000000 c9fc3ef0 e0de92f8
Dec 21 23:30:56 oscar kernel:        c15d8800 c15d8800 c9fc3f04 e0dd035a c15d8800 e0dcb4a8 a0dcb4a8 c9fc3f28
Dec 21 23:30:56 oscar kernel:        c019d252 c15d8800 e0dd046c c15d884c e0dcb4a8 ffffffed c15d8800 e0dcb480
Dec 21 23:30:56 oscar kernel: Call Trace:
Dec 21 23:30:56 oscar kernel:  [<e0de9108>] agp_backend_initialize+0x1c/0x168 [agpgart]
Dec 21 23:30:56 oscar kernel:  [<e0de92f8>] agp_register_driver+0x2c/0xac [agpgart]
Dec 21 23:30:56 oscar kernel:  [<e0dd035a>] agp_via_probe+0x62/0x6c [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [pci_device_probe+70/96] pci_device_probe+0x46/0x60
Dec 21 23:30:56 oscar kernel:  [<e0dd046c>] agp_via_pci_table+0x0/0x38 [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dcb480>] agp_via_pci_driver+0x0/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [bus_match+56/108] bus_match+0x38/0x6c
Dec 21 23:30:56 oscar kernel:  [driver_attach+66/108] driver_attach+0x42/0x6c
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [bus_add_driver+172/204] bus_add_driver+0xac/0xcc
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dcb4c8>] agp_via_pci_driver+0x48/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [driver_register+54/56] driver_register+0x36/0x38
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [pci_register_driver+68/80] pci_register_driver+0x44/0x50
Dec 21 23:30:56 oscar kernel:  [<e0dcb4a8>] agp_via_pci_driver+0x28/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dd0372>] agp_via_init+0xe/0x44 [via_agp]
Dec 21 23:30:56 oscar kernel:  [<e0dcb480>] agp_via_pci_driver+0x0/0xa0 [via_agp]
Dec 21 23:30:56 oscar kernel:  [sys_init_module+274/408] sys_init_module+0x112/0x198
Dec 21 23:30:56 oscar kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Dec 21 23:30:56 oscar kernel:
Dec 21 23:30:56 oscar kernel: Code:  Bad EIP value.

when modprobing via_agp

Ed



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-22  4:41         ` Ed Tomlinson
@ 2002-12-22 12:28           ` Dave Jones
  2002-12-22 14:08             ` Ed Tomlinson
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Jones @ 2002-12-22 12:28 UTC (permalink / raw
  To: Ed Tomlinson; +Cc: linux-kernel

On Sat, Dec 21, 2002 at 11:41:29PM -0500, Ed Tomlinson wrote:
 > 
 > Now for something new.  With bk current (6pm EST) I get:
 > Dec 21 23:30:56 oscar kernel: Call Trace:
 > Dec 21 23:30:56 oscar kernel:  [<e0de9108>] agp_backend_initialize+0x1c/0x168 [agpgart]
 > Dec 21 23:30:56 oscar kernel:  [<e0de92f8>] agp_register_driver+0x2c/0xac [agpgart]

I already fixed a bug with the same call-trace. This looks like
you've still got old .o files around. Can you make clean and rebuild
just to make sure ?

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-22 12:28           ` Dave Jones
@ 2002-12-22 14:08             ` Ed Tomlinson
  0 siblings, 0 replies; 18+ messages in thread
From: Ed Tomlinson @ 2002-12-22 14:08 UTC (permalink / raw
  To: Dave Jones; +Cc: linux-kernel

On December 22, 2002 07:28 am, Dave Jones wrote:
> On Sat, Dec 21, 2002 at 11:41:29PM -0500, Ed Tomlinson wrote:
>  > Now for something new.  With bk current (6pm EST) I get:
>  > Dec 21 23:30:56 oscar kernel: Call Trace:
>  > Dec 21 23:30:56 oscar kernel:  [<e0de9108>]
>  > agp_backend_initialize+0x1c/0x168 [agpgart] Dec 21 23:30:56 oscar
>  > kernel:  [<e0de92f8>] agp_register_driver+0x2c/0xac [agpgart]
>
> I already fixed a bug with the same call-trace. This looks like
> you've still got old .o files around. Can you make clean and rebuild
> just to make sure ?

Rebuilt with make clean - I am getting the same oops...

Ed

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-21 14:22       ` Dave Jones
  2002-12-22  4:41         ` Ed Tomlinson
@ 2002-12-23  1:10         ` Rusty Russell
  2002-12-23 12:10           ` Dave Jones
  1 sibling, 1 reply; 18+ messages in thread
From: Rusty Russell @ 2002-12-23  1:10 UTC (permalink / raw
  To: Dave Jones; +Cc: Ed Tomlinson, linux-kernel

In message <20021221142226.GA24941@suse.de> you write:
> On Fri, Dec 20, 2002 at 06:29:18PM -0500, Ed Tomlinson wrote:
>  > Dave, with the pull from this morning (8am EST), it almost works modular.
>  > I get:
>  > 
>  > Dec 20 18:20:19 oscar upsd[636]: Communication established
>  > Dec 20 18:20:47 oscar kernel: Linux agpgart interface v0.100 (c) Dave Jone
s
>  > Dec 20 18:20:47 oscar kernel: agpgart: Detected VIA MVP3 chipset
>  > Dec 20 18:20:47 oscar kernel: agpgart: AGP aperture is 64M @ 0xe0000000
>  > Dec 20 18:20:58 oscar kernel: [drm] Initialized mga 3.1.0 20021029 on mino
r 0
>  > Dec 20 18:20:58 oscar kernel: Module agpgart cannot be unloaded due to uns
afe usage in drivers/char/ag
>  > p/backend.c:58
> 
> This one is due to the way AGPGART does (or has done for the last 3
> years) its module locking. It does a MOD_INC_USE_COUNT as soon as
> someone calls the acquire routines.

Which is racy under SMP, and under preempt, which is why it's
deprecated.

> (So you can't unload agpgart whilst you've a 3d using app (like X)
> open).  This seems quite sensible, but these days you can't unload
> agpgart.ko anyway because the chipset module (via-agp.ko in your
> case) already has it 'in use', so I'm tempted to drop those bits.

If this is true (it usually is), you can simply drop them.  There are
other cases where the caller is not grabbing references, so
MOD_INC_USE_COUNT is better than nothing (should the warning stay for
2.6?  Good question).

Hope that helps,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-23  1:10         ` Rusty Russell
@ 2002-12-23 12:10           ` Dave Jones
  2002-12-26  3:51             ` Rusty Russell
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Jones @ 2002-12-23 12:10 UTC (permalink / raw
  To: Rusty Russell; +Cc: Ed Tomlinson, linux-kernel

On Mon, Dec 23, 2002 at 12:10:47PM +1100, Rusty Russell wrote:
 > > This one is due to the way AGPGART does (or has done for the last 3
 > > years) its module locking. It does a MOD_INC_USE_COUNT as soon as
 > > someone calls the acquire routines.
 > Which is racy under SMP, and under preempt, which is why it's
 > deprecated.

Crapola. I've just realised why this is no longer relevant.
I've moved what this was protecting into the per chipset modules.
Right now its possible to load modules, start x (which loads DRM),
then rmmod via_agp from under its feet. result - boom when something
tries to use 3d.

Sure it's unlikely someone would be crazy enough to try and do this,
and they deserve what they get, but it's not exactly clean, or nice.

So where is the documentation describing module locking de jour ?

 > > (So you can't unload agpgart whilst you've a 3d using app (like X)
 > > open).  This seems quite sensible, but these days you can't unload
 > > agpgart.ko anyway because the chipset module (via-agp.ko in your
 > > case) already has it 'in use', so I'm tempted to drop those bits.
 > If this is true (it usually is), you can simply drop them.

I'll need 'something' in the chipset drivers. The first thing that
jumps to mind is to give the chipset drivers an 'acquire' op
which does the locking much like the old agp_backend_acquire() does.
 
 > There are other cases where the caller is not grabbing references, so
 > MOD_INC_USE_COUNT is better than nothing (should the warning stay for
 > 2.6?  Good question).

Why exactly isn't it safe any more?  If there's documentation on this,
I'd love to read it. If there isn't, there really should be.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-23 12:10           ` Dave Jones
@ 2002-12-26  3:51             ` Rusty Russell
  2002-12-28  0:20               ` Roman Zippel
  0 siblings, 1 reply; 18+ messages in thread
From: Rusty Russell @ 2002-12-26  3:51 UTC (permalink / raw
  To: Dave Jones; +Cc: Ed Tomlinson, linux-kernel

In message <20021223121022.GA32080@suse.de> you write:
> On Mon, Dec 23, 2002 at 12:10:47PM +1100, Rusty Russell wrote:
>  > > This one is due to the way AGPGART does (or has done for the last 3
>  > > years) its module locking. It does a MOD_INC_USE_COUNT as soon as
>  > > someone calls the acquire routines.
>  > Which is racy under SMP, and under preempt, which is why it's
>  > deprecated.
> 
> Crapola. I've just realised why this is no longer relevant.
> I've moved what this was protecting into the per chipset modules.
> Right now its possible to load modules, start x (which loads DRM),
> then rmmod via_agp from under its feet. result - boom when something
> tries to use 3d.
> 
> Sure it's unlikely someone would be crazy enough to try and do this,
> and they deserve what they get, but it's not exactly clean, or nice.
> 
> So where is the documentation describing module locking de jour ?

Here's the FAQ again.  Note that the init stuff is not currently true
(you can enter a module doing init, with possibly bad results),
because Linus patched my code.  It's on my TODO list to revisit this
issue.

>  > > (So you can't unload agpgart whilst you've a 3d using app (like X)
>  > > open).  This seems quite sensible, but these days you can't unload
>  > > agpgart.ko anyway because the chipset module (via-agp.ko in your
>  > > case) already has it 'in use', so I'm tempted to drop those bits.
>  > If this is true (it usually is), you can simply drop them.
> 
> I'll need 'something' in the chipset drivers. The first thing that
> jumps to mind is to give the chipset drivers an 'acquire' op
> which does the locking much like the old agp_backend_acquire() does.

The problem is, a module basically can't lock itself.  Hence you
should expose an "owner" field when you register an interface, and
have the caller do the locking.  There are other possibly solutions,
but none of them were as simple as "lock down before you call in".

>  > There are other cases where the caller is not grabbing references, so
>  > MOD_INC_USE_COUNT is better than nothing (should the warning stay for
>  > 2.6?  Good question).
> 
> Why exactly isn't it safe any more?  If there's documentation on this,
> I'd love to read it. If there isn't, there really should be.

Consider the fictitious my_module.c:

	static void my_function_called_through_ptr(void)
	{
		MOD_INC_USE_COUNT;
		...

		MOD_DEC_USE_COUNT;
		return;
	}

If you get preempted before the MOD_INC_USE_COUNT, an unload could
occur, boom.  If you get preempted after the MOD_DEC_USE_COUNT, but
before the return, and an unload occurs, boom.

If it were not called through a pointer, (ie. called by name) then it
means the module calling it has a reference (this is done
automatically in symbol resolution, or symbol_get()), so no problem.

[ As an implementation detail, this is *not* a problem on SMP without
  preemption.  But that's an implementation detail, and it wasn't true
  with the previous module implementation. ]

Hope this helps,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Golden Rule: If you are calling though a function pointer into a
(different) module, you must hold a reference to that module.
Otherwise you risk sleeping in the module while it is unloaded.

Q: How do I get a reference to a module?
A: Usually, a successful call to try_module_get(owner).  You don't
   need to check for owner != NULL, BTW.

Q: When does try_module_get(owner) fail?
A: When the module is not ready to be entered (ie. still in
   init_module) or it is being removed.  This prevents you
   entering the module as it is being discarded (init might fail, or
   it's being removed).

Q: But the modules' init routine calls my register() routine which
   wants to call back into one of the function pointers immediately,
   and so try_module_get() fails! (because the module is not finished
   initializing yet)
A: You're being called from the module, so someone already has a
   reference (unless there's a bug), so you don't need a
   try_module_get().

   This does mean that if you were to register a structure for
   *another* module (does anyone do this?) you'd need to have a
   reference to it.

Q: How do I put the reference back?
A: Using module_put(owner) (owner == NULL is OK).

Q: Do I really need to put try_module_get() before every function ptr
   call?
A: If the function does not sleep (any cannot be preempted) ie. is
   called in softirq or hardirq context, you can omit this step, since
   you obviously won't sleep inside the module.

   Also, most structs have clear "start" and "stop" functions
   (eg. mount/umount), so you only need one try_module_get()
   on start, and module_put() on stop.

Q: Is it safe to call try_module_get() and module_put() from an
   interrtupt / softirq?
A: Yes.

Q: My code use "MOD_INC_USE_COUNT".  Do I still need to adjust my
   module count when someone calls one of my functions?
A: No, you never need to adjust your own module count.  There are five
   ways a function in your module can get called: firstly, it could be
   your module_init() function, in which case the module code holds a
   reference.  It could be another module using one of your
   EXPORT_SYMBOL'ed functions, in which case you cannot be removed
   since they would have to be removed first.  It could be a module
   which found an EXPORT_SYMBOL'ed function using symbol_get(), in
   which case they hold a reference count.  It could be through a
   function pointer which your module gave out previously, which is
   discussed above.  Finally, it could be from within your own module,
   in which case someone must already hold a reference.

Q: My code uses "__MOD_INC_USE_COUNT(reg->owner)", but now I get a
   warning at runtime that it is unsafe.  What do I need to do?
A: You need to use try_module_get(), and not call into the module if
   it fails (act as if it hasn't registered yet).  Note that you no
   longer need to check for NULL yourself, try_module_get() does that.

Q: My code used "GET_USE_COUNT(module)" to get the reference count.
A: Don't do that.  If module unloading is disabled, there is no
   reference count, and there is never a single value you can assign
   to.

Q: My code used "try_inc_mod_count(module)" to get the reference
   count.  Should I change it?
A: No hurry.  try_module_get() is exactly the same: the new name
   reflects that this is now the only way to get a reference.

Q: How does the code in try_module_get() work?
A: It disables preemption for a moment, checks the live flag, and then
   increments a per-cpu counter if the module is live.  This is even
   lighter-weight (in icache and cycles) than using a brlock, but has
   the same effect.  If CONFIG_MODULE_UNLOAD=n, it just becomes a
   check that the module is live.

Q: How does the module remove code work?
A: It stops the machine by scheduling threads for every other CPU,
   then they all disable interrupts.  At this stage we know that noone
   is in try_module_get(), so we can reliably read the counter.  If
   zero, or the rmmod user specified --wait, we set the live flag to
   false.  After this, the reference count should not increase, and
   each module_put() will wake us up, so we can check the counter
   again.

Q: Are these changes all so you could implement an in-kernel module
   linker?
A: No, they were to prevent load and unload races without altering
   every module, nor introducing drastic new requirements.

Q: Doesn't putting linking code into the kernel just add bloat?
A: The total linking code is about 200 generic lines, 100
   x86-specific lines.  The ia64 linking code is about 500 lines (it's
   the most complex).  Richard Henderson has a great suggestion for
   simplifying it furthur, which I'm implementing now.  "insmod" is
   now a single portable system call, meaning insmod can be written in
   about 20 lines of code.

   The previous code had to implement the two module loading
   system calls, the module querying system call, and the /proc/ksyms
   output, required a little more code than the current x86 linker.

Q: Why not just fix the old code?
A: Because having something so intimate with the kernel in userspace
   greatly restricts what changes the kernel can make.  Moving into
   the kernel means I have implemented modversions, typesafe
   extensible module parameters and kallsyms without altering
   userspace in any way.  Future extensions won't have to worry about
   the version of modutils problem.

Q: Why not implement two-stage insert / two-stage delete?
A: Because I implemented it first and it sucked.  And because this
   *is* two-stage insert and two-stage delete, without exposing it to
   the modules using it, with the added advantage that the second
   stage is atomic (activation/deactivation is simply changing
   mod->live, modulo locking implementation magic detailed above).
   This prevents the race between deactivating the module and finding
   that someone has starting using it as you are deactivating it.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [drm:drm_init] *ERROR* Cannot initialize the agpgart module.
  2002-12-26  3:51             ` Rusty Russell
@ 2002-12-28  0:20               ` Roman Zippel
  0 siblings, 0 replies; 18+ messages in thread
From: Roman Zippel @ 2002-12-28  0:20 UTC (permalink / raw
  To: Rusty Russell; +Cc: Dave Jones, Ed Tomlinson, linux-kernel

Hi,

Rusty Russell wrote:

> > So where is the documentation describing module locking de jour ?
> 
> Here's the FAQ again.  Note that the init stuff is not currently true
> (you can enter a module doing init, with possibly bad results),
> because Linus patched my code.  It's on my TODO list to revisit this
> issue.

Rusty, you should do this rather soon. Could you please explain, how you
intend to solve the module init races? The "don't enter module until
init is finished" concept doesn't work, so what else do you want to do?

> Q: How does the code in try_module_get() work?
> A: It disables preemption for a moment, checks the live flag, and then
>    increments a per-cpu counter if the module is live.  This is even
>    lighter-weight (in icache and cycles) than using a brlock, but has
>    the same effect.

Q: But if I have a large number of modules (e.g. netfilter) and I have
to use try_module_get() for each of them, won't it still affect
performance?

> Q: Are these changes all so you could implement an in-kernel module
>    linker?
> A: No, they were to prevent load and unload races without altering
>    every module, nor introducing drastic new requirements.

Q: Wasn't it possible to do the rewrite in several steps? What made it
so urgent that everything had to be 2.6?

> Q: Doesn't putting linking code into the kernel just add bloat?
> A: The total linking code is about 200 generic lines, 100
>    x86-specific lines. [...]
> 
>    The previous code had to implement the two module loading
>    system calls, the module querying system call, and the /proc/ksyms
>    output, required a little more code than the current x86 linker.

2.4.20: wc kernel/module.c include/linux/module.h
   1283    3800   29616 kernel/module.c
    415    1609   13173 include/linux/module.h
   1698    5409   42789 total

2.5.52: wc kernel/module.c kernel/params.c arch/i386/kernel/module.c
include/linux/module*
   1414    4400   35284 kernel/module.c
    338    1093    8182 kernel/params.c
    124     412    3347 arch/i386/kernel/module.c
    368    1174    9954 include/linux/module.h
     61     244    1958 include/linux/moduleloader.h
    127     619    5199 include/linux/moduleparam.h
   2432    7942   63924 total

Q: Doesn't the querying system call provide the same information as
/proc/modules and /proc/ksyms? Is procfs now required to load modules?
Q: How can I debug modules with ksymoops?
Q: Why was it necessary to remove the system calls? Wasn't it possible
to emulate them?

> Q: Why not just fix the old code?
> A: Because having something so intimate with the kernel in userspace
>    greatly restricts what changes the kernel can make.  Moving into
>    the kernel means I have implemented modversions, typesafe
>    extensible module parameters and kallsyms without altering
>    userspace in any way.  Future extensions won't have to worry about
>    the version of modutils problem.

Q: Can I see the modversions implementation?
Q: Was there an analysis, which discussed the limitation of the old
code? Was it really impossible to design a more flexible interface,
which left as much as possible in userspace?

Rusty, could you _please_ start answering questions instead of ignoring
them or is there no discussion required anymore, now that your code is
in the kernel?

bye, Roman



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2002-12-28  1:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20021218094714.43C712C076@lists.samba.org>
2002-12-18 12:57 ` [drm:drm_init] *ERROR* Cannot initialize the agpgart module Ed Tomlinson
2002-12-18 16:29   ` Dave Jones
     [not found] ` <200212181803.23279.tomlins@cam.org>
     [not found]   ` <20021219105909.GE29122@suse.de>
2002-12-20 23:29     ` Ed Tomlinson
2002-12-21 14:22       ` Dave Jones
2002-12-22  4:41         ` Ed Tomlinson
2002-12-22 12:28           ` Dave Jones
2002-12-22 14:08             ` Ed Tomlinson
2002-12-23  1:10         ` Rusty Russell
2002-12-23 12:10           ` Dave Jones
2002-12-26  3:51             ` Rusty Russell
2002-12-28  0:20               ` Roman Zippel
2002-12-17  1:49 Ed Tomlinson
2002-12-17  8:06 ` Paul P Komkoff Jr
2002-12-17 12:50   ` Ed Tomlinson
2002-12-20  1:30     ` Matt Bernstein
2002-12-20  1:43       ` Randy.Dunlap
2002-12-20  9:33         ` Dave Jones
2002-12-17 12:33 ` Dave Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.