All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* gdb 5.0 display arguments problem
@ 2001-03-19 22:49 Jun Sun
  2001-03-20  0:18   ` Kevin D. Kissell
  0 siblings, 1 reply; 8+ messages in thread
From: Jun Sun @ 2001-03-19 22:49 UTC (permalink / raw
  To: linux-mips


I am using gdb 5.0 client to debug kernel, and found a bug in gdb 5.0 when it
trys to display an function argument.

The following is the relavent code segment where breakpoint is set to the
first instruction of serial_console_write().

00000000801270ac <serial_console_write>:
    801270ac:   3c04801d        lui     $a0,0x801d
    801270b0:   8c84bc1c        lw      $a0,-17380($a0)
    801270b4:   27bdffc0        addiu   $sp,$sp,-64
    801270b8:   afb40028        sw      $s4,40($sp)
    801270bc:   00a0a021        move    $s4,$a1
    801270c0:   24050001        li      $a1,1

For whatever reason gdb client on the host side apparently thinks the second
arg is stored in register s4.  When the breakpoint is hit, gdb tries to
display the value of s4 (which is 0x4 in this case).  Since the type of this
argument is char *, gdb further tries to read the content at 0x4 which causes
kernel panic.

I believe I have seen this problem before (and in most case the symptom is
wrong argument values instead of kernel panic).  Does someone have an idea how
to fix it or work around it? 

Does this problem exist in native debugging?

I assume we can disable gdb to display char strings by default.  Does someone
know how to do it?

Thanks.

Jun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
@ 2001-03-20  0:18   ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2001-03-20  0:18 UTC (permalink / raw
  To: Jun Sun, linux-mips

> I am using gdb 5.0 client to debug kernel, and found a bug in gdb 5.0 when
it
> trys to display an function argument.
...
> For whatever reason gdb client on the host side apparently thinks the
second
> arg is stored in register s4.  When the breakpoint is hit, gdb tries to
> display the value of s4 (which is 0x4 in this case).  Since the type of
this
> argument is char *, gdb further tries to read the content at 0x4 which
causes
> kernel panic.
>
> I believe I have seen this problem before (and in most case the symptom is
> wrong argument values instead of kernel panic).  Does someone have an idea
how
> to fix it or work around it?

I had exactly the same problem with earlier versions of gdb (4.18
if I recall), though for me the problem was invariably provoked by my
asking "where" on a deeply nested set of stack frames.  I had always
believed that the problem stemmed from the fact that the compiler,
when invoked with the options used to build the Linux kernel in any
case, is under no obligation to preserve the values of its incoming
arguments past their useful life within the called function.  If the
arguments are consumed before the next use of the register,
they are never saved.  So sooner or later, the back trace comes to
a function for which the argument storage on the stack frame is
uninitialized garbage.

That problem should, in theory, be fixable by compiling with
a set of options that force the arguments to be stored in the
stack frame.  Yours may well be slightly different, but fatal
for the same reason.

> Does this problem exist in native debugging?

Yes, but in that case all you get is a bogus reported argument.
The problem is that the kgdb "agent" in the kernel is being passed
a bogus pointer, and is dereferencing it mindlessly (and fatally).

> I assume we can disable gdb to display char strings by default.  Does
someone
> know how to do it?

I tried fixing it with a hack to the exception handling code,
inspired by the old Cobalt MIPS kernel code, such that the
kgdb agent's proxy references could fail in a non-fatal manner.
I never did get it to work.  It probably would be easy to cripple
gdb to not automatically dereference pointer arguments, perhaps
only in remote mode and perhaps only if some magic flag is set.
But it *is* nice to see the string arguments when they are sane.
A cleaner approach might be to have the proxy use a high level
VM routine to check the validity of each address before
dereferencing, but it's not clear that it's actually safe to invoke
such routines from the level at which the kgdb proxy is executing.

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
@ 2001-03-20  0:18   ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2001-03-20  0:18 UTC (permalink / raw
  To: Jun Sun, linux-mips

> I am using gdb 5.0 client to debug kernel, and found a bug in gdb 5.0 when
it
> trys to display an function argument.
...
> For whatever reason gdb client on the host side apparently thinks the
second
> arg is stored in register s4.  When the breakpoint is hit, gdb tries to
> display the value of s4 (which is 0x4 in this case).  Since the type of
this
> argument is char *, gdb further tries to read the content at 0x4 which
causes
> kernel panic.
>
> I believe I have seen this problem before (and in most case the symptom is
> wrong argument values instead of kernel panic).  Does someone have an idea
how
> to fix it or work around it?

I had exactly the same problem with earlier versions of gdb (4.18
if I recall), though for me the problem was invariably provoked by my
asking "where" on a deeply nested set of stack frames.  I had always
believed that the problem stemmed from the fact that the compiler,
when invoked with the options used to build the Linux kernel in any
case, is under no obligation to preserve the values of its incoming
arguments past their useful life within the called function.  If the
arguments are consumed before the next use of the register,
they are never saved.  So sooner or later, the back trace comes to
a function for which the argument storage on the stack frame is
uninitialized garbage.

That problem should, in theory, be fixable by compiling with
a set of options that force the arguments to be stored in the
stack frame.  Yours may well be slightly different, but fatal
for the same reason.

> Does this problem exist in native debugging?

Yes, but in that case all you get is a bogus reported argument.
The problem is that the kgdb "agent" in the kernel is being passed
a bogus pointer, and is dereferencing it mindlessly (and fatally).

> I assume we can disable gdb to display char strings by default.  Does
someone
> know how to do it?

I tried fixing it with a hack to the exception handling code,
inspired by the old Cobalt MIPS kernel code, such that the
kgdb agent's proxy references could fail in a non-fatal manner.
I never did get it to work.  It probably would be easy to cripple
gdb to not automatically dereference pointer arguments, perhaps
only in remote mode and perhaps only if some magic flag is set.
But it *is* nice to see the string arguments when they are sane.
A cleaner approach might be to have the proxy use a high level
VM routine to check the validity of each address before
dereferencing, but it's not clear that it's actually safe to invoke
such routines from the level at which the kgdb proxy is executing.

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
@ 2001-03-20  9:09     ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2001-03-20  9:09 UTC (permalink / raw
  To: Kevin D. Kissell, Jun Sun, linux-mips

Followups on my own message of last night.

> I had exactly the same problem with earlier versions of gdb (4.18
> if I recall), though for me the problem was invariably provoked by my
> asking "where" on a deeply nested set of stack frames.  I had always
> believed that the problem stemmed from the fact that the compiler,
> when invoked with the options used to build the Linux kernel in any
> case, is under no obligation to preserve the values of its incoming
> arguments past their useful life within the called function.  If the
> arguments are consumed before the next use of the register,
> they are never saved.  So sooner or later, the back trace comes to
> a function for which the argument storage on the stack frame is
> uninitialized garbage.
> 
> That problem should, in theory, be fixable by compiling with
> a set of options that force the arguments to be stored in the
> stack frame.

I've experimented around a bit, and so far, the only way
I can find that ensures that the argument values are preserved
all the way back down a calling chain is to use no "-O" optimisation
whatsoever.  The code us huge and grotesque, but arguments
are systematically saved in the stack frame in their allotted,
caller-allocated slots.  Even then I also need to compile with -g
if I want gdb (native 4.17) to be able to do a correct backtrace
even of user-mode code.

> Yours may well be slightly different, but fatal for the same reason.

Indeed, I wonder if your gdb isn't looking on the stack frame
for an argument that isn't there - and which may never be
there - and finding the value that also happens to be in s4.

> > Does this problem exist in native debugging?
> 
> Yes, but in most cases all you get is a bogus reported argument.

Or a truncated back trace.  In the worst case, you do get a gdb
crash/core dump.

> The problem is that the kgdb "agent" in the kernel is being passed 
> a bogus pointer, and is  dereferencing it mindlessly (and fatally).  
>
> > I assume we can disable gdb to display char strings by default.  
> > Does someone know how to do it?
> 
> I tried fixing it with a hack to the exception handling code,
> inspired by the old Cobalt MIPS kernel code, such that the
> kgdb agent's proxy references could fail in a non-fatal manner.
> I never did get it to work.  It probably would be easy to cripple
> gdb to not automatically dereference pointer arguments, perhaps
> only in remote mode and perhaps only if some magic flag is set.
> But it *is* nice to see the string arguments when they are sane.
> A cleaner approach might be to have the proxy use a high level
> VM routine to check the validity of each address before
> dereferencing, but it's not clear that it's actually safe to invoke
> such routines from the level at which the kgdb proxy is executing.

Another reason to fix things in the gdb proxy/exception code
rather than cripple gdb backtrace is that, even with the backtrace
fixed, the current kgdb situation is such that the slightest typo
at the debugger operator interface can generate a bad address
and blow the system sky high.  It's happened to me on more than
one occasion.  Fortunately, what I was debugging at the time
was readily reproduceable (if not, I would have fixed the kgdb
problem then and there!).

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
@ 2001-03-20  9:09     ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2001-03-20  9:09 UTC (permalink / raw
  To: Kevin D. Kissell, Jun Sun, linux-mips

Followups on my own message of last night.

> I had exactly the same problem with earlier versions of gdb (4.18
> if I recall), though for me the problem was invariably provoked by my
> asking "where" on a deeply nested set of stack frames.  I had always
> believed that the problem stemmed from the fact that the compiler,
> when invoked with the options used to build the Linux kernel in any
> case, is under no obligation to preserve the values of its incoming
> arguments past their useful life within the called function.  If the
> arguments are consumed before the next use of the register,
> they are never saved.  So sooner or later, the back trace comes to
> a function for which the argument storage on the stack frame is
> uninitialized garbage.
> 
> That problem should, in theory, be fixable by compiling with
> a set of options that force the arguments to be stored in the
> stack frame.

I've experimented around a bit, and so far, the only way
I can find that ensures that the argument values are preserved
all the way back down a calling chain is to use no "-O" optimisation
whatsoever.  The code us huge and grotesque, but arguments
are systematically saved in the stack frame in their allotted,
caller-allocated slots.  Even then I also need to compile with -g
if I want gdb (native 4.17) to be able to do a correct backtrace
even of user-mode code.

> Yours may well be slightly different, but fatal for the same reason.

Indeed, I wonder if your gdb isn't looking on the stack frame
for an argument that isn't there - and which may never be
there - and finding the value that also happens to be in s4.

> > Does this problem exist in native debugging?
> 
> Yes, but in most cases all you get is a bogus reported argument.

Or a truncated back trace.  In the worst case, you do get a gdb
crash/core dump.

> The problem is that the kgdb "agent" in the kernel is being passed 
> a bogus pointer, and is  dereferencing it mindlessly (and fatally).  
>
> > I assume we can disable gdb to display char strings by default.  
> > Does someone know how to do it?
> 
> I tried fixing it with a hack to the exception handling code,
> inspired by the old Cobalt MIPS kernel code, such that the
> kgdb agent's proxy references could fail in a non-fatal manner.
> I never did get it to work.  It probably would be easy to cripple
> gdb to not automatically dereference pointer arguments, perhaps
> only in remote mode and perhaps only if some magic flag is set.
> But it *is* nice to see the string arguments when they are sane.
> A cleaner approach might be to have the proxy use a high level
> VM routine to check the validity of each address before
> dereferencing, but it's not clear that it's actually safe to invoke
> such routines from the level at which the kgdb proxy is executing.

Another reason to fix things in the gdb proxy/exception code
rather than cripple gdb backtrace is that, even with the backtrace
fixed, the current kgdb situation is such that the slightest typo
at the debugger operator interface can generate a bad address
and blow the system sky high.  It's happened to me on more than
one occasion.  Fortunately, what I was debugging at the time
was readily reproduceable (if not, I would have fixed the kgdb
problem then and there!).

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
  2001-03-20  9:09     ` Kevin D. Kissell
  (?)
@ 2001-03-20 19:26     ` Jun Sun
  2001-03-20 22:16         ` Kevin D. Kissell
  -1 siblings, 1 reply; 8+ messages in thread
From: Jun Sun @ 2001-03-20 19:26 UTC (permalink / raw
  To: Kevin D. Kissell; +Cc: linux-mips

"Kevin D. Kissell" wrote:
> 
> > Yours may well be slightly different, but fatal for the same reason.
> 
> Indeed, I wonder if your gdb isn't looking on the stack frame
> for an argument that isn't there - and which may never be
> there - and finding the value that also happens to be in s4.
> 

I think I figured out the reason.  If you take a look of the code segment I
attached in my first posting, you will see that s4 register indeed holds the
value of a1, and presummably remains so for the rest of the function. 
However, the actual assignment of a1 to s4 does not happen until a couple of
instructions later into the function.  So if the breakpoint is set at the
first instruction of the function, gdb would still think (wrongly) s4 holds
the 2nd argument and, even worse, try to dereference it if it is a char*
pointer.

> 
> Another reason to fix things in the gdb proxy/exception code
> rather than cripple gdb backtrace is that, even with the backtrace
> fixed, the current kgdb situation is such that the slightest typo
> at the debugger operator interface can generate a bad address
> and blow the system sky high.  It's happened to me on more than
> one occasion.  Fortunately, what I was debugging at the time
> was readily reproduceable (if not, I would have fixed the kgdb
> problem then and there!).
> 

This sounds pretty cool, but I don't see a clean algorithm.  So in the
exception code you would decide not to crash if 1)kgdb is configured, and 2)
the exception is caused by kgdb code (how?).  Also if you decide not to crash,
what should be reasonable return values?

Disable automatic char * dereferencing is not that bad.  You always have the
option to manually dereference it.  However, I could not find such an option. 
Maybe gdb does not provide that yet.

Jun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
@ 2001-03-20 22:16         ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2001-03-20 22:16 UTC (permalink / raw
  To: Jun Sun; +Cc: linux-mips

> "Kevin D. Kissell" wrote:
> >
> > > Yours may well be slightly different, but fatal for the same reason.
> >
> > Indeed, I wonder if your gdb isn't looking on the stack frame
> > for an argument that isn't there - and which may never be
> > there - and finding the value that also happens to be in s4.
> >
>
> I think I figured out the reason.  If you take a look of the code segment
I
> attached in my first posting, you will see that s4 register indeed holds
the
> value of a1, and presummably remains so for the rest of the function.
> However, the actual assignment of a1 to s4 does not happen until a couple
of
> instructions later into the function.  So if the breakpoint is set at the
> first instruction of the function, gdb would still think (wrongly) s4
holds
> the 2nd argument and, even worse, try to dereference it if it is a char*
> pointer.

It's not unusual to need to step one line to "see" arguments
correctly.  A pity about the pointer dereference.  ;-)

> > Another reason to fix things in the gdb proxy/exception code
> > rather than cripple gdb backtrace is that, even with the backtrace
> > fixed, the current kgdb situation is such that the slightest typo
> > at the debugger operator interface can generate a bad address
> > and blow the system sky high.  It's happened to me on more than
> > one occasion.  Fortunately, what I was debugging at the time
> > was readily reproduceable (if not, I would have fixed the kgdb
> > problem then and there!).
> >
>
> This sounds pretty cool, but I don't see a clean algorithm.  So in the
> exception code you would decide not to crash if 1)kgdb is configured, and
2)
> the exception is caused by kgdb code (how?).  Also if you decide not to
crash,
> what should be reasonable return values?

There are a number of possible ways of going about it.  One is
to set some kind of flag during the kgdb proxy access, which
is tested by the page fault code.  That's the stuff that's derived
from Cobalt code and that is partly in place in some existing
kernel gdb support ("debugmem_got_flt", etc.).  I have not seen
it work correcly, and isn't SMP safe.  Another scheme, which is
actually a bit cleaner, would be to check explicitly for the faulting
address to be that of the kgdb proxy load.  In any case, once the
case has been identified, one tweaks the EPC value to return to
something other than a repeat of the bad dereference, and returns
zero or an error value to gdb - the protocol and message formats
are described in comments in gdb-stub.c.

> Disable automatic char * dereferencing is not that bad.  You always have
the
 > option to manually dereference it.  However, I could not find such an
option.
> Maybe gdb does not provide that yet.

And even if it did, accidentally trying to dump "0x8021f34" would
blow you out of the water.  It's the kernel proxy code that really needs
to be fixed.

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gdb 5.0 display arguments problem
@ 2001-03-20 22:16         ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2001-03-20 22:16 UTC (permalink / raw
  To: Jun Sun; +Cc: linux-mips

> "Kevin D. Kissell" wrote:
> >
> > > Yours may well be slightly different, but fatal for the same reason.
> >
> > Indeed, I wonder if your gdb isn't looking on the stack frame
> > for an argument that isn't there - and which may never be
> > there - and finding the value that also happens to be in s4.
> >
>
> I think I figured out the reason.  If you take a look of the code segment
I
> attached in my first posting, you will see that s4 register indeed holds
the
> value of a1, and presummably remains so for the rest of the function.
> However, the actual assignment of a1 to s4 does not happen until a couple
of
> instructions later into the function.  So if the breakpoint is set at the
> first instruction of the function, gdb would still think (wrongly) s4
holds
> the 2nd argument and, even worse, try to dereference it if it is a char*
> pointer.

It's not unusual to need to step one line to "see" arguments
correctly.  A pity about the pointer dereference.  ;-)

> > Another reason to fix things in the gdb proxy/exception code
> > rather than cripple gdb backtrace is that, even with the backtrace
> > fixed, the current kgdb situation is such that the slightest typo
> > at the debugger operator interface can generate a bad address
> > and blow the system sky high.  It's happened to me on more than
> > one occasion.  Fortunately, what I was debugging at the time
> > was readily reproduceable (if not, I would have fixed the kgdb
> > problem then and there!).
> >
>
> This sounds pretty cool, but I don't see a clean algorithm.  So in the
> exception code you would decide not to crash if 1)kgdb is configured, and
2)
> the exception is caused by kgdb code (how?).  Also if you decide not to
crash,
> what should be reasonable return values?

There are a number of possible ways of going about it.  One is
to set some kind of flag during the kgdb proxy access, which
is tested by the page fault code.  That's the stuff that's derived
from Cobalt code and that is partly in place in some existing
kernel gdb support ("debugmem_got_flt", etc.).  I have not seen
it work correcly, and isn't SMP safe.  Another scheme, which is
actually a bit cleaner, would be to check explicitly for the faulting
address to be that of the kgdb proxy load.  In any case, once the
case has been identified, one tweaks the EPC value to return to
something other than a repeat of the bad dereference, and returns
zero or an error value to gdb - the protocol and message formats
are described in comments in gdb-stub.c.

> Disable automatic char * dereferencing is not that bad.  You always have
the
 > option to manually dereference it.  However, I could not find such an
option.
> Maybe gdb does not provide that yet.

And even if it did, accidentally trying to dump "0x8021f34" would
blow you out of the water.  It's the kernel proxy code that really needs
to be fixed.

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-03-20 22:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-19 22:49 gdb 5.0 display arguments problem Jun Sun
2001-03-20  0:18 ` Kevin D. Kissell
2001-03-20  0:18   ` Kevin D. Kissell
2001-03-20  9:09   ` Kevin D. Kissell
2001-03-20  9:09     ` Kevin D. Kissell
2001-03-20 19:26     ` Jun Sun
2001-03-20 22:16       ` Kevin D. Kissell
2001-03-20 22:16         ` Kevin D. Kissell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.