* Re: update re: fork() failures in 2.1.101
[not found] <19980611173940.51846@adore.lightlink.com>
@ 1998-06-12 4:36 ` Rik van Riel
1998-06-12 22:58 ` Stephen C. Tweedie
0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 1998-06-12 4:36 UTC (permalink / raw
To: Paul Kimoto; +Cc: Linux MM
[Paul get's "cannot fork" errors after 60 or more hours of
uptime. This suggests fragmentation problems.]
On Thu, 11 Jun 1998, Paul Kimoto wrote:
> > Hmm, the 'cannot fork' issue only starting after some
> > days of uptime... This suggests fragmentation. Is your
> > box very heavily loaded, or just lightly (VM-wise)?
>
> Light, I think; I have 48MB of RAM and usually end up with 8--16MB in swap.
> In normal operation I don't have to wait much for paging except for larger
> programs (netscape, xemacs, or big compilations).
Ahh, I think I see it now. The fragmentation on your system
persists because of the swap cache. The swap cache 'caches'
swap pages and kinda makes sure they are reloaded to the
same physical address.
Stephen, Ben: should we disable the swap cache when
fragmentation is high?
Rik.
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures in 2.1.101
1998-06-12 4:36 ` update re: fork() failures in 2.1.101 Rik van Riel
@ 1998-06-12 22:58 ` Stephen C. Tweedie
0 siblings, 0 replies; 8+ messages in thread
From: Stephen C. Tweedie @ 1998-06-12 22:58 UTC (permalink / raw
To: Rik van Riel; +Cc: Paul Kimoto, Linux MM
Hi,
On Fri, 12 Jun 1998 06:36:53 +0200 (MET DST), Rik van Riel
<H.H.vanRiel@phys.uu.nl> said:
> [Paul get's "cannot fork" errors after 60 or more hours of
> uptime. This suggests fragmentation problems.]
Kernel version?
> Ahh, I think I see it now. The fragmentation on your system persists
> because of the swap cache. The swap cache 'caches' swap pages and
> kinda makes sure they are reloaded to the same physical address.
No. As it stands right now, the "caching" component of the swap cache
is an *on disk* cache of resident pages. Once the pages are swapped
out they are paged back in anywhere appropriate. That part of the
fragmentation does not persist.
The real problem is not swapper, I suspect, but the various consumers of
slab cache (especially dcache). The slab allocator has some really
nasty properties; just one single in-use object will pin an entire slab
(up to 32k) into memory. If the slabs become small, then it will be 4k
pages which get so pinned, and at that point we cannot allocate any
stack pages. There are a number of ways we may tackle this in 2.1, but
disabling the swap cache won't help at all.
--Stephen
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures in 2.1.101
[not found] <19980618235448.18503@adore.lightlink.com>
@ 1998-06-19 7:33 ` Rik van Riel
1998-06-19 15:01 ` update re: fork() failures [in 2.1.103] Paul Kimoto
1998-06-21 20:19 ` update re: fork() failures in 2.1.103 Paul Kimoto
0 siblings, 2 replies; 8+ messages in thread
From: Rik van Riel @ 1998-06-19 7:33 UTC (permalink / raw
To: Paul Kimoto; +Cc: Linux MM
[CC-ed to linux-mm, and it should stay that way...]
On Thu, 18 Jun 1998, Paul Kimoto wrote:
> For completeness, here is the fragmentation report for each:
> > Jun 18 01:24:48 ( 48*4kB 7*8kB 1*16kB 1*32kB 4*64kB 1*128kB = 680kB)
> > Jun 18 18:03:53 ( 1*4kB 28*8kB 39*16kB 2*32kB 1*64kB 1*128kB = 1108kB)
Damn, this looks near-perfect for normal system load...
I really don't understand what's wrong.
> If you have other suggestions for things to try, with the reduction in
> memory (from 48 MB) the problems seem to arise in about half the time.
I wonder what kind of software / networking app you are using,
and what memory usage those programs have...
Rik.
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures [in 2.1.103]
1998-06-19 7:33 ` update re: fork() failures in 2.1.101 Rik van Riel
@ 1998-06-19 15:01 ` Paul Kimoto
1998-06-19 16:59 ` Rik van Riel
1998-06-21 20:19 ` update re: fork() failures in 2.1.103 Paul Kimoto
1 sibling, 1 reply; 8+ messages in thread
From: Paul Kimoto @ 1998-06-19 15:01 UTC (permalink / raw
To: Linux MM; +Cc: Rik van Riel
On Fri, Jun 19, 1998 at 09:33:54AM +0200, Rik van Riel wrote:
> I wonder what kind of software / networking app you are using,
> and what memory usage those programs have...
It's a mixed libc5/libc6 system.
Here is a snapshot of the Top 20 in RSS:
%CPU %MEM SIZE RSS
1.3 18.9 13552 5876 Xwrapper XFree 3.3.2.2
1.0 18.4 10612 5716 netscape 3.01
0.0 5.6 4508 1740 kermitbeta 6.1.193 Beta.05
1.2 5.1 4072 1584 rvplayer 5.0.0.35
0.0 3.7 4372 1176 kermitbeta
0.0 3.7 4372 1168 kermitbeta
0.0 2.9 1824 908 named 8.1.2
0.0 2.9 960 908 xntpd 3-5.91 (locked into memory)
0.0 2.8 2584 876 xterm
0.0 2.4 2420 748 xterm
0.0 2.3 1448 716 zsh 3.1.4
0.0 2.1 1380 676 zsh
0.0 2.1 1380 676 zsh
0.0 2.1 1404 668 perl 5.004_04
0.0 1.9 1512 592 gnuplot_x11 3.5 (3.50.1.17)
0.0 1.9 2164 592 xload
0.0 1.6 932 520 pppd 2.3.5
95.7 1.6 9364 520 mprime 15.4.2 (internet Mersenne prime search)
0.0 1.5 1756 496 gnuplot
0.0 1.5 836 488 ps 1.2.4
-Paul <kimoto@lightlink.com>
[please cc: relevant messages to me]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures [in 2.1.103]
1998-06-19 15:01 ` update re: fork() failures [in 2.1.103] Paul Kimoto
@ 1998-06-19 16:59 ` Rik van Riel
1998-06-19 20:14 ` Paul Kimoto
0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 1998-06-19 16:59 UTC (permalink / raw
To: Paul Kimoto; +Cc: Linux MM
On Fri, 19 Jun 1998, Paul Kimoto wrote:
> On Fri, Jun 19, 1998 at 09:33:54AM +0200, Rik van Riel wrote:
> > I wonder what kind of software / networking app you are using,
> > and what memory usage those programs have...
>
> It's a mixed libc5/libc6 system.
> Here is a snapshot of the Top 20 in RSS:
>
> %CPU %MEM SIZE RSS
> 1.3 18.9 13552 5876 Xwrapper XFree 3.3.2.2
> 1.0 18.4 10612 5716 netscape 3.01
> 95.7 1.6 9364 520 mprime 15.4.2 (internet Mersenne prime search)
Shouldn't be much of a problem... But 'eh, does the
Mersenne program regularly do memory I/O?
It could be that it loads large chunks of memory and
frees small portions from the middle of it. The Linux
MM system could have a problem with that...
Of course we should be able to handle such stuff, but
with the current buddy allocator things might just get
a little bit tricky :(
The reason I picked this process, is that it's RSS is
only one 18th of it's total size, which is somewhat
weird for a 'normal' Unix process.
Rik.
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures [in 2.1.103]
1998-06-19 16:59 ` Rik van Riel
@ 1998-06-19 20:14 ` Paul Kimoto
1998-06-20 0:48 ` George Woltman
0 siblings, 1 reply; 8+ messages in thread
From: Paul Kimoto @ 1998-06-19 20:14 UTC (permalink / raw
To: Rik van Riel; +Cc: Linux MM, woltman
On Fri, Jun 19, 1998 at 06:59:56PM +0200, Rik van Riel wrote:
>> %CPU %MEM SIZE RSS
>> 95.7 1.6 9364 520 mprime 15.4.2 (internet Mersenne prime search)
> Shouldn't be much of a problem... But 'eh, does the
> Mersenne program regularly do memory I/O?
> It could be that it loads large chunks of memory and
> frees small portions from the middle of it. The Linux
> MM system could have a problem with that...
> The reason I picked this process, is that it's RSS is
> only one 18th of it's total size, which is somewhat
> weird for a 'normal' Unix process.
I *think* that it allocates a huge amount of memory,
then uses only a small portion of it.
The above shows an inconsistency between "ps" and "top":
according to "ps", SIZE=9364, RSS=404;
but according to "top", SIZE= 500, RSS=404, SWAP=96.
"grep '^Vm' /proc/<pid>/status" says
> VmSize: 9364 kB
> VmLck: 0 kB
> VmRSS: 464 kB
> VmData: 8400 kB
> VmStk: 12 kB
> VmExe: 72 kB
> VmLib: 580 kB
-Paul <kimoto@lightlink.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures [in 2.1.103]
1998-06-19 20:14 ` Paul Kimoto
@ 1998-06-20 0:48 ` George Woltman
0 siblings, 0 replies; 8+ messages in thread
From: George Woltman @ 1998-06-20 0:48 UTC (permalink / raw
To: Paul Kimoto, Rik van Riel; +Cc: Linux MM
At 04:14 PM 6/19/98 -0400, Paul Kimoto wrote:
>
>I *think* that it allocates a huge amount of memory,
>then uses only a small portion of it.
This is indeed the case. I know it's a sloppy programming practice,
but it was the easiest way for me to interface with all my assembly
code that assumes the FFT data is at a fixed address. Mprime actually
has 16MB of global variables. Unless you are testing an exponent above
20,000,000 then you are only using a small fraction of the 16MB.
Best regards,
George
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: update re: fork() failures in 2.1.103
1998-06-19 7:33 ` update re: fork() failures in 2.1.101 Rik van Riel
1998-06-19 15:01 ` update re: fork() failures [in 2.1.103] Paul Kimoto
@ 1998-06-21 20:19 ` Paul Kimoto
1 sibling, 0 replies; 8+ messages in thread
From: Paul Kimoto @ 1998-06-21 20:19 UTC (permalink / raw
To: Linux MM
RECAP: In 2.1.99, 2.1.101, 2.1.103, and 2.1.104-pre1, my system has been
usable for only ~1 day with 32 MB of memory, or ~2.5 days with 48 MB.
Then my system has trouble forking, typically with EAGAIN. The situation
can be alleviated temporarily by killing off a few processes, but the
errors always reappear soon thereafter. I have sent in the results of
Shift-ScrollLock, which Rik thinks are not typical of excessive memory
fragmentation.
Now, I have scripts that run "ifconfig ppp0" hourly (to check whether PPP
is "UP"). Recently I joined the modern era by changing from net-tools
1.432 to 1.45. The forking errors have gone away (at least for uptimes
twice the above). When I changed these scripts to run "/sbin/ifconfig.old
ppp0" instead, they came back.
Running the old ifconfig (when the problem arises) would put "kmod: fork
failed, errno 11" messages in the logfiles. The new ifconfig doesn't.
Running strace on "ifconfig ppp0" shows that the old version makes the
following system calls that the new one doesn't:
> socket(PF_??? (0x4), SOCK_DGRAM, , 0) = -1 ENOSYS (Function not implemented)
> socket(PF_??? (0x4), SOCK_DGRAM, , 0) = -1 ENOSYS (Function not implemented)
> socket(PF_??? (0x4), SOCK_DGRAM, , 0) = -1 EINVAL (Invalid argument)
> socket(PF_??? (0x3), SOCK_DGRAM, , 0) = -1 ENOSYS (Function not implemented)
> socket(PF_??? (0x3), SOCK_DGRAM, , 0) = -1 ENOSYS (Function not implemented)
> socket(PF_??? (0x3), SOCK_DGRAM, , 0) = -1 EINVAL (Invalid argument)
> socket(PF_??? (0x5), SOCK_DGRAM, , 0) = -1 ENOSYS (Function not implemented)
> socket(PF_??? (0x5), SOCK_DGRAM, , 0) = -1 ENOSYS (Function not implemented)
> socket(PF_??? (0x5), SOCK_DGRAM, , 0) = -1 EINVAL (Invalid argument)
(I am not sure whether these system calls have been taken out of the
new ifconfig, or whether I merely configured net-tools to be ignorant
of appletalk, etc.)
Something about my old ifconfig must be triggering a bug (or hardware
error?) somewhere. I am willing to take further suggestions for
experiments to try, if anyone is still interested.
-Paul <kimoto@lightlink.com>
(please cc: relevant messages to me)
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~1998-06-21 20:19 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <19980618235448.18503@adore.lightlink.com>
1998-06-19 7:33 ` update re: fork() failures in 2.1.101 Rik van Riel
1998-06-19 15:01 ` update re: fork() failures [in 2.1.103] Paul Kimoto
1998-06-19 16:59 ` Rik van Riel
1998-06-19 20:14 ` Paul Kimoto
1998-06-20 0:48 ` George Woltman
1998-06-21 20:19 ` update re: fork() failures in 2.1.103 Paul Kimoto
[not found] <19980611173940.51846@adore.lightlink.com>
1998-06-12 4:36 ` update re: fork() failures in 2.1.101 Rik van Riel
1998-06-12 22:58 ` Stephen C. Tweedie
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.