All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Stephen C. Tweedie" <sct@redhat.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: "Eric W. Biederman" <ebiederm+eric@ccr.net>,
	Andrea Arcangeli <andrea@e-mind.com>,
	steve@netplus.net, brent verner <damonbrent@earthlink.net>,
	"Garst R. Reese" <reese@isn.net>,
	Kalle Andersson <kalle.andersson@mbox303.swipnet.se>,
	Zlatko Calusic <Zlatko.Calusic@CARNet.hr>,
	Ben McCann <bmccann@indusriver.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	bredelin@ucsd.edu, "Stephen C. Tweedie" <sct@redhat.com>,
	linux-kernel@vger.rutgers.edu,
	Rik van Riel <H.H.vanRiel@phys.uu.nl>,
	linux-mm@kvack.org
Subject: Re: arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]]
Date: Sat, 9 Jan 1999 02:13:08 GMT	[thread overview]
Message-ID: <199901090213.CAA05306@dax.scot.redhat.com> (raw)
In-Reply-To: <Pine.LNX.3.95.990107093746.4270H-100000@penguin.transmeta.com>

Hi,

On Thu, 7 Jan 1999 09:56:03 -0800 (PST), Linus Torvalds
<torvalds@transmeta.com> said:

> That way I can be reasonably hopeful that there are no new bugs introduced
> even though performance is very different. I _do_ have some early data
> that seems to say that this _has_ uncovered a very old deadlock condition: 
> something that could happen before but was almost impossible to trigger. 

> The deadlock I suspect is:
>  - we're low on memory
>  - we allocate or look up a new block on the filesystem. This involves
>    getting the ext2 superblock lock, and doing a "bread()" of the free
>    block bitmap block.
>  - this causes us to try to allocate a new buffer, and we are so low on
>    memory that we go into try_to_free_pages() to find some more memory.
>  - try_to_free_pages() finds a shared memory file to page out.
>  - trying to page that out, it looks up the buffers on the filesystem it
>    needs, but deadlocks on the superblock lock.

Hmm, I think that's a new one to me, but add to that one which I think
we've come across before and which I have not even thought about for a
couple of years at least: a large write() to a mmap()ed file can
deadlock for a similar reason, but on the inode write lock instead of
the superblock lock.

> The positive news is that if I'm right in my suspicions it can only happen
> with shared writable mappings or shared memory segments. The bad news is
> that the bug appears rather old, and no immediate solution presents
> itself. 

A couple solutions which come to mind: (1) make the superblock lock
recursive (ugh, horrible and it only works if we have an additional
mechanism to pin down bitmap buffers in the bitmap cache), or (2) allow
load_block_bitmap and friends to drop the superblock if it finds that it
needs to do an IO, and repeat if it happened.  However, what we're
basically saying here is that all operations on the superblock_lock have
to drop the lock if they want to allocate memory, and that's not a great
deal of fun: we might as well use the kernel spinlock.

It gets worse, because of course we cannot even rely on kswapd to
function correctly in this situation --- it will block on the superblock
lock just as happily as the current process's try_to_free_pages call
will.

I think the cleanest solution may be to reimplement some form of the old
GFP_IO flag, to prevent us from trying to use IO inside
try_to_free_pages() if we know we already have a lock which could
deadlock.  The easiest way I can see of achieving something like this is
to set current->flags |= PF_MEMALLOC while we hold the superblock lock,
or create another PF_NOIO flag which prevents try_to_free_pages from
doing anything with dirty pages.  I suspect that the PF_MEMALLOC option
might be good enough for starters; it will only do the wrong thing if we
have entirely exhausted the free page list.

The inode deadlock at least is relatively easy to fix, either by making
the inodelock recursive, or by having a separate sharable truncate lock
to prevent pages from being invalidated in the middle of the pageout
(which was the reason for the down() in the filemap write-page code in
the first place).  The truncate lock (or allocation/deallocation lock,
if you want to do it that way) makes a ton of sense; it avoids
serialising all writes while still making sure that truncates themselves
are exclusively locked.

>> 2) I have tested using PG_dirty from shrink_mmap and it is a
>> performance problem because it loses all locality of reference,
>> and because it forces shrink_mmap into a dual role, of freeing and
>> writing pages, which need seperate tuning.

> Exactly. This is part of the complexity.

> The right solution (I _think_) is to conceptually always mark it PG_dirty
> in vmscan, and basically leave all the nasty cases to the filemap physical
> page scan. But in the simple cases (ie a swap-cached page that is only
> mapped by one process and doesn't have any other users), you'd start the
> IO "early".

The trouble is that when we come to do the physical IO, we really want
to cluster the IOs.  Doing the swap cache allocation from vmscan means
that we'll still be allocating virtually adjacent memory pages to
adjacent swap pages, but if we don't do the IO itself until
shrink_mmap(), we'll lose the IO clustering which we need for good
swapout performance.

--Stephen
--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

  parent reply	other threads:[~1999-01-09  2:13 UTC|newest]

Thread overview: 243+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <199812290146.BAA12687@terrorserver.swansea.linux.org.uk>
1998-12-31 18:00 ` 2.2.0 Bug summary Andrea Arcangeli
1998-12-31 18:34   ` [patch] new-vm improvement [Re: 2.2.0 Bug summary] Andrea Arcangeli
1999-01-01  0:16     ` Steve Bergman
1999-01-01 17:16       ` Andrea Arcangeli
1999-01-01 16:44     ` Andrea Arcangeli
1999-01-01 20:02       ` Andrea Arcangeli
1999-01-01 23:46         ` Steve Bergman
1999-01-02  6:55           ` Linus Torvalds
1999-01-02  8:33             ` Steve Bergman
1999-01-02 14:48             ` Andrea Arcangeli
1999-01-02 15:38             ` Andrea Arcangeli
1999-01-02 18:10               ` Linus Torvalds
1999-01-02 20:52               ` Andrea Arcangeli
1999-01-03  2:59                 ` Andrea Arcangeli
1999-01-04 18:08                   ` [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm improvement , [Re: 2.2.0 Bug summary]] Andrea Arcangeli
1999-01-04 20:56                     ` Linus Torvalds
1999-01-04 21:10                       ` Rik van Riel
1999-01-04 22:04                       ` Alan Cox
1999-01-04 21:55                         ` Linus Torvalds
1999-01-04 22:51                           ` Andrea Arcangeli
1999-01-05  0:32                             ` Andrea Arcangeli
1999-01-05  0:52                               ` Zlatko Calusic
1999-01-05  3:02                               ` Zlatko Calusic
1999-01-05 11:49                                 ` Andrea Arcangeli
1999-01-05 13:23                                   ` Zlatko Calusic
1999-01-05 15:42                                     ` Andrea Arcangeli
1999-01-05 16:16                                       ` Zlatko Calusic
1999-01-05 15:35                               ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Andrea Arcangeli
1999-01-06 14:48                                 ` Andrea Arcangeli
1999-01-06 23:31                                   ` Andrea Arcangeli
1999-01-07  3:32                                     ` Results: 2.2.0-pre5 vs arcavm10 vs arcavm9 vs arcavm7 Steve Bergman
1999-01-07 12:02                                       ` Andrea Arcangeli
1999-01-07 20:27                                         ` Linus Torvalds
1999-01-07 23:56                                           ` Andrea Arcangeli
1999-01-07 17:35                                       ` Linus Torvalds
1999-01-07 18:44                                         ` Zlatko Calusic
1999-01-07 19:33                                           ` Linus Torvalds
1999-01-07 21:10                                             ` Zlatko Calusic
1999-01-07 19:38                                           ` Zlatko Calusic
1999-01-07 19:40                                           ` Andrea Arcangeli
1999-01-09  6:28                                           ` 2.2.0-pre[56] swap performance poor with > 1 thrashing task Dax Kelson
1999-01-09  6:32                                             ` Zlatko Calusic
1999-01-09  6:44                                               ` Linus Torvalds
1999-01-09 18:58                                                 ` Andrea Arcangeli
1999-01-11  9:21                                                 ` Buffer handling (setting PG_referenced on access) Zlatko Calusic
1999-01-11 17:44                                                   ` Linus Torvalds
1999-01-11 20:14                                                     ` Zlatko Calusic
1999-01-16 17:35                                                 ` 2.2.0-pre[56] swap performance poor with > 1 thrashing task Andrea Arcangeli
1999-01-09  7:48                                             ` Benjamin Redelings I
1999-01-09  6:53                                               ` Linus Torvalds
1999-01-09 22:39                                       ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Steve Bergman
1999-01-10  0:28                                         ` Steve Bergman
1999-01-10  5:35                                           ` Linus Torvalds
1999-01-10 18:33                                             ` Andrea Arcangeli
1999-01-10 18:43                                             ` Steve Bergman
1999-01-10 19:08                                               ` Linus Torvalds
1999-01-10 19:23                                                 ` Vladimir Dergachev
1999-01-10 20:09                                                 ` Andrea Arcangeli
1999-01-10 20:29                                                 ` Steve Bergman
1999-01-10 21:41                                                   ` Linus Torvalds
1999-01-10 23:33                                                     ` testing/pre-7 and do_poll() Chip Salzenberg
1999-01-11  6:02                                                       ` Linus Torvalds
1999-01-11  6:26                                                         ` Chip Salzenberg
1999-01-11  6:46                                                           ` Linus Torvalds
1999-01-11  6:59                                                             ` Chip Salzenberg
1999-01-11  7:02                                                               ` Linus Torvalds
1999-01-11 22:08                                                                 ` Shawn Leas
1999-01-11 22:13                                                                   ` Linus Torvalds
1999-01-12  0:25                                                                     ` estafford
1999-01-12  8:25                                                                       ` Shawn Leas
1999-01-12  7:06                                                                     ` Gregory Maxwell
1999-01-11 20:20                                                       ` Adam Heath
1999-01-11 16:57                                                   ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Steve Bergman
1999-01-11 19:36                                                     ` Andrea Arcangeli
1999-01-11 23:03                                                       ` Andrea Arcangeli
1999-01-11 23:38                                                         ` Zlatko Calusic
1999-01-12  2:02                                                         ` Steve Bergman
1999-01-12  3:21                                                           ` Results: Zlatko's new vm patch Steve Bergman
1999-01-12  5:33                                                             ` Linus Torvalds
1999-01-12 14:49                                                               ` Andrea Arcangeli
1999-01-12 16:58                                                               ` Joseph Anthony
1999-01-12 18:16                                                                 ` Stephen C. Tweedie
1999-01-12 20:15                                                                   ` Michael K Vance
1999-01-13 19:25                                                                     ` Stephen C. Tweedie
1999-01-12 18:24                                                               ` Michael K Vance
1999-01-13  0:01                                                               ` Where to find pre7. Was: " Robert Thorncrantz
1999-01-13 20:47                                                             ` [patch] arca-vm-19 [Re: Results: Zlatko's new vm patch] Andrea Arcangeli
1999-01-14 12:30                                                               ` Andrea Arcangeli
1999-01-15 23:56                                                                 ` [patch] NEW: arca-vm-21, swapout via shrink_mmap using PG_dirty Andrea Arcangeli
1999-01-16 16:49                                                                   ` Andrea Arcangeli
1999-01-17 23:47                                                                     ` Andrea Arcangeli
1999-01-18  5:11                                                                       ` Linus Torvalds
1999-01-18  7:28                                                                         ` Eric W. Biederman
1999-01-18 10:00                                                                           ` Andrea Arcangeli
1999-01-18  9:15                                                                         ` Andrea Arcangeli
1999-01-18 17:49                                                                           ` Linus Torvalds
1999-01-18 19:22                                                                       ` Andrea Arcangeli
1999-01-10 20:40                                               ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Andrea Arcangeli
1999-01-10 20:50                                                 ` Linus Torvalds
1999-01-10 21:01                                                   ` Andrea Arcangeli
1999-01-10 21:51                                                     ` Steve Bergman
1999-01-10 22:50                                                       ` Results: arcavm15, et. al Steve Bergman
1999-01-11  0:20                                                         ` Steve Bergman
1999-01-11 13:21                                                         ` Andrea Arcangeli
1999-01-11  3:47                                         ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Gregory Maxwell
1999-01-06 23:35                                   ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Linus Torvalds
1999-01-07  4:30                                     ` Eric W. Biederman
1999-01-07 17:56                                       ` Linus Torvalds
1999-01-07 18:18                                         ` Rik van Riel
1999-01-07 19:19                                           ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] Alan Cox
1999-01-07 18:55                                         ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Zlatko Calusic
1999-01-07 22:57                                         ` Linus Torvalds
1999-01-08  1:16                                           ` Linus Torvalds
1999-01-08 10:45                                             ` Andrea Arcangeli
1999-01-08 19:06                                               ` Linus Torvalds
1999-01-09  9:43                                           ` MM deadlock [was: Re: arca-vm-8...] Savochkin Andrey Vladimirovich
1999-01-09 18:00                                             ` Linus Torvalds
1999-01-09 18:41                                               ` Andrea Arcangeli
1999-01-10 21:41                                                 ` Stephen C. Tweedie
1999-01-10 21:47                                                   ` Linus Torvalds
1999-01-09 21:50                                               ` Linus Torvalds
1999-01-10 11:56                                                 ` Savochkin Andrey Vladimirovich
1999-01-10 17:59                                                   ` Andrea Arcangeli
1999-01-10 22:33                                                   ` Stephen C. Tweedie
1999-01-10 16:59                                                 ` Stephen C. Tweedie
1999-01-10 18:13                                                   ` Andrea Arcangeli
1999-01-10 18:35                                                   ` Linus Torvalds
1999-01-10 19:45                                                     ` Alan Cox
1999-01-10 19:03                                                       ` Andrea Arcangeli
1999-01-10 21:39                                                         ` Stephen C. Tweedie
1999-01-10 19:09                                                       ` Linus Torvalds
1999-01-10 20:33                                                         ` Alan Cox
1999-01-10 20:07                                                           ` Linus Torvalds
1999-01-10 22:18                                                     ` Stephen C. Tweedie
1999-01-10 22:49                                                     ` Stephen C. Tweedie
1999-01-11  6:04                                                       ` Eric W. Biederman
1999-01-12 16:06                                                         ` Stephen C. Tweedie
1999-01-12 17:54                                                           ` Linus Torvalds
1999-01-12 18:44                                                             ` Zlatko Calusic
1999-01-12 19:05                                                               ` Andrea Arcangeli
1999-01-13 17:48                                                                 ` Stephen C. Tweedie
1999-01-13 18:07                                                                   ` 2.2.0-pre6 ain't nice =( Kalle Andersson
1999-01-13 19:05                                                                   ` MM deadlock [was: Re: arca-vm-8...] Alan Cox
1999-01-13 19:23                                                                     ` MOLNAR Ingo
1999-01-13 19:26                                                                     ` Andrea Arcangeli
1999-01-14 11:02                                                                       ` Mike Jagdis
1999-01-14 22:38                                                                         ` Andrea Arcangeli
1999-01-15  7:40                                                                       ` Agus Budy Wuysang
1999-01-14 10:48                                                                   ` Mike Jagdis
1999-01-12 21:46                                                               ` Rik van Riel
1999-01-13  6:52                                                                 ` Zlatko Calusic
1999-01-13 13:45                                                                 ` Andrea Arcangeli
1999-01-13 13:58                                                                   ` Chris Evans
1999-01-13 15:07                                                                     ` Andrea Arcangeli
1999-01-13 22:11                                                                       ` Stephen C. Tweedie
1999-01-13 14:59                                                                   ` Rik van Riel
1999-01-13 18:10                                                                     ` Andrea Arcangeli
1999-01-13 22:14                                                                       ` Stephen C. Tweedie
1999-01-14 14:53                                                                         ` Dr. Werner Fink
1999-01-21 16:50                                                                           ` Stephen C. Tweedie
1999-01-21 19:53                                                                             ` Andrea Arcangeli
1999-01-22 13:55                                                                               ` Stephen C. Tweedie
1999-01-22 19:45                                                                                 ` Andrea Arcangeli
1999-01-23 23:20                                                                             ` Alan Cox
1999-01-24  0:19                                                                               ` Linus Torvalds
1999-01-24 18:33                                                                                 ` Gregory Maxwell
1999-01-25  0:21                                                                                   ` Linus Torvalds
1999-01-25  1:28                                                                                     ` Alan Cox
1999-01-25  3:35                                                                                       ` pmonta
1999-01-25  4:17                                                                                       ` Linus Torvalds
1999-01-24 20:33                                                                                 ` Alan Cox
1999-01-25  0:27                                                                                   ` Linus Torvalds
1999-01-25  1:38                                                                                     ` Alan Cox
1999-01-25  1:04                                                                                       ` Andrea Arcangeli
1999-01-25  2:10                                                                                         ` Alan Cox
1999-01-25  3:16                                                                                           ` Garst R. Reese
1999-01-25 10:49                                                                                             ` Alan Cox
1999-01-25 14:06                                                                                           ` Rik van Riel
1999-01-25 21:59                                                                                       ` Gerard Roudier
1999-01-26 11:45                                                                                         ` Thomas Sailer
1999-01-26 20:48                                                                                           ` Gerard Roudier
1999-01-26 21:24                                                                                             ` Thomas Sailer
1999-01-27  0:25                                                                                             ` David Lang
1999-01-27 16:05                                                                                             ` Stephen C. Tweedie
1999-01-27 20:11                                                                                               ` Gerard Roudier
1999-01-26 13:06                                                                                         ` Stephen C. Tweedie
1999-01-26 14:28                                                                                           ` Alan Cox
1999-01-26 14:15                                                                                             ` MOLNAR Ingo
1999-01-26 14:36                                                                                               ` yodaiken
1999-01-26 15:21                                                                                                 ` MOLNAR Ingo
1999-01-27 10:31                                                                                                   ` yodaiken
1999-01-26 15:46                                                                                                 ` Alan Cox
1999-01-26 16:45                                                                                                   ` Stephen C. Tweedie
1999-01-30  7:01                                                                                                     ` yodaiken
1999-02-01 13:07                                                                                                       ` Stephen C. Tweedie
1999-01-26 16:37                                                                                               ` Stephen C. Tweedie
1999-01-27 11:35                                                                                               ` Jakub Jelinek
1999-01-26 14:21                                                                                             ` Rik van Riel
1999-01-25 16:25                                                                                 ` Stephen C. Tweedie
1999-01-25 16:52                                                                                   ` Andrea Arcangeli
1999-01-25 18:27                                                                                   ` Linus Torvalds
1999-01-25 18:43                                                                                     ` Stephen C. Tweedie
1999-01-25 18:49                                                                                       ` Linus Torvalds
1999-01-25 18:43                                                                                   ` Linus Torvalds
1999-01-25 19:15                                                                                     ` Stephen C. Tweedie
1999-01-26  1:57                                                                                   ` Andrea Arcangeli
1999-01-26 18:37                                                                                     ` Andrea Arcangeli
1999-01-27 12:13                                                                                     ` Stephen C. Tweedie
1999-01-22 16:29                                                                           ` Eric W. Biederman
1999-01-25 13:14                                                                             ` Dr. Werner Fink
1999-01-25 17:56                                                                               ` Stephen C. Tweedie
1999-01-25 19:10                                                                               ` Andrea Arcangeli
1999-01-25 20:49                                                                                 ` Dr. Werner Fink
1999-01-25 20:56                                                                                   ` Linus Torvalds
1999-01-26 12:23                                                                                     ` Rik van Riel
1999-01-26 15:44                                                                                       ` Andrea Arcangeli
1999-01-27 14:52                                                                                   ` Stephen C. Tweedie
1999-01-28 19:12                                                                                     ` Dr. Werner Fink
1999-01-13 17:55                                                                   ` [PATCH] " Stephen C. Tweedie
1999-01-13 18:52                                                                     ` Andrea Arcangeli
1999-01-13 22:10                                                                       ` Stephen C. Tweedie
1999-01-13 22:30                                                                         ` Linus Torvalds
1999-01-11 11:20                                                       ` Pavel Machek
1999-01-11 17:35                                                         ` Stephen C. Tweedie
1999-01-11 14:11                                                     ` Savochkin Andrey Vladimirovich
1999-01-11 17:55                                                       ` Linus Torvalds
1999-01-11 18:37                                                         ` Andrea Arcangeli
1999-01-08  2:56                                         ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Eric W. Biederman
1999-01-09  0:50                                         ` David S. Miller
1999-01-09  2:13                                         ` Stephen C. Tweedie [this message]
1999-01-09  2:34                                           ` Andrea Arcangeli
1999-01-09  9:30                                             ` Stephen C. Tweedie
1999-01-09 12:11                                           ` Andrea Arcangeli
1999-01-07 14:11                                     ` Andrea Arcangeli
1999-01-07 18:19                                       ` Linus Torvalds
1999-01-07 20:35                                         ` Andrea Arcangeli
1999-01-07 23:51                                           ` Linus Torvalds
1999-01-08  0:04                                             ` Andrea Arcangeli
1999-01-04 22:43                         ` [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm improvement , [Re: 2.2.0 Bug summary]] Andrea Arcangeli
1999-01-04 22:29                       ` Andrea Arcangeli
1999-01-05 13:33                   ` [patch] new-vm improvement [Re: 2.2.0 Bug summary] Ben McCann
1999-01-02 20:04             ` Steve Bergman
1999-01-02  3:03         ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=199901090213.CAA05306@dax.scot.redhat.com \
    --to=sct@redhat.com \
    --cc=H.H.vanRiel@phys.uu.nl \
    --cc=Zlatko.Calusic@CARNet.hr \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@e-mind.com \
    --cc=bmccann@indusriver.com \
    --cc=bredelin@ucsd.edu \
    --cc=damonbrent@earthlink.net \
    --cc=ebiederm+eric@ccr.net \
    --cc=kalle.andersson@mbox303.swipnet.se \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    --cc=reese@isn.net \
    --cc=steve@netplus.net \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.