All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* RAWIO-5 and O_DIRECT-3 updates
@ 2001-04-24  6:56 Andrea Arcangeli
  2001-04-26  2:58 ` RAWIO-6 update [was Re: RAWIO-5 and O_DIRECT-3 updates] Andrea Arcangeli
  0 siblings, 1 reply; 2+ messages in thread
From: Andrea Arcangeli @ 2001-04-24  6:56 UTC (permalink / raw
  To: Stephen C. Tweedie, Linus Torvalds; +Cc: linux-kernel, kiobuf-io-devel

I fixed a new bug pointed out by Andrew and discussed on the kiobuf list
(thanks Andrew!) (lock_kiovec was not handling correctly a failed trylockpage
and could unlock pages locked by other people, not a big deal though as such
function is never called in the whole pre6 and I'm wondering if it make sense
at all to allow the pinned pages to be locked down in 2.4 [it certainly is
still necessary in 2.2 for all the reads from disk]). However I didn't killed
that function just in case there's an useful use of it. New updated rawio patch
against vanilla 2.4.4pre6 is here:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre6/rawio-5.bz2

Then I integrated a new feature in the O_DIRECT support to make possible to
set/unset from fcntl (I delayed that feature intentionally in the first release
because I thought it was low prio but Michael Susæg just asked for it to do I/O
at misaligned offsets sometime). The way I preferred to implement it is to left
the kiobuf allocated all the time, until we destroy the file structure in fput
(the kiobuf is huge thing, even more with my above rawio x2 boost patch that
handles 512k of atomic I/O with everything needed for the I/O just
preallocated, and we don't want to allocate/deallocate the whole thing every
time we switch in/out buffered mode). I'm also abusing the inode->i_sem in
fcntl to serialize the case of two tasks doing the fcntl(O_DIRECT) on two
filedescriptors pointing to the same file at the same time (kiobuf allocation
can sleep, it does even vmalloc...). Abusing it looks safe and we save the
memory of one semaphore per file structure this way. New patch is here:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre6/o_direct-3

As usual o_direct-3 has to be applied after rawio-5. rawio-5 is recommended for
integration into pre7 as it's not only a performance optimization but includes
strictly necessary fixes for race conditions, mm corruption etc..etc..  (again
credit for some of the stability fixes goes to SCT!)

Andrea

^ permalink raw reply	[flat|nested] 2+ messages in thread

* RAWIO-6 update [was Re: RAWIO-5 and O_DIRECT-3 updates]
  2001-04-24  6:56 RAWIO-5 and O_DIRECT-3 updates Andrea Arcangeli
@ 2001-04-26  2:58 ` Andrea Arcangeli
  0 siblings, 0 replies; 2+ messages in thread
From: Andrea Arcangeli @ 2001-04-26  2:58 UTC (permalink / raw
  To: Stephen C. Tweedie; +Cc: linux-kernel, kiobuf-io-devel

On Tue, Apr 24, 2001 at 08:56:00AM +0200, Andrea Arcangeli wrote:
> I fixed a new bug pointed out by Andrew and discussed on the kiobuf list
> (thanks Andrew!) (lock_kiovec was not handling correctly a failed trylockpage
> and could unlock pages locked by other people, not a big deal though as such
> function is never called in the whole pre6 and I'm wondering if it make sense
> at all to allow the pinned pages to be locked down in 2.4 [it certainly is
> still necessary in 2.2 for all the reads from disk]). However I didn't killed
> that function just in case there's an useful use of it. New updated rawio patch
> against vanilla 2.4.4pre6 is here:
> 
> 	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre6/rawio-5.bz2

My 2.4 rawio-* patches had from the start an annoying bug that can be
triggered only with machines with more than 1G (I know, strictly
speaking some houndred MB less than 1G for vmalloc and friends but you
got it)

The bug is caused by this debugging check in the highmem bounce buffer
code that I didn't trapped while keeping the slab cache constructed
to improve the allocation of the buffer headers:

	memset(&bh->b_wait, -1, sizeof(bh->b_wait));

The side effect of the bug is that the kernel locks hard while paging
init in during boot (as said above only if you have more than 1G of
ram). You can guess why I didn't trapped it during regression testing.
So this is the incremental fix against rawio-5:

diff -urN rawio-5/mm/highmem.c rawio-6/mm/highmem.c
--- rawio-5/mm/highmem.c	Thu Dec 14 22:34:14 2000
+++ rawio-6/mm/highmem.c	Thu Apr 26 02:37:07 2001
@@ -207,6 +207,10 @@
 
 	bh_orig->b_end_io(bh_orig, uptodate);
 	__free_page(bh->b_page);
+#ifdef HIGHMEM_DEBUG
+	/* Don't clobber the constructed slab cache */
+	init_waitqueue_head(&bh->b_wait);
+#endif
 	kmem_cache_free(bh_cachep, bh);
 }
 
@@ -260,12 +264,14 @@
 	bh->b_count = bh_orig->b_count;
 	bh->b_rdev = bh_orig->b_rdev;
 	bh->b_state = bh_orig->b_state;
+#ifdef HIGHMEM_DEBUG
 	bh->b_flushtime = jiffies;
 	bh->b_next_free = NULL;
 	bh->b_prev_free = NULL;
 	/* bh->b_this_page */
 	bh->b_reqnext = NULL;
 	bh->b_pprev = NULL;
+#endif
 	/* bh->b_page */
 	if (rw == WRITE) {
 		bh->b_end_io = bounce_end_io_write;
@@ -274,7 +280,9 @@
 		bh->b_end_io = bounce_end_io_read;
 	bh->b_private = (void *)bh_orig;
 	bh->b_rsector = bh_orig->b_rsector;
+#ifdef HIGHMEM_DEBUG
 	memset(&bh->b_wait, -1, sizeof(bh->b_wait));
+#endif
 
 	return bh;
 }


(side note: right now HIGHMEM_DEBUG is enabled by default and that's ok of
course)

And here it is an update that is now supposed to work with highmem
machines as well:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre7/rawio-6

Andrea

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2001-04-26  2:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-24  6:56 RAWIO-5 and O_DIRECT-3 updates Andrea Arcangeli
2001-04-26  2:58 ` RAWIO-6 update [was Re: RAWIO-5 and O_DIRECT-3 updates] Andrea Arcangeli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.