prepare_write problem for small write when page not uptodate

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* prepare_write problem for small write when page not uptodate
@ 2004-04-29  4:02 Steve French
  2004-04-30 14:08 ` Dave Kleikamp
  0 siblings, 1 reply; 4+ messages in thread
From: Steve French @ 2004-04-29  4:02 UTC (permalink / raw
  To: linux-fsdevel

 I was able to isolate a case in which data at the beginning of the 1st 
page of a file is overwritten with zero when a series of writes begins 
midpage.   The problem is that simple_prepare_write zeroes the beginning 
of a page that has been invalidated even if there is good data that 
could be read into that.  See below (from fs/libfs.c):

319 int simple_prepare_write(struct file *file, struct page *page,
320                         unsigned from, unsigned to)
321 {
322         if (!PageUptodate(page)) {
323                 if (to - from != PAGE_CACHE_SIZE) {
324                         void *kaddr = kmap_atomic(page, KM_USER0);
325                         memset(kaddr, 0, from);
326                         memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);

At one point I had stopped using simple_prepare_write in cifs and like 
NFS, checked for pending writes that would overlap, but it looks like 
that could not be the case with current code since writepage was 
synchronous in the cifs case and in any case had the same problem with 
the partial page zeroing logic as simple_prepare_write..

The sequence that showed this problem was the following:
create file
write 128 bytes at offset zero
flush
close file

open file - which causes cifs to invalidate cache (since we could not be 
certain if file had changed in this particular case)
    filemap_fdatawrite
    filemap_fdatawait
    invalidate_remote_inode (which basically just calls 
invalidate_mapping_pages to do the work -
        which presumably caused the page to be marked not uptodate)
    write 128 bytes at offset 128
    prepare_write (which zeroed 0 to 127 and 256 to PAGE_CACHE_SIZE)
    and then eventually writepage (which will write a 1/2 corrupt 256 
bytes which incorrectly includes zeroes in 0 to 127, to the server)

Seems like invalidate_remote_inode (eventually calling 
invalidate_complete_page) does not always actually free the page 
(perhaps if there was an mmap on it?), in this case just marks it not up 
to date so the first write to the page in this case is not preceeded by 
a readpage as one would expect.   Is it the responsibility of the 
filesystem's prepare_write code to reread in a page on a partial page 
write if page not uptodate?   Or is there a way to call something after 
invalidate_remote_inode to force a reread of the page before the next 
prepare_write?  

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: prepare_write problem for small write when page not uptodate
@ 2004-04-29 21:39 Steve French
  2004-04-29 22:13 ` Steve French
  0 siblings, 1 reply; 4+ messages in thread
From: Steve French @ 2004-04-29 21:39 UTC (permalink / raw
  To: linux-fsdevel

> Seems like invalidate_remote_inode (eventually calling 
> invalidate_complete_page) does not always actually free the page

Switching from invalidate_remote_inode (invalidate_mapping_pages) to
invalidate_inode_pages2 in cifs worked and fixed the problem (partial
page write corrupts a page that is marked not uptodate due to
invalidate_remote_inode).  Still not clear to me when you would use
truncate_inode_pages and when you would use invalidate_inode_pages2.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: prepare_write problem for small write when page not uptodate
  2004-04-29 21:39 prepare_write problem for small write when page not uptodate Steve French
@ 2004-04-29 22:13 ` Steve French
  0 siblings, 0 replies; 4+ messages in thread
From: Steve French @ 2004-04-29 22:13 UTC (permalink / raw
  To: linux-fsdevel

On Thu, 2004-04-29 at 16:39, Steve French wrote:
> > Seems like invalidate_remote_inode (eventually calling 
> > invalidate_complete_page) does not always actually free the page
> 
> Switching from invalidate_remote_inode (invalidate_mapping_pages) to
> invalidate_inode_pages2 in cifs worked and fixed the problem (partial
> page write corrupts a page that is marked not uptodate due to
> invalidate_remote_inode).  Still not clear to me when you would use
> truncate_inode_pages and when you would use invalidate_inode_pages2.

False alarm - none of the three worked.

Looks like I am going to have to fall back to the hard way - in
prepare_write if page is not up to date - read it from the server since
I can't seem to force the mm code to do a readpage first by tossing the
page via any of the three mechanisms.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: prepare_write problem for small write when page not uptodate
  2004-04-29  4:02 Steve French
@ 2004-04-30 14:08 ` Dave Kleikamp
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Kleikamp @ 2004-04-30 14:08 UTC (permalink / raw
  To: Steve French; +Cc: fsdevel

On Wed, 2004-04-28 at 23:02, Steve French wrote:
>  I was able to isolate a case in which data at the beginning of the 1st 
> page of a file is overwritten with zero when a series of writes begins 
> midpage.   The problem is that simple_prepare_write zeroes the beginning 
> of a page that has been invalidated even if there is good data that 
> could be read into that.  See below (from fs/libfs.c):
> 
> 319 int simple_prepare_write(struct file *file, struct page *page,
> 320                         unsigned from, unsigned to)
> 321 {
> 322         if (!PageUptodate(page)) {
> 323                 if (to - from != PAGE_CACHE_SIZE) {
> 324                         void *kaddr = kmap_atomic(page, KM_USER0);
> 325                         memset(kaddr, 0, from);
> 326                         memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
> 
> At one point I had stopped using simple_prepare_write in cifs and like 
> NFS, checked for pending writes that would overlap, but it looks like 
> that could not be the case with current code since writepage was 
> synchronous in the cifs case and in any case had the same problem with 
> the partial page zeroing logic as simple_prepare_write.

Don't use simple_prepare_write!  The simple_* routines are for file
systems like ramfs that don't have backing store.  The premise with
these functions is that if it's not cached, it's zero.

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-04-30 14:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-29 21:39 prepare_write problem for small write when page not uptodate Steve French
2004-04-29 22:13 ` Steve French
  -- strict thread matches above, loose matches on Subject: below --
2004-04-29  4:02 Steve French
2004-04-30 14:08 ` Dave Kleikamp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.