* prepare_write problem for small write when page not uptodate
@ 2004-04-29 4:02 Steve French
2004-04-30 14:08 ` Dave Kleikamp
0 siblings, 1 reply; 4+ messages in thread
From: Steve French @ 2004-04-29 4:02 UTC (permalink / raw
To: linux-fsdevel
I was able to isolate a case in which data at the beginning of the 1st
page of a file is overwritten with zero when a series of writes begins
midpage. The problem is that simple_prepare_write zeroes the beginning
of a page that has been invalidated even if there is good data that
could be read into that. See below (from fs/libfs.c):
319 int simple_prepare_write(struct file *file, struct page *page,
320 unsigned from, unsigned to)
321 {
322 if (!PageUptodate(page)) {
323 if (to - from != PAGE_CACHE_SIZE) {
324 void *kaddr = kmap_atomic(page, KM_USER0);
325 memset(kaddr, 0, from);
326 memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
At one point I had stopped using simple_prepare_write in cifs and like
NFS, checked for pending writes that would overlap, but it looks like
that could not be the case with current code since writepage was
synchronous in the cifs case and in any case had the same problem with
the partial page zeroing logic as simple_prepare_write..
The sequence that showed this problem was the following:
create file
write 128 bytes at offset zero
flush
close file
open file - which causes cifs to invalidate cache (since we could not be
certain if file had changed in this particular case)
filemap_fdatawrite
filemap_fdatawait
invalidate_remote_inode (which basically just calls
invalidate_mapping_pages to do the work -
which presumably caused the page to be marked not uptodate)
write 128 bytes at offset 128
prepare_write (which zeroed 0 to 127 and 256 to PAGE_CACHE_SIZE)
and then eventually writepage (which will write a 1/2 corrupt 256
bytes which incorrectly includes zeroes in 0 to 127, to the server)
Seems like invalidate_remote_inode (eventually calling
invalidate_complete_page) does not always actually free the page
(perhaps if there was an mmap on it?), in this case just marks it not up
to date so the first write to the page in this case is not preceeded by
a readpage as one would expect. Is it the responsibility of the
filesystem's prepare_write code to reread in a page on a partial page
write if page not uptodate? Or is there a way to call something after
invalidate_remote_inode to force a reread of the page before the next
prepare_write?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: prepare_write problem for small write when page not uptodate
@ 2004-04-29 21:39 Steve French
2004-04-29 22:13 ` Steve French
0 siblings, 1 reply; 4+ messages in thread
From: Steve French @ 2004-04-29 21:39 UTC (permalink / raw
To: linux-fsdevel
> Seems like invalidate_remote_inode (eventually calling
> invalidate_complete_page) does not always actually free the page
Switching from invalidate_remote_inode (invalidate_mapping_pages) to
invalidate_inode_pages2 in cifs worked and fixed the problem (partial
page write corrupts a page that is marked not uptodate due to
invalidate_remote_inode). Still not clear to me when you would use
truncate_inode_pages and when you would use invalidate_inode_pages2.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: prepare_write problem for small write when page not uptodate
2004-04-29 21:39 prepare_write problem for small write when page not uptodate Steve French
@ 2004-04-29 22:13 ` Steve French
0 siblings, 0 replies; 4+ messages in thread
From: Steve French @ 2004-04-29 22:13 UTC (permalink / raw
To: linux-fsdevel
On Thu, 2004-04-29 at 16:39, Steve French wrote:
> > Seems like invalidate_remote_inode (eventually calling
> > invalidate_complete_page) does not always actually free the page
>
> Switching from invalidate_remote_inode (invalidate_mapping_pages) to
> invalidate_inode_pages2 in cifs worked and fixed the problem (partial
> page write corrupts a page that is marked not uptodate due to
> invalidate_remote_inode). Still not clear to me when you would use
> truncate_inode_pages and when you would use invalidate_inode_pages2.
False alarm - none of the three worked.
Looks like I am going to have to fall back to the hard way - in
prepare_write if page is not up to date - read it from the server since
I can't seem to force the mm code to do a readpage first by tossing the
page via any of the three mechanisms.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: prepare_write problem for small write when page not uptodate
2004-04-29 4:02 Steve French
@ 2004-04-30 14:08 ` Dave Kleikamp
0 siblings, 0 replies; 4+ messages in thread
From: Dave Kleikamp @ 2004-04-30 14:08 UTC (permalink / raw
To: Steve French; +Cc: fsdevel
On Wed, 2004-04-28 at 23:02, Steve French wrote:
> I was able to isolate a case in which data at the beginning of the 1st
> page of a file is overwritten with zero when a series of writes begins
> midpage. The problem is that simple_prepare_write zeroes the beginning
> of a page that has been invalidated even if there is good data that
> could be read into that. See below (from fs/libfs.c):
>
> 319 int simple_prepare_write(struct file *file, struct page *page,
> 320 unsigned from, unsigned to)
> 321 {
> 322 if (!PageUptodate(page)) {
> 323 if (to - from != PAGE_CACHE_SIZE) {
> 324 void *kaddr = kmap_atomic(page, KM_USER0);
> 325 memset(kaddr, 0, from);
> 326 memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
>
> At one point I had stopped using simple_prepare_write in cifs and like
> NFS, checked for pending writes that would overlap, but it looks like
> that could not be the case with current code since writepage was
> synchronous in the cifs case and in any case had the same problem with
> the partial page zeroing logic as simple_prepare_write.
Don't use simple_prepare_write! The simple_* routines are for file
systems like ramfs that don't have backing store. The premise with
these functions is that if it's not cached, it's zero.
Shaggy
--
David Kleikamp
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-04-30 14:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-29 21:39 prepare_write problem for small write when page not uptodate Steve French
2004-04-29 22:13 ` Steve French
-- strict thread matches above, loose matches on Subject: below --
2004-04-29 4:02 Steve French
2004-04-30 14:08 ` Dave Kleikamp
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.