* xlog_write: reservation ran out. Need to up reservation
[not found] <159192779.3859815.1408461799560.JavaMail.zimbra@klaube.net>
@ 2014-08-19 15:34 ` Thomas Klaube
2014-08-19 22:55 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Klaube @ 2014-08-19 15:34 UTC (permalink / raw
To: xfs
Hi all,
I am currently testing/benchmarking xfs on top of a bcache. When I run a heavy
IO workload (fio with 64 threads, read/write) on the device for ~30-45min I get
[ 9092.978268] XFS (bcache1): xlog_write: reservation summary:
[ 9092.978268] trans type = (null) (42)
[ 9092.978268] unit res = 18730384 bytes
[ 9092.978268] current res = -1640 bytes
[ 9092.978268] total reg = 512 bytes (o/flow = 1163749592 bytes)
[ 9092.978268] ophdrs = 655304 (ophdr space = 7863648 bytes)
[ 9092.978268] ophdr + reg = 1171613752 bytes
[ 9092.978268] num regions = 2
[ 9092.978268]
[ 9092.978272] XFS (bcache1): region[0]: LR header - 512 bytes
[ 9092.978273] XFS (bcache1): region[1]: commit - 0 bytes
[ 9092.978274] XFS (bcache1): xlog_write: reservation ran out. Need to up reservation
[ 9092.978303] XFS (bcache1): xfs_do_force_shutdown(0x2) called from line 2036 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa04433c8
[ 9092.979189] XFS (bcache1): Log I/O Error Detected. Shutting down filesystem
[ 9092.979210] XFS (bcache1): Please umount the filesystem and rectify the problem(s)
[ 9092.979238] XFS (bcache1): xfs_do_force_shutdown(0x2) called from line 1497 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa0443b57
[ 9093.183869] XFS (bcache1): xfs_log_force: error 5 returned.
[ 9093.489944] XFS (bcache1): xfs_log_force: error 5 returned.
Kernel is 3.16.1 but this also happens with Ubuntu 3.13.0.34.
With the bcache the fio puts ~30k IOps on the filesystem.
xfs_info:
meta-data=/dev/bcache1 isize=256 agcount=8, agsize=268435455 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=1949957886, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
umount/mount recovers the fs and the fs seems ok.
I can reproduce this behavior. Is there anything I could try to debug
this?
Regards
Thomas
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xlog_write: reservation ran out. Need to up reservation
2014-08-19 15:34 ` xlog_write: reservation ran out. Need to up reservation Thomas Klaube
@ 2014-08-19 22:55 ` Dave Chinner
2014-08-21 7:24 ` Thomas Klaube
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2014-08-19 22:55 UTC (permalink / raw
To: Thomas Klaube; +Cc: xfs
On Tue, Aug 19, 2014 at 05:34:30PM +0200, Thomas Klaube wrote:
> Hi all,
>
> I am currently testing/benchmarking xfs on top of a bcache. When I run a heavy
> IO workload (fio with 64 threads, read/write) on the device for ~30-45min I get
Can you post the fio job configuration?
> [ 9092.978268] XFS (bcache1): xlog_write: reservation summary:
> [ 9092.978268] trans type = (null) (42)
> [ 9092.978268] unit res = 18730384 bytes
> [ 9092.978268] current res = -1640 bytes
> [ 9092.978268] total reg = 512 bytes (o/flow = 1163749592 bytes)
> [ 9092.978268] ophdrs = 655304 (ophdr space = 7863648 bytes)
> [ 9092.978268] ophdr + reg = 1171613752 bytes
> [ 9092.978268] num regions = 2
Oh, my:
> [ 9092.978268] ophdr + reg = 1171613752 bytes
Thats 1,171,613,752 bytes, or 1.1GB of journal data in that
checkpoint. It's more than half the size of the journal, so it's
violated fundamental constraints (i.e. no checkpoint shoul dbe
larger than half the log)
We should be committing the checkpoint once the queued metadata is
beyond 12.5% of log space, or about 250MB in this case. The question
is how did that get delayed for so long that we overran the push
threshold by a factor of 3.5?
Hmmmm - I wonder if bcache is causing some kind of kworker or
workqueue starvation? I really need to see that fio job config and
find out a whole lot more about the hardware and storage config you
are running:
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> [ 9092.978268]
> [ 9092.978272] XFS (bcache1): region[0]: LR header - 512 bytes
> [ 9092.978273] XFS (bcache1): region[1]: commit - 0 bytes
> [ 9092.978274] XFS (bcache1): xlog_write: reservation ran out. Need to up reservation
> [ 9092.978303] XFS (bcache1): xfs_do_force_shutdown(0x2) called from line 2036 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa04433c8
> [ 9092.979189] XFS (bcache1): Log I/O Error Detected. Shutting down filesystem
> [ 9092.979210] XFS (bcache1): Please umount the filesystem and rectify the problem(s)
> [ 9092.979238] XFS (bcache1): xfs_do_force_shutdown(0x2) called from line 1497 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa0443b57
> [ 9093.183869] XFS (bcache1): xfs_log_force: error 5 returned.
> [ 9093.489944] XFS (bcache1): xfs_log_force: error 5 returned.
>
> Kernel is 3.16.1 but this also happens with Ubuntu 3.13.0.34.
> With the bcache the fio puts ~30k IOps on the filesystem.
Which is not very much. I do that sort of thing all the time.
> xfs_info:
> meta-data=/dev/bcache1 isize=256 agcount=8, agsize=268435455 blks
> = sectsz=512 attr=2
> data = bsize=4096 blocks=1949957886, imaxpct=5
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal bsize=4096 blocks=521728, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> umount/mount recovers the fs and the fs seems ok.
>
> I can reproduce this behavior. Is there anything I could try to debug
> this?
Run the workload directly on the SSD rather than with bcache. Use
mkfs parameters to give you 8 ags and the same size log, and see
if you get the same problem.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xlog_write: reservation ran out. Need to up reservation
2014-08-19 22:55 ` Dave Chinner
@ 2014-08-21 7:24 ` Thomas Klaube
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Klaube @ 2014-08-21 7:24 UTC (permalink / raw
To: xfs
----- Ursprüngliche Mail -----
> Von: "Dave Chinner" <david@fromorbit.com>
> An: "Thomas Klaube" <thomas@klaube.net>
> CC: xfs@oss.sgi.com
> Gesendet: Mittwoch, 20. August 2014 00:55:07
> Betreff: Re: xlog_write: reservation ran out. Need to up reservation
Hi,
> Can you post the fio job configuration?
first I run this job for 600 sec:
wtk@ubuntu ~ $ cat write.fio
[rnd]
rw=randwrite
ramp_time=30
runtime=600
time_based
gtod_reduce=1
size=100g
refill_buffers=1
directory=.
iodepth=64
direct=1
blocksize=16k
numjobs=64
nrfiles=1
group_reporting
ioengine=libaio
loops=1
Then I run this job for 2hours:
wtk@ubuntu ~ $ cat random.fio
[rnd]
rw=randrw
ramp_time=30
runtime=7200
time_based
rwmixread=30
size=100g
refill_buffers=1
directory=.
iodepth=64
direct=1
blocksize=4k
numjobs=64
group_reporting
ioengine=libaio
loops=1
I run this workload on 2 devices in parallel. One is the bcache device (with xfs), the other is
a non cached device. The random.fio job causes the problem on the bcache device after ~30-75mins.
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
I have sent a mail with all collected data to Dave.
> Run the workload directly on the SSD rather than with bcache. Use
> mkfs parameters to give you 8 ags and the same size log, and see
> if you get the same problem.
I created a xfs direclty on the SSD:
mkfs.xfs -f -d agcount=8 -l size=521728b /dev/sdc1
Then I started tho fio jobs as described above for 10 hours. I could not reproduce the
problem. I will send a mail to the bcache mailing list as well...
Thanx and Regards
Thomas
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-08-21 7:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <159192779.3859815.1408461799560.JavaMail.zimbra@klaube.net>
2014-08-19 15:34 ` xlog_write: reservation ran out. Need to up reservation Thomas Klaube
2014-08-19 22:55 ` Dave Chinner
2014-08-21 7:24 ` Thomas Klaube
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.