From: Felix Rubio <felix@kngnt.org>
To: fio@vger.kernel.org
Subject: Re: help understanding the output of fio
Date: Sat, 06 Apr 2024 09:03:38 +0200 [thread overview]
Message-ID: <1bdba632a389d002a443eeb233b19573@kngnt.org> (raw)
In-Reply-To: <CAFCYAsdXh08yEP5wM1BBQ8dLLVyKSksFFG3eBngbBdWNk=nRhQ@mail.gmail.com>
Hi Jeff,
I have run the write test using libaio, and with the primary and
secondary caches set to "none"
fio --name=write_throughput_rpool --numjobs=1 --time_based --runtime=60s
--ramp_time=2s --iodepth=8 --ioengine=libaio --verify=0
--group_reporting=1 --bs=16k --rw=write --size=1G
--filename=/var/cache/test
These are the results:
fio-3.33
Starting 1 process
write_throughput_rpool: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][51.6%][w=256KiB/s][w=16 IOPS][eta 00m:59s]
write_throughput_rpool: (groupid=0, jobs=1): err= 0: pid=1330097: Sat
Apr 6 08:56:27 2024
write: IOPS=550, BW=8808KiB/s (9019kB/s)(519MiB/60315msec); 0 zone
resets
slat (usec): min=185, max=2865.8k, avg=1810.21, stdev=32343.38
clat (usec): min=19, max=2869.5k, avg=12720.49, stdev=85213.20
lat (msec): min=2, max=2870, avg=14.53, stdev=91.03
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 4], 20.00th=[
5],
| 30.00th=[ 6], 40.00th=[ 6], 50.00th=[ 7], 60.00th=[
8],
| 70.00th=[ 9], 80.00th=[ 10], 90.00th=[ 13], 95.00th=[
16],
| 99.00th=[ 34], 99.50th=[ 313], 99.90th=[ 1217], 99.95th=[
1687],
| 99.99th=[ 2869]
bw ( KiB/s): min= 256, max=42068, per=100.00%, avg=11304.49,
stdev=9239.48, samples=94
iops : min= 16, max= 2629, avg=706.47, stdev=577.45,
samples=94
lat (usec) : 20=0.01%
lat (msec) : 2=0.07%, 4=14.03%, 10=68.18%, 20=15.56%, 50=1.40%
lat (msec) : 100=0.23%, 250=0.04%, 500=0.06%, 750=0.13%, 1000=0.06%
lat (msec) : 2000=0.21%, >=2000=0.04%
cpu : usr=0.46%, sys=2.57%, ctx=33743, majf=0, minf=38
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,33194,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
Run status group 0 (all jobs):
WRITE: bw=8808KiB/s (9019kB/s), 8808KiB/s-8808KiB/s
(9019kB/s-9019kB/s), io=519MiB (544MB), run=60315-60315msec
Should I use multiple jobs, running
fio --name=write_throughput_rpool --numjobs=8 --time_based --runtime=60s
--ramp_time=2s --iodepth=8 --ioengine=libaio --verify=0
--group_reporting=1 --bs=16k --rw=write --size=128M
--filename=/var/cache/test
I get the following results:
write_throughput_rpool: (g=0): rw=write, bs=(R) 16.0KiB-16.0KiB, (W)
16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=8
...
fio-3.33
Starting 8 processes
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
write_throughput_rpool: Laying out IO file (1 file / 128MiB)
Jobs: 8 (f=2): [f(4),W(1),f(2),W(1)][35.6%][w=2818KiB/s][w=176 IOPS][eta
01m:54s]
write_throughput_rpool: (groupid=0, jobs=8): err= 0: pid=1338427: Sat
Apr 6 09:02:00 2024
write: IOPS=543, BW=8712KiB/s (8921kB/s)(511MiB/60023msec); 0 zone
resets
slat (usec): min=733, max=2260.7k, avg=14703.28, stdev=63546.14
clat (usec): min=4, max=2385.0k, avg=102868.92, stdev=164897.88
lat (usec): min=774, max=2405.7k, avg=117555.81, stdev=175738.91
clat percentiles (msec):
| 1.00th=[ 19], 5.00th=[ 29], 10.00th=[ 36], 20.00th=[
44],
| 30.00th=[ 50], 40.00th=[ 55], 50.00th=[ 62], 60.00th=[
70],
| 70.00th=[ 81], 80.00th=[ 103], 90.00th=[ 133], 95.00th=[
359],
| 99.00th=[ 1045], 99.50th=[ 1167], 99.90th=[ 1385], 99.95th=[
1754],
| 99.99th=[ 2366]
bw ( KiB/s): min= 384, max=33551, per=100.00%, avg=9727.65,
stdev=850.37, samples=853
iops : min= 24, max= 2095, avg=607.90, stdev=53.14,
samples=853
lat (usec) : 10=0.01%, 20=0.02%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.02%, 10=0.10%, 20=1.01%, 50=30.81%
lat (msec) : 100=47.01%, 250=15.29%, 500=2.31%, 750=2.00%,
1000=0.53%
lat (msec) : 2000=1.04%, >=2000=0.02%
cpu : usr=0.11%, sys=0.45%, ctx=33751, majf=0, minf=295
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,32620,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
Run status group 0 (all jobs):
WRITE: bw=8712KiB/s (8921kB/s), 8712KiB/s-8712KiB/s
(8921kB/s-8921kB/s), io=511MiB (535MB), run=60023-60023msec
So... I am getting more or less the same bandwidths in both cases.
---
Felix Rubio
"Don't believe what you're told. Double check."
On 2024-04-05 20:39, Jeff Johnson wrote:
> Felix,
>
> You're using the psync ioengine with a queue depth of 1. This will
> slow zfs down a lot. Use libaio and a queue depth of 8-16.
>
> To temporarily disable ZFS arc caching on your volume, say your
> benchmark volume is mypool/myvol mounted at /myvol you can do:
> zfs get all mypool/myvol > ~/myvol-settings-save.txt
> zfs set primarycache=off mypool/myvol
> zfs set secondarycache=off mypool/myvol
>
> then run your fio tests. After you are done benchmarking your ZFS
> refer to the settings saved in ~/myvol-settings-save.txt to set
> primarycache and secondarycache back to original.
>
> --Jeff
>
>
> On Thu, Apr 4, 2024 at 10:41 PM Felix Rubio <felix@kngnt.org> wrote:
>>
>> Hi Jeff,
>>
>> I have done the test you suggested, and this is what came up:
>>
>> read_throughput_rpool: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W)
>> 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
>> fio-3.33
>> Starting 1 process
>> read_throughput_rpool: Laying out IO file (1 file / 1024MiB)
>> Jobs: 1 (f=1): [R(1)][100.0%][r=904MiB/s][r=903 IOPS][eta 00m:00s]
>> read_throughput_rpool: (groupid=0, jobs=1): err= 0: pid=3284411: Fri
>> Apr
>> 5 07:31:18 2024
>> read: IOPS=1092, BW=1092MiB/s (1145MB/s)(64.0GiB/60001msec)
>> clat (usec): min=733, max=6031, avg=903.71, stdev=166.68
>> lat (usec): min=734, max=6032, avg=904.88, stdev=166.84
>> clat percentiles (usec):
>> | 1.00th=[ 742], 5.00th=[ 750], 10.00th=[ 750], 20.00th=[
>> 758],
>> | 30.00th=[ 766], 40.00th=[ 791], 50.00th=[ 824], 60.00th=[
>> 873],
>> | 70.00th=[ 1037], 80.00th=[ 1106], 90.00th=[ 1139], 95.00th=[
>> 1188],
>> | 99.00th=[ 1254], 99.50th=[ 1287], 99.90th=[ 1401], 99.95th=[
>> 1500],
>> | 99.99th=[ 2474]
>> bw ( MiB/s): min= 871, max= 1270, per=100.00%, avg=1092.97,
>> stdev=166.88, samples=120
>> iops : min= 871, max= 1270, avg=1092.78, stdev=166.95,
>> samples=120
>> lat (usec) : 750=6.17%, 1000=60.30%
>> lat (msec) : 2=33.51%, 4=0.02%, 10=0.01%
>> cpu : usr=2.04%, sys=97.52%, ctx=2457, majf=0, minf=37
>> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>> >=64=0.0%
>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> issued rwts: total=65543,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>> latency : target=0, window=0, percentile=100.00%, depth=1
>>
>> Run status group 0 (all jobs):
>> READ: bw=1092MiB/s (1145MB/s), 1092MiB/s-1092MiB/s
>> (1145MB/s-1145MB/s), io=64.0GiB (68.7GB), run=60001-60001msec
>>
>> I take that the file was being cached (because the 1145 MBps on a
>> single
>> SSD disk is hard to believe)? For the write test, these are the
>> results:
>>
>> write_throughput_rpool: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W)
>> 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
>> fio-3.33
>> Starting 1 process
>> write_throughput_rpool: Laying out IO file (1 file / 1024MiB)
>> Jobs: 1 (f=1): [W(1)][100.0%][w=4100KiB/s][w=4 IOPS][eta 00m:00s]
>> write_throughput_rpool: (groupid=0, jobs=1): err= 0: pid=3292023: Fri
>> Apr 5 07:36:11 2024
>> write: IOPS=5, BW=5394KiB/s (5524kB/s)(317MiB/60174msec); 0 zone
>> resets
>> clat (msec): min=16, max=1063, avg=189.59, stdev=137.19
>> lat (msec): min=16, max=1063, avg=189.81, stdev=137.19
>> clat percentiles (msec):
>> | 1.00th=[ 18], 5.00th=[ 22], 10.00th=[ 41], 20.00th=[
>> 62],
>> | 30.00th=[ 129], 40.00th=[ 144], 50.00th=[ 163], 60.00th=[
>> 190],
>> | 70.00th=[ 228], 80.00th=[ 268], 90.00th=[ 363], 95.00th=[
>> 489],
>> | 99.00th=[ 567], 99.50th=[ 592], 99.90th=[ 1062], 99.95th=[
>> 1062],
>> | 99.99th=[ 1062]
>> bw ( KiB/s): min= 2048, max=41042, per=100.00%, avg=5534.14,
>> stdev=5269.61, samples=117
>> iops : min= 2, max= 40, avg= 5.40, stdev= 5.14,
>> samples=117
>> lat (msec) : 20=3.47%, 50=11.99%, 100=8.83%, 250=51.42%,
>> 500=20.19%
>> lat (msec) : 750=3.79%, 2000=0.32%
>> cpu : usr=0.11%, sys=0.52%, ctx=2673, majf=0, minf=38
>> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>> >=64=0.0%
>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> issued rwts: total=0,317,0,0 short=0,0,0,0 dropped=0,0,0,0
>> latency : target=0, window=0, percentile=100.00%, depth=1
>>
>> Run status group 0 (all jobs):
>> WRITE: bw=5394KiB/s (5524kB/s), 5394KiB/s-5394KiB/s
>> (5524kB/s-5524kB/s), io=317MiB (332MB), run=60174-60174msec
>>
>> Still at 6 MBps. Should move this discussion to the openzfs team, or
>> is
>> there anything else still worth testing?
>>
>> Thank you very much!
>>
>> ---
>> Felix Rubio
>> "Don't believe what you're told. Double check."
>>
>> On 2024-04-04 21:07, Jeff Johnson wrote:
>> > Felix,
>> >
>> > Based on your previous emails about the drive it would appear that the
>> > hardware (ssd, cables, port) are fine and the drive performs.
>> >
>> > Go back and run your original ZFS test on your mounted ZFS volume
>> > directory and remove the "--direct=1" from your command as ZFS does
>> > not yet support direct_io and disabling buffered io to the zfs
>> > directory will have very negative impacts. This is a ZFS thing, not
>> > your kernel or hardware.
>> >
>> > --Jeff
>> >
>> > On Thu, Apr 4, 2024 at 12:00 PM Felix Rubio <felix@kngnt.org> wrote:
>> >>
>> >> hey Jeff,
>> >>
>> >> Good catch! I have run the following command:
>> >>
>> >> fio --name=seqread --numjobs=1 --time_based --runtime=60s
>> >> --ramp_time=2s
>> >> --iodepth=8 --ioengine=libaio --direct=1 --verify=0
>> >> --group_reporting=1
>> >> --bs=1M --rw=read --size=1G --filename=/dev/sda
>> >>
>> >> (/sda and /sdd are the drives I have on one of the pools), and this is
>> >> what I get:
>> >>
>> >> seqread: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB,
>> >> (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=8
>> >> fio-3.33
>> >> Starting 1 process
>> >> Jobs: 1 (f=1): [R(1)][100.0%][r=388MiB/s][r=388 IOPS][eta 00m:00s]
>> >> seqread: (groupid=0, jobs=1): err= 0: pid=2368687: Thu Apr 4 20:56:06
>> >> 2024
>> >> read: IOPS=382, BW=383MiB/s (401MB/s)(22.4GiB/60020msec)
>> >> slat (usec): min=17, max=3098, avg=68.94, stdev=46.04
>> >> clat (msec): min=14, max=367, avg=20.84, stdev= 6.61
>> >> lat (msec): min=15, max=367, avg=20.91, stdev= 6.61
>> >> clat percentiles (msec):
>> >> | 1.00th=[ 21], 5.00th=[ 21], 10.00th=[ 21], 20.00th=[
>> >> 21],
>> >> | 30.00th=[ 21], 40.00th=[ 21], 50.00th=[ 21], 60.00th=[
>> >> 21],
>> >> | 70.00th=[ 21], 80.00th=[ 21], 90.00th=[ 21], 95.00th=[
>> >> 21],
>> >> | 99.00th=[ 25], 99.50th=[ 31], 99.90th=[ 48], 99.95th=[
>> >> 50],
>> >> | 99.99th=[ 368]
>> >> bw ( KiB/s): min=215040, max=399360, per=100.00%, avg=392047.06,
>> >> stdev=19902.89, samples=120
>> >> iops : min= 210, max= 390, avg=382.55, stdev=19.43,
>> >> samples=120
>> >> lat (msec) : 20=0.19%, 50=99.80%, 100=0.01%, 500=0.03%
>> >> cpu : usr=0.39%, sys=1.93%, ctx=45947, majf=0, minf=37
>> >> IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%,
>> >> >=64=0.0%
>> >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >> >=64=0.0%
>> >> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >> >=64=0.0%
>> >> issued rwts: total=22954,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>> >> latency : target=0, window=0, percentile=100.00%, depth=8
>> >>
>> >> Run status group 0 (all jobs):
>> >> READ: bw=383MiB/s (401MB/s), 383MiB/s-383MiB/s (401MB/s-401MB/s),
>> >> io=22.4GiB (24.1GB), run=60020-60020msec
>> >>
>> >> Disk stats (read/write):
>> >> sda: ios=23817/315, merge=0/0, ticks=549704/132687,
>> >> in_queue=683613,
>> >> util=99.93%
>> >>
>> >> 400 MBps!!! This is a number I have never experienced. I understand
>> >> this
>> >> means I need to go back to the openzfs chat/forum?
>> >>
>> >> ---
>> >> Felix Rubio
>> >> "Don't believe what you're told. Double check."
>> >>
>> >
>> >
>> > --
>> > ------------------------------
>> > Jeff Johnson
>> > Co-Founder
>> > Aeon Computing
>> >
>> > jeff.johnson@aeoncomputing.com
>> > www.aeoncomputing.com
>> > t: 858-412-3810 x1001 f: 858-412-3845
>> > m: 619-204-9061
>> >
>> > 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>> >
>> > High-Performance Computing / Lustre Filesystems / Scale-out Storage
>>
prev parent reply other threads:[~2024-04-06 7:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-03 5:35 help understanding the output of fio Felix Rubio
2024-04-03 15:45 ` Patrick Goetz
2024-04-03 19:21 ` Felix Rubio
2024-04-04 4:28 ` Damien Le Moal
2024-04-04 16:18 ` Patrick Goetz
2024-04-04 23:37 ` Damien Le Moal
[not found] ` <4d12ea060363d934d84201f834b21a38@kngnt.org>
[not found] ` <e9efa7bb-f2b3-47f0-aa03-5191f85562e8@kernel.org>
2024-04-04 18:29 ` Felix Rubio
2024-04-04 18:43 ` Jeff Johnson
2024-04-04 18:59 ` Felix Rubio
2024-04-04 19:07 ` Jeff Johnson
2024-04-05 5:41 ` Felix Rubio
2024-04-05 18:39 ` Jeff Johnson
2024-04-06 7:03 ` Felix Rubio [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1bdba632a389d002a443eeb233b19573@kngnt.org \
--to=felix@kngnt.org \
--cc=fio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).