From: Vincent Fu <vincentfu@gmail.com>
To: Ankit Kumar <ankit.kumar@samsung.com>, axboe@kernel.dk
Cc: fio@vger.kernel.org, joshi.k@samsung.com
Subject: Re: [PATCH 1/3] trim: add support for multiple ranges
Date: Wed, 14 Feb 2024 12:51:50 -0500 [thread overview]
Message-ID: <86754e2a-8083-44ea-8f02-e5966628763c@gmail.com> (raw)
In-Reply-To: <20240214120624.135416-2-ankit.kumar@samsung.com>
On 2/14/24 07:06, Ankit Kumar wrote:
> NVMe specification allow multiple ranges for the dataset management
> commands. Currently the block ioctl only allows a single range for
> trim, however multiple ranges can be specified using nvme character
> device.
>
> Add an option num_range to send multiple range per trim request, which
> only works if the data direction is solely trim i.e. trim or randtrim.
> Add FIO_MULTI_RANGE_TRIM as the ioengine flag, to restrict the usage of
> this new option.
> For multi range trim request this modifies the way IO buffers are used.
> The buffer length will depend on number of trim ranges and the actual
> buffer will contains start and length of each range entry.
>
> Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
> ---
> HOWTO.rst | 7 +++++
> backend.c | 20 ++++++++++--
> cconv.c | 2 ++
> fio.1 | 6 ++++
> fio.h | 18 +++++++++++
> init.c | 13 ++++++++
> io_u.c | 81 ++++++++++++++++++++++++++++++++++++++----------
> io_u.h | 4 +++
> ioengines.h | 2 ++
> options.c | 11 +++++++
> thread_options.h | 3 ++
> 11 files changed, 148 insertions(+), 19 deletions(-)
>
> diff --git a/HOWTO.rst b/HOWTO.rst
> index 5bc1713c..c798613b 100644
> --- a/HOWTO.rst
> +++ b/HOWTO.rst
> @@ -2534,6 +2534,13 @@ with the caveat that when used on the command line, they must come after the
> Specifies logical block application tag mask value, if namespace is
> formatted to use end to end protection information. Default: 0xffff.
>
> +.. option:: num_range=int : [io_uring_cmd]
> +
> + For trim command this will be the number of ranges to trim per I/O
> + request. The number of logical blocks per range is determined by the
> + :option:`bs` option which should be a multiple of logical block size.
> + This cannot be used with read or write. Default: 1.
> +
I would add a note here and in fio.1 that log_offset is not able to log
all the offsets for trim operations with num_range > 1
> .. option:: cpuload=int : [cpuio]
>
> Attempt to use the specified percentage of CPU cycles. This is a mandatory
> diff --git a/backend.c b/backend.c
> index 1fab467a..2f2221bf 100644
> --- a/backend.c
> +++ b/backend.c
> @@ -1333,7 +1333,7 @@ static int init_io_u(struct thread_data *td)
> int init_io_u_buffers(struct thread_data *td)
> {
> struct io_u *io_u;
> - unsigned long long max_bs, min_write;
> + unsigned long long max_bs, min_write, trim_bs = 0;
> int i, max_units;
> int data_xfer = 1;
> char *p;
> @@ -1344,7 +1344,18 @@ int init_io_u_buffers(struct thread_data *td)
> td->orig_buffer_size = (unsigned long long) max_bs
> * (unsigned long long) max_units;
>
> - if (td_ioengine_flagged(td, FIO_NOIO) || !(td_read(td) || td_write(td)))
> + if (td_trim(td) && td->o.num_range > 1) {
> + trim_bs = td->o.num_range * sizeof(struct trim_range);
> + td->orig_buffer_size = trim_bs
> + * (unsigned long long) max_units;
> + }
> +
> + /*
> + * For reads, writes, and multi-range trim operations we need a
> + * data buffer
> + */
> + if (td_ioengine_flagged(td, FIO_NOIO) ||
> + !(td_read(td) || td_write(td) || (td_trim(td) && td->o.num_range > 1)))
> data_xfer = 0;
>
> /*
> @@ -1396,7 +1407,10 @@ int init_io_u_buffers(struct thread_data *td)
> fill_verify_pattern(td, io_u->buf, max_bs, io_u, 0, 0);
> }
> }
> - p += max_bs;
> + if (td_trim(td) && td->o.num_range > 1)
> + p += trim_bs;
> + else
> + p += max_bs;
> }
>
> return 0;
> diff --git a/cconv.c b/cconv.c
> index c9298408..ead47248 100644
> --- a/cconv.c
> +++ b/cconv.c
> @@ -111,6 +111,7 @@ int convert_thread_options_to_cpu(struct thread_options *o,
> o->serialize_overlap = le32_to_cpu(top->serialize_overlap);
> o->size = le64_to_cpu(top->size);
> o->io_size = le64_to_cpu(top->io_size);
> + o->num_range = le32_to_cpu(top->num_range);
> o->size_percent = le32_to_cpu(top->size_percent);
> o->io_size_percent = le32_to_cpu(top->io_size_percent);
> o->fill_device = le32_to_cpu(top->fill_device);
> @@ -609,6 +610,7 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
>
> top->size = __cpu_to_le64(o->size);
> top->io_size = __cpu_to_le64(o->io_size);
> + top->num_range = __cpu_to_le32(o->num_range);
> top->verify_backlog = __cpu_to_le64(o->verify_backlog);
> top->start_delay = __cpu_to_le64(o->start_delay);
> top->start_delay_high = __cpu_to_le64(o->start_delay_high);
> diff --git a/fio.1 b/fio.1
> index 7ec5c745..cfa49423 100644
> --- a/fio.1
> +++ b/fio.1
> @@ -2293,6 +2293,12 @@ end to end protection information. Default: 0x1234.
> Specifies logical block application tag mask value, if namespace is formatted
> to use end to end protection information. Default: 0xffff.
> .TP
> +.BI (io_uring_cmd)num_range \fR=\fPint
> +For trim command this will be the number of ranges to trim per I/O request.
> +The number of logical blocks per range is determined by the \fBbs\fR option
> +which should be a multiple of logical block size. This cannot be used with
> +read or write. Default: 1.
> +.TP
> .BI (cpuio)cpuload \fR=\fPint
> Attempt to use the specified percentage of CPU cycles. This is a mandatory
> option when using cpuio I/O engine.
> diff --git a/fio.h b/fio.h
> index 1322656f..5d558e7b 100644
> --- a/fio.h
> +++ b/fio.h
> @@ -71,6 +71,16 @@
>
> struct fio_sem;
>
> +#define MAX_TRIM_RANGE 256
> +
> +/*
> + * Range for trim command
> + */
> +struct trim_range {
> + unsigned long long start;
> + unsigned long long len;
> +};
> +
> /*
> * offset generator types
> */
> @@ -609,6 +619,14 @@ static inline void fio_ro_check(const struct thread_data *td, struct io_u *io_u)
> !(io_u->ddir == DDIR_TRIM && !td_trim(td)));
> }
>
> +static inline bool multi_range_trim(struct thread_data *td, struct io_u *io_u)
> +{
> + if (io_u->ddir == DDIR_TRIM && td->o.num_range > 1)
> + return true;
> + else
> + return false;
> +}
> +
> static inline bool should_fsync(struct thread_data *td)
> {
> if (td->last_was_sync)
> diff --git a/init.c b/init.c
> index 105339fa..7a0b14a3 100644
> --- a/init.c
> +++ b/init.c
> @@ -618,6 +618,19 @@ static int fixup_options(struct thread_data *td)
> ret |= 1;
> }
>
> + if (td_trimwrite(td) && o->num_range > 1) {
> + log_err("fio: trimwrite cannot be used with multiple"
> + " ranges.\n");
> + ret |= 1;
> + }
> +
> + if (td_trim(td) && o->num_range > 1 &&
> + !td_ioengine_flagged(td, FIO_MULTI_RANGE_TRIM)) {
> + log_err("fio: can't use multiple ranges with IO engine %s\n",
> + td->io_ops->name);
> + ret |= 1;
> + }
> +
> #ifndef CONFIG_PSHARED
> if (!o->use_thread) {
> log_info("fio: this platform does not support process shared"
> diff --git a/io_u.c b/io_u.c
> index 13187882..50df518c 100644
> --- a/io_u.c
> +++ b/io_u.c
> @@ -943,8 +943,11 @@ static void setup_strided_zone_mode(struct thread_data *td, struct io_u *io_u)
> static int fill_io_u(struct thread_data *td, struct io_u *io_u)
> {
> bool is_random;
> - uint64_t offset;
> + uint64_t offset, buflen, i = 0;
> enum io_u_action ret;
> + struct fio_file *f = io_u->file;
> + struct trim_range *range;
> + uint8_t *buf;
>
> if (td_ioengine_flagged(td, FIO_NOIO))
> goto out;
> @@ -966,22 +969,68 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
> else if (td->o.zone_mode == ZONE_MODE_ZBD)
> setup_zbd_zone_mode(td, io_u);
>
> - /*
> - * No log, let the seq/rand engine retrieve the next buflen and
> - * position.
> - */
> - if (get_next_offset(td, io_u, &is_random)) {
> - dprint(FD_IO, "io_u %p, failed getting offset\n", io_u);
> - return 1;
> - }
> + if (multi_range_trim(td, io_u)) {
> + buf = io_u->buf;
> + buflen = 0;
>
> - io_u->buflen = get_next_buflen(td, io_u, is_random);
> - if (!io_u->buflen) {
> - dprint(FD_IO, "io_u %p, failed getting buflen\n", io_u);
> - return 1;
> - }
> + while (i < td->o.num_range) {
> + range = (struct trim_range *)buf;
> + if (get_next_offset(td, io_u, &is_random)) {
> + dprint(FD_IO, "io_u %p, failed getting offset\n",
> + io_u);
> + break;
> + }
> +
> + io_u->buflen = get_next_buflen(td, io_u, is_random);
> + if (!io_u->buflen) {
> + dprint(FD_IO, "io_u %p, failed getting buflen\n", io_u);
> + break;
> + }
> +
> + if (io_u->offset + io_u->buflen > io_u->file->real_file_size) {
> + dprint(FD_IO, "io_u %p, off=0x%llx + len=0x%llx exceeds file size=0x%llx\n",
> + io_u,
> + (unsigned long long) io_u->offset, io_u->buflen,
> + (unsigned long long) io_u->file->real_file_size);
> + break;
> + }
> +
> + range->start = io_u->offset;
> + range->len = io_u->buflen;
It would be helpful to include a debug message with the offset and
length here dprint(FD_IO, "...") for troubleshooting and testing
purposes since there is no other convenient way to see all the ranges.
Vincent
next prev parent reply other threads:[~2024-02-14 17:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20240214064924epcas5p12bee438c7815ca28e43a536459bcddc2@epcas5p1.samsung.com>
2024-02-14 12:06 ` [PATCH 0/3] Support trim with multiple ranges Ankit Kumar
[not found] ` <CGME20240214064925epcas5p4167aa60ecee87f3a15bb2a57a4409685@epcas5p4.samsung.com>
2024-02-14 12:06 ` [PATCH 1/3] trim: add support for " Ankit Kumar
2024-02-14 14:43 ` Jens Axboe
2024-02-14 17:51 ` Vincent Fu [this message]
[not found] ` <CGME20240214064926epcas5p29dcccdd0abe0e0ffb75c64835ec6a01a@epcas5p2.samsung.com>
2024-02-14 12:06 ` [PATCH 2/3] engines/nvme: pass offset and len instead of io_u Ankit Kumar
2024-02-14 14:44 ` Jens Axboe
[not found] ` <CGME20240214064927epcas5p468b8554a7555bb666a0147df11db9d28@epcas5p4.samsung.com>
2024-02-14 12:06 ` [PATCH 3/3] engines/io_uring: add multi range dsm support Ankit Kumar
2024-02-14 14:45 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86754e2a-8083-44ea-8f02-e5966628763c@gmail.com \
--to=vincentfu@gmail.com \
--cc=ankit.kumar@samsung.com \
--cc=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
--cc=joshi.k@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).