Linux-Block Archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Kanchan Joshi <joshi.k@samsung.com>,
	lsf-pc@lists.linux-foundation.org,
	 "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	 "linux-nvme@lists.infradead.org"
	<linux-nvme@lists.infradead.org>,
	"kbusch@kernel.org" <kbusch@kernel.org>,
	 "axboe@kernel.dk" <axboe@kernel.dk>,
	josef@toxicpanda.com, Christoph Hellwig <hch@lst.de>
Subject: Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
Date: Fri, 5 Apr 2024 02:12:01 -0400	[thread overview]
Message-ID: <2zvjinuyaxg22d4az76n3ad7injl7ru46xzh64tuqtx3qrtwzo@zhdq434oypwg> (raw)
In-Reply-To: <yq14jdu7t2u.fsf@ca-mkp.ca.oracle.com>

On Mon, Feb 26, 2024 at 06:15:19PM -0500, Martin K. Petersen wrote:
> 
> Kanchan,
> 
> > - Generic user interface that user-space can use to exchange meta. A
> > new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
> > direct IO.
> 
> Yep. I'm interested in this too. Reviving this effort is near the top of
> my todo list so I'm happy to collaborate.
> 
> > NVMe SSD can do the offload when the host sends the PRACT bit. But in
> > the driver, this is tied to global integrity disablement using
> > CONFIG_BLK_DEV_INTEGRITY.
> 
> > So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
> > that the filesystem can send. The block-integrity and NVMe driver do
> > the rest to make the offload work.
> 
> Whether to have a block device do this is currently controlled by the
> /sys/block/foo/integrity/{read_verify,write_generate} knobs. At least
> for SCSI, protected transfers are always enabled between HBA and target
> if both support it. If no integrity has been attached to an I/O by the
> application/filesystem, the block layer will do so controlled by the
> sysfs knobs above. IOW, if the hardware is capable, protected transfers
> should always be enabled, at least from the block layer down.
> 
> It's possible that things don't work quite that way with NVMe since, at
> least for PCIe, the drive is both initiator and target. And NVMe also
> missed quite a few DIX details in its PI implementation. It's been a
> while since I messed with PI on NVMe, I'll have a look.
> 
> But in any case the intent for the Linux code was for protected
> transfers to be enabled automatically when possible. If the block layer
> protection is explicitly disabled, a filesystem can still trigger
> protected transfers via the bip flags. So that capability should
> definitely be exposed via io_uring.

I've little interest in checksum calculation offload - but protected
transfers are interesting.

bcachefs moves data around in the background (copygc, rebalance), and
whenever we move existing data we're careful to carry around the
existing checksum and revalidate it at every step, and when we have to
compute a new checksum (fragmenting an existing extent) we compute new
checksums and check that they sum up to the old checksum.

It'd be pretty cool to push this down into the storage device (and up
into the page cache as well).

      parent reply	other threads:[~2024-04-05  6:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240222193304epcas5p318426c5267ee520e6b5710164c533b7d@epcas5p3.samsung.com>
2024-02-22 19:33 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements Kanchan Joshi
2024-02-22 20:08   ` Keith Busch
2024-02-23 12:41     ` Kanchan Joshi
2024-02-23 14:38   ` David Sterba
2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
2024-03-27 13:45     ` Kanchan Joshi
2024-03-28  0:30       ` Martin K. Petersen
2024-03-29 11:35         ` Kanchan Joshi
2024-04-03  2:10           ` Martin K. Petersen
2024-04-02 10:45     ` Dongyang Li
2024-04-02 11:37       ` Hannes Reinecke
2024-04-02 16:52       ` Kanchan Joshi
2024-04-03 12:40         ` Dongyang Li
2024-04-03 12:42           ` hch
2024-04-04  9:53             ` Dongyang Li
2024-04-05  6:12     ` Kent Overstreet [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2zvjinuyaxg22d4az76n3ad7injl7ru46xzh64tuqtx3qrtwzo@zhdq434oypwg \
    --to=kent.overstreet@linux.dev \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=joshi.k@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).