Linux-NVME Archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Kanchan Joshi <joshi.k@samsung.com>
Cc: lsf-pc@lists.linux-foundation.org,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	josef@toxicpanda.com, Christoph Hellwig <hch@lst.de>
Subject: Re: [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
Date: Thu, 22 Feb 2024 13:08:54 -0700	[thread overview]
Message-ID: <Zdep1jEY4kFFxxk8@kbusch-mbp> (raw)
In-Reply-To: <aca1e970-9785-5ff4-807b-9f892af71741@samsung.com>

On Fri, Feb 23, 2024 at 01:03:01AM +0530, Kanchan Joshi wrote:
> With respect to the current state of Meta/Block-integrity, there are
> some missing pieces.
> I can improve some of it. But not sure if I am up to speed on the
> history behind the status quo.
> 
> Hence, this proposal to discuss the pieces.
> 
> Maybe people would like to discuss other points too, but I have the 
> following:
> 
> - Generic user interface that user-space can use to exchange meta. A
> new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
> direct IO. Buffered IO seems non-trivial as a relatively smaller meta
> needs to be written into/read from the page cache. The related
> metadata must also be written during the writeback (of data).
> 
> 
> - Is there interest in filesystem leveraging the integrity capabilities 
> that almost every enterprise SSD has.
> Filesystems lacking checksumming abilities can still ask the SSD to do
> it and be more robust.
> And for BTRFS - there may be value in offloading the checksum to SSD.
> Either to save the host CPU or to get more usable space (by not
> writing the checksum tree). The mount option 'nodatasum' can turn off
> the data checksumming, but more needs to be done to make the offload
> work.

As I understand it, btrfs's checksums are on a variable extent size, but
offloading it to the SSD would do it per block, so it's forcing a new
on-disk format. It would be cool to use it, though: you could atomically
update data and checksums without stable pages.
 
> NVMe SSD can do the offload when the host sends the PRACT bit. But in
> the driver, this is tied to global integrity disablement using
> CONFIG_BLK_DEV_INTEGRITY.
> So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
> that the filesystem can send. The block-integrity and NVMe driver do
> the rest to make the offload work.
> 
> - Currently, block integrity uses guard and ref tags but not application 
> tags.
> As per Martin's paper [*]:
> 
> "Work is in progress to implement support for the data
> integrity extensions in btrfs, enabling the filesystem
> to use the application tag."


  reply	other threads:[~2024-02-22 20:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240222193304epcas5p318426c5267ee520e6b5710164c533b7d@epcas5p3.samsung.com>
2024-02-22 19:33 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements Kanchan Joshi
2024-02-22 20:08   ` Keith Busch [this message]
2024-02-23 12:41     ` Kanchan Joshi
2024-02-23 14:38   ` David Sterba
2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
2024-03-27 13:45     ` Kanchan Joshi
2024-03-28  0:30       ` Martin K. Petersen
2024-03-29 11:35         ` Kanchan Joshi
2024-04-03  2:10           ` Martin K. Petersen
2024-04-02 10:45     ` Dongyang Li
2024-04-02 11:37       ` Hannes Reinecke
2024-04-02 16:52       ` Kanchan Joshi
2024-04-03 12:40         ` Dongyang Li
2024-04-03 12:42           ` hch
2024-04-04  9:53             ` Dongyang Li
2024-04-05  6:12     ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zdep1jEY4kFFxxk8@kbusch-mbp \
    --to=kbusch@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=joshi.k@samsung.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).