Linux-Block Archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
       [not found] <CGME20240222193304epcas5p318426c5267ee520e6b5710164c533b7d@epcas5p3.samsung.com>
@ 2024-02-22 19:33 ` Kanchan Joshi
  2024-02-22 20:08   ` Keith Busch
                     ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Kanchan Joshi @ 2024-02-22 19:33 UTC (permalink / raw
  To: lsf-pc, linux-block@vger.kernel.org, Linux FS Devel,
	linux-nvme@lists.infradead.org
  Cc: Martin K. Petersen, kbusch@kernel.org, axboe@kernel.dk, josef,
	Christoph Hellwig

With respect to the current state of Meta/Block-integrity, there are
some missing pieces.
I can improve some of it. But not sure if I am up to speed on the
history behind the status quo.

Hence, this proposal to discuss the pieces.

Maybe people would like to discuss other points too, but I have the 
following:

- Generic user interface that user-space can use to exchange meta. A
new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
direct IO. Buffered IO seems non-trivial as a relatively smaller meta
needs to be written into/read from the page cache. The related
metadata must also be written during the writeback (of data).


- Is there interest in filesystem leveraging the integrity capabilities 
that almost every enterprise SSD has.
Filesystems lacking checksumming abilities can still ask the SSD to do
it and be more robust.
And for BTRFS - there may be value in offloading the checksum to SSD.
Either to save the host CPU or to get more usable space (by not
writing the checksum tree). The mount option 'nodatasum' can turn off
the data checksumming, but more needs to be done to make the offload
work.

NVMe SSD can do the offload when the host sends the PRACT bit. But in
the driver, this is tied to global integrity disablement using
CONFIG_BLK_DEV_INTEGRITY.
So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
that the filesystem can send. The block-integrity and NVMe driver do
the rest to make the offload work.

- Currently, block integrity uses guard and ref tags but not application 
tags.
As per Martin's paper [*]:

"Work is in progress to implement support for the data
integrity extensions in btrfs, enabling the filesystem
to use the application tag."

I could not figure out more about the above effort. It will be good to
understand the progress/problems.

I hope to have the RFC (on the first two items) for better discussion.

[*] https://www.landley.net/kdocs/ols/2008/ols2008v2-pages-151-156.pdf

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-22 19:33 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements Kanchan Joshi
@ 2024-02-22 20:08   ` Keith Busch
  2024-02-23 12:41     ` Kanchan Joshi
  2024-02-23 14:38   ` David Sterba
  2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
  2 siblings, 1 reply; 16+ messages in thread
From: Keith Busch @ 2024-02-22 20:08 UTC (permalink / raw
  To: Kanchan Joshi
  Cc: lsf-pc, linux-block@vger.kernel.org, Linux FS Devel,
	linux-nvme@lists.infradead.org, Martin K. Petersen,
	axboe@kernel.dk, josef, Christoph Hellwig

On Fri, Feb 23, 2024 at 01:03:01AM +0530, Kanchan Joshi wrote:
> With respect to the current state of Meta/Block-integrity, there are
> some missing pieces.
> I can improve some of it. But not sure if I am up to speed on the
> history behind the status quo.
> 
> Hence, this proposal to discuss the pieces.
> 
> Maybe people would like to discuss other points too, but I have the 
> following:
> 
> - Generic user interface that user-space can use to exchange meta. A
> new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
> direct IO. Buffered IO seems non-trivial as a relatively smaller meta
> needs to be written into/read from the page cache. The related
> metadata must also be written during the writeback (of data).
> 
> 
> - Is there interest in filesystem leveraging the integrity capabilities 
> that almost every enterprise SSD has.
> Filesystems lacking checksumming abilities can still ask the SSD to do
> it and be more robust.
> And for BTRFS - there may be value in offloading the checksum to SSD.
> Either to save the host CPU or to get more usable space (by not
> writing the checksum tree). The mount option 'nodatasum' can turn off
> the data checksumming, but more needs to be done to make the offload
> work.

As I understand it, btrfs's checksums are on a variable extent size, but
offloading it to the SSD would do it per block, so it's forcing a new
on-disk format. It would be cool to use it, though: you could atomically
update data and checksums without stable pages.
 
> NVMe SSD can do the offload when the host sends the PRACT bit. But in
> the driver, this is tied to global integrity disablement using
> CONFIG_BLK_DEV_INTEGRITY.
> So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
> that the filesystem can send. The block-integrity and NVMe driver do
> the rest to make the offload work.
> 
> - Currently, block integrity uses guard and ref tags but not application 
> tags.
> As per Martin's paper [*]:
> 
> "Work is in progress to implement support for the data
> integrity extensions in btrfs, enabling the filesystem
> to use the application tag."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-22 20:08   ` Keith Busch
@ 2024-02-23 12:41     ` Kanchan Joshi
  0 siblings, 0 replies; 16+ messages in thread
From: Kanchan Joshi @ 2024-02-23 12:41 UTC (permalink / raw
  To: Keith Busch
  Cc: lsf-pc, linux-block@vger.kernel.org, Linux FS Devel,
	linux-nvme@lists.infradead.org, Martin K. Petersen,
	axboe@kernel.dk, josef, Christoph Hellwig

On 2/23/2024 1:38 AM, Keith Busch wrote:
> On Fri, Feb 23, 2024 at 01:03:01AM +0530, Kanchan Joshi wrote:
>> With respect to the current state of Meta/Block-integrity, there are
>> some missing pieces.
>> I can improve some of it. But not sure if I am up to speed on the
>> history behind the status quo.
>>
>> Hence, this proposal to discuss the pieces.
>>
>> Maybe people would like to discuss other points too, but I have the
>> following:
>>
>> - Generic user interface that user-space can use to exchange meta. A
>> new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
>> direct IO. Buffered IO seems non-trivial as a relatively smaller meta
>> needs to be written into/read from the page cache. The related
>> metadata must also be written during the writeback (of data).
>>
>>
>> - Is there interest in filesystem leveraging the integrity capabilities
>> that almost every enterprise SSD has.
>> Filesystems lacking checksumming abilities can still ask the SSD to do
>> it and be more robust.
>> And for BTRFS - there may be value in offloading the checksum to SSD.
>> Either to save the host CPU or to get more usable space (by not
>> writing the checksum tree). The mount option 'nodatasum' can turn off
>> the data checksumming, but more needs to be done to make the offload
>> work.
> 
> As I understand it, btrfs's checksums are on a variable extent size, but
> offloading it to the SSD would do it per block, so it's forcing a new
> on-disk format. It would be cool to use it, though: you could atomically
> update data and checksums without stable pages.
>   

Yes, variable extents but it computes the checksum for each FS block 
size (4k-64K, practically 4K) within each extent.
On-disk format change will not be needed, because in this approach FS 
(and block-integrity) does not really deal with checksums. It only asks 
the device to compute/verify.

Am I missing your point?

>> NVMe SSD can do the offload when the host sends the PRACT bit. But in
>> the driver, this is tied to global integrity disablement using
>> CONFIG_BLK_DEV_INTEGRITY.
>> So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
>> that the filesystem can send. The block-integrity and NVMe driver do
>> the rest to make the offload work.
>>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-22 19:33 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements Kanchan Joshi
  2024-02-22 20:08   ` Keith Busch
@ 2024-02-23 14:38   ` David Sterba
  2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
  2 siblings, 0 replies; 16+ messages in thread
From: David Sterba @ 2024-02-23 14:38 UTC (permalink / raw
  To: Kanchan Joshi
  Cc: lsf-pc, linux-block@vger.kernel.org, Linux FS Devel,
	linux-nvme@lists.infradead.org, Martin K. Petersen,
	kbusch@kernel.org, axboe@kernel.dk, josef, Christoph Hellwig

On Fri, Feb 23, 2024 at 01:03:01AM +0530, Kanchan Joshi wrote:
> - Is there interest in filesystem leveraging the integrity capabilities 
> that almost every enterprise SSD has.
> Filesystems lacking checksumming abilities can still ask the SSD to do
> it and be more robust.
> And for BTRFS - there may be value in offloading the checksum to SSD.
> Either to save the host CPU or to get more usable space (by not
> writing the checksum tree). The mount option 'nodatasum' can turn off
> the data checksumming, but more needs to be done to make the offload
> work.

What would be the interface for offloading? E.g. the SSD capability is
provided by the async hash in linux crypto API.

As you say using the nodatasum option for the whole filesystem would
achieve the offloading and not storing the checksums. But other ways
would need an interface how to communicate the checksum values back to
the filesystem.

Dealing with the ahash as interface is not straightforward, may need
additional memory for requests and set up of the pages to pass the
memory. All that and the latency caused by issuing the request and
waiting could be slower than calculating the checksum on CPU.

Also the ahash interface is getting less popular, fsverity removed the
support not so long ago [1].

[1] https://lore.kernel.org/linux-crypto/20230406003714.94580-1-ebiggers@kernel.org/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-22 19:33 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements Kanchan Joshi
  2024-02-22 20:08   ` Keith Busch
  2024-02-23 14:38   ` David Sterba
@ 2024-02-26 23:15   ` Martin K. Petersen
  2024-03-27 13:45     ` Kanchan Joshi
                       ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Martin K. Petersen @ 2024-02-26 23:15 UTC (permalink / raw
  To: Kanchan Joshi
  Cc: lsf-pc, linux-block@vger.kernel.org, Linux FS Devel,
	linux-nvme@lists.infradead.org, kbusch@kernel.org,
	axboe@kernel.dk, josef, Martin K. Petersen, Christoph Hellwig


Kanchan,

> - Generic user interface that user-space can use to exchange meta. A
> new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
> direct IO.

Yep. I'm interested in this too. Reviving this effort is near the top of
my todo list so I'm happy to collaborate.

> NVMe SSD can do the offload when the host sends the PRACT bit. But in
> the driver, this is tied to global integrity disablement using
> CONFIG_BLK_DEV_INTEGRITY.

> So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
> that the filesystem can send. The block-integrity and NVMe driver do
> the rest to make the offload work.

Whether to have a block device do this is currently controlled by the
/sys/block/foo/integrity/{read_verify,write_generate} knobs. At least
for SCSI, protected transfers are always enabled between HBA and target
if both support it. If no integrity has been attached to an I/O by the
application/filesystem, the block layer will do so controlled by the
sysfs knobs above. IOW, if the hardware is capable, protected transfers
should always be enabled, at least from the block layer down.

It's possible that things don't work quite that way with NVMe since, at
least for PCIe, the drive is both initiator and target. And NVMe also
missed quite a few DIX details in its PI implementation. It's been a
while since I messed with PI on NVMe, I'll have a look.

But in any case the intent for the Linux code was for protected
transfers to be enabled automatically when possible. If the block layer
protection is explicitly disabled, a filesystem can still trigger
protected transfers via the bip flags. So that capability should
definitely be exposed via io_uring.

> "Work is in progress to implement support for the data integrity
> extensions in btrfs, enabling the filesystem to use the application
> tag."

This didn't go anywhere for a couple of reasons:

 - Individual disk drives supported ATO but every storage array we
   worked with used the app tag space internally. And thus there were
   very few real-life situations where it would be possible to store
   additional information in each block.

   Back in the mid-2000s, putting enterprise data on individual disk
   drives was not considered acceptable. So implementing filesystem
   support that would only be usable on individual disk drives didn't
   seem worth the investment. Especially when the PI-for-ATA efforts
   were abandoned.

   Wrt. the app tag ownership situation in SCSI, the storage tag in NVMe
   spec is a remedy for this, allowing the application to own part of
   the extra tag space and the storage device itself another.

 - Our proposed use case for the app tag was to provide filesystems with
   back pointers without having to change the on-disk format.

   The use of 0xFFFF as escape check in PI meant that the caller had to
   be very careful about what to store in the app tag. Our prototype
   attached structs of metadata to each filesystem block (8 512-byte
   sectors * 2 bytes of PI, so 16 bytes of metadata per filesystem
   block). But none of those 2-byte blobs could contain the value
   0xFFFF. Wasn't really a great interface for filesystems that wanted
   to be able to attach whatever data structure was important to them.

So between a very limited selection of hardware actually providing the
app tag space and a clunky interface for filesystems, the app tag just
never really took off. We ended up modifying it to be an access control
instead, see the app tag control mode page in SCSI.

Databases and many filesystems have means to protect blocks or extents.
And these means are often better at identifying the nature of read-time
problems than a CRC over each 512-byte LBA would be. So what made PI
interesting was the ability to catch problems at write time in case of a
bad partition remap, wrong buffer pointer, misordered blocks, etc. Once
the data is on media, the drive ECC is superior. And again, at read time
the database or application is often better equipped to identify
corruption than PI.

And consequently our interest focused on treating PI something more akin
to a network checksum than a facility to protect data at rest on media.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
@ 2024-03-27 13:45     ` Kanchan Joshi
  2024-03-28  0:30       ` Martin K. Petersen
  2024-04-02 10:45     ` Dongyang Li
  2024-04-05  6:12     ` Kent Overstreet
  2 siblings, 1 reply; 16+ messages in thread
From: Kanchan Joshi @ 2024-03-27 13:45 UTC (permalink / raw
  To: Martin K. Petersen
  Cc: lsf-pc, linux-block@vger.kernel.org, Linux FS Devel,
	linux-nvme@lists.infradead.org, kbusch@kernel.org,
	axboe@kernel.dk, josef, Christoph Hellwig

On 2/27/2024 4:45 AM, Martin K. Petersen wrote:
> 
> Kanchan,
> 
>> - Generic user interface that user-space can use to exchange meta. A
>> new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
>> direct IO.
> 
> Yep. I'm interested in this too. Reviving this effort is near the top of
> my todo list so I'm happy to collaborate.

The first cut is here:
https://lore.kernel.org/linux-block/20240322185023.131697-1-joshi.k@samsung.com/

Not sure how far it is from the requirements you may have. Feedback will 
help.
Perhaps the interface needs the ability to tell what kind of checks 
(guard, apptag, reftag) are desired.
Doable, but that will require the introduction of three new RWF_* flags.

>> NVMe SSD can do the offload when the host sends the PRACT bit. But in
>> the driver, this is tied to global integrity disablement using
>> CONFIG_BLK_DEV_INTEGRITY.
> 
>> So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
>> that the filesystem can send. The block-integrity and NVMe driver do
>> the rest to make the offload work.
> 
> Whether to have a block device do this is currently controlled by the
> /sys/block/foo/integrity/{read_verify,write_generate} knobs.

Right. This can work for the case when host does not need to pass the 
buffer (meta-size is equal to pi-size).
But when meta-size is greater than pi-size, the meta-buffer needs to be 
allocated. Some changes are required so that Block-integrity does that 
allocation, without having to do read_verify/write_generate.

> At least
> for SCSI, protected transfers are always enabled between HBA and target
> if both support it. If no integrity has been attached to an I/O by the
> application/filesystem, the block layer will do so controlled by the
> sysfs knobs above. IOW, if the hardware is capable, protected transfers
> should always be enabled, at least from the block layer down.
> It's possible that things don't work quite that way with NVMe since, at
> least for PCIe, the drive is both initiator and target. And NVMe also
> missed quite a few DIX details in its PI implementation. It's been a
> while since I messed with PI on NVMe, I'll have a look.

PRACT=1 case, figure 9 and Section 5.2.2, in the spec: 
https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-1.0d-2023.12.28-Ratified.pdf

I am not sure whether SCSI also has the equivalent of this bit.


> But in any case the intent for the Linux code was for protected
> transfers to be enabled automatically when possible. If the block layer
> protection is explicitly disabled, a filesystem can still trigger
> protected transfers via the bip flags. So that capability should
> definitely be exposed via io_uring.
> 
>> "Work is in progress to implement support for the data integrity
>> extensions in btrfs, enabling the filesystem to use the application
>> tag."
> 
> This didn't go anywhere for a couple of reasons:

Thanks, this was very helpful!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-03-27 13:45     ` Kanchan Joshi
@ 2024-03-28  0:30       ` Martin K. Petersen
  2024-03-29 11:35         ` Kanchan Joshi
  0 siblings, 1 reply; 16+ messages in thread
From: Martin K. Petersen @ 2024-03-28  0:30 UTC (permalink / raw
  To: Kanchan Joshi
  Cc: Martin K. Petersen, axboe@kernel.dk, josef,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	kbusch@kernel.org, Linux FS Devel, lsf-pc, Christoph Hellwig


Hi Kanchan!

> Not sure how far it is from the requirements you may have. Feedback
> will help. Perhaps the interface needs the ability to tell what kind
> of checks (guard, apptag, reftag) are desired. Doable, but that will
> require the introduction of three new RWF_* flags.

I'm working on getting my test tooling working with your series. But
yes, I'll definitely need a way to set the bip flags.

> Right. This can work for the case when host does not need to pass the
> buffer (meta-size is equal to pi-size). But when meta-size is greater
> than pi-size, the meta-buffer needs to be allocated. Some changes are
> required so that Block-integrity does that allocation, without having
> to do read_verify/write_generate.

Not sure I follow. Do you want the non-PI metadata to be passed in from
userland but the kernel or controller to generate the PI?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-03-28  0:30       ` Martin K. Petersen
@ 2024-03-29 11:35         ` Kanchan Joshi
  2024-04-03  2:10           ` Martin K. Petersen
  0 siblings, 1 reply; 16+ messages in thread
From: Kanchan Joshi @ 2024-03-29 11:35 UTC (permalink / raw
  To: Martin K. Petersen
  Cc: axboe@kernel.dk, josef, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, kbusch@kernel.org, Linux FS Devel,
	lsf-pc, Christoph Hellwig

On 3/28/2024 6:00 AM, Martin K. Petersen wrote:
> 
> Hi Kanchan!
> 
>> Not sure how far it is from the requirements you may have. Feedback
>> will help. Perhaps the interface needs the ability to tell what kind
>> of checks (guard, apptag, reftag) are desired. Doable, but that will
>> require the introduction of three new RWF_* flags.
> 
> I'm working on getting my test tooling working with your series.

If it helps somehow, here is a simple application for the interface [*].
It is devoid of guard (and stuff); for that fio is better.

> But
> yes, I'll definitely need a way to set the bip flags.

Just to be clear, I was thinking of three new flags that userspace can 
pass: RWF_CHK_GUARD, RWF_CHK_APPTAG, RWF_CHK_REFTAG.
And corresponding bip flags will need to be introduced (I don't see 
anything existing). Driver will see those and convert to protocol 
specific flags.
Does this match with you what you have in mind.


>> Right. This can work for the case when host does not need to pass the
>> buffer (meta-size is equal to pi-size). But when meta-size is greater
>> than pi-size, the meta-buffer needs to be allocated. Some changes are
>> required so that Block-integrity does that allocation, without having
>> to do read_verify/write_generate.
> 
> Not sure I follow. Do you want the non-PI metadata to be passed in from
> userland but the kernel or controller to generate the PI?
> 

No, this has no connection with userland. Seems concurrent discussion 
(with the user-interface topic) will cause the confusion. Maybe this is 
better to be discussed (after some time) along with its own RFC.

[*]

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <linux/io_uring.h>
#include "liburing.h"

/* write data/meta. read data/meta. compare data/meta buffers */
/* prerequisite: format namespace with 4KB + 8b, pi_type = 0 */

#define DATA_LEN 4096
#define META_LEN 8

int main(int argc, char *argv[])
{
         struct io_uring ring;
         struct io_uring_sqe *sqe = NULL;
         struct io_uring_cqe *cqe = NULL;
         void *wdb,*rdb;
         char wmb[META_LEN], rmb[META_LEN];
         char *data_str = "data buffer";
         char *meta_str = "meta";
         int fd, ret, blksize;
         struct stat fstat;
         unsigned long long offset = 0;

         if (argc != 2) {
                 fprintf(stderr, "Usage: %s <block-device>", argv[0]);
                 return 1;
         };

         if (stat(argv[1], &fstat) == 0) {
                 blksize = (int)fstat.st_blksize;
         } else {
                 perror("stat");
                 return 1;
         }

         if (posix_memalign(&wdb, blksize, DATA_LEN)) {
                 perror("posix_memalign failed");
                 return 1;
         }
         if (posix_memalign(&rdb, blksize, DATA_LEN)) {
                 perror("posix_memalign failed");
                 return 1;
         }

         strcpy(wdb, data_str);
         strcpy(wmb, meta_str);

         fd = open(argv[1], O_RDWR | O_DIRECT);
         if (fd < 0) {
                 printf("Error in opening device\n");
                 return 0;
         }

         ret = io_uring_queue_init(8, &ring, 0);
         if (ret) {
                 fprintf(stderr, "ring setup failed: %d\n", ret);
                 return 1;
         }

         /* write data + meta-buffer to device */
         sqe = io_uring_get_sqe(&ring);
         if (!sqe) {
                 fprintf(stderr, "get sqe failed\n");
                 return 1;
         }

         io_uring_prep_write(sqe, fd, wdb, DATA_LEN, offset);
         sqe->opcode = IORING_OP_WRITE_META;
         sqe->meta_addr = (__u64)wmb;
         sqe->meta_len = META_LEN;
         /*
          * TBD: Flags to ask for guard/apptag/reftag checks
          * sqe->rw_flags = RWF_CHK_GUARD | RWF_CHK_APPTAG | RWF_CHK_REFTAG;
          */
         ret = io_uring_submit(&ring);
         if (ret <= 0) {
                 fprintf(stderr, "sqe submit failed: %d\n", ret);
                 return 1;
         }

         ret = io_uring_wait_cqe(&ring, &cqe);
         if (!cqe) {
                 fprintf(stderr, "cqe is NULL :%d\n", ret);
                 return 1;
         }
         if (cqe->res < 0) {
                 fprintf(stderr, "write cqe failure: %d", cqe->res);
                 return 1;
         }

         io_uring_cqe_seen(&ring, cqe);

         /* read dat + meta-buffer back from device */
         sqe = io_uring_get_sqe(&ring);
         if (!sqe) {
                 fprintf(stderr, "get sqe failed\n");
                 return 1;
         }

         io_uring_prep_read(sqe, fd, rdb, DATA_LEN, offset);
         sqe->opcode = IORING_OP_READ_META;
         sqe->meta_addr = (__u64)rmb;
         sqe->meta_len = META_LEN;

         ret = io_uring_submit(&ring);
         if (ret <= 0) {
                 fprintf(stderr, "sqe submit failed: %d\n", ret);
                 return 1;
         }

         ret = io_uring_wait_cqe(&ring, &cqe);
         if (!cqe) {
                 fprintf(stderr, "cqe is NULL :%d\n", ret);
                 return 1;
         }

         if (cqe->res < 0) {
                 fprintf(stderr, "read cqe failure: %d", cqe->res);
                 return 1;
         }
         io_uring_cqe_seen(&ring, cqe);

         if (strncmp(wmb, rmb, META_LEN))
                 printf("Failure: meta mismatch!, wmb=%s, rmb=%s\n", 
wmb, rmb);

         if (strncmp(wdb, rdb, DATA_LEN))
                 printf("Failure: data mismatch!\n");

         io_uring_queue_exit(&ring);
         free(rdb);
         free(wdb);
         return 0;
}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
  2024-03-27 13:45     ` Kanchan Joshi
@ 2024-04-02 10:45     ` Dongyang Li
  2024-04-02 11:37       ` Hannes Reinecke
  2024-04-02 16:52       ` Kanchan Joshi
  2024-04-05  6:12     ` Kent Overstreet
  2 siblings, 2 replies; 16+ messages in thread
From: Dongyang Li @ 2024-04-02 10:45 UTC (permalink / raw
  To: joshi.k@samsung.com, martin.petersen@oracle.com
  Cc: hch@lst.de, linux-fsdevel@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	lsf-pc@lists.linux-foundation.org, josef@toxicpanda.com,
	kbusch@kernel.org, linux-block@vger.kernel.org

Martin, Kanchan,
> 
> Kanchan,
> 
> > - Generic user interface that user-space can use to exchange meta.
> > A
> > new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
> > direct IO.
> 
> Yep. I'm interested in this too. Reviving this effort is near the top
> of
> my todo list so I'm happy to collaborate.
If we are going to have a interface to exchange meta/integrity to user-
space, we could also have a interface in kernel to do the same?

It would be useful for some network filesystem/block device drivers
like nbd/drbd/NVMe-oF to use blk-integrity as network checksum, and the
same checksum covers the I/O on the server as well.

The integrity can be generated on the client and send over network,
on server blk-integrity can just offload to storage.
Verify follows the same principle: on server blk-integrity gets
the PI from storage using the interface, and send over network,
on client we can do the usual verify.

In the past we tried to achieve this, there's patch to add optional
generate/verify functions and they take priority over the ones from the
integrity profile, and the optional generate/verify functions does the
meta/PI exchange, but that didn't get traction. It would be much better
if we can have an bio interface for this.

Cheers
Dongyang
> 
> > NVMe SSD can do the offload when the host sends the PRACT bit. But
> > in
> > the driver, this is tied to global integrity disablement using
> > CONFIG_BLK_DEV_INTEGRITY.
> 
> > So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
> > that the filesystem can send. The block-integrity and NVMe driver
> > do
> > the rest to make the offload work.
> 
> Whether to have a block device do this is currently controlled by the
> /sys/block/foo/integrity/{read_verify,write_generate} knobs. At least
> for SCSI, protected transfers are always enabled between HBA and
> target
> if both support it. If no integrity has been attached to an I/O by
> the
> application/filesystem, the block layer will do so controlled by the
> sysfs knobs above. IOW, if the hardware is capable, protected
> transfers
> should always be enabled, at least from the block layer down.
> 
> It's possible that things don't work quite that way with NVMe since,
> at
> least for PCIe, the drive is both initiator and target. And NVMe also
> missed quite a few DIX details in its PI implementation. It's been a
> while since I messed with PI on NVMe, I'll have a look.
> 
> But in any case the intent for the Linux code was for protected
> transfers to be enabled automatically when possible. If the block
> layer
> protection is explicitly disabled, a filesystem can still trigger
> protected transfers via the bip flags. So that capability should
> definitely be exposed via io_uring.
> 
> > "Work is in progress to implement support for the data integrity
> > extensions in btrfs, enabling the filesystem to use the application
> > tag."
> 
> This didn't go anywhere for a couple of reasons:
> 
>  - Individual disk drives supported ATO but every storage array we
>    worked with used the app tag space internally. And thus there were
>    very few real-life situations where it would be possible to store
>    additional information in each block.
> 
>    Back in the mid-2000s, putting enterprise data on individual disk
>    drives was not considered acceptable. So implementing filesystem
>    support that would only be usable on individual disk drives didn't
>    seem worth the investment. Especially when the PI-for-ATA efforts
>    were abandoned.
> 
>    Wrt. the app tag ownership situation in SCSI, the storage tag in
> NVMe
>    spec is a remedy for this, allowing the application to own part of
>    the extra tag space and the storage device itself another.
> 
>  - Our proposed use case for the app tag was to provide filesystems
> with
>    back pointers without having to change the on-disk format.
> 
>    The use of 0xFFFF as escape check in PI meant that the caller had
> to
>    be very careful about what to store in the app tag. Our prototype
>    attached structs of metadata to each filesystem block (8 512-byte
>    sectors * 2 bytes of PI, so 16 bytes of metadata per filesystem
>    block). But none of those 2-byte blobs could contain the value
>    0xFFFF. Wasn't really a great interface for filesystems that
> wanted
>    to be able to attach whatever data structure was important to
> them.
> 
> So between a very limited selection of hardware actually providing
> the
> app tag space and a clunky interface for filesystems, the app tag
> just
> never really took off. We ended up modifying it to be an access
> control
> instead, see the app tag control mode page in SCSI.
> 
> Databases and many filesystems have means to protect blocks or
> extents.
> And these means are often better at identifying the nature of read-
> time
> problems than a CRC over each 512-byte LBA would be. So what made PI
> interesting was the ability to catch problems at write time in case
> of a
> bad partition remap, wrong buffer pointer, misordered blocks, etc.
> Once
> the data is on media, the drive ECC is superior. And again, at read
> time
> the database or application is often better equipped to identify
> corruption than PI.
> 
> And consequently our interest focused on treating PI something more
> akin
> to a network checksum than a facility to protect data at rest on
> media.
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-04-02 10:45     ` Dongyang Li
@ 2024-04-02 11:37       ` Hannes Reinecke
  2024-04-02 16:52       ` Kanchan Joshi
  1 sibling, 0 replies; 16+ messages in thread
From: Hannes Reinecke @ 2024-04-02 11:37 UTC (permalink / raw
  To: Dongyang Li, joshi.k@samsung.com, martin.petersen@oracle.com
  Cc: hch@lst.de, linux-fsdevel@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	lsf-pc@lists.linux-foundation.org, josef@toxicpanda.com,
	kbusch@kernel.org, linux-block@vger.kernel.org

On 4/2/24 12:45, Dongyang Li wrote:
> Martin, Kanchan,
>>
>> Kanchan,
>>
>>> - Generic user interface that user-space can use to exchange meta.
>>> A new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible
>>> for direct IO.
>>
>> Yep. I'm interested in this too. Reviving this effort is near the top
>> of my todo list so I'm happy to collaborate.
> If we are going to have a interface to exchange meta/integrity to user-
> space, we could also have a interface in kernel to do the same?
> 
> It would be useful for some network filesystem/block device drivers
> like nbd/drbd/NVMe-oF to use blk-integrity as network checksum, and the
> same checksum covers the I/O on the server as well.
> 
> The integrity can be generated on the client and send over network,
> on server blk-integrity can just offload to storage.
> Verify follows the same principle: on server blk-integrity gets
> the PI from storage using the interface, and send over network,
> on client we can do the usual verify.
> 
> In the past we tried to achieve this, there's patch to add optional
> generate/verify functions and they take priority over the ones from the
> integrity profile, and the optional generate/verify functions does the
> meta/PI exchange, but that didn't get traction. It would be much better
> if we can have an bio interface for this.
> 
Not sure if I understand.
Key point of PI is that there _is_ hardware interaction on the disk 
side, and that you can store/offload PI to the hardware.
That PI data can be transferred via the transport up to the application,
and the application can validate it.
I do see the case for nbd (in the sense that nbd should be enabled to 
hand down PI information if it receives them). NVMe-oF is trying to use
PI (which is what this topic is about).
But drbd?
What do you want to achieve? Sure drbd should be PI enabled, but I can't 
really see how it would forward PI information; essentially drbd is a
network-based RAID1, so what should happen with the PI information?
Should drbd try to combine PI information from both legs?
Is the PI information from both legs required to be the same?
Incidentally, the same question would apply to 'normal' RAID1.
In the end, I'm tempted to declare PI to be terminated at that
level to treat everything the same.
But I'd be open to discussion here.

Cheers,

Hannes


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-04-02 10:45     ` Dongyang Li
  2024-04-02 11:37       ` Hannes Reinecke
@ 2024-04-02 16:52       ` Kanchan Joshi
  2024-04-03 12:40         ` Dongyang Li
  1 sibling, 1 reply; 16+ messages in thread
From: Kanchan Joshi @ 2024-04-02 16:52 UTC (permalink / raw
  To: Dongyang Li, martin.petersen@oracle.com
  Cc: hch@lst.de, linux-fsdevel@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	lsf-pc@lists.linux-foundation.org, josef@toxicpanda.com,
	kbusch@kernel.org, linux-block@vger.kernel.org

On 4/2/2024 4:15 PM, Dongyang Li wrote:
> Martin, Kanchan,
>>
>> Kanchan,
>>
>>> - Generic user interface that user-space can use to exchange meta.
>>> A
>>> new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
>>> direct IO.
>>
>> Yep. I'm interested in this too. Reviving this effort is near the top
>> of
>> my todo list so I'm happy to collaborate.
> If we are going to have a interface to exchange meta/integrity to user-
> space, we could also have a interface in kernel to do the same?

Not sure if I follow.
Currently when blk-integrity allocates/attaches the meta buffer, it 
decides what to put in it and how to go about integrity 
generation/verification.
When user-space is sending the meta buffer, it will decide what to 
put/verify. Passed meta buffer will be used directly, and blk-integrity 
will only facilitate that without doing any in-kernel 
generation/verification.

> It would be useful for some network filesystem/block device drivers
> like nbd/drbd/NVMe-oF to use blk-integrity as network checksum, and the
> same checksum covers the I/O on the server as well.
> 
> The integrity can be generated on the client and send over network,
> on server blk-integrity can just offload to storage.
> Verify follows the same principle: on server blk-integrity gets
> the PI from storage using the interface, and send over network,
> on client we can do the usual verify.
> 
> In the past we tried to achieve this, there's patch to add optional
> generate/verify functions and they take priority over the ones from the
> integrity profile, and the optional generate/verify functions does the
> meta/PI exchange, but that didn't get traction. It would be much better
> if we can have an bio interface for this.

Any link to the patches?
I am not sure what this bio interface is for. Does this mean 
verify/generate functions to be specified for each bio?
Now also in-kernel users can add the meta buffer to the bio. It is up to 
the bio owner to implement any custom processing on this meta buffer.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-03-29 11:35         ` Kanchan Joshi
@ 2024-04-03  2:10           ` Martin K. Petersen
  0 siblings, 0 replies; 16+ messages in thread
From: Martin K. Petersen @ 2024-04-03  2:10 UTC (permalink / raw
  To: Kanchan Joshi
  Cc: Martin K. Petersen, axboe@kernel.dk, josef,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	Linux FS Devel, kbusch@kernel.org, lsf-pc, Christoph Hellwig


Kanchan,

> Just to be clear, I was thinking of three new flags that userspace can
> pass: RWF_CHK_GUARD, RWF_CHK_APPTAG, RWF_CHK_REFTAG. And corresponding
> bip flags will need to be introduced (I don't see anything existing).
> Driver will see those and convert to protocol specific flags. Does
> this match with you what you have in mind.

See bip_flags in bio.h. We currently can't pick which tag to check or
not check because RDPROTECT/WRPROTECT in SCSI are a bit of a mess.
However, we do have separate flags to disabling checking at HBA and disk
level. That distinction doesn't really apply for NVMe but we do need it
for SCSI.

>>> Right. This can work for the case when host does not need to pass
>>> the buffer (meta-size is equal to pi-size). But when meta-size is
>>> greater than pi-size, the meta-buffer needs to be allocated. Some
>>> changes are required so that Block-integrity does that allocation,
>>> without having to do read_verify/write_generate.

The block layer should not allocate or mess with the bip when the
metadata originates in userland.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-04-02 16:52       ` Kanchan Joshi
@ 2024-04-03 12:40         ` Dongyang Li
  2024-04-03 12:42           ` hch
  0 siblings, 1 reply; 16+ messages in thread
From: Dongyang Li @ 2024-04-03 12:40 UTC (permalink / raw
  To: joshi.k@samsung.com, martin.petersen@oracle.com
  Cc: hch@lst.de, linux-fsdevel@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	josef@toxicpanda.com, lsf-pc@lists.linux-foundation.org,
	kbusch@kernel.org, linux-block@vger.kernel.org

On Tue, 2024-04-02 at 22:22 +0530, Kanchan Joshi wrote:
> On 4/2/2024 4:15 PM, Dongyang Li wrote:
> > Martin, Kanchan,
> > > 
> > > Kanchan,
> > > 
> > > > - Generic user interface that user-space can use to exchange
> > > > meta.
> > > > A
> > > > new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible
> > > > for
> > > > direct IO.
> > > 
> > > Yep. I'm interested in this too. Reviving this effort is near the
> > > top
> > > of
> > > my todo list so I'm happy to collaborate.
> > If we are going to have a interface to exchange meta/integrity to
> > user-
> > space, we could also have a interface in kernel to do the same?
> 
> Not sure if I follow.
> Currently when blk-integrity allocates/attaches the meta buffer, it 
> decides what to put in it and how to go about integrity 
> generation/verification.
> When user-space is sending the meta buffer, it will decide what to 
> put/verify. Passed meta buffer will be used directly, and blk-
> integrity 
> will only facilitate that without doing any in-kernel 
> generation/verification.
This is what I was trying to get, but for in-kernel users instead of
user-space, however...
> 
> > It would be useful for some network filesystem/block device drivers
> > like nbd/drbd/NVMe-oF to use blk-integrity as network checksum, and
> > the
> > same checksum covers the I/O on the server as well.
> > 
> > The integrity can be generated on the client and send over network,
> > on server blk-integrity can just offload to storage.
> > Verify follows the same principle: on server blk-integrity gets
> > the PI from storage using the interface, and send over network,
> > on client we can do the usual verify.
> > 
> > In the past we tried to achieve this, there's patch to add optional
> > generate/verify functions and they take priority over the ones from
> > the
> > integrity profile, and the optional generate/verify functions does
> > the
> > meta/PI exchange, but that didn't get traction. It would be much
> > better
> > if we can have an bio interface for this.
> 
> Any link to the patches?
> I am not sure what this bio interface is for. Does this mean 
> verify/generate functions to be specified for each bio?
Yes, and it's an awkward way to save the meta/PI before the PI buffer
gets freed right after verify.
> Now also in-kernel users can add the meta buffer to the bio. It is up
> to 
> the bio owner to implement any custom processing on this meta buffer.
This makes me realise if in-kernel user does its own meta/PI buffer
management without bio_integrity_prep(), it won't be freed by
bio_integrity_free() and we can put/get to the meta/PI buffer and reuse
the PI data. I will give this a try. Thanks



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-04-03 12:40         ` Dongyang Li
@ 2024-04-03 12:42           ` hch
  2024-04-04  9:53             ` Dongyang Li
  0 siblings, 1 reply; 16+ messages in thread
From: hch @ 2024-04-03 12:42 UTC (permalink / raw
  To: Dongyang Li
  Cc: joshi.k@samsung.com, martin.petersen@oracle.com, hch@lst.de,
	linux-fsdevel@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, josef@toxicpanda.com,
	lsf-pc@lists.linux-foundation.org, kbusch@kernel.org,
	linux-block@vger.kernel.org

In kernel use is easy, we can do that as soon as the first in-kernel
user comes along.  Which one do you have in mind?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-04-03 12:42           ` hch
@ 2024-04-04  9:53             ` Dongyang Li
  0 siblings, 0 replies; 16+ messages in thread
From: Dongyang Li @ 2024-04-04  9:53 UTC (permalink / raw
  To: hch@lst.de
  Cc: linux-fsdevel@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, josef@toxicpanda.com,
	lsf-pc@lists.linux-foundation.org, joshi.k@samsung.com,
	martin.petersen@oracle.com, kbusch@kernel.org,
	linux-block@vger.kernel.org

On Wed, 2024-04-03 at 14:42 +0200, hch@lst.de wrote:
> In kernel use is easy, we can do that as soon as the first in-kernel
> user comes along.  Which one do you have in mind?
> 
We do have dm-crypt, nvme target and target_core_iblock using
bio_integrity_alloc() and bio_integrity_add_page() to setup PI buffer,
but I'm not sure if there's any value this will bring, looks like
bio_itegrity_alloc/add_page() are just working fine for them.

Cheers
Dongyang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements
  2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
  2024-03-27 13:45     ` Kanchan Joshi
  2024-04-02 10:45     ` Dongyang Li
@ 2024-04-05  6:12     ` Kent Overstreet
  2 siblings, 0 replies; 16+ messages in thread
From: Kent Overstreet @ 2024-04-05  6:12 UTC (permalink / raw
  To: Martin K. Petersen
  Cc: Kanchan Joshi, lsf-pc, linux-block@vger.kernel.org,
	Linux FS Devel, linux-nvme@lists.infradead.org, kbusch@kernel.org,
	axboe@kernel.dk, josef, Christoph Hellwig

On Mon, Feb 26, 2024 at 06:15:19PM -0500, Martin K. Petersen wrote:
> 
> Kanchan,
> 
> > - Generic user interface that user-space can use to exchange meta. A
> > new io_uring opcode IORING_OP_READ/WRITE_META - seems feasible for
> > direct IO.
> 
> Yep. I'm interested in this too. Reviving this effort is near the top of
> my todo list so I'm happy to collaborate.
> 
> > NVMe SSD can do the offload when the host sends the PRACT bit. But in
> > the driver, this is tied to global integrity disablement using
> > CONFIG_BLK_DEV_INTEGRITY.
> 
> > So, the idea is to introduce a bio flag REQ_INTEGRITY_OFFLOAD
> > that the filesystem can send. The block-integrity and NVMe driver do
> > the rest to make the offload work.
> 
> Whether to have a block device do this is currently controlled by the
> /sys/block/foo/integrity/{read_verify,write_generate} knobs. At least
> for SCSI, protected transfers are always enabled between HBA and target
> if both support it. If no integrity has been attached to an I/O by the
> application/filesystem, the block layer will do so controlled by the
> sysfs knobs above. IOW, if the hardware is capable, protected transfers
> should always be enabled, at least from the block layer down.
> 
> It's possible that things don't work quite that way with NVMe since, at
> least for PCIe, the drive is both initiator and target. And NVMe also
> missed quite a few DIX details in its PI implementation. It's been a
> while since I messed with PI on NVMe, I'll have a look.
> 
> But in any case the intent for the Linux code was for protected
> transfers to be enabled automatically when possible. If the block layer
> protection is explicitly disabled, a filesystem can still trigger
> protected transfers via the bip flags. So that capability should
> definitely be exposed via io_uring.

I've little interest in checksum calculation offload - but protected
transfers are interesting.

bcachefs moves data around in the background (copygc, rebalance), and
whenever we move existing data we're careful to carry around the
existing checksum and revalidate it at every step, and when we have to
compute a new checksum (fragmenting an existing extent) we compute new
checksums and check that they sum up to the old checksum.

It'd be pretty cool to push this down into the storage device (and up
into the page cache as well).

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-04-05  6:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20240222193304epcas5p318426c5267ee520e6b5710164c533b7d@epcas5p3.samsung.com>
2024-02-22 19:33 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] Meta/Integrity/PI improvements Kanchan Joshi
2024-02-22 20:08   ` Keith Busch
2024-02-23 12:41     ` Kanchan Joshi
2024-02-23 14:38   ` David Sterba
2024-02-26 23:15   ` [Lsf-pc] " Martin K. Petersen
2024-03-27 13:45     ` Kanchan Joshi
2024-03-28  0:30       ` Martin K. Petersen
2024-03-29 11:35         ` Kanchan Joshi
2024-04-03  2:10           ` Martin K. Petersen
2024-04-02 10:45     ` Dongyang Li
2024-04-02 11:37       ` Hannes Reinecke
2024-04-02 16:52       ` Kanchan Joshi
2024-04-03 12:40         ` Dongyang Li
2024-04-03 12:42           ` hch
2024-04-04  9:53             ` Dongyang Li
2024-04-05  6:12     ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).