All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/10] Introduce Signature feature
@ 2013-11-07 15:53 Sagi Grimberg
       [not found] ` <1383839648-19000-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2013-11-07 15:53 UTC (permalink / raw
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	nab-PEzghdH756F8UrSeD/g0lQ, oren-VPRAkNaXOzVWk0Htik3J/w,
	tzahio-VPRAkNaXOzVWk0Htik3J/w

This patchset Introduces Verbs level support for signature handover
feature. Siganture is intended to implement end-to-end data integrity
on a transactional basis in a completely offloaded manner.

There are several end-to-end data integrity methods used today in various
applications and/or upper layer protocols such as T10-DIF defined by SCSI
specifications (SBC), CRC32, XOR8 and more. This patchset adds verbs
support only for T10-DIF. The proposed framework allows adding more
signature methods in the future.

In T10-DIF, when a series of 512-byte data blocks are transferred, each
block is followed by an 8-byte guard (note that other protection intervals
may be used other then 512-bytes). The guard consists of CRC that protects
the integrity of the data in the block, and tag that protects against
mis-directed IOs and a free tag for application use.

Data can be protected when transferred over the wire, but can also be
protected in the memory of the sender/receiver. This allows true end-
to-end protection against bits flipping either over the wire, through
gateways, in memory, over PCI, etc.

While T10-DIF clearly defines that over the wire protection guards are
interleaved into the data stream (each 512-Byte block followed by 8-byte
guard), when in memory, the protection guards may reside in a buffer
separated from the data. Depending on the application, it is usually
easier to handle the data when it is contiguous. In this case the data
buffer will be of size 512xN and the protection buffer will be of size
8xN (where N is the number of blocks in the transaction).

There are 3 kinds of signature handover operation:
1. Take unprotected data (from wire or memory) and ADD protection
   guards.
2. Take protetected data (from wire or memory), validate the data
   integrity against the protection guards and STRIP the protection
   guards.
3. Take protected data (from wire or memory), validate the data
   integrity against the protection guards and PASS the data with
   the guards as-is.

This translates to defining to the HCA how/if data protection exists
in memory domain, and how/if data protection exists is wire domain.

The way that data integrity is performed is by using a new kind of
memory region: signature-enabled MR, and a new kind of work request:
REG_SIG_MR. The REG_SIG_MR WR operates on the signature-enabled MR,
and defines all the needed information for the signature handover
(data buffer, protection buffer if needed and signature attributes).
The result is an MR that can be used for data transfer as usual,
that will also add/validate/strip/pass protection guards.

When the data transfer is successfully completed, it does not mean
that there are no integrity errors. The user must afterwards check
the signature status of the handover operation using a new light-weight
verb.

This feature shall be used in storage upper layer protocols iSER/SRP
implementing end-to-end data integrity T10-DIF. Following this patchset,
we will soon submit krping patches which will demonstrate the usage of
these signature verbs.

Patchset summary:
- Intoduce verbs for create/destroy memory regions supporting signature.
- Introduce IB core signature verbs API.
- Implement mr create/destroy verbs in mlx5 driver.
- Preperation patches for signature support in mlx5 driver.
- Implement signature handover work request in mlx5 driver.
- Implement signature error collection and handling in mlx5 driver.

Changes from v2 (mostly CR comments):
- IB/core: Added comment on IB_T10DIF_CRC/CSUM declarations.
- IB/core: Renamed block_size as pi_interval in ib_sig_attrs.
- IB/core: Took t10_dif domain out of sig union (ib_sig_domain).
- IB/mlx5: Fixed memory leak in create_mr
- IB/mlx5: Remove redundant assignment in WQE initialization.
- IB/mlx5: Fixed possible NULL dereference in check_sig_status
           and set_sig_wr.
- IB/mlx5: Added helper function to convert mkey to base key.
- IB/mlx5: Reduced Fencing in compund REG_SIG_MR WR.
- Resolved checkpatch warnings. 

Changes from v1:
- IB/core: Reduced sizeof ib_send_wr by using wr->sg_list for data
	   and dedicated ib_sge for protection guards buffer.
           Currently sig_handover extension does not increase sizeof ib_send_wr
- IB/core: Change enum to int for container variables.
- IB/mlx5: Validate wr->num_sge=1 for REG_SIG_MR work request.

Changes from v0:
- Commit messages: Added more detailed explanation for signature work request.
- IB/core: Remove indirect memory registration enablement from create_mr.
           Keep only signature enablement.
- IB/mlx5: Changed signature error processing via MR radix lookup.

Sagi Grimberg (10):
  IB/core: Introduce protected memory regions
  IB/core: Introduce Signature Verbs API
  IB/mlx5, mlx5_core: Support for create_mr and destroy_mr
  IB/mlx5: Initialize mlx5_ib_qp signature related
  IB/mlx5: Break wqe handling to begin & finish routines
  IB/mlx5: remove MTT access mode from umr flags helper function
  IB/mlx5: Keep mlx5 MRs in a radix tree under device
  IB/mlx5: Support IB_WR_REG_SIG_MR
  IB/mlx5: Collect signature error completion
  IB/mlx5: Publish support in signature feature

 drivers/infiniband/core/verbs.c                |   47 ++
 drivers/infiniband/hw/mlx5/cq.c                |   54 +++
 drivers/infiniband/hw/mlx5/main.c              |   12 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h           |   14 +
 drivers/infiniband/hw/mlx5/mr.c                |  142 +++++++
 drivers/infiniband/hw/mlx5/qp.c                |  539 ++++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c |    1 +
 drivers/net/ethernet/mellanox/mlx5/core/mr.c   |   85 ++++
 include/linux/mlx5/cq.h                        |    1 +
 include/linux/mlx5/device.h                    |   47 ++
 include/linux/mlx5/driver.h                    |   41 ++
 include/linux/mlx5/qp.h                        |   67 +++
 include/rdma/ib_verbs.h                        |  170 ++++++++-
 13 files changed, 1177 insertions(+), 43 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found] ` <1383839648-19000-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-11-07 21:12   ` Or Gerlitz
  0 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-11-07 21:12 UTC (permalink / raw
  To: Roland Dreier
  Cc: linux-rdma, martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	Nicholas Bellinger, oren-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
	Tzahi Oved, Bart Van Assche, Mike Christie, Sagi Grimberg

On Thu, Nov 7, 2013 at 5:53 PM, Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> This patchset Introduces Verbs level support for signature handover
> feature. Siganture is intended to implement end-to-end data integrity
> on a transactional basis in a completely offloaded manner.

Hi Roland,

The patch series is around for couple of weeks already and went
through the review of Sean and Bart, with all their feedback being
applied. Also Sagi and Co enhanced krping to fully cover (and test...)
the proposed API and driver implementation @
git://beany.openfabrics.org/~sgrimberg/krping.git

As this is kernel only API, if something comes up in the next phase, we can fix
and need not maintain compatibility e.g to user space.

The patches provide the infrastructure for implementing
accelerated/support for T10 DIF in the upstream SCSI target LIO RDMA
(e.g iser) driver and the upstream iser initiator driver, as well as
other open-source/commercial iser target codes.

We would don't want to make it a three/four way (e.g
LIO/iSCSI/iser_i/iser_t/verbs/
driver) merge. And as such I think its fair to ask for inclusion in
3.13 such that the development of the upper layers can run for 3.14
and and later kernels.

Or.


> There are several end-to-end data integrity methods used today in various
> applications and/or upper layer protocols such as T10-DIF defined by SCSI
> specifications (SBC), CRC32, XOR8 and more. This patchset adds verbs
> support only for T10-DIF. The proposed framework allows adding more
> signature methods in the future.
>
> In T10-DIF, when a series of 512-byte data blocks are transferred, each
> block is followed by an 8-byte guard (note that other protection intervals
> may be used other then 512-bytes). The guard consists of CRC that protects
> the integrity of the data in the block, and tag that protects against
> mis-directed IOs and a free tag for application use.
>
> Data can be protected when transferred over the wire, but can also be
> protected in the memory of the sender/receiver. This allows true end-
> to-end protection against bits flipping either over the wire, through
> gateways, in memory, over PCI, etc.
>
> While T10-DIF clearly defines that over the wire protection guards are
> interleaved into the data stream (each 512-Byte block followed by 8-byte
> guard), when in memory, the protection guards may reside in a buffer
> separated from the data. Depending on the application, it is usually
> easier to handle the data when it is contiguous. In this case the data
> buffer will be of size 512xN and the protection buffer will be of size
> 8xN (where N is the number of blocks in the transaction).
>
> There are 3 kinds of signature handover operation:
> 1. Take unprotected data (from wire or memory) and ADD protection
>    guards.
> 2. Take protetected data (from wire or memory), validate the data
>    integrity against the protection guards and STRIP the protection
>    guards.
> 3. Take protected data (from wire or memory), validate the data
>    integrity against the protection guards and PASS the data with
>    the guards as-is.
>
> This translates to defining to the HCA how/if data protection exists
> in memory domain, and how/if data protection exists is wire domain.
>
> The way that data integrity is performed is by using a new kind of
> memory region: signature-enabled MR, and a new kind of work request:
> REG_SIG_MR. The REG_SIG_MR WR operates on the signature-enabled MR,
> and defines all the needed information for the signature handover
> (data buffer, protection buffer if needed and signature attributes).
> The result is an MR that can be used for data transfer as usual,
> that will also add/validate/strip/pass protection guards.
>
> When the data transfer is successfully completed, it does not mean
> that there are no integrity errors. The user must afterwards check
> the signature status of the handover operation using a new light-weight
> verb.
>
> This feature shall be used in storage upper layer protocols iSER/SRP
> implementing end-to-end data integrity T10-DIF. Following this patchset,
> we will soon submit krping patches which will demonstrate the usage of
> these signature verbs.
>
> Patchset summary:
> - Intoduce verbs for create/destroy memory regions supporting signature.
> - Introduce IB core signature verbs API.
> - Implement mr create/destroy verbs in mlx5 driver.
> - Preperation patches for signature support in mlx5 driver.
> - Implement signature handover work request in mlx5 driver.
> - Implement signature error collection and handling in mlx5 driver.
>
> Changes from v2 (mostly CR comments):
> - IB/core: Added comment on IB_T10DIF_CRC/CSUM declarations.
> - IB/core: Renamed block_size as pi_interval in ib_sig_attrs.
> - IB/core: Took t10_dif domain out of sig union (ib_sig_domain).
> - IB/mlx5: Fixed memory leak in create_mr
> - IB/mlx5: Remove redundant assignment in WQE initialization.
> - IB/mlx5: Fixed possible NULL dereference in check_sig_status
>            and set_sig_wr.
> - IB/mlx5: Added helper function to convert mkey to base key.
> - IB/mlx5: Reduced Fencing in compund REG_SIG_MR WR.
> - Resolved checkpatch warnings.
>
> Changes from v1:
> - IB/core: Reduced sizeof ib_send_wr by using wr->sg_list for data
>            and dedicated ib_sge for protection guards buffer.
>            Currently sig_handover extension does not increase sizeof ib_send_wr
> - IB/core: Change enum to int for container variables.
> - IB/mlx5: Validate wr->num_sge=1 for REG_SIG_MR work request.
>
> Changes from v0:
> - Commit messages: Added more detailed explanation for signature work request.
> - IB/core: Remove indirect memory registration enablement from create_mr.
>            Keep only signature enablement.
> - IB/mlx5: Changed signature error processing via MR radix lookup.
>
> Sagi Grimberg (10):
>   IB/core: Introduce protected memory regions
>   IB/core: Introduce Signature Verbs API
>   IB/mlx5, mlx5_core: Support for create_mr and destroy_mr
>   IB/mlx5: Initialize mlx5_ib_qp signature related
>   IB/mlx5: Break wqe handling to begin & finish routines
>   IB/mlx5: remove MTT access mode from umr flags helper function
>   IB/mlx5: Keep mlx5 MRs in a radix tree under device
>   IB/mlx5: Support IB_WR_REG_SIG_MR
>   IB/mlx5: Collect signature error completion
>   IB/mlx5: Publish support in signature feature
>
>  drivers/infiniband/core/verbs.c                |   47 ++
>  drivers/infiniband/hw/mlx5/cq.c                |   54 +++
>  drivers/infiniband/hw/mlx5/main.c              |   12 +
>  drivers/infiniband/hw/mlx5/mlx5_ib.h           |   14 +
>  drivers/infiniband/hw/mlx5/mr.c                |  142 +++++++
>  drivers/infiniband/hw/mlx5/qp.c                |  539 ++++++++++++++++++++++--
>  drivers/net/ethernet/mellanox/mlx5/core/main.c |    1 +
>  drivers/net/ethernet/mellanox/mlx5/core/mr.c   |   85 ++++
>  include/linux/mlx5/cq.h                        |    1 +
>  include/linux/mlx5/device.h                    |   47 ++
>  include/linux/mlx5/driver.h                    |   41 ++
>  include/linux/mlx5/qp.h                        |   67 +++
>  include/rdma/ib_verbs.h                        |  170 ++++++++-
>  13 files changed, 1177 insertions(+), 43 deletions(-)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 00/10] Introduce Signature feature
@ 2013-11-14  0:19 Hefty, Sean
       [not found] ` <1828884A29C6694DAF28B7E6B8A8237388D031C4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Hefty, Sean @ 2013-11-14  0:19 UTC (permalink / raw
  To: Or Gerlitz, Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, oren-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
	Tzahi Oved, Bart Van Assche, Mike Christie, Sagi Grimberg

> The patch series is around for couple of weeks already and went
> through the review of Sean and Bart, with all their feedback being
> applied. Also Sagi and Co enhanced krping to fully cover (and test...)
> the proposed API and driver implementation @
> git://beany.openfabrics.org/~sgrimberg/krping.git

Somewhat separate from this specific patch, this is my concern.

There are continual requests to modify the kernel verbs interfaces.  These requests boil down to exposing proprietary capabilities to the latest version of some vendor's hardware.  In turn, these hardware specific knobs bleed into the kernel clients.

At the very least, it seems that there should be some sort of discussion if this is a desirable property of the kernel verbs interface, and if this is the architecture that the kernel should continue to pursue.  Or, is there an alternative way of providing the same ability of coding ULPs to specific HW features, versus plugging every new feature into 'post send'?

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found] ` <1828884A29C6694DAF28B7E6B8A8237388D031C4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2013-11-14  7:30   ` Or Gerlitz
       [not found]     ` <52847BF8.60401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-11-14 21:39   ` Tzahi Oved
  1 sibling, 1 reply; 10+ messages in thread
From: Or Gerlitz @ 2013-11-14  7:30 UTC (permalink / raw
  To: Hefty, Sean, Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, Oren Duer, Tzahi Oved, Bart Van Assche,
	Mike Christie, Sagi Grimberg, Roland Dreier

On 14/11/2013 02:19, Hefty, Sean wrote:
>> The patch series is around for couple of weeks already and went through the review of Sean and Bart, with all their feedback being applied. Also Sagi and Co enhanced krping to fully cover (and test...) the proposed API and driver implementation
> Somewhat separate from this specific patch, this is my concern.
>
> There are continual requests to modify the kernel verbs interfaces.  These requests boil down to exposing proprietary capabilities to the latest version of some vendor's hardware.  In turn, these hardware specific knobs bleed into the kernel clients.
>
> At the very least, it seems that there should be some sort of discussion if this is a desirable property of the kernel verbs interface, and if this is the architecture that the kernel should continue to pursue.  Or, is there an alternative way of providing the same ability of coding ULPs to specific HW features, versus plugging every new feature into 'post send'?

Sean,

Being concrete + re-iterating  and expanding what I wrote you earlier on 
the V1 thread @ http://marc.info/?l=linux-rdma&m=138314853203389&w=2when 
you said

Sean > Maybe we should rethink the approach of exposing low-level 
hardware constructs to every
Sean > distinct feature of every vendor's latest hardware directly to 
the kernel ULPs.

To begin with T10 DIF **is** industry standard, which is to be used in 
production storage systems, the feature here is T10 DIF acceleration for 
upstream kernel storage drivers such as iSER/SRP/FCoE initiator/targets 
that use RDMA and are included in commercial distributions which are 
used by customers. Note that this/similar feature is supported by some 
FC cards too, so we want RDMA to be competitive.

This work is part of larger efforts which are done nowadays in other 
parts of the kernel such as the block layer, the upstream kernel target 
and more to support T10, its "just" the RDMA part.

Sagi and team made great effort to expose API which isn't tied to 
specific HW/Firmware API. And in that respect, the verbs API is coupled 
with industry standards and by no means with specific HW features. Just 
as quick example, the specific driver/card (mlx5 / ConnectIB) for which 
the news verbs are implemented uses three objects for its T10 support, 
named BSF, KLM and PSV - you can be sure, and please check us  that 
there is no sign for them in the verbs API, they only live within the 
mlx5 driver.

If you see a vendor specific feature/construct that appears in the 
proposed verbs API changes, let us know.

 > [...] versus plugging every new feature into 'post send'?

Its a new feature indeed but its a feature which comes into play when 
submitting RDMA work-requests to the HCA and
for performance reasons must be subject to pipe-lining in the form of 
batched posting and hence has very good fit as
a sub operation of post-send.

Sean > There are continual requests to modify the kernel verbs 
interfaces. These requests boil down to exposing proprietary capabilities
Sean >  to the latest version of some vendor's hardware. In turn, these 
hardware specific knobs bleed into the kernel clients.

non-T10 examples (please) ?!

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found]     ` <52847BF8.60401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-11-14  9:21       ` Sagi Grimberg
  2013-11-14 19:03       ` Hefty, Sean
  1 sibling, 0 replies; 10+ messages in thread
From: Sagi Grimberg @ 2013-11-14  9:21 UTC (permalink / raw
  To: Or Gerlitz, Hefty, Sean, Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, Oren Duer, Tzahi Oved, Bart Van Assche,
	Mike Christie, Roland Dreier

On 11/14/2013 9:30 AM, Or Gerlitz wrote:
> On 14/11/2013 02:19, Hefty, Sean wrote:
>>> The patch series is around for couple of weeks already and went 
>>> through the review of Sean and Bart, with all their feedback being 
>>> applied. Also Sagi and Co enhanced krping to fully cover (and 
>>> test...) the proposed API and driver implementation
>> Somewhat separate from this specific patch, this is my concern.
>>
>> There are continual requests to modify the kernel verbs interfaces.  
>> These requests boil down to exposing proprietary capabilities to the 
>> latest version of some vendor's hardware. In turn, these hardware 
>> specific knobs bleed into the kernel clients.
>>
>> At the very least, it seems that there should be some sort of 
>> discussion if this is a desirable property of the kernel verbs 
>> interface, and if this is the architecture that the kernel should 
>> continue to pursue.  Or, is there an alternative way of providing the 
>> same ability of coding ULPs to specific HW features, versus plugging 
>> every new feature into 'post send'?
>
> Sean,
>
> Being concrete + re-iterating  and expanding what I wrote you earlier 
> on the V1 thread @ 
> http://marc.info/?l=linux-rdma&m=138314853203389&w=2when you said
>
> Sean > Maybe we should rethink the approach of exposing low-level 
> hardware constructs to every
> Sean > distinct feature of every vendor's latest hardware directly to 
> the kernel ULPs.
>
> To begin with T10 DIF **is** industry standard, which is to be used in 
> production storage systems, the feature here is T10 DIF acceleration 
> for upstream kernel storage drivers such as iSER/SRP/FCoE 
> initiator/targets that use RDMA and are included in commercial 
> distributions which are used by customers. Note that this/similar 
> feature is supported by some FC cards too, so we want RDMA to be 
> competitive.
>
> This work is part of larger efforts which are done nowadays in other 
> parts of the kernel such as the block layer, the upstream kernel 
> target and more to support T10, its "just" the RDMA part.
>
> Sagi and team made great effort to expose API which isn't tied to 
> specific HW/Firmware API. And in that respect, the verbs API is 
> coupled with industry standards and by no means with specific HW 
> features. Just as quick example, the specific driver/card (mlx5 / 
> ConnectIB) for which the news verbs are implemented uses three objects 
> for its T10 support, named BSF, KLM and PSV - you can be sure, and 
> please check us  that there is no sign for them in the verbs API, they 
> only live within the mlx5 driver.
>
> If you see a vendor specific feature/construct that appears in the 
> proposed verbs API changes, let us know.
>
> > [...] versus plugging every new feature into 'post send'?
>
> Its a new feature indeed but its a feature which comes into play when 
> submitting RDMA work-requests to the HCA and
> for performance reasons must be subject to pipe-lining in the form of 
> batched posting and hence has very good fit as
> a sub operation of post-send.
>
> Sean > There are continual requests to modify the kernel verbs 
> interfaces. These requests boil down to exposing proprietary capabilities
> Sean >  to the latest version of some vendor's hardware. In turn, 
> these hardware specific knobs bleed into the kernel clients.
>
> non-T10 examples (please) ?!
>
> Or.

Hey Sean,

Just to add on Or's input,
I really don't agree this is some specific HW capability exposed to 
ULPs. This feature allows offloading data-integrity handling over RDMA 
which is a wider concept then just T10-DIF (although we currently expose 
T10-DIF alone).
Signature verbs API does not introduce something specific to Mellanox, 
we think API is generic enough to allow each vendor to support signature 
with some degree of freedom.
Just needs to implement the 3-steps: create signature enabled MR, bind 
MR to signature attributes (work-request) and check for signature status 
at the end of the transaction.

Regarding plugging into post_send, The signature operation is a 
fast-path operation and I agree with Or regarding the value of batching 
work requests.
Moreover, I think this is a separate discussion. If we agree on another 
API posting on the send-queue, it will require work also for migrating 
fastreg and bind_mw extensions.
So how about going with current framework, and start a discussion on 
your concern "taking non-SEND WR extensions out of post_send".

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 00/10] Introduce Signature feature
       [not found]     ` <52847BF8.60401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-11-14  9:21       ` Sagi Grimberg
@ 2013-11-14 19:03       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A8237388D0463F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Hefty, Sean @ 2013-11-14 19:03 UTC (permalink / raw
  To: Or Gerlitz, Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, Oren Duer, Tzahi Oved, Bart Van Assche,
	Mike Christie, Sagi Grimberg, Roland Dreier

> To begin with T10 DIF **is** industry standard, which is to be used in
> production storage systems, the feature here is T10 DIF acceleration for
> upstream kernel storage drivers such as iSER/SRP/FCoE initiator/targets
> that use RDMA and are included in commercial distributions which are
> used by customers. Note that this/similar feature is supported by some
> FC cards too, so we want RDMA to be competitive.

I wasn't talking about whether T10 DIF is a standard.  I was talking about *how* it is exposed through verbs.  That 'how' is what's vendor specific.  The same is true of flow steering, SRQ is standard but IB specific, the same with XRC - which was vendor specific first then standardized, the IP CSUM send flag is vendor specific, 'fast' registration is vendor specific...

I'm not suggesting that these features shouldn't exist, I'm just questioning if the goal of verbs should simply be to expose every hardware knob that a ULP can fiddle.  Maybe the answer is yes, but let's at least ask the question.

As a random, made up on the spot thought, what if IPoIB were architected so that there was a device specific component to it, instead of it pretending that some vendor feature exposed through verbs was a generic RDMA capability?  IPoIB acceleration features could still be used.

> This work is part of larger efforts which are done nowadays in other
> parts of the kernel such as the block layer, the upstream kernel target
> and more to support T10, its "just" the RDMA part.

How does the RDMA part tie into any of the other work being done in other parts of the kernel?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found]         ` <1828884A29C6694DAF28B7E6B8A8237388D0463F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2013-11-14 20:11           ` Or Gerlitz
       [not found]             ` <CAJZOPZLpy8a-uUFhT+bvzd51T7jyV0Osdpz-Muc-1q0du7td3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Or Gerlitz @ 2013-11-14 20:11 UTC (permalink / raw
  To: Hefty, Sean
  Cc: Or Gerlitz, Roland Dreier, linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, Oren Duer, Tzahi Oved, Bart Van Assche,
	Mike Christie, Sagi Grimberg, Roland Dreier

On Thu, Nov 14, 2013 at 9:03 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> To begin with T10 DIF **is** industry standard, which is to be used in
>> production storage systems, the feature here is T10 DIF acceleration for
>> upstream kernel storage drivers such as iSER/SRP/FCoE initiator/targets
>> that use RDMA and are included in commercial distributions which are
>> used by customers. Note that this/similar feature is supported by some
>> FC cards too, so we want RDMA to be competitive.

> I wasn't talking about whether T10 DIF is a standard.  I was talking about *how* it is exposed through verbs.  That 'how' is what's vendor specific.  The same is true of flow steering, SRQ is standard but IB specific, the same with XRC - which was vendor specific first then standardized, the IP CSUM send flag is vendor specific, 'fast' registration is vendor specific...

Again, for T10 we claim that nothing in how its exposed through the
proposed verbs is specific to vendor (e.g Mellanox) or to a specific
hardware brand (e.g ConnectIB) of a vendor, and if you think
otherwise, we'll love to hear that.

As for the other examples (and yes it makes sense to chase them one by
one here even if we end up little lengthy):

SRQ is for all IB cards and IB/IBoE are central enough in the RDMA
stack... such that even if (say) 5-10 features (and there are less)
features are IB/IBoE specific it makes sense to expose them through
the verbs API. We have device capabilities for that end, so the
ULP/Application never asks "hey, what vendor/card is that?" but rather
"is the driver supporting feature X?".

As you said XRC was standardized through IBTA so nothing to bother.

I don't see why you consider IP CSUM to be vendor specific. To compete
with any Ethernet card dated to the last 10 years, if IB HW vendor
wants to support TCP/IP networking ala IPoIB they pretty much need to
implement TCP/IP stateless offloads, namely TX/RX checksum, LSO and
RSS. And same goes for drivers that support RAW_PACKET Ethernet QP for
user space applications that offloads TCP/IP.

Fast registration is all but vendor specific being part of both iWARP
and IB specs (see the BMME - Base Memory Management Extensions section
in the IBTA spec).


> I'm not suggesting that these features shouldn't exist, I'm just questioning if the goal of verbs should simply be to expose every hardware knob that a ULP can fiddle.  Maybe the answer is yes, but let's at least ask the question.

Indeed some HW special knobs might not play well into verbs and in
that case the vendor might need to give them away or support them only
on prop. stacks or whatever solution we can think of, but not T10...

> As a random, made up on the spot thought, what if IPoIB were architected so that there was a device specific component to it, instead of it pretending that some vendor feature exposed through verbs was a generic RDMA capability?  IPoIB acceleration features could still be used.

Could be, IPoIB is there for almost ten years (since 2.6.12) and if we
don't make it to bring the driver into providing 56/100/200Gbs in the
next 1-2 years this could be nice out of the box direction to go to.

But I don't think it applies to the T10 case, at least where it stands
now. Its a new feature whose related verbs are not tied to some
vendor, so lets get it in, and if after some time we see some
performance drawbacks who originate in the embedding into verbs, we
can do this possible breakdown you suggest.


>> This work is part of larger efforts which are done nowadays in other
>> parts of the kernel such as the block layer, the upstream kernel target
>> and more to support T10, its "just" the RDMA part.

> How does the RDMA part tie into any of the other work being done in other parts of the kernel?

T10 touches few areas in the storage stack, T10 && RDMA is when you
look on  SAN (Storage Area Networks) drivers. E.g in the same manner
that FC driver needs to play well with back-end storage drivers when
the FC code submits the the pages received from the network that
contain data and signature blocks into the block layer, RDMA has to do
that too. Since this area is under development and things need to play
well, there is interaction between the RDMA T10 developer to the
upstream SCSI target maintainer to folks which are involved with the
block layer and more storage drivers.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found] ` <1828884A29C6694DAF28B7E6B8A8237388D031C4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2013-11-14  7:30   ` Or Gerlitz
@ 2013-11-14 21:39   ` Tzahi Oved
  1 sibling, 0 replies; 10+ messages in thread
From: Tzahi Oved @ 2013-11-14 21:39 UTC (permalink / raw
  To: Hefty, Sean, Or Gerlitz, Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, oren-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
	Tzahi Oved, Bart Van Assche, Mike Christie, Sagi Grimberg

On 14/11/2013 02:19, Hefty, Sean wrote:
>> The patch series is around for couple of weeks already and went
>> through the review of Sean and Bart, with all their feedback being
>> applied. Also Sagi and Co enhanced krping to fully cover (and test...)
>> the proposed API and driver implementation @
>> git://beany.openfabrics.org/~sgrimberg/krping.git
> Somewhat separate from this specific patch, this is my concern.
>
> There are continual requests to modify the kernel verbs interfaces.  These requests boil down to exposing proprietary capabilities to the latest version of some vendor's hardware.  In turn, these hardware specific knobs bleed into the kernel clients.
Disagree, the verbs changes proposal in the signature case where 
submitted as an RFC for few weeks so all vendors may comments and ask 
for changes. Isn't that what open source development is all about. We 
can't stop progress, we open new functionalists and features for 
everyone to comment and agree on mutual interface. Is there other way u 
had in mind we should define non vendor specific API? we will be glad to 
collaborate but please come with alternate process u think is best. Such 
comments only hold back new functionalities from being accepted and 
backs down Verbs API progress.
> At the very least, it seems that there should be some sort of discussion if this is a desirable property of the kernel verbs interface, and if this is the architecture that the kernel should continue to pursue.  Or, is there an alternative way of providing the same ability of coding ULPs to specific HW features, versus plugging every new feature into 'post send'?
Current Verbs semantics define post send as the operation aggregator 
that enables posting list of WQE in single call so users can serialize 
multiple operation requests and post in single API call. Since signature 
is mainly an enhancement of existing RDMA operation, seems like it fits 
best there. Defining more specific APIs per application type: Storage, 
Cloud, HPC, .. is indeed important and in the process of being defined 
as part of the Open Framework working group u r co-chairing. Thus, it 
doesn't make sense to break the post send verbs semantics in this case.

Tzahi
>
> - Sean
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found]             ` <CAJZOPZLpy8a-uUFhT+bvzd51T7jyV0Osdpz-Muc-1q0du7td3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-12-11 17:26               ` Or Gerlitz
       [not found]                 ` <CAJZOPZLctOXBJN596gr5sCWQ8F=QQPJ_9VpoZmMhJVc2XwS9qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Or Gerlitz @ 2013-12-11 17:26 UTC (permalink / raw
  To: Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, Oren Duer, Tzahi Oved, Bart Van Assche,
	Mike Christie, Sagi Grimberg, Roland Dreier, Hefty, Sean

On Thu, Nov 14, 2013 at 10:11 PM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, Nov 14, 2013 at 9:03 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>> To begin with T10 DIF **is** industry standard, which is to be used in
>>> production storage systems, the feature here is T10 DIF acceleration for
>>> upstream kernel storage drivers such as iSER/SRP/FCoE initiator/targets
>>> that use RDMA and are included in commercial distributions which are
>>> used by customers. Note that this/similar feature is supported by some
>>> FC cards too, so we want RDMA to be competitive.
>
>> I wasn't talking about whether T10 DIF is a standard.  I was talking about *how* it is exposed through verbs.  That 'how' is what's vendor specific.  The same is true of flow steering, SRQ is standard but IB specific, the same with XRC - which was vendor specific first then standardized, the IP CSUM send flag is vendor specific, 'fast' registration is vendor specific...
>
> Again, for T10 we claim that nothing in how its exposed through the
> proposed verbs is specific to vendor (e.g Mellanox) or to a specific
> hardware brand (e.g ConnectIB) of a vendor, and if you think
> otherwise, we'll love to hear that.

Hi Roland, these patches are in the air for two months (V0 posted Oct
15th http://marc.info/?l=linux-rdma&m=138185152212490&w=2)
and not a word from you. Sean raised some concerns for which we
replied in detail and we'd like to get your maintainer say, 3.14 is
coming
closer, if something needs to be changed/fixed we need to know that
asap so there's enough time to do that  for 3.14, if its helps
this is the V3 thread with the latest discussion
http://marc.info/?t=138383972100009&r=1&w=2 , we have some minor fixes
which
can be posted in the form of V4, but I don't see the point to do that
before we get any feedback from you on the current debate.

Or.


>
> As for the other examples (and yes it makes sense to chase them one by
> one here even if we end up little lengthy):
>
> SRQ is for all IB cards and IB/IBoE are central enough in the RDMA
> stack... such that even if (say) 5-10 features (and there are less)
> features are IB/IBoE specific it makes sense to expose them through
> the verbs API. We have device capabilities for that end, so the
> ULP/Application never asks "hey, what vendor/card is that?" but rather
> "is the driver supporting feature X?".
>
> As you said XRC was standardized through IBTA so nothing to bother.
>
> I don't see why you consider IP CSUM to be vendor specific. To compete
> with any Ethernet card dated to the last 10 years, if IB HW vendor
> wants to support TCP/IP networking ala IPoIB they pretty much need to
> implement TCP/IP stateless offloads, namely TX/RX checksum, LSO and
> RSS. And same goes for drivers that support RAW_PACKET Ethernet QP for
> user space applications that offloads TCP/IP.
>
> Fast registration is all but vendor specific being part of both iWARP
> and IB specs (see the BMME - Base Memory Management Extensions section
> in the IBTA spec).
>
>
>> I'm not suggesting that these features shouldn't exist, I'm just questioning if the goal of verbs should simply be to expose every hardware knob that a ULP can fiddle.  Maybe the answer is yes, but let's at least ask the question.
>
> Indeed some HW special knobs might not play well into verbs and in
> that case the vendor might need to give them away or support them only
> on prop. stacks or whatever solution we can think of, but not T10...
>
>> As a random, made up on the spot thought, what if IPoIB were architected so that there was a device specific component to it, instead of it pretending that some vendor feature exposed through verbs was a generic RDMA capability?  IPoIB acceleration features could still be used.
>
> Could be, IPoIB is there for almost ten years (since 2.6.12) and if we
> don't make it to bring the driver into providing 56/100/200Gbs in the
> next 1-2 years this could be nice out of the box direction to go to.
>
> But I don't think it applies to the T10 case, at least where it stands
> now. Its a new feature whose related verbs are not tied to some
> vendor, so lets get it in, and if after some time we see some
> performance drawbacks who originate in the embedding into verbs, we
> can do this possible breakdown you suggest.
>
>
>>> This work is part of larger efforts which are done nowadays in other
>>> parts of the kernel such as the block layer, the upstream kernel target
>>> and more to support T10, its "just" the RDMA part.
>
>> How does the RDMA part tie into any of the other work being done in other parts of the kernel?
>
> T10 touches few areas in the storage stack, T10 && RDMA is when you
> look on  SAN (Storage Area Networks) drivers. E.g in the same manner
> that FC driver needs to play well with back-end storage drivers when
> the FC code submits the the pages received from the network that
> contain data and signature blocks into the block layer, RDMA has to do
> that too. Since this area is under development and things need to play
> well, there is interaction between the RDMA T10 developer to the
> upstream SCSI target maintainer to folks which are involved with the
> block layer and more storage drivers.
>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 00/10] Introduce Signature feature
       [not found]                 ` <CAJZOPZLctOXBJN596gr5sCWQ8F=QQPJ_9VpoZmMhJVc2XwS9qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-12-15 13:59                   ` Or Gerlitz
  0 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-12-15 13:59 UTC (permalink / raw
  To: Roland Dreier
  Cc: linux-rdma,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	Nicholas Bellinger, Oren Duer, Tzahi Oved, Bart Van Assche,
	Mike Christie, Sagi Grimberg, Roland Dreier, Hefty, Sean

On 11/12/2013 19:26, Or Gerlitz wrote:
> Hi Roland, these patches are in the air for two months (V0 posted Oct 
> 15th http://marc.info/?l=linux-rdma&m=138185152212490&w=2) and not a 
> word from you. 

Sagi will be posting V4 soon with the latest cut of the code, rebased to 
3.13-rc and containing some minor fixes and changes we've stepped on 
throughout deeper testing.

Or.

> Sean raised some concerns for which we replied in detail and we'd like 
> to get your maintainer say, 3.14 is coming closer, if something needs 
> to be changed/fixed we need to know that asap so there's enough time 
> to do that for 3.14, if its helps this is the V3 thread with the 
> latest discussion http://marc.info/?t=138383972100009&r=1&w=2 , we 
> have some minor fixes which can be posted in the form of V4, but I 
> don't see the point to do that before we get any feedback from you on 
> the current debate.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-12-15 13:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-14  0:19 [PATCH v3 00/10] Introduce Signature feature Hefty, Sean
     [not found] ` <1828884A29C6694DAF28B7E6B8A8237388D031C4-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-11-14  7:30   ` Or Gerlitz
     [not found]     ` <52847BF8.60401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-11-14  9:21       ` Sagi Grimberg
2013-11-14 19:03       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A8237388D0463F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-11-14 20:11           ` Or Gerlitz
     [not found]             ` <CAJZOPZLpy8a-uUFhT+bvzd51T7jyV0Osdpz-Muc-1q0du7td3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-11 17:26               ` Or Gerlitz
     [not found]                 ` <CAJZOPZLctOXBJN596gr5sCWQ8F=QQPJ_9VpoZmMhJVc2XwS9qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-15 13:59                   ` Or Gerlitz
2013-11-14 21:39   ` Tzahi Oved
  -- strict thread matches above, loose matches on Subject: below --
2013-11-07 15:53 Sagi Grimberg
     [not found] ` <1383839648-19000-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-11-07 21:12   ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.