Lustre-devel archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] Lustre interval tree problems
Date: Thu, 21 Feb 2019 13:25:42 +1100	[thread overview]
Message-ID: <87y369c1y1.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <alpine.LFD.2.21.1902201738170.6267@casper.infradead.org>

On Wed, Feb 20 2019, James Simmons wrote:

> With the port of the interval tree work to the OpenSFS branch several 
> problems have been pointed out in the implementation which impacts
> performance and the memory footprint in some cases.

Porting to OpenSFS will have to address one issue that I could skip
over.

hsm_update_work() in mtd_hsm_cdt_requests.c deliberately doesn't want
duplicates - it is the only place in all of lustre which really doesn't
want duplicates, and it is server side so I don't have to address it.

I wold probably change the code to do a lookup first, then insert only
if the lookup failed.

>
> The biggest issues is the ability to add duplicate locks. The reason
> the original implementation avoided adding duplicate locks was to
> avoid the extra compares being done with the same type of locks.
> Especially in the case of the range lock where you can end up with
> many duplicate read locks [0-EOF]. Comparing with 1000s of identical
> read locks will have an impact. Their is teh question of the memory
> foot print in this case as well. I believe this is the case with the
> ldlm locks as well since the ability to skip large lock amounts if
> not matching has been lost.

The interval tree stores entries in an rb-tree, which is a
nearly-balanced binary tree.  This provides O(ln n) lookup.
If there are 1000s of identical read locks, then a lookup will have 10s
of extra compares.  Even with 1,000,000 locks, that would be at most 40
compares.
The code simplification could well make up for some of these extra
compares by reducing i-cache pressure, though that would be hard to
measure.

Is there any empirical evidence of a slow-down?  Does 10s of compares
really seem like a big cost?

I cannot see the memory foot print issue that you mention.
All locks are added to the data structure even if they aren't in the
interval tree directly.  There is a linked-list of locks that all have
the same start/end, so keeping duplicates doesn't mean that more memory
is used, it just means that it is linked together differently.

So it isn't yet clear to me what the problem is here.
I'd be have to have it explained to me in more detail.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20190221/454cd5af/attachment.sig>

      parent reply	other threads:[~2019-02-21  2:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-20 17:59 [lustre-devel] Lustre interval tree problems James Simmons
2019-02-20 22:37 ` NeilBrown
2019-02-20 22:54   ` Patrick Farrell
2019-02-21  0:33     ` NeilBrown
2019-02-21  1:21       ` NeilBrown
2019-02-21  1:00 ` NeilBrown
2019-02-21  2:25 ` NeilBrown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y369c1y1.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).