All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: "Carlo Arenas" <carenas@gmail.com>,
	git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: improve performance of PCRE2 bug 2642 bug workaround
Date: Tue, 22 Mar 2022 21:26:10 +0100	[thread overview]
Message-ID: <99b0adb6-26ba-293c-3a8f-679f59e7cb4d@web.de> (raw)
In-Reply-To: <bd751d5c-2f8b-4c52-72ec-f2b7268a30a8@cs.ucla.edu>

Am 22.03.22 um 17:38 schrieb Paul Eggert:
> Today, Carlo Arenas pointed out[1] that GNU grep didn't work around
> PCRE2 bug 2642, which Git grep has a workaround for. While installing
> a GNU grep patch to fix this[2] I noticed that Git's workaround
> appears to be too pessimistic: on older PCRE2 libraries Git grep sets
> PCRE2_NO_START_OPTIMIZE even when PCRE2_CASELESS is not set.
>
> Attached is a patch to Git that I just now cobbled up and have not
> even compiled, much less tested. Please feel free to ignore it, as it
> would merely improve performance on older, buggy PCRE2 libraries and
> that might not be worth your trouble. I'm sending this email as more
> of a thank-you for letting us know indirectly of the PCRE2 bug.
>
> [1]: https://lists.gnu.org/r/grep-devel/2022-03/msg00004.html
> [2]: https://lists.gnu.org/r/grep-devel/2022-03/msg00005.html

Interesting.  So you say bug 2642 [3] requires the flag PCRE2_CASELESS
(i.e. --ignore-case) to be triggered.  (That's probably documented in
Bugzilla, but I'm not authorized to access it.)

However, the looser check works around another bug, if only by accident.
I believe it was fixed upstream by [4].  That other bug was discussed in
the thread Carlo linked to, which started at [5].  You should be able to
reproduce it with something like this (search for leading white-space in
a Unicode haystack):

  $ echo ' Halló' | grep -P '^\s'

An affected version of PCRE2 would loop forever.

However, I can only test any of that with CI jobs, not locally, so
please take my findings with a heap of salt.

René


[3] https://bugs.exim.org/show_bug.cgi?id=2642
[4] https://github.com/PhilipHazel/pcre2/commit/e0c6029
[5] https://lore.kernel.org/git/20220129172542.GB2581@szeder.dev/

  reply	other threads:[~2022-03-22 20:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 16:38 improve performance of PCRE2 bug 2642 bug workaround Paul Eggert
2022-03-22 20:26 ` René Scharfe [this message]
2022-03-22 21:12   ` Paul Eggert
2022-03-23  1:09   ` Carlo Marcelo Arenas Belón
2022-03-23  4:06     ` Paul Eggert
2022-03-23 18:37     ` René Scharfe
2022-03-23 20:24       ` Carlo Arenas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99b0adb6-26ba-293c-3a8f-679f59e7cb4d@web.de \
    --to=l.s.r@web.de \
    --cc=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=eggert@cs.ucla.edu \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.