Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "René Scharfe" <l.s.r@web.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Git List" <git@vger.kernel.org>
Subject: Re: [PATCH v2] bisect--helper: plug strvec leak
Date: Tue, 11 Oct 2022 09:20:18 -0400	[thread overview]
Message-ID: <Y0VtkmNwjKcXcemP@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqq35buykz1.fsf@gitster.g>

On Mon, Oct 10, 2022 at 10:42:42PM -0700, Junio C Hamano wrote:

> >> -			struct strvec argv = STRVEC_INIT;
> >> +			const char *argv[] = { "checkout", start_head.buf,
> >> +					       "--", NULL };
> >> 
> >> -			strvec_pushl(&argv, "checkout", start_head.buf,
> >> -				     "--", NULL);
> >> -			if (run_command_v_opt(argv.v, RUN_GIT_CMD)) {
> >> +			if (run_command_v_opt(argv, RUN_GIT_CMD)) {
> >
> > This is OK with me, but note that one thing we lose is compiler
> > protection that we remembered the trailing NULL pointer in the argv
> > array (whereas strvec_pushl() has an attribute that makes sure the last
> > argument is a NULL).
> 
> The first parameter to run_command_v_opt() must be a NULL terminated
> array of strings.  argv.v[] after strvec_push*() is such a NULL
> terminated array, and is suitable to be passed to the function.
> 
> That much human programmers would know.
> 
> But does the compiler know that run_command_v_opt() requires a NULL
> terminated array of strings, and does it know to check that argv.v[]
> came from strvec_pushl() without any annotation in the first place?

No, but I don't think that's the interesting part. If you're using
strvec, it does the right thing, and it's hard to get it wrong. I'm more
concerned about places where we manually write a list of strings, and
it's easy to forget the trailing NULL.

In the existing code, that's done in the interface of strvec_pushl(),
which will remind you if you write:

  strvec_pushl(&arg, "checkout", start_head.buf, "--");

But after it is done in an initializer, which has no clue about the
expected semantics. We only have to get strvec's invariants right once.
But every ad-hoc command argv has to remember the trailing NULL.

> For such a check to happen, I think we need to tell the compiler
> with some annotation that the first parameter to run_command_v_opt()
> is supposed to be a NULL terminated char *[] array.

Right, but I would not expect the compiler to realize that strvec
maintains the ends-in-NULL invariant. It would have to be quite a clever
compiler.

In theory it could realize that argv is declared as an array locally,
and could make sure it ends in NULL as a compile-time check.

So it would have to be: "check this statically if you can, but otherwise
assume it's OK" kind of warning. But it's all kind of moot since I don't
think any such annotation exists. :)

Possibly a linter like sparse could complain about declaring a variable
called argv that doesn't end in NULL. I don't think it's worth anybody
spending too much time on it, though. This hasn't historically been a
big source of bugs.

> > Probably not that big a deal in practice. It would be nice if there was
> > a way to annotate this for the compiler, but I don't think there's any
> > attribute for "this pointer-to-pointer parameter should have a NULL at
> > the end".
> 
> But the code before this patch is safe only for strvec_pushl() call,
> not run_command_v_opt() call, so we are not losing anything, I would
> think.

The bug I'm worried about it is in a human writing the list of strings
and forgetting the NULL, so there we are losing the (admittedly minor)
protection.

-Peff

  parent reply	other threads:[~2022-10-11 13:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-04 16:06 PATCH] bisect--helper: plug strvec leak in bisect_start() René Scharfe
2022-10-05  7:29 ` Ævar Arnfjörð Bjarmason
2022-10-05 15:43   ` René Scharfe
2022-10-05 19:44   ` Junio C Hamano
2022-10-06 21:35     ` Ævar Arnfjörð Bjarmason
2022-10-06 21:53       ` Junio C Hamano
2022-10-07 15:08         ` [PATCH v2] bisect--helper: plug strvec leak René Scharfe
2022-10-07 17:21           ` Junio C Hamano
2022-10-11  2:39           ` Jeff King
2022-10-11  5:42             ` Junio C Hamano
2022-10-11  7:29               ` Ævar Arnfjörð Bjarmason
2022-10-11 13:21                 ` Jeff King
2022-10-11 13:20               ` Jeff King [this message]
2022-10-11 17:11                 ` Junio C Hamano
2022-10-11 18:13                   ` Ævar Arnfjörð Bjarmason
2022-10-11 21:43                     ` Junio C Hamano
2022-10-14 19:44                       ` Jeff King
2022-10-14 20:23                         ` Junio C Hamano
2022-10-15  6:51                         ` René Scharfe
2022-10-15 18:21                           ` Jeff King
2022-10-05 19:41 ` PATCH] bisect--helper: plug strvec leak in bisect_start() Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0VtkmNwjKcXcemP@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).