Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: "René Scharfe" <l.s.r@web.de>
Cc: Junio C Hamano <gitster@pobox.com>, Git List <git@vger.kernel.org>
Subject: Re: free and errno, was Re: [PATCH] apply: replace mksnpath() with a mkpathdup() call
Date: Tue, 23 Apr 2024 21:11:37 -0400	[thread overview]
Message-ID: <20240424011137.GA1180766@coredump.intra.peff.net> (raw)
In-Reply-To: <10b7fff3-7a2c-4555-9210-8000aae43583@web.de>

On Sun, Apr 14, 2024 at 05:17:23PM +0200, René Scharfe wrote:

> > In other words, it really depends on the contract of foo() with respect
> > to errno. And I don't think there is a way in C to specify that
> > contract in away that the compiler can understand.
> 
> The context attribute of sparse could be used, I guess.  We'd have to
> duplicate the declaration of all library functions, though, to tag them
> with a may_have_changed_errno attribute.  And we'd need to clear it on
> success, so we'd have to modify all callers.  On the flip side it would
> allow detecting consecutive errno changes, e.g. by two I/O functions
> with no error checking in between.  But the implementation effort seems
> way too high.
> 
> It would be easier if the error information was in one place instead of
> one bit in the return value (NULL or less than 0) and the rest in errno.

Yeah. In a world where every failing function returned the equivalent of
-errno, you could just do:

  ret = xread(fd, ...); /* actually returns -errno! */
  if (ret < 0) {
	close(fd);
	return ret; /* open's errno preserved via ret */
  }

We could live in that world if we wrapped all of the syscalls with
xread(), etc, that behaved that way. I don't know if it would run into
weird gotchas, though, or if the result would simply look foreign to
most POSIX/C programmers.

I also wonder if it might end up having the same 1% overhead that the
free() checks did, as it involves more juggling of stack values.

> >> An xfree() to go along with xmalloc()/xrealloc()/xcalloc()/xstrdup() would
> >> fit in nicely and might be easier to remember than free() after a while.
> >> Having to convert thousands of calls is unappealing, though.
> >
> > My biggest concern with it is that we'd have to remember to use it, and
> > there's not good compiler enforcement. But I guess coccinelle can help
> > us there.q
> 
> Yes, converting all free(3) calls is easy with Coccinelle, and the same
> semantic patch can be used to enforce the use of xfree().  Still the
> initial diff would be huge (2000+ changed lines in 300+ files).

True. It's one more weird thing to remember, but for the most part we
are there already with xmalloc().

> > My secondary concern is that it might make people think that xmalloc()
> > and xfree() are always paired, and thus you can do clever things in one
> > as long as the other matches it. But we sometimes free memory from
> > system routines like getline(). Maybe a scary comment would be enough?
> 
> Amazing foresight!  Currently we only use getline(3) (which can act like
> realloc(3)) in contrib/credential/, though.  Any others?

Heh, I actually meant to say getdelim(), which obviously is closely
related. There's only one call to it, but as it underlies
strbuf_getwholeline(), many strbufs will use it.

And as you might have guessed, my "amazing foresight" came from being
bitten by the same thing already. ;) Annoyed at how long git took to run
a large workload under massif, I once wrote a hacky peak-heap profiler
that wraps the system malloc(). Naturally it needs to match frees to
their original allocations so we know how big each one was. Imagine my
surprise when I saw many frees without a matching allocation.

> The "x" prefix doesn't promise exclusivity (hermetic seal?), but it
> certainly suggests there are some shared feature between the allocator
> functions and xfree(), while they only have in common that the are
> wrapped.  We could call the latter errno_preserving_free() instead or so.

That's a mouthful, for sure. I think it is OK to suggest they are
related as long as there is a comment in xfree() mentioning that we
might see pointers that didn't come from our x*() functions. That's
where somebody would have to add code that violates the assumption.

> I'm more worried about the overhead.  For a 0-1% slowdown (in the case of
> git log) git_free() or xfree() give us accurate error numbers on all
> platforms.  Non-misleading error codes are worth a lot (seen my share of
> cursed messages), but xfree() is only necessary in the error handling
> code.  The happy path can call free(3) directly without harm.
> 
> But how to classify call sites accordingly?  It's easy for programmers
> once they know it's necessary.  Is there a way in C, macro or Coccinelle
> to have our cake and only eat it if really needed?  I don't see any. :-|

I don't see a way either. I'd be willing to consider the 0-1% slowdown
if it buys us simplicity, though. I'm also not even sure it's real.
There seemed to be a fairly large variation in results, and switching my
powersaving functions around got me cases where the new code was
actually faster. If we think there's a real benefit, we should make sure
the costs aren't just illusory.

-Peff

  reply	other threads:[~2024-04-24  1:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04 21:08 [PATCH] apply: replace mksnpath() with a mkpathdup() call René Scharfe
2024-04-04 21:29 ` Junio C Hamano
2024-04-04 22:53 ` free and errno, was " Jeff King
2024-04-04 23:08   ` Junio C Hamano
2024-04-05 10:52   ` René Scharfe
2024-04-05 17:35     ` Jeff King
2024-04-05 17:41       ` Jeff King
2024-04-06 17:45       ` René Scharfe
2024-04-07  1:18         ` Jeff King
2024-04-14 15:17           ` René Scharfe
2024-04-24  1:11             ` Jeff King [this message]
2024-04-05 10:53 ` [PATCH v2 1/2] apply: avoid fixed-size buffer in create_one_file() René Scharfe
2024-04-05 10:56   ` [PATCH v2 2/2] path: remove mksnpath() René Scharfe
2024-04-05 17:37     ` Jeff King
2024-04-05 16:51   ` [PATCH v2 1/2] apply: avoid fixed-size buffer in create_one_file() Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240424011137.GA1180766@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).