From: Jeff King <peff@peff.net>
To: "René Scharfe" <l.s.r@web.de>
Cc: Junio C Hamano <gitster@pobox.com>, Git List <git@vger.kernel.org>
Subject: Re: free and errno, was Re: [PATCH] apply: replace mksnpath() with a mkpathdup() call
Date: Tue, 23 Apr 2024 21:11:37 -0400 [thread overview]
Message-ID: <20240424011137.GA1180766@coredump.intra.peff.net> (raw)
In-Reply-To: <10b7fff3-7a2c-4555-9210-8000aae43583@web.de>
On Sun, Apr 14, 2024 at 05:17:23PM +0200, René Scharfe wrote:
> > In other words, it really depends on the contract of foo() with respect
> > to errno. And I don't think there is a way in C to specify that
> > contract in away that the compiler can understand.
>
> The context attribute of sparse could be used, I guess. We'd have to
> duplicate the declaration of all library functions, though, to tag them
> with a may_have_changed_errno attribute. And we'd need to clear it on
> success, so we'd have to modify all callers. On the flip side it would
> allow detecting consecutive errno changes, e.g. by two I/O functions
> with no error checking in between. But the implementation effort seems
> way too high.
>
> It would be easier if the error information was in one place instead of
> one bit in the return value (NULL or less than 0) and the rest in errno.
Yeah. In a world where every failing function returned the equivalent of
-errno, you could just do:
ret = xread(fd, ...); /* actually returns -errno! */
if (ret < 0) {
close(fd);
return ret; /* open's errno preserved via ret */
}
We could live in that world if we wrapped all of the syscalls with
xread(), etc, that behaved that way. I don't know if it would run into
weird gotchas, though, or if the result would simply look foreign to
most POSIX/C programmers.
I also wonder if it might end up having the same 1% overhead that the
free() checks did, as it involves more juggling of stack values.
> >> An xfree() to go along with xmalloc()/xrealloc()/xcalloc()/xstrdup() would
> >> fit in nicely and might be easier to remember than free() after a while.
> >> Having to convert thousands of calls is unappealing, though.
> >
> > My biggest concern with it is that we'd have to remember to use it, and
> > there's not good compiler enforcement. But I guess coccinelle can help
> > us there.q
>
> Yes, converting all free(3) calls is easy with Coccinelle, and the same
> semantic patch can be used to enforce the use of xfree(). Still the
> initial diff would be huge (2000+ changed lines in 300+ files).
True. It's one more weird thing to remember, but for the most part we
are there already with xmalloc().
> > My secondary concern is that it might make people think that xmalloc()
> > and xfree() are always paired, and thus you can do clever things in one
> > as long as the other matches it. But we sometimes free memory from
> > system routines like getline(). Maybe a scary comment would be enough?
>
> Amazing foresight! Currently we only use getline(3) (which can act like
> realloc(3)) in contrib/credential/, though. Any others?
Heh, I actually meant to say getdelim(), which obviously is closely
related. There's only one call to it, but as it underlies
strbuf_getwholeline(), many strbufs will use it.
And as you might have guessed, my "amazing foresight" came from being
bitten by the same thing already. ;) Annoyed at how long git took to run
a large workload under massif, I once wrote a hacky peak-heap profiler
that wraps the system malloc(). Naturally it needs to match frees to
their original allocations so we know how big each one was. Imagine my
surprise when I saw many frees without a matching allocation.
> The "x" prefix doesn't promise exclusivity (hermetic seal?), but it
> certainly suggests there are some shared feature between the allocator
> functions and xfree(), while they only have in common that the are
> wrapped. We could call the latter errno_preserving_free() instead or so.
That's a mouthful, for sure. I think it is OK to suggest they are
related as long as there is a comment in xfree() mentioning that we
might see pointers that didn't come from our x*() functions. That's
where somebody would have to add code that violates the assumption.
> I'm more worried about the overhead. For a 0-1% slowdown (in the case of
> git log) git_free() or xfree() give us accurate error numbers on all
> platforms. Non-misleading error codes are worth a lot (seen my share of
> cursed messages), but xfree() is only necessary in the error handling
> code. The happy path can call free(3) directly without harm.
>
> But how to classify call sites accordingly? It's easy for programmers
> once they know it's necessary. Is there a way in C, macro or Coccinelle
> to have our cake and only eat it if really needed? I don't see any. :-|
I don't see a way either. I'd be willing to consider the 0-1% slowdown
if it buys us simplicity, though. I'm also not even sure it's real.
There seemed to be a fairly large variation in results, and switching my
powersaving functions around got me cases where the new code was
actually faster. If we think there's a real benefit, we should make sure
the costs aren't just illusory.
-Peff
next prev parent reply other threads:[~2024-04-24 1:11 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-04 21:08 [PATCH] apply: replace mksnpath() with a mkpathdup() call René Scharfe
2024-04-04 21:29 ` Junio C Hamano
2024-04-04 22:53 ` free and errno, was " Jeff King
2024-04-04 23:08 ` Junio C Hamano
2024-04-05 10:52 ` René Scharfe
2024-04-05 17:35 ` Jeff King
2024-04-05 17:41 ` Jeff King
2024-04-06 17:45 ` René Scharfe
2024-04-07 1:18 ` Jeff King
2024-04-14 15:17 ` René Scharfe
2024-04-24 1:11 ` Jeff King [this message]
2024-04-05 10:53 ` [PATCH v2 1/2] apply: avoid fixed-size buffer in create_one_file() René Scharfe
2024-04-05 10:56 ` [PATCH v2 2/2] path: remove mksnpath() René Scharfe
2024-04-05 17:37 ` Jeff King
2024-04-05 16:51 ` [PATCH v2 1/2] apply: avoid fixed-size buffer in create_one_file() Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240424011137.GA1180766@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=l.s.r@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).