Building with PGO: concurrency and test data

Git Mailing List Archive mirror
 help / color / mirror / Atom feed

* Building with PGO: concurrency and test data
@ 2024-04-21  0:52 intelfx
  2024-04-21 15:45 ` Mike Castle
  2024-04-23 22:42 ` Jeff King
  0 siblings, 2 replies; 3+ messages in thread
From: intelfx @ 2024-04-21  0:52 UTC (permalink / raw)
  To: git

Hi!

I'm trying to build Git with PGO (for a private distribution) and I
have two questions about the specifics of the profiling process.

1. The INSTALL doc says that the profiling pass has to run the test
suite using a single CPU, and the Makefile `profile` target also
encodes this rule:

> As a caveat: a profile-optimized build takes a *lot* longer since the
> git tree must be built twice, and in order for the profiling
> measurements to work properly, ccache must be disabled and the test
> suite has to be run using only a single CPU. <...>
( https://github.com/git/git/blob/master/INSTALL#L54-L59 )

> profile:: profile-clean
> 	$(MAKE) PROFILE=GEN all
> 	$(MAKE) PROFILE=GEN -j1 test
> 	@if test -n "$$GIT_PERF_REPO" || test -d .git; then \
> 		$(MAKE) PROFILE=GEN -j1 perf; \
( https://github.com/git/git/blob/master/Makefile#L2350-L2352 )

However, some cursory searching tells me that gcc is equipped to handle
concurrent runs of an instrumented program:

> > It is unclear to me if one can safely run multiple processes
concurrently.
> > is there any risk of corruption or overwriting of the various
"gcda” files if different processes attempt to write on them?
>
> The gcda files are accessed by proper locks, so you should be sa[f]e.
( https://gcc-help.gcc.gnu.narkive.com/0NItmccw/is-it-safe-to-generate-profiles-from-multiple-concurrent-processes#post1 )

As far as I understand, the profiling data collected does not include
timing information or any performance counters. What am I missing? Why
is it not possible to run the test suite with parallelism on the
profiling pass?

2. The performance test suite (t/perf/) uses up to two git repositories
("normal" and "large") as test data to run git commands against. Does
the internal organization of these repositories matter? I.e., does it
matter if those are "real-world-used" repositories with overlapping
packs, cruft, loose objects, many refs etc., or can I simply use fresh
clones of git.git and linux.git without loss of profile quality?

Thanks,

-- 
Ivan Shapovalov / intelfx /

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Building with PGO: concurrency and test data
  2024-04-21  0:52 Building with PGO: concurrency and test data intelfx
@ 2024-04-21 15:45 ` Mike Castle
  2024-04-23 22:42 ` Jeff King
  1 sibling, 0 replies; 3+ messages in thread
From: Mike Castle @ 2024-04-21 15:45 UTC (permalink / raw)
  To: intelfx; +Cc: git

On Sat, Apr 20, 2024 at 5:53 PM <intelfx@intelfx.name> wrote:
> I'm trying to build Git with PGO (for a private distribution) and I
> have two questions about the specifics of the profiling process.

Generally speaking, there does not need to be a lot of execution to
generate good profiles.

Execute the happy paths and collect data from those.  (Which implies
that unittests are usually a bad source of profile data.)

Many folks use performance tests to generate profiles because they are
already written, but often, they are overkill.  Depending on what is
going on in the real world, more resources are spent on collecting
data than would be saved by the resulting optimizations.

I'd say, don't worry about it, and just go with what is already provided.

For tools like git, each run is short enough that improvements are not
likely to be noticed in day-to-day activities.  It is still likely to
be IO bound.  Most perceived performance issues are more likely to be
addressed by algorithmic improvements (in general, not just git),
rather than feedback profiles.

Now, for any busy long running servers, this can make a bigger
difference, particularly for computationally expensive operations like
authentication.  But again, IO is likely to dominate.

mrc

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Building with PGO: concurrency and test data
  2024-04-21  0:52 Building with PGO: concurrency and test data intelfx
  2024-04-21 15:45 ` Mike Castle
@ 2024-04-23 22:42 ` Jeff King
  1 sibling, 0 replies; 3+ messages in thread
From: Jeff King @ 2024-04-23 22:42 UTC (permalink / raw)
  To: intelfx; +Cc: Theodore Ts'o, git

On Sun, Apr 21, 2024 at 02:52:48AM +0200, intelfx@intelfx.name wrote:

> 1. The INSTALL doc says that the profiling pass has to run the test
> suite using a single CPU, and the Makefile `profile` target also
> encodes this rule:
> 
> > As a caveat: a profile-optimized build takes a *lot* longer since the
> > git tree must be built twice, and in order for the profiling
> > measurements to work properly, ccache must be disabled and the test
> > suite has to be run using only a single CPU. <...>
> ( https://github.com/git/git/blob/master/INSTALL#L54-L59 )
> [...]
> However, some cursory searching tells me that gcc is equipped to handle
> concurrent runs of an instrumented program:

That text was added quite a while ago, in f2d713fc3e (Fix build problems
related to profile-directed optimization, 2012-02-06). It may be that it
was a problem back then, but isn't anymore.

+cc the author of that commit; I don't know offhand how many people
use "make profile" (now or back then).

> 2. The performance test suite (t/perf/) uses up to two git repositories
> ("normal" and "large") as test data to run git commands against. Does
> the internal organization of these repositories matter? I.e., does it
> matter if those are "real-world-used" repositories with overlapping
> packs, cruft, loose objects, many refs etc., or can I simply use fresh
> clones of git.git and linux.git without loss of profile quality?

I'd be surprised if the choice of repository didn't have some impact.
After all, if there are no loose objects, then the routines that
interact with them are not going to get a lot of exercise. But how much
does it actually matter in practice? I think you'd have to do a bunch of
trial and error measurements to find out.

My gut is that "larger is better" to emphasize the hot loops, but even
that might not be true. The main reason we want "large" repos in some
perf scripts is that it makes it easier to measure the thing we are
speeding up versus the overhead of starting processes, etc. But PGO
might not be as sensitive to that, if it can get what it needs from a
smaller number of runs of the sensitive spots.

All of which is to say "no idea". I know that's not very satisfying, but
I don't recall anybody really discussing PGO much here in the last
decade, so I think you're largely on your own.

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-23 22:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-21  0:52 Building with PGO: concurrency and test data intelfx
2024-04-21 15:45 ` Mike Castle
2024-04-23 22:42 ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).