Re: [PATCH] ci: avoid unnecessary builds

Git Mailing List Archive mirror
 help / color / mirror / Atom feed

From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee <derrickstolee@github.com>,
	Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH] ci: avoid unnecessary builds
Date: Tue, 8 Nov 2022 10:51:45 +0100 (CET)	[thread overview]
Message-ID: <2p770o19-s3o3-o6r9-18r5-25n91r1r2210@tzk.qr> (raw)
In-Reply-To: <xmqqk046cmmv.fsf@gitster.g>

Hi Junio,

On Mon, 7 Nov 2022, Junio C Hamano wrote:

> Derrick Stolee <derrickstolee@github.com> writes:
>
> > On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
> >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >>
> >> Whenever a branch is pushed to a repository which has GitHub Actions
> >> enabled, a bunch of new workflow runs are started.
> >>
> >> We sometimes see contributors push multiple branch updates in rapid
> >> succession, which in conjunction with the impressive time swallowed by
> >> even just a single CI build frequently leads to many queued-up runs.
> >>
> >> This is particularly problematic in the case of Pull Requests where a
> >> single contributor can easily (inadvertently) prevent timely builds for
> >> other contributors.
> >
> > As someone who is both the cause and the victim of this, I
> > thank you for finding a way to reduce wasted CPU time. This
> > patch looks good to me, though I'll need to trust the docs
> > and your testing to be sure it will work. We will definitely
> > see it in place as it merges into 'next' and 'main'.
>
> When I see breakages of 'seen' only at the CI, perhaps because it
> manifests only on macOS, I manually "bisected" by pushing various
> combinations of topics merged on top of 'master' and pushing the
> result out as 'seen' only to the GitHub repository, and not having
> to wait one to finish before pushing out another was a really nice
> feature.  Of course, I could wait before pushing another out, but
> after seeing the last one close to successful completion in a few
> minutes and being able to push out the next one was a great
> timesaver, not only for the "few minutes", but for the countless
> minutes because I will have to concentrate on more than just a "few
> minutes" on another task if I have to switch to another task in
> order to wait for just a "few minutes" before pushing the trial
> merge out.

I often find myself with similar problems where I have to test a couple of
revisions in order to pinpoint regressions that do not reproduce locally.

One of my power tools to figure these things out is
https://github.com/mxschmitt/action-tmate, allowing me to SSH into the
build agent where the failure happens (note: do use with prudence lest your
account gets flagged for potential mining).

I find that much more surgical a tool than having multiple builds run
concurrently, not the least reason being that keeping track of which
builds correspond to which step of the bisection can be very, very
confusing.

Also, one of my core values is to use up only the resources that I need
to. And using up a lot of build minutes is contrary to that value.

So what I end up doing in similar situations is:

- first of all, switch to a temporary branch

- then, add a TO-DROP commit that rips out _all_ of the unnecessary
  builds. Typically the first to go are check-whitespace and git-l10n, and
  then I liberally delete jobs (including the `ci-config` job that's
  totally useless in this use case) and restrict the test suite to running
  just the failing script by editing the `T = [...]` line in `t/Makefile`,
  often even adding a `--run=[... only the minimal amount of test
  cases...]` (after figuring out which ones are needed which is
  unfortunately quite hard due to the abundance of side effects our test
  suite relies on) to the `GIT_TEST_OPTS` in `ci/`.

- then, if I need to test multiple revisions, I _create_ new branches for
  each bisection point (typically encoding information in the branch name
  that will help me with book-keeping), cherry-picking the just-created
  TO-DROP commit before pushing.

Since I want to minimize my footprint when it comes to using resources, I
typically am very judicious about what revisions I test, and how many I
run concurrently.

It is a bit sad that doing this is currently very much involved and that
it is so much easier to just go ahead and run the entire test suite on
every available platform even if the test failures one wishes to diagnose
happen on but one platform in but one test script. I.e. in the current
shape, our code base encourages wasting of resources.

That is a situation I would like to see improved. If I was not committed
to other work streams, I would work on it because I find it that
important.

> So, that is the only concern I have with this change, but in
> general, not running jobs whose results are clearly not needed is a
> good idea.  It just is "clearly" is hard to determine automatically.

Right. It is also very subjective what "clearly" should mean in this
context. For example, I find it clearly undesirable just how long our test
suite takes (and not for any good reason, really) and how inflexible it is
when it comes to things like Test Impact Analysis (i.e. running only tests
covering the code modified in a given PR, which would speed up everything
_tremendously_). At the same time I am fully aware that I find myself
pretty alone in this assessment, otherwise other contributors would
clearly be more interested in fixing these issues than they are.

Ciao,
Dscho

     prev parent reply	other threads:[~2022-11-08  9:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 13:34 [PATCH] ci: avoid unnecessary builds Johannes Schindelin via GitGitGadget
2022-11-04  1:46 ` Ævar Arnfjörð Bjarmason
2022-11-04  2:23   ` Taylor Blau
2022-11-04  3:20     ` Jeff King
2022-11-08  9:16       ` Johannes Schindelin
2022-11-09 14:00         ` Jeff King
2022-11-10  2:40           ` Taylor Blau
2022-11-04  2:09 ` Taylor Blau
2022-11-07 19:45 ` Derrick Stolee
2022-11-07 19:53   ` Taylor Blau
2022-11-07 20:08     ` Derrick Stolee
2022-11-07 21:03       ` Ævar Arnfjörð Bjarmason
2022-11-07 21:59         ` Derrick Stolee
2022-11-07 22:44           ` Taylor Blau
2022-11-08  8:18             ` Johannes Schindelin
2022-11-08 18:30               ` Taylor Blau
2022-11-07 22:56           ` Ævar Arnfjörð Bjarmason
2022-11-08  0:02             ` Derrick Stolee
2022-11-08  0:31   ` Junio C Hamano
2022-11-08  9:51     ` Johannes Schindelin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2p770o19-s3o3-o6r9-18r5-25n91r1r2210@tzk.qr \
    --to=johannes.schindelin@gmx.de \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).