Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Tao Klerks <tao@klerks.biz>
To: Junio C Hamano <gitster@pobox.com>
Cc: Tao Klerks via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH] cherry-pick: refuse cherry-pick sequence if index is dirty
Date: Wed, 24 May 2023 11:33:49 +0200	[thread overview]
Message-ID: <CAPMMpoic_+RATwS46=Bd2K4+D_5yEw9RQFGR075Bs4aQJUjtsQ@mail.gmail.com> (raw)
In-Reply-To: <xmqqjzwyh9tp.fsf@gitster.g>

On Wed, May 24, 2023 at 2:06 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Tao Klerks <tao@klerks.biz> writes:
>
> > The current implementation of this patch is far too restrictive. It
> > doesn't break any tests (and maybe I should add one now that I know),
> > but it's doing the wrong thing.
>
> I am ambivalent.  What do we want to see in a multi-pick sequence
> that is different from rebase?

I would argue there are primarily three things that are different:
1. The checkout of the new base (and checkout of the original in an "--abort")
2. The support for and/or more-common expectation of "messing" with
commits as you go, eg squash, edit
3. The (partial) support for rebasing/recreating merge commits

I'm not sure to what extent any of these justify having tighter
restrictions on when we allow a rebase to start, though.

> A single-step cherry-pick can fail
> safely before it touches the index or the working tree files, but if
> two-step cherry-pick, whose first step succeeds, finds that it
> cannot safely carry out its second step without clobbering the local
> changes made to the working tree files, what should happen?  Are we
> OK if we stopped in the state just after the first step has already
> been done?

This is the current behavior: it stops before the specific pick that
is going to affect local unstaged changes, or if there are *any*
staged changes (in which case it stops as it's about to do the first
pick - the first time this check runs). The reasoning for this
behavior, as I understand it, is that the "--abort" strategy,
intending to "undo whatever I started doing here, including a conflict
resolution", resets the index. So as long as there is nothing you want
to keep in the index, and as long as we know that any previous picks
haven't impacted any files with unstaged changes, we're good.

The bug that I want to fix is that we only end up checking whether
there are changes in the index *after* we've already committed to
resetting the index upon later "--abort". It's a kind of catch-22:
we've detected that aborting would destroy your work, so we leave you
in a state where the most obvious thing to do is abort, so we destroy
your work... Of course, if you understand what's going on you can
choose to "--quit" and *not* lose your work... but this is completely
antithetical to the general intent of "--abort".

There's another, smaller flaw here I think, common to Merge,
Single-Cherry-Pick, and Sequence-Cherry-Pick, which is that *if* you
start with unstaged changes, and you end up in a conflict resolution
or "--no-commit" pause, and you then "git add" your unstaged changes
during that pause/resolution, and you *then* later "--abort"... then
your originally-unstaged changes are destroyed by the "--abort" - so
it has *not* taken you back to where you were before the operation
started. This is, to me as a user, non-obvious, and could potentially
lead to data loss. The only way I see to fix that, is to have *all* of
these operations refuse to operate on dirty worktrees altogether -
like rebase already does.

I suspect this level of "strictness" would be welcome to newcomers,
and less welcome to existing experienced users.

>
> My (tentative) answer to that question is "yes", but the recovery
> options of "cherry-pick" may want to work differently from what we
> have seen them traditionally do.  Namely, the user accepts that the
> first step is already done, and stopping "cherry-pick", be it called
> "--abort" or something else, should just remove the sequencer state
> and behave as if the single-pick cherry-pick on the first step only
> has just finished and leave such a state in the index and the
> working tree.

This behavior exists, and is called "--quit", right?

The semantics as I understand it are:
--quit: I know what I'm doing, just remove any "ongoing operation"
metadata and let me work with the current index and worktree.
--abort: This was a bad idea, please take me back to where I was
before I started this operation (without losing any work I had
ongoing, pls!)

>  If that is what we are going to do, then it would
> make sense to adopt the same safety semantics we use for "git merge"
> and "git checkout" to ensure only that the index is clean, relying
> on the unpack-trees machinery that stops before clobbering a locally
> modified working tree files.

Yep

> But if we are to aim for "all-or-none"
> semantics people expect from aborting "git rebase", I suspect that
> it would be way too complicated to allow random changes in the
> working tree files that we may only discover to be problems after
> starting the sequence of replaying commits one-by-one, and "too
> restrictive" check may be justified.

I don't think I understand this argument. If we want to support both
sets of semantics, then that's exactly what "--quit" and "--abort"
achieve, right? (as long as we check for the dirty index *before*
committing to destroying the index in case of "--abort")

>  To put it differently, if it
> is too restrictive for multi-pick, then we would want to loosen it
> for "git rebase" as well, as the issues are likely to be the same.

My argument for only changing "sequence-cherry-pick" here, and having
it (continue to) use the index-safety-only semantics of
single-cherry-pick and merge, is that *this is not a change in
cabability* - it is only a bugfix. Switching to the worktree-safety
semantics of rebase would be a substantial change in behavior beyond
the bugfix.

I, personally, would prefer to see the worktree-safety semantics of
rebase be used in *all* these operations, so I could no longer shoot
myself in the foot by starting a merge, accidentally staging some
previously-unstaged changes during conflict resolution, and then
losing those changes by "--abort"ing. But I expect that this kind of
change would need to be behind a config option of some sort, trading
off safety against low friction.

I could imagine a setting like "core.OperationWorktreeSafety", with
settings "default" (current behavior - rebase disallows dirty
worktrees, the others disallow dirty index), "strict" (all behave like
current rebase) and "lax" (all behave like merge).

As discussed elsewhere, I would also like to (have an option to) treat
untracked files as "worktree dirtiness"/unstaged changes in exactly
the same way as changes to tracked files - but that's another topic :)

I'll prepare a v2 with index-safety-only for sequence-cherry-pick for
now, please let me know if a (better-named)
"core.OperationWorktreeSafety" option is something that you'd be
interested in / that would make sense to you.

Thanks!

  reply	other threads:[~2023-05-24  9:34 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23  8:32 [PATCH] cherry-pick: refuse cherry-pick sequence if index is dirty Tao Klerks via GitGitGadget
2023-05-23 16:01 ` Tao Klerks
2023-05-24  0:06   ` Junio C Hamano
2023-05-24  9:33     ` Tao Klerks [this message]
2023-05-30 13:01       ` Phillip Wood
2023-05-28  9:08 ` [PATCH v2] " Tao Klerks via GitGitGadget
2023-05-30 14:16   ` Phillip Wood
2023-09-06  5:02     ` Tao Klerks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPMMpoic_+RATwS46=Bd2K4+D_5yEw9RQFGR075Bs4aQJUjtsQ@mail.gmail.com' \
    --to=tao@klerks.biz \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).