Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Victoria Dye <vdye@github.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>,
	Matheus Tavares <matheus.bernardino@usp.br>,
	Elijah Newren <newren@gmail.com>, Glen Choo <chooglen@google.com>,
	Martin von Zweigbergk <martinvonz@google.com>
Subject: Re: [PATCH v4] sparse-checkout.txt: new document with sparse-checkout directions
Date: Wed, 16 Nov 2022 11:18:00 +0800	[thread overview]
Message-ID: <CAOLTT8T39Q4q5W2BaFVkm81T7mRc1UvT2MN07XHGT5qpB7ZMHQ@mail.gmail.com> (raw)
In-Reply-To: <CAOLTT8TzpfoH7pz7gxgFvNWOaUZUcg1q_Tap+2anwHfAUgDV8Q@mail.gmail.com>

ZheNing Hu <adlternative@gmail.com> 于2022年11月15日周二 12:03写道:
>
> Hi,
>
> Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年11月6日周日 14:04写道:
> >
> > From: Elijah Newren <newren@gmail.com>
> >
> > Once upon a time, Matheus wrote some patches to make
> >    git grep [--cached | <REVISION>] ...
> > restrict its output to the sparsity specification when working in a
> > sparse checkout[1].  That effort got derailed by two things:
> >
> >   (1) The --sparse-index work just beginning which we wanted to avoid
> >       creating conflicts for
> >   (2) Never deciding on flag and config names and planned high level
> >       behavior for all commands.
> >
> > More recently, Shaoxuan implemented a more limited form of Matheus'
> > patches that only affected --cached, using a different flag name,
> > but also changing the default behavior in line with what Matheus did.
> > This again highlighted the fact that we never decided on command line
> > flag names, config option names, and the big picture path forward.
> >
> > The --sparse-index work has been mostly complete (or at least released
> > into production even if some small edges remain) for quite some time
> > now.  We have also had several discussions on flag and config names,
> > though we never came to solid conclusions.  Stolee once upon a time
> > suggested putting all these into some document in
> > Documentation/technical[3], which Victoria recently also requested[4].
> > I'm behind the times, but here's a patch attempting to finally do that.
> >
> > [1] https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/
> >     (See his second link in that email in particular)
> > [2] https://lore.kernel.org/git/20220908001854.206789-2-shaoxuan.yuan02@gmail.com/
> > [3] https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/
> >     (Scroll to the very end for the final few paragraphs)
> > [4] https://lore.kernel.org/git/cafcedba-96a2-cb85-d593-ef47c8c8397c@github.com/
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >     sparse-checkout.txt: new document with sparse-checkout directions
> >
> >     v2 and v3 didn't get any reviews (I know, I know, this document is
> >     really long), but it's been nearly a month and this patch is still
> >     marked as "Needs Review", so I'm hoping sending a v4 will encourage
> >     feedback. I think it's good enough to accept and start iterating, but
> >     want to be sure others agree.
> >
> >     As before, I think we're starting to converge on actual proposals;
> >     there's some areas we've agreed on, others we've compromised on, and
> >     some we've just figured out what the others were saying. The discussion
> >     has been very illuminating; thanks to everyone who has chimed in. I've
> >     tried to take my best stab at cleaning up and culling things that don't
> >     need to remain as open questions, but if I've mis-represented anyone or
> >     missed something, don't hesitate to speak up. Everything is still open
> >     for debate, even if not marked as a currently open question.
> >
> >     Changes since v3:
> >
> >      * A few minor wording cleanups here and there, and one paragraph moved
> >        to keep similar things together.
> >
> >     Changes since v2:
> >
> >      * Compromised with Stollee on log -- Behavior A only affects
> >        patch-related operations, not revision walking
> >      * Incorporated Junio's suggestions about untracked file handling
> >      * Added new usecases, one brought up by Martin, one by Stolee
> >      * Added new sections:
> >        * Usecases of primary concern
> >        * Oversimplified mental models ("Cliff Notes" for this document!)
> >      * Recategorization of a few commands based on discussion
> >      * Greater details on how index operations work under Behavior A, to
> >        avoid weird edge cases
> >      * Extended explanation of the sparse specification, particularly when
> >        index differs from HEAD
> >      * Switched proposed flag names to --scope={sparse,all} to avoid binary
> >        flags that are hard to extend
> >      * Switched proposed config option name (still need good values and
> >        descriptions for it, though)
> >      * Removed questions we seemed to have agreement on. Modified/extended
> >        some existing questions.
> >      * Added Stolee's sparse-backfill ideas to the plans
> >      * Additional Known bugs
> >      * Various wording improvements
> >      * Possibly other things I've missed.
> >
> >     Changes since v1:
> >
> >      * Added new sections:
> >        * "Terminology"
> >        * "Behavior classes"
> >        * "Sparse specification vs. sparsity patterns"
> >      * Tried to shuffle commands from unknown into appropriate sections
> >        based on feedback, but I got some conflicting feedback, so...who
> >        knows if thing are in the right place
> >      * More consistency in using "sparse specification" over other terms
> >      * Extra comments about how add/rm/mv operate on moving files across the
> >        tracked/untracked boundary
> >      * --restrict-but-warn should have been "restrict or error", but
> >        reworded even more heavily as part of "Behavior classes" section
> >      * Added extra questions based on feedback (--no-expand, update-index
> >        stuff, apply --index)
> >      * More details on apply/am bugs
> >      * Documented read-tree issue
> >      * A few cases of fixing line wrapping at <=80 chars
> >      * Added more alternate name suggestions for options instead of
> >        --[no-]restrict
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1367%2Fnewren%2Fsparse-checkout-directions-v4
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1367/newren/sparse-checkout-directions-v4
> > Pull-Request: https://github.com/gitgitgadget/git/pull/1367
> >
> >  Documentation/technical/sparse-checkout.txt | 1103 +++++++++++++++++++
> >  1 file changed, 1103 insertions(+)
> >  create mode 100644 Documentation/technical/sparse-checkout.txt
> >
> > diff --git a/Documentation/technical/sparse-checkout.txt b/Documentation/technical/sparse-checkout.txt
> > new file mode 100644
> > +=== Terminology ===
> > +
> > +sparse directory: An entry in the index corresponding to a directory, which
> > +       appears in the index instead of all the files under that directory
> > +       that would normally appear.  See also sparse-index.  Something that
> > +       can cause confusion is that the "sparse directory" does NOT match
> > +       the sparse specification, i.e. the directory is NOT present in the
> > +       working tree.  May be renamed in the future (e.g. to "skipped
> > +       directory").
> > +
> > +sparse index: A special mode for sparse-checkout that also makes the
> > +       index sparse by recording a directory entry in lieu of all the
> > +       files underneath that directory (thus making that a "skipped
> > +       directory" which unfortunately has also been called a "sparse
> > +       directory"), and does this for potentially multiple
> > +       directories.  Controlled by the --[no-]sparse-index option to
> > +       init|set|reapply.
> > +
> > +sparsity patterns: patterns from $GIT_DIR/info/sparse-checkout used to
> > +       define the set of files of interest.  A warning: It is easy to
> > +       over-use this term (or the shortened "patterns" term), for two
> > +       reasons: (1) users in cone mode specify directories rather than
> > +       patterns (their directories are transformed into patterns, but
> > +       users may think you are talking about non-cone mode if you use the
> > +       word "patterns"), and (b) the sparse specification might
>
> nit: s/(b)/(2)/g
>
> > +       transiently differ in the working tree or index from the sparsity
> > +       patterns (see "Sparse specification vs. sparsity patterns").
> > +
> > +sparse specification: The set of paths in the user's area of focus.  This
> > +       is typically just the tracked files that match the sparsity
> > +       patterns, but the sparse specification can temporarily differ and
> > +       include additional files.  (See also "Sparse specification
> > +       vs. sparsity patterns")
> > +
> > +       * When working with history, the sparse specification is exactly
> > +         the set of files matching the sparsity patterns.
> > +       * When interacting with the working tree, the sparse specification
> > +         is the set of tracked files with a clear SKIP_WORKTREE bit or
> > +         tracked files present in the working copy.
>

I found af6a518 (repo_read_index: clear SKIP_WORKTREE bit from files
present in worktree) which maybe a good place to learn about "sparse
specification",
it has a long commit message though.

> I'm guessing what you mean here is:
> Some files are stored with a flag bit of !SKIP_WORKTREE in its index entry.
> But files are "vivifying" (restore to worktree) or new files added to
> index (tracked files),
> they also belong to the sparse specification.
>
> I think we can add some examples to describe these terms.
>
> #!/bin/sh
>
> set -x
>
> rm -rf mono-repo
> git init mono-repo -b main
> (
>   cd mono-repo &&
>   mkdir p1 p2 &&
>   echo a >p1/a &&
>   echo b >p1/b &&
>   echo a >p2/a &&
>   echo b >p2/b &&
>   git add . &&
>   git commit -m ok &&
>   git sparse-checkout set p1 &&
>   git ls-files -t &&
>   echo a >>p1/a &&
>   echo b >>p1/b &&
>   mkdir p2 p3 &&
>   echo next >>p2/a &&
>   echo next >>p3/c &&
>   git add p3/c &&
>   # p2/a and p3/c vivify
>   git ls-files -t &&
>   # compare wortree/commit
>   git --no-pager diff HEAD --name-only
> )
>
> > +       * When modifying or showing results from the index, the sparse
> > +         specification is the set of files with a clear SKIP_WORKTREE bit
> > +         or that differ in the index from HEAD.
>
> #!/bin/sh
>
> set -x
>
> rm -rf mono-repo
> git init mono-repo -b main
> (
>   cd mono-repo &&
>   mkdir p1 p2 &&
>   echo a >p1/a &&
>   echo b >p1/b &&
>   echo a >p2/a &&
>   echo b >p2/b &&
>   git add . &&
>   git commit -m ok &&
>   git sparse-checkout set p1 &&
>   git update-index --chmod=+x p2/a &&
>   # compare commit/index
>   git --no-pager diff --cached --name-only
> )
>

  reply	other threads:[~2022-11-16  3:18 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-25  0:09 [PATCH] sparse-checkout.txt: new document with sparse-checkout directions Elijah Newren via GitGitGadget
2022-09-26 17:20 ` Junio C Hamano
2022-09-26 17:38 ` Junio C Hamano
2022-09-27  3:05   ` Elijah Newren
2022-09-27  4:30     ` Junio C Hamano
2022-09-26 20:08 ` Victoria Dye
2022-09-26 22:36   ` Junio C Hamano
2022-09-27  7:30     ` Elijah Newren
2022-09-27 16:07       ` Junio C Hamano
2022-09-28  6:13         ` Elijah Newren
2022-09-27  6:09   ` Elijah Newren
2022-09-27 16:42   ` Derrick Stolee
2022-09-28  5:42     ` Elijah Newren
2022-09-27 15:43 ` Junio C Hamano
2022-09-28  7:49   ` Elijah Newren
2022-09-27 16:36 ` Derrick Stolee
2022-09-28  5:38   ` Elijah Newren
2022-09-28 13:22     ` Derrick Stolee
2022-10-06  7:10       ` Elijah Newren
2022-10-06 18:27         ` Derrick Stolee
2022-10-07  2:56           ` Elijah Newren
2022-09-30  9:54     ` ZheNing Hu
2022-10-06  7:53       ` Elijah Newren
2022-10-15  2:17         ` ZheNing Hu
2022-10-15  4:37           ` Elijah Newren
2022-10-15 14:49             ` ZheNing Hu
2022-09-30  9:09   ` ZheNing Hu
2022-09-28  8:32 ` [PATCH v2] " Elijah Newren via GitGitGadget
2022-10-08 22:52   ` [PATCH v3] " Elijah Newren via GitGitGadget
2022-11-06  6:04     ` [PATCH v4] " Elijah Newren via GitGitGadget
2022-11-07 20:44       ` Derrick Stolee
2022-11-16  4:39         ` Elijah Newren
2022-11-15  4:03       ` ZheNing Hu
2022-11-16  3:18         ` ZheNing Hu [this message]
2022-11-16  6:51           ` Elijah Newren
2022-11-16  5:49         ` Elijah Newren
2022-11-16 10:04           ` ZheNing Hu
2022-11-16 10:10             ` ZheNing Hu
2022-11-16 14:33               ` ZheNing Hu
2022-11-19  2:36                 ` Elijah Newren
2022-11-19  2:15             ` Elijah Newren
2022-11-23  9:08               ` ZheNing Hu
2023-01-14 10:18           ` ZheNing Hu
2023-01-20  4:30             ` Elijah Newren
2023-01-23 15:05               ` ZheNing Hu
2023-01-24  3:17                 ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOLTT8T39Q4q5W2BaFVkm81T7mRc1UvT2MN07XHGT5qpB7ZMHQ@mail.gmail.com \
    --to=adlternative@gmail.com \
    --cc=chooglen@google.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=martinvonz@google.com \
    --cc=matheus.bernardino@usp.br \
    --cc=newren@gmail.com \
    --cc=shaoxuan.yuan02@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).