Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Rudy Rigot <rudy.rigot@gmail.com>
To: Sean Allred <allred.sean@gmail.com>
Cc: "Mor, Gil (DXC Luxoft)" <gil.mor@dxc.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Git monorepo - recommendation regarding usage of sparse-checkout
Date: Thu, 27 Jul 2023 15:01:14 -0500	[thread overview]
Message-ID: <CANaDLWK+UYLgVbqjDxq_euYeJh1CCVMm283GZdQFSwUsBfTKSA@mail.gmail.com> (raw)
In-Reply-To: <m0a5w5etlu.fsf@epic96565.epic.com>

To add a data point: we have been using sparse checkout in non-cone
mode on our very large repository at Salesforce. Non-cone, because we
are a monolith, 99% of our files are actually needed by our single
build, so we need to process by exclusion. Our repository has been in
production for a bit over a year, with now between 1k and 2k active
collaborators, and the only issue we've seen with sparse checkout, is
when a user really messed something up bad enough in our scripted
environment setup that they end up running a very old version of Git
for some reason. Recent versions have been seamless.

--


On Sat, Jul 8, 2023 at 9:20 PM Sean Allred <allred.sean@gmail.com> wrote:
>
>
> "Mor, Gil (DXC Luxoft)" <gil.mor@dxc.com> writes:
> > Hello, we are experimenting with migrating a large-ish code base from
> > SVN to a Git Monorepo and it would help us if we can get some input
> > regarding the usage of sparse-checkout.
>
> We're in the same boat. I haven't been able to keep up with the list as
> well as I would like, but I can share our experience so far. We're
> writing developer tooling for a team of ~2k devs.
>
> > From our timing experiments sparse-checkout is the only method so far
> > that reduces our times to good results.
>
> You should also look into sparse-index.
>
> > The only issue might be the Disclaimer that the sparse-checkout
> > feature is experimental, and that the behavior will change.
>
> It seems vanishingly unlikely that the feature will go away at this
> point (even if the CLI changes). We have automated integration tests set
> up for our automation and have near-term plans to start running those
> against `git.git:main` and `git.git:next`. This way, we'll get advance
> notice if something we're relying on starts breaking.
>
> > The command is already featured in GitHub and GitLab articles about
> > reducing Monorepos size but we are still not sure how un/stable the
> > feature is or how commonly used the feature is already.
>
> We haven't encountered many issues with stability. There was one issue
> a few months back where the pattern syntax changed, but as I recall that
> was more of a problem with one of our developers going off the beaten
> path and trying to write to GIT_DIR directly instead of using `git
> sparse-checkout set` or similar.
>
> > So, we thought we'll write an email to see if we can get a bit more
> > nuanced answer about the safety of real-world usage so that we can
> > make an informed decision whether or not to start using
> > sparse-checkout, despite it being experimental.
>
> One of the goals of our tooling is to teach people how to actually use
> Git (i.e., use our tooling to automate the boring stuff -- not to
> replace Git itself). To meet this goal, we're using the more 'ergonomic'
> git-switch command instead of git-checkout. In our case, as long as we
> can react to changes in git-switch syntax (which we haven't seen since
> the project started a few years ago) and as long as we can get the same
> side-effects, we'll be fine. This comfort is largely driven by the
> existence of integration tests.
>
> > We are not looking for 100% assurance, we know the responsibility is
> > eventually totally ours and there are no guarantees, but it seems like
> > a game changer so we are just looking for a bit more information so
> > that we can make a decision.
>
> Sparse checkout is not a silver bullet, but it does make a difference.
> We still see commits take several seconds on Windows (even with sparse
> index). This is *several orders of magnitude* better than SVN on our
> repository (where naive commits on top-level folders can take tens of
> minutes), but it's not what folks are going to be expecting from Git. In
> the long-term, we're looking at what would be involved in splitting up
> our monorepo and seeing whether the rewards really outweigh the costs
> (both of which reach far beyond source control).
>
> Best of luck!
>
> --
> Sean Allred

      reply	other threads:[~2023-07-27 20:01 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-30 19:26 Git monorepo - recommendation regarding usage of sparse-checkout Mor, Gil (DXC Luxoft)
2023-07-09  1:44 ` Sean Allred
2023-07-27 20:01   ` Rudy Rigot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANaDLWK+UYLgVbqjDxq_euYeJh1CCVMm283GZdQFSwUsBfTKSA@mail.gmail.com \
    --to=rudy.rigot@gmail.com \
    --cc=allred.sean@gmail.com \
    --cc=gil.mor@dxc.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).