Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* Git monorepo - recommendation regarding usage of sparse-checkout
@ 2023-05-30 19:26 Mor, Gil (DXC Luxoft)
  2023-07-09  1:44 ` Sean Allred
  0 siblings, 1 reply; 3+ messages in thread
From: Mor, Gil (DXC Luxoft) @ 2023-05-30 19:26 UTC (permalink / raw)
  To: git@vger.kernel.org

Hello, we are experimenting with migrating a large-ish code base from SVN to a Git Monorepo and it would help us if we can get some input regarding the usage of sparse-checkout.

From our timing experiments sparse-checkout is the only method so far that reduces our times to good results.

The only issue might be the Disclaimer that the sparse-checkout feature is experimental, and that the behavior will change.

We have tried Full, Shallow, Blobless, Treeless clones in all combinations and it takes 25-40 minutes for each operation (checkout and branch switching).
Sparse checkouts reduce these times to a few minutes to checkout and a few seconds to switch branches for each sub-project/sparse set.
Our code base doesn't have binary blobs - only text.

Out of the high level use cases we would say B ("Users want a sparse working tree but are working in a larger whole") fits us.
https://git-scm.com/docs/sparse-checkout#_purpose_of_sparse_checkouts

The command is already featured in GitHub and GitLab articles about reducing Monorepos size but we are still not sure how un/stable the feature is or how commonly used the feature is already.

So, we thought we'll write an email to see if we can get a bit more nuanced answer about the safety of real-world usage so that we can make an informed decision whether or not to start using sparse-checkout, despite it being experimental.

We are not looking for 100% assurance, we know the responsibility is eventually totally ours and there are no guarantees, but it seems like a game changer so we are just looking for a bit more information so that we can make a decision.

Best Regards,

Gil Mor
SW Developer
Cross Industry Solutions

Luxoft
A DXC Company




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Git monorepo - recommendation regarding usage of sparse-checkout
  2023-05-30 19:26 Git monorepo - recommendation regarding usage of sparse-checkout Mor, Gil (DXC Luxoft)
@ 2023-07-09  1:44 ` Sean Allred
  2023-07-27 20:01   ` Rudy Rigot
  0 siblings, 1 reply; 3+ messages in thread
From: Sean Allred @ 2023-07-09  1:44 UTC (permalink / raw)
  To: Mor, Gil (DXC Luxoft); +Cc: git@vger.kernel.org


"Mor, Gil (DXC Luxoft)" <gil.mor@dxc.com> writes:
> Hello, we are experimenting with migrating a large-ish code base from
> SVN to a Git Monorepo and it would help us if we can get some input
> regarding the usage of sparse-checkout.

We're in the same boat. I haven't been able to keep up with the list as
well as I would like, but I can share our experience so far. We're
writing developer tooling for a team of ~2k devs.

> From our timing experiments sparse-checkout is the only method so far
> that reduces our times to good results.

You should also look into sparse-index.

> The only issue might be the Disclaimer that the sparse-checkout
> feature is experimental, and that the behavior will change.

It seems vanishingly unlikely that the feature will go away at this
point (even if the CLI changes). We have automated integration tests set
up for our automation and have near-term plans to start running those
against `git.git:main` and `git.git:next`. This way, we'll get advance
notice if something we're relying on starts breaking.

> The command is already featured in GitHub and GitLab articles about
> reducing Monorepos size but we are still not sure how un/stable the
> feature is or how commonly used the feature is already.

We haven't encountered many issues with stability. There was one issue
a few months back where the pattern syntax changed, but as I recall that
was more of a problem with one of our developers going off the beaten
path and trying to write to GIT_DIR directly instead of using `git
sparse-checkout set` or similar.

> So, we thought we'll write an email to see if we can get a bit more
> nuanced answer about the safety of real-world usage so that we can
> make an informed decision whether or not to start using
> sparse-checkout, despite it being experimental.

One of the goals of our tooling is to teach people how to actually use
Git (i.e., use our tooling to automate the boring stuff -- not to
replace Git itself). To meet this goal, we're using the more 'ergonomic'
git-switch command instead of git-checkout. In our case, as long as we
can react to changes in git-switch syntax (which we haven't seen since
the project started a few years ago) and as long as we can get the same
side-effects, we'll be fine. This comfort is largely driven by the
existence of integration tests.

> We are not looking for 100% assurance, we know the responsibility is
> eventually totally ours and there are no guarantees, but it seems like
> a game changer so we are just looking for a bit more information so
> that we can make a decision.

Sparse checkout is not a silver bullet, but it does make a difference.
We still see commits take several seconds on Windows (even with sparse
index). This is *several orders of magnitude* better than SVN on our
repository (where naive commits on top-level folders can take tens of
minutes), but it's not what folks are going to be expecting from Git. In
the long-term, we're looking at what would be involved in splitting up
our monorepo and seeing whether the rewards really outweigh the costs
(both of which reach far beyond source control).

Best of luck!

--
Sean Allred

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Git monorepo - recommendation regarding usage of sparse-checkout
  2023-07-09  1:44 ` Sean Allred
@ 2023-07-27 20:01   ` Rudy Rigot
  0 siblings, 0 replies; 3+ messages in thread
From: Rudy Rigot @ 2023-07-27 20:01 UTC (permalink / raw)
  To: Sean Allred; +Cc: Mor, Gil (DXC Luxoft), git@vger.kernel.org

To add a data point: we have been using sparse checkout in non-cone
mode on our very large repository at Salesforce. Non-cone, because we
are a monolith, 99% of our files are actually needed by our single
build, so we need to process by exclusion. Our repository has been in
production for a bit over a year, with now between 1k and 2k active
collaborators, and the only issue we've seen with sparse checkout, is
when a user really messed something up bad enough in our scripted
environment setup that they end up running a very old version of Git
for some reason. Recent versions have been seamless.

--


On Sat, Jul 8, 2023 at 9:20 PM Sean Allred <allred.sean@gmail.com> wrote:
>
>
> "Mor, Gil (DXC Luxoft)" <gil.mor@dxc.com> writes:
> > Hello, we are experimenting with migrating a large-ish code base from
> > SVN to a Git Monorepo and it would help us if we can get some input
> > regarding the usage of sparse-checkout.
>
> We're in the same boat. I haven't been able to keep up with the list as
> well as I would like, but I can share our experience so far. We're
> writing developer tooling for a team of ~2k devs.
>
> > From our timing experiments sparse-checkout is the only method so far
> > that reduces our times to good results.
>
> You should also look into sparse-index.
>
> > The only issue might be the Disclaimer that the sparse-checkout
> > feature is experimental, and that the behavior will change.
>
> It seems vanishingly unlikely that the feature will go away at this
> point (even if the CLI changes). We have automated integration tests set
> up for our automation and have near-term plans to start running those
> against `git.git:main` and `git.git:next`. This way, we'll get advance
> notice if something we're relying on starts breaking.
>
> > The command is already featured in GitHub and GitLab articles about
> > reducing Monorepos size but we are still not sure how un/stable the
> > feature is or how commonly used the feature is already.
>
> We haven't encountered many issues with stability. There was one issue
> a few months back where the pattern syntax changed, but as I recall that
> was more of a problem with one of our developers going off the beaten
> path and trying to write to GIT_DIR directly instead of using `git
> sparse-checkout set` or similar.
>
> > So, we thought we'll write an email to see if we can get a bit more
> > nuanced answer about the safety of real-world usage so that we can
> > make an informed decision whether or not to start using
> > sparse-checkout, despite it being experimental.
>
> One of the goals of our tooling is to teach people how to actually use
> Git (i.e., use our tooling to automate the boring stuff -- not to
> replace Git itself). To meet this goal, we're using the more 'ergonomic'
> git-switch command instead of git-checkout. In our case, as long as we
> can react to changes in git-switch syntax (which we haven't seen since
> the project started a few years ago) and as long as we can get the same
> side-effects, we'll be fine. This comfort is largely driven by the
> existence of integration tests.
>
> > We are not looking for 100% assurance, we know the responsibility is
> > eventually totally ours and there are no guarantees, but it seems like
> > a game changer so we are just looking for a bit more information so
> > that we can make a decision.
>
> Sparse checkout is not a silver bullet, but it does make a difference.
> We still see commits take several seconds on Windows (even with sparse
> index). This is *several orders of magnitude* better than SVN on our
> repository (where naive commits on top-level folders can take tens of
> minutes), but it's not what folks are going to be expecting from Git. In
> the long-term, we're looking at what would be involved in splitting up
> our monorepo and seeing whether the rewards really outweigh the costs
> (both of which reach far beyond source control).
>
> Best of luck!
>
> --
> Sean Allred

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-27 20:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30 19:26 Git monorepo - recommendation regarding usage of sparse-checkout Mor, Gil (DXC Luxoft)
2023-07-09  1:44 ` Sean Allred
2023-07-27 20:01   ` Rudy Rigot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).