Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* `git gc` says "unable to read" but `git fsck` happy
@ 2023-03-29 22:05 Stefan Monnier
  2023-03-29 23:37 ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2023-03-29 22:05 UTC (permalink / raw)
  To: git

Here's an example session:

    % LANG=C git fsck --strict; LANG=C git gc
    Checking object directories: 100% (256/256), done.
    error in tree 2699d230e3b592ae42506d7b5c969a7ac6a4593c: zeroPaddedFilemode: contains zero-padded file modes
    Checking objects: 100% (462555/462555), done.
    Verifying commits in commit graph: 100% (117904/117904), done.
    Enumerating objects: 462573, done.
    Counting objects: 100% (462573/462573), done.
    Delta compression using up to 8 threads
    Compressing objects: 100% (155363/155363), done.
    fatal: unable to read f5e44b38fc8f7e15e5e6718090d05b09912254fa
    fatal: failed to run repack
    %

How come it can't read `f5e44b38fc8f7e15e5e6718090d05b09912254fa` during
"repack" while `git fsck` says everything is fine?

More importantly: how do I diagnose this further and fix it?

Rumors on the net suggest that `git gc --aggressive` may circumvent this
problem occasionally, but those don't seem to know what they're talking
about, and in my case it didn't make any difference (except that it
takes more time :-).


        Stefan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: `git gc` says "unable to read" but `git fsck` happy
  2023-03-29 22:05 `git gc` says "unable to read" but `git fsck` happy Stefan Monnier
@ 2023-03-29 23:37 ` Jeff King
  2023-03-30 13:01   ` Stefan Monnier
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2023-03-29 23:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: git

On Wed, Mar 29, 2023 at 06:05:24PM -0400, Stefan Monnier wrote:

> Here's an example session:
> 
>     % LANG=C git fsck --strict; LANG=C git gc
>     Checking object directories: 100% (256/256), done.
>     error in tree 2699d230e3b592ae42506d7b5c969a7ac6a4593c: zeroPaddedFilemode: contains zero-padded file modes
>     Checking objects: 100% (462555/462555), done.
>     Verifying commits in commit graph: 100% (117904/117904), done.
>     Enumerating objects: 462573, done.
>     Counting objects: 100% (462573/462573), done.
>     Delta compression using up to 8 threads
>     Compressing objects: 100% (155363/155363), done.
>     fatal: unable to read f5e44b38fc8f7e15e5e6718090d05b09912254fa
>     fatal: failed to run repack
>     %
> 
> How come it can't read `f5e44b38fc8f7e15e5e6718090d05b09912254fa` during
> "repack" while `git fsck` says everything is fine?

Do you use separate worktrees? It sounds like it might be similar to
this case:

  https://lore.kernel.org/git/c6246ed5-bffc-7af9-1540-4e2071eff5dc@kdbg.org/

If so, there are patches in the current "master" (but not in a released
version yet) that fix fsck to detect this.

> More importantly: how do I diagnose this further and fix it?

If it is the same problem (which would be a blob or maybe cached tree
missing in one of the worktree's index files), then probably you'd
either:

  1. Accept the loss and blow away that worktree's index file (or
     perhaps even the whole worktree, and just recreate it).

  2. Try to "git add" the file in question to recreate the blob
     (assuming that the file itself is still hanging around).

The original corruption bug itself (gc not taking into account worktree
index files) has been fixed for a while, so the theory is that this can
be lingering corruption from a repack by an older version of Git. But if
you have evidence to the contrary, we'd like to hear that, too. ;)

> Rumors on the net suggest that `git gc --aggressive` may circumvent this
> problem occasionally, but those don't seem to know what they're talking
> about, and in my case it didn't make any difference (except that it
> takes more time :-).

I don't think --aggressive would help at all. In theory --prune=now
might, but I think even that won't help if the problem is that the
object is referenced in an index file.

It could also be a totally unrelated bug, perhaps where we are too eager
to complain about missing objects in unreachable history we're trying to
retain. In which case "git gc --prune=now" _would_ help (but it might be
nice to save a copy of the repository before trying, because that would
indicate a bug we still need to fix, and your repo is worth
investigating).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: `git gc` says "unable to read" but `git fsck` happy
  2023-03-29 23:37 ` Jeff King
@ 2023-03-30 13:01   ` Stefan Monnier
  2023-03-30 18:17     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2023-03-30 13:01 UTC (permalink / raw)
  To: Jeff King; +Cc: git

>> How come it can't read `f5e44b38fc8f7e15e5e6718090d05b09912254fa` during
>> "repack" while `git fsck` says everything is fine?
>
> Do you use separate worktrees?

Very much so, indeed!

> It sounds like it might be similar to this case:
>
>   https://lore.kernel.org/git/c6246ed5-bffc-7af9-1540-4e2071eff5dc@kdbg.org/

That's sounds exactly right.  I was actually preparing to file
a separate bug report because of a similar problem I had identified
where a worktree's `index` caused a similar problem (`git fsck` happy
but `git gc` fails) except it was found much earlier in `git gc`,
causing a "bad object" error almost right away.

> If so, there are patches in the current "master" (but not in a released
> version yet) that fix fsck to detect this.

Good, thanks.

>> More importantly: how do I diagnose this further and fix it?
>
> If it is the same problem (which would be a blob or maybe cached tree
> missing in one of the worktree's index files), then probably you'd
> either:
>
>   1. Accept the loss and blow away that worktree's index file (or
>      perhaps even the whole worktree, and just recreate it).

Hmm... the problem is "that": I have about a hundred worktrees for
this repository.
But yes, I can just throw away all those `index` files, I guess.

>      (assuming that the file itself is still hanging around).
> The original corruption bug itself (gc not taking into account worktree
> index files) has been fixed for a while, so the theory is that this can
> be lingering corruption from a repack by an older version of Git. But if
> you have evidence to the contrary, we'd like to hear that, too. ;)

My suspicion is that the origin of the broken state is elsewhere (maybe
a power failure?) because the problem appeared "simultaneously" (a few
days apart, really) for two different repositories.

> I don't think --aggressive would help at all. In theory --prune=now
> might, but I think even that won't help if the problem is that the
> object is referenced in an index file.

Indeed, I had also tried `--prune=now` and it did not help.
Thanks,


        Stefan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: `git gc` says "unable to read" but `git fsck` happy
  2023-03-30 13:01   ` Stefan Monnier
@ 2023-03-30 18:17     ` Jeff King
  2023-06-01 12:04       ` Andreas Schwab
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2023-03-30 18:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: git

On Thu, Mar 30, 2023 at 09:01:39AM -0400, Stefan Monnier wrote:

> > If it is the same problem (which would be a blob or maybe cached tree
> > missing in one of the worktree's index files), then probably you'd
> > either:
> >
> >   1. Accept the loss and blow away that worktree's index file (or
> >      perhaps even the whole worktree, and just recreate it).
> 
> Hmm... the problem is "that": I have about a hundred worktrees for
> this repository.
> But yes, I can just throw away all those `index` files, I guess.

If you try "git fsck" from the tip of master, it should identify the
worktree index that is the source of the problem, I think. You might
need to pass "--name-objects".

> >      (assuming that the file itself is still hanging around).
> > The original corruption bug itself (gc not taking into account worktree
> > index files) has been fixed for a while, so the theory is that this can
> > be lingering corruption from a repack by an older version of Git. But if
> > you have evidence to the contrary, we'd like to hear that, too. ;)
> 
> My suspicion is that the origin of the broken state is elsewhere (maybe
> a power failure?) because the problem appeared "simultaneously" (a few
> days apart, really) for two different repositories.

Hmm. I wouldn't expect that to happen specifically with this worktree
thing, but of course many bets are off with power failures.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: `git gc` says "unable to read" but `git fsck` happy
  2023-03-30 18:17     ` Jeff King
@ 2023-06-01 12:04       ` Andreas Schwab
  0 siblings, 0 replies; 5+ messages in thread
From: Andreas Schwab @ 2023-06-01 12:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Monnier, git

On Mär 30 2023, Jeff King wrote:

> On Thu, Mar 30, 2023 at 09:01:39AM -0400, Stefan Monnier wrote:
>
>> > If it is the same problem (which would be a blob or maybe cached tree
>> > missing in one of the worktree's index files), then probably you'd
>> > either:
>> >
>> >   1. Accept the loss and blow away that worktree's index file (or
>> >      perhaps even the whole worktree, and just recreate it).
>> 
>> Hmm... the problem is "that": I have about a hundred worktrees for
>> this repository.
>> But yes, I can just throw away all those `index` files, I guess.
>
> If you try "git fsck" from the tip of master, it should identify the
> worktree index that is the source of the problem, I think. You might
> need to pass "--name-objects".

I had the same problem, and after Junio refreshed my memory by pointing
me to this thread, I updated to the brand new git 2.41 and re-ran git
fsck.  That duely identified problems in two of the worktree indexes
(invalid sha1 pointer in resolve-undo).  After recreating those indexes
there were no more complaints from git gc.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-01 12:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-29 22:05 `git gc` says "unable to read" but `git fsck` happy Stefan Monnier
2023-03-29 23:37 ` Jeff King
2023-03-30 13:01   ` Stefan Monnier
2023-03-30 18:17     ` Jeff King
2023-06-01 12:04       ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).