Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Adam Majer <adamm@zombino.com>
To: Felipe Contreras <felipe.contreras@gmail.com>,
	Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>, git@vger.kernel.org
Subject: Re: Is GIT_DEFAULT_HASH flawed?
Date: Wed, 3 May 2023 11:03:47 +0200	[thread overview]
Message-ID: <70103746-6980-baed-13d9-afeae6cee464@zombino.com> (raw)
In-Reply-To: <6451a0ba5c3fb_200ae2945b@chronos.notmuch>

On 5/3/23 01:46, Felipe Contreras wrote:
> To be honest this whole approach seems to be completely flawed to me and
> against the whole design of git in the first place.

The discussion above is mostly moot now since this has been fixed in 
later patches in this thread, AFAIK. It's also moot for other reasons, 
like the hash function transition plan is not really implemented, yet.

Also, this was about corner-case, like it often is.


> In a recent email Linus Torvalds explained why object ids were
> calculated based {type, size, data} [1], and he explained very clearly
> that two objects with exactly the same data are not supposed to have the
> same id if the type is different.

This is different. But aside, type + size + data are not really much 
different from just having data in a hash function. There are plenty of 
hash collisions where

     HASH(type + size + data) == HASH(type + size + data')

by definition of how these functions work. The problem is always in 
finding these collisions. But anyway...

> In my view one repository should be able to have part SHA-1 history,
> part SHA3-256 history, and part BLAKE2b history.

Yes, that would be great. Please provide patch series for this :-)

> I have not been following the SHA-1 -> OID discussions, but I
> distinctively recall Linus Torvalds mentioning that the choice of using
> SHA-1 wasn't even for security purposes, it was to ensure integrity.

These are different sides of the same coin. Hashes are used to provide 
integrity. Hashes like MD4, MD5, SHA1, SHA256 are there for integrity. 
Some of these are no longer recommended and some are completely broken.

> Better the SHA-1 you know, than the SHA-256 you don't.

Wrong conclusion ;) Also, we know SHA-256

The problem in git-core and virtually all clients and other 
implementations is/was that SHA1 was hardcoded and assumed to be THE ONE 
and ONLY hash. It will take quite a bit of work outside of git-core to 
remove this one assumption (remember two digit year and 2000? - yes I'm 
old). Once this hash assumption is removed, you can start talking about 
adding other hashes and interop.

Keep in mind -- hashes are there for object reference. They are the glue 
in git. But there is really nothing stopping us from recalculating them 
"on the fly". If you have SHA1 repo, you can calculate a SHA256 or 
whatever hash for any type object. That's not the problem, conceptually 
speaking.

Finally, let not have a "bike shed" discussion about this. The 
GIT_DEFAULT_HASH is meant to be used by `git init` in-lieu of 
--object-format parameter, so it's not flawed. When used in other 
applications, it probably indicates a bug. But we can't fix all the bugs 
at once :-)

Cheers,
- Adam

  reply	other threads:[~2023-05-03  9:03 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05 10:28 git clone of empty repositories doesn't preserve hash Adam Majer
2023-04-05 19:04 ` Junio C Hamano
2023-04-05 19:47   ` Adam Majer
2023-04-05 20:01     ` Jeff King
2023-04-05 20:40       ` Junio C Hamano
2023-04-05 21:15         ` Junio C Hamano
2023-04-05 21:26           ` Jeff King
2023-04-05 22:48           ` brian m. carlson
2023-04-06 13:11           ` Adam Majer
2023-04-25 21:35           ` brian m. carlson
2023-04-25 22:24             ` Junio C Hamano
2023-04-25 23:12             ` Junio C Hamano
2023-04-26  0:20               ` brian m. carlson
2023-04-26 11:25                 ` Jeff King
2023-04-26 15:08                   ` Junio C Hamano
2023-04-26 15:13                     ` [PATCH] doc: GIT_DEFAULT_HASH is and will be ignored during "clone" Junio C Hamano
2023-04-26 21:06                       ` brian m. carlson
2023-04-27  4:46                     ` git clone of empty repositories doesn't preserve hash Jeff King
2023-04-26 10:51               ` Jeff King
2023-04-26 15:42                 ` Junio C Hamano
2023-04-26 20:40                 ` brian m. carlson
2023-04-26 20:53                   ` [PATCH 0/2] Fix empty SHA-256 clones with v0 and v1 brian m. carlson
2023-04-26 20:53                     ` [PATCH 1/2] http: advertise capabilities when cloning empty repos brian m. carlson
2023-04-26 21:14                       ` Junio C Hamano
2023-04-26 21:28                         ` brian m. carlson
2023-04-27  5:00                           ` Jeff King
2023-04-27  5:30                       ` Jeff King
2023-04-27 20:40                         ` Junio C Hamano
2023-04-26 20:53                     ` [PATCH 2/2] Honor GIT_DEFAULT_HASH for empty clones without remote algo brian m. carlson
2023-04-26 21:18                       ` Junio C Hamano
2023-04-26 21:33                       ` Junio C Hamano
2023-04-27  5:43                         ` Jeff King
2023-05-02 23:46                           ` Is GIT_DEFAULT_HASH flawed? Felipe Contreras
2023-05-03  9:03                             ` Adam Majer [this message]
2023-05-03 15:44                               ` Felipe Contreras
2023-05-03 17:21                                 ` Adam Majer
2023-05-08  0:34                                   ` Felipe Contreras
2023-05-03  9:09                             ` demerphq
2023-05-03 18:20                               ` Felipe Contreras
2023-05-03 22:54                             ` brian m. carlson
2023-05-08  2:00                               ` Felipe Contreras
2023-05-08 21:38                                 ` brian m. carlson
2023-05-09 10:32                                   ` Oswald Buddenhagen
2023-05-09 16:47                                     ` Junio C Hamano
2023-04-26 21:12                     ` [PATCH 0/2] Fix empty SHA-256 clones with v0 and v1 Junio C Hamano
2023-04-27  4:56                   ` git clone of empty repositories doesn't preserve hash Jeff King
2023-05-01 17:00                   ` [PATCH v2 0/1] Fix empty SHA-256 clones with v0 and v1 brian m. carlson
2023-05-01 17:00                     ` [PATCH v2 1/1] upload-pack: advertise capabilities when cloning empty repos brian m. carlson
2023-05-01 22:40                       ` Jeff King
2023-05-01 22:51                         ` Junio C Hamano
2023-05-01 17:37                     ` [PATCH v2 0/1] Fix empty SHA-256 clones with v0 and v1 Junio C Hamano
2023-05-17 19:24                   ` [PATCH v3 " brian m. carlson
2023-05-17 19:24                     ` [PATCH v3 1/1] upload-pack: advertise capabilities when cloning empty repos brian m. carlson
2023-05-17 21:48                     ` [PATCH v3 0/1] Fix empty SHA-256 clones with v0 and v1 Junio C Hamano
2023-05-17 22:28                       ` brian m. carlson
2023-05-18 18:28                     ` Jeff King
2023-05-19 15:32                       ` brian m. carlson
2023-04-05 21:23         ` git clone of empty repositories doesn't preserve hash Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=70103746-6980-baed-13d9-afeae6cee464@zombino.com \
    --to=adamm@zombino.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).