All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v2 00/12] Stop relying on SHA1 fallback for `the_hash_algo`
Date: Tue, 23 Apr 2024 07:07:20 +0200	[thread overview]
Message-ID: <cover.1713848619.git.ps@pks.im> (raw)
In-Reply-To: <cover.1713519789.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 6510 bytes --]

Hi,

this is the second version of my patch series that causes us to stop
relying on the SHA1 default hash.

Changes compared to v1:

    - Various typo fixes in commit messages.

    - Added another patch that moves `validate_headref()` into "setup.c"
      to clarify that it is only used during repository discovery.

    - Indented a diff in a commit message so that git-am(1) is happy.

Thanks!

Patrick

Patrick Steinhardt (12):
  path: harden validation of HEAD with non-standard hashes
  path: move `validate_headref()` to its only user
  parse-options-cb: only abbreviate hashes when hash algo is known
  attr: don't recompute default attribute source
  attr: fix BUG() when parsing attrs outside of repo
  remote-curl: fix parsing of detached SHA256 heads
  builtin/rev-parse: allow shortening to more than 40 hex characters
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/bundle: abort "verify" early when there is no repository
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/shortlog: don't set up revisions without repo
  repository: stop setting SHA1 as the default object hash

 attr.c                     | 31 +++++++++++++++-------
 builtin/blame.c            |  5 ++--
 builtin/bundle.c           |  5 ++++
 builtin/diff.c             |  9 +++++++
 builtin/rev-parse.c        |  5 ++--
 builtin/shortlog.c         |  2 +-
 parse-options-cb.c         |  3 ++-
 path.c                     | 53 --------------------------------------
 path.h                     |  1 -
 remote-curl.c              | 19 +++++++++++++-
 repository.c               |  2 --
 setup.c                    | 53 ++++++++++++++++++++++++++++++++++++++
 t/t0003-attributes.sh      | 15 +++++++++++
 t/t0040-parse-options.sh   | 17 ++++++++++++
 t/t1500-rev-parse.sh       |  6 +++++
 t/t5550-http-fetch-dumb.sh | 15 +++++++++++
 16 files changed, 167 insertions(+), 74 deletions(-)

Range-diff against v1:
 1:  aa4d6f508b !  1:  a986b464d3 path: harden validation of HEAD with non-standard hashes
    @@ Commit message
         current version of Git doesn't understand yet. We'd still want to detect
         the repository as proper Git repository in that case, and we will fail
         eventually with a proper error message that the hash isn't understood
    -    when trying to set up the repostiory format.
    +    when trying to set up the repository format.
     
         It follows that we could just leave the current code intact, as in
         practice the code change doesn't have any user visible impact. But it
         also prepares us for `the_hash_algo` being unset when there is no
    -    repositroy.
    +    repository.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
 -:  ---------- >  2:  a347c7e6ca path: move `validate_headref()` to its only user
 2:  5daaaed2b9 !  3:  c0a15b2fa6 parse-options-cb: only abbreviate hashes when hash algo is known
    @@ Commit message
         parse-options-cb: only abbreviate hashes when hash algo is known
     
         The `OPT__ABBREV()` option can be used to add an option that abbreviates
    -    object IDs. When given an length longer than `the_hash_algo->hexsz`,
    -    then it will instead set the length to that maximum length.
    +    object IDs. When given a length longer than `the_hash_algo->hexsz`, then
    +    it will instead set the length to that maximum length.
     
         It may not always be guaranteed that we have `the_hash_algo` initialized
    -    properly as the hash algortihm can only be set up after we have set up
    +    properly as the hash algorithm can only be set up after we have set up
         `the_repository`. In that case, the hash would always be truncated to
         the hex length of SHA1, which may not be what the user desires.
     
 3:  ae91a27ffc !  4:  1b5f904eed attr: don't recompute default attribute source
    @@ Commit message
         variable is the null object ID then we try to look up the attr source,
         otherwise we skip over it.
     
    -    This has approach is flawed though: the variable will never be set to
    +    This approach is flawed though: the variable will never be set to
         anything else but the null object ID in case there is no attr source.
         Consequently, we re-compute the information on every call. And in the
         worst case, when we silently ignore bad trees, this will cause us to try
 4:  53c8e1cd7c =  5:  26909daca4 attr: fix BUG() when parsing attrs outside of repo
 5:  32a429fb60 =  6:  0b99184f50 remote-curl: fix parsing of detached SHA256 heads
 6:  9cb7baa50c =  7:  ccfda3c2d2 builtin/rev-parse: allow shortening to more than 40 hex characters
 7:  e189a4ad15 =  8:  1813e7eb5c builtin/blame: don't access potentially unitialized `the_hash_algo`
 8:  bc4bda3508 =  9:  31182a1fc6 builtin/bundle: abort "verify" early when there is no repository
 9:  39e56dab62 ! 10:  78e19d0a1b builtin/diff: explicitly set hash algo when there is no repo
    @@ Commit message
         hashing the files that we are diffing so that we can print the "index"
         line:
     
    -    ```
    -    diff --git a/a b/b
    -    index 7898192..6178079 100644
    -    --- a/a
    -    +++ b/b
    -    @@ -1 +1 @@
    -    -a
    -    +b
    -    ```
    +        ```
    +        diff --git a/a b/b
    +        index 7898192..6178079 100644
    +        --- a/a
    +        +++ b/b
    +        @@ -1 +1 @@
    +        -a
    +        +b
    +        ```
     
         We implicitly use SHA1 to calculate the hash here, which is because
         `the_repository` gets initialized with SHA1 during the startup routine.
10:  508e28ed1e ! 11:  51bcddbc31 builtin/shortlog: don't set up revisions without repo
    @@ Commit message
         repository in that context, it is thus unsupported to pass any revisions
         as arguments.
     
    -    Reghardless of that we still end up calling `setup_revisions()`. While
    +    Regardless of that we still end up calling `setup_revisions()`. While
         that works alright, it is somewhat strange. Furthermore, this is about
         to cause problems when we unset the default object hash.
     
11:  f86a6ff3ba = 12:  e8126371e1 repository: stop setting SHA1 as the default object hash

base-commit: 21306a098c3f174ad4c2a5cddb9069ee27a548b0
-- 
2.45.0-rc0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2024-04-23  5:07 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19  9:51 [PATCH 00/11] Stop relying on SHA1 fallback for `the_hash_algo` Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 01/11] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-19 19:03   ` brian m. carlson
2024-04-22  4:56     ` Patrick Steinhardt
2024-04-22 16:15   ` Junio C Hamano
2024-04-23  4:50     ` Patrick Steinhardt
2024-04-23 16:54       ` Junio C Hamano
2024-04-19  9:51 ` [PATCH 02/11] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-23  0:30   ` Justin Tobler
2024-04-19  9:51 ` [PATCH 03/11] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-23  0:32   ` Justin Tobler
2024-04-19  9:51 ` [PATCH 04/11] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 05/11] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 06/11] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 07/11] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 08/11] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 09/11] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-22 18:41   ` Junio C Hamano
2024-04-19  9:51 ` [PATCH 10/11] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-23  0:35   ` Justin Tobler
2024-04-19  9:51 ` [PATCH 11/11] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-04-19 19:12 ` [PATCH 00/11] Stop relying on SHA1 fallback for `the_hash_algo` brian m. carlson
2024-04-19 19:16   ` Junio C Hamano
2024-04-22  4:56   ` Patrick Steinhardt
2024-04-23  5:07 ` Patrick Steinhardt [this message]
2024-04-23  5:07   ` [PATCH v2 01/12] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 02/12] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 03/12] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 04/12] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 05/12] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 06/12] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 07/12] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 08/12] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 09/12] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 10/12] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 11/12] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 12/12] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-04-27 22:09   ` [PATCH v2 00/12] Stop relying on SHA1 fallback for `the_hash_algo` Junio C Hamano
2024-04-29  6:05     ` Patrick Steinhardt
2024-04-29  6:34 ` [PATCH v3 00/13] " Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 01/13] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 02/13] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 03/13] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 04/13] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 05/13] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 06/13] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 07/13] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 08/13] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 09/13] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 10/13] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-29  6:35   ` [PATCH v3 11/13] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-29  6:35   ` [PATCH v3 12/13] oss-fuzz/commit-graph: set up hash algorithm Patrick Steinhardt
2024-04-29  6:35   ` [PATCH v3 13/13] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-05-07  4:52 ` [PATCH v4 00/13] Stop relying on SHA1 fallback for `the_hash_algo` Patrick Steinhardt
2024-05-07  4:52   ` [PATCH v4 01/13] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-05-07  4:52   ` [PATCH v4 02/13] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-05-07  4:52   ` [PATCH v4 03/13] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 04/13] attr: don't recompute default attribute source Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 05/13] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 06/13] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 07/13] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 08/13] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 09/13] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 10/13] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 11/13] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 12/13] oss-fuzz/commit-graph: set up hash algorithm Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 13/13] repository: stop setting SHA1 as the default object hash Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1713848619.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jltobler@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.