Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v5 1/4] gitformat-commit-graph: describe version 2 of BDAT
Date: Wed, 19 Jul 2023 13:25:18 -0400	[thread overview]
Message-ID: <ZLgcfvIrV3TapMAp@nand.local> (raw)
In-Reply-To: <52e281eef0e857de93e5bdc6e0ef50065adaab2c.1689283789.git.jonathantanmy@google.com>

On Thu, Jul 13, 2023 at 02:42:08PM -0700, Jonathan Tan wrote:
> The code change to Git to support version 2 will be done in subsequent
> commits.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/gitformat-commit-graph.txt | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/gitformat-commit-graph.txt b/Documentation/gitformat-commit-graph.txt
> index 31cad585e2..3e906e8030 100644
> --- a/Documentation/gitformat-commit-graph.txt
> +++ b/Documentation/gitformat-commit-graph.txt
> @@ -142,13 +142,16 @@ All multi-byte numbers are in network byte order.
>
>  ==== Bloom Filter Data (ID: {'B', 'D', 'A', 'T'}) [Optional]

This is a little beyond the scope of your series, but since we're
changing the on-disk format here a little bit, I think that it might be
worth it to consider whether there are any other changes that we'd like
to perform at the same time.

One that comes to mind is serializing the `max_changed_paths` value of
the Bloom filter settings, which is currently hard-coded as a constant,
c.f. 97ffa4fab50 (commit-graph.c: store maximum changed paths,
2020-09-17).

We always assume that the value there is 512, or the environment
variable GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS, if it is set. But it
might be nice to write it to disk, since it would allow us to do
something like:

--- 8< ---
diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index fa9d32facfb..a42b0b03cfb 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -178,11 +178,12 @@ test_expect_success 'persist filter settings' '
 	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
 		GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=9 \
 		GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY=15 \
+		GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS=513 \
 		git commit-graph write --reachable --changed-paths &&
-	grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15,\"max_changed_paths\":512" trace2.txt &&
+	grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15,\"max_changed_paths\":513" trace2.txt &&
 	GIT_TRACE2_EVENT="$(pwd)/trace2-auto.txt" \
 		git commit-graph write --reachable --changed-paths &&
-	grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15,\"max_changed_paths\":512" trace2-auto.txt
+	grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15,\"max_changed_paths\":513" trace2-auto.txt
 '

 test_max_changed_paths () {
--- >8 ---

Which is currently not possible (the second grep assertion will fail,
since Git has no way to remember what the value of max_changed_paths is
from the existing commit-graph).

> +	in Probabilistic Verification". Version 1 Bloom filters have a bug that appears
> +	when char is signed and the repository has path names that have characters >=
> +	0x80; Git supports reading and writing them, but this ability will be removed
> +	in a future version of Git.

Makes sense.

Thanks,
Taylor

  reply	other threads:[~2023-07-19 17:25 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22 21:48 [PATCH 0/2] Changed path filter hash fix and version bump Jonathan Tan
2023-05-22 21:48 ` [PATCH 1/2] t4216: test wrong bloom filter version rejection Jonathan Tan
2023-05-22 21:48 ` [PATCH 2/2] commit-graph: fix murmur3, bump filter ver. to 2 Jonathan Tan
2023-05-23 13:00   ` Derrick Stolee
2023-05-23 23:00     ` Jonathan Tan
2023-05-23 23:51     ` Junio C Hamano
2023-05-24 21:26       ` Jonathan Tan
2023-05-26 13:19         ` Derrick Stolee
2023-05-30 17:26           ` Jonathan Tan
2023-05-23  4:42 ` [PATCH 0/2] Changed path filter hash fix and version bump Junio C Hamano
2023-05-31 23:12 ` [PATCH v2 0/3] " Jonathan Tan
2023-05-31 23:12   ` [PATCH v2 1/3] t4216: test changed path filters with high bit paths Jonathan Tan
2023-05-31 23:12   ` [PATCH v2 2/3] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-05-31 23:12   ` [PATCH v2 3/3] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-03  1:01   ` [PATCH v2 0/3] Changed path filter hash fix and version bump Junio C Hamano
2023-06-03  2:24     ` Junio C Hamano
2023-06-07 16:30       ` Jonathan Tan
2023-06-07 21:37         ` Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 0/4] " Jonathan Tan
2023-06-08 19:21   ` [PATCH v3 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-06-08 19:52     ` Ramsay Jones
2023-06-12 21:26       ` Junio C Hamano
2023-06-08 19:21   ` [PATCH v3 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-06-08 19:21   ` [PATCH v3 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-06-08 19:21   ` [PATCH v3 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-08 19:50   ` [PATCH v3 0/4] Changed path filter hash fix and version bump Ramsay Jones
2023-06-09  0:08     ` Jonathan Tan
2023-06-12 21:31     ` Junio C Hamano
2023-06-13 17:16       ` Jonathan Tan
2023-06-13 17:29         ` [PATCH] CodingGuidelines: use octal escapes, not hex Jonathan Tan
2023-06-13 18:16           ` Eric Sunshine
2023-06-13 18:43             ` Jonathan Tan
2023-06-13 19:15               ` Eric Sunshine
2023-06-13 19:29                 ` Junio C Hamano
2023-06-13 19:16         ` [PATCH v3 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-06-13 17:39 ` [PATCH v4 " Jonathan Tan
2023-06-13 17:39   ` [PATCH v4 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-06-13 21:58     ` Junio C Hamano
2023-06-20 13:22       ` Derrick Stolee
2023-06-21 12:08       ` Taylor Blau
2023-06-22 22:26         ` Jonathan Tan
2023-06-23 13:05           ` Derrick Stolee
2023-06-13 17:39   ` [PATCH v4 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-06-13 17:39   ` [PATCH v4 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-06-20 13:28     ` Derrick Stolee
2023-06-21 12:14     ` Taylor Blau
2023-06-13 17:39   ` [PATCH v4 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-20 13:39     ` Derrick Stolee
2023-06-20 18:37       ` Junio C Hamano
2023-06-13 19:21   ` [PATCH v4 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-06-20 13:43     ` Derrick Stolee
2023-06-20 21:56       ` Jonathan Tan
2023-06-21 12:19       ` Taylor Blau
2023-06-21 17:53         ` Derrick Stolee
2023-06-22 22:27           ` Jonathan Tan
2023-06-23 13:18             ` Derrick Stolee
2023-07-13 21:42 ` [PATCH v5 " Jonathan Tan
2023-07-13 21:42   ` [PATCH v5 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-07-19 17:25     ` Taylor Blau [this message]
2023-07-20 20:20       ` Jonathan Tan
2023-07-21  1:38         ` Taylor Blau
2023-07-13 21:42   ` [PATCH v5 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-07-13 22:50     ` Junio C Hamano
2023-07-19 17:27       ` Taylor Blau
2023-07-19 17:55         ` [PATCH 0/4] commit-graph: avoid looking at Bloom filter data directly Taylor Blau
2023-07-19 17:55           ` [PATCH 1/4] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2023-07-19 17:55           ` [PATCH 2/4] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2023-07-19 17:55           ` [PATCH 3/4] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2023-07-19 17:55           ` [PATCH 4/4] fixup! t4216: test changed path filters with high bit paths Taylor Blau
2023-07-19 19:24           ` [PATCH 0/4] commit-graph: avoid looking at Bloom filter data directly Junio C Hamano
2023-07-20 20:22           ` Jonathan Tan
2023-07-13 21:42   ` [PATCH v5 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-07-19 18:10     ` Taylor Blau
2023-07-20 20:42       ` Jonathan Tan
2023-07-20 21:02         ` Taylor Blau
2023-07-13 21:42   ` [PATCH v5 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-07-19 18:24     ` Taylor Blau
2023-07-20 21:27       ` Jonathan Tan
2023-07-26 23:32         ` Taylor Blau
2023-07-13 22:16   ` [PATCH v5 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-07-13 22:59     ` Junio C Hamano
2023-07-14 18:48     ` Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 0/7] " Jonathan Tan
2023-07-20 21:46   ` [PATCH v6 1/7] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-07-20 21:46   ` [PATCH v6 2/7] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-07-26 23:26     ` Taylor Blau
2023-07-20 21:46   ` [PATCH v6 3/7] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-07-20 21:46   ` [PATCH v6 4/7] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-07-20 21:46   ` [PATCH v6 5/7] t4216: test changed path filters with high bit paths Jonathan Tan
2023-07-26 23:28     ` Taylor Blau
2023-07-20 21:46   ` [PATCH v6 6/7] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-07-20 21:46   ` [PATCH v6 7/7] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-07-25 20:52   ` [PATCH v6 0/7] Changed path filter hash fix and version bump Junio C Hamano
2023-07-26 20:39   ` Junio C Hamano
2023-07-27  0:17     ` Taylor Blau
2023-07-27  0:49       ` Junio C Hamano
2023-07-27 17:39         ` Jonathan Tan
2023-07-27 17:56           ` Taylor Blau
2023-07-27 20:53             ` Jonathan Tan
2023-08-01 18:08               ` Taylor Blau
2023-08-01 18:52                 ` Jonathan Tan
2023-08-03  0:01                 ` Taylor Blau
2023-08-03 13:18                   ` Derrick Stolee
2023-08-03 18:45                     ` Taylor Blau
2023-07-27 18:44         ` Junio C Hamano
2023-08-01 18:41 ` [PATCH v7 " Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 1/7] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 2/7] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 3/7] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 4/7] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 5/7] t4216: test changed path filters with high bit paths Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 6/7] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-08-01 18:41   ` [PATCH v7 7/7] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-08-01 18:44   ` [PATCH v7 0/7] Changed path filter hash fix and version bump Junio C Hamano
2023-08-01 20:58     ` Taylor Blau
2023-08-01 21:07       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZLgcfvIrV3TapMAp@nand.local \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).