Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* [GSoC] Proposal Discussion: git-refs Project
@ 2025-03-23 13:36 Yuting Zheng
  2025-03-24 12:02 ` Patrick Steinhardt
  2025-03-29 15:02 ` [GSoC] git-refs proposal draft Zheng Yuting
  0 siblings, 2 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-03-23 13:36 UTC (permalink / raw)
  To: git

Dear Git Community,

I am very interested in applying for the GSoC 2025 project "Consolidate
ref-related functionality into git-refs". I have reviewed the relevant
code, documentation, and mailing lists, and as part of the application
prerequisites, I have submitted a microproject patch
(https://lore.kernel.org/git/20250323022111.20226-1-05ZYT30@gmail.com/).

My current idea is to extend the `git-refs` command—by calling into the
existing code—to add subcommands. This approach would replace the
functionalities of the mentioned commands while ensuring that I do not
modify the code underlying them. This guarantees that the new `git-refs`
subcommand meets the new requirements without affecting the usage of the
existing commands.

However, when searching the mailing lists with keywords
“nq:consolidate ref” and “s: refs”, I did not find any discussion about
merging these commands. If anyone has come across any previous discussions
or could kindly provide additional insights on this matter, I would greatly
appreciate your help.

Thank you for your guidance.

Best regards,
Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] Proposal Discussion: git-refs Project
  2025-03-23 13:36 [GSoC] Proposal Discussion: git-refs Project Yuting Zheng
@ 2025-03-24 12:02 ` Patrick Steinhardt
  2025-03-27  2:26   ` Yuting Zheng
  2025-03-29 15:02 ` [GSoC] git-refs proposal draft Zheng Yuting
  1 sibling, 1 reply; 17+ messages in thread
From: Patrick Steinhardt @ 2025-03-24 12:02 UTC (permalink / raw)
  To: Yuting Zheng; +Cc: git

Hi Yuting,

On Sun, Mar 23, 2025 at 09:36:51PM +0800, Yuting Zheng wrote:
> Dear Git Community,
> 
> I am very interested in applying for the GSoC 2025 project "Consolidate
> ref-related functionality into git-refs". I have reviewed the relevant
> code, documentation, and mailing lists, and as part of the application
> prerequisites, I have submitted a microproject patch
> (https://lore.kernel.org/git/20250323022111.20226-1-05ZYT30@gmail.com/).
> 
> My current idea is to extend the `git-refs` command—by calling into the
> existing code—to add subcommands. This approach would replace the
> functionalities of the mentioned commands while ensuring that I do not
> modify the code underlying them. This guarantees that the new `git-refs`
> subcommand meets the new requirements without affecting the usage of the
> existing commands.
> 
> However, when searching the mailing lists with keywords
> “nq:consolidate ref” and “s: refs”, I did not find any discussion about
> merging these commands. If anyone has come across any previous discussions
> or could kindly provide additional insights on this matter, I would greatly
> appreciate your help.
> 
> Thank you for your guidance.

I have been chatting with Peff about this topic quite a while ago, but
that was mostly an in-person chat that hasn't made it onto the mailing
list. I may also have mentioned on the mailing list on several occasions
that it would make sense to consolidate, but there wasn't ever a bigger
discussion around all of this. There's also [1] as a non-authoritative
source for this project that documents my intent to consolidate the
commands.

So ultimately there hasn't been a lot of discussion yet around this
whole thing. Driving consensus and designing the new interface would
thus be one of the biggest challenges in this project from my point of
view.

I'm happy to provide more feedback once an initial draft has been
created for how the project could look like!

Thanks.

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/issues/330

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] Proposal Discussion: git-refs Project
  2025-03-24 12:02 ` Patrick Steinhardt
@ 2025-03-27  2:26   ` Yuting Zheng
  2025-03-28 13:45     ` shejialuo
  0 siblings, 1 reply; 17+ messages in thread
From: Yuting Zheng @ 2025-03-27  2:26 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Thanks for your reply!

I have reviewed the changelog and noted that Git version 2.23
introduced similar work through the addition of the git-switch and
git-restore commands, which replace some legacy commands and incorporate
various functional modifications.

After examining the updates, I have summarized the proposed work as
follows and would appreciate confirmation on whether these tasks are to be
included in the current project:

1. Code Modifications for Command Implementation:

- Implementation of new commands.
- Necessary modifications to existing commands to support these changes.

2. Test Modifications:

- Addition of tests for the new features (including help tests, basic
functionality tests, and extended feature tests).
- Updating tests for old commands to execute tests on the new commands
(for example, changing the command in git-checkout tests to git-restore).

3. Documentation Updates:

- Creating documentation for the new commands.
Updating and unifying existing documentation (including git.txt,
git-cli.txt, and git-commit.txt).

Additionally, I have a few points that require further discussion:

1. Command Migration:

Upon reviewing the commands slated for replacement (e.g., git-update-ref(1),
git-for-each-ref(1), git-show-ref(1), git-pack-refs(1), and
git-symbolic-ref), it seems that migrating their functionality into a
subcommand of git-refs could be sufficient. Could you please confirm if
this approach meets our project requirements without introducing
additional functionality?

2. Function Call Integration:

Regarding migration, is it acceptable to directly invoke the legacy command
functions by passing parameters from the new command functions?

3. Test Retention:

Lastly, should we retain the original tests for the legacy commands, or
should they be fully replaced with tests for the new implementations?

I appreciate your guidance and look forward to your feedback on these points.

Best regards,
Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] Proposal Discussion: git-refs Project
  2025-03-27  2:26   ` Yuting Zheng
@ 2025-03-28 13:45     ` shejialuo
  2025-03-29 14:54       ` Yuting Zheng
  0 siblings, 1 reply; 17+ messages in thread
From: shejialuo @ 2025-03-28 13:45 UTC (permalink / raw)
  To: Yuting Zheng; +Cc: Patrick Steinhardt, git

On Thu, Mar 27, 2025 at 10:26:49AM +0800, Yuting Zheng wrote:
> Thanks for your reply!
> 
> I have reviewed the changelog and noted that Git version 2.23
> introduced similar work through the addition of the git-switch and
> git-restore commands, which replace some legacy commands and incorporate
> various functional modifications.
> 
> After examining the updates, I have summarized the proposed work as
> follows and would appreciate confirmation on whether these tasks are to be
> included in the current project:
> 
> 1. Code Modifications for Command Implementation:
> 
> - Implementation of new commands.
> - Necessary modifications to existing commands to support these changes.

I think "modifications to existing commands" may not be accurate. I
think what we need to do is we should try to reuse the original logic as
much as possible which requires:

1. Understand the behavior of the existing commands.
2. Find good design to expose the common interfaces for the new commands
and existing commands.

> 
> 2. Test Modifications:
> 
> - Addition of tests for the new features (including help tests, basic
> functionality tests, and extended feature tests).
> - Updating tests for old commands to execute tests on the new commands
> (for example, changing the command in git-checkout tests to git-restore).

I don't think that we should update tests for old commands. We want to
keep the original command not broken, right? So, we should use the
original test to exercise your changed code to make sure that everything
is OK.

> 
> 3. Documentation Updates:
> 
> - Creating documentation for the new commands.
> Updating and unifying existing documentation (including git.txt,
> git-cli.txt, and git-commit.txt).
> 
> Additionally, I have a few points that require further discussion:
> 
> 1. Command Migration:
> 
> Upon reviewing the commands slated for replacement (e.g., git-update-ref(1),
> git-for-each-ref(1), git-show-ref(1), git-pack-refs(1), and
> git-symbolic-ref), it seems that migrating their functionality into a
> subcommand of git-refs could be sufficient. Could you please confirm if
> this approach meets our project requirements without introducing
> additional functionality?
> 

From my own understanding, we just want to use "git-refs(1)" as an entry
point about all operations for refs. So, we don't need to add new
functionality in this project.

> 2. Function Call Integration:
> 
> Regarding migration, is it acceptable to directly invoke the legacy command
> functions by passing parameters from the new command functions?
> 

So, you want to say that could we use a subprocess to just invoke the
legacy command? I don't think we should use subprocess. If we could use
subprocess, should this project be called as a project?

I somehow think that you may first look at "git-pack-refs(1)" or
something like which is not so complicated to think about a solution.
And when writing the proposal, you may need to talk about how many
commands you want to migrate and how do you plan to migrate.

> 3. Test Retention:
> 
> Lastly, should we retain the original tests for the legacy commands, or
> should they be fully replaced with tests for the new implementations?
> 

This is a good question. From my view, we should not change the original
tests. And this would introduce another question, if we add the new test
for the new command, we'd introduce repetition. I cannot give your
answer here because I don't have experience about this.

> I appreciate your guidance and look forward to your feedback on these points.
> 
> Best regards,
> Zheng Yuting

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] Proposal Discussion: git-refs Project
  2025-03-28 13:45     ` shejialuo
@ 2025-03-29 14:54       ` Yuting Zheng
  0 siblings, 0 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-03-29 14:54 UTC (permalink / raw)
  To: shejialuo; +Cc: Patrick Steinhardt, git

Thank you for clarifying my misunderstandings—some phrasing issues might
stem from my non-native English. I’ve revised the proposal draft
accordingly and will share it directly in this mailing list thread for
your review.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GSoC] git-refs proposal draft
  2025-03-23 13:36 [GSoC] Proposal Discussion: git-refs Project Yuting Zheng
  2025-03-24 12:02 ` Patrick Steinhardt
@ 2025-03-29 15:02 ` Zheng Yuting
  2025-03-31  9:42   ` Patrick Steinhardt
                     ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Zheng Yuting @ 2025-03-29 15:02 UTC (permalink / raw)
  To: 05zyt30; +Cc: git

## Name and Contact Information

- Full Name: Zheng Yuting
- Email Address: 05ZYT30@gmail.com
- Time Zone: UTC +8:00

---

## Abstract

The current Git reference management functionality is fragmented across
multiple independent commands (git-show-ref, git-for-each-ref,
git-update-ref, git-pack-refs, git-check-ref-format, and
git-symbolic-ref), leading to code redundancy and increased maintenance
costs. Based on Patrick Steinhardt’s integration vision[1], this project
aims to introduce 8 new subcommands (list, exists, show, resolve, pack,
update, delete, check-format) under the existing git-refs command to
achieve the following objectives:

- Feature Integration: Consolidate existing reference management
  commands under git-refs, while maintaining backward compatibility.
- Feature Enhancement: Introduce recursion depth control for git-refs
  resolve.
- Testing & Documentation: Add test cases ensuring consistency and
  update relevant documentation.

---

## Implementation Plan

### Command Integration Strategy

#### Design Goals

The project will unify scattered reference management functionalities
under the git-refs subcommand framework, ensuring:

1. Complete Feature Coverage: Each subcommand fully replaces its
   corresponding legacy command.
2. Parameter Compatibility: Preserve the semantics and output behavior
   of legacy command options.
3. Code Reusability: Minimize redundancy by sharing underlying modules
   (e.g., refs/files-backend.c).

#### Subcommand Mapping

- git-refs list
  Replaces git-show-ref and git-for-each-ref, merging reference listing
  functionalities with support for formatting (--format), filtering
  (--heads, --tags), and sorting (--sort).
- git-refs exists
  Replaces git-show-ref --exists, providing reference existence checks
  with positive (<ref>) and exclusion-based (--exclude-existing)
  verification.
- git-refs show
  Replaces git-show-ref --verify, validating reference correctness with
  a strict mode (--strict).
- git-refs resolve
  Replaces git-symbolic-ref, resolving symbolic references with added
  recursion depth control (--max-depth), while retaining deletion (-d)
  and quiet mode (-q) options.
- git-refs pack
  Replaces git-pack-refs, packing loose references with support for
  filtering (--include, --exclude) and automatic cleanup (--prune).
- git-refs update
  Replaces git-update-ref, providing transactional reference updates
  with batch processing (--stdin) and atomic guarantees.
- git-refs delete
  Separates the delete functionality from git-update-ref, ensuring
  explicit handling of reference removals with safety checks and batch
  operations (--stdin).
- git-refs check-format
  Replaces git-check-ref-format, validating reference format with
  support for normalized output (--normalize).

#### Implementation Strategy

1. Option Parsing: Each subcommand will reuse the argument parsing
   logic from legacy commands (e.g., git-pack-refs --prune).
2. Shared Backend Logic: Calls to common functions in refs/ (e.g.,
   reference traversal, locking mechanisms).
3. Error Consistency: Maintain the same error codes and message
   formats as legacy commands.

---

### Example: Implementing git-refs pack

#### Functional Implementation

1. Modify builtin/refs.c:
   - Add cmd_refs_pack function implementing git-pack-refs logic.
   - Update cmd_refs to include pack with
     OPT_SUBCOMMAND("pack", &fn, cmd_refs_pack).
   - Define REFS_PACK_USAGE:
     git refs pack [--all] [--no-prune] [--auto] [--include <pattern>]
     [--exclude <pattern>].
2. Register New Subcommand in git.c:
   - Add { "refs-pack", cmd_refs_pack }, to the command array.
3. Reuse refs/files-backend.c Logic:
   - Ensure cmd_refs_pack calls pack_refs correctly, adjusting as
     necessary for new options.

#### Testing Plan

- Test Cases:
  Add t/txxx-refs-pack.sh, leveraging t/t0601-reffiles-pack-refs.sh
  scenarios to verify:
  - --prune removes obsolete references correctly.
  - --include and --exclude apply filtering as expected.
  - Packed references match legacy command outputs (diff .git/packed-refs).
- Performance Benchmarking (if needed):
  Add performance tests in t/perf to ensure no significant regression
  in execution time or memory usage.

#### Documentation Updates

- User Manual:
  Add a pack section to Documentation/git-refs.txt, mapping options to
  legacy command equivalents.
- Developer Notes:
  Comment code to highlight functional parity between git-refs pack
  and git-pack-refs.

---

### Timeline

- May 8 - May 11 (4 days): Initial Testing & Subcommand Framework Setup
- May 12 - May 28 (17 days): pack Subcommand Implementation
- May 29 - June 14 (17 days): check-format Subcommand Development
- June 15 - July 5 (21 days): update and delete Subcommands Development
- July 6 - July 26 (21 days): show and exists Subcommands Development
- July 27 - August 16 (21 days): resolve Subcommand Implementation
- August 17 - September 6 (21 days): list Subcommand Implementation
- September 7 - September 16 (10 days): Mid-term Review
- September 17 - September 23 (7 days): Mentor Review & Final Adjustments

---

## Background & Experience

I graduated in June 2024 from Wenzhou University with a degree in
Network Engineering. My experience includes C programming and
command-line tool development, along with proficiency in Shell
scripting. I am currently in a transitional phase and expect to finalize
my schedule by late April, and then update my weekly schedule for GSoC,
estimating 25-30 hours per week for this project currently.

### Project Experience

- One Student One Chip Project[2]
  Extending the open-source NEMU simulator by implementing CPU cycle
  functionalities in C.
- Web Development
  Developed a Django-based campus website, including user chat, news
  publishing, and teacher management modules.
- Custom Communication Protocols
  Built a UDP-based chatroom with peer-to-peer and group messaging.
- Stock Monitoring Tool
  Implemented real-time monitoring and historical data analysis, with
  email alerting and planned AI-driven strategy optimization.

I have also obtained CCNA certification and gained hands-on experience
as a network engineer. Additionally, I contributed a patch optimizing
send-email functionality in Git[3], giving me insights into the Git
codebase.

## Appendix

[1] https://gitlab.com/gitlab-org/git/-/issues/330
[2] https://ysyx.oscc.cc/en/project/intro.html
[3]https://lore.kernel.org/git/20250312064639.668875-1-05ZYT30@gmail.com/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] git-refs proposal draft
  2025-03-29 15:02 ` [GSoC] git-refs proposal draft Zheng Yuting
@ 2025-03-31  9:42   ` Patrick Steinhardt
  2025-04-01 13:37     ` Yuting Zheng
  2025-04-03 15:44   ` Discussion on git-refs list Implementation and Possible Approaches Zheng Yuting
  2025-04-06  6:08   ` [GSoC] git-refs proposal v2 Yuting Zheng
  2 siblings, 1 reply; 17+ messages in thread
From: Patrick Steinhardt @ 2025-03-31  9:42 UTC (permalink / raw)
  To: Zheng Yuting; +Cc: git

On Sat, Mar 29, 2025 at 11:02:46PM +0800, Zheng Yuting wrote:
> ## Name and Contact Information
> 
> - Full Name: Zheng Yuting
> - Email Address: 05ZYT30@gmail.com
> - Time Zone: UTC +8:00
> 
> ---
> 
> ## Abstract
> 
> The current Git reference management functionality is fragmented across
> multiple independent commands (git-show-ref, git-for-each-ref,
> git-update-ref, git-pack-refs, git-check-ref-format, and
> git-symbolic-ref), leading to code redundancy and increased maintenance
> costs. Based on Patrick Steinhardt’s integration vision[1], this project
> aims to introduce 8 new subcommands (list, exists, show, resolve, pack,
> update, delete, check-format) under the existing git-refs command to
> achieve the following objectives:

I have a couple of opinions on the exact naming of the subcommands, more
on that below.

In any case, I don't think the naming and how exactly each of these
commands should look and work like needs to be hashed out in this
document. It's nice to scope out _what_ we want to achieve and propose
how this could look like, but ultimately I think that most of the design
should happen during the project itself.

> - Feature Integration: Consolidate existing reference management
>   commands under git-refs, while maintaining backward compatibility.
> - Feature Enhancement: Introduce recursion depth control for git-refs
>   resolve.
> - Testing & Documentation: Add test cases ensuring consistency and
>   update relevant documentation.
> 
> ---
> 
> ## Implementation Plan
> 
> ### Command Integration Strategy
> 
> #### Design Goals
> 
> The project will unify scattered reference management functionalities
> under the git-refs subcommand framework, ensuring:
> 
> 1. Complete Feature Coverage: Each subcommand fully replaces its
>    corresponding legacy command.
> 2. Parameter Compatibility: Preserve the semantics and output behavior
>    of legacy command options.

This one is something that is up for debate. While I do expect that most
of the commands should remain current semantics and options, we could
also use this as an opportunity to think whether there are any issues
with the current design and improve upon it.

> 3. Code Reusability: Minimize redundancy by sharing underlying modules
>    (e.g., refs/files-backend.c).
> 
> #### Subcommand Mapping
> 
> - git-refs list
>   Replaces git-show-ref and git-for-each-ref, merging reference listing
>   functionalities with support for formatting (--format), filtering
>   (--heads, --tags), and sorting (--sort).

Yup. One thing to note is that git-show-ref(1) and git-for-each-ref(1)
are very similar, but not quite the same. One should find good arguments
which of the two semantics are preferable to us and why that is.

For example, git-show-ref(1) outperforms git-for-each-ref(1) due to the
default format:

    Benchmark 1: git show-ref
      Time (mean ± σ):      99.0 ms ±   0.5 ms    [User: 55.6 ms, System: 43.0 ms]
      Range (min … max):    98.0 ms … 100.8 ms    100 runs

    Benchmark 2: git for-each-ref
      Time (mean ± σ):     134.0 ms ±   0.6 ms    [User: 82.3 ms, System: 50.8 ms]
      Range (min … max):   132.7 ms … 135.8 ms    100 runs

    Summary
      git show-ref ran
        1.35 ± 0.01 times faster than git for-each-ref

> - git-refs exists
>   Replaces git-show-ref --exists, providing reference existence checks
>   with positive (<ref>) and exclusion-based (--exclude-existing)
>   verification.

I'm not quite clear what exclusion-based existence checks is. How do you
check whether something exists when you exclude it? I don't think that
this option is relevant in the context of `git refs exists`.

> - git-refs show
>   Replaces git-show-ref --verify, validating reference correctness with
>   a strict mode (--strict).

Yup. In contrast to `git refs resolve` this command shouldn't resolve
the ref, but directly show what it's pointing to. And this should be
true for both symbolic and normal refs.

> - git-refs resolve
>   Replaces git-symbolic-ref, resolving symbolic references with added
>   recursion depth control (--max-depth), while retaining deletion (-d)
>   and quiet mode (-q) options.

Not quite. The difference to `git refs show` is that this command always
resolves the ref to an object. So it's rather more similar to `git
rev-parse --verify`, except that it only ever handles references.

> - git-refs pack
>   Replaces git-pack-refs, packing loose references with support for
>   filtering (--include, --exclude) and automatic cleanup (--prune).

I would probably call this `git refs optimize` or something like that.
git-pack-refs(1) is mostly called this way because it was introduced to
pack refs into the "packed-refs" file. But nowadays with the reftable
backend I think that the command name is somewhat inaccurate.

> - git-refs update
>   Replaces git-update-ref, providing transactional reference updates
>   with batch processing (--stdin) and atomic guarantees.
> - git-refs delete
>   Separates the delete functionality from git-update-ref, ensuring
>   explicit handling of reference removals with safety checks and batch
>   operations (--stdin).

It's up for debate whether we should even have something like `git refs
delete`. As you rightfully notice `git refs update` already handles the
usecase, so it feels like needless duplication.

> - git-refs check-format
>   Replaces git-check-ref-format, validating reference format with
>   support for normalized output (--normalize).

Ah, nice, this is a command I forgot about.

> #### Implementation Strategy
> 
> 1. Option Parsing: Each subcommand will reuse the argument parsing
>    logic from legacy commands (e.g., git-pack-refs --prune).

We cannot and do not want to do this for every case. As mentioned above,
we may want to iterate on some of the subcommands to address historic
warts. But overall I agree, we should of course aim to reduce
duplication as far as it is sensible to do.

> 2. Shared Backend Logic: Calls to common functions in refs/ (e.g.,
>    reference traversal, locking mechanisms).
> 3. Error Consistency: Maintain the same error codes and message
>    formats as legacy commands.

Same reasoning here, we may want to adapt some of them. The old commands
won't go away as they are used everywhere, and that makes it more
reasonable for us to change behaviour in their newer equivalents.

> ---
> 
> ### Example: Implementing git-refs pack
> 
> #### Functional Implementation
> 
> 1. Modify builtin/refs.c:
>    - Add cmd_refs_pack function implementing git-pack-refs logic.
>    - Update cmd_refs to include pack with
>      OPT_SUBCOMMAND("pack", &fn, cmd_refs_pack).
>    - Define REFS_PACK_USAGE:
>      git refs pack [--all] [--no-prune] [--auto] [--include <pattern>]
>      [--exclude <pattern>].
> 2. Register New Subcommand in git.c:
>    - Add { "refs-pack", cmd_refs_pack }, to the command array.

You don't actually have to change "git.c" to introduce new subcommands.
We don't want `git refs-pack`, but rather `git refs pack`, which is an
important distinction.

> 3. Reuse refs/files-backend.c Logic:
>    - Ensure cmd_refs_pack calls pack_refs correctly, adjusting as
>      necessary for new options.

We shouldn't have to touch any of the backends at all. You should rather
make sure to integrate with "refs.c", which wraps the backends and
provides a backend-agnostic interface to refs.

> #### Testing Plan
> 
> - Test Cases:
>   Add t/txxx-refs-pack.sh, leveraging t/t0601-reffiles-pack-refs.sh
>   scenarios to verify:
>   - --prune removes obsolete references correctly.
>   - --include and --exclude apply filtering as expected.
>   - Packed references match legacy command outputs (diff .git/packed-refs).
> - Performance Benchmarking (if needed):
>   Add performance tests in t/perf to ensure no significant regression
>   in execution time or memory usage.
> 
> #### Documentation Updates
> 
> - User Manual:
>   Add a pack section to Documentation/git-refs.txt, mapping options to
>   legacy command equivalents.
> - Developer Notes:
>   Comment code to highlight functional parity between git-refs pack
>   and git-pack-refs.
> 
> ---
> 
> ### Timeline
> 
> - May 8 - May 11 (4 days): Initial Testing & Subcommand Framework Setup
> - May 12 - May 28 (17 days): pack Subcommand Implementation
> - May 29 - June 14 (17 days): check-format Subcommand Development
> - June 15 - July 5 (21 days): update and delete Subcommands Development
> - July 6 - July 26 (21 days): show and exists Subcommands Development
> - July 27 - August 16 (21 days): resolve Subcommand Implementation
> - August 17 - September 6 (21 days): list Subcommand Implementation
> - September 7 - September 16 (10 days): Mid-term Review
> - September 17 - September 23 (7 days): Mentor Review & Final Adjustments

You probably underestimate the time to review and land a specific change
quite significantly. Landing new features in ~2 weeks is thus not quite
realistic and you should allocate a lot more time for each of the
specific subcommands.

That of course raises the question of how to squeeze all of the
subcommands into a single GSoC. And the answer is that you don't: it's
perfectly fine to implement only a subset of the new proposed
subcommands. I'd rather you spend more time thinking about how to
improve upon the status quo for each of the subcommands and thus spend
more time on it than trying to do everything in a hurry.

So: there isn't any expectation that you manage to implement all of
them. I'd recommend to pick a subset of commands that you want to
implement as a realistic goal. You may define other commands as a
stretch goal in case you manage to speed through the implementation way
faster than I anticipate.

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] git-refs proposal draft
  2025-03-31  9:42   ` Patrick Steinhardt
@ 2025-04-01 13:37     ` Yuting Zheng
  2025-04-02  8:02       ` Patrick Steinhardt
  0 siblings, 1 reply; 17+ messages in thread
From: Yuting Zheng @ 2025-04-01 13:37 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Hi Patrick,

Thanks for your feedback! Here are some adjustments based on your
suggestions:

> In any case, I don't think the naming and how exactly each of these
> commands should look and work like needs to be hashed out in this
> document. It's nice to scope out _what_ we want to achieve and propose
> how this could look like, but ultimately I think that most of the design
> should happen during the project itself.

OK! I may have misunderstood it. I will remove it.

> This one is something that is up for debate. While I do expect that most
> of the commands should remain current semantics and options, we could
> also use this as an opportunity to think whether there are any issues
> with the current design and improve upon it.

So, discussing the specific implementation of the command should also
be included in the proposal, right?

>> - git-refs exists
>>   Replaces git-show-ref --exists, providing reference existence checks
>>   with positive (<ref>) and exclusion-based (--exclude-existing)
>>   verification.
>
> I'm not quite clear what exclusion-based existence checks is. How do you
> check whether something exists when you exclude it? I don't think that
> this option is relevant in the context of `git refs exists`.

Sorry, I made a mistake. I meant to convey that the `--exclude-existing`
option should be included in `git-refs list` (replacing
`git-show-ref --exclude-existing`), which then lists refs within a certain
scope.

>> - git-refs resolve
>>   Replaces git-symbolic-ref, resolving symbolic references with added
>>   recursion depth control (--max-depth), while retaining deletion (-d)
>>   and quiet mode (-q) options.
>
> Not quite. The difference to `git refs show` is that this command always
> resolves the ref to an object. So it's rather more similar to `git
> rev-parse --verify`, except that it only ever handles references.

Thanks for pointing that out. I will correct it afterward.

>> - git-refs pack
>>   Replaces git-pack-refs, packing loose references with support for
>>   filtering (--include, --exclude) and automatic cleanup (--prune).
>
> I would probably call this `git refs optimize` or something like that.
> git-pack-refs(1) is mostly called this way because it was introduced to
> pack refs into the "packed-refs" file. But nowadays with the reftable
> backend I think that the command name is somewhat inaccurate.

Agree with it.

>> - git-refs update
>>   Replaces git-update-ref, providing transactional reference updates
>>   with batch processing (--stdin) and atomic guarantees.
>> - git-refs delete
>>   Separates the delete functionality from git-update-ref, ensuring
>>   explicit handling of reference removals with safety checks and batch
>>   operations (--stdin).
>
> It's up for debate whether we should even have something like `git refs
> delete`. As you rightfully notice `git refs update` already handles the
> usecase, so it feels like needless duplication.
>

I think maybe separate `update` and `delete` can be more direct. Separating
these commands can enhance clarity in their usage, although I'm open to
further discussion if the community prefers a unified command.

>> 1. Option Parsing: Each subcommand will reuse the argument parsing
>>    logic from legacy commands (e.g., git-pack-refs --prune).
>
> We cannot and do not want to do this for every case. As mentioned above,
> we may want to iterate on some of the subcommands to address historic
> warts. But overall I agree, we should of course aim to reduce
> duplication as far as it is sensible to do.

>
>> 2. Shared Backend Logic: Calls to common functions in refs/ (e.g.,
>>    reference traversal, locking mechanisms).
>> 3. Error Consistency: Maintain the same error codes and message
>>    formats as legacy commands.
>
> Same reasoning here, we may want to adapt some of them. The old commands
> won't go away as they are used everywhere, and that makes it more
> reasonable for us to change behaviour in their newer equivalents.
>

Got it. I will list my thoughts below.

> You don't actually have to change "git.c" to introduce new subcommands.
> We don't want `git refs-pack`, but rather `git refs pack`, which is an
> important distinction.

Sorry for my oversight. I will be more careful from now on.

>> 3. Reuse refs/files-backend.c Logic:
>>    - Ensure cmd_refs_pack calls pack_refs correctly, adjusting as
>>      necessary for new options.
>
> We shouldn't have to touch any of the backends at all. You should rather
> make sure to integrate with "refs.c", which wraps the backends and
> provides a backend-agnostic interface to refs.

Got it.

> You probably underestimate the time to review and land a specific change
> quite significantly. Landing new features in ~2 weeks is thus not quite
> realistic and you should allocate a lot more time for each of the
> specific subcommands.
>
> That of course raises the question of how to squeeze all of the
> subcommands into a single GSoC. And the answer is that you don't: it's
> perfectly fine to implement only a subset of the new proposed
> subcommands. I'd rather you spend more time thinking about how to
> improve upon the status quo for each of the subcommands and thus spend
> more time on it than trying to do everything in a hurry.
>

Thanks for your reminder! I plan to focus on implementing `git-refs list` and
`git-refs update` first. These will form the foundation of the new design, and
once stable, I will consider addressing `git-refs resolve` and additional
commands if time permits.

So, I need to update my proposal to reduce the number of subcommands so
that I can complete this project with high quality. I also need to
further discuss
the implications of these commands. By reducing the number of subcommands,
I can dedicate more time to refining each one and ensuring they integrate well
with the existing system. I will also detail the implications of each command in
my updated proposal.

Thanks!
Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC] git-refs proposal draft
  2025-04-01 13:37     ` Yuting Zheng
@ 2025-04-02  8:02       ` Patrick Steinhardt
  0 siblings, 0 replies; 17+ messages in thread
From: Patrick Steinhardt @ 2025-04-02  8:02 UTC (permalink / raw)
  To: Yuting Zheng; +Cc: git

On Tue, Apr 01, 2025 at 09:37:50PM +0800, Yuting Zheng wrote:
> Hi Patrick,
> 
> Thanks for your feedback! Here are some adjustments based on your
> suggestions:
> 
> > In any case, I don't think the naming and how exactly each of these
> > commands should look and work like needs to be hashed out in this
> > document. It's nice to scope out _what_ we want to achieve and propose
> > how this could look like, but ultimately I think that most of the design
> > should happen during the project itself.
> 
> OK! I may have misunderstood it. I will remove it.
> 
> > This one is something that is up for debate. While I do expect that most
> > of the commands should remain current semantics and options, we could
> > also use this as an opportunity to think whether there are any issues
> > with the current design and improve upon it.
> 
> So, discussing the specific implementation of the command should also
> be included in the proposal, right?

At least the general direction should become clear, yes. The intent is
that we want to double check that the candidate has indeed invested a
bit of time to understand the problem space and what is being asked of
them. So you don't have to provide all the nitty-gritty details of how
exactly you plan on doing the conversion, but provide a bit of an
overview of what the project would entail.

> >> - git-refs exists
> >>   Replaces git-show-ref --exists, providing reference existence checks
> >>   with positive (<ref>) and exclusion-based (--exclude-existing)
> >>   verification.
> >
> > I'm not quite clear what exclusion-based existence checks is. How do you
> > check whether something exists when you exclude it? I don't think that
> > this option is relevant in the context of `git refs exists`.
> 
> Sorry, I made a mistake. I meant to convey that the `--exclude-existing`
> option should be included in `git-refs list` (replacing
> `git-show-ref --exclude-existing`), which then lists refs within a certain
> scope.

No need to be sorry, we all do mistakes.

[snip]
> >> - git-refs update
> >>   Replaces git-update-ref, providing transactional reference updates
> >>   with batch processing (--stdin) and atomic guarantees.
> >> - git-refs delete
> >>   Separates the delete functionality from git-update-ref, ensuring
> >>   explicit handling of reference removals with safety checks and batch
> >>   operations (--stdin).
> >
> > It's up for debate whether we should even have something like `git refs
> > delete`. As you rightfully notice `git refs update` already handles the
> > usecase, so it feels like needless duplication.
> >
> 
> I think maybe separate `update` and `delete` can be more direct. Separating
> these commands can enhance clarity in their usage, although I'm open to
> further discussion if the community prefers a unified command.

`update` will have to support deletions regardless as you won't be able
to do atomic updates of many refs at once if that update would include a
deletion. So let's start with that, and then we can still figure out
whether `delete` would be desirable.

> > You probably underestimate the time to review and land a specific change
> > quite significantly. Landing new features in ~2 weeks is thus not quite
> > realistic and you should allocate a lot more time for each of the
> > specific subcommands.
> >
> > That of course raises the question of how to squeeze all of the
> > subcommands into a single GSoC. And the answer is that you don't: it's
> > perfectly fine to implement only a subset of the new proposed
> > subcommands. I'd rather you spend more time thinking about how to
> > improve upon the status quo for each of the subcommands and thus spend
> > more time on it than trying to do everything in a hurry.
> >
> 
> Thanks for your reminder! I plan to focus on implementing `git-refs list` and
> `git-refs update` first. These will form the foundation of the new design, and
> once stable, I will consider addressing `git-refs resolve` and additional
> commands if time permits.
> 
> So, I need to update my proposal to reduce the number of subcommands so
> that I can complete this project with high quality. I also need to
> further discuss
> the implications of these commands. By reducing the number of subcommands,
> I can dedicate more time to refining each one and ensuring they integrate well
> with the existing system. I will also detail the implications of each command in
> my updated proposal.

Great, thanks!

Patrick

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Discussion on git-refs list Implementation and Possible Approaches
  2025-03-29 15:02 ` [GSoC] git-refs proposal draft Zheng Yuting
  2025-03-31  9:42   ` Patrick Steinhardt
@ 2025-04-03 15:44   ` Zheng Yuting
  2025-04-04 11:08     ` Karthik Nayak
                       ` (2 more replies)
  2025-04-06  6:08   ` [GSoC] git-refs proposal v2 Yuting Zheng
  2 siblings, 3 replies; 17+ messages in thread
From: Zheng Yuting @ 2025-04-03 15:44 UTC (permalink / raw)
  To: 05zyt30; +Cc: git

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2881 bytes --]

After an initial review of the code and documentation for `git-show-ref`
and `git-for-each-ref`, I believe the functionality of the `git-refs list`
subcommand can be categorized into two major types:

1. **Filtering options**
   - In `git-for-each-ref`:
     - `--count`
     - `--sort=<key>`
     - `--points-at=<object>`
     - `--merged[=<object>]`
     - `--no-merged[=<object>]`
     - `--contains[=<object>]`
     - `--no-contains[=<object>]`
     - `--omit-empty`
     - `--exclude=<pattern>`
     - `--include-root-refs`
   - In `git-show-ref`:
     - `--head`
     - `--branches`
     - `--tags`
     - `--exclude-existing[=<pattern>]`

2. **Formatting options**
   - In `git-for-each-ref`:
     - `--format=<format>`
     - `--color[=<when>]`
     - `--tcl`
     - `--shell`
     - `--perl`
   - In `git-show-ref`:
     - `--dereference`
     - `--hash`

Additionally, for filtering functionality, the `--ignore-case` option
from `git-for-each-ref` should be supported across the board.

**Note**: The `--verify`, `--quiet` and `--exist` options in
`git-show-ref` are intended to be implemented as separate
`git-refs` subcommands and are not within the scope of this
discussion.

## Implementation Considerations

At this point, I haven't come up with a perfect implementation
plan, as each approach has some issues:

### Approach 1:
`git-refs list` would support both filtering and formatting options,
meaning it could provide:
- Filtered output
- Formatted output
- Combined filter + format output

However, I see two potential problems with this approach:
1. Would it make the `list` subcommand too complex?
2. The performance could be worse than `git-for-each-ref`.

### Approach 2:
Split the functionality into two separate subcommands:
- `git-refs filter`: Handles filtering and filter + format output
- `git-refs show`: Supports formatting options

For implementation, my initial thought is that `git-refs filter` could
reuse the formatting options from `git-refs show`. Perhaps this could
work similarly to how `git-add --patch` and `git-restore --patch`
share logic, though I haven’t thoroughly reviewed that part of the
code yet. Would this be a reasonable approach?

## Overall Plan

If Approach 2 is preferable, I could start with `git-refs show` since it
only deals with basic ref listing and formatting. I would then make
the formatting code more reusable to support `git-refs filter`, which
would focus solely on filtering.

If Approach 1 is chosen, the implementation plan would remain the
same, but everything would be handled within a single `git-refs list`
command.

I would appreciate any feedback or alternative suggestions on the
best way to structure this functionality.

Thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Discussion on git-refs list Implementation and Possible Approaches
  2025-04-03 15:44   ` Discussion on git-refs list Implementation and Possible Approaches Zheng Yuting
@ 2025-04-04 11:08     ` Karthik Nayak
  2025-04-04 15:25       ` Yuting Zheng
  2025-04-04 11:15     ` Patrick Steinhardt
  2025-04-04 15:16     ` Yuting Zheng
  2 siblings, 1 reply; 17+ messages in thread
From: Karthik Nayak @ 2025-04-04 11:08 UTC (permalink / raw)
  To: Zheng Yuting; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 5578 bytes --]

Zheng Yuting <05zyt30@gmail.com> writes:

> After an initial review of the code and documentation for `git-show-ref`
> and `git-for-each-ref`, I believe the functionality of the `git-refs list`
> subcommand can be categorized into two major types:
>
> 1. **Filtering options**
>    - In `git-for-each-ref`:
>      - `--count`
>      - `--sort=<key>`

I would categorize '--sort' into a third subcategory. Filtering refers
to possible change in the size of the sample set. While sorting is more
of a presentation utility.

>      - `--points-at=<object>`
>      - `--merged[=<object>]`
>      - `--no-merged[=<object>]`
>      - `--contains[=<object>]`
>      - `--no-contains[=<object>]`
>      - `--omit-empty`
>      - `--exclude=<pattern>`
>      - `--include-root-refs`
>    - In `git-show-ref`:
>      - `--head`
>      - `--branches`
>      - `--tags`
>      - `--exclude-existing[=<pattern>]`
>
> 2. **Formatting options**
>    - In `git-for-each-ref`:
>      - `--format=<format>`
>      - `--color[=<when>]`
>      - `--tcl`
>      - `--shell`
>      - `--perl`
>    - In `git-show-ref`:
>      - `--dereference`
>      - `--hash`
>
> Additionally, for filtering functionality, the `--ignore-case` option
> from `git-for-each-ref` should be supported across the board.
>

This is indeed a special case which applies to both sorting and
filtering.

> **Note**: The `--verify`, `--quiet` and `--exist` options in
> `git-show-ref` are intended to be implemented as separate
> `git-refs` subcommands and are not within the scope of this
> discussion.
>
>
> ## Implementation Considerations
>
> At this point, I haven't come up with a perfect implementation
> plan, as each approach has some issues:
>
> ### Approach 1:
> `git-refs list` would support both filtering and formatting options,
> meaning it could provide:
> - Filtered output
> - Formatted output
> - Combined filter + format output
>
> However, I see two potential problems with this approach:
> 1. Would it make the `list` subcommand too complex?

You mean complex from the user perspective of having too many options or
from the implementation perspective.

I think from the UX perspective, it is a good time to rethink usage and
need for the options you mentioned above. , for e.g. with '--format', do
we need to have '--tcl', `--shell` and `--perl`?

> 2. The performance could be worse than `git-for-each-ref`.
>

Why would it be worse? The performance difference between
`git-for-each-ref(1)` and `git-show-ref(1)` stem from the formats they
use by default.

$ hyperfine --shell=none --warmup=3 "git for-each-ref" "git show-ref"
Benchmark 1: git for-each-ref
  Time (mean ± σ):       4.0 ms ±   0.6 ms    [User: 1.9 ms, System: 1.9 ms]
  Range (min … max):     3.0 ms …   5.7 ms    680 runs

Benchmark 2: git show-ref
  Time (mean ± σ):       2.9 ms ±   0.4 ms    [User: 1.2 ms, System: 1.5 ms]
  Range (min … max):     2.0 ms …   4.3 ms    909 runs

Summary
  git show-ref ran
    1.38 ± 0.28 times faster than git for-each-ref

What I found interesting was that changing the format for
'git-for-each-ref(1)' gives it a boost:

$ hyperfine --shell=none --warmup=3 'git for-each-ref
--format="%(objectname) %(refname)"' "git show-ref"
Benchmark 1: git for-each-ref --format="%(objectname) %(refname)"
  Time (mean ± σ):       2.4 ms ±   0.3 ms    [User: 1.1 ms, System: 1.1 ms]
  Range (min … max):     1.7 ms …   3.6 ms    1070 runs

Benchmark 2: git show-ref
  Time (mean ± σ):       2.9 ms ±   0.4 ms    [User: 1.2 ms, System: 1.5 ms]
  Range (min … max):     2.0 ms …   4.5 ms    833 runs

Summary
  git for-each-ref --format="%(objectname) %(refname)" ran
    1.20 ± 0.23 times faster than git show-ref

> ### Approach 2:
> Split the functionality into two separate subcommands:
> - `git-refs filter`: Handles filtering and filter + format output
> - `git-refs show`: Supports formatting options
>
> For implementation, my initial thought is that `git-refs filter` could
> reuse the formatting options from `git-refs show`. Perhaps this could
> work similarly to how `git-add --patch` and `git-restore --patch`
> share logic, though I haven’t thoroughly reviewed that part of the
> code yet. Would this be a reasonable approach?
>

And what is the expectation that when you want to do both filtering and
formatting, would the user be expected to do `git refs filter | git refs
show`? Generally users want to combine both of these options.

Also wasn't the idea to already implement `git-refs show` as a
standalone which simply shows what value a reference holds (without
derefence)?

> ## Overall Plan
>
> If Approach 2 is preferable, I could start with `git-refs show` since it
> only deals with basic ref listing and formatting. I would then make
> the formatting code more reusable to support `git-refs filter`, which
> would focus solely on filtering.
>
> If Approach 1 is chosen, the implementation plan would remain the
> same, but everything would be handled within a single `git-refs list`
> command.

While I would think Approach 1 is the better option here, I'm also
seeing how it is complex, perhaps a good option to get started would be
to implement a simpler subcommand as a first case? Perhaps the
originally discussed `git refs show`?

>
> I would appreciate any feedback or alternative suggestions on the
> best way to structure this functionality.
>
> Thanks!

Thanks for the proposal!
Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Discussion on git-refs list Implementation and Possible Approaches
  2025-04-03 15:44   ` Discussion on git-refs list Implementation and Possible Approaches Zheng Yuting
  2025-04-04 11:08     ` Karthik Nayak
@ 2025-04-04 11:15     ` Patrick Steinhardt
       [not found]       ` <CAMvj1+rMY2YR8_GGFeDoJ6HCiVDusZZk9fAguKh=kbctHO=2Qg@mail.gmail.com>
  2025-04-04 15:26       ` Yuting Zheng
  2025-04-04 15:16     ` Yuting Zheng
  2 siblings, 2 replies; 17+ messages in thread
From: Patrick Steinhardt @ 2025-04-04 11:15 UTC (permalink / raw)
  To: Zheng Yuting; +Cc: git

On Thu, Apr 03, 2025 at 11:44:04PM +0800, Zheng Yuting wrote:
> After an initial review of the code and documentation for `git-show-ref`
> and `git-for-each-ref`, I believe the functionality of the `git-refs list`
> subcommand can be categorized into two major types:
> 
> 1. **Filtering options**
>    - In `git-for-each-ref`:
>      - `--count`
>      - `--sort=<key>`
>      - `--points-at=<object>`
>      - `--merged[=<object>]`
>      - `--no-merged[=<object>]`
>      - `--contains[=<object>]`
>      - `--no-contains[=<object>]`
>      - `--omit-empty`
>      - `--exclude=<pattern>`
>      - `--include-root-refs`
>    - In `git-show-ref`:
>      - `--head`
>      - `--branches`
>      - `--tags`
>      - `--exclude-existing[=<pattern>]`
> 
> 2. **Formatting options**
>    - In `git-for-each-ref`:
>      - `--format=<format>`
>      - `--color[=<when>]`
>      - `--tcl`
>      - `--shell`
>      - `--perl`
>    - In `git-show-ref`:
>      - `--dereference`
>      - `--hash`
> 
> Additionally, for filtering functionality, the `--ignore-case` option
> from `git-for-each-ref` should be supported across the board.
> 
> **Note**: The `--verify`, `--quiet` and `--exist` options in
> `git-show-ref` are intended to be implemented as separate
> `git-refs` subcommands and are not within the scope of this
> discussion.

Yup, makes sense.

Another factor is the default format that these two commands use which
differs. I would heavily lean towards using the format exposed by `git
show-ref` because it doesn't require us to hit the ODB, and thus it is
way more efficient. This has bitten me quite often already.

> ## Implementation Considerations
> 
> At this point, I haven't come up with a perfect implementation
> plan, as each approach has some issues:
> 
> ### Approach 1:
> `git-refs list` would support both filtering and formatting options,
> meaning it could provide:
> - Filtered output
> - Formatted output
> - Combined filter + format output
> 
> However, I see two potential problems with this approach:
> 1. Would it make the `list` subcommand too complex?

I don't think it would, both are orthogonal to one another. I don't
think people _only_ want to format or _only_ want to filter. Quite
often, they'll want to do both at the same time.

> 2. The performance could be worse than `git-for-each-ref`.

Why is that? git-for-each-ref(1) already knows to filter and format, so
I'd expect the performance to be roughly the same. In fact, I think we
would be able to improve performance if we changed the default format as
mentioned above.

> ### Approach 2:
> Split the functionality into two separate subcommands:
> - `git-refs filter`: Handles filtering and filter + format output
> - `git-refs show`: Supports formatting options
> 
> For implementation, my initial thought is that `git-refs filter` could
> reuse the formatting options from `git-refs show`. Perhaps this could
> work similarly to how `git-add --patch` and `git-restore --patch`
> share logic, though I haven’t thoroughly reviewed that part of the
> code yet. Would this be a reasonable approach?

I don't think this plan would make sense as it would mean that current
users of git-for-each-ref(1) wouldn't be able to migrate.

Patrick

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Discussion on git-refs list Implementation and Possible Approaches
  2025-04-03 15:44   ` Discussion on git-refs list Implementation and Possible Approaches Zheng Yuting
  2025-04-04 11:08     ` Karthik Nayak
  2025-04-04 11:15     ` Patrick Steinhardt
@ 2025-04-04 15:16     ` Yuting Zheng
  2 siblings, 0 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-04-04 15:16 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, karthik nayak

Hi everyone,

Following the initial discussion, I’ve updated the design for the
`git-refs list` subcommand. Below are the key changes and a
discussion about subcommand options.

### `git-refs list implement plan`

1. Output Format:

The default output format now follows the `git-show-ref` style:
`<oid> SP <ref> LF`. This avoids dependency on ODB and aligns with
lightweight ref listing.

2. Option Categorization:

The functionality is now divided into three distinct types of options
(filter, sort, format) that can be combined:

2.1. **Filtering options**
   - In `git-for-each-ref`:
     - `--count`
     - `--points-at=<object>`
     - `--merged[=<object>]`
     - `--no-merged[=<object>]`
     - `--contains[=<object>]`
     - `--no-contains[=<object>]`
     - `--omit-empty`
     - `--exclude=<pattern>`
     - `--include-root-refs`
   - In `git-show-ref`:
     - `--head`
     - `--branches`
     - `--tags`
     - `--exclude-existing[=<pattern>]`

2.2. **Sorting options**
   - In `git-for-each-ref`:
     - `--sort=<key>`

2.3. **Formatting options**
   - In `git-for-each-ref`:
     - `--format=<format>`
     - `--color[=<when>]`
     - `--tcl`
     - `--shell`
     - `--perl`
   - In `git-show-ref`:
     - `--dereference`
     - `--hash`

Additionally, for filtering and sorting functionality, the
`--ignore-case` option from `git-for-each-ref` should be
supported across the board.

**Note**: The `--verify`, `--quiet` and `--exist` options in
`git-show-ref` are intended to be implemented as separate
`git-refs` subcommands and are not within the scope of this
discussion.

3. Implementation Approach:

> ### Approach 1:
> `git-refs list` would support both filtering and formatting options,
> meaning it could provide:
> - Filtered output
> - Formatted output
> - Combined filter + format output
>

I will proceed with Approach 1 by implementing `git-refs list` as a
single subcommand that combines filtering, sorting, and formatting
capabilities. To establish a foundation for this, I will first develop
`git-refs show` as a standalone subcommand to replace
`git-show-ref --verify`. The `git-refs list` functionality will then be built
on top of the `git-refs show` codebase."

## Discussion About Options

1. Legacy Formatting Options:

Should `--tcl`, `--shell`, `--perl` be retained?

2. New Options:

Have you used these legacy options or needed modern alternatives?
Any pain points?

I would appreciate any feedback or alternative suggestions on the
best way to structure this functionality.

Thanks!
Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Fwd: Discussion on git-refs list Implementation and Possible Approaches
       [not found]       ` <CAMvj1+rMY2YR8_GGFeDoJ6HCiVDusZZk9fAguKh=kbctHO=2Qg@mail.gmail.com>
@ 2025-04-04 15:20         ` Yuting Zheng
  0 siblings, 0 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-04-04 15:20 UTC (permalink / raw)
  To: git

On Fri, Apr 4, 2025 at 10:48 PM Yuting Zheng <05zyt30@gmail.com> wrote:
>
> Thanks for your review!
>
> > Another factor is the default format that these two commands use which
> > differs. I would heavily lean towards using the format exposed by `git
> > show-ref` because it doesn't require us to hit the ODB, and thus it is
> > way more efficient. This has bitten me quite often already.
>
> Thanks for your reminder! I will explain this output format in my next
> proposal, and I agree that we should adopt the `git show-ref` format for
> its superior efficiency.
>
> > I don't think it would, both are orthogonal to one another. I don't
> > think people _only_ want to format or _only_ want to filter. Quite
> > often, they'll want to do both at the same time.
> >
>
> On the topic of filtering and formatting, I plan to implement these as
> basic functions that work together seamlessly. In other words, the filter
> and format functionalities will be integrated (without being exposed as
> separate options) so that users can combine them as needed. I will
> submit another email for further discussion about options.
>
> > > 2. The performance could be worse than `git-for-each-ref`.
> >
> > Why is that? git-for-each-ref(1) already knows to filter and format, so
> > I'd expect the performance to be roughly the same. In fact, I think we
> > would be able to improve performance if we changed the default format as
> > mentioned above.
> >
>
> I am concerned that iterating over all available options might introduce
> additional overhead.
>
> >
> > I don't think this plan would make sense as it would mean that current
> > users of git-for-each-ref(1) wouldn't be able to migrate.
> >
>
> Finally, in light of your feedback and Karthik’s, I have decided that
> Approach 1 will be my final plan.
>
> Thanks !
> Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Discussion on git-refs list Implementation and Possible Approaches
  2025-04-04 11:08     ` Karthik Nayak
@ 2025-04-04 15:25       ` Yuting Zheng
  0 siblings, 0 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-04-04 15:25 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

Thanks for your reply!

> I would categorize '--sort' into a third subcategory. Filtering refers
> to possible change in the size of the sample set. While sorting is more
> of a presentation utility.
>

That’s a good idea, it makes my plan more clear. I will separate
the “--filter” options into “--filter” and “--sort” so that users can clearly
distinguish them.

> This is indeed a special case which applies to both sorting and
> filtering.

Understood!

>
> You mean complex from the user perspective of having too many options or
> from the implementation perspective.
>
> I think from the UX perspective, it is a good time to rethink usage and
> need for the options you mentioned above. , for e.g. with '--format', do
> we need to have '--tcl', `--shell` and `--perl`?
>

I think it’s important to discuss all available options, and I will
submit another
email for further discussion.


> > 2. The performance could be worse than `git-for-each-ref`.
> >
>
> Why would it be worse? The performance difference between
> `git-for-each-ref(1)` and `git-show-ref(1)` stem from the formats they
> use by default.
>
> $ hyperfine --shell=none --warmup=3 "git for-each-ref" "git show-ref"
> Benchmark 1: git for-each-ref
>   Time (mean ± σ):       4.0 ms ±   0.6 ms    [User: 1.9 ms, System: 1.9 ms]
>   Range (min … max):     3.0 ms …   5.7 ms    680 runs
>
> Benchmark 2: git show-ref
>   Time (mean ± σ):       2.9 ms ±   0.4 ms    [User: 1.2 ms, System: 1.5 ms]
>   Range (min … max):     2.0 ms …   4.3 ms    909 runs
>
> Summary
>   git show-ref ran
>     1.38 ± 0.28 times faster than git for-each-ref
>
> What I found interesting was that changing the format for
> 'git-for-each-ref(1)' gives it a boost:
>
> $ hyperfine --shell=none --warmup=3 'git for-each-ref
> --format="%(objectname) %(refname)"' "git show-ref"
> Benchmark 1: git for-each-ref --format="%(objectname) %(refname)"
>   Time (mean ± σ):       2.4 ms ±   0.3 ms    [User: 1.1 ms, System: 1.1 ms]
>   Range (min … max):     1.7 ms …   3.6 ms    1070 runs
>
> Benchmark 2: git show-ref
>   Time (mean ± σ):       2.9 ms ±   0.4 ms    [User: 1.2 ms, System: 1.5 ms]
>   Range (min … max):     2.0 ms …   4.5 ms    833 runs
>
> Summary
>   git for-each-ref --format="%(objectname) %(refname)" ran
>     1.20 ± 0.23 times faster than git show-ref
>

Thank you for the reminder. Once each option is implemented, I will test its
performance to ensure that it maintains—or improves upon—the efficiency
of the previous version.

>
> And what is the expectation that when you want to do both filtering and
> formatting, would the user be expected to do `git refs filter | git refs
> show`? Generally users want to combine both of these options.
>
> Also wasn't the idea to already implement `git-refs show` as a
> standalone which simply shows what value a reference holds (without
> derefence)?
>
> While I would think Approach 1 is the better option here, I'm also
> seeing how it is complex, perhaps a good option to get started would be
> to implement a simpler subcommand as a first case? Perhaps the
> originally discussed `git refs show`?

I agree that implementing `git-refs show` first would provide a solid foundation
for other options. I will add these improvements in the next version
of the proposal.

Thanks!
Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Discussion on git-refs list Implementation and Possible Approaches
  2025-04-04 11:15     ` Patrick Steinhardt
       [not found]       ` <CAMvj1+rMY2YR8_GGFeDoJ6HCiVDusZZk9fAguKh=kbctHO=2Qg@mail.gmail.com>
@ 2025-04-04 15:26       ` Yuting Zheng
  1 sibling, 0 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-04-04 15:26 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Thanks for your review!

> Another factor is the default format that these two commands use which
> differs. I would heavily lean towards using the format exposed by `git
> show-ref` because it doesn't require us to hit the ODB, and thus it is
> way more efficient. This has bitten me quite often already.

Thanks for your reminder! I will explain this output format in my next
proposal, and I agree that we should adopt the `git show-ref` format for
its superior efficiency.

> I don't think it would, both are orthogonal to one another. I don't
> think people _only_ want to format or _only_ want to filter. Quite
> often, they'll want to do both at the same time.
>

On the topic of filtering and formatting, I plan to implement these as
basic functions that work together seamlessly. In other words, the filter
and format functionalities will be integrated (without being exposed as
separate options) so that users can combine them as needed. I will
submit another email for further discussion about options.

> > 2. The performance could be worse than `git-for-each-ref`.
>
> Why is that? git-for-each-ref(1) already knows to filter and format, so
> I'd expect the performance to be roughly the same. In fact, I think we
> would be able to improve performance if we changed the default format as
> mentioned above.
>

I am concerned that iterating over all available options might introduce
additional overhead.

>
> I don't think this plan would make sense as it would mean that current
> users of git-for-each-ref(1) wouldn't be able to migrate.
>

Finally, in light of your feedback and Karthik’s, I have decided that
Approach 1 will be my final plan.

Thanks !
Zheng Yuting

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GSoC] git-refs proposal v2
  2025-03-29 15:02 ` [GSoC] git-refs proposal draft Zheng Yuting
  2025-03-31  9:42   ` Patrick Steinhardt
  2025-04-03 15:44   ` Discussion on git-refs list Implementation and Possible Approaches Zheng Yuting
@ 2025-04-06  6:08   ` Yuting Zheng
  2 siblings, 0 replies; 17+ messages in thread
From: Yuting Zheng @ 2025-04-06  6:08 UTC (permalink / raw)
  To: git

## Name and Contact Information

- Full Name: Zheng Yuting
- Email Address: 05ZYT30@gmail.com
- Time Zone: UTC +8:00

---

## Abstract

The current Git reference management functionality is fragmented across
multiple independent commands (`git-show-ref`, `git-for-each-ref`,
`git-update-ref`,
`git-pack-refs`, `git-check-ref-format`, and `git-symbolic-ref`),
leading to code
redundancy and increased maintenance costs.
Based on Patrick Steinhardt’s integration vision[1], this project aims to
consolidate functionality under the unified `git-refs` command by initially
implementing three core subcommands: **show**, **list**, and **update**.
These subcommands will cover the most essential reference management
operations while ensuring backward compatibility and laying the foundation
for further refinement.

If time permits, additional subcommands (such as `exists`, `resolve`, `pack`,
and `check-format`) will be gradually integrated to extend and enhance the
existing functionality. Comprehensive testing and updated documentation
will support this phased approach, ensuring a robust transition from the
legacy tools.

---

## Implementation Plan

### Command Integration Strategy

#### Implementation Sequence

The development will proceed in the following order:

1. `git-refs show`
   - **Purpose:** Replace `git-show-ref --verify` with strict
reference validation.

2. `git-refs list`
   - **Purpose:** Merge `git-show-ref` and `git-for-each-ref` for
listing references.
   - **Output Format:** `<oid> SP <ref> LF` (git-show-ref style).
   - **Options:**
     - **Filtering:**
       - From `git-for-each-ref`:
         - `--count`,
         - `--points-at=<object>`,
         - `--merged[=<object>]`,
         - `--no-merged[=<object>]`,
         - `--contains[=<object>]`,
         - `--no-contains[=<object>]`,
         - `--omit-empty`,
         - `--exclude=<pattern>`,
         - `--include-root-refs`.
       - From `git-show-ref`:
         - `--head`,
         - `--branches`,
         - `--tags`,
         - `--exclude-existing`.
     - **Sorting:**
       - From `git-for-each-ref`: `--sort=<key>`.
     - **Formatting:**
       - From `git-for-each-ref`:
         - `--format=<format>`,
         - `--color[=<when>]`,
         - `--tcl` (under discussion),
         - `--shell`(under discussion),
         - `--perl`(under discussion).
       - From `git-show-ref`:
         - `--dereference`,
         - `--hash`.
     - **Global:** `--ignore-case` (applies to all filtering/sorting).

3. `git-refs update`
   - **Purpose:** Replace `git-update-ref` with transactional updates and
batch processing.
   - **Options (all from `git-update-ref`):**
     - `<ref>`: Target reference.
     - `<newvalue>`: New object identifier.
     - `[<oldvalue>]`: Expected old value (atomic check).
     - `--stdin`: Read batch updates from stdin.
     - `-d, --delete`: Delete the reference.
     - `-m <message>, --message <message>`: Custom reflog message.
     - `--no-reflog`: Skip reflog updates.
     - `--no-deref`: Update symbolic refs directly.

---

#### Testing & Documentation Updates:

- **Unified Testing:**
  - Develop comprehensive test cases for each subcommand to ensure
that the new commands produce outputs consistent with the legacy ones.
  - Leverage existing test scenarios (e.g., those used for `git-show-ref`
and `git-update-ref`) and add new tests specific to the new option
categories and output formats.

- **Documentation:**
  - Update the user manual (e.g., Documentation/git-refs.txt) to include
detailed sections for each subcommand, mapping the new options to
their legacy equivalents.
  - Provide developer notes to explain changes, highlight areas of
functional parity, and outline the phased implementation approach.

---

### Timeline

- **May 8 – May 17 (10 days):** Design Finalization & Alignment (publish
proposals, resolve conflicts).

- **May 18 – June 7 (21 days):** Implement `git-refs show` (includes
testing/docs).

- **June 8 – July 3 (26 days):** Implement `git-refs list` (includes
testing/docs).

- **July 4 – August 4 (32 days):** Implement `git-refs update` (includes
testing/docs).

- **August 5 – August 25 (21 days):** Cross-command validation &
edge-case fixes.

- **August 26 – September 1 (7 days):** Final Review & Adjustments.

---

## Background & Experience

I graduated in June 2024 from Wenzhou University with a degree in
Network Engineering. My experience includes C programming and
command-line tool development, along with proficiency in Shell
scripting. I am currently in a transitional phase and expect to finalize
my schedule by late April, and then update my weekly schedule for GSoC,
estimating 25-30 hours per week for this project currently.

### Project Experience

- **One Student One Chip Project[2]**
  Extending the open-source NEMU simulator by implementing CPU cycle
  functionalities in C.
- **Web Development**
  Developed a Django-based campus website, including user chat, news
  publishing, and teacher management modules.
- **Custom Communication Protocols**
  Built a UDP-based chatroom with peer-to-peer and group messaging.
- **Stock Monitoring Tool**
  Implemented real-time monitoring and historical data analysis, with
  email alerting and planned AI-driven strategy optimization.

I have also obtained CCNA certification and gained hands-on experience
as a network engineer. Additionally, I contributed a patch (currently pending
merge) optimizing send-email functionality in Git [3], which has given me
valuable insights into the Git codebase. For reference, my draft proposal
discussions can be reviewed on the mailing list [4], and `git-refs list`
discussion on the mailing list [5].

---

## Appendix

[1] https://gitlab.com/gitlab-org/git/-/issues/330
[2] https://ysyx.oscc.cc/en/project/intro.html
[3] https://lore.kernel.org/git/20250312064639.668875-1-05ZYT30@gmail.com/
[4] https://lore.kernel.org/git/CAMvj1+rbYKFNeWEvvN76MTpzfuWc4TN4ViXRE4nTfWy7ZMspWg@mail.gmail.com/
[5] https://lore.kernel.org/git/20250403154404.3459805-1-05ZYT30@gmail.com/

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-04-06  6:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-23 13:36 [GSoC] Proposal Discussion: git-refs Project Yuting Zheng
2025-03-24 12:02 ` Patrick Steinhardt
2025-03-27  2:26   ` Yuting Zheng
2025-03-28 13:45     ` shejialuo
2025-03-29 14:54       ` Yuting Zheng
2025-03-29 15:02 ` [GSoC] git-refs proposal draft Zheng Yuting
2025-03-31  9:42   ` Patrick Steinhardt
2025-04-01 13:37     ` Yuting Zheng
2025-04-02  8:02       ` Patrick Steinhardt
2025-04-03 15:44   ` Discussion on git-refs list Implementation and Possible Approaches Zheng Yuting
2025-04-04 11:08     ` Karthik Nayak
2025-04-04 15:25       ` Yuting Zheng
2025-04-04 11:15     ` Patrick Steinhardt
     [not found]       ` <CAMvj1+rMY2YR8_GGFeDoJ6HCiVDusZZk9fAguKh=kbctHO=2Qg@mail.gmail.com>
2025-04-04 15:20         ` Fwd: " Yuting Zheng
2025-04-04 15:26       ` Yuting Zheng
2025-04-04 15:16     ` Yuting Zheng
2025-04-06  6:08   ` [GSoC] git-refs proposal v2 Yuting Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).