Jeff King writes: > On Tue, Apr 23, 2024 at 11:28:10PM +0200, Karthik Nayak wrote: > >> Changes from v2: >> 1. We no longer have separate commands for symrefs, instead the regular >> commands learn to parse 'ref:' as symref targets. This reduces >> the code in this series. Thanks Patrick for the suggestion. > > Hmm. I can see how this makes things a lot simpler, but it introduces an > ambiguity, since you can pass full ref expressions to "update-ref" (like > "ref:foo" to find the "foo" entry in ref^{tree}). I see that you only > kick in the symref "ref:" handling if the regular oid lookup failed, so > there's no backwards-compatibility issue (anything that used to work > will still take precedence, and the new code only runs when the old code > would have reported an error). > > But I wonder if it would let somebody cause mischief in a repository > they can push to, but which may get updates from other sources. For > example, imagine a forge like GitLab runs the equivalent of: > > echo "create refs/whatever ref:refs/heads/main" | > git update-ref --stdin > > as part of some system process. Now if I push up "refs/heads/ref" that > contains the path "refs/heads/main" in its tree, that will take > precedence, causing the system process to do something it did not > expect. > > I think you'd have to pile on a lot of assumptions to get any kind of > security problem. Something like: > > 1. The system has a hidden ref namespace like refs/gitlab that normal > remote push/fetch users are not allowed to read/write to. > > 2. The system tries to make a symlink within that namespace. Say, > "refs/gitlab/metadata/HEAD" to point to > "refs/gitlab/metadata/branches/main" or something. > > 3. The user pushes up "refs/heads/ref" with a tree that contains > "refs/gitlab/metadata/branches/main". Now when (2) happens, the > hidden ref points to user-controlled data. > > That's pretty convoluted. But we can avoid it entirely if there's no > ambiguity in the protocol at all. > > -Peff Thanks Peff, that is indeed something I totally missed thinking about. This also brings light onto the previous versions we were considering: symref-update SP SP [SP ( | )] LF There is also some ambiguity here which we missed, especially when we support dangling refs. If we're updating a dangling ref , and we provide an old value. Then there is uncertainty around whether the provided value is actually a or if it's an . For non dangling ref symref, we first parse it as an and since the would exist, we can move on. So I see two ways to go about this, 1. In the symref-update function, we need to parse and see if is a regular ref or a symref, if it is symref, we simply set the provided old value as , if not, we set it as . This seems clunky because we'll be parsing the ref and trying to understand its type in 'update-ref.c', before the actual update. 2. We change the syntax to something like symref-update SP SP [SP (ref | oid )] LF this would remove any ambiguity since the user specifies the data type they're providing. Overall, I think it is best to discard this version and I will push v4 with the older schematics of introducing new commands. I'm currently considering going ahead with the [2], but will wait for a day or two to consider other opinions. Also on a sidenote, it's worth considering that with the direction of [2], we could also extrapolate to introduce {verify, update, create, delete} v2, which support both symrefs and regular refs. But require explicit types from the user: update-v2 SP NUL (oid | ref ) NUL [(oid | ref )] NUL create-v2 SP NUL (oid | ref ) NUL delete-v2 SP NUL [(oid | ref )] NUL verify-v2 SP NUL [(oid | ref )] NUL This is similar to the v3 patches I've currently sent out, in that it would also allow cross operations between regular refs and symrefs.