Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* [PATCH] pkt-line: don't check string length in packet_length()
@ 2023-07-01  7:05 René Scharfe
  2023-07-05  5:56 ` Junio C Hamano
  2023-07-07 21:47 ` [PATCH v2] pkt-line: add size parameter to packet_length() René Scharfe
  0 siblings, 2 replies; 8+ messages in thread
From: René Scharfe @ 2023-07-01  7:05 UTC (permalink / raw)
  To: Git List

hex2chr() takes care not to run over the end of a short string.
101736a14c (pkt-line: extern packet_length(), 2020-05-19) turned the
input parameter of packet_length() from a string pointer into an array
of known length, making string length checks unnecessary.  Get rid of
them by using hexval() directly.

The resulting branchless code is simpler and it becomes easier to see
that the function mirrors set_packet_header().

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 pkt-line.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 62b4208b66..6e022029ca 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -375,8 +375,10 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size,

 int packet_length(const char lenbuf_hex[4])
 {
-	int val = hex2chr(lenbuf_hex);
-	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
+	return	hexval(lenbuf_hex[0]) << 12 |
+		hexval(lenbuf_hex[1]) <<  8 |
+		hexval(lenbuf_hex[2]) <<  4 |
+		hexval(lenbuf_hex[3]);
 }

 static char *find_packfile_uri_path(const char *buffer)
--
2.41.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] pkt-line: don't check string length in packet_length()
  2023-07-01  7:05 [PATCH] pkt-line: don't check string length in packet_length() René Scharfe
@ 2023-07-05  5:56 ` Junio C Hamano
  2023-07-05 16:15   ` René Scharfe
  2023-07-07 21:47 ` [PATCH v2] pkt-line: add size parameter to packet_length() René Scharfe
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2023-07-05  5:56 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

> hex2chr() takes care not to run over the end of a short string.
> 101736a14c (pkt-line: extern packet_length(), 2020-05-19) turned the
> input parameter of packet_length() from a string pointer into an array
> of known length, making string length checks unnecessary.  Get rid of
> them by using hexval() directly.

I am puzzled about the part of the above description on "making
string length checks unnecessary".  The two callers we currently
have both do pass char[4], but the compiler would not stop us from
passing a pointer to a memory region of an unknown size; if we
butcher one of the current callers

diff --git c/pkt-line.c w/pkt-line.c
index 6e022029ca..e1c49baefd 100644
--- c/pkt-line.c
+++ w/pkt-line.c
@@ -421,7 +421,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		return PACKET_READ_EOF;
 	}
 
-	len = packet_length(linelen);
+	len = packet_length(buffer);
 
 	if (len < 0) {
 		if (options & PACKET_READ_GENTLE_ON_READ_ERROR)

where "buffer" is just a random piece of memory passed to the caller
and there is no such guarantee like "it at least is 4 bytes long",
we would just slurp garbage and run past the end of the buffer.

> The resulting branchless code is simpler and it becomes easier to see
> that the function mirrors set_packet_header().

I do like the resulting code, but I feel a bit uneasy to sell this
change as "the code becomes more streamlined without losing safety".
It looks more like "this change is safe for our two callers; those
adding more callers in the future are better be very careful", no?

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] pkt-line: don't check string length in packet_length()
  2023-07-05  5:56 ` Junio C Hamano
@ 2023-07-05 16:15   ` René Scharfe
  2023-07-05 20:33     ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: René Scharfe @ 2023-07-05 16:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Am 05.07.23 um 07:56 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> hex2chr() takes care not to run over the end of a short string.
>> 101736a14c (pkt-line: extern packet_length(), 2020-05-19) turned the
>> input parameter of packet_length() from a string pointer into an array
>> of known length, making string length checks unnecessary.  Get rid of
>> them by using hexval() directly.
>
> I am puzzled about the part of the above description on "making
> string length checks unnecessary".  The two callers we currently
> have both do pass char[4], but the compiler would not stop us from
> passing a pointer to a memory region of an unknown size; if we
> butcher one of the current callers
>
> diff --git c/pkt-line.c w/pkt-line.c
> index 6e022029ca..e1c49baefd 100644
> --- c/pkt-line.c
> +++ w/pkt-line.c
> @@ -421,7 +421,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  		return PACKET_READ_EOF;
>  	}
>
> -	len = packet_length(linelen);
> +	len = packet_length(buffer);
>
>  	if (len < 0) {
>  		if (options & PACKET_READ_GENTLE_ON_READ_ERROR)
>
> where "buffer" is just a random piece of memory passed to the caller
> and there is no such guarantee like "it at least is 4 bytes long",
> we would just slurp garbage and run past the end of the buffer.

Indeed, GCC warns if you give it a short array, but not if you pass a
pointer: https://godbolt.org/z/T3xfE5W9n.  Clang doesn't care at all.

So that was wishful thinking on my part.  Sorry.  :-/

>> The resulting branchless code is simpler and it becomes easier to see
>> that the function mirrors set_packet_header().
>
> I do like the resulting code, but I feel a bit uneasy to sell this
> change as "the code becomes more streamlined without losing safety".
> It looks more like "this change is safe for our two callers; those
> adding more callers in the future are better be very careful", no?

With no way to enforce passing an array of a certain size to a function
the only safe options I see are keeping the length check, using a macro
or inlining the calculation.  Hmm.

René

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pkt-line: don't check string length in packet_length()
  2023-07-05 16:15   ` René Scharfe
@ 2023-07-05 20:33     ` Junio C Hamano
  2023-07-05 21:11       ` René Scharfe
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2023-07-05 20:33 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

>> I do like the resulting code, but I feel a bit uneasy to sell this
>> change as "the code becomes more streamlined without losing safety".
>> It looks more like "this change is safe for our two callers; those
>> adding more callers in the future are better be very careful", no?
>
> With no way to enforce passing an array of a certain size to a function
> the only safe options I see are keeping the length check, using a macro
> or inlining the calculation.  Hmm.

We keep repeating "length check" because that is what the comment in
the function says, but even if the caller has 4-byte, that 4-byte
substring at the beginning is what it read from the untrusted side
over the network.  We should be checking if we have 4 hexadecimal
length even if we are not running beyond the end of the buffer, no?

So it may be that the comment needs to be fixed more than the code.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pkt-line: don't check string length in packet_length()
  2023-07-05 20:33     ` Junio C Hamano
@ 2023-07-05 21:11       ` René Scharfe
  2023-07-05 22:27         ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: René Scharfe @ 2023-07-05 21:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Am 05.07.23 um 22:33 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>>> I do like the resulting code, but I feel a bit uneasy to sell this
>>> change as "the code becomes more streamlined without losing safety".
>>> It looks more like "this change is safe for our two callers; those
>>> adding more callers in the future are better be very careful", no?
>>
>> With no way to enforce passing an array of a certain size to a function
>> the only safe options I see are keeping the length check, using a macro
>> or inlining the calculation.  Hmm.

Three more ideas: Wrap the buffer in a custom struct, pass the four
bytes as a uint32_t or as four separate char parameters.  I better
stop now..

> We keep repeating "length check" because that is what the comment in
> the function says, but even if the caller has 4-byte, that 4-byte
> substring at the beginning is what it read from the untrusted side
> over the network.  We should be checking if we have 4 hexadecimal
> length even if we are not running beyond the end of the buffer, no?

Sure, that's done.  If any of the four characters is not a hexadecimal
digit then packet_length() returns a negative value, before and after
the change.

> So it may be that the comment needs to be fixed more than the code.

Which comment specifically?  The one in pkt-line.h doesn't mention
what happens if you pass in a short buffer -- leaving it undefined is
OK, I think.  And in and around pkt-line.c::packet_length() I don't
see any comment?

René

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pkt-line: don't check string length in packet_length()
  2023-07-05 21:11       ` René Scharfe
@ 2023-07-05 22:27         ` Junio C Hamano
  2023-07-06  5:07           ` René Scharfe
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2023-07-05 22:27 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

> Sure, that's done.  If any of the four characters is not a hexadecimal
> digit then packet_length() returns a negative value, before and after
> the change.

Ah, it is the beauty of hexval[] table ;-)

So as long as we are sure that we are not running beyond the end of
the buffer, we are OK.  And as I already said, I think "this change
is safe for our two callers; those adding more callers in the future
are better be very careful" is probably good enough for this one.

hex.h:hex2chr() says "don't run over the end of short strings", but
as far as I can see it does not check any such thing; find a page of
memory, whose next page is unmapped, and pointing *s at the last
byte of that page and calling it will happily run over the end and
would cause SIGBUS.  The function assumes that such a short string
is always NUL terminated, which is not a great way to guarantee that
we do not run over the end of strings.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pkt-line: don't check string length in packet_length()
  2023-07-05 22:27         ` Junio C Hamano
@ 2023-07-06  5:07           ` René Scharfe
  0 siblings, 0 replies; 8+ messages in thread
From: René Scharfe @ 2023-07-06  5:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Am 06.07.23 um 00:27 schrieb Junio C Hamano:
> hex.h:hex2chr() says "don't run over the end of short strings", but
> as far as I can see it does not check any such thing; find a page of
> memory, whose next page is unmapped, and pointing *s at the last
> byte of that page and calling it will happily run over the end and
> would cause SIGBUS.  The function assumes that such a short string
> is always NUL terminated, which is not a great way to guarantee that
> we do not run over the end of strings.

Yes, hex2chr() works with C strings, i.e. those that end with a NUL
character.  An empty string is just a NUL byte, a string of length 1
is a non-NUL byte and a NUL.  The function reads one byte from the
former and otherwise two bytes -- no overrun.

If a C string loses its NUL, how could you detect its end?

René

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] pkt-line: add size parameter to packet_length()
  2023-07-01  7:05 [PATCH] pkt-line: don't check string length in packet_length() René Scharfe
  2023-07-05  5:56 ` Junio C Hamano
@ 2023-07-07 21:47 ` René Scharfe
  1 sibling, 0 replies; 8+ messages in thread
From: René Scharfe @ 2023-07-07 21:47 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

hex2chr() takes care not to run over the end of a NUL-terminated string.
It's used in packet_length(), but both callers of that function pass a
four-byte buffer, making NUL-checks unnecessary.  packet_length() could
accidentally be used with a pointer to a buffer of unknown size at new
call-sites, though, and the compiler wouldn't complain.

Add a size parameter plus check, and remove the NUL-checks by calling
hexval() directly.  This trades three NUL checks against one size check
and the ability to report the use of a short buffer at runtime.

If any of the four bytes is NUL or -- more generally -- not a
hexadecimal digit, then packet_length() still returns a negative value.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 pkt-line.c    | 12 ++++++++----
 pkt-line.h    |  2 +-
 remote-curl.c |  3 ++-
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 62b4208b66..fcfa357ccd 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -373,10 +373,14 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 	return ret;
 }

-int packet_length(const char lenbuf_hex[4])
+int packet_length(const char lenbuf_hex[4], size_t size)
 {
-	int val = hex2chr(lenbuf_hex);
-	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
+	if (size < 4)
+		BUG("buffer too small");
+	return	hexval(lenbuf_hex[0]) << 12 |
+		hexval(lenbuf_hex[1]) <<  8 |
+		hexval(lenbuf_hex[2]) <<  4 |
+		hexval(lenbuf_hex[3]);
 }

 static char *find_packfile_uri_path(const char *buffer)
@@ -419,7 +423,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		return PACKET_READ_EOF;
 	}

-	len = packet_length(linelen);
+	len = packet_length(linelen, sizeof(linelen));

 	if (len < 0) {
 		if (options & PACKET_READ_GENTLE_ON_READ_ERROR)
diff --git a/pkt-line.h b/pkt-line.h
index 7c23a4bfaf..954eec8719 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -94,7 +94,7 @@ int packet_read(int fd, char *buffer, unsigned size, int options);
  * If lenbuf_hex contains non-hex characters, return -1. Otherwise, return the
  * numeric value of the length header.
  */
-int packet_length(const char lenbuf_hex[4]);
+int packet_length(const char lenbuf_hex[4], size_t size);

 /*
  * Read a packetized line into a buffer like the 'packet_read()' function but
diff --git a/remote-curl.c b/remote-curl.c
index acf7b2bb40..143318658e 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -763,7 +763,8 @@ static void check_pktline(struct check_pktline_state *state, const char *ptr, si
 			size -= digits_remaining;

 			if (state->len_filled == 4) {
-				state->remaining = packet_length(state->len_buf);
+				state->remaining = packet_length(state->len_buf,
+								 sizeof(state->len_buf));
 				if (state->remaining < 0) {
 					die(_("remote-curl: bad line length character: %.4s"), state->len_buf);
 				} else if (state->remaining == 2) {
--
2.41.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-07-07 21:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-01  7:05 [PATCH] pkt-line: don't check string length in packet_length() René Scharfe
2023-07-05  5:56 ` Junio C Hamano
2023-07-05 16:15   ` René Scharfe
2023-07-05 20:33     ` Junio C Hamano
2023-07-05 21:11       ` René Scharfe
2023-07-05 22:27         ` Junio C Hamano
2023-07-06  5:07           ` René Scharfe
2023-07-07 21:47 ` [PATCH v2] pkt-line: add size parameter to packet_length() René Scharfe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).