* [PATCH 0/6] cat-file: add remote-object-info to batch-command
@ 2024-06-28 19:04 Eric Ju
2024-06-28 19:04 ` [PATCH 1/6] fetch-pack: refactor packet writing Eric Ju
` (16 more replies)
0 siblings, 17 replies; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:04 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
This is a continuation of Calvin Wan's (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
Sometimes it is useful to get information about an object without having to download
it completely. The server logic for retrieving size has already been implemented and merged in
"a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
This patch series implement the client option for it.
This patch series add the `remote-object-info` command to `cat-file --batch-command`. This command
allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not have
object-info capability, the entire object is fetched and the
relevant object info is returned.
A few questions open for discussions please:
1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
or "warn and wait for new command".
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when remote-object-info is used.
Any suggestions on this approach?
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
Calvin Wan (5):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
cat-file: add remote-object-info to batch-command
Eric Ju (1):
cat-file: add declaration of variable i inside its for loop
Documentation/git-cat-file.txt | 22 +-
builtin/cat-file.c | 240 ++++++++++----
fetch-pack.c | 48 ++-
fetch-pack.h | 10 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/t1017-cat-file-remote-object-info.sh | 412 +++++++++++++++++++++++++
transport-helper.c | 8 +-
transport.c | 102 +++++-
transport.h | 11 +
11 files changed, 785 insertions(+), 86 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
--
2.45.2
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH 1/6] fetch-pack: refactor packet writing
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-06-28 19:04 ` Eric Ju
2024-07-04 16:59 ` Karthik Nayak
2024-06-28 19:04 ` [PATCH 2/6] fetch-pack: move fetch initialization Eric Ju
` (15 subsequent siblings)
16 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:04 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
From: Calvin Wan <calvinwan@google.com>
A subsequent patch need to write capabilities for another command.
Refactor write_fetch_command_and_capabilities() to be used by both
fetch and future command.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index eba9e420ea..fc9fb66cd8 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1313,13 +1313,13 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
+static void write_command_and_capabilities(struct strbuf *req_buf,
+ const struct string_list *server_options, const char* command)
{
const char *hash_name;
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
if (server_supports_v2("agent"))
packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
if (advertise_sid && server_supports_v2("session-id"))
@@ -1355,7 +1355,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, args->server_options, "fetch");
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2163,7 +2163,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, server_options, "fetch");
packet_buf_write(&req_buf, "wait-for-done");
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH 2/6] fetch-pack: move fetch initialization
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-06-28 19:04 ` [PATCH 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-06-28 19:04 ` Eric Ju
2024-06-28 19:05 ` [PATCH 3/6] serve: advertise object-info feature Eric Ju
` (14 subsequent siblings)
16 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:04 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index fc9fb66cd8..da0de9c537 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1676,18 +1676,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH 3/6] serve: advertise object-info feature
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-06-28 19:04 ` [PATCH 1/6] fetch-pack: refactor packet writing Eric Ju
2024-06-28 19:04 ` [PATCH 2/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-06-28 19:05 ` Eric Ju
2024-06-28 19:05 ` [PATCH 4/6] transport: add client support for object-info Eric Ju
` (13 subsequent siblings)
16 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:05 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index aa651b73e9..fd42fecc15 100644
--- a/serve.c
+++ b/serve.c
@@ -68,7 +68,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -76,6 +76,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH 4/6] transport: add client support for object-info
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (2 preceding siblings ...)
2024-06-28 19:05 ` [PATCH 3/6] serve: advertise object-info feature Eric Ju
@ 2024-06-28 19:05 ` Eric Ju
2024-07-09 7:15 ` Toon claes
2024-07-10 10:13 ` Karthik Nayak
2024-06-28 19:05 ` [PATCH 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
` (12 subsequent siblings)
16 siblings, 2 replies; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:05 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
From: Calvin Wan <calvinwan@google.com>
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
as “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
features 'size' and 'type' from a v2 server. If a server does not
advertise either of the requested features, then the client falls back
to making the request through 'fetch'.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
---
fetch-pack.c | 24 +++++++++++
fetch-pack.h | 10 +++++
transport-helper.c | 8 +++-
transport.c | 102 ++++++++++++++++++++++++++++++++++++++++++---
transport.h | 11 +++++
5 files changed, 148 insertions(+), 7 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index da0de9c537..d533cac1d8 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1345,6 +1345,27 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
+void send_object_info_request(int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, args->server_options, "object-info");
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1682,6 +1703,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index 6775d26517..16e4dc0824 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info **object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
@@ -68,6 +70,12 @@ struct fetch_pack_args {
unsigned connectivity_checked:1;
};
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
/*
* sought represents remote references that should be updated from.
* On return, the names that were found on the remote will have been
@@ -101,4 +109,6 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
*/
int report_unmatched_refs(struct ref **sought, int nr_sought);
+void send_object_info_request(int fd_out, struct object_info_args *args);
+
#endif
diff --git a/transport-helper.c b/transport-helper.c
index 9820947ab2..670d1e7068 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -697,13 +697,17 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
+ * require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
return -1;
}
+ if (transport->smart_options->object_info) {
+ // fail the command explicitly to avoid further commands input
+ die(_("remote-object-info requires protocol v2"));
+ }
if (!data->get_refs_list_called)
get_refs_list_using_list(transport, 0);
diff --git a/transport.c b/transport.c
index 83ddea8fbc..2847aa3f3c 100644
--- a/transport.c
+++ b/transport.c
@@ -363,6 +363,73 @@ static struct ref *handshake(struct transport *transport, int for_push,
return refs;
}
+static int fetch_object_info(struct transport *transport, struct object_info **object_info_data)
+{
+ int size_index = -1;
+ struct git_transport_data *data = transport->data;
+ struct object_info_args args;
+ struct packet_reader reader;
+
+ memset(&args, 0, sizeof(args));
+ args.server_options = transport->server_options;
+ args.object_info_options = transport->smart_options->object_info_options;
+ args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+ data->version = discover_version(&reader);
+
+ transport->hash_algo = reader.hash_algo;
+
+ switch (data->version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args.object_info_options, "size")
+ && !server_supports_feature("object-info", "size", 0)) {
+ return -1;
+ }
+ send_object_info_request(data->fd[1], &args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args.object_info_options->nr; i++) {
+ if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
+ if (!strcmp(reader.line, "size"))
+ size_index = i;
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader.line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+ *(*object_info_data)[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+ }
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+
+ return 0;
+}
+
static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
struct transport_ls_refs_options *options)
{
@@ -410,6 +477,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
+ struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -436,11 +504,27 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
-
- if (!data->finished_handshake) {
- int i;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options && transport->smart_options->object_info) {
+ struct ref *ref = object_info_refs;
+
+ if (!fetch_object_info(transport, data->options.object_info_data))
+ goto cleanup;
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
+ temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
+ temp_ref->exact_oid = 1;
+ ref->next = temp_ref;
+ ref = ref->next;
+ }
+ transport->remote_refs = object_info_refs->next;
+ } else if (!data->finished_handshake) {
int must_list_refs = 0;
- for (i = 0; i < nr_heads; i++) {
+ for (int i = 0; i < nr_heads; i++) {
if (!to_fetch[i]->exact_oid) {
must_list_refs = 1;
break;
@@ -478,11 +562,18 @@ static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs->next;
+ for (int i = 0; ref_cpy_reader; i++) {
+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &(*args.object_info_data)[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
- if (!refs)
+ if (!refs && !args.object_info)
ret = -1;
if (report_unmatched_refs(to_fetch, nr_heads))
ret = -1;
@@ -498,6 +589,7 @@ static int fetch_refs_via_pack(struct transport *transport,
free_refs(refs_tmp);
free_refs(refs);
list_objects_filter_release(&args.filter_options);
+ free_refs(object_info_refs);
return ret;
}
diff --git a/transport.h b/transport.h
index 6393cd9823..5a3cda1860 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to pull only object-info. Fallbacks
+ * to pulling entire object if object-info is not supported.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info **object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH 5/6] cat-file: add declaration of variable i inside its for loop
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (3 preceding siblings ...)
2024-06-28 19:05 ` [PATCH 4/6] transport: add client support for object-info Eric Ju
@ 2024-06-28 19:05 ` Eric Ju
2024-07-10 10:16 ` Karthik Nayak
2024-06-28 19:05 ` [PATCH 6/6] cat-file: add remote-object-info to batch-command Eric Ju
` (11 subsequent siblings)
16 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:05 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
Some code declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/cat-file.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 43a1d7ac49..72a78cdc8c 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -668,12 +668,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (int i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -681,9 +679,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -709,7 +705,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -719,7 +714,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (int i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (4 preceding siblings ...)
2024-06-28 19:05 ` [PATCH 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-06-28 19:05 ` Eric Ju
2024-07-09 1:50 ` Justin Tobler
` (2 more replies)
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
` (10 subsequent siblings)
16 siblings, 3 replies; 174+ messages in thread
From: Eric Ju @ 2024-06-28 19:05 UTC (permalink / raw)
To: git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
From: Calvin Wan <calvinwan@google.com>
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Set batch mode state, get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Set batch mode state, get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through object info
- Call respective function attached to `info`
- Set batch mode state, use passed in object info, print object
info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through object info
- Call respective function attached to `info`
- Set batch mode state, use passed in object info, print
object info
Else:
- Call respective function attached to command
- Set batch mode state, get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then generates multiple `info` commands with the object info passed in.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/git-cat-file.txt | 22 +-
builtin/cat-file.c | 231 ++++++++++----
object-file.c | 11 +
object-store-ll.h | 3 +
t/t1017-cat-file-remote-object-info.sh | 412 +++++++++++++++++++++++++
5 files changed, 620 insertions(+), 59 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index bd95a6c10a..ab0647bb39 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,12 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified <remote> without
+ downloading object from remote.
+ Error when no object references is provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +296,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +305,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +322,9 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except remote-object-info command who uses
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+When "%(objecttype)" is supported, default format should be unified.
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +406,10 @@ scripting purposes.
CAVEATS
-------
+Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
+remote-object-info, git will error and exit when they are in the format string.
+
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 72a78cdc8c..34958a1747 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,14 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
+#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
+
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -508,7 +516,6 @@ static void batch_object_write(const char *obj_name,
}
batch_write(opt, scratch->buf, scratch->len);
-
if (opt->batch_mode == BATCH_MODE_CONTENTS) {
print_object_or_die(opt, data);
batch_write(opt, &opt->output_delim, 1);
@@ -526,51 +533,118 @@ static void batch_one_object(const char *obj_name,
(opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0);
enum get_oid_result result;
- result = get_oid_with_context(the_repository, obj_name,
- flags, &data->oid, &ctx);
- if (result != FOUND) {
- switch (result) {
- case MISSING_OBJECT:
- printf("%s missing%c", obj_name, opt->output_delim);
- break;
- case SHORT_NAME_AMBIGUOUS:
- printf("%s ambiguous%c", obj_name, opt->output_delim);
- break;
- case DANGLING_SYMLINK:
- printf("dangling %"PRIuMAX"%c%s%c",
- (uintmax_t)strlen(obj_name),
- opt->output_delim, obj_name, opt->output_delim);
- break;
- case SYMLINK_LOOP:
- printf("loop %"PRIuMAX"%c%s%c",
- (uintmax_t)strlen(obj_name),
- opt->output_delim, obj_name, opt->output_delim);
- break;
- case NOT_DIR:
- printf("notdir %"PRIuMAX"%c%s%c",
- (uintmax_t)strlen(obj_name),
- opt->output_delim, obj_name, opt->output_delim);
- break;
- default:
- BUG("unknown get_sha1_with_context result %d\n",
- result);
- break;
+ if (!opt->use_remote_info) {
+ result = get_oid_with_context(the_repository, obj_name,
+ flags, &data->oid, &ctx);
+ if (result != FOUND) {
+ switch (result) {
+ case MISSING_OBJECT:
+ printf("%s missing%c", obj_name, opt->output_delim);
+ break;
+ case SHORT_NAME_AMBIGUOUS:
+ printf("%s ambiguous%c", obj_name, opt->output_delim);
+ break;
+ case DANGLING_SYMLINK:
+ printf("dangling %"PRIuMAX"%c%s%c",
+ (uintmax_t)strlen(obj_name),
+ opt->output_delim, obj_name, opt->output_delim);
+ break;
+ case SYMLINK_LOOP:
+ printf("loop %"PRIuMAX"%c%s%c",
+ (uintmax_t)strlen(obj_name),
+ opt->output_delim, obj_name, opt->output_delim);
+ break;
+ case NOT_DIR:
+ printf("notdir %"PRIuMAX"%c%s%c",
+ (uintmax_t)strlen(obj_name),
+ opt->output_delim, obj_name, opt->output_delim);
+ break;
+ default:
+ BUG("unknown get_sha1_with_context result %d\n",
+ result);
+ break;
+ }
+ fflush(stdout);
+ return;
}
- fflush(stdout);
- return;
- }
- if (ctx.mode == 0) {
- printf("symlink %"PRIuMAX"%c%s%c",
- (uintmax_t)ctx.symlink_path.len,
- opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
- fflush(stdout);
- return;
+ if (ctx.mode == 0) {
+ printf("symlink %"PRIuMAX"%c%s%c",
+ (uintmax_t)ctx.symlink_path.len,
+ opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
+ fflush(stdout);
+ return;
+ }
}
batch_object_write(obj_name, scratch, opt, data, NULL, 0);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format) {
+ opt->format = "%(objectname) %(objectsize)";
+ }
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ int include_size = 0;
+
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+ /*
+ * 'size' is the only option currently supported.
+ * Other options that are passed in the format will exit with error.
+ */
+ if (strstr(opt->format, "%(objectsize)")) {
+ string_list_append(&object_info_options, "size");
+ include_size = 1;
+ }
+ if (strstr(opt->format, "%(objecttype)")) {
+ die(_("objecttype is currently not supported with remote-object-info"));
+ }
+ if (strstr(opt->format, "%(objectsize:disk)"))
+ die(_("objectsize:disk is currently not supported with remote-object-info"));
+ if (strstr(opt->format, "%(deltabase)"))
+ die(_("deltabase is currently not supported with remote-object-info"));
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+ if (include_size)
+ remote_object_info[i].sizep = xcalloc(1, sizeof(long));
+ }
+ gtransport->smart_options->object_info_data = &remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -642,6 +716,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
struct queued_cmd {
parse_cmd_fn_t fn;
char *line;
+ const char *name;
};
static void parse_cmd_contents(struct batch_options *opt,
@@ -662,6 +737,55 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static const struct parse_cmd {
+ const char *name;
+ parse_cmd_fn_t fn;
+ unsigned takes_args;
+} commands[] = {
+ { "contents", parse_cmd_contents, 1 },
+ { "info", parse_cmd_info, 1 },
+ { "remote-object-info", parse_cmd_info, 1 },
+ { "flush", NULL, 0 },
+};
+
+static void parse_remote_info(struct batch_options *opt,
+ char *line,
+ struct strbuf *output,
+ struct expand_data *data,
+ const struct parse_cmd *p_cmd,
+ struct queued_cmd *q_cmd)
+{
+ int count;
+ const char **argv;
+
+ count = split_cmdline(line, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ data->mark_query = 0;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+ if (remote_object_info[i].sizep)
+ data->size = *remote_object_info[i].sizep;
+ if (remote_object_info[i].typep)
+ data->type = *remote_object_info[i].typep;
+
+ data->oid = object_info_oids.oid[i];
+ if (p_cmd)
+ p_cmd->fn(opt, argv[i+1], output, data);
+ else
+ q_cmd->fn(opt, argv[i+1], output, data);
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+ data->mark_query = 1;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -671,8 +795,12 @@ static void dispatch_calls(struct batch_options *opt,
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (int i = 0; i < nr; i++)
- cmd[i].fn(opt, cmd[i].line, output, data);
+ for (int i = 0; i < nr; i++) {
+ if (!strcmp(cmd[i].name, "remote-object-info"))
+ parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
+ else
+ cmd[i].fn(opt, cmd[i].line, output, data);
+ }
fflush(stdout);
}
@@ -685,17 +813,6 @@ static void free_cmds(struct queued_cmd *cmd, size_t *nr)
*nr = 0;
}
-
-static const struct parse_cmd {
- const char *name;
- parse_cmd_fn_t fn;
- unsigned takes_args;
-} commands[] = {
- { "contents", parse_cmd_contents, 1},
- { "info", parse_cmd_info, 1},
- { "flush", NULL, 0},
-};
-
static void batch_objects_command(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data)
@@ -740,11 +857,17 @@ static void batch_objects_command(struct batch_options *opt,
dispatch_calls(opt, output, data, queued_cmd, nr);
free_cmds(queued_cmd, &nr);
} else if (!opt->buffer_output) {
- cmd->fn(opt, p, output, data);
+ if (!strcmp(cmd->name, "remote-object-info")) {
+ char *line = xstrdup_or_null(p);
+ parse_remote_info(opt, line, output, data, cmd, NULL);
+ } else {
+ cmd->fn(opt, p, output, data);
+ }
} else {
ALLOC_GROW(queued_cmd, nr + 1, alloc);
call.fn = cmd->fn;
call.line = xstrdup_or_null(p);
+ call.name = cmd->name;
queued_cmd[nr++] = call;
}
}
@@ -761,8 +884,6 @@ static void batch_objects_command(struct batch_options *opt,
strbuf_release(&input);
}
-#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
-
static int batch_objects(struct batch_options *opt)
{
struct strbuf input = STRBUF_INIT;
diff --git a/object-file.c b/object-file.c
index d3cf4b8b2e..6aaa167942 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2988,3 +2988,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index c5f2bb2fc2..333e19cd1e 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -533,4 +533,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..7a7bdfeb91
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,412 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)".
+# Tests with the default filter are used to test the fallback to 'fetch' command
+
+
+# Test --batch-command remote-object-info with 'git://' transport
+
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+'
+
+set_transport_variables () {
+ hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_sha1=$(git -C "$1" write-tree)
+ commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
+ tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'http://' transport
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello
+'
+
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "objectsize:disk is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "deltabase is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ (
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+# shellcheck disable=SC2016
+test_expect_success 'remote-object-info fails on missing OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd missing_oid_repo &&
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "fatal: object-info: not our ref $object_id" err
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' transport
+
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello
+'
+
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ cd server &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://$(pwd)" $hello_sha1
+ remote-object-info "file://$(pwd)" $tree_sha1
+ remote-object-info "file://$(pwd)" $commit_sha1
+ remote-object-info "file://$(pwd)" $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ cd server &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ cd server &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
+ remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ cd server &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
+ remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_done
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH 1/6] fetch-pack: refactor packet writing
2024-06-28 19:04 ` [PATCH 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-07-04 16:59 ` Karthik Nayak
2024-07-08 15:17 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Karthik Nayak @ 2024-07-04 16:59 UTC (permalink / raw)
To: Eric Ju, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
[-- Attachment #1: Type: text/plain, Size: 3012 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
> From: Calvin Wan <calvinwan@google.com>
>
> A subsequent patch need to write capabilities for another command.
s/need/needs
> Refactor write_fetch_command_and_capabilities() to be used by both
> fetch and future command.
>
Nit: mostly from my lack of understanding, but until I read the code, I
couldn't understand what 'command' meant in this para. Maybe some
preface would be nice here.
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Helped-by: Christian Couder <chriscool@tuxfamily.org>
> ---
> fetch-pack.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index eba9e420ea..fc9fb66cd8 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1313,13 +1313,13 @@ static int add_haves(struct fetch_negotiator *negotiator,
> return haves_added;
> }
>
> -static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
> - const struct string_list *server_options)
> +static void write_command_and_capabilities(struct strbuf *req_buf,
> + const struct string_list *server_options, const char* command)
> {
> const char *hash_name;
>
> - ensure_server_supports_v2("fetch");
> - packet_buf_write(req_buf, "command=fetch");
> + ensure_server_supports_v2(command);
> + packet_buf_write(req_buf, "command=%s", command);
> if (server_supports_v2("agent"))
> packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
> if (advertise_sid && server_supports_v2("session-id"))
> @@ -1355,7 +1355,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
> int done_sent = 0;
> struct strbuf req_buf = STRBUF_INIT;
>
> - write_fetch_command_and_capabilities(&req_buf, args->server_options);
> + write_command_and_capabilities(&req_buf, args->server_options, "fetch");
>
> if (args->use_thin_pack)
> packet_buf_write(&req_buf, "thin-pack");
> @@ -2163,7 +2163,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
> the_repository, "%d",
> negotiation_round);
> strbuf_reset(&req_buf);
> - write_fetch_command_and_capabilities(&req_buf, server_options);
> + write_command_and_capabilities(&req_buf, server_options, "fetch");
>
> packet_buf_write(&req_buf, "wait-for-done");
>
> --
> 2.45.2
Right, this commit in itself looks good. But I was curious why we need
this, so I did a sneak peak into the following commits.
To summarize, we want to call:
`write_command_and_capabilities(..., "object-info");`
in the upcoming patches to get the object-info details from the server.
But isn't this function too specific to the "fetch" command to be
generalized to be for "object-info" too?
Wouldn't it make sense to add a custom function for 'object-info' in
'connect.c'? Like how we currently have `get_remote_bundle_uri()` for
'bundle-uri' and `get_remote_refs` for 'ls-refs'?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 1/6] fetch-pack: refactor packet writing
2024-07-04 16:59 ` Karthik Nayak
@ 2024-07-08 15:17 ` Peijian Ju
2024-07-10 9:39 ` Karthik Nayak
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2024-07-08 15:17 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Thu, Jul 4, 2024 at 1:00 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > From: Calvin Wan <calvinwan@google.com>
> >
> > A subsequent patch need to write capabilities for another command.
>
> s/need/needs
Thank you. Fixed in v2.
> > Refactor write_fetch_command_and_capabilities() to be used by both
> > fetch and future command.
> >
>
> Nit: mostly from my lack of understanding, but until I read the code, I
> couldn't understand what 'command' meant in this para. Maybe some
> preface would be nice here.
>
Thank you. I will add this in v2 commit message.
Here "command" means the "operations" supported by Git’s wire protocol
https://git-scm.com/docs/protocol-v2. An example would be a
git's subcommand, such as git-fetch(1); or an operation supported by
the server side such as "object-info" implemented at "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> > Helped-by: Jonathan Tan <jonathantanmy@google.com>
> > Helped-by: Christian Couder <chriscool@tuxfamily.org>
> > ---
> > fetch-pack.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index eba9e420ea..fc9fb66cd8 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1313,13 +1313,13 @@ static int add_haves(struct fetch_negotiator *negotiator,
> > return haves_added;
> > }
> >
> > -static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
> > - const struct string_list *server_options)
> > +static void write_command_and_capabilities(struct strbuf *req_buf,
> > + const struct string_list *server_options, const char* command)
> > {
> > const char *hash_name;
> >
> > - ensure_server_supports_v2("fetch");
> > - packet_buf_write(req_buf, "command=fetch");
> > + ensure_server_supports_v2(command);
> > + packet_buf_write(req_buf, "command=%s", command);
> > if (server_supports_v2("agent"))
> > packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
> > if (advertise_sid && server_supports_v2("session-id"))
> > @@ -1355,7 +1355,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
> > int done_sent = 0;
> > struct strbuf req_buf = STRBUF_INIT;
> >
> > - write_fetch_command_and_capabilities(&req_buf, args->server_options);
> > + write_command_and_capabilities(&req_buf, args->server_options, "fetch");
> >
> > if (args->use_thin_pack)
> > packet_buf_write(&req_buf, "thin-pack");
> > @@ -2163,7 +2163,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
> > the_repository, "%d",
> > negotiation_round);
> > strbuf_reset(&req_buf);
> > - write_fetch_command_and_capabilities(&req_buf, server_options);
> > + write_command_and_capabilities(&req_buf, server_options, "fetch");
> >
> > packet_buf_write(&req_buf, "wait-for-done");
> >
> > --
> > 2.45.2
>
> Right, this commit in itself looks good. But I was curious why we need
> this, so I did a sneak peak into the following commits.
>
> To summarize, we want to call:
> `write_command_and_capabilities(..., "object-info");`
> in the upcoming patches to get the object-info details from the server.
> But isn't this function too specific to the "fetch" command to be
> generalized to be for "object-info" too?
>
> Wouldn't it make sense to add a custom function for 'object-info' in
> 'connect.c'? Like how we currently have `get_remote_bundle_uri()` for
> 'bundle-uri' and `get_remote_refs` for 'ls-refs'?
Thank you. I am reading through the old comments left by Taylor
at https://lore.kernel.org/git/YkOPyc9tUfe2Tozx@nand.local/
" Makes obvious sense, and this was something that jumped out to me when I
looked at the first and second versions of this patch. I'm glad that
this is getting factored out."
It seems refactoring this into a more general function is on purpose.
It is encouraged to use this general function to request capability
rather than adding a custom function.
Taylor’s comment was 2 years ago, but I think refactoring this into a
more general function to
enforce DRY still makes sense.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:05 ` [PATCH 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-07-09 1:50 ` Justin Tobler
2024-07-12 17:41 ` Peijian Ju
2024-07-09 7:16 ` Toon claes
2024-07-10 12:08 ` Karthik Nayak
2 siblings, 1 reply; 174+ messages in thread
From: Justin Tobler @ 2024-07-09 1:50 UTC (permalink / raw)
To: Eric Ju; +Cc: git, Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On 24/06/28 03:05PM, Eric Ju wrote:
> From: Calvin Wan <calvinwan@google.com>
>
> Since the `info` command in cat-file --batch-command prints object info
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
> Add `remote-object-info` to cat-file --batch-command.
>
> While `info` takes object ids one at a time, this creates overhead when
> making requests to a server so `remote-object-info` instead can take
> multiple object ids at once.
>
> cat-file --batch-command is generally implemented in the following
> manner:
>
> - Receive and parse input from user
> - Call respective function attached to command
> - Set batch mode state, get object info, print object info
>
> In --buffer mode, this changes to:
>
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue
> - Call respective function attached to command
> - Set batch mode state, get object info, print object info
So the problem is that there is overhead associated with getting object
info from the remote. Therefore, remote-object-info also supports
batching objects together. This seems reasonable.
>
> Notice how the getting and printing of object info is accomplished one
> at a time. As described above, this creates a problem for making
> requests to a server. Therefore, `remote-object-info` is implemented in
> the following manner:
>
> - Receive and parse input from user
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through object info
> - Call respective function attached to `info`
> - Set batch mode state, use passed in object info, print object
> info
> Else:
> - Call respective function attached to command
> - Parse input, get object info, print object info
>
> And finally for --buffer mode `remote-object-info`:
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue:
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through object info
> - Call respective function attached to `info`
> - Set batch mode state, use passed in object info, print
> object info
> Else:
> - Call respective function attached to command
> - Set batch mode state, get object info, print object info
>
> To summarize, `remote-object-info` gets object info from the remote and
> then generates multiple `info` commands with the object info passed in.
>
> In order for remote-object-info to avoid remote communication overhead
> in the non-buffer mode, the objects are passed in as such:
Even in non-buffer mode, having separate remote-object-info commands
would result in additional overhead correct? From my understanding each
command is executed sequently, so multiples of remote-object-info would
always result in additional overhead.
>
> remote-object-info <remote> <oid> <oid> ... <oid>
>
> rather than
>
> remote-object-info <remote> <oid>
> remote-object-info <remote> <oid>
> ...
> remote-object-info <remote> <oid>
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Helped-by: Christian Couder <chriscool@tuxfamily.org>
I think the sign-offs are supposed to go at the bottom.
[snip]
> @@ -526,51 +533,118 @@ static void batch_one_object(const char *obj_name,
> (opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0);
> enum get_oid_result result;
>
> - result = get_oid_with_context(the_repository, obj_name,
> - flags, &data->oid, &ctx);
> - if (result != FOUND) {
> - switch (result) {
> - case MISSING_OBJECT:
> - printf("%s missing%c", obj_name, opt->output_delim);
> - break;
> - case SHORT_NAME_AMBIGUOUS:
> - printf("%s ambiguous%c", obj_name, opt->output_delim);
> - break;
> - case DANGLING_SYMLINK:
> - printf("dangling %"PRIuMAX"%c%s%c",
> - (uintmax_t)strlen(obj_name),
> - opt->output_delim, obj_name, opt->output_delim);
> - break;
> - case SYMLINK_LOOP:
> - printf("loop %"PRIuMAX"%c%s%c",
> - (uintmax_t)strlen(obj_name),
> - opt->output_delim, obj_name, opt->output_delim);
> - break;
> - case NOT_DIR:
> - printf("notdir %"PRIuMAX"%c%s%c",
> - (uintmax_t)strlen(obj_name),
> - opt->output_delim, obj_name, opt->output_delim);
> - break;
> - default:
> - BUG("unknown get_sha1_with_context result %d\n",
> - result);
> - break;
> + if (!opt->use_remote_info) {
When using the remote-object-info command, the object in question is
supposed to be on the remote and may not exist locally. Therefore we
skip over `get_oid_with_context()`.
> + result = get_oid_with_context(the_repository, obj_name,
> + flags, &data->oid, &ctx);
> + if (result != FOUND) {
> + switch (result) {
> + case MISSING_OBJECT:
> + printf("%s missing%c", obj_name, opt->output_delim);
> + break;
> + case SHORT_NAME_AMBIGUOUS:
> + printf("%s ambiguous%c", obj_name, opt->output_delim);
> + break;
> + case DANGLING_SYMLINK:
> + printf("dangling %"PRIuMAX"%c%s%c",
> + (uintmax_t)strlen(obj_name),
> + opt->output_delim, obj_name, opt->output_delim);
> + break;
> + case SYMLINK_LOOP:
> + printf("loop %"PRIuMAX"%c%s%c",
> + (uintmax_t)strlen(obj_name),
> + opt->output_delim, obj_name, opt->output_delim);
> + break;
> + case NOT_DIR:
> + printf("notdir %"PRIuMAX"%c%s%c",
> + (uintmax_t)strlen(obj_name),
> + opt->output_delim, obj_name, opt->output_delim);
> + break;
> + default:
> + BUG("unknown get_sha1_with_context result %d\n",
> + result);
> + break;
> + }
> + fflush(stdout);
> + return;
> }
> - fflush(stdout);
> - return;
> - }
>
> - if (ctx.mode == 0) {
> - printf("symlink %"PRIuMAX"%c%s%c",
> - (uintmax_t)ctx.symlink_path.len,
> - opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> - fflush(stdout);
> - return;
> + if (ctx.mode == 0) {
> + printf("symlink %"PRIuMAX"%c%s%c",
> + (uintmax_t)ctx.symlink_path.len,
> + opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> + fflush(stdout);
> + return;
> + }
> }
>
> batch_object_write(obj_name, scratch, opt, data, NULL, 0);
> }
>
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> + int retval = 0;
> + struct remote *remote = NULL;
> + struct object_id oid;
> + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> + static struct transport *gtransport;
> +
> + /*
> + * Change the format to "%(objectname) %(objectsize)" when
> + * remote-object-info command is used. Once we start supporting objecttype
> + * the default format should change to DEFAULT_FORMAT
> + */
> + if (!opt->format) {
> + opt->format = "%(objectname) %(objectsize)";
> + }
We should omit the parenthesis for single line if statements.
> +
> + remote = remote_get(argv[0]);
> + if (!remote)
> + die(_("must supply valid remote when using remote-object-info"));
> + oid_array_clear(&object_info_oids);
> + for (size_t i = 1; i < argc; i++) {
> + if (get_oid_hex(argv[i], &oid))
> + die(_("Not a valid object name %s"), argv[i]);
> + oid_array_append(&object_info_oids, &oid);
> + }
> +
> + gtransport = transport_get(remote, NULL);
> + if (gtransport->smart_options) {
> + int include_size = 0;
> +
> + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> + gtransport->smart_options->object_info = 1;
> + gtransport->smart_options->object_info_oids = &object_info_oids;
> + /*
> + * 'size' is the only option currently supported.
> + * Other options that are passed in the format will exit with error.
> + */
> + if (strstr(opt->format, "%(objectsize)")) {
> + string_list_append(&object_info_options, "size");
> + include_size = 1;
> + }
> + if (strstr(opt->format, "%(objecttype)")) {
> + die(_("objecttype is currently not supported with remote-object-info"));
> + }
Another single line if statement above that should omit the parenthesis.
> + if (strstr(opt->format, "%(objectsize:disk)"))
> + die(_("objectsize:disk is currently not supported with remote-object-info"));
> + if (strstr(opt->format, "%(deltabase)"))
> + die(_("deltabase is currently not supported with remote-object-info"));
> + if (object_info_options.nr > 0) {
> + gtransport->smart_options->object_info_options = &object_info_options;
> + for (size_t i = 0; i < object_info_oids.nr; i++) {
> + if (include_size)
> + remote_object_info[i].sizep = xcalloc(1, sizeof(long));
> + }
> + gtransport->smart_options->object_info_data = &remote_object_info;
> + retval = transport_fetch_refs(gtransport, NULL);
> + }
> + } else {
> + retval = -1;
> + }
> +
> + return retval;
> +}
> +
> struct object_cb_data {
> struct batch_options *opt;
> struct expand_data *expand;
> @@ -642,6 +716,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
> struct queued_cmd {
> parse_cmd_fn_t fn;
> char *line;
> + const char *name;
Since special handling is needed for the remote-object-info command, we
record the queued command names to check against later.
> };
>
> static void parse_cmd_contents(struct batch_options *opt,
> @@ -662,6 +737,55 @@ static void parse_cmd_info(struct batch_options *opt,
> batch_one_object(line, output, opt, data);
> }
>
> +static const struct parse_cmd {
> + const char *name;
> + parse_cmd_fn_t fn;
> + unsigned takes_args;
> +} commands[] = {
> + { "contents", parse_cmd_contents, 1 },
> + { "info", parse_cmd_info, 1 },
> + { "remote-object-info", parse_cmd_info, 1 },
> + { "flush", NULL, 0 },
> +};
> +
> +static void parse_remote_info(struct batch_options *opt,
> + char *line,
> + struct strbuf *output,
> + struct expand_data *data,
> + const struct parse_cmd *p_cmd,
> + struct queued_cmd *q_cmd)
It seems a little confusing to me that `parse_remote_info()` accepts
both a `parse_cmd` and `queued_cmd`, but only expects to use one or the
other. It looks like this is done because `dispatch_calls()` already
accepts `queued_cmd`, but now needs to call `parse_remote_info()`.
Since it is only the underlying command function that is needed by
`parse_remote_info()`
> +{
> + int count;
> + const char **argv;
> +
> + count = split_cmdline(line, &argv);
> + if (get_remote_info(opt, count, argv))
> + goto cleanup;
> + opt->use_remote_info = 1;
> + data->skip_object_info = 1;
> + data->mark_query = 0;
> + for (size_t i = 0; i < object_info_oids.nr; i++) {
> + if (remote_object_info[i].sizep)
> + data->size = *remote_object_info[i].sizep;
> + if (remote_object_info[i].typep)
> + data->type = *remote_object_info[i].typep;
> +
> + data->oid = object_info_oids.oid[i];
> + if (p_cmd)
> + p_cmd->fn(opt, argv[i+1], output, data);
> + else
> + q_cmd->fn(opt, argv[i+1], output, data);
> + }
> + opt->use_remote_info = 0;
> + data->skip_object_info = 0;
> + data->mark_query = 1;
> +
> +cleanup:
> + for (size_t i = 0; i < object_info_oids.nr; i++)
> + free_object_info_contents(&remote_object_info[i]);
> + free(remote_object_info);
> +}
> +
> static void dispatch_calls(struct batch_options *opt,
> struct strbuf *output,
> struct expand_data *data,
> @@ -671,8 +795,12 @@ static void dispatch_calls(struct batch_options *opt,
> if (!opt->buffer_output)
> die(_("flush is only for --buffer mode"));
>
> - for (int i = 0; i < nr; i++)
> - cmd[i].fn(opt, cmd[i].line, output, data);
> + for (int i = 0; i < nr; i++) {
> + if (!strcmp(cmd[i].name, "remote-object-info"))
> + parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
If we adapt `parse_remote_info()` to accept the command function we
could pass cmd->fn here instead.
> + else
> + cmd[i].fn(opt, cmd[i].line, output, data);
> + }
>
> fflush(stdout);
> }
> @@ -685,17 +813,6 @@ static void free_cmds(struct queued_cmd *cmd, size_t *nr)
> *nr = 0;
> }
>
> -
> -static const struct parse_cmd {
> - const char *name;
> - parse_cmd_fn_t fn;
> - unsigned takes_args;
> -} commands[] = {
> - { "contents", parse_cmd_contents, 1},
> - { "info", parse_cmd_info, 1},
> - { "flush", NULL, 0},
> -};
> -
> static void batch_objects_command(struct batch_options *opt,
> struct strbuf *output,
> struct expand_data *data)
> @@ -740,11 +857,17 @@ static void batch_objects_command(struct batch_options *opt,
> dispatch_calls(opt, output, data, queued_cmd, nr);
> free_cmds(queued_cmd, &nr);
> } else if (!opt->buffer_output) {
> - cmd->fn(opt, p, output, data);
> + if (!strcmp(cmd->name, "remote-object-info")) {
> + char *line = xstrdup_or_null(p);
> + parse_remote_info(opt, line, output, data, cmd, NULL);
Same here, if we adapt `parse_remote_info()` to accept the command
function we could pass cmd->fn here instead.
> + } else {
> + cmd->fn(opt, p, output, data);
> + }
> } else {
> ALLOC_GROW(queued_cmd, nr + 1, alloc);
> call.fn = cmd->fn;
> call.line = xstrdup_or_null(p);
> + call.name = cmd->name;
> queued_cmd[nr++] = call;
> }
> }
> @@ -761,8 +884,6 @@ static void batch_objects_command(struct batch_options *opt,
> strbuf_release(&input);
> }
>
> -#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> -
> static int batch_objects(struct batch_options *opt)
> {
> struct strbuf input = STRBUF_INIT;
[snip]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 4/6] transport: add client support for object-info
2024-06-28 19:05 ` [PATCH 4/6] transport: add client support for object-info Eric Ju
@ 2024-07-09 7:15 ` Toon claes
2024-07-09 16:37 ` Junio C Hamano
2024-07-13 2:30 ` Peijian Ju
2024-07-10 10:13 ` Karthik Nayak
1 sibling, 2 replies; 174+ messages in thread
From: Toon claes @ 2024-07-09 7:15 UTC (permalink / raw)
To: Eric Ju, git
Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
Eric Ju <eric.peijian@gmail.com> writes:
> diff --git a/transport.c b/transport.c
> index 83ddea8fbc..2847aa3f3c 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -436,11 +504,27 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.server_options = transport->server_options;
> args.negotiation_tips = data->options.negotiation_tips;
> args.reject_shallow_remote = transport->smart_options->reject_shallow;
> -
> - if (!data->finished_handshake) {
> - int i;
> + args.object_info = transport->smart_options->object_info;
> +
> + if (transport->smart_options && transport->smart_options->object_info) {
> + struct ref *ref = object_info_refs;
> +
> + if (!fetch_object_info(transport, data->options.object_info_data))
> + goto cleanup;
> + args.object_info_data = data->options.object_info_data;
> + args.quiet = 1;
> + args.no_progress = 1;
> + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> + struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
> + temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
Any reason why you're not using the subscript operator (square brackets)
like this:
+ temp_ref->old_oid = transport->smart_options->object_info_oids->oid[i];
> + temp_ref->exact_oid = 1;
> + ref->next = temp_ref;
> + ref = ref->next;
> + }
> + transport->remote_refs = object_info_refs->next;
I find it a bit weird you're allocating object_info_refs, only to use it
to point to the next. Can I suggest a little refactor:
----8<-----8<----
diff --git a/transport.c b/transport.c
index 662faa004e..56cb3a1693 100644
--- a/transport.c
+++ b/transport.c
@@ -479,7 +479,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
- struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -509,7 +509,7 @@ static int fetch_refs_via_pack(struct transport *transport,
args.object_info = transport->smart_options->object_info;
if (transport->smart_options && transport->smart_options->object_info) {
- struct ref *ref = object_info_refs;
+ struct ref *ref = object_info_refs = xcalloc(1, sizeof (struct ref));
if (!fetch_object_info(transport, data->options.object_info_data))
goto cleanup;
@@ -517,13 +517,12 @@ static int fetch_refs_via_pack(struct transport *transport,
args.quiet = 1;
args.no_progress = 1;
for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
- struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
- temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
- temp_ref->exact_oid = 1;
- ref->next = temp_ref;
+ ref->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref->exact_oid = 1;
+ ref->next = xcalloc(1, sizeof (struct ref));
ref = ref->next;
}
- transport->remote_refs = object_info_refs->next;
+ transport->remote_refs = object_info_refs;
} else if (!data->finished_handshake) {
int must_list_refs = 0;
for (int i = 0; i < nr_heads; i++) {
@@ -565,7 +564,7 @@ static int fetch_refs_via_pack(struct transport *transport,
data->finished_handshake = 0;
if (args.object_info) {
- struct ref *ref_cpy_reader = object_info_refs->next;
+ struct ref *ref_cpy_reader = object_info_refs;
for (int i = 0; ref_cpy_reader; i++) {
oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &(*args.object_info_data)[i], OBJECT_INFO_LOOKUP_REPLACE);
ref_cpy_reader = ref_cpy_reader->next;
----8<-----8<----
To be honest, I'm not sure it works, because fetch_object_info() always
seem to return a non-zero value. I'm not sure this is due to missing
code coverage, or a bug. I guess it's worth looking into.
--
Toon
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:05 ` [PATCH 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-07-09 1:50 ` Justin Tobler
@ 2024-07-09 7:16 ` Toon claes
2024-07-13 2:35 ` Peijian Ju
2024-07-10 12:08 ` Karthik Nayak
2 siblings, 1 reply; 174+ messages in thread
From: Toon claes @ 2024-07-09 7:16 UTC (permalink / raw)
To: Eric Ju, git
Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai, Eric Ju
Eric Ju <eric.peijian@gmail.com> writes:
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 72a78cdc8c..34958a1747 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> ...
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> + int retval = 0;
> + struct remote *remote = NULL;
> + struct object_id oid;
> + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> + static struct transport *gtransport;
> +
> + /*
> + * Change the format to "%(objectname) %(objectsize)" when
> + * remote-object-info command is used. Once we start supporting objecttype
> + * the default format should change to DEFAULT_FORMAT
> + */
I believe this comment has become outdated, or got moved around
incorrectly.
> diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> new file mode 100755
> index 0000000000..7a7bdfeb91
> --- /dev/null
> +++ b/t/t1017-cat-file-remote-object-info.sh
> ...
> +stop_git_daemon
> +
> +# Test --batch-command remote-object-info with 'http://' transport
> +
> +. "$TEST_DIRECTORY"/lib-httpd.sh
> +start_httpd
start_httpd skips the remainder of the tests if it fails to start the
httpd server. That's why I see various other tests which have this at
the end:
# DO NOT add non-httpd-specific tests here, because the last part of this
# test script is only executed when httpd is available and enabled.
So I would suggest to add this comment as well, and move the file://
tests above start_httpd.
--
Toon
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 4/6] transport: add client support for object-info
2024-07-09 7:15 ` Toon claes
@ 2024-07-09 16:37 ` Junio C Hamano
2024-07-13 2:32 ` Peijian Ju
2024-07-13 2:30 ` Peijian Ju
1 sibling, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2024-07-09 16:37 UTC (permalink / raw)
To: Toon claes
Cc: Eric Ju, git, Christian Couder, Calvin Wan, Jonathan Tan,
John Cai
Toon claes <toon@iotcl.com> writes:
>> + temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
>
> Any reason why you're not using the subscript operator (square brackets)
> like this:
>
> + temp_ref->old_oid = transport->smart_options->object_info_oids->oid[i];
Much nicer, but fold such overly long lines, please,
temp_ref->old_oid = transport->smart_options->
object_info_oids->oid[i];
to make them readable.
> ...
> To be honest, I'm not sure it works, because fetch_object_info() always
> seem to return a non-zero value. I'm not sure this is due to missing
> code coverage, or a bug. I guess it's worth looking into.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 1/6] fetch-pack: refactor packet writing
2024-07-08 15:17 ` Peijian Ju
@ 2024-07-10 9:39 ` Karthik Nayak
2024-07-15 16:40 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Karthik Nayak @ 2024-07-10 9:39 UTC (permalink / raw)
To: Peijian Ju; +Cc: git, Christian Couder, Calvin Wan, Jonathan Tan, John Cai
[-- Attachment #1: Type: text/plain, Size: 1598 bytes --]
Peijian Ju <eric.peijian@gmail.com> writes:
[snip]
>> Right, this commit in itself looks good. But I was curious why we need
>> this, so I did a sneak peak into the following commits.
>>
>> To summarize, we want to call:
>> `write_command_and_capabilities(..., "object-info");`
>> in the upcoming patches to get the object-info details from the server.
>> But isn't this function too specific to the "fetch" command to be
>> generalized to be for "object-info" too?
>>
>> Wouldn't it make sense to add a custom function for 'object-info' in
>> 'connect.c'? Like how we currently have `get_remote_bundle_uri()` for
>> 'bundle-uri' and `get_remote_refs` for 'ls-refs'?
>
> Thank you. I am reading through the old comments left by Taylor
> at https://lore.kernel.org/git/YkOPyc9tUfe2Tozx@nand.local/
>
> " Makes obvious sense, and this was something that jumped out to me when I
> looked at the first and second versions of this patch. I'm glad that
> this is getting factored out."
>
>
> It seems refactoring this into a more general function is on purpose.
> It is encouraged to use this general function to request capability
> rather than adding a custom function.
> Taylor’s comment was 2 years ago, but I think refactoring this into a
> more general function to
> enforce DRY still makes sense.
It would make sense then to move the existing users to also use
`write_command_and_capabilities` eventually. I guess this could be done
in a follow up series.
Then I would say `write_command_and_capabilities()` should be moved to
`transport.c`, no?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 4/6] transport: add client support for object-info
2024-06-28 19:05 ` [PATCH 4/6] transport: add client support for object-info Eric Ju
2024-07-09 7:15 ` Toon claes
@ 2024-07-10 10:13 ` Karthik Nayak
2024-07-16 2:39 ` Peijian Ju
1 sibling, 1 reply; 174+ messages in thread
From: Karthik Nayak @ 2024-07-10 10:13 UTC (permalink / raw)
To: Eric Ju, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
[-- Attachment #1: Type: text/plain, Size: 12225 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
> From: Calvin Wan <calvinwan@google.com>
>
> Sometimes it is useful to get information about an object without having
> to download it completely. The server logic has already been implemented
> as “a2ba162cda (object-info: support for retrieving object info,
Nit: s/as/in
> 2021-04-20)”.
>
> Add client functions to communicate with the server.
>
> The client currently supports requesting a list of object ids with
> features 'size' and 'type' from a v2 server. If a server does not
But do we support type? I thought we only added support for 'size'.
> advertise either of the requested features, then the client falls back
> to making the request through 'fetch'.
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Helped-by: Christian Couder <chriscool@tuxfamily.org>
> ---
> fetch-pack.c | 24 +++++++++++
> fetch-pack.h | 10 +++++
> transport-helper.c | 8 +++-
> transport.c | 102 ++++++++++++++++++++++++++++++++++++++++++---
> transport.h | 11 +++++
> 5 files changed, 148 insertions(+), 7 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index da0de9c537..d533cac1d8 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1345,6 +1345,27 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
> packet_buf_delim(req_buf);
> }
>
> +void send_object_info_request(int fd_out, struct object_info_args *args)
> +{
> + struct strbuf req_buf = STRBUF_INIT;
> +
> + write_command_and_capabilities(&req_buf, args->server_options, "object-info");
> +
> + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> + packet_buf_write(&req_buf, "size");
> +
> + if (args->oids) {
> + for (size_t i = 0; i < args->oids->nr; i++)
> + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> + }
> +
> + packet_buf_flush(&req_buf);
> + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> + die_errno(_("unable to write request to remote"));
> +
> + strbuf_release(&req_buf);
> +}
> +
> static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
> struct fetch_pack_args *args,
> const struct ref *wants, struct oidset *common,
> @@ -1682,6 +1703,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> if (args->depth > 0 || args->deepen_since || args->deepen_not)
> args->deepen = 1;
>
> + if (args->object_info)
> + state = FETCH_SEND_REQUEST;
> +
> while (state != FETCH_DONE) {
> switch (state) {
> case FETCH_CHECK_LOCAL:
> diff --git a/fetch-pack.h b/fetch-pack.h
> index 6775d26517..16e4dc0824 100644
> --- a/fetch-pack.h
> +++ b/fetch-pack.h
> @@ -16,6 +16,7 @@ struct fetch_pack_args {
> const struct string_list *deepen_not;
> struct list_objects_filter_options filter_options;
> const struct string_list *server_options;
> + struct object_info **object_info_data;
>
> /*
> * If not NULL, during packfile negotiation, fetch-pack will send "have"
> @@ -42,6 +43,7 @@ struct fetch_pack_args {
> unsigned reject_shallow_remote:1;
> unsigned deepen:1;
> unsigned refetch:1;
> + unsigned object_info:1;
>
> /*
> * Indicate that the remote of this request is a promisor remote. The
> @@ -68,6 +70,12 @@ struct fetch_pack_args {
> unsigned connectivity_checked:1;
> };
>
> +struct object_info_args {
> + struct string_list *object_info_options;
> + const struct string_list *server_options;
> + struct oid_array *oids;
> +};
> +
> /*
> * sought represents remote references that should be updated from.
> * On return, the names that were found on the remote will have been
> @@ -101,4 +109,6 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
> */
> int report_unmatched_refs(struct ref **sought, int nr_sought);
>
> +void send_object_info_request(int fd_out, struct object_info_args *args);
> +
> #endif
> diff --git a/transport-helper.c b/transport-helper.c
> index 9820947ab2..670d1e7068 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -697,13 +697,17 @@ static int fetch_refs(struct transport *transport,
>
> /*
> * If we reach here, then the server, the client, and/or the transport
> - * helper does not support protocol v2. --negotiate-only requires
> - * protocol v2.
> + * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
> + * require protocol v2.
> */
> if (data->transport_options.acked_commits) {
> warning(_("--negotiate-only requires protocol v2"));
> return -1;
> }
> + if (transport->smart_options->object_info) {
> + // fail the command explicitly to avoid further commands input
> + die(_("remote-object-info requires protocol v2"));
> + }
>
> if (!data->get_refs_list_called)
> get_refs_list_using_list(transport, 0);
> diff --git a/transport.c b/transport.c
> index 83ddea8fbc..2847aa3f3c 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -363,6 +363,73 @@ static struct ref *handshake(struct transport *transport, int for_push,
> return refs;
> }
>
> +static int fetch_object_info(struct transport *transport, struct object_info **object_info_data)
> +{
> + int size_index = -1;
> + struct git_transport_data *data = transport->data;
> + struct object_info_args args;
> + struct packet_reader reader;
> +
> + memset(&args, 0, sizeof(args));
Nit: we could `struct object_info_args args = { 0 };` above instead.
> + args.server_options = transport->server_options;
> + args.object_info_options = transport->smart_options->object_info_options;
> + args.oids = transport->smart_options->object_info_oids;
> +
> + connect_setup(transport, 0);
> + packet_reader_init(&reader, data->fd[0], NULL, 0,
> + PACKET_READ_CHOMP_NEWLINE |
> + PACKET_READ_GENTLE_ON_EOF |
> + PACKET_READ_DIE_ON_ERR_PACKET);
> + data->version = discover_version(&reader);
> +
> + transport->hash_algo = reader.hash_algo;
> +
> + switch (data->version) {
> + case protocol_v2:
> + if (!server_supports_v2("object-info"))
> + return -1;
> + if (unsorted_string_list_has_string(args.object_info_options, "size")
> + && !server_supports_feature("object-info", "size", 0)) {
> + return -1;
> + }
> + send_object_info_request(data->fd[1], &args);
> + break;
> + case protocol_v1:
> + case protocol_v0:
> + die(_("wrong protocol version. expected v2"));
> + case protocol_unknown_version:
> + BUG("unknown protocol version");
> + }
> +
> + for (size_t i = 0; i < args.object_info_options->nr; i++) {
> + if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
> + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> + return -1;
> + }
> + if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
> + if (!strcmp(reader.line, "size"))
> + size_index = i;
> + continue;
> + }
> + return -1;
> + }
> +
> + for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
> + struct string_list object_info_values = STRING_LIST_INIT_DUP;
We need to also call `string_list_clear()` at the end of this block.
> +
> + string_list_split(&object_info_values, reader.line, ' ', -1);
> + if (0 <= size_index) {
> + if (!strcmp(object_info_values.items[1 + size_index].string, ""))
> + die("object-info: not our ref %s",
> + object_info_values.items[0].string);
> + *(*object_info_data)[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
Perhaps `*object_info_data[i]->sizep =
strtoul(object_info_values.items[1 + size_index].string, NULL, 10);`?
So, this is allocated in 'cat-file' and set here? Wouldn't it be nicer
to also do the alloc here?
> + }
> + }
> + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> +
> + return 0;
> +}
> +
> static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> struct transport_ls_refs_options *options)
> {
> @@ -410,6 +477,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> struct ref *refs = NULL;
> struct fetch_pack_args args;
> struct ref *refs_tmp = NULL;
> + struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
>
> memset(&args, 0, sizeof(args));
> args.uploadpack = data->options.uploadpack;
> @@ -436,11 +504,27 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.server_options = transport->server_options;
> args.negotiation_tips = data->options.negotiation_tips;
> args.reject_shallow_remote = transport->smart_options->reject_shallow;
> -
> - if (!data->finished_handshake) {
> - int i;
> + args.object_info = transport->smart_options->object_info;
> +
> + if (transport->smart_options && transport->smart_options->object_info) {
> + struct ref *ref = object_info_refs;
> +
> + if (!fetch_object_info(transport, data->options.object_info_data))
> + goto cleanup;
> + args.object_info_data = data->options.object_info_data;
> + args.quiet = 1;
> + args.no_progress = 1;
> + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> + struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
> + temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
> + temp_ref->exact_oid = 1;
> + ref->next = temp_ref;
> + ref = ref->next;
> + }
> + transport->remote_refs = object_info_refs->next;
> + } else if (!data->finished_handshake) {
> int must_list_refs = 0;
> - for (i = 0; i < nr_heads; i++) {
> + for (int i = 0; i < nr_heads; i++) {
> if (!to_fetch[i]->exact_oid) {
> must_list_refs = 1;
> break;
> @@ -478,11 +562,18 @@ static int fetch_refs_via_pack(struct transport *transport,
> &transport->pack_lockfiles, data->version);
>
> data->finished_handshake = 0;
> + if (args.object_info) {
> + struct ref *ref_cpy_reader = object_info_refs->next;
> + for (int i = 0; ref_cpy_reader; i++) {
> + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &(*args.object_info_data)[i], OBJECT_INFO_LOOKUP_REPLACE);
> + ref_cpy_reader = ref_cpy_reader->next;
> + }
> + }
> data->options.self_contained_and_connected =
> args.self_contained_and_connected;
> data->options.connectivity_checked = args.connectivity_checked;
>
> - if (!refs)
> + if (!refs && !args.object_info)
> ret = -1;
> if (report_unmatched_refs(to_fetch, nr_heads))
> ret = -1;
> @@ -498,6 +589,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> free_refs(refs_tmp);
> free_refs(refs);
> list_objects_filter_release(&args.filter_options);
> + free_refs(object_info_refs);
Shouldn't we loop through `object_info_refs->next` and free all of them ?
> return ret;
> }
>
> diff --git a/transport.h b/transport.h
> index 6393cd9823..5a3cda1860 100644
> --- a/transport.h
> +++ b/transport.h
> @@ -5,6 +5,7 @@
> #include "remote.h"
> #include "list-objects-filter-options.h"
> #include "string-list.h"
> +#include "object-store.h"
>
> struct git_transport_options {
> unsigned thin : 1;
> @@ -30,6 +31,12 @@ struct git_transport_options {
> */
> unsigned connectivity_checked:1;
>
> + /*
> + * Transport will attempt to pull only object-info. Fallbacks
> + * to pulling entire object if object-info is not supported.
> + */
> + unsigned object_info : 1;
> +
> int depth;
> const char *deepen_since;
> const struct string_list *deepen_not;
> @@ -53,6 +60,10 @@ struct git_transport_options {
> * common commits to this oidset instead of fetching any packfiles.
> */
> struct oidset *acked_commits;
> +
> + struct oid_array *object_info_oids;
> + struct object_info **object_info_data;
> + struct string_list *object_info_options;
> };
>
> enum transport_family {
> --
> 2.45.2
I wondering if we can add tests at this stage.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 5/6] cat-file: add declaration of variable i inside its for loop
2024-06-28 19:05 ` [PATCH 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-07-10 10:16 ` Karthik Nayak
2024-07-16 2:59 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Karthik Nayak @ 2024-07-10 10:16 UTC (permalink / raw)
To: Eric Ju, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
[-- Attachment #1: Type: text/plain, Size: 310 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
> Some code declares variable i and only uses it
> in a for loop, not in any other logic outside the loop.
>
> Change the declaration of i to be inside the for loop for readability.
>
If we're doing this anyways, we could replace the 'int' with 'size_t'
too.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:05 ` [PATCH 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-07-09 1:50 ` Justin Tobler
2024-07-09 7:16 ` Toon claes
@ 2024-07-10 12:08 ` Karthik Nayak
2024-07-17 2:38 ` Peijian Ju
2 siblings, 1 reply; 174+ messages in thread
From: Karthik Nayak @ 2024-07-10 12:08 UTC (permalink / raw)
To: Eric Ju, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
[-- Attachment #1: Type: text/plain, Size: 34268 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
> From: Calvin Wan <calvinwan@google.com>
>
> Since the `info` command in cat-file --batch-command prints object info
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
> Add `remote-object-info` to cat-file --batch-command.
>
> While `info` takes object ids one at a time, this creates overhead when
> making requests to a server so `remote-object-info` instead can take
> multiple object ids at once.
>
> cat-file --batch-command is generally implemented in the following
> manner:
>
> - Receive and parse input from user
So this refers input delimited by newline or '\0'.
> - Call respective function attached to command
> - Set batch mode state, get object info, print object info
>
Doesn't the batch mode get set before the input parsing begins?
> In --buffer mode, this changes to:
>
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue
> - Call respective function attached to command
> - Set batch mode state, get object info, print object info
>
> Notice how the getting and printing of object info is accomplished one
> at a time. As described above, this creates a problem for making
> requests to a server. Therefore, `remote-object-info` is implemented in
> the following manner:
>
> - Receive and parse input from user
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through object info
> - Call respective function attached to `info`
> - Set batch mode state, use passed in object info, print object
> info
> Else:
> - Call respective function attached to command
> - Parse input, get object info, print object info
>
So this is because we want 'remote-object-info' to also use
'parse_cmd_info' similar to 'info'. But I'm not understanding why,
especially since 'parse_cmd_info' calls 'batch_one_object', and we skip
most of that code for 'remote-object-info'.
Wouldn't it be cleaner to just define our own 'batch_remote_object' and
create 'parse_cmd_remote_info' ?
> And finally for --buffer mode `remote-object-info`:
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue:
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through object info
> - Call respective function attached to `info`
> - Set batch mode state, use passed in object info, print
> object info
> Else:
> - Call respective function attached to command
> - Set batch mode state, get object info, print object info
>
> To summarize, `remote-object-info` gets object info from the remote and
> then generates multiple `info` commands with the object info passed in.
>
> In order for remote-object-info to avoid remote communication overhead
> in the non-buffer mode, the objects are passed in as such:
>
> remote-object-info <remote> <oid> <oid> ... <oid>
>
> rather than
>
> remote-object-info <remote> <oid>
> remote-object-info <remote> <oid>
> ...
> remote-object-info <remote> <oid>
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Helped-by: Christian Couder <chriscool@tuxfamily.org>
> ---
> Documentation/git-cat-file.txt | 22 +-
> builtin/cat-file.c | 231 ++++++++++----
> object-file.c | 11 +
> object-store-ll.h | 3 +
> t/t1017-cat-file-remote-object-info.sh | 412 +++++++++++++++++++++++++
> 5 files changed, 620 insertions(+), 59 deletions(-)
> create mode 100755 t/t1017-cat-file-remote-object-info.sh
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index bd95a6c10a..ab0647bb39 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -149,6 +149,12 @@ info <object>::
> Print object info for object reference `<object>`. This corresponds to the
> output of `--batch-check`.
>
> +remote-object-info <remote> <object>...::
> + Print object info for object references `<object>` at specified <remote> without
> + downloading object from remote.
> + Error when no object references is provided.
> + This command may be combined with `--buffer`.
> +
> flush::
> Used with `--buffer` to execute all preceding commands that were issued
> since the beginning or since the last flush was issued. When `--buffer`
> @@ -290,7 +296,8 @@ newline. The available atoms are:
> The full hex representation of the object name.
>
> `objecttype`::
> - The type of the object (the same as `cat-file -t` reports).
> + The type of the object (the same as `cat-file -t` reports). See
> + `CAVEATS` below. Not supported by `remote-object-info`.
>
> `objectsize`::
> The size, in bytes, of the object (the same as `cat-file -s`
> @@ -298,13 +305,14 @@ newline. The available atoms are:
>
> `objectsize:disk`::
> The size, in bytes, that the object takes up on disk. See the
> - note about on-disk sizes in the `CAVEATS` section below.
> + note about on-disk sizes in the `CAVEATS` section below. Not
> + supported by `remote-object-info`.
>
> `deltabase`::
> If the object is stored as a delta on-disk, this expands to the
> full hex representation of the delta base object name.
> Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> - below.
> + below. Not supported by `remote-object-info`.
>
> `rest`::
> If this atom is used in the output string, input lines are split
> @@ -314,7 +322,9 @@ newline. The available atoms are:
> line) are output in place of the `%(rest)` atom.
>
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except remote-object-info command who uses
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +When "%(objecttype)" is supported, default format should be unified.
>
> If `--batch` is specified, or if `--batch-command` is used with the `contents`
> command, the object information is followed by the object contents (consisting
> @@ -396,6 +406,10 @@ scripting purposes.
> CAVEATS
> -------
>
> +Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
> +remote-object-info, git will error and exit when they are in the format string.
> +
> +
> Note that the sizes of objects on disk are reported accurately, but care
> should be taken in drawing conclusions about which refs or objects are
> responsible for disk usage. The size of a packed non-delta object may be
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 72a78cdc8c..34958a1747 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -24,6 +24,9 @@
> #include "promisor-remote.h"
> #include "mailmap.h"
> #include "write-or-die.h"
> +#include "alias.h"
> +#include "remote.h"
> +#include "transport.h"
>
> enum batch_mode {
> BATCH_MODE_CONTENTS,
> @@ -42,9 +45,14 @@ struct batch_options {
> char input_delim;
> char output_delim;
> const char *format;
> + int use_remote_info;
> };
>
> +#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> +
> static const char *force_path;
> +static struct object_info *remote_object_info;
> +static struct oid_array object_info_oids = OID_ARRAY_INIT;
>
> static struct string_list mailmap = STRING_LIST_INIT_NODUP;
> static int use_mailmap;
> @@ -508,7 +516,6 @@ static void batch_object_write(const char *obj_name,
> }
>
> batch_write(opt, scratch->buf, scratch->len);
> -
Nit: why remove this?
> if (opt->batch_mode == BATCH_MODE_CONTENTS) {
> print_object_or_die(opt, data);
> batch_write(opt, &opt->output_delim, 1);
> @@ -526,51 +533,118 @@ static void batch_one_object(const char *obj_name,
> (opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0);
> enum get_oid_result result;
>
> - result = get_oid_with_context(the_repository, obj_name,
> - flags, &data->oid, &ctx);
> - if (result != FOUND) {
> - switch (result) {
> - case MISSING_OBJECT:
> - printf("%s missing%c", obj_name, opt->output_delim);
> - break;
> - case SHORT_NAME_AMBIGUOUS:
> - printf("%s ambiguous%c", obj_name, opt->output_delim);
> - break;
> - case DANGLING_SYMLINK:
> - printf("dangling %"PRIuMAX"%c%s%c",
> - (uintmax_t)strlen(obj_name),
> - opt->output_delim, obj_name, opt->output_delim);
> - break;
> - case SYMLINK_LOOP:
> - printf("loop %"PRIuMAX"%c%s%c",
> - (uintmax_t)strlen(obj_name),
> - opt->output_delim, obj_name, opt->output_delim);
> - break;
> - case NOT_DIR:
> - printf("notdir %"PRIuMAX"%c%s%c",
> - (uintmax_t)strlen(obj_name),
> - opt->output_delim, obj_name, opt->output_delim);
> - break;
> - default:
> - BUG("unknown get_sha1_with_context result %d\n",
> - result);
> - break;
> + if (!opt->use_remote_info) {
> + result = get_oid_with_context(the_repository, obj_name,
> + flags, &data->oid, &ctx);
> + if (result != FOUND) {
> + switch (result) {
> + case MISSING_OBJECT:
> + printf("%s missing%c", obj_name, opt->output_delim);
> + break;
> + case SHORT_NAME_AMBIGUOUS:
> + printf("%s ambiguous%c", obj_name, opt->output_delim);
> + break;
> + case DANGLING_SYMLINK:
> + printf("dangling %"PRIuMAX"%c%s%c",
> + (uintmax_t)strlen(obj_name),
> + opt->output_delim, obj_name, opt->output_delim);
> + break;
> + case SYMLINK_LOOP:
> + printf("loop %"PRIuMAX"%c%s%c",
> + (uintmax_t)strlen(obj_name),
> + opt->output_delim, obj_name, opt->output_delim);
> + break;
> + case NOT_DIR:
> + printf("notdir %"PRIuMAX"%c%s%c",
> + (uintmax_t)strlen(obj_name),
> + opt->output_delim, obj_name, opt->output_delim);
> + break;
> + default:
> + BUG("unknown get_sha1_with_context result %d\n",
> + result);
> + break;
> + }
> + fflush(stdout);
> + return;
> }
> - fflush(stdout);
> - return;
> - }
>
> - if (ctx.mode == 0) {
> - printf("symlink %"PRIuMAX"%c%s%c",
> - (uintmax_t)ctx.symlink_path.len,
> - opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> - fflush(stdout);
> - return;
> + if (ctx.mode == 0) {
> + printf("symlink %"PRIuMAX"%c%s%c",
> + (uintmax_t)ctx.symlink_path.len,
> + opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> + fflush(stdout);
> + return;
> + }
> }
>
> batch_object_write(obj_name, scratch, opt, data, NULL, 0);
> }
>
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> + int retval = 0;
> + struct remote *remote = NULL;
We need to call `remote_clear()` on this at the end.
> + struct object_id oid;
> + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
This needs to be cleared.
> + static struct transport *gtransport;
Shouldn't we call `transport_disconnect(transport);`?
> + /*
> + * Change the format to "%(objectname) %(objectsize)" when
> + * remote-object-info command is used. Once we start supporting objecttype
> + * the default format should change to DEFAULT_FORMAT
> + */
> + if (!opt->format) {
> + opt->format = "%(objectname) %(objectsize)";
> + }
> +
> + remote = remote_get(argv[0]);
> + if (!remote)
> + die(_("must supply valid remote when using remote-object-info"));
> + oid_array_clear(&object_info_oids);
> + for (size_t i = 1; i < argc; i++) {
> + if (get_oid_hex(argv[i], &oid))
> + die(_("Not a valid object name %s"), argv[i]);
> + oid_array_append(&object_info_oids, &oid);
> + }
> +
> + gtransport = transport_get(remote, NULL);
> + if (gtransport->smart_options) {
> + int include_size = 0;
> +
> + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> + gtransport->smart_options->object_info = 1;
> + gtransport->smart_options->object_info_oids = &object_info_oids;
> + /*
> + * 'size' is the only option currently supported.
> + * Other options that are passed in the format will exit with error.
> + */
> + if (strstr(opt->format, "%(objectsize)")) {
> + string_list_append(&object_info_options, "size");
> + include_size = 1;
> + }
> + if (strstr(opt->format, "%(objecttype)")) {
> + die(_("objecttype is currently not supported with remote-object-info"));
> + }
> + if (strstr(opt->format, "%(objectsize:disk)"))
> + die(_("objectsize:disk is currently not supported with remote-object-info"));
> + if (strstr(opt->format, "%(deltabase)"))
> + die(_("deltabase is currently not supported with remote-object-info"));
>
This whole block could be replaced by an else..
if (strstr(opt->format, "%(objectsize)")) {
string_list_append(&object_info_options, "size");
include_size = 1;
} else {
die(_("%s is currently not supported with remote-object-info", opt->format));
}
> + if (object_info_options.nr > 0) {
> + gtransport->smart_options->object_info_options = &object_info_options;
> + for (size_t i = 0; i < object_info_oids.nr; i++) {
> + if (include_size)
> + remote_object_info[i].sizep = xcalloc(1, sizeof(long));
> + }
> + gtransport->smart_options->object_info_data = &remote_object_info;
> + retval = transport_fetch_refs(gtransport, NULL);
> + }
> + } else {
> + retval = -1;
> + }
> +
> + return retval;
> +}
> +
> struct object_cb_data {
> struct batch_options *opt;
> struct expand_data *expand;
> @@ -642,6 +716,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
> struct queued_cmd {
> parse_cmd_fn_t fn;
> char *line;
> + const char *name;
> };
>
> static void parse_cmd_contents(struct batch_options *opt,
> @@ -662,6 +737,55 @@ static void parse_cmd_info(struct batch_options *opt,
> batch_one_object(line, output, opt, data);
> }
>
> +static const struct parse_cmd {
> + const char *name;
> + parse_cmd_fn_t fn;
> + unsigned takes_args;
> +} commands[] = {
> + { "contents", parse_cmd_contents, 1 },
> + { "info", parse_cmd_info, 1 },
> + { "remote-object-info", parse_cmd_info, 1 },
> + { "flush", NULL, 0 },
> +};
> +
> +static void parse_remote_info(struct batch_options *opt,
> + char *line,
> + struct strbuf *output,
> + struct expand_data *data,
> + const struct parse_cmd *p_cmd,
> + struct queued_cmd *q_cmd)
> +{
> + int count;
> + const char **argv;
> +
> + count = split_cmdline(line, &argv);
> + if (get_remote_info(opt, count, argv))
> + goto cleanup;
> + opt->use_remote_info = 1;
> + data->skip_object_info = 1;
> + data->mark_query = 0;
> + for (size_t i = 0; i < object_info_oids.nr; i++) {
> + if (remote_object_info[i].sizep)
> + data->size = *remote_object_info[i].sizep;
> + if (remote_object_info[i].typep)
> + data->type = *remote_object_info[i].typep;
> +
We don't even set the type, so this shouldn't ever be possible right?
> + data->oid = object_info_oids.oid[i];
> + if (p_cmd)
> + p_cmd->fn(opt, argv[i+1], output, data);
> + else
> + q_cmd->fn(opt, argv[i+1], output, data);
> + }
> + opt->use_remote_info = 0;
> + data->skip_object_info = 0;
> + data->mark_query = 1;
> +
> +cleanup:
> + for (size_t i = 0; i < object_info_oids.nr; i++)
> + free_object_info_contents(&remote_object_info[i]);
> + free(remote_object_info);
argv needs to free'd too
> +}
> +
> static void dispatch_calls(struct batch_options *opt,
> struct strbuf *output,
> struct expand_data *data,
> @@ -671,8 +795,12 @@ static void dispatch_calls(struct batch_options *opt,
> if (!opt->buffer_output)
> die(_("flush is only for --buffer mode"));
>
> - for (int i = 0; i < nr; i++)
> - cmd[i].fn(opt, cmd[i].line, output, data);
> + for (int i = 0; i < nr; i++) {
> + if (!strcmp(cmd[i].name, "remote-object-info"))
> + parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
> + else
> + cmd[i].fn(opt, cmd[i].line, output, data);
> + }
>
> fflush(stdout);
> }
> @@ -685,17 +813,6 @@ static void free_cmds(struct queued_cmd *cmd, size_t *nr)
> *nr = 0;
> }
>
> -
> -static const struct parse_cmd {
> - const char *name;
> - parse_cmd_fn_t fn;
> - unsigned takes_args;
> -} commands[] = {
> - { "contents", parse_cmd_contents, 1},
> - { "info", parse_cmd_info, 1},
> - { "flush", NULL, 0},
> -};
> -
> static void batch_objects_command(struct batch_options *opt,
> struct strbuf *output,
> struct expand_data *data)
> @@ -740,11 +857,17 @@ static void batch_objects_command(struct batch_options *opt,
> dispatch_calls(opt, output, data, queued_cmd, nr);
> free_cmds(queued_cmd, &nr);
> } else if (!opt->buffer_output) {
> - cmd->fn(opt, p, output, data);
> + if (!strcmp(cmd->name, "remote-object-info")) {
> + char *line = xstrdup_or_null(p);
This needs to be free'd.
> + parse_remote_info(opt, line, output, data, cmd, NULL);
> + } else {
> + cmd->fn(opt, p, output, data);
> + }
> } else {
> ALLOC_GROW(queued_cmd, nr + 1, alloc);
> call.fn = cmd->fn;
> call.line = xstrdup_or_null(p);
> + call.name = cmd->name;
> queued_cmd[nr++] = call;
> }
> }
> @@ -761,8 +884,6 @@ static void batch_objects_command(struct batch_options *opt,
> strbuf_release(&input);
> }
>
> -#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> -
> static int batch_objects(struct batch_options *opt)
> {
> struct strbuf input = STRBUF_INIT;
> diff --git a/object-file.c b/object-file.c
> index d3cf4b8b2e..6aaa167942 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -2988,3 +2988,14 @@ int read_loose_object(const char *path,
> munmap(map, mapsize);
> return ret;
> }
> +
> +void free_object_info_contents(struct object_info *object_info)
> +{
> + if (!object_info)
> + return;
> + free(object_info->typep);
> + free(object_info->sizep);
> + free(object_info->disk_sizep);
> + free(object_info->delta_base_oid);
> + free(object_info->type_name);
> +}
> diff --git a/object-store-ll.h b/object-store-ll.h
> index c5f2bb2fc2..333e19cd1e 100644
> --- a/object-store-ll.h
> +++ b/object-store-ll.h
> @@ -533,4 +533,7 @@ int for_each_object_in_pack(struct packed_git *p,
> int for_each_packed_object(each_packed_object_fn, void *,
> enum for_each_object_flags flags);
>
> +/* Free pointers inside of object_info, but not object_info itself */
> +void free_object_info_contents(struct object_info *object_info);
> +
> #endif /* OBJECT_STORE_LL_H */
> diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> new file mode 100755
> index 0000000000..7a7bdfeb91
> --- /dev/null
> +++ b/t/t1017-cat-file-remote-object-info.sh
> @@ -0,0 +1,412 @@
> +#!/bin/sh
> +
> +test_description='git cat-file --batch-command with remote-object-info command'
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +
> +echo_without_newline () {
> + printf '%s' "$*"
> +}
> +
> +strlen () {
> + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> +}
> +
> +hello_content="Hello World"
> +hello_size=$(strlen "$hello_content")
> +hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> +
> +tree_size=$(($(test_oid rawsz) + 13))
> +
> +commit_message="Initial commit"
> +commit_size=$(($(test_oid hexsz) + 137))
>
Why 13 and 137?
> +
> +tag_header_without_oid="type blob
> +tag hellotag
> +tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
> +tag_header_without_timestamp="object $hello_oid
> +$tag_header_without_oid"
> +tag_description="This is a tag"
> +tag_content="$tag_header_without_timestamp 0 +0000
> +
> +$tag_description"
> +
> +tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
> +tag_size=$(strlen "$tag_content")
> +
> +# This section tests --batch-command with remote-object-info command
> +# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
> +# the filters are set to "%(objectname) %(objectsize)".
> +# Tests with the default filter are used to test the fallback to 'fetch' command
> +
> +
> +# Test --batch-command remote-object-info with 'git://' transport
> +
> +. "$TEST_DIRECTORY"/lib-git-daemon.sh
> +start_git_daemon --export-all --enable=receive-pack
> +daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
> +
> +test_expect_success 'create repo to be served by git-daemon' '
> + git init "$daemon_parent" &&
> +
> + echo_without_newline "$hello_content" > $daemon_parent/hello &&
> + git -C "$daemon_parent" update-index --add hello &&
> + git -C "$daemon_parent" config transfer.advertiseobjectinfo true
> +'
> +
> +set_transport_variables () {
> + hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> + tree_sha1=$(git -C "$1" write-tree)
> + commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
> + tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
> + tag_size=$(strlen "$tag_content")
> +}
> +
> +
extra newline here
> +test_expect_success 'batch-command remote-object-info git://' '
> + (
> + set_transport_variables "$daemon_parent" &&
> + cd "$daemon_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1
> + remote-object-info "$GIT_DAEMON_URL/parent" $tree_sha1
> + remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1
> + remote-object-info "$GIT_DAEMON_URL/parent" $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
> + (
> + set_transport_variables "$daemon_parent" &&
> + cd "$daemon_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command remote-object-info http:// default filter' '
> + (
> + set_transport_variables "$daemon_parent" &&
> + cd "$daemon_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
> + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
> + remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command --buffer remote-object-info git://' '
> + (
> + set_transport_variables "$daemon_parent" &&
> + cd "$daemon_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
> + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
> + remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
> + flush
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +stop_git_daemon
> +
> +# Test --batch-command remote-object-info with 'http://' transport
> +
> +. "$TEST_DIRECTORY"/lib-httpd.sh
> +start_httpd
> +
> +test_expect_success 'create repo to be served by http:// transport' '
> + git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
> + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
> + echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
> + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello
> +'
> +
> +
> +test_expect_success 'batch-command remote-object-info http://' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> + remote-object-info "$HTTPD_URL/smart/http_parent" $tree_sha1
> + remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1
> + remote-object-info "$HTTPD_URL/smart/http_parent" $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command remote-object-info http:// one line' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command --buffer remote-object-info http://' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> +
> + git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
> + remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
> + flush
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command remote-object-info http:// default filter' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> +
> + git cat-file --batch-command >actual <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
> + remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> + EOF
> + test_grep "objectsize:disk is currently not supported with remote-object-info" err
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> + EOF
> + test_grep "deltabase is currently not supported with remote-object-info" err
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on server with legacy protocol' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> + EOF
> + test_grep "remote-object-info requires protocol v2" err
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> + test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> + EOF
> + test_grep "remote-object-info requires protocol v2" err
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on malformed OID' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + malformed_object_id="this_id_is_not_valid" &&
> +
> + test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
> + EOF
> + test_grep "Not a valid object name '$malformed_object_id'" err
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on malformed OID fallback' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + malformed_object_id="this_id_is_not_valid" &&
> +
> + test_must_fail git cat-file --batch-command 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
> + EOF
> + test_grep "Not a valid object name '$malformed_object_id'" err
> + )
> +'
> +
> +test_expect_success 'remote-object-info fails on missing OID' '
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
> + test_commit -C missing_oid_repo message1 c.txt &&
> + (
> + cd missing_oid_repo &&
> +
> + object_id=$(git rev-parse message1:c.txt) &&
> + test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
> + EOF
> + test_grep "object-info: not our ref $object_id" err
> + )
> +'
> +
> +# shellcheck disable=SC2016
> +test_expect_success 'remote-object-info fails on missing OID fallback' '
> + (
> + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> + cd missing_oid_repo &&
> + object_id=$(git rev-parse message1:c.txt) &&
> + test_must_fail git cat-file --batch-command 2>err <<-EOF &&
> + remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
> + EOF
> + test_grep "fatal: object-info: not our ref $object_id" err
> + )
> +'
> +
> +# Test --batch-command remote-object-info with 'file://' transport
> +
> +# shellcheck disable=SC2016
> +test_expect_success 'create repo to be served by file:// transport' '
> + git init server &&
> + git -C server config protocol.version 2 &&
> + git -C server config transfer.advertiseobjectinfo true &&
> + echo_without_newline "$hello_content" > server/hello &&
> + git -C server update-index --add hello
> +'
> +
> +
> +test_expect_success 'batch-command remote-object-info file://' '
> + (
> + set_transport_variables "server" &&
> + cd server &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> + remote-object-info "file://$(pwd)" $hello_sha1
> + remote-object-info "file://$(pwd)" $tree_sha1
> + remote-object-info "file://$(pwd)" $commit_sha1
> + remote-object-info "file://$(pwd)" $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
> + (
> + set_transport_variables "server" &&
> + cd server &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> + remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command --buffer remote-object-info file://' '
> + (
> + set_transport_variables "server" &&
> + cd server &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> + git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
> + remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
> + remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
> + flush
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_expect_success 'batch-command remote-object-info file:// default filter' '
> + (
> + set_transport_variables "server" &&
> + cd server &&
> +
> + echo "$hello_sha1 $hello_size" >expect &&
> + echo "$tree_sha1 $tree_size" >>expect &&
> + echo "$commit_sha1 $commit_size" >>expect &&
> + echo "$tag_sha1 $tag_size" >>expect &&
> +
> + git cat-file --batch-command >actual <<-EOF &&
> + remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
> + remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
> + EOF
> + test_cmp expect actual
> + )
> +'
> +
> +test_done
Some more tests I'd like to see
- Testing against the '-Z' option.
- Testing the fallback to fetch whole object when the server doesn't
support 'remote-object-info'.
Thanks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-07-09 1:50 ` Justin Tobler
@ 2024-07-12 17:41 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-12 17:41 UTC (permalink / raw)
To: Justin Tobler, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Mon, Jul 8, 2024 at 9:51 PM Justin Tobler <jltobler@gmail.com> wrote:
>
> On 24/06/28 03:05PM, Eric Ju wrote:
> > From: Calvin Wan <calvinwan@google.com>
> >
> > Since the `info` command in cat-file --batch-command prints object info
> > for a given object, it is natural to add another command in cat-file
> > --batch-command to print object info for a given object from a remote.
> > Add `remote-object-info` to cat-file --batch-command.
> >
> > While `info` takes object ids one at a time, this creates overhead when
> > making requests to a server so `remote-object-info` instead can take
> > multiple object ids at once.
> >
> > cat-file --batch-command is generally implemented in the following
> > manner:
> >
> > - Receive and parse input from user
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
> > In --buffer mode, this changes to:
> >
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
>
> So the problem is that there is overhead associated with getting object
> info from the remote. Therefore, remote-object-info also supports
> batching objects together. This seems reasonable.
>
Thank you, Justin. Yes, you are right, whenever remote-object-info is
called there is an overhead. I will explain where this overhead
happens in the following reply.
> >
> > Notice how the getting and printing of object info is accomplished one
> > at a time. As described above, this creates a problem for making
> > requests to a server. Therefore, `remote-object-info` is implemented in
> > the following manner:
> >
> > - Receive and parse input from user
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through object info
> > - Call respective function attached to `info`
> > - Set batch mode state, use passed in object info, print object
> > info
> > Else:
> > - Call respective function attached to command
> > - Parse input, get object info, print object info
> >
> > And finally for --buffer mode `remote-object-info`:
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue:
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through object info
> > - Call respective function attached to `info`
> > - Set batch mode state, use passed in object info, print
> > object info
> > Else:
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
> > To summarize, `remote-object-info` gets object info from the remote and
> > then generates multiple `info` commands with the object info passed in.
> >
> > In order for remote-object-info to avoid remote communication overhead
> > in the non-buffer mode, the objects are passed in as such:
>
> Even in non-buffer mode, having separate remote-object-info commands
> would result in additional overhead correct? From my understanding each
> command is executed sequently, so multiples of remote-object-info would
> always result in additional overhead.
>
Thank you. No matter what mode it is (buffer or non-buffer), the
overhead of remote-object-info is always there. To my understanding,
there are two parts in the overhead:
1. Setting up a connection. This is happening in `connect_setup()` in
`fetch_object_info()` function.
2. Sending request buf. This includes initializing the packet reader
in `packet_reader_init()` and putting OIDs in the request buff in
`send_object_info_request()`. Both of them are called in the
`fetch_object_info()` function.
It would be more efficient to send multiple OIDs over one request
packet in one connection in the form of remote-object-info <remote>
<oid> <oid> ... <oid>
> >
> > remote-object-info <remote> <oid> <oid> ... <oid>
> >
> > rather than
> >
> > remote-object-info <remote> <oid>
> > remote-object-info <remote> <oid>
> > ...
> > remote-object-info <remote> <oid>
> >
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> > Helped-by: Jonathan Tan <jonathantanmy@google.com>
> > Helped-by: Christian Couder <chriscool@tuxfamily.org>
>
> I think the sign-offs are supposed to go at the bottom.
>
Thank you. I am fixing it in v2.
> [snip]
> > @@ -526,51 +533,118 @@ static void batch_one_object(const char *obj_name,
> > (opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0);
> > enum get_oid_result result;
> >
> > - result = get_oid_with_context(the_repository, obj_name,
> > - flags, &data->oid, &ctx);
> > - if (result != FOUND) {
> > - switch (result) {
> > - case MISSING_OBJECT:
> > - printf("%s missing%c", obj_name, opt->output_delim);
> > - break;
> > - case SHORT_NAME_AMBIGUOUS:
> > - printf("%s ambiguous%c", obj_name, opt->output_delim);
> > - break;
> > - case DANGLING_SYMLINK:
> > - printf("dangling %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - case SYMLINK_LOOP:
> > - printf("loop %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - case NOT_DIR:
> > - printf("notdir %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - default:
> > - BUG("unknown get_sha1_with_context result %d\n",
> > - result);
> > - break;
> > + if (!opt->use_remote_info) {
>
> When using the remote-object-info command, the object in question is
> supposed to be on the remote and may not exist locally. Therefore we
> skip over `get_oid_with_context()`.
>
Thank you. Yes, that is the reason. I reworded your comment and added
it to the code in v2 to make it easier to follow.
> > + result = get_oid_with_context(the_repository, obj_name,
> > + flags, &data->oid, &ctx);
> > + if (result != FOUND) {
> > + switch (result) {
> > + case MISSING_OBJECT:
> > + printf("%s missing%c", obj_name, opt->output_delim);
> > + break;
> > + case SHORT_NAME_AMBIGUOUS:
> > + printf("%s ambiguous%c", obj_name, opt->output_delim);
> > + break;
> > + case DANGLING_SYMLINK:
> > + printf("dangling %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + case SYMLINK_LOOP:
> > + printf("loop %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + case NOT_DIR:
> > + printf("notdir %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + default:
> > + BUG("unknown get_sha1_with_context result %d\n",
> > + result);
> > + break;
> > + }
> > + fflush(stdout);
> > + return;
> > }
> > - fflush(stdout);
> > - return;
> > - }
> >
> > - if (ctx.mode == 0) {
> > - printf("symlink %"PRIuMAX"%c%s%c",
> > - (uintmax_t)ctx.symlink_path.len,
> > - opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> > - fflush(stdout);
> > - return;
> > + if (ctx.mode == 0) {
> > + printf("symlink %"PRIuMAX"%c%s%c",
> > + (uintmax_t)ctx.symlink_path.len,
> > + opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> > + fflush(stdout);
> > + return;
> > + }
> > }
> >
> > batch_object_write(obj_name, scratch, opt, data, NULL, 0);
> > }
> >
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > + static struct transport *gtransport;
> > +
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT
> > + */
> > + if (!opt->format) {
> > + opt->format = "%(objectname) %(objectsize)";
> > + }
>
> We should omit the parenthesis for single line if statements.
>
Thank you. Fixed in V2.
> > +
> > + remote = remote_get(argv[0]);
> > + if (!remote)
> > + die(_("must supply valid remote when using remote-object-info"));
> > + oid_array_clear(&object_info_oids);
> > + for (size_t i = 1; i < argc; i++) {
> > + if (get_oid_hex(argv[i], &oid))
> > + die(_("Not a valid object name %s"), argv[i]);
> > + oid_array_append(&object_info_oids, &oid);
> > + }
> > +
> > + gtransport = transport_get(remote, NULL);
> > + if (gtransport->smart_options) {
> > + int include_size = 0;
> > +
> > + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> > + gtransport->smart_options->object_info = 1;
> > + gtransport->smart_options->object_info_oids = &object_info_oids;
> > + /*
> > + * 'size' is the only option currently supported.
> > + * Other options that are passed in the format will exit with error.
> > + */
> > + if (strstr(opt->format, "%(objectsize)")) {
> > + string_list_append(&object_info_options, "size");
> > + include_size = 1;
> > + }
> > + if (strstr(opt->format, "%(objecttype)")) {
> > + die(_("objecttype is currently not supported with remote-object-info"));
> > + }
>
> Another single line if statement above that should omit the parenthesis.
>
Thank you. Fixed in V2.
> > + if (strstr(opt->format, "%(objectsize:disk)"))
> > + die(_("objectsize:disk is currently not supported with remote-object-info"));
> > + if (strstr(opt->format, "%(deltabase)"))
> > + die(_("deltabase is currently not supported with remote-object-info"));
> > + if (object_info_options.nr > 0) {
> > + gtransport->smart_options->object_info_options = &object_info_options;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > + if (include_size)
> > + remote_object_info[i].sizep = xcalloc(1, sizeof(long));
> > + }
> > + gtransport->smart_options->object_info_data = &remote_object_info;
> > + retval = transport_fetch_refs(gtransport, NULL);
> > + }
> > + } else {
> > + retval = -1;
> > + }
> > +
> > + return retval;
> > +}
> > +
> > struct object_cb_data {
> > struct batch_options *opt;
> > struct expand_data *expand;
> > @@ -642,6 +716,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
> > struct queued_cmd {
> > parse_cmd_fn_t fn;
> > char *line;
> > + const char *name;
>
> Since special handling is needed for the remote-object-info command, we
> record the queued command names to check against later.
>
Yes. We need to compare the function name to do special handling
later. But I think we can have a better solution here instead of doing
a name comparison. Please see my reply below.
> > };
> >
> > static void parse_cmd_contents(struct batch_options *opt,
> > @@ -662,6 +737,55 @@ static void parse_cmd_info(struct batch_options *opt,
> > batch_one_object(line, output, opt, data);
> > }
> >
> > +static const struct parse_cmd {
> > + const char *name;
> > + parse_cmd_fn_t fn;
> > + unsigned takes_args;
> > +} commands[] = {
> > + { "contents", parse_cmd_contents, 1 },
> > + { "info", parse_cmd_info, 1 },
> > + { "remote-object-info", parse_cmd_info, 1 },
> > + { "flush", NULL, 0 },
> > +};
> > +
> > +static void parse_remote_info(struct batch_options *opt,
> > + char *line,
> > + struct strbuf *output,
> > + struct expand_data *data,
> > + const struct parse_cmd *p_cmd,
> > + struct queued_cmd *q_cmd)
>
> It seems a little confusing to me that `parse_remote_info()` accepts
> both a `parse_cmd` and `queued_cmd`, but only expects to use one or the
> other. It looks like this is done because `dispatch_calls()` already
> accepts `queued_cmd`, but now needs to call `parse_remote_info()`.
>
> Since it is only the underlying command function that is needed by
> `parse_remote_info()`
>
Thank you. I agree. I did some refactoring. Please see me reply below.
> > +{
> > + int count;
> > + const char **argv;
> > +
> > + count = split_cmdline(line, &argv);
> > + if (get_remote_info(opt, count, argv))
> > + goto cleanup;
> > + opt->use_remote_info = 1;
> > + data->skip_object_info = 1;
> > + data->mark_query = 0;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > + if (remote_object_info[i].sizep)
> > + data->size = *remote_object_info[i].sizep;
> > + if (remote_object_info[i].typep)
> > + data->type = *remote_object_info[i].typep;
> > +
> > + data->oid = object_info_oids.oid[i];
> > + if (p_cmd)
> > + p_cmd->fn(opt, argv[i+1], output, data);
> > + else
> > + q_cmd->fn(opt, argv[i+1], output, data);
> > + }
> > + opt->use_remote_info = 0;
> > + data->skip_object_info = 0;
> > + data->mark_query = 1;
> > +
> > +cleanup:
> > + for (size_t i = 0; i < object_info_oids.nr; i++)
> > + free_object_info_contents(&remote_object_info[i]);
> > + free(remote_object_info);
> > +}
> > +
> > static void dispatch_calls(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data,
> > @@ -671,8 +795,12 @@ static void dispatch_calls(struct batch_options *opt,
> > if (!opt->buffer_output)
> > die(_("flush is only for --buffer mode"));
> >
> > - for (int i = 0; i < nr; i++)
> > - cmd[i].fn(opt, cmd[i].line, output, data);
> > + for (int i = 0; i < nr; i++) {
> > + if (!strcmp(cmd[i].name, "remote-object-info"))
> > + parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
>
> If we adapt `parse_remote_info()` to accept the command function we
> could pass cmd->fn here instead.
>
Thank you. I think I can push it a bit further.
Under the hood, parse_remote_info will use parse_cmd_info to print the
retrieved information to the client. That is why it had this line
originally:
...
{ "remote-object-info", parse_cmd_info, 1 },
...
Inspired by your comment, I am thinking if I can adapt
parse_remote_info() 's signature to the same as parse_cmd_info(). It
would make the code cleaner. To be specific. I can
1. get rid of name cooperation in
...
if (!strcmp(cmd[i].name, "remote-object-info"))
parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[I]);
else
cmd[i].fn(opt, cmd[i].line, output, data);
...
and I can just use `cmd[i].fn(opt, cmd[i].line, output, data)`
2. get rid of
...
if (p_cmd)
p_cmd->fn(opt, argv[i+1], output, data);
else
q_cmd->fn(opt, argv[i+1], output, data);
...
I will make this change in V2.
> > + else
> > + cmd[i].fn(opt, cmd[i].line, output, data);
> > + }
> >
> > fflush(stdout);
> > }
> > @@ -685,17 +813,6 @@ static void free_cmds(struct queued_cmd *cmd, size_t *nr)
> > *nr = 0;
> > }
> >
> > -
> > -static const struct parse_cmd {
> > - const char *name;
> > - parse_cmd_fn_t fn;
> > - unsigned takes_args;
> > -} commands[] = {
> > - { "contents", parse_cmd_contents, 1},
> > - { "info", parse_cmd_info, 1},
> > - { "flush", NULL, 0},
> > -};
> > -
> > static void batch_objects_command(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data)
> > @@ -740,11 +857,17 @@ static void batch_objects_command(struct batch_options *opt,
> > dispatch_calls(opt, output, data, queued_cmd, nr);
> > free_cmds(queued_cmd, &nr);
> > } else if (!opt->buffer_output) {
> > - cmd->fn(opt, p, output, data);
> > + if (!strcmp(cmd->name, "remote-object-info")) {
> > + char *line = xstrdup_or_null(p);
> > + parse_remote_info(opt, line, output, data, cmd, NULL);
>
> Same here, if we adapt `parse_remote_info()` to accept the command
> function we could pass cmd->fn here instead.
Thank you. Please see my reply above.
> > + } else {
> > + cmd->fn(opt, p, output, data);
> > + }
> > } else {
> > ALLOC_GROW(queued_cmd, nr + 1, alloc);
> > call.fn = cmd->fn;
> > call.line = xstrdup_or_null(p);
> > + call.name = cmd->name;
> > queued_cmd[nr++] = call;
> > }
> > }
> > @@ -761,8 +884,6 @@ static void batch_objects_command(struct batch_options *opt,
> > strbuf_release(&input);
> > }
> >
> > -#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> > -
> > static int batch_objects(struct batch_options *opt)
> > {
> > struct strbuf input = STRBUF_INIT;
> [snip]
On Mon, Jul 8, 2024 at 9:51 PM Justin Tobler <jltobler@gmail.com> wrote:
>
> On 24/06/28 03:05PM, Eric Ju wrote:
> > From: Calvin Wan <calvinwan@google.com>
> >
> > Since the `info` command in cat-file --batch-command prints object info
> > for a given object, it is natural to add another command in cat-file
> > --batch-command to print object info for a given object from a remote.
> > Add `remote-object-info` to cat-file --batch-command.
> >
> > While `info` takes object ids one at a time, this creates overhead when
> > making requests to a server so `remote-object-info` instead can take
> > multiple object ids at once.
> >
> > cat-file --batch-command is generally implemented in the following
> > manner:
> >
> > - Receive and parse input from user
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
> > In --buffer mode, this changes to:
> >
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
>
> So the problem is that there is overhead associated with getting object
> info from the remote. Therefore, remote-object-info also supports
> batching objects together. This seems reasonable.
>
> >
> > Notice how the getting and printing of object info is accomplished one
> > at a time. As described above, this creates a problem for making
> > requests to a server. Therefore, `remote-object-info` is implemented in
> > the following manner:
> >
> > - Receive and parse input from user
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through object info
> > - Call respective function attached to `info`
> > - Set batch mode state, use passed in object info, print object
> > info
> > Else:
> > - Call respective function attached to command
> > - Parse input, get object info, print object info
> >
> > And finally for --buffer mode `remote-object-info`:
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue:
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through object info
> > - Call respective function attached to `info`
> > - Set batch mode state, use passed in object info, print
> > object info
> > Else:
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
> > To summarize, `remote-object-info` gets object info from the remote and
> > then generates multiple `info` commands with the object info passed in.
> >
> > In order for remote-object-info to avoid remote communication overhead
> > in the non-buffer mode, the objects are passed in as such:
>
> Even in non-buffer mode, having separate remote-object-info commands
> would result in additional overhead correct? From my understanding each
> command is executed sequently, so multiples of remote-object-info would
> always result in additional overhead.
>
> >
> > remote-object-info <remote> <oid> <oid> ... <oid>
> >
> > rather than
> >
> > remote-object-info <remote> <oid>
> > remote-object-info <remote> <oid>
> > ...
> > remote-object-info <remote> <oid>
> >
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> > Helped-by: Jonathan Tan <jonathantanmy@google.com>
> > Helped-by: Christian Couder <chriscool@tuxfamily.org>
>
> I think the sign-offs are supposed to go at the bottom.
>
> [snip]
> > @@ -526,51 +533,118 @@ static void batch_one_object(const char *obj_name,
> > (opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0);
> > enum get_oid_result result;
> >
> > - result = get_oid_with_context(the_repository, obj_name,
> > - flags, &data->oid, &ctx);
> > - if (result != FOUND) {
> > - switch (result) {
> > - case MISSING_OBJECT:
> > - printf("%s missing%c", obj_name, opt->output_delim);
> > - break;
> > - case SHORT_NAME_AMBIGUOUS:
> > - printf("%s ambiguous%c", obj_name, opt->output_delim);
> > - break;
> > - case DANGLING_SYMLINK:
> > - printf("dangling %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - case SYMLINK_LOOP:
> > - printf("loop %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - case NOT_DIR:
> > - printf("notdir %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - default:
> > - BUG("unknown get_sha1_with_context result %d\n",
> > - result);
> > - break;
> > + if (!opt->use_remote_info) {
>
> When using the remote-object-info command, the object in question is
> supposed to be on the remote and may not exist locally. Therefore we
> skip over `get_oid_with_context()`.
>
> > + result = get_oid_with_context(the_repository, obj_name,
> > + flags, &data->oid, &ctx);
> > + if (result != FOUND) {
> > + switch (result) {
> > + case MISSING_OBJECT:
> > + printf("%s missing%c", obj_name, opt->output_delim);
> > + break;
> > + case SHORT_NAME_AMBIGUOUS:
> > + printf("%s ambiguous%c", obj_name, opt->output_delim);
> > + break;
> > + case DANGLING_SYMLINK:
> > + printf("dangling %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + case SYMLINK_LOOP:
> > + printf("loop %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + case NOT_DIR:
> > + printf("notdir %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + default:
> > + BUG("unknown get_sha1_with_context result %d\n",
> > + result);
> > + break;
> > + }
> > + fflush(stdout);
> > + return;
> > }
> > - fflush(stdout);
> > - return;
> > - }
> >
> > - if (ctx.mode == 0) {
> > - printf("symlink %"PRIuMAX"%c%s%c",
> > - (uintmax_t)ctx.symlink_path.len,
> > - opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> > - fflush(stdout);
> > - return;
> > + if (ctx.mode == 0) {
> > + printf("symlink %"PRIuMAX"%c%s%c",
> > + (uintmax_t)ctx.symlink_path.len,
> > + opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> > + fflush(stdout);
> > + return;
> > + }
> > }
> >
> > batch_object_write(obj_name, scratch, opt, data, NULL, 0);
> > }
> >
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > + static struct transport *gtransport;
> > +
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT
> > + */
> > + if (!opt->format) {
> > + opt->format = "%(objectname) %(objectsize)";
> > + }
>
> We should omit the parenthesis for single line if statements.
>
> > +
> > + remote = remote_get(argv[0]);
> > + if (!remote)
> > + die(_("must supply valid remote when using remote-object-info"));
> > + oid_array_clear(&object_info_oids);
> > + for (size_t i = 1; i < argc; i++) {
> > + if (get_oid_hex(argv[i], &oid))
> > + die(_("Not a valid object name %s"), argv[i]);
> > + oid_array_append(&object_info_oids, &oid);
> > + }
> > +
> > + gtransport = transport_get(remote, NULL);
> > + if (gtransport->smart_options) {
> > + int include_size = 0;
> > +
> > + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> > + gtransport->smart_options->object_info = 1;
> > + gtransport->smart_options->object_info_oids = &object_info_oids;
> > + /*
> > + * 'size' is the only option currently supported.
> > + * Other options that are passed in the format will exit with error.
> > + */
> > + if (strstr(opt->format, "%(objectsize)")) {
> > + string_list_append(&object_info_options, "size");
> > + include_size = 1;
> > + }
> > + if (strstr(opt->format, "%(objecttype)")) {
> > + die(_("objecttype is currently not supported with remote-object-info"));
> > + }
>
> Another single line if statement above that should omit the parenthesis.
>
> > + if (strstr(opt->format, "%(objectsize:disk)"))
> > + die(_("objectsize:disk is currently not supported with remote-object-info"));
> > + if (strstr(opt->format, "%(deltabase)"))
> > + die(_("deltabase is currently not supported with remote-object-info"));
> > + if (object_info_options.nr > 0) {
> > + gtransport->smart_options->object_info_options = &object_info_options;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > + if (include_size)
> > + remote_object_info[i].sizep = xcalloc(1, sizeof(long));
> > + }
> > + gtransport->smart_options->object_info_data = &remote_object_info;
> > + retval = transport_fetch_refs(gtransport, NULL);
> > + }
> > + } else {
> > + retval = -1;
> > + }
> > +
> > + return retval;
> > +}
> > +
> > struct object_cb_data {
> > struct batch_options *opt;
> > struct expand_data *expand;
> > @@ -642,6 +716,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
> > struct queued_cmd {
> > parse_cmd_fn_t fn;
> > char *line;
> > + const char *name;
>
> Since special handling is needed for the remote-object-info command, we
> record the queued command names to check against later.
>
> > };
> >
> > static void parse_cmd_contents(struct batch_options *opt,
> > @@ -662,6 +737,55 @@ static void parse_cmd_info(struct batch_options *opt,
> > batch_one_object(line, output, opt, data);
> > }
> >
> > +static const struct parse_cmd {
> > + const char *name;
> > + parse_cmd_fn_t fn;
> > + unsigned takes_args;
> > +} commands[] = {
> > + { "contents", parse_cmd_contents, 1 },
> > + { "info", parse_cmd_info, 1 },
> > + { "remote-object-info", parse_cmd_info, 1 },
> > + { "flush", NULL, 0 },
> > +};
> > +
> > +static void parse_remote_info(struct batch_options *opt,
> > + char *line,
> > + struct strbuf *output,
> > + struct expand_data *data,
> > + const struct parse_cmd *p_cmd,
> > + struct queued_cmd *q_cmd)
>
> It seems a little confusing to me that `parse_remote_info()` accepts
> both a `parse_cmd` and `queued_cmd`, but only expects to use one or the
> other. It looks like this is done because `dispatch_calls()` already
> accepts `queued_cmd`, but now needs to call `parse_remote_info()`.
>
> Since it is only the underlying command function that is needed by
> `parse_remote_info()`
>
> > +{
> > + int count;
> > + const char **argv;
> > +
> > + count = split_cmdline(line, &argv);
> > + if (get_remote_info(opt, count, argv))
> > + goto cleanup;
> > + opt->use_remote_info = 1;
> > + data->skip_object_info = 1;
> > + data->mark_query = 0;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > + if (remote_object_info[i].sizep)
> > + data->size = *remote_object_info[i].sizep;
> > + if (remote_object_info[i].typep)
> > + data->type = *remote_object_info[i].typep;
> > +
> > + data->oid = object_info_oids.oid[i];
> > + if (p_cmd)
> > + p_cmd->fn(opt, argv[i+1], output, data);
> > + else
> > + q_cmd->fn(opt, argv[i+1], output, data);
> > + }
> > + opt->use_remote_info = 0;
> > + data->skip_object_info = 0;
> > + data->mark_query = 1;
> > +
> > +cleanup:
> > + for (size_t i = 0; i < object_info_oids.nr; i++)
> > + free_object_info_contents(&remote_object_info[i]);
> > + free(remote_object_info);
> > +}
> > +
> > static void dispatch_calls(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data,
> > @@ -671,8 +795,12 @@ static void dispatch_calls(struct batch_options *opt,
> > if (!opt->buffer_output)
> > die(_("flush is only for --buffer mode"));
> >
> > - for (int i = 0; i < nr; i++)
> > - cmd[i].fn(opt, cmd[i].line, output, data);
> > + for (int i = 0; i < nr; i++) {
> > + if (!strcmp(cmd[i].name, "remote-object-info"))
> > + parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
>
> If we adapt `parse_remote_info()` to accept the command function we
> could pass cmd->fn here instead.
>
> > + else
> > + cmd[i].fn(opt, cmd[i].line, output, data);
> > + }
> >
> > fflush(stdout);
> > }
> > @@ -685,17 +813,6 @@ static void free_cmds(struct queued_cmd *cmd, size_t *nr)
> > *nr = 0;
> > }
> >
> > -
> > -static const struct parse_cmd {
> > - const char *name;
> > - parse_cmd_fn_t fn;
> > - unsigned takes_args;
> > -} commands[] = {
> > - { "contents", parse_cmd_contents, 1},
> > - { "info", parse_cmd_info, 1},
> > - { "flush", NULL, 0},
> > -};
> > -
> > static void batch_objects_command(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data)
> > @@ -740,11 +857,17 @@ static void batch_objects_command(struct batch_options *opt,
> > dispatch_calls(opt, output, data, queued_cmd, nr);
> > free_cmds(queued_cmd, &nr);
> > } else if (!opt->buffer_output) {
> > - cmd->fn(opt, p, output, data);
> > + if (!strcmp(cmd->name, "remote-object-info")) {
> > + char *line = xstrdup_or_null(p);
> > + parse_remote_info(opt, line, output, data, cmd, NULL);
>
> Same here, if we adapt `parse_remote_info()` to accept the command
> function we could pass cmd->fn here instead.
>
> > + } else {
> > + cmd->fn(opt, p, output, data);
> > + }
> > } else {
> > ALLOC_GROW(queued_cmd, nr + 1, alloc);
> > call.fn = cmd->fn;
> > call.line = xstrdup_or_null(p);
> > + call.name = cmd->name;
> > queued_cmd[nr++] = call;
> > }
> > }
> > @@ -761,8 +884,6 @@ static void batch_objects_command(struct batch_options *opt,
> > strbuf_release(&input);
> > }
> >
> > -#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> > -
> > static int batch_objects(struct batch_options *opt)
> > {
> > struct strbuf input = STRBUF_INIT;
> [snip]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 4/6] transport: add client support for object-info
2024-07-09 7:15 ` Toon claes
2024-07-09 16:37 ` Junio C Hamano
@ 2024-07-13 2:30 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-13 2:30 UTC (permalink / raw)
To: Toon claes, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Tue, Jul 9, 2024 at 3:16 AM Toon claes <toon@iotcl.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > diff --git a/transport.c b/transport.c
> > index 83ddea8fbc..2847aa3f3c 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -436,11 +504,27 @@ static int fetch_refs_via_pack(struct transport *transport,
> > args.server_options = transport->server_options;
> > args.negotiation_tips = data->options.negotiation_tips;
> > args.reject_shallow_remote = transport->smart_options->reject_shallow;
> > -
> > - if (!data->finished_handshake) {
> > - int i;
> > + args.object_info = transport->smart_options->object_info;
> > +
> > + if (transport->smart_options && transport->smart_options->object_info) {
> > + struct ref *ref = object_info_refs;
> > +
> > + if (!fetch_object_info(transport, data->options.object_info_data))
> > + goto cleanup;
> > + args.object_info_data = data->options.object_info_data;
> > + args.quiet = 1;
> > + args.no_progress = 1;
> > + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> > + struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
> > + temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
>
> Any reason why you're not using the subscript operator (square brackets)
> like this:
>
> + temp_ref->old_oid = transport->smart_options->object_info_oids->oid[I];
>
Thank you. Fixed in V2.
> > + temp_ref->exact_oid = 1;
> > + ref->next = temp_ref;
> > + ref = ref->next;
> > + }
> > + transport->remote_refs = object_info_refs->next;
>
> I find it a bit weird you're allocating object_info_refs, only to use it
> to point to the next. Can I suggest a little refactor:
>
Thank you. I have to agree that the old implementation of iterating on
the object_info_refs linked list is a bit obscure.
Your suggestion is easier to follow. I am replacing the old logic in V2.
> ----8<-----8<----
> diff --git a/transport.c b/transport.c
> index 662faa004e..56cb3a1693 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -479,7 +479,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> struct ref *refs = NULL;
> struct fetch_pack_args args;
> struct ref *refs_tmp = NULL;
> - struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
> + struct ref *object_info_refs = NULL;
>
> memset(&args, 0, sizeof(args));
> args.uploadpack = data->options.uploadpack;
> @@ -509,7 +509,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.object_info = transport->smart_options->object_info;
>
> if (transport->smart_options && transport->smart_options->object_info) {
> - struct ref *ref = object_info_refs;
> + struct ref *ref = object_info_refs = xcalloc(1, sizeof (struct ref));
>
> if (!fetch_object_info(transport, data->options.object_info_data))
> goto cleanup;
> @@ -517,13 +517,12 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.quiet = 1;
> args.no_progress = 1;
> for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> - struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
> - temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
> - temp_ref->exact_oid = 1;
> - ref->next = temp_ref;
> + ref->old_oid = transport->smart_options->object_info_oids->oid[i];
> + ref->exact_oid = 1;
> + ref->next = xcalloc(1, sizeof (struct ref));
> ref = ref->next;
> }
> - transport->remote_refs = object_info_refs->next;
> + transport->remote_refs = object_info_refs;
> } else if (!data->finished_handshake) {
> int must_list_refs = 0;
> for (int i = 0; i < nr_heads; i++) {
> @@ -565,7 +564,7 @@ static int fetch_refs_via_pack(struct transport *transport,
>
> data->finished_handshake = 0;
> if (args.object_info) {
> - struct ref *ref_cpy_reader = object_info_refs->next;
> + struct ref *ref_cpy_reader = object_info_refs;
> for (int i = 0; ref_cpy_reader; i++) {
> oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &(*args.object_info_data)[i], OBJECT_INFO_LOOKUP_REPLACE);
> ref_cpy_reader = ref_cpy_reader->next;
> ----8<-----8<----
>
> To be honest, I'm not sure it works, because fetch_object_info() always
> seem to return a non-zero value. I'm not sure this is due to missing
> code coverage, or a bug. I guess it's worth looking into.
>
Thank you. I tested your suggestion and it is working. I can confirm
it when I did the following with my debugger
1. pause on a test case of t/t1017-cat-file-remote-object-info.sh
2. git cat-file "--batch-command=%(objectname) %(objectsize)"
3. remote-object-info http://127.0.0.1:11017/smart/http_parent
5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
I set breakpoints all along and see that fetch_object_info() returned zero
Would you mind sharing your test steps with me? I would love to dig deeper.
> --
> Toon
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 4/6] transport: add client support for object-info
2024-07-09 16:37 ` Junio C Hamano
@ 2024-07-13 2:32 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-13 2:32 UTC (permalink / raw)
To: Junio C Hamano, git
Cc: Toon claes, Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Tue, Jul 9, 2024 at 12:37 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Toon claes <toon@iotcl.com> writes:
>
> >> + temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
> >
> > Any reason why you're not using the subscript operator (square brackets)
> > like this:
> >
> > + temp_ref->old_oid = transport->smart_options->object_info_oids->oid[i];
>
> Much nicer, but fold such overly long lines, please,
>
> temp_ref->old_oid = transport->smart_options->
> object_info_oids->oid[i];
>
> to make them readable.
>
>
Thank you, sir. I will follow the folding format in V2.
>
> > ...
> > To be honest, I'm not sure it works, because fetch_object_info() always
> > seem to return a non-zero value. I'm not sure this is due to missing
> > code coverage, or a bug. I guess it's worth looking into.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-07-09 7:16 ` Toon claes
@ 2024-07-13 2:35 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-13 2:35 UTC (permalink / raw)
To: Toon claes, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Tue, Jul 9, 2024 at 3:16 AM Toon claes <toon@iotcl.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 72a78cdc8c..34958a1747 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > ...
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > + static struct transport *gtransport;
> > +
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT
> > + */
>
> I believe this comment has become outdated, or got moved around
> incorrectly.
>
Thank you Toon. Sorry, I didn't get it. This comment is not outdated.
It is before this code
if (!opt->format) {
opt->format = "%(objectname) %(objectsize)";
}
And this is related to my 2nd open question in the cover letter
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when
remote-object-info is used.
Any suggestions on this approach?
> > diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> > new file mode 100755
> > index 0000000000..7a7bdfeb91
> > --- /dev/null
> > +++ b/t/t1017-cat-file-remote-object-info.sh
> > ...
> > +stop_git_daemon
> > +
> > +# Test --batch-command remote-object-info with 'http://' transport
> > +
> > +. "$TEST_DIRECTORY"/lib-httpd.sh
> > +start_httpd
>
> start_httpd skips the remainder of the tests if it fails to start the
> httpd server. That's why I see various other tests which have this at
> the end:
>
> # DO NOT add non-httpd-specific tests here, because the last part of this
> # test script is only executed when httpd is available and enabled.
>
> So I would suggest to add this comment as well, and move the file://
> tests above start_httpd.
>
Thank you. Fixing it in V2
> --
> Toon
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 1/6] fetch-pack: refactor packet writing
2024-07-10 9:39 ` Karthik Nayak
@ 2024-07-15 16:40 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-15 16:40 UTC (permalink / raw)
To: Karthik Nayak, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Wed, Jul 10, 2024 at 5:39 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Peijian Ju <eric.peijian@gmail.com> writes:
> [snip]
>
> >> Right, this commit in itself looks good. But I was curious why we need
> >> this, so I did a sneak peak into the following commits.
> >>
> >> To summarize, we want to call:
> >> `write_command_and_capabilities(..., "object-info");`
> >> in the upcoming patches to get the object-info details from the server.
> >> But isn't this function too specific to the "fetch" command to be
> >> generalized to be for "object-info" too?
> >>
> >> Wouldn't it make sense to add a custom function for 'object-info' in
> >> 'connect.c'? Like how we currently have `get_remote_bundle_uri()` for
> >> 'bundle-uri' and `get_remote_refs` for 'ls-refs'?
> >
> > Thank you. I am reading through the old comments left by Taylor
> > at https://lore.kernel.org/git/YkOPyc9tUfe2Tozx@nand.local/
> >
> > " Makes obvious sense, and this was something that jumped out to me when I
> > looked at the first and second versions of this patch. I'm glad that
> > this is getting factored out."
> >
> >
> > It seems refactoring this into a more general function is on purpose.
> > It is encouraged to use this general function to request capability
> > rather than adding a custom function.
> > Taylor’s comment was 2 years ago, but I think refactoring this into a
> > more general function to
> > enforce DRY still makes sense.
>
> It would make sense then to move the existing users to also use
> `write_command_and_capabilities` eventually. I guess this could be done
> in a follow up series.
>
> Then I would say `write_command_and_capabilities()` should be moved to
> `transport.c`, no?
Thank you. I am not sure about this. Currently, the file dependency is
like this:
`transport.c` -> `fetch-pack.c` -> `connect.c` where "->" means "depends on".
Moving `write_command_and_capabilities()` to `transport.c` would make
circle dependency.
If we want `write_command_and_capabilities()` to be a more general
utility function,
it seems make more sense to move it to `connect.c`. I saw a bunch of
these general utility functions
in `connect.c` such as `send_capabilities()`. Some custom functions
such as `get_remote_bundle_uri()` and `get_remote_refs`also lives in
it.
Please let me know what you think. Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 4/6] transport: add client support for object-info
2024-07-10 10:13 ` Karthik Nayak
@ 2024-07-16 2:39 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-16 2:39 UTC (permalink / raw)
To: Karthik Nayak, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Wed, Jul 10, 2024 at 6:13 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > From: Calvin Wan <calvinwan@google.com>
> >
> > Sometimes it is useful to get information about an object without having
> > to download it completely. The server logic has already been implemented
> > as “a2ba162cda (object-info: support for retrieving object info,
>
> Nit: s/as/in
>
Thank you. Fixed in V2.
> > 2021-04-20)”.
> >
> > Add client functions to communicate with the server.
> >
> > The client currently supports requesting a list of object ids with
> > features 'size' and 'type' from a v2 server. If a server does not
>
> But do we support type? I thought we only added support for 'size'.
>
Thank you. Yes, only size is supported, I will revise it.
> > advertise either of the requested features, then the client falls back
> > to making the request through 'fetch'.
> >
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> > Helped-by: Jonathan Tan <jonathantanmy@google.com>
> > Helped-by: Christian Couder <chriscool@tuxfamily.org>
> > ---
> > fetch-pack.c | 24 +++++++++++
> > fetch-pack.h | 10 +++++
> > transport-helper.c | 8 +++-
> > transport.c | 102 ++++++++++++++++++++++++++++++++++++++++++---
> > transport.h | 11 +++++
> > 5 files changed, 148 insertions(+), 7 deletions(-)
> >
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index da0de9c537..d533cac1d8 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1345,6 +1345,27 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
> > packet_buf_delim(req_buf);
> > }
> >
> > +void send_object_info_request(int fd_out, struct object_info_args *args)
> > +{
> > + struct strbuf req_buf = STRBUF_INIT;
> > +
> > + write_command_and_capabilities(&req_buf, args->server_options, "object-info");
> > +
> > + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> > + packet_buf_write(&req_buf, "size");
> > +
> > + if (args->oids) {
> > + for (size_t i = 0; i < args->oids->nr; i++)
> > + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> > + }
> > +
> > + packet_buf_flush(&req_buf);
> > + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> > + die_errno(_("unable to write request to remote"));
> > +
> > + strbuf_release(&req_buf);
> > +}
> > +
> > static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
> > struct fetch_pack_args *args,
> > const struct ref *wants, struct oidset *common,
> > @@ -1682,6 +1703,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> > if (args->depth > 0 || args->deepen_since || args->deepen_not)
> > args->deepen = 1;
> >
> > + if (args->object_info)
> > + state = FETCH_SEND_REQUEST;
> > +
> > while (state != FETCH_DONE) {
> > switch (state) {
> > case FETCH_CHECK_LOCAL:
> > diff --git a/fetch-pack.h b/fetch-pack.h
> > index 6775d26517..16e4dc0824 100644
> > --- a/fetch-pack.h
> > +++ b/fetch-pack.h
> > @@ -16,6 +16,7 @@ struct fetch_pack_args {
> > const struct string_list *deepen_not;
> > struct list_objects_filter_options filter_options;
> > const struct string_list *server_options;
> > + struct object_info **object_info_data;
> >
> > /*
> > * If not NULL, during packfile negotiation, fetch-pack will send "have"
> > @@ -42,6 +43,7 @@ struct fetch_pack_args {
> > unsigned reject_shallow_remote:1;
> > unsigned deepen:1;
> > unsigned refetch:1;
> > + unsigned object_info:1;
> >
> > /*
> > * Indicate that the remote of this request is a promisor remote. The
> > @@ -68,6 +70,12 @@ struct fetch_pack_args {
> > unsigned connectivity_checked:1;
> > };
> >
> > +struct object_info_args {
> > + struct string_list *object_info_options;
> > + const struct string_list *server_options;
> > + struct oid_array *oids;
> > +};
> > +
> > /*
> > * sought represents remote references that should be updated from.
> > * On return, the names that were found on the remote will have been
> > @@ -101,4 +109,6 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
> > */
> > int report_unmatched_refs(struct ref **sought, int nr_sought);
> >
> > +void send_object_info_request(int fd_out, struct object_info_args *args);
> > +
> > #endif
> > diff --git a/transport-helper.c b/transport-helper.c
> > index 9820947ab2..670d1e7068 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -697,13 +697,17 @@ static int fetch_refs(struct transport *transport,
> >
> > /*
> > * If we reach here, then the server, the client, and/or the transport
> > - * helper does not support protocol v2. --negotiate-only requires
> > - * protocol v2.
> > + * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
> > + * require protocol v2.
> > */
> > if (data->transport_options.acked_commits) {
> > warning(_("--negotiate-only requires protocol v2"));
> > return -1;
> > }
> > + if (transport->smart_options->object_info) {
> > + // fail the command explicitly to avoid further commands input
> > + die(_("remote-object-info requires protocol v2"));
> > + }
> >
> > if (!data->get_refs_list_called)
> > get_refs_list_using_list(transport, 0);
> > diff --git a/transport.c b/transport.c
> > index 83ddea8fbc..2847aa3f3c 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -363,6 +363,73 @@ static struct ref *handshake(struct transport *transport, int for_push,
> > return refs;
> > }
> >
> > +static int fetch_object_info(struct transport *transport, struct object_info **object_info_data)
> > +{
> > + int size_index = -1;
> > + struct git_transport_data *data = transport->data;
> > + struct object_info_args args;
> > + struct packet_reader reader;
> > +
> > + memset(&args, 0, sizeof(args));
>
> Nit: we could `struct object_info_args args = { 0 };` above instead.
Thank you. Your suggestion has better readability and maintainability.
I am adopting it in V2.
>
> > + args.server_options = transport->server_options;
> > + args.object_info_options = transport->smart_options->object_info_options;
> > + args.oids = transport->smart_options->object_info_oids;
> > +
> > + connect_setup(transport, 0);
> > + packet_reader_init(&reader, data->fd[0], NULL, 0,
> > + PACKET_READ_CHOMP_NEWLINE |
> > + PACKET_READ_GENTLE_ON_EOF |
> > + PACKET_READ_DIE_ON_ERR_PACKET);
> > + data->version = discover_version(&reader);
> > +
> > + transport->hash_algo = reader.hash_algo;
> > +
> > + switch (data->version) {
> > + case protocol_v2:
> > + if (!server_supports_v2("object-info"))
> > + return -1;
> > + if (unsorted_string_list_has_string(args.object_info_options, "size")
> > + && !server_supports_feature("object-info", "size", 0)) {
> > + return -1;
> > + }
> > + send_object_info_request(data->fd[1], &args);
> > + break;
> > + case protocol_v1:
> > + case protocol_v0:
> > + die(_("wrong protocol version. expected v2"));
> > + case protocol_unknown_version:
> > + BUG("unknown protocol version");
> > + }
> > +
> > + for (size_t i = 0; i < args.object_info_options->nr; i++) {
> > + if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
> > + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> > + return -1;
> > + }
> > + if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
> > + if (!strcmp(reader.line, "size"))
> > + size_index = i;
> > + continue;
> > + }
> > + return -1;
> > + }
> > +
> > + for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
> > + struct string_list object_info_values = STRING_LIST_INIT_DUP;
>
> We need to also call `string_list_clear()` at the end of this block.
>
> > +
> > + string_list_split(&object_info_values, reader.line, ' ', -1);
> > + if (0 <= size_index) {
> > + if (!strcmp(object_info_values.items[1 + size_index].string, ""))
> > + die("object-info: not our ref %s",
> > + object_info_values.items[0].string);
> > + *(*object_info_data)[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
>
> Perhaps `*object_info_data[i]->sizep =
> strtoul(object_info_values.items[1 + size_index].string, NULL, 10);`?
>
> So, this is allocated in 'cat-file' and set here? Wouldn't it be nicer
> to also do the alloc here?
>
> > Perhaps `*object_info_data[i]->sizep =
> > strtoul(object_info_values.items[1 + size_index].string, NULL, 10);`?
Thank you.
Seems that `*(*object_info_data)[i].sizep` and
`object_info_data[i]->sizep` are not the same.
Given object_info_data is a pointer to a pointer to struct
object_info, what `*(*object_info_data)[i].sizep` does is
1. *object_info_data dereferences object_info_data, yielding a pointer
to the first element of the array of struct object_info.
2. (*object_info_data)[i] accesses the i-th element in the array of
struct object_info that *object_info_data points to.
4, (*object_info_data)[i].sizep accesses the sizep member of the i-th
struct object_info.
5. *(*object_info_data)[i].sizep dereferences the sizep pointer,
yielding the value it points to.
So we are interested in the array of struct object_info with its first
element at *object_info_data. A more intuitive way of thinking it is
that if we think object_info_data as a 2-D array,
*(*object_info_data)[i] is accessing the object_info_data[0][i].
For `*object_info_data[i]->sizep`:
1. object_info_data[i] accesses the i-th element in the array of
pointers to struct object_info.
2. object_info_data[i]->sizep accesses the sizep member of the i-th
struct object_info that object_info_data[i] points to.
3. *object_info_data[i]->sizep dereferences the sizep pointer,
yielding the value it points to.
*object_info_data[i]->sizep will treat object_info_data as an array of
pointers. In the mental model of 2D array, *object_info_data[i] is
like object_info_data[i][0]
Nevertheless, I do think using a pointer to a pointer is tricky and
error-prone. In V2, I am refactoring the code to use just a pointer
instead of a pointer to a pointer. For example, in transport.h
git_transport_options {
...
struct object_info *object_info_data;
...
}
> > So, this is allocated in 'cat-file' and set here? Wouldn't it be nicer
> > to also do the alloc here?
Thank you.
Yes, this makes sense, V2 is refactoring the allocation into
`fetch_object_info()` in transport.c
> > + }
> > + }
> > + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> > +
> > + return 0;
> > +}
> > +
> > static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> > struct transport_ls_refs_options *options)
> > {
> > @@ -410,6 +477,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> > struct ref *refs = NULL;
> > struct fetch_pack_args args;
> > struct ref *refs_tmp = NULL;
> > + struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
> >
> > memset(&args, 0, sizeof(args));
> > args.uploadpack = data->options.uploadpack;
> > @@ -436,11 +504,27 @@ static int fetch_refs_via_pack(struct transport *transport,
> > args.server_options = transport->server_options;
> > args.negotiation_tips = data->options.negotiation_tips;
> > args.reject_shallow_remote = transport->smart_options->reject_shallow;
> > -
> > - if (!data->finished_handshake) {
> > - int i;
> > + args.object_info = transport->smart_options->object_info;
> > +
> > + if (transport->smart_options && transport->smart_options->object_info) {
> > + struct ref *ref = object_info_refs;
> > +
> > + if (!fetch_object_info(transport, data->options.object_info_data))
> > + goto cleanup;
> > + args.object_info_data = data->options.object_info_data;
> > + args.quiet = 1;
> > + args.no_progress = 1;
> > + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> > + struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
> > + temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
> > + temp_ref->exact_oid = 1;
> > + ref->next = temp_ref;
> > + ref = ref->next;
> > + }
> > + transport->remote_refs = object_info_refs->next;
> > + } else if (!data->finished_handshake) {
> > int must_list_refs = 0;
> > - for (i = 0; i < nr_heads; i++) {
> > + for (int i = 0; i < nr_heads; i++) {
> > if (!to_fetch[i]->exact_oid) {
> > must_list_refs = 1;
> > break;
> > @@ -478,11 +562,18 @@ static int fetch_refs_via_pack(struct transport *transport,
> > &transport->pack_lockfiles, data->version);
> >
> > data->finished_handshake = 0;
> > + if (args.object_info) {
> > + struct ref *ref_cpy_reader = object_info_refs->next;
> > + for (int i = 0; ref_cpy_reader; i++) {
> > + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &(*args.object_info_data)[i], OBJECT_INFO_LOOKUP_REPLACE);
> > + ref_cpy_reader = ref_cpy_reader->next;
> > + }
> > + }
> > data->options.self_contained_and_connected =
> > args.self_contained_and_connected;
> > data->options.connectivity_checked = args.connectivity_checked;
> >
> > - if (!refs)
> > + if (!refs && !args.object_info)
> > ret = -1;
> > if (report_unmatched_refs(to_fetch, nr_heads))
> > ret = -1;
> > @@ -498,6 +589,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> > free_refs(refs_tmp);
> > free_refs(refs);
> > list_objects_filter_release(&args.filter_options);
> > + free_refs(object_info_refs);
>
> Shouldn't we loop through `object_info_refs->next` and free all of them ?
>
Thank you. I think free_refs() has the logic to loop through
object_info_refs->next and feel the linked list.
> > return ret;
> > }
> >
> > diff --git a/transport.h b/transport.h
> > index 6393cd9823..5a3cda1860 100644
> > --- a/transport.h
> > +++ b/transport.h
> > @@ -5,6 +5,7 @@
> > #include "remote.h"
> > #include "list-objects-filter-options.h"
> > #include "string-list.h"
> > +#include "object-store.h"
> >
> > struct git_transport_options {
> > unsigned thin : 1;
> > @@ -30,6 +31,12 @@ struct git_transport_options {
> > */
> > unsigned connectivity_checked:1;
> >
> > + /*
> > + * Transport will attempt to pull only object-info. Fallbacks
> > + * to pulling entire object if object-info is not supported.
> > + */
> > + unsigned object_info : 1;
> > +
> > int depth;
> > const char *deepen_since;
> > const struct string_list *deepen_not;
> > @@ -53,6 +60,10 @@ struct git_transport_options {
> > * common commits to this oidset instead of fetching any packfiles.
> > */
> > struct oidset *acked_commits;
> > +
> > + struct oid_array *object_info_oids;
> > + struct object_info **object_info_data;
> > + struct string_list *object_info_options;
> > };
> >
> > enum transport_family {
> > --
> > 2.45.2
>
> I wondering if we can add tests at this stage.
Thank you. V2 is adding more tests to cover this.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 5/6] cat-file: add declaration of variable i inside its for loop
2024-07-10 10:16 ` Karthik Nayak
@ 2024-07-16 2:59 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-16 2:59 UTC (permalink / raw)
To: Karthik Nayak, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Wed, Jul 10, 2024 at 6:16 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > Some code declares variable i and only uses it
> > in a for loop, not in any other logic outside the loop.
> >
> > Change the declaration of i to be inside the for loop for readability.
> >
>
> If we're doing this anyways, we could replace the 'int' with 'size_t'
> too.
>
Thank you. Fixed in V2
> [snip]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH 6/6] cat-file: add remote-object-info to batch-command
2024-07-10 12:08 ` Karthik Nayak
@ 2024-07-17 2:38 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-07-17 2:38 UTC (permalink / raw)
To: Karthik Nayak, git; +Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai
On Wed, Jul 10, 2024 at 8:08 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > From: Calvin Wan <calvinwan@google.com>
> >
> > Since the `info` command in cat-file --batch-command prints object info
> > for a given object, it is natural to add another command in cat-file
> > --batch-command to print object info for a given object from a remote.
> > Add `remote-object-info` to cat-file --batch-command.
> >
> > While `info` takes object ids one at a time, this creates overhead when
> > making requests to a server so `remote-object-info` instead can take
> > multiple object ids at once.
> >
> > cat-file --batch-command is generally implemented in the following
> > manner:
> >
> > - Receive and parse input from user
>
> So this refers input delimited by newline or '\0'.
>
Thank you. The input should take both newline and '\0' into
consideration. We are missing some test coverage on '\0' delimited
input though. I am adding them in V2
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
>
> Doesn't the batch mode get set before the input parsing begins?
>
Thank you. Yes, I am also unsure what Calvin "Set batch mode state"
means here. This batch mode is determined when the cat-file command is
called. But I do see `opt->batch_mode = BATCH_MODE_INFO;` in
`parse_cmd_info and()` and
`opt->batch_mode = BATCH_MODE_CONTENTS;` in `parse_cmd_contents()` I
guess that is what Calvin refers to
Anyway, I am removing "Set batch mode state" in V2 to avoid confusion,
It seems too detailed.
> > In --buffer mode, this changes to:
> >
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
> > Notice how the getting and printing of object info is accomplished one
> > at a time. As described above, this creates a problem for making
> > requests to a server. Therefore, `remote-object-info` is implemented in
> > the following manner:
> >
> > - Receive and parse input from user
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through object info
> > - Call respective function attached to `info`
> > - Set batch mode state, use passed in object info, print object
> > info
> > Else:
> > - Call respective function attached to command
> > - Parse input, get object info, print object info
> >
>
> So this is because we want 'remote-object-info' to also use
> 'parse_cmd_info' similar to 'info'. But I'm not understanding why,
> especially since 'parse_cmd_info' calls 'batch_one_object', and we skip
> most of that code for 'remote-object-info'.
>
> Wouldn't it be cleaner to just define our own 'batch_remote_object' and
> create 'parse_cmd_remote_info' ?
>
Thank you. That makes sense. Actually, I am pushing it a bit further in V2:
1. The interface of parse_remote_info() is changed to parse_cmd_fn_t,
and its name is changed to `parse_cmd_remote_info()`.
2. In `static const struct parse_cmd{...} commands[]`, the
"remote-object-info" is attached with parse_cmd_remote_info() directly
3. In side parse_cmd_remote_info, we don't need
`batch_remote_object()`, all we need is just `batch_object_write()` to
print the object info out. That will simply the code a lot.
We don't need to call parse_cmd_info; also, we can get rid of the name
compare logic, i.e. `if (!strcmp(cmd[i].name, "remote-object-info"))
...`
> > And finally for --buffer mode `remote-object-info`:
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue:
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through object info
> > - Call respective function attached to `info`
> > - Set batch mode state, use passed in object info, print
> > object info
> > Else:
> > - Call respective function attached to command
> > - Set batch mode state, get object info, print object info
> >
> > To summarize, `remote-object-info` gets object info from the remote and
> > then generates multiple `info` commands with the object info passed in.
> >
> > In order for remote-object-info to avoid remote communication overhead
> > in the non-buffer mode, the objects are passed in as such:
> >
> > remote-object-info <remote> <oid> <oid> ... <oid>
> >
> > rather than
> >
> > remote-object-info <remote> <oid>
> > remote-object-info <remote> <oid>
> > ...
> > remote-object-info <remote> <oid>
> >
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> > Helped-by: Jonathan Tan <jonathantanmy@google.com>
> > Helped-by: Christian Couder <chriscool@tuxfamily.org>
> > ---
> > Documentation/git-cat-file.txt | 22 +-
> > builtin/cat-file.c | 231 ++++++++++----
> > object-file.c | 11 +
> > object-store-ll.h | 3 +
> > t/t1017-cat-file-remote-object-info.sh | 412 +++++++++++++++++++++++++
> > 5 files changed, 620 insertions(+), 59 deletions(-)
> > create mode 100755 t/t1017-cat-file-remote-object-info.sh
> >
> > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> > index bd95a6c10a..ab0647bb39 100644
> > --- a/Documentation/git-cat-file.txt
> > +++ b/Documentation/git-cat-file.txt
> > @@ -149,6 +149,12 @@ info <object>::
> > Print object info for object reference `<object>`. This corresponds to the
> > output of `--batch-check`.
> >
> > +remote-object-info <remote> <object>...::
> > + Print object info for object references `<object>` at specified <remote> without
> > + downloading object from remote.
> > + Error when no object references is provided.
> > + This command may be combined with `--buffer`.
> > +
> > flush::
> > Used with `--buffer` to execute all preceding commands that were issued
> > since the beginning or since the last flush was issued. When `--buffer`
> > @@ -290,7 +296,8 @@ newline. The available atoms are:
> > The full hex representation of the object name.
> >
> > `objecttype`::
> > - The type of the object (the same as `cat-file -t` reports).
> > + The type of the object (the same as `cat-file -t` reports). See
> > + `CAVEATS` below. Not supported by `remote-object-info`.
> >
> > `objectsize`::
> > The size, in bytes, of the object (the same as `cat-file -s`
> > @@ -298,13 +305,14 @@ newline. The available atoms are:
> >
> > `objectsize:disk`::
> > The size, in bytes, that the object takes up on disk. See the
> > - note about on-disk sizes in the `CAVEATS` section below.
> > + note about on-disk sizes in the `CAVEATS` section below. Not
> > + supported by `remote-object-info`.
> >
> > `deltabase`::
> > If the object is stored as a delta on-disk, this expands to the
> > full hex representation of the delta base object name.
> > Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> > - below.
> > + below. Not supported by `remote-object-info`.
> >
> > `rest`::
> > If this atom is used in the output string, input lines are split
> > @@ -314,7 +322,9 @@ newline. The available atoms are:
> > line) are output in place of the `%(rest)` atom.
> >
> > If no format is specified, the default format is `%(objectname)
> > -%(objecttype) %(objectsize)`.
> > +%(objecttype) %(objectsize)`, except remote-object-info command who uses
> > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> > +When "%(objecttype)" is supported, default format should be unified.
> >
> > If `--batch` is specified, or if `--batch-command` is used with the `contents`
> > command, the object information is followed by the object contents (consisting
> > @@ -396,6 +406,10 @@ scripting purposes.
> > CAVEATS
> > -------
> >
> > +Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
> > +remote-object-info, git will error and exit when they are in the format string.
> > +
> > +
> > Note that the sizes of objects on disk are reported accurately, but care
> > should be taken in drawing conclusions about which refs or objects are
> > responsible for disk usage. The size of a packed non-delta object may be
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 72a78cdc8c..34958a1747 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -24,6 +24,9 @@
> > #include "promisor-remote.h"
> > #include "mailmap.h"
> > #include "write-or-die.h"
> > +#include "alias.h"
> > +#include "remote.h"
> > +#include "transport.h"
> >
> > enum batch_mode {
> > BATCH_MODE_CONTENTS,
> > @@ -42,9 +45,14 @@ struct batch_options {
> > char input_delim;
> > char output_delim;
> > const char *format;
> > + int use_remote_info;
> > };
> >
> > +#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> > +
> > static const char *force_path;
> > +static struct object_info *remote_object_info;
> > +static struct oid_array object_info_oids = OID_ARRAY_INIT;
> >
> > static struct string_list mailmap = STRING_LIST_INIT_NODUP;
> > static int use_mailmap;
> > @@ -508,7 +516,6 @@ static void batch_object_write(const char *obj_name,
> > }
> >
> > batch_write(opt, scratch->buf, scratch->len);
> > -
>
> Nit: why remove this?
>
Adding it back. Probably caused when resolving conflicts.
> > if (opt->batch_mode == BATCH_MODE_CONTENTS) {
> > print_object_or_die(opt, data);
> > batch_write(opt, &opt->output_delim, 1);
> > @@ -526,51 +533,118 @@ static void batch_one_object(const char *obj_name,
> > (opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0);
> > enum get_oid_result result;
> >
> > - result = get_oid_with_context(the_repository, obj_name,
> > - flags, &data->oid, &ctx);
> > - if (result != FOUND) {
> > - switch (result) {
> > - case MISSING_OBJECT:
> > - printf("%s missing%c", obj_name, opt->output_delim);
> > - break;
> > - case SHORT_NAME_AMBIGUOUS:
> > - printf("%s ambiguous%c", obj_name, opt->output_delim);
> > - break;
> > - case DANGLING_SYMLINK:
> > - printf("dangling %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - case SYMLINK_LOOP:
> > - printf("loop %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - case NOT_DIR:
> > - printf("notdir %"PRIuMAX"%c%s%c",
> > - (uintmax_t)strlen(obj_name),
> > - opt->output_delim, obj_name, opt->output_delim);
> > - break;
> > - default:
> > - BUG("unknown get_sha1_with_context result %d\n",
> > - result);
> > - break;
> > + if (!opt->use_remote_info) {
> > + result = get_oid_with_context(the_repository, obj_name,
> > + flags, &data->oid, &ctx);
> > + if (result != FOUND) {
> > + switch (result) {
> > + case MISSING_OBJECT:
> > + printf("%s missing%c", obj_name, opt->output_delim);
> > + break;
> > + case SHORT_NAME_AMBIGUOUS:
> > + printf("%s ambiguous%c", obj_name, opt->output_delim);
> > + break;
> > + case DANGLING_SYMLINK:
> > + printf("dangling %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + case SYMLINK_LOOP:
> > + printf("loop %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + case NOT_DIR:
> > + printf("notdir %"PRIuMAX"%c%s%c",
> > + (uintmax_t)strlen(obj_name),
> > + opt->output_delim, obj_name, opt->output_delim);
> > + break;
> > + default:
> > + BUG("unknown get_sha1_with_context result %d\n",
> > + result);
> > + break;
> > + }
> > + fflush(stdout);
> > + return;
> > }
> > - fflush(stdout);
> > - return;
> > - }
> >
> > - if (ctx.mode == 0) {
> > - printf("symlink %"PRIuMAX"%c%s%c",
> > - (uintmax_t)ctx.symlink_path.len,
> > - opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> > - fflush(stdout);
> > - return;
> > + if (ctx.mode == 0) {
> > + printf("symlink %"PRIuMAX"%c%s%c",
> > + (uintmax_t)ctx.symlink_path.len,
> > + opt->output_delim, ctx.symlink_path.buf, opt->output_delim);
> > + fflush(stdout);
> > + return;
> > + }
> > }
> >
> > batch_object_write(obj_name, scratch, opt, data, NULL, 0);
> > }
> >
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
>
> We need to call `remote_clear()` on this at the end.
>
Thank you. I am not sure about this.
It seems the remote is cached in a hashmap see
https://git.kernel.org/pub/scm/git/git.git/tree/remote.c#n136.
When multiple commands are sent, the remote can be reused from the
hashmap cache.
The life cycle of this hashmap cache seems managed by
"the_repository", see
https://git.kernel.org/pub/scm/git/git.git/tree/remote.c#n720 and
https://git.kernel.org/pub/scm/git/git.git/tree/repository.c#n364.
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
>
> This needs to be cleared.
>
Thank you. Fixed in V2.
> > + static struct transport *gtransport;
>
> Shouldn't we call `transport_disconnect(transport);`?
>
Thank you. transport_disconnect(transport) is added the end of
get_remote_info() in V2.
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT
> > + */
> > + if (!opt->format) {
> > + opt->format = "%(objectname) %(objectsize)";
> > + }
> > +
> > + remote = remote_get(argv[0]);
> > + if (!remote)
> > + die(_("must supply valid remote when using remote-object-info"));
> > + oid_array_clear(&object_info_oids);
> > + for (size_t i = 1; i < argc; i++) {
> > + if (get_oid_hex(argv[i], &oid))
> > + die(_("Not a valid object name %s"), argv[i]);
> > + oid_array_append(&object_info_oids, &oid);
> > + }
> > +
> > + gtransport = transport_get(remote, NULL);
> > + if (gtransport->smart_options) {
> > + int include_size = 0;
> > +
> > + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> > + gtransport->smart_options->object_info = 1;
> > + gtransport->smart_options->object_info_oids = &object_info_oids;
> > + /*
> > + * 'size' is the only option currently supported.
> > + * Other options that are passed in the format will exit with error.
> > + */
> > + if (strstr(opt->format, "%(objectsize)")) {
> > + string_list_append(&object_info_options, "size");
> > + include_size = 1;
> > + }
> > + if (strstr(opt->format, "%(objecttype)")) {
> > + die(_("objecttype is currently not supported with remote-object-info"));
> > + }
> > + if (strstr(opt->format, "%(objectsize:disk)"))
> > + die(_("objectsize:disk is currently not supported with remote-object-info"));
> > + if (strstr(opt->format, "%(deltabase)"))
> > + die(_("deltabase is currently not supported with remote-object-info"));
> >
>
> This whole block could be replaced by an else..
>
> if (strstr(opt->format, "%(objectsize)")) {
> string_list_append(&object_info_options, "size");
> include_size = 1;
> } else {
> die(_("%s is currently not supported with remote-object-info", opt->format));
> }
>
Thank you. Revised in V2.
> > + if (object_info_options.nr > 0) {
> > + gtransport->smart_options->object_info_options = &object_info_options;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > + if (include_size)
> > + remote_object_info[i].sizep = xcalloc(1, sizeof(long));
> > + }
> > + gtransport->smart_options->object_info_data = &remote_object_info;
> > + retval = transport_fetch_refs(gtransport, NULL);
> > + }
> > + } else {
> > + retval = -1;
> > + }
> > +
> > + return retval;
> > +}
> > +
> > struct object_cb_data {
> > struct batch_options *opt;
> > struct expand_data *expand;
> > @@ -642,6 +716,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
> > struct queued_cmd {
> > parse_cmd_fn_t fn;
> > char *line;
> > + const char *name;
> > };
> >
> > static void parse_cmd_contents(struct batch_options *opt,
> > @@ -662,6 +737,55 @@ static void parse_cmd_info(struct batch_options *opt,
> > batch_one_object(line, output, opt, data);
> > }
> >
> > +static const struct parse_cmd {
> > + const char *name;
> > + parse_cmd_fn_t fn;
> > + unsigned takes_args;
> > +} commands[] = {
> > + { "contents", parse_cmd_contents, 1 },
> > + { "info", parse_cmd_info, 1 },
> > + { "remote-object-info", parse_cmd_info, 1 },
> > + { "flush", NULL, 0 },
> > +};
> > +
> > +static void parse_remote_info(struct batch_options *opt,
> > + char *line,
> > + struct strbuf *output,
> > + struct expand_data *data,
> > + const struct parse_cmd *p_cmd,
> > + struct queued_cmd *q_cmd)
> > +{
> > + int count;
> > + const char **argv;
> > +
> > + count = split_cmdline(line, &argv);
> > + if (get_remote_info(opt, count, argv))
> > + goto cleanup;
> > + opt->use_remote_info = 1;
> > + data->skip_object_info = 1;
> > + data->mark_query = 0;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > + if (remote_object_info[i].sizep)
> > + data->size = *remote_object_info[i].sizep;
> > + if (remote_object_info[i].typep)
> > + data->type = *remote_object_info[i].typep;
> > +
>
> We don't even set the type, so this shouldn't ever be possible right?
>
Thank you. Yes, that is right. Remove that in V2.
> > + data->oid = object_info_oids.oid[i];
> > + if (p_cmd)
> > + p_cmd->fn(opt, argv[i+1], output, data);
> > + else
> > + q_cmd->fn(opt, argv[i+1], output, data);
> > + }
> > + opt->use_remote_info = 0;
> > + data->skip_object_info = 0;
> > + data->mark_query = 1;
> > +
> > +cleanup:
> > + for (size_t i = 0; i < object_info_oids.nr; i++)
> > + free_object_info_contents(&remote_object_info[i]);
> > + free(remote_object_info);
>
> argv needs to free'd too
>
Thank you. Added in V2.
> > +}
> > +
> > static void dispatch_calls(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data,
> > @@ -671,8 +795,12 @@ static void dispatch_calls(struct batch_options *opt,
> > if (!opt->buffer_output)
> > die(_("flush is only for --buffer mode"));
> >
> > - for (int i = 0; i < nr; i++)
> > - cmd[i].fn(opt, cmd[i].line, output, data);
> > + for (int i = 0; i < nr; i++) {
> > + if (!strcmp(cmd[i].name, "remote-object-info"))
> > + parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
> > + else
> > + cmd[i].fn(opt, cmd[i].line, output, data);
> > + }
> >
> > fflush(stdout);
> > }
> > @@ -685,17 +813,6 @@ static void free_cmds(struct queued_cmd *cmd, size_t *nr)
> > *nr = 0;
> > }
> >
> > -
> > -static const struct parse_cmd {
> > - const char *name;
> > - parse_cmd_fn_t fn;
> > - unsigned takes_args;
> > -} commands[] = {
> > - { "contents", parse_cmd_contents, 1},
> > - { "info", parse_cmd_info, 1},
> > - { "flush", NULL, 0},
> > -};
> > -
> > static void batch_objects_command(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data)
> > @@ -740,11 +857,17 @@ static void batch_objects_command(struct batch_options *opt,
> > dispatch_calls(opt, output, data, queued_cmd, nr);
> > free_cmds(queued_cmd, &nr);
> > } else if (!opt->buffer_output) {
> > - cmd->fn(opt, p, output, data);
> > + if (!strcmp(cmd->name, "remote-object-info")) {
> > + char *line = xstrdup_or_null(p);
>
> This needs to be free'd.
>
Thank you. This line is removed in V2, but free() is added in the new
code in `parse_cmd_remote_info()`
> > + parse_remote_info(opt, line, output, data, cmd, NULL);
>
>
>
> > + } else {
> > + cmd->fn(opt, p, output, data);
> > + }
> > } else {
> > ALLOC_GROW(queued_cmd, nr + 1, alloc);
> > call.fn = cmd->fn;
> > call.line = xstrdup_or_null(p);
> > + call.name = cmd->name;
> > queued_cmd[nr++] = call;
> > }
> > }
> > @@ -761,8 +884,6 @@ static void batch_objects_command(struct batch_options *opt,
> > strbuf_release(&input);
> > }
> >
> > -#define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> > -
> > static int batch_objects(struct batch_options *opt)
> > {
> > struct strbuf input = STRBUF_INIT;
> > diff --git a/object-file.c b/object-file.c
> > index d3cf4b8b2e..6aaa167942 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -2988,3 +2988,14 @@ int read_loose_object(const char *path,
> > munmap(map, mapsize);
> > return ret;
> > }
> > +
> > +void free_object_info_contents(struct object_info *object_info)
> > +{
> > + if (!object_info)
> > + return;
> > + free(object_info->typep);
> > + free(object_info->sizep);
> > + free(object_info->disk_sizep);
> > + free(object_info->delta_base_oid);
> > + free(object_info->type_name);
> > +}
> > diff --git a/object-store-ll.h b/object-store-ll.h
> > index c5f2bb2fc2..333e19cd1e 100644
> > --- a/object-store-ll.h
> > +++ b/object-store-ll.h
> > @@ -533,4 +533,7 @@ int for_each_object_in_pack(struct packed_git *p,
> > int for_each_packed_object(each_packed_object_fn, void *,
> > enum for_each_object_flags flags);
> >
> > +/* Free pointers inside of object_info, but not object_info itself */
> > +void free_object_info_contents(struct object_info *object_info);
> > +
> > #endif /* OBJECT_STORE_LL_H */
> > diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> > new file mode 100755
> > index 0000000000..7a7bdfeb91
> > --- /dev/null
> > +++ b/t/t1017-cat-file-remote-object-info.sh
> > @@ -0,0 +1,412 @@
> > +#!/bin/sh
> > +
> > +test_description='git cat-file --batch-command with remote-object-info command'
> > +
> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> > +
> > +. ./test-lib.sh
> > +
> > +echo_without_newline () {
> > + printf '%s' "$*"
> > +}
> > +
> > +strlen () {
> > + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> > +}
> > +
> > +hello_content="Hello World"
> > +hello_size=$(strlen "$hello_content")
> > +hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> > +
> > +tree_size=$(($(test_oid rawsz) + 13))
> > +
> > +commit_message="Initial commit"
> > +commit_size=$(($(test_oid hexsz) + 137))
> >
>
> Why 13 and 137?
>
Thank you. That is tricky. Originally I took them from
t/t1006-cat-file.sh. I did some research though.
13 = <file mode> + <a_space> + <file name> + <a_null>, where
file mode is 100644, which is 6 characters;
file name is hello, which is 5 characters
a space is 1 character and a null is 1 character
For commit message, here is the raw content
tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
author A U Thor <author@example.com> 1112354055 +0200
committer C O Mitter <committer@example.com> 1112354055 +0200
Initial commit
137 = <tree header> + <a_space> + <a newline> +
<Author line> + <a newline> +
<Committer line> + <a newline> +
<a newline> +
<commit message length>
An easier way is this by `git cat-file commit <commit hash> | wc -c`,
which gets 177, then it should be minus 40 hex away, and result in 137
I put them in the comments to avoid confusion.
> > +
> > +tag_header_without_oid="type blob
> > +tag hellotag
> > +tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
> > +tag_header_without_timestamp="object $hello_oid
> > +$tag_header_without_oid"
> > +tag_description="This is a tag"
> > +tag_content="$tag_header_without_timestamp 0 +0000
> > +
> > +$tag_description"
> > +
> > +tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
> > +tag_size=$(strlen "$tag_content")
> > +
> > +# This section tests --batch-command with remote-object-info command
> > +# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
> > +# the filters are set to "%(objectname) %(objectsize)".
> > +# Tests with the default filter are used to test the fallback to 'fetch' command
> > +
> > +
> > +# Test --batch-command remote-object-info with 'git://' transport
> > +
> > +. "$TEST_DIRECTORY"/lib-git-daemon.sh
> > +start_git_daemon --export-all --enable=receive-pack
> > +daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
> > +
> > +test_expect_success 'create repo to be served by git-daemon' '
> > + git init "$daemon_parent" &&
> > +
> > + echo_without_newline "$hello_content" > $daemon_parent/hello &&
> > + git -C "$daemon_parent" update-index --add hello &&
> > + git -C "$daemon_parent" config transfer.advertiseobjectinfo true
> > +'
> > +
> > +set_transport_variables () {
> > + hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> > + tree_sha1=$(git -C "$1" write-tree)
> > + commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
> > + tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
> > + tag_size=$(strlen "$tag_content")
> > +}
> > +
> > +
>
> extra newline here
>
Thank you. Fixed in V2.
> > +test_expect_success 'batch-command remote-object-info git://' '
> > + (
> > + set_transport_variables "$daemon_parent" &&
> > + cd "$daemon_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1
> > + remote-object-info "$GIT_DAEMON_URL/parent" $tree_sha1
> > + remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1
> > + remote-object-info "$GIT_DAEMON_URL/parent" $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
> > + (
> > + set_transport_variables "$daemon_parent" &&
> > + cd "$daemon_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command remote-object-info http:// default filter' '
> > + (
> > + set_transport_variables "$daemon_parent" &&
> > + cd "$daemon_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
> > + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
> > + remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command --buffer remote-object-info git://' '
> > + (
> > + set_transport_variables "$daemon_parent" &&
> > + cd "$daemon_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
> > + remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
> > + remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
> > + flush
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +stop_git_daemon
> > +
> > +# Test --batch-command remote-object-info with 'http://' transport
> > +
> > +. "$TEST_DIRECTORY"/lib-httpd.sh
> > +start_httpd
> > +
> > +test_expect_success 'create repo to be served by http:// transport' '
> > + git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
> > + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
> > + echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
> > + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello
> > +'
> > +
> > +
> > +test_expect_success 'batch-command remote-object-info http://' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $tree_sha1
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command remote-object-info http:// one line' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command --buffer remote-object-info http://' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > +
> > + git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
> > + flush
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command remote-object-info http:// default filter' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > +
> > + git cat-file --batch-command >actual <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> > + EOF
> > + test_grep "objectsize:disk is currently not supported with remote-object-info" err
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> > + EOF
> > + test_grep "deltabase is currently not supported with remote-object-info" err
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on server with legacy protocol' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> > + EOF
> > + test_grep "remote-object-info requires protocol v2" err
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > + test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> > + EOF
> > + test_grep "remote-object-info requires protocol v2" err
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on malformed OID' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + malformed_object_id="this_id_is_not_valid" &&
> > +
> > + test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
> > + EOF
> > + test_grep "Not a valid object name '$malformed_object_id'" err
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on malformed OID fallback' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + malformed_object_id="this_id_is_not_valid" &&
> > +
> > + test_must_fail git cat-file --batch-command 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
> > + EOF
> > + test_grep "Not a valid object name '$malformed_object_id'" err
> > + )
> > +'
> > +
> > +test_expect_success 'remote-object-info fails on missing OID' '
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
> > + test_commit -C missing_oid_repo message1 c.txt &&
> > + (
> > + cd missing_oid_repo &&
> > +
> > + object_id=$(git rev-parse message1:c.txt) &&
> > + test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
> > + EOF
> > + test_grep "object-info: not our ref $object_id" err
> > + )
> > +'
> > +
> > +# shellcheck disable=SC2016
> > +test_expect_success 'remote-object-info fails on missing OID fallback' '
> > + (
> > + set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > + cd missing_oid_repo &&
> > + object_id=$(git rev-parse message1:c.txt) &&
> > + test_must_fail git cat-file --batch-command 2>err <<-EOF &&
> > + remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
> > + EOF
> > + test_grep "fatal: object-info: not our ref $object_id" err
> > + )
> > +'
> > +
> > +# Test --batch-command remote-object-info with 'file://' transport
> > +
> > +# shellcheck disable=SC2016
> > +test_expect_success 'create repo to be served by file:// transport' '
> > + git init server &&
> > + git -C server config protocol.version 2 &&
> > + git -C server config transfer.advertiseobjectinfo true &&
> > + echo_without_newline "$hello_content" > server/hello &&
> > + git -C server update-index --add hello
> > +'
> > +
> > +
> > +test_expect_success 'batch-command remote-object-info file://' '
> > + (
> > + set_transport_variables "server" &&
> > + cd server &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > + remote-object-info "file://$(pwd)" $hello_sha1
> > + remote-object-info "file://$(pwd)" $tree_sha1
> > + remote-object-info "file://$(pwd)" $commit_sha1
> > + remote-object-info "file://$(pwd)" $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
> > + (
> > + set_transport_variables "server" &&
> > + cd server &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > + remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command --buffer remote-object-info file://' '
> > + (
> > + set_transport_variables "server" &&
> > + cd server &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > + git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
> > + remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
> > + remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
> > + flush
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_expect_success 'batch-command remote-object-info file:// default filter' '
> > + (
> > + set_transport_variables "server" &&
> > + cd server &&
> > +
> > + echo "$hello_sha1 $hello_size" >expect &&
> > + echo "$tree_sha1 $tree_size" >>expect &&
> > + echo "$commit_sha1 $commit_size" >>expect &&
> > + echo "$tag_sha1 $tag_size" >>expect &&
> > +
> > + git cat-file --batch-command >actual <<-EOF &&
> > + remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
> > + remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
> > + EOF
> > + test_cmp expect actual
> > + )
> > +'
> > +
> > +test_done
>
> Some more tests I'd like to see
> - Testing against the '-Z' option.
> - Testing the fallback to fetch whole object when the server doesn't
> support 'remote-object-info'.
>
Thank you. More tests are added in V2 to cover those scenarios.
> Thanks
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v2 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (5 preceding siblings ...)
2024-06-28 19:05 ` [PATCH 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-07-20 3:43 ` [PATCH v2 1/6] fetch-pack: refactor packet writing Eric Ju
` (5 more replies)
2024-08-22 21:24 ` [PATCH 0/6] " Peijian Ju
` (9 subsequent siblings)
16 siblings, 6 replies; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This is a continuation of Calvin Wan's (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
Sometimes it is useful to get information about an object without having to download
it completely. The server logic for retrieving size has already been implemented and merged in
"a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
This patch series implement the client option for it.
This patch series add the `remote-object-info` command to `cat-file --batch-command`.
This command allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not have
object-info capability, the entire object is fetched and the
relevant object info is returned.
A few questions open for discussions please:
1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
or "warn and wait for new command".
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when remote-object-info is used.
Any suggestions on this approach?
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
V1 of the patch series can be found here:
https://lore.kernel.org/git/20240628190503.67389-1-eric.peijian@gmail.com/
Changes since V1
================
- The function parse_remote_info() has been renamed to
parse_cmd_remote_object_info() and its signature has been modified to comply
with parse_cmd_fn_t. This function now serves as the mapped function for the
remote-object-info command.
This change simplifies the code by avoiding command name comparisons and
reusing logic that fits parse_cmd_fn_t.
- Added more tests to cover fallbacks. When the server does not support the
object-info capability, remote-object-info will fetch the objects locally and
print out the information.
- Fixed a logic issue that could lead to a potential heap-buffer-overflow error.
The alloc_ref function is now used to initialize a ref struct instead of xcalloc.
- Refactored some logic for improved readability, such as how to initialize the
transport->remote_refs linked list.
Thank you.
Eric Ju
Calvin Wan (5):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
cat-file: add remote-object-info to batch-command
Eric Ju (1):
cat-file: add declaration of variable i inside its for loop
Documentation/git-cat-file.txt | 23 +-
builtin/cat-file.c | 127 ++++-
fetch-pack.c | 48 +-
fetch-pack.h | 10 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/t1017-cat-file-remote-object-info.sh | 748 +++++++++++++++++++++++++
transport-helper.c | 8 +-
transport.c | 118 +++-
transport.h | 11 +
11 files changed, 1075 insertions(+), 36 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v1:
1: fdd44b16b5 ! 1: 97871cb75e fetch-pack: refactor packet writing
@@ Metadata
## Commit message ##
fetch-pack: refactor packet writing
- A subsequent patch need to write capabilities for another command.
- Refactor write_fetch_command_and_capabilities() to be used by both
- fetch and future command.
+ A subsequent patch needs to write capabilities for another command.
+ Refactor write_fetch_command_and_capabilities() to be a more general
+ purpose function write_command_and_capabilities(), so that it can be
+ used by both fetch and future command.
+
+ Here "command" means the "operations" supported by Git’s wire protocol
+ https://git-scm.com/docs/protocol-v2. An example would be a
+ git's subcommand, such as git-fetch(1); or an operation supported by
+ the server side such as "object-info" implemented in "a2ba162cda
+ (object-info: support for retrieving object info, 2021-04-20)".
- Signed-off-by: Calvin Wan <calvinwan@google.com>
- Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
+ Signed-off-by: Calvin Wan <calvinwan@google.com>
+ Signed-off-by: Eric Ju <eric.peijian@gmail.com>
## fetch-pack.c ##
@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
2: 890219ce6a ! 2: 301047c574 fetch-pack: move fetch initialization
@@ Commit message
from the beginning of the first state to just before the execution of
the state machine.
- Signed-off-by: Calvin Wan <calvinwan@google.com>
- Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
+ Signed-off-by: Calvin Wan <calvinwan@google.com>
+ Signed-off-by: Eric Ju <eric.peijian@gmail.com>
## fetch-pack.c ##
@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
3: 6844095b26 ! 3: 5d83c4f5b2 serve: advertise object-info feature
@@ Commit message
client to decide whether to query the server for object-info or fetch
as a fallback.
- Signed-off-by: Calvin Wan <calvinwan@google.com>
- Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
+ Signed-off-by: Calvin Wan <calvinwan@google.com>
+ Signed-off-by: Eric Ju <eric.peijian@gmail.com>
## serve.c ##
@@ serve.c: static void session_id_receive(struct repository *r UNUSED,
4: c940cb1657 ! 4: a7210b7169 transport: add client support for object-info
@@ Commit message
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
- as “a2ba162cda (object-info: support for retrieving object info,
+ in “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
- features 'size' and 'type' from a v2 server. If a server does not
- advertise either of the requested features, then the client falls back
+ feature 'size' from a v2 server. If a server does not
+ advertise the feature, then the client falls back
to making the request through 'fetch'.
- Signed-off-by: Calvin Wan <calvinwan@google.com>
- Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
+ Signed-off-by: Calvin Wan <calvinwan@google.com>
+ Signed-off-by: Eric Ju <eric.peijian@gmail.com>
## fetch-pack.c ##
@@ fetch-pack.c: static void write_command_and_capabilities(struct strbuf *req_buf,
@@ fetch-pack.h: struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
-+ struct object_info **object_info_data;
++ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
return refs;
}
-+static int fetch_object_info(struct transport *transport, struct object_info **object_info_data)
++static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
+{
+ int size_index = -1;
+ struct git_transport_data *data = transport->data;
-+ struct object_info_args args;
++ struct object_info_args args = { 0 };
+ struct packet_reader reader;
+
-+ memset(&args, 0, sizeof(args));
+ args.server_options = transport->server_options;
+ args.object_info_options = transport->smart_options->object_info_options;
+ args.oids = transport->smart_options->object_info_oids;
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
-+ if (!strcmp(reader.line, "size"))
++ if (!strcmp(reader.line, "size")) {
+ size_index = i;
++ for (size_t j = 0; j < args.oids->nr; j++) {
++ object_info_data[j].sizep = xcalloc(1, sizeof(long));
++ }
++ }
+ continue;
+ }
+ return -1;
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
-+ *(*object_info_data)[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
++
++ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
++
+ }
++
++ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
-+ struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
++ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
--
-- if (!data->finished_handshake) {
-- int i;
+ args.object_info = transport->smart_options->object_info;
+
-+ if (transport->smart_options && transport->smart_options->object_info) {
-+ struct ref *ref = object_info_refs;
++ if (transport->smart_options
++ && transport->smart_options->object_info
++ && transport->smart_options->object_info_oids->nr > 0) {
++ struct ref *ref_itr = object_info_refs = alloc_ref("");
+
+ if (!fetch_object_info(transport, data->options.object_info_data))
+ goto cleanup;
++
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
-+ struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
-+ temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
-+ temp_ref->exact_oid = 1;
-+ ref->next = temp_ref;
-+ ref = ref->next;
++ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
++ ref_itr->exact_oid = 1;
++ if (i == transport->smart_options->object_info_oids->nr - 1)
++ /* last element, no need to allocat to next */
++ ref_itr -> next = NULL;
++ else
++ ref_itr->next = alloc_ref("");
+
+- if (!data->finished_handshake) {
+- int i;
++ ref_itr = ref_itr->next;
+ }
-+ transport->remote_refs = object_info_refs->next;
++
++ transport->remote_refs = object_info_refs;
++
+ } else if (!data->finished_handshake) {
int must_list_refs = 0;
- for (i = 0; i < nr_heads; i++) {
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
data->finished_handshake = 0;
+ if (args.object_info) {
-+ struct ref *ref_cpy_reader = object_info_refs->next;
++ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
-+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &(*args.object_info_data)[i], OBJECT_INFO_LOOKUP_REPLACE);
++ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
++
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
ret = -1;
if (report_unmatched_refs(to_fetch, nr_heads))
ret = -1;
-@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+
+ cleanup:
++ free_refs(object_info_refs);
+ close(data->fd[0]);
+ if (data->fd[1] >= 0)
+ close(data->fd[1]);
+ if (finish_connect(data->conn))
+ ret = -1;
+ data->conn = NULL;
+-
free_refs(refs_tmp);
free_refs(refs);
list_objects_filter_release(&args.filter_options);
-+ free_refs(object_info_refs);
- return ret;
- }
-
## transport.h ##
@@
@@ transport.h: struct git_transport_options {
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
-+ struct object_info **object_info_data;
++ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
5: 6526e24aa4 ! 5: 2787327782 cat-file: add declaration of variable i inside its for loop
@@ Commit message
Change the declaration of i to be inside the for loop for readability.
- Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
+ Signed-off-by: Eric Ju <eric.peijian@gmail.com>
## builtin/cat-file.c ##
@@ builtin/cat-file.c: static void dispatch_calls(struct batch_options *opt,
@@ builtin/cat-file.c: static void dispatch_calls(struct batch_options *opt,
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
-+ for (int i = 0; i < nr; i++)
++ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ builtin/cat-file.c: static void batch_objects_command(struct batch_options *opt,
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
-+ for (int i = 0; i < ARRAY_SIZE(commands); i++) {
++ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
6: 5cd1a1dbd2 < -: ---------- cat-file: add remote-object-info to batch-command
-: ---------- > 6: cb114765cf cat-file: add remote-object-info to batch-command
--
2.45.2
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v2 1/6] fetch-pack: refactor packet writing
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-09-24 11:45 ` Christian Couder
2024-07-20 3:43 ` [PATCH v2 2/6] fetch-pack: move fetch initialization Eric Ju
` (4 subsequent siblings)
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
A subsequent patch needs to write capabilities for another command.
Refactor write_fetch_command_and_capabilities() to be a more general
purpose function write_command_and_capabilities(), so that it can be
used by both fetch and future command.
Here "command" means the "operations" supported by Git’s wire protocol
https://git-scm.com/docs/protocol-v2. An example would be a
git's subcommand, such as git-fetch(1); or an operation supported by
the server side such as "object-info" implemented in "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 732511604b..9c8cda0f9e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1312,13 +1312,13 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
+static void write_command_and_capabilities(struct strbuf *req_buf,
+ const struct string_list *server_options, const char* command)
{
const char *hash_name;
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
if (server_supports_v2("agent"))
packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
if (advertise_sid && server_supports_v2("session-id"))
@@ -1354,7 +1354,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, args->server_options, "fetch");
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2172,7 +2172,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, server_options, "fetch");
packet_buf_write(&req_buf, "wait-for-done");
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v2 2/6] fetch-pack: move fetch initialization
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
2024-07-20 3:43 ` [PATCH v2 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-07-20 3:43 ` [PATCH v2 3/6] serve: advertise object-info feature Eric Ju
` (3 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 9c8cda0f9e..a605b9a499 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1675,18 +1675,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v2 3/6] serve: advertise object-info feature
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
2024-07-20 3:43 ` [PATCH v2 1/6] fetch-pack: refactor packet writing Eric Ju
2024-07-20 3:43 ` [PATCH v2 2/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-07-20 3:43 ` [PATCH v2 4/6] transport: add client support for object-info Eric Ju
` (2 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index 884cd84ca8..3aae03405b 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v2 4/6] transport: add client support for object-info
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
` (2 preceding siblings ...)
2024-07-20 3:43 ` [PATCH v2 3/6] serve: advertise object-info feature Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-09-24 11:45 ` Christian Couder
2024-07-20 3:43 ` [PATCH v2 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-07-20 3:43 ` [PATCH v2 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
in “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
feature 'size' from a v2 server. If a server does not
advertise the feature, then the client falls back
to making the request through 'fetch'.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 24 +++++++++
fetch-pack.h | 10 ++++
transport-helper.c | 8 ++-
transport.c | 118 +++++++++++++++++++++++++++++++++++++++++++--
transport.h | 11 +++++
5 files changed, 164 insertions(+), 7 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index a605b9a499..419450c8dd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1344,6 +1344,27 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
+void send_object_info_request(int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, args->server_options, "object-info");
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1681,6 +1702,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..5a5211e355 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
@@ -68,6 +70,12 @@ struct fetch_pack_args {
unsigned connectivity_checked:1;
};
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
/*
* sought represents remote references that should be updated from.
* On return, the names that were found on the remote will have been
@@ -106,4 +114,6 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
*/
int fetch_pack_fsck_objects(void);
+void send_object_info_request(int fd_out, struct object_info_args *args);
+
#endif
diff --git a/transport-helper.c b/transport-helper.c
index 09b3560ffd..841a32e80a 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -699,13 +699,17 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
+ * require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
return -1;
}
+ if (transport->smart_options->object_info) {
+ // fail the command explicitly to avoid further commands input
+ die(_("remote-object-info requires protocol v2"));
+ }
if (!data->get_refs_list_called)
get_refs_list_using_list(transport, 0);
diff --git a/transport.c b/transport.c
index 12cc5b4d96..8f990fcba6 100644
--- a/transport.c
+++ b/transport.c
@@ -366,6 +366,80 @@ static struct ref *handshake(struct transport *transport, int for_push,
return refs;
}
+static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
+{
+ int size_index = -1;
+ struct git_transport_data *data = transport->data;
+ struct object_info_args args = { 0 };
+ struct packet_reader reader;
+
+ args.server_options = transport->server_options;
+ args.object_info_options = transport->smart_options->object_info_options;
+ args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+ data->version = discover_version(&reader);
+
+ transport->hash_algo = reader.hash_algo;
+
+ switch (data->version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args.object_info_options, "size")
+ && !server_supports_feature("object-info", "size", 0)) {
+ return -1;
+ }
+ send_object_info_request(data->fd[1], &args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args.object_info_options->nr; i++) {
+ if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
+ if (!strcmp(reader.line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args.oids->nr; j++) {
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader.line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+
+ return 0;
+}
+
static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
struct transport_ls_refs_options *options)
{
@@ -413,6 +487,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -439,11 +514,36 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct ref *ref_itr = object_info_refs = alloc_ref("");
+
+ if (!fetch_object_info(transport, data->options.object_info_data))
+ goto cleanup;
+
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
+ /* last element, no need to allocat to next */
+ ref_itr -> next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
- if (!data->finished_handshake) {
- int i;
+ ref_itr = ref_itr->next;
+ }
+
+ transport->remote_refs = object_info_refs;
+
+ } else if (!data->finished_handshake) {
int must_list_refs = 0;
- for (i = 0; i < nr_heads; i++) {
+ for (int i = 0; i < nr_heads; i++) {
if (!to_fetch[i]->exact_oid) {
must_list_refs = 1;
break;
@@ -481,23 +581,31 @@ static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
+
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
- if (!refs)
+ if (!refs && !args.object_info)
ret = -1;
if (report_unmatched_refs(to_fetch, nr_heads))
ret = -1;
cleanup:
+ free_refs(object_info_refs);
close(data->fd[0]);
if (data->fd[1] >= 0)
close(data->fd[1]);
if (finish_connect(data->conn))
ret = -1;
data->conn = NULL;
-
free_refs(refs_tmp);
free_refs(refs);
list_objects_filter_release(&args.filter_options);
diff --git a/transport.h b/transport.h
index 6393cd9823..50ea2b05cf 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to pull only object-info. Fallbacks
+ * to pulling entire object if object-info is not supported.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v2 5/6] cat-file: add declaration of variable i inside its for loop
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
` (3 preceding siblings ...)
2024-07-20 3:43 ` [PATCH v2 4/6] transport: add client support for object-info Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-07-20 3:43 ` [PATCH v2 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 18fe58d6b8..a5724667b1 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -673,12 +673,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -686,9 +684,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -714,7 +710,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -724,7 +719,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v2 6/6] cat-file: add remote-object-info to batch-command
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
` (4 preceding siblings ...)
2024-07-20 3:43 ` [PATCH v2 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-07-20 3:43 ` Eric Ju
2024-09-11 13:11 ` Toon Claes
2024-09-24 12:13 ` Christian Couder
5 siblings, 2 replies; 174+ messages in thread
From: Eric Ju @ 2024-07-20 3:43 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, print the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 23 +-
builtin/cat-file.c | 116 +++-
object-file.c | 11 +
object-store-ll.h | 3 +
t/t1017-cat-file-remote-object-info.sh | 748 +++++++++++++++++++++++++
5 files changed, 893 insertions(+), 8 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index bd95a6c10a..98375dab46 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified <remote> without
+ downloading objects from remote. If the object-info capability is not
+ supported by the server, the objects will be downloaded instead.
+ Error when no object references is provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,9 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except remote-object-info command who uses
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+When "%(objecttype)" is supported, default format should be unified.
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +407,10 @@ scripting purposes.
CAVEATS
-------
+Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
+remote-object-info, git will error and exit when they are in the format string.
+
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index a5724667b1..ca6e05e769 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -528,7 +534,7 @@ static void batch_one_object(const char *obj_name,
enum get_oid_result result;
result = get_oid_with_context(the_repository, obj_name,
- flags, &data->oid, &ctx);
+ flags, &data->oid, &ctx);
if (result != FOUND) {
switch (result) {
case MISSING_OBJECT:
@@ -576,6 +582,61 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+ /*
+ * 'size' is the only option currently supported.
+ * Other options that are passed in the format will exit with error.
+ */
+ if (strstr(opt->format, "%(objectsize)")) {
+ string_list_append(&object_info_options, "size");
+ } else {
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+ }
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -667,6 +728,52 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line,
+ struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ data->size = *remote_object_info[i].sizep;
+ } else {
+ /*
+ * When reaching here, it means remote-object-info can't retrive
+ * infomation from server withoug downloading them, and the objects
+ * have been fetched to client already.
+ * Print the infomation using the logic for local objects.
+ */
+ data->skip_object_info = 0;
+ }
+
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -696,9 +803,10 @@ static const struct parse_cmd {
parse_cmd_fn_t fn;
unsigned takes_args;
} commands[] = {
- { "contents", parse_cmd_contents, 1},
- { "info", parse_cmd_info, 1},
- { "flush", NULL, 0},
+ { "contents", parse_cmd_contents, 1 },
+ { "info", parse_cmd_info, 1 },
+ { "remote-object-info", parse_cmd_remote_object_info, 1 },
+ { "flush", NULL, 0 },
};
static void batch_objects_command(struct batch_options *opt,
diff --git a/object-file.c b/object-file.c
index 065103be3e..34c702ece5 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2987,3 +2987,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index c5f2bb2fc2..333e19cd1e 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -533,4 +533,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..64eb55bd9e
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,748 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_sha1=$(git -C "$1" write-tree)
+ commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
+ tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_sha1 $hello_size" >expect &&
+ printf "%s\0" "$tree_sha1 $tree_size" >>expect &&
+ printf "%s\0" "$commit_sha1 $commit_size" >>expect &&
+ printf "%s\0" "$tag_sha1 $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_sha1 missing" >>expect &&
+ printf "%s\0" "$tree_sha1 missing" >>expect &&
+ printf "%s\0" "$commit_sha1 missing" >>expect &&
+ printf "%s\0" "$tag_sha1 missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_sha1 $tree_sha1
+remote-object-info $GIT_DAEMON_URL/parent $commit_sha1 $tag_sha1
+info $hello_sha1
+info $tree_sha1
+info $commit_sha1
+info $tag_sha1
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback git://: fetch objects to client' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_sha1 missing" >expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_sha1 $hello_size" >>expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_sha1 $hello_size" >>expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ remote-object-info $GIT_DAEMON_URL/parent $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_sha1
+ remote-object-info "file://${server_path}" $tree_sha1
+ remote-object-info "file://${server_path}" $commit_sha1
+ remote-object-info "file://${server_path}" $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1
+ remote-object-info "file://${server_path}" $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1
+ remote-object-info "file://${server_path}" $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_sha1 $hello_size" >expect &&
+ printf "%s\0" "$tree_sha1 $tree_size" >>expect &&
+ printf "%s\0" "$commit_sha1 $commit_size" >>expect &&
+ printf "%s\0" "$tag_sha1 $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_sha1 missing" >>expect &&
+ printf "%s\0" "$tree_sha1 missing" >>expect &&
+ printf "%s\0" "$commit_sha1 missing" >>expect &&
+ printf "%s\0" "$tag_sha1 missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_sha1 $tree_sha1
+remote-object-info \"file://${server_path}\" $commit_sha1 $tag_sha1
+info $hello_sha1
+info $tree_sha1
+info $commit_sha1
+info $tag_sha1
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback file://: fetch objects to client' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+ cd file_client_empty &&
+
+ # Prove object is not on the client
+ echo "$hello_sha1 missing" >expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_sha1 $hello_size" >>expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_sha1 $hello_size" >>expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_sha1 missing" >>expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_sha1 $hello_size" >expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_sha1 $hello_size" >expect &&
+ printf "%s\0" "$tree_sha1 $tree_size" >>expect &&
+ printf "%s\0" "$commit_sha1 $commit_size" >>expect &&
+ printf "%s\0" "$tag_sha1 $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_sha1 $tree_sha1
+remote-object-info $HTTPD_URL/smart/http_parent $commit_sha1 $tag_sha1
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback http://: fetch objects to client' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_sha1 missing" >expect &&
+ echo "$tree_sha1 missing" >>expect &&
+ echo "$commit_sha1 missing" >>expect &&
+ echo "$tag_sha1 missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_sha1 $hello_size" >>expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_sha1 $hello_size" >>expect &&
+ echo "$tree_sha1 $tree_size" >>expect &&
+ echo "$commit_sha1 $commit_size" >>expect &&
+ echo "$tag_sha1 $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+ info $hello_sha1
+ info $tree_sha1
+ info $commit_sha1
+ info $tag_sha1
+ EOF
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.45.2
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (6 preceding siblings ...)
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
@ 2024-08-22 21:24 ` Peijian Ju
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
` (8 subsequent siblings)
16 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-08-22 21:24 UTC (permalink / raw)
To: git
Cc: Christian Couder, Calvin Wan, Jonathan Tan, John Cai,
Karthik Nayak, Justin Tobler, Toon claes
Dear Reviewers,
Thank you for your thorough review of v1. I have addressed all the
issues identified in that version and have now prepared v2.
Could you please take another look and provide your acknowledgment?
Thank you very much for your time and effort.
Best regards,
Peijian
On Fri, Jun 28, 2024 at 3:05 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> This is a continuation of Calvin Wan's (calvinwan@google.com)
> patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
>
> Sometimes it is useful to get information about an object without having to download
> it completely. The server logic for retrieving size has already been implemented and merged in
> "a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
> This patch series implement the client option for it.
>
> This patch series add the `remote-object-info` command to `cat-file --batch-command`. This command
> allows the client to make an object-info command request to a server
> that supports protocol v2. If the server is v2, but does not have
> object-info capability, the entire object is fetched and the
> relevant object info is returned.
>
> A few questions open for discussions please:
>
> 1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
> `cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
> or "warn and wait for new command".
>
> 2. Right now, only the size is supported. If the batch command format
> contains objectsize:disk or deltabase, it will die. The question
> is about objecttype. In the current implementation, it will die too.
> But dying on objecttype breaks the default format. We have changed the
> default format to %(objectname) %(objectsize) when remote-object-info is used.
> Any suggestions on this approach?
>
>
> [1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
> [2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
>
>
> Calvin Wan (5):
> fetch-pack: refactor packet writing
> fetch-pack: move fetch initialization
> serve: advertise object-info feature
> transport: add client support for object-info
> cat-file: add remote-object-info to batch-command
>
> Eric Ju (1):
> cat-file: add declaration of variable i inside its for loop
>
> Documentation/git-cat-file.txt | 22 +-
> builtin/cat-file.c | 240 ++++++++++----
> fetch-pack.c | 48 ++-
> fetch-pack.h | 10 +
> object-file.c | 11 +
> object-store-ll.h | 3 +
> serve.c | 4 +-
> t/t1017-cat-file-remote-object-info.sh | 412 +++++++++++++++++++++++++
> transport-helper.c | 8 +-
> transport.c | 102 +++++-
> transport.h | 11 +
> 11 files changed, 785 insertions(+), 86 deletions(-)
> create mode 100755 t/t1017-cat-file-remote-object-info.sh
>
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 6/6] cat-file: add remote-object-info to batch-command
2024-07-20 3:43 ` [PATCH v2 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-09-11 13:11 ` Toon Claes
2024-09-25 18:18 ` Peijian Ju
2024-09-24 12:13 ` Christian Couder
1 sibling, 1 reply; 174+ messages in thread
From: Toon Claes @ 2024-09-11 13:11 UTC (permalink / raw)
To: Eric Ju, git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Eric Ju <eric.peijian@gmail.com> writes:
[snip]
> diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> new file mode 100755
> index 0000000000..64eb55bd9e
> --- /dev/null
> +++ b/t/t1017-cat-file-remote-object-info.sh
> @@ -0,0 +1,748 @@
> +#!/bin/sh
> +
> +test_description='git cat-file --batch-command with remote-object-info command'
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +
> +echo_without_newline () {
> + printf '%s' "$*"
> +}
> +
> +echo_without_newline_nul () {
> + echo_without_newline "$@" | tr '\n' '\0'
> +}
> +
> +strlen () {
> + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> +}
> +
> +hello_content="Hello World"
> +hello_size=$(strlen "$hello_content")
> +hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> +
> +# This is how we get 13:
> +# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
> +# file mode is 100644, which is 6 characters;
> +# file name is hello, which is 5 characters
> +# a space is 1 character and a null is 1 character
> +tree_size=$(($(test_oid rawsz) + 13))
> +
> +commit_message="Initial commit"
> +
> +# This is how we get 137:
> +# 137 = <tree header> + <a_space> + <a newline> +
> +# <Author line> + <a newline> +
> +# <Committer line> + <a newline> +
> +# <a newline> +
> +# <commit message length>
> +# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
> +# to get 177, 2. then deduct 40 hex characters to get 137
> +commit_size=$(($(test_oid hexsz) + 137))
> +
> +tag_header_without_oid="type blob
> +tag hellotag
> +tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
> +tag_header_without_timestamp="object $hello_oid
> +$tag_header_without_oid"
> +tag_description="This is a tag"
> +tag_content="$tag_header_without_timestamp 0 +0000
> +
> +$tag_description"
> +
> +tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
> +tag_size=$(strlen "$tag_content")
> +
> +set_transport_variables () {
> + hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> + tree_sha1=$(git -C "$1" write-tree)
> + commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
> + tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
I see here and various other places in this file names with "_sha1". I
think it makes more sense to name them "_oid" because these works also
fine with GIT_TEST_DEFAULT_HASH=sha256.
Other than that I don't have any comments about this patch series.
--
Toon
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 1/6] fetch-pack: refactor packet writing
2024-07-20 3:43 ` [PATCH v2 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-09-24 11:45 ` Christian Couder
2024-09-25 20:42 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Christian Couder @ 2024-09-24 11:45 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Sat, Jul 20, 2024 at 5:43 AM Eric Ju <eric.peijian@gmail.com> wrote:
>
> From: Calvin Wan <calvinwan@google.com>
>
> A subsequent patch needs to write capabilities for another command.
> Refactor write_fetch_command_and_capabilities() to be a more general
> purpose function write_command_and_capabilities(), so that it can be
> used by both fetch and future command.
>
> Here "command" means the "operations" supported by Git’s wire protocol
> https://git-scm.com/docs/protocol-v2. An example would be a
> git's subcommand, such as git-fetch(1); or an operation supported by
> the server side such as "object-info" implemented in "a2ba162cda
> (object-info: support for retrieving object info, 2021-04-20)".
I agree that reusing or refactoring the new
write_command_and_capabilities() function for more commands can be
done in a separate series that could perhaps also move the new
function to connect.c. Maybe this could be added to the commit message
though.
[...]
> -static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
> - const struct string_list *server_options)
> +static void write_command_and_capabilities(struct strbuf *req_buf,
> + const struct string_list *server_options, const char* command)
In https://lore.kernel.org/git/xmqqfsn0qsi4.fsf@gitster.g/ Junio
suggested swaping the "command" and "server_options" arguments as well
as sticking the "*" to "command" instead of "char", so:
static void write_command_and_capabilities(struct strbuf *req_buf,
const char *command,
const struct string_list *server_options)
The rest of the patch looks good.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 4/6] transport: add client support for object-info
2024-07-20 3:43 ` [PATCH v2 4/6] transport: add client support for object-info Eric Ju
@ 2024-09-24 11:45 ` Christian Couder
2024-09-24 17:29 ` Junio C Hamano
2024-09-25 18:29 ` Peijian Ju
0 siblings, 2 replies; 174+ messages in thread
From: Christian Couder @ 2024-09-24 11:45 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Sat, Jul 20, 2024 at 5:43 AM Eric Ju <eric.peijian@gmail.com> wrote:
[...]
> fetch-pack.c | 24 +++++++++
> fetch-pack.h | 10 ++++
> transport-helper.c | 8 ++-
> transport.c | 118 +++++++++++++++++++++++++++++++++++++++++++--
> transport.h | 11 +++++
> 5 files changed, 164 insertions(+), 7 deletions(-)
Karthik suggested adding tests at this stage, but I see no tests here.
Maybe the tests are added later, but I agree with Karthik that it
would be nice to add them early if possible.
> diff --git a/transport-helper.c b/transport-helper.c
> index 09b3560ffd..841a32e80a 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -699,13 +699,17 @@ static int fetch_refs(struct transport *transport,
>
> /*
> * If we reach here, then the server, the client, and/or the transport
> - * helper does not support protocol v2. --negotiate-only requires
> - * protocol v2.
> + * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
> + * require protocol v2.
> */
> if (data->transport_options.acked_commits) {
> warning(_("--negotiate-only requires protocol v2"));
> return -1;
> }
> + if (transport->smart_options->object_info) {
> + // fail the command explicitly to avoid further commands input
We use "/* stuff */" for one line comments instead of "// stuff". Also
the comment could go before the if (...) above and the "{" and "}"
could be dropped.
> + die(_("remote-object-info requires protocol v2"));
> + }
[...]
> +static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
> +{
> + int size_index = -1;
> + struct git_transport_data *data = transport->data;
> + struct object_info_args args = { 0 };
> + struct packet_reader reader;
> +
> + args.server_options = transport->server_options;
> + args.object_info_options = transport->smart_options->object_info_options;
> + args.oids = transport->smart_options->object_info_oids;
> +
> + connect_setup(transport, 0);
> + packet_reader_init(&reader, data->fd[0], NULL, 0,
> + PACKET_READ_CHOMP_NEWLINE |
> + PACKET_READ_GENTLE_ON_EOF |
> + PACKET_READ_DIE_ON_ERR_PACKET);
> + data->version = discover_version(&reader);
> +
> + transport->hash_algo = reader.hash_algo;
> +
> + switch (data->version) {
> + case protocol_v2:
> + if (!server_supports_v2("object-info"))
> + return -1;
> + if (unsorted_string_list_has_string(args.object_info_options, "size")
> + && !server_supports_feature("object-info", "size", 0)) {
> + return -1;
> + }
The "{" and "}" can be dropped here too.
> + send_object_info_request(data->fd[1], &args);
> + break;
> + case protocol_v1:
> + case protocol_v0:
> + die(_("wrong protocol version. expected v2"));
> + case protocol_unknown_version:
> + BUG("unknown protocol version");
> + }
> +
> + for (size_t i = 0; i < args.object_info_options->nr; i++) {
> + if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
> + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> + return -1;
> + }
> + if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
> + if (!strcmp(reader.line, "size")) {
> + size_index = i;
> + for (size_t j = 0; j < args.oids->nr; j++) {
> + object_info_data[j].sizep = xcalloc(1, sizeof(long));
> + }
The "{" and "}" can be dropped here too.
> + }
> + continue;
> + }
> + return -1;
> + }
> +
> + for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
> + struct string_list object_info_values = STRING_LIST_INIT_DUP;
> +
> + string_list_split(&object_info_values, reader.line, ' ', -1);
> + if (0 <= size_index) {
> + if (!strcmp(object_info_values.items[1 + size_index].string, ""))
> + die("object-info: not our ref %s",
> + object_info_values.items[0].string);
> +
> + *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
> +
This blank line can be removed.
> + }
> +
> + string_list_clear(&object_info_values, 0);
> + }
> + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> +
> + return 0;
> +}
> +
> static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> struct transport_ls_refs_options *options)
> {
> @@ -413,6 +487,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> struct ref *refs = NULL;
> struct fetch_pack_args args;
> struct ref *refs_tmp = NULL;
> + struct ref *object_info_refs = NULL;
>
> memset(&args, 0, sizeof(args));
> args.uploadpack = data->options.uploadpack;
> @@ -439,11 +514,36 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.server_options = transport->server_options;
> args.negotiation_tips = data->options.negotiation_tips;
> args.reject_shallow_remote = transport->smart_options->reject_shallow;
> + args.object_info = transport->smart_options->object_info;
> +
> + if (transport->smart_options
> + && transport->smart_options->object_info
> + && transport->smart_options->object_info_oids->nr > 0) {
> + struct ref *ref_itr = object_info_refs = alloc_ref("");
> +
> + if (!fetch_object_info(transport, data->options.object_info_data))
> + goto cleanup;
> +
> + args.object_info_data = data->options.object_info_data;
> + args.quiet = 1;
> + args.no_progress = 1;
> + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> + ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
> + ref_itr->exact_oid = 1;
> + if (i == transport->smart_options->object_info_oids->nr - 1)
> + /* last element, no need to allocat to next */
s/allocat/allocate/
> + ref_itr -> next = NULL;
> + else
> + ref_itr->next = alloc_ref("");
>
> - if (!data->finished_handshake) {
> - int i;
> + ref_itr = ref_itr->next;
> + }
> +
> + transport->remote_refs = object_info_refs;
> +
> + } else if (!data->finished_handshake) {
> int must_list_refs = 0;
> - for (i = 0; i < nr_heads; i++) {
> + for (int i = 0; i < nr_heads; i++) {
> if (!to_fetch[i]->exact_oid) {
> must_list_refs = 1;
> break;
> @@ -481,23 +581,31 @@ static int fetch_refs_via_pack(struct transport *transport,
> &transport->pack_lockfiles, data->version);
>
> data->finished_handshake = 0;
> + if (args.object_info) {
> + struct ref *ref_cpy_reader = object_info_refs;
> + for (int i = 0; ref_cpy_reader; i++) {
> + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
This line might want to be folded.
> + ref_cpy_reader = ref_cpy_reader->next;
> + }
> + }
> +
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 6/6] cat-file: add remote-object-info to batch-command
2024-07-20 3:43 ` [PATCH v2 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-09-11 13:11 ` Toon Claes
@ 2024-09-24 12:13 ` Christian Couder
2024-09-25 18:12 ` Peijian Ju
1 sibling, 1 reply; 174+ messages in thread
From: Christian Couder @ 2024-09-24 12:13 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Sat, Jul 20, 2024 at 5:44 AM Eric Ju <eric.peijian@gmail.com> wrote:
> +remote-object-info <remote> <object>...::
> + Print object info for object references `<object>` at specified <remote> without
> + downloading objects from remote. If the object-info capability is not
> + supported by the server, the objects will be downloaded instead.
> + Error when no object references is provided.
Maybe s/is provided/are provided/
> + This command may be combined with `--buffer`.
> `deltabase`::
> If the object is stored as a delta on-disk, this expands to the
> full hex representation of the delta base object name.
> Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> - below.
> + below. Not supported by `remote-object-info`.
>
> `rest`::
> If this atom is used in the output string, input lines are split
> @@ -314,7 +323,9 @@ newline. The available atoms are:
> line) are output in place of the `%(rest)` atom.
>
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except remote-object-info command who uses
s/except remote-object-info command who uses/except for
`remote-object-info` commands which use/
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +When "%(objecttype)" is supported, default format should be unified.
>
> If `--batch` is specified, or if `--batch-command` is used with the `contents`
> command, the object information is followed by the object contents (consisting
> @@ -396,6 +407,10 @@ scripting purposes.
> CAVEATS
> -------
>
> +Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
s/objecttype, objectsize:disk and deltabase/%(objecttype),
%(objectsize:disk) and %(deltabase)/
> +remote-object-info, git will error and exit when they are in the format string.
s//remote-object-info, git /`remote-object-info` command, we/
> +
> +
Maybe a single blank line is enough.
> Note that the sizes of objects on disk are reported accurately, but care
> should be taken in drawing conclusions about which refs or objects are
> responsible for disk usage. The size of a packed non-delta object may be
[...]
> + gtransport = transport_get(remote, NULL);
> + if (gtransport->smart_options) {
> + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> + gtransport->smart_options->object_info = 1;
> + gtransport->smart_options->object_info_oids = &object_info_oids;
> + /*
> + * 'size' is the only option currently supported.
> + * Other options that are passed in the format will exit with error.
> + */
> + if (strstr(opt->format, "%(objectsize)")) {
> + string_list_append(&object_info_options, "size");
> + } else {
> + die(_("%s is currently not supported with remote-object-info"), opt->format);
> + }
Something like the following might be a bit shorter and simpler:
/* 'objectsize' is the only option currently supported */
if (!strstr(opt->format, "%(objectsize)"))
die(_("%s is currently not supported with
remote-object-info"), opt->format);
string_list_append(&object_info_options, "size");
> + if (object_info_options.nr > 0) {
> + gtransport->smart_options->object_info_options = &object_info_options;
> + gtransport->smart_options->object_info_data = remote_object_info;
> + retval = transport_fetch_refs(gtransport, NULL);
> + }
> + } else {
> + retval = -1;
> + }
[...]
> + opt->use_remote_info = 1;
> + data->skip_object_info = 1;
> + for (size_t i = 0; i < object_info_oids.nr; i++) {
> +
> + data->oid = object_info_oids.oid[i];
> +
> + if (remote_object_info[i].sizep) {
> + data->size = *remote_object_info[i].sizep;
> + } else {
> + /*
> + * When reaching here, it means remote-object-info can't retrive
s/retrive/retrieve/
> + * infomation from server withoug downloading them, and the objects
s/infomation from server withoug/information from server without/
> + * have been fetched to client already.
> + * Print the infomation using the logic for local objects.
s/infomation/information/
> + */
> + data->skip_object_info = 0;
> + }
> +
> + opt->batch_mode = BATCH_MODE_INFO;
> + batch_object_write(argv[i+1], output, opt, data, NULL, 0);
> +
> + }
> + opt->use_remote_info = 0;
> + data->skip_object_info = 0;
> +
> +cleanup:
> + for (size_t i = 0; i < object_info_oids.nr; i++)
> + free_object_info_contents(&remote_object_info[i]);
> + free(line_to_split);
> + free(argv);
> + free(remote_object_info);
> +}
> +
> static void dispatch_calls(struct batch_options *opt,
> struct strbuf *output,
> struct expand_data *data,
> @@ -696,9 +803,10 @@ static const struct parse_cmd {
> parse_cmd_fn_t fn;
> unsigned takes_args;
> } commands[] = {
> - { "contents", parse_cmd_contents, 1},
> - { "info", parse_cmd_info, 1},
> - { "flush", NULL, 0},
> + { "contents", parse_cmd_contents, 1 },
> + { "info", parse_cmd_info, 1 },
> + { "remote-object-info", parse_cmd_remote_object_info, 1 },
> + { "flush", NULL, 0 },
I am not sure it's a good thing to add a space before "}".
> };
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 4/6] transport: add client support for object-info
2024-09-24 11:45 ` Christian Couder
@ 2024-09-24 17:29 ` Junio C Hamano
2024-09-25 18:29 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Junio C Hamano @ 2024-09-24 17:29 UTC (permalink / raw)
To: Christian Couder
Cc: Eric Ju, git, calvinwan, jonathantanmy, chriscool, karthik.188,
toon, jltobler
Christian Couder <christian.couder@gmail.com> writes:
> On Sat, Jul 20, 2024 at 5:43 AM Eric Ju <eric.peijian@gmail.com> wrote:
>
> [...]
>
>> fetch-pack.c | 24 +++++++++
>> fetch-pack.h | 10 ++++
>> transport-helper.c | 8 ++-
>> transport.c | 118 +++++++++++++++++++++++++++++++++++++++++++--
>> transport.h | 11 +++++
>> 5 files changed, 164 insertions(+), 7 deletions(-)
>
> Karthik suggested adding tests at this stage, but I see no tests here.
> Maybe the tests are added later, but I agree with Karthik that it
> would be nice to add them early if possible.
> ...
>> + if (args.object_info) {
>> + struct ref *ref_cpy_reader = object_info_refs;
>> + for (int i = 0; ref_cpy_reader; i++) {
>> + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
>
> This line might want to be folded.
Thanks for a review on this long patch.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 6/6] cat-file: add remote-object-info to batch-command
2024-09-24 12:13 ` Christian Couder
@ 2024-09-25 18:12 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-09-25 18:12 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Tue, Sep 24, 2024 at 8:13 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Sat, Jul 20, 2024 at 5:44 AM Eric Ju <eric.peijian@gmail.com> wrote:
>
> > +remote-object-info <remote> <object>...::
> > + Print object info for object references `<object>` at specified <remote> without
> > + downloading objects from remote. If the object-info capability is not
> > + supported by the server, the objects will be downloaded instead.
> > + Error when no object references is provided.
>
> Maybe s/is provided/are provided/
Thank you. Fixed in V3.
>
>
> > + This command may be combined with `--buffer`.
>
> > `deltabase`::
> > If the object is stored as a delta on-disk, this expands to the
> > full hex representation of the delta base object name.
> > Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> > - below.
> > + below. Not supported by `remote-object-info`.
> >
> > `rest`::
> > If this atom is used in the output string, input lines are split
> > @@ -314,7 +323,9 @@ newline. The available atoms are:
> > line) are output in place of the `%(rest)` atom.
> >
> > If no format is specified, the default format is `%(objectname)
> > -%(objecttype) %(objectsize)`.
> > +%(objecttype) %(objectsize)`, except remote-object-info command who uses
>
> s/except remote-object-info command who uses/except for
> `remote-object-info` commands which use/
Thank you. Fixed in V3.
>
>
> > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> > +When "%(objecttype)" is supported, default format should be unified.
> >
> > If `--batch` is specified, or if `--batch-command` is used with the `contents`
> > command, the object information is followed by the object contents (consisting
> > @@ -396,6 +407,10 @@ scripting purposes.
> > CAVEATS
> > -------
> >
> > +Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
>
> s/objecttype, objectsize:disk and deltabase/%(objecttype),
> %(objectsize:disk) and %(deltabase)/
>
Thank you. Fixed in V3.
>
> > +remote-object-info, git will error and exit when they are in the format string.
>
> s//remote-object-info, git /`remote-object-info` command, we/
>
Thank you. Fixed in V3.
>
> > +
> > +
>
> Maybe a single blank line is enough.
Thank you. Fixed in V3.
>
>
> > Note that the sizes of objects on disk are reported accurately, but care
> > should be taken in drawing conclusions about which refs or objects are
> > responsible for disk usage. The size of a packed non-delta object may be
>
> [...]
>
> > + gtransport = transport_get(remote, NULL);
> > + if (gtransport->smart_options) {
> > + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> > + gtransport->smart_options->object_info = 1;
> > + gtransport->smart_options->object_info_oids = &object_info_oids;
> > + /*
> > + * 'size' is the only option currently supported.
> > + * Other options that are passed in the format will exit with error.
> > + */
> > + if (strstr(opt->format, "%(objectsize)")) {
> > + string_list_append(&object_info_options, "size");
> > + } else {
> > + die(_("%s is currently not supported with remote-object-info"), opt->format);
> > + }
>
> Something like the following might be a bit shorter and simpler:
>
> /* 'objectsize' is the only option currently supported */
> if (!strstr(opt->format, "%(objectsize)"))
> die(_("%s is currently not supported with
> remote-object-info"), opt->format);
>
> string_list_append(&object_info_options, "size");
Thank you. Revised in V3.
>
>
> > + if (object_info_options.nr > 0) {
> > + gtransport->smart_options->object_info_options = &object_info_options;
> > + gtransport->smart_options->object_info_data = remote_object_info;
> > + retval = transport_fetch_refs(gtransport, NULL);
> > + }
> > + } else {
> > + retval = -1;
> > + }
>
> [...]
>
> > + opt->use_remote_info = 1;
> > + data->skip_object_info = 1;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > +
> > + data->oid = object_info_oids.oid[i];
> > +
> > + if (remote_object_info[i].sizep) {
> > + data->size = *remote_object_info[i].sizep;
> > + } else {
> > + /*
> > + * When reaching here, it means remote-object-info can't retrive
>
> s/retrive/retrieve/
>
Thank you. Fixed in V3.
>
> > + * infomation from server withoug downloading them, and the objects
>
> s/infomation from server withoug/information from server without/
>
Thank you. Fixed in V3.
> > + * have been fetched to client already.
> > + * Print the infomation using the logic for local objects.
>
> s/infomation/information/
Thank you. Fixed in V3.
>
>
> > + */
> > + data->skip_object_info = 0;
> > + }
> > +
> > + opt->batch_mode = BATCH_MODE_INFO;
> > + batch_object_write(argv[i+1], output, opt, data, NULL, 0);
> > +
> > + }
> > + opt->use_remote_info = 0;
> > + data->skip_object_info = 0;
> > +
> > +cleanup:
> > + for (size_t i = 0; i < object_info_oids.nr; i++)
> > + free_object_info_contents(&remote_object_info[i]);
> > + free(line_to_split);
> > + free(argv);
> > + free(remote_object_info);
> > +}
> > +
> > static void dispatch_calls(struct batch_options *opt,
> > struct strbuf *output,
> > struct expand_data *data,
> > @@ -696,9 +803,10 @@ static const struct parse_cmd {
> > parse_cmd_fn_t fn;
> > unsigned takes_args;
> > } commands[] = {
> > - { "contents", parse_cmd_contents, 1},
> > - { "info", parse_cmd_info, 1},
> > - { "flush", NULL, 0},
> > + { "contents", parse_cmd_contents, 1 },
> > + { "info", parse_cmd_info, 1 },
> > + { "remote-object-info", parse_cmd_remote_object_info, 1 },
> > + { "flush", NULL, 0 },
>
> I am not sure it's a good thing to add a space before "}".
>
Thank you. Fixed in V3.
>
> > };
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 6/6] cat-file: add remote-object-info to batch-command
2024-09-11 13:11 ` Toon Claes
@ 2024-09-25 18:18 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-09-25 18:18 UTC (permalink / raw)
To: Toon Claes, git
Cc: calvinwan, jonathantanmy, chriscool, karthik.188, jltobler
On Wed, Sep 11, 2024 at 9:12 AM Toon Claes <toon@iotcl.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> [snip]
>
> > diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> > new file mode 100755
> > index 0000000000..64eb55bd9e
> > --- /dev/null
> > +++ b/t/t1017-cat-file-remote-object-info.sh
> > @@ -0,0 +1,748 @@
> > +#!/bin/sh
> > +
> > +test_description='git cat-file --batch-command with remote-object-info command'
> > +
> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> > +
> > +. ./test-lib.sh
> > +
> > +echo_without_newline () {
> > + printf '%s' "$*"
> > +}
> > +
> > +echo_without_newline_nul () {
> > + echo_without_newline "$@" | tr '\n' '\0'
> > +}
> > +
> > +strlen () {
> > + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> > +}
> > +
> > +hello_content="Hello World"
> > +hello_size=$(strlen "$hello_content")
> > +hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> > +
> > +# This is how we get 13:
> > +# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
> > +# file mode is 100644, which is 6 characters;
> > +# file name is hello, which is 5 characters
> > +# a space is 1 character and a null is 1 character
> > +tree_size=$(($(test_oid rawsz) + 13))
> > +
> > +commit_message="Initial commit"
> > +
> > +# This is how we get 137:
> > +# 137 = <tree header> + <a_space> + <a newline> +
> > +# <Author line> + <a newline> +
> > +# <Committer line> + <a newline> +
> > +# <a newline> +
> > +# <commit message length>
> > +# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
> > +# to get 177, 2. then deduct 40 hex characters to get 137
> > +commit_size=$(($(test_oid hexsz) + 137))
> > +
> > +tag_header_without_oid="type blob
> > +tag hellotag
> > +tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
> > +tag_header_without_timestamp="object $hello_oid
> > +$tag_header_without_oid"
> > +tag_description="This is a tag"
> > +tag_content="$tag_header_without_timestamp 0 +0000
> > +
> > +$tag_description"
> > +
> > +tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
> > +tag_size=$(strlen "$tag_content")
> > +
> > +set_transport_variables () {
> > + hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
> > + tree_sha1=$(git -C "$1" write-tree)
> > + commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
> > + tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
>
> I see here and various other places in this file names with "_sha1". I
> think it makes more sense to name them "_oid" because these works also
> fine with GIT_TEST_DEFAULT_HASH=sha256.
>
> Other than that I don't have any comments about this patch series.
>
> --
> Toon
Thank you. In V3, all the variables end with "_sha1" are changed to
end with "_oid"
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 4/6] transport: add client support for object-info
2024-09-24 11:45 ` Christian Couder
2024-09-24 17:29 ` Junio C Hamano
@ 2024-09-25 18:29 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-09-25 18:29 UTC (permalink / raw)
To: Christian Couder, git
Cc: calvinwan, jonathantanmy, chriscool, karthik.188, toon, jltobler
On Tue, Sep 24, 2024 at 7:45 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Sat, Jul 20, 2024 at 5:43 AM Eric Ju <eric.peijian@gmail.com> wrote:
>
> [...]
>
> > fetch-pack.c | 24 +++++++++
> > fetch-pack.h | 10 ++++
> > transport-helper.c | 8 ++-
> > transport.c | 118 +++++++++++++++++++++++++++++++++++++++++++--
> > transport.h | 11 +++++
> > 5 files changed, 164 insertions(+), 7 deletions(-)
>
> Karthik suggested adding tests at this stage, but I see no tests here.
> Maybe the tests are added later, but I agree with Karthik that it
> would be nice to add them early if possible.
>
Thank you. I’m not sure if there’s an easy way to directly add unit
tests for the changes in fetch-pack.c, transport-helper.c, and
transport.c, as the relevant functions are deeply nested within the
client’s call stack. Therefore, I’m attempting to test them indirectly
through git cat-file in t/t1017-cat-file-remote-object-info.sh in the
next commit.
Specifically, in t/t1017-cat-file-remote-object-info.sh:
- The code in transport-helper.c is tested in the cases
“remote-object-info fails on server with legacy protocol” and
“remote-object-info fails on server with legacy protocol fallback”.
- The code in transport.c and fetch-pack.c is tested in the cases
where transfer.advertiseobjectinfo is set to true, such as in
“batch-command remote-object-info http://”, “batch-command
remote-object-info file://”, and “batch-command remote-object-info
git://”. In these tests, we verify that remote-object-info
successfully retrieves the size from the remote without downloading
the objects locally.
> > diff --git a/transport-helper.c b/transport-helper.c
> > index 09b3560ffd..841a32e80a 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -699,13 +699,17 @@ static int fetch_refs(struct transport *transport,
> >
> > /*
> > * If we reach here, then the server, the client, and/or the transport
> > - * helper does not support protocol v2. --negotiate-only requires
> > - * protocol v2.
> > + * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
> > + * require protocol v2.
> > */
> > if (data->transport_options.acked_commits) {
> > warning(_("--negotiate-only requires protocol v2"));
> > return -1;
> > }
> > + if (transport->smart_options->object_info) {
> > + // fail the command explicitly to avoid further commands input
>
> We use "/* stuff */" for one line comments instead of "// stuff". Also
> the comment could go before the if (...) above and the "{" and "}"
> could be dropped.
>
Thank you. Revised in V3.
> > + die(_("remote-object-info requires protocol v2"));
> > + }
>
> [...]
>
> > +static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
> > +{
> > + int size_index = -1;
> > + struct git_transport_data *data = transport->data;
> > + struct object_info_args args = { 0 };
> > + struct packet_reader reader;
> > +
> > + args.server_options = transport->server_options;
> > + args.object_info_options = transport->smart_options->object_info_options;
> > + args.oids = transport->smart_options->object_info_oids;
> > +
> > + connect_setup(transport, 0);
> > + packet_reader_init(&reader, data->fd[0], NULL, 0,
> > + PACKET_READ_CHOMP_NEWLINE |
> > + PACKET_READ_GENTLE_ON_EOF |
> > + PACKET_READ_DIE_ON_ERR_PACKET);
> > + data->version = discover_version(&reader);
> > +
> > + transport->hash_algo = reader.hash_algo;
> > +
> > + switch (data->version) {
> > + case protocol_v2:
> > + if (!server_supports_v2("object-info"))
> > + return -1;
> > + if (unsorted_string_list_has_string(args.object_info_options, "size")
> > + && !server_supports_feature("object-info", "size", 0)) {
> > + return -1;
> > + }
>
> The "{" and "}" can be dropped here too.
>
Thank you. Fixed in V3.
> > + send_object_info_request(data->fd[1], &args);
> > + break;
> > + case protocol_v1:
> > + case protocol_v0:
> > + die(_("wrong protocol version. expected v2"));
> > + case protocol_unknown_version:
> > + BUG("unknown protocol version");
> > + }
> > +
> > + for (size_t i = 0; i < args.object_info_options->nr; i++) {
> > + if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
> > + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> > + return -1;
> > + }
> > + if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
> > + if (!strcmp(reader.line, "size")) {
> > + size_index = i;
> > + for (size_t j = 0; j < args.oids->nr; j++) {
> > + object_info_data[j].sizep = xcalloc(1, sizeof(long));
> > + }
>
> The "{" and "}" can be dropped here too.
>
Thank you. Fixed in V3.
> > + }
> > + continue;
> > + }
> > + return -1;
> > + }
> > +
> > + for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
> > + struct string_list object_info_values = STRING_LIST_INIT_DUP;
> > +
> > + string_list_split(&object_info_values, reader.line, ' ', -1);
> > + if (0 <= size_index) {
> > + if (!strcmp(object_info_values.items[1 + size_index].string, ""))
> > + die("object-info: not our ref %s",
> > + object_info_values.items[0].string);
> > +
> > + *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
> > +
>
> This blank line can be removed.
>
Thank you. Fixed in V3.
> > + }
> > +
> > + string_list_clear(&object_info_values, 0);
> > + }
> > + check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
> > +
> > + return 0;
> > +}
> > +
> > static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> > struct transport_ls_refs_options *options)
> > {
> > @@ -413,6 +487,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> > struct ref *refs = NULL;
> > struct fetch_pack_args args;
> > struct ref *refs_tmp = NULL;
> > + struct ref *object_info_refs = NULL;
> >
> > memset(&args, 0, sizeof(args));
> > args.uploadpack = data->options.uploadpack;
> > @@ -439,11 +514,36 @@ static int fetch_refs_via_pack(struct transport *transport,
> > args.server_options = transport->server_options;
> > args.negotiation_tips = data->options.negotiation_tips;
> > args.reject_shallow_remote = transport->smart_options->reject_shallow;
> > + args.object_info = transport->smart_options->object_info;
> > +
> > + if (transport->smart_options
> > + && transport->smart_options->object_info
> > + && transport->smart_options->object_info_oids->nr > 0) {
> > + struct ref *ref_itr = object_info_refs = alloc_ref("");
> > +
> > + if (!fetch_object_info(transport, data->options.object_info_data))
> > + goto cleanup;
> > +
> > + args.object_info_data = data->options.object_info_data;
> > + args.quiet = 1;
> > + args.no_progress = 1;
> > + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> > + ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
> > + ref_itr->exact_oid = 1;
> > + if (i == transport->smart_options->object_info_oids->nr - 1)
> > + /* last element, no need to allocat to next */
>
> s/allocat/allocate/
>
Thank you. Fixed in V3.
> > + ref_itr -> next = NULL;
> > + else
> > + ref_itr->next = alloc_ref("");
> >
> > - if (!data->finished_handshake) {
> > - int i;
> > + ref_itr = ref_itr->next;
> > + }
> > +
> > + transport->remote_refs = object_info_refs;
> > +
> > + } else if (!data->finished_handshake) {
> > int must_list_refs = 0;
> > - for (i = 0; i < nr_heads; i++) {
> > + for (int i = 0; i < nr_heads; i++) {
> > if (!to_fetch[i]->exact_oid) {
> > must_list_refs = 1;
> > break;
> > @@ -481,23 +581,31 @@ static int fetch_refs_via_pack(struct transport *transport,
> > &transport->pack_lockfiles, data->version);
> >
> > data->finished_handshake = 0;
> > + if (args.object_info) {
> > + struct ref *ref_cpy_reader = object_info_refs;
> > + for (int i = 0; ref_cpy_reader; i++) {
> > + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
>
> This line might want to be folded.
>
Thank you. Fixed in V3.
>
> > + ref_cpy_reader = ref_cpy_reader->next;
> > + }
> > + }
> > +
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v2 1/6] fetch-pack: refactor packet writing
2024-09-24 11:45 ` Christian Couder
@ 2024-09-25 20:42 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-09-25 20:42 UTC (permalink / raw)
To: Christian Couder, git
Cc: calvinwan, jonathantanmy, chriscool, karthik.188, toon, jltobler
On Tue, Sep 24, 2024 at 7:45 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Sat, Jul 20, 2024 at 5:43 AM Eric Ju <eric.peijian@gmail.com> wrote:
> >
> > From: Calvin Wan <calvinwan@google.com>
> >
> > A subsequent patch needs to write capabilities for another command.
> > Refactor write_fetch_command_and_capabilities() to be a more general
> > purpose function write_command_and_capabilities(), so that it can be
> > used by both fetch and future command.
> >
> > Here "command" means the "operations" supported by Git’s wire protocol
> > https://git-scm.com/docs/protocol-v2. An example would be a
> > git's subcommand, such as git-fetch(1); or an operation supported by
> > the server side such as "object-info" implemented in "a2ba162cda
> > (object-info: support for retrieving object info, 2021-04-20)".
>
> I agree that reusing or refactoring the new
> write_command_and_capabilities() function for more commands can be
> done in a separate series that could perhaps also move the new
> function to Maybe this could be added to the commit message
> though.
>
Thank you, I am adding this to the commit message,
"In a future separate series, we can move
write_command_and_capabilities() to a higher-level file, such as
connect.c, so that it becomes accessible to other commands."
> [...]
>
> > -static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
> > - const struct string_list *server_options)
> > +static void write_command_and_capabilities(struct strbuf *req_buf,
> > + const struct string_list *server_options, const char* command)
>
> In https://lore.kernel.org/git/xmqqfsn0qsi4.fsf@gitster.g/ Junio
> suggested swaping the "command" and "server_options" arguments as well
> as sticking the "*" to "command" instead of "char", so:
>
> static void write_command_and_capabilities(struct strbuf *req_buf,
>
> const char *command,
>
> const struct string_list *server_options)
>
> The rest of the patch looks good.
Thank you. The format is changed in V3.
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v3 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (7 preceding siblings ...)
2024-08-22 21:24 ` [PATCH 0/6] " Peijian Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-09-26 1:38 ` [PATCH v3 1/6] fetch-pack: refactor packet writing Eric Ju
` (5 more replies)
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
` (7 subsequent siblings)
16 siblings, 6 replies; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This is a continuation of Calvin Wan's (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
Sometimes it is useful to get information about an object without having to download
it completely. The server logic for retrieving size has already been implemented and merged in
"a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
This patch series implement the client option for it.
This patch series add the `remote-object-info` command to `cat-file --batch-command`.
This command allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not have
object-info capability, the entire object is fetched and the
relevant object info is returned.
A few questions open for discussions please:
1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
or "warn and wait for new command".
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when remote-object-info is used.
Any suggestions on this approach?
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
V1 of the patch series can be found here:
https://lore.kernel.org/git/20240628190503.67389-1-eric.peijian@gmail.com/
v2 of the patch series can be found here:
https://lore.kernel.org/git/20240720034337.57125-1-eric.peijian@gmail.com/
Changes since V2
================
- Fix typos and formatting errors
- Add more information in commit messages
- Confirm that new logics in transport.c, fetch-pack.c and transport-helper.c
are covered in the new test file t1017-cat-file-remote-object-info.sh
Thank you.
Eric Ju
Calvin Wan (5):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
cat-file: add remote-object-info to batch-command
Eric Ju (1):
cat-file: add declaration of variable i inside its for loop
Documentation/git-cat-file.txt | 22 +-
builtin/cat-file.c | 119 +++-
fetch-pack.c | 49 +-
fetch-pack.h | 10 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/t1017-cat-file-remote-object-info.sh | 750 +++++++++++++++++++++++++
transport-helper.c | 8 +-
transport.c | 116 +++-
transport.h | 11 +
11 files changed, 1070 insertions(+), 33 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v2:
1: f90a74cbb2 ! 1: b570dee186 fetch-pack: refactor packet writing
@@ Metadata
## Commit message ##
fetch-pack: refactor packet writing
- A subsequent patch needs to write capabilities for another command.
Refactor write_fetch_command_and_capabilities() to be a more general
purpose function write_command_and_capabilities(), so that it can be
used by both fetch and future command.
@@ Commit message
the server side such as "object-info" implemented in "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
+ In a future separate series, we can move
+ write_command_and_capabilities() to a higher-level file, such as
+ connect.c, so that it becomes accessible to other commands.
+
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
+static void write_command_and_capabilities(struct strbuf *req_buf,
-+ const struct string_list *server_options, const char* command)
++ const char *command,
++ const struct string_list *server_options)
{
const char *hash_name;
@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
if (server_supports_v2("agent"))
packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
if (advertise_sid && server_supports_v2("session-id"))
+@@ fetch-pack.c: static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
+ packet_buf_delim(req_buf);
+ }
+
++
++void send_object_info_request(int fd_out, struct object_info_args *args)
++{
++ struct strbuf req_buf = STRBUF_INIT;
++
++ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
++
++ if (unsorted_string_list_has_string(args->object_info_options, "size"))
++ packet_buf_write(&req_buf, "size");
++
++ if (args->oids) {
++ for (size_t i = 0; i < args->oids->nr; i++)
++ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
++ }
++
++ packet_buf_flush(&req_buf);
++ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
++ die_errno(_("unable to write request to remote"));
++
++ strbuf_release(&req_buf);
++}
++
+ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
+ struct fetch_pack_args *args,
+ const struct ref *wants, struct oidset *common,
@@ fetch-pack.c: static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
-+ write_command_and_capabilities(&req_buf, args->server_options, "fetch");
++ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ fetch-pack.c: void negotiate_using_fetch(const struct oid_array *negotiation_tip
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
-+ write_command_and_capabilities(&req_buf, server_options, "fetch");
++ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
2: 64ec1ab9f9 = 2: e8777e8776 fetch-pack: move fetch initialization
3: 68c35ab6c1 = 3: d00d19cf2c serve: advertise object-info feature
4: 3e5a65ab46 ! 4: 3e1773910c transport: add client support for object-info
@@ fetch-pack.c: static void write_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
-+void send_object_info_request(int fd_out, struct object_info_args *args)
-+{
-+ struct strbuf req_buf = STRBUF_INIT;
-+
-+ write_command_and_capabilities(&req_buf, args->server_options, "object-info");
-+
-+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
-+ packet_buf_write(&req_buf, "size");
-+
-+ if (args->oids) {
-+ for (size_t i = 0; i < args->oids->nr; i++)
-+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
-+ }
-+
-+ packet_buf_flush(&req_buf);
-+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
-+ die_errno(_("unable to write request to remote"));
-+
-+ strbuf_release(&req_buf);
-+}
-+
- static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
- struct fetch_pack_args *args,
- const struct ref *wants, struct oidset *common,
+-
+ void send_object_info_request(int fd_out, struct object_info_args *args)
+ {
+ struct strbuf req_buf = STRBUF_INIT;
@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
@@ transport-helper.c: static int fetch_refs(struct transport *transport,
warning(_("--negotiate-only requires protocol v2"));
return -1;
}
-+ if (transport->smart_options->object_info) {
-+ // fail the command explicitly to avoid further commands input
-+ die(_("remote-object-info requires protocol v2"));
-+ }
++ /* fail the command explicitly to avoid further commands input. */
++ if (transport->smart_options->object_info)
++ die(_("remote-object-info requires protocol v2"));
++
if (!data->get_refs_list_called)
get_refs_list_using_list(transport, 0);
+
## transport.c ##
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_push,
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args.object_info_options, "size")
-+ && !server_supports_feature("object-info", "size", 0)) {
++ && !server_supports_feature("object-info", "size", 0))
+ return -1;
-+ }
+ send_object_info_request(data->fd[1], &args);
+ break;
+ case protocol_v1:
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
+ if (!strcmp(reader.line, "size")) {
+ size_index = i;
-+ for (size_t j = 0; j < args.oids->nr; j++) {
++ for (size_t j = 0; j < args.oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
-+ }
+ }
+ continue;
+ }
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
-+
+ }
+
+ string_list_clear(&object_info_values, 0);
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
-+ /* last element, no need to allocat to next */
++ /* last element, no need to allocate to next */
+ ref_itr -> next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
-+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid, &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
++ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
++ &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
5: eb1c87f3fd = 5: bb110fbc93 cat-file: add declaration of variable i inside its for loop
6: e33c1f93bc ! 6: 6dd143c164 cat-file: add remote-object-info to batch-command
@@ Documentation/git-cat-file.txt: info <object>::
+ Print object info for object references `<object>` at specified <remote> without
+ downloading objects from remote. If the object-info capability is not
+ supported by the server, the objects will be downloaded instead.
-+ Error when no object references is provided.
++ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
@@ Documentation/git-cat-file.txt: newline. The available atoms are:
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
-+%(objecttype) %(objectsize)`, except remote-object-info command who uses
++%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+When "%(objecttype)" is supported, default format should be unified.
@@ Documentation/git-cat-file.txt: scripting purposes.
CAVEATS
-------
-+Note that since objecttype, objectsize:disk and deltabase are currently not supported by the
-+remote-object-info, git will error and exit when they are in the format string.
-+
++Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are currently not supported by the
++`remote-object-info` command, we will error and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
@@ builtin/cat-file.c: static void batch_one_object(const char *obj_name,
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
-+ /*
-+ * 'size' is the only option currently supported.
-+ * Other options that are passed in the format will exit with error.
-+ */
-+ if (strstr(opt->format, "%(objectsize)")) {
-+ string_list_append(&object_info_options, "size");
-+ } else {
++
++ /* 'objectsize' is the only option currently supported */
++ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
-+ }
++
++ string_list_append(&object_info_options, "size");
++
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
+ data->size = *remote_object_info[i].sizep;
+ } else {
+ /*
-+ * When reaching here, it means remote-object-info can't retrive
-+ * infomation from server withoug downloading them, and the objects
++ * When reaching here, it means remote-object-info can't retrieve
++ * information from server without downloading them, and the objects
+ * have been fetched to client already.
-+ * Print the infomation using the logic for local objects.
++ * Print the information using the logic for local objects.
+ */
+ data->skip_object_info = 0;
+ }
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ builtin/cat-file.c: static const struct parse_cmd {
- parse_cmd_fn_t fn;
- unsigned takes_args;
} commands[] = {
-- { "contents", parse_cmd_contents, 1},
-- { "info", parse_cmd_info, 1},
-- { "flush", NULL, 0},
-+ { "contents", parse_cmd_contents, 1 },
-+ { "info", parse_cmd_info, 1 },
-+ { "remote-object-info", parse_cmd_remote_object_info, 1 },
-+ { "flush", NULL, 0 },
+ { "contents", parse_cmd_contents, 1},
+ { "info", parse_cmd_info, 1},
++ { "remote-object-info", parse_cmd_remote_object_info, 1},
+ { "flush", NULL, 0},
};
- static void batch_objects_command(struct batch_options *opt,
## object-file.c ##
@@ object-file.c: int read_loose_object(const char *path,
@@ t/t1017-cat-file-remote-object-info.sh (new)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
-+ hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
-+ tree_sha1=$(git -C "$1" write-tree)
-+ commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
-+ tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
++ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
++ tree_oid=$(git -C "$1" write-tree)
++ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
++ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
@@ t/t1017-cat-file-remote-object-info.sh (new)
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
-+# Test --batch-command remote-object-info with 'git://' transport
++# Test --batch-command remote-object-info with 'git://' transport with
++# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1
-+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_sha1
-+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1
-+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
++ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
++ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
++ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
-+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
-+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
++ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
++ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
-+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
-+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
++ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
-+ printf "%s\0" "$hello_sha1 $hello_size" >expect &&
-+ printf "%s\0" "$tree_sha1 $tree_size" >>expect &&
-+ printf "%s\0" "$commit_sha1 $commit_size" >>expect &&
-+ printf "%s\0" "$tag_sha1 $tag_size" >>expect &&
-+
-+ printf "%s\0" "$hello_sha1 missing" >>expect &&
-+ printf "%s\0" "$tree_sha1 missing" >>expect &&
-+ printf "%s\0" "$commit_sha1 missing" >>expect &&
-+ printf "%s\0" "$tag_sha1 missing" >>expect &&
-+
-+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_sha1 $tree_sha1
-+remote-object-info $GIT_DAEMON_URL/parent $commit_sha1 $tag_sha1
-+info $hello_sha1
-+info $tree_sha1
-+info $commit_sha1
-+info $tag_sha1
++ printf "%s\0" "$hello_oid $hello_size" >expect &&
++ printf "%s\0" "$tree_oid $tree_size" >>expect &&
++ printf "%s\0" "$commit_oid $commit_size" >>expect &&
++ printf "%s\0" "$tag_oid $tag_size" >>expect &&
++
++ printf "%s\0" "$hello_oid missing" >>expect &&
++ printf "%s\0" "$tree_oid missing" >>expect &&
++ printf "%s\0" "$commit_oid missing" >>expect &&
++ printf "%s\0" "$tag_oid missing" >>expect &&
++
++ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
++remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
++info $hello_oid
++info $tree_oid
++info $commit_oid
++info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # Prove object is not on the client
-+ echo "$hello_sha1 missing" >expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
-+ echo "$hello_sha1 $hello_size" >>expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >>expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
-+ echo "$hello_sha1 $hello_size" >>expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >>expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
-+ remote-object-info $GIT_DAEMON_URL/parent $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
++ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+
+ # revert server state back
@@ t/t1017-cat-file-remote-object-info.sh (new)
+
+stop_git_daemon
+
-+# Test --batch-command remote-object-info with 'file://' transport
++# Test --batch-command remote-object-info with 'file://' transport with
++# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ remote-object-info "file://${server_path}" $hello_sha1
-+ remote-object-info "file://${server_path}" $tree_sha1
-+ remote-object-info "file://${server_path}" $commit_sha1
-+ remote-object-info "file://${server_path}" $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "file://${server_path}" $hello_oid
++ remote-object-info "file://${server_path}" $tree_oid
++ remote-object-info "file://${server_path}" $commit_oid
++ remote-object-info "file://${server_path}" $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
-+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1
-+ remote-object-info "file://${server_path}" $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "file://${server_path}" $hello_oid $tree_oid
++ remote-object-info "file://${server_path}" $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
-+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1
-+ remote-object-info "file://${server_path}" $commit_sha1 $tag_sha1
++ remote-object-info "file://${server_path}" $hello_oid $tree_oid
++ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
-+ printf "%s\0" "$hello_sha1 $hello_size" >expect &&
-+ printf "%s\0" "$tree_sha1 $tree_size" >>expect &&
-+ printf "%s\0" "$commit_sha1 $commit_size" >>expect &&
-+ printf "%s\0" "$tag_sha1 $tag_size" >>expect &&
-+
-+ printf "%s\0" "$hello_sha1 missing" >>expect &&
-+ printf "%s\0" "$tree_sha1 missing" >>expect &&
-+ printf "%s\0" "$commit_sha1 missing" >>expect &&
-+ printf "%s\0" "$tag_sha1 missing" >>expect &&
-+
-+ batch_input="remote-object-info \"file://${server_path}\" $hello_sha1 $tree_sha1
-+remote-object-info \"file://${server_path}\" $commit_sha1 $tag_sha1
-+info $hello_sha1
-+info $tree_sha1
-+info $commit_sha1
-+info $tag_sha1
++ printf "%s\0" "$hello_oid $hello_size" >expect &&
++ printf "%s\0" "$tree_oid $tree_size" >>expect &&
++ printf "%s\0" "$commit_oid $commit_size" >>expect &&
++ printf "%s\0" "$tag_oid $tag_size" >>expect &&
++
++ printf "%s\0" "$hello_oid missing" >>expect &&
++ printf "%s\0" "$tree_oid missing" >>expect &&
++ printf "%s\0" "$commit_oid missing" >>expect &&
++ printf "%s\0" "$tag_oid missing" >>expect &&
++
++ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
++remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
++info $hello_oid
++info $tree_oid
++info $commit_oid
++info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd file_client_empty &&
+
+ # Prove object is not on the client
-+ echo "$hello_sha1 missing" >expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
-+ echo "$hello_sha1 $hello_size" >>expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >>expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
-+ echo "$hello_sha1 $hello_size" >>expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >>expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
-+ remote-object-info "file://${server_path}" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
++ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+
+ # revert server state back
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_sha1
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
++ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
++ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
++ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
-+ echo "$hello_sha1 missing" >>expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >>expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
++ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
-+ echo "$hello_sha1 $hello_size" >expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
++ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
-+ printf "%s\0" "$hello_sha1 $hello_size" >expect &&
-+ printf "%s\0" "$tree_sha1 $tree_size" >>expect &&
-+ printf "%s\0" "$commit_sha1 $commit_size" >>expect &&
-+ printf "%s\0" "$tag_sha1 $tag_size" >>expect &&
++ printf "%s\0" "$hello_oid $hello_size" >expect &&
++ printf "%s\0" "$tree_oid $tree_size" >>expect &&
++ printf "%s\0" "$commit_oid $commit_size" >>expect &&
++ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
-+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_sha1 $tree_sha1
-+remote-object-info $HTTPD_URL/smart/http_parent $commit_sha1 $tag_sha1
++ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
++remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # Prove object is not on the client
-+ echo "$hello_sha1 missing" >expect &&
-+ echo "$tree_sha1 missing" >>expect &&
-+ echo "$commit_sha1 missing" >>expect &&
-+ echo "$tag_sha1 missing" >>expect &&
++ echo "$hello_oid missing" >expect &&
++ echo "$tree_oid missing" >>expect &&
++ echo "$commit_oid missing" >>expect &&
++ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
-+ echo "$hello_sha1 $hello_size" >>expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >>expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
-+ echo "$hello_sha1 $hello_size" >>expect &&
-+ echo "$tree_sha1 $tree_size" >>expect &&
-+ echo "$commit_sha1 $commit_size" >>expect &&
-+ echo "$tag_sha1 $tag_size" >>expect &&
++ echo "$hello_oid $hello_size" >>expect &&
++ echo "$tree_oid $tree_size" >>expect &&
++ echo "$commit_oid $commit_size" >>expect &&
++ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
-+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
-+ info $hello_sha1
-+ info $tree_sha1
-+ info $commit_sha1
-+ info $tag_sha1
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
++ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
++ info $hello_oid
++ info $tree_oid
++ info $commit_oid
++ info $tag_oid
+ EOF
+
+ # revert server state back
--
2.46.0
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v3 1/6] fetch-pack: refactor packet writing
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-09-26 1:38 ` [PATCH v3 2/6] fetch-pack: move fetch initialization Eric Ju
` (4 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to be a more general
purpose function write_command_and_capabilities(), so that it can be
used by both fetch and future command.
Here "command" means the "operations" supported by Git’s wire protocol
https://git-scm.com/docs/protocol-v2. An example would be a
git's subcommand, such as git-fetch(1); or an operation supported by
the server side such as "object-info" implemented in "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
In a future separate series, we can move
write_command_and_capabilities() to a higher-level file, such as
connect.c, so that it becomes accessible to other commands.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index f752da93a8..756fb83f89 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1314,13 +1314,14 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
+static void write_command_and_capabilities(struct strbuf *req_buf,
+ const char *command,
+ const struct string_list *server_options)
{
const char *hash_name;
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
if (server_supports_v2("agent"))
packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
if (advertise_sid && server_supports_v2("session-id"))
@@ -1346,6 +1347,28 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
+
+void send_object_info_request(int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1356,7 +1379,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2174,7 +2197,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.46.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v3 2/6] fetch-pack: move fetch initialization
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
2024-09-26 1:38 ` [PATCH v3 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-09-26 1:38 ` [PATCH v3 3/6] serve: advertise object-info feature Eric Ju
` (3 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 756fb83f89..800505f25f 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1700,18 +1700,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.46.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v3 3/6] serve: advertise object-info feature
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
2024-09-26 1:38 ` [PATCH v3 1/6] fetch-pack: refactor packet writing Eric Ju
2024-09-26 1:38 ` [PATCH v3 2/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-09-26 1:38 ` [PATCH v3 4/6] transport: add client support for object-info Eric Ju
` (2 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index d674764a25..c3d8098642 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.46.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v3 4/6] transport: add client support for object-info
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
` (2 preceding siblings ...)
2024-09-26 1:38 ` [PATCH v3 3/6] serve: advertise object-info feature Eric Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-10-23 9:48 ` Christian Couder
2024-09-26 1:38 ` [PATCH v3 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-09-26 1:38 ` [PATCH v3 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
in “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
feature 'size' from a v2 server. If a server does not
advertise the feature, then the client falls back
to making the request through 'fetch'.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 4 +-
fetch-pack.h | 10 ++++
transport-helper.c | 8 +++-
transport.c | 116 +++++++++++++++++++++++++++++++++++++++++++--
transport.h | 11 +++++
5 files changed, 141 insertions(+), 8 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 800505f25f..1a9facc1c0 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1347,7 +1347,6 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
-
void send_object_info_request(int fd_out, struct object_info_args *args)
{
struct strbuf req_buf = STRBUF_INIT;
@@ -1706,6 +1705,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..5a5211e355 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
@@ -68,6 +70,12 @@ struct fetch_pack_args {
unsigned connectivity_checked:1;
};
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
/*
* sought represents remote references that should be updated from.
* On return, the names that were found on the remote will have been
@@ -106,4 +114,6 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
*/
int fetch_pack_fsck_objects(void);
+void send_object_info_request(int fd_out, struct object_info_args *args);
+
#endif
diff --git a/transport-helper.c b/transport-helper.c
index c688967b8c..7db74b4ad7 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -709,14 +709,18 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
+ * require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
return -1;
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
if (!data->get_refs_list_called)
get_refs_list_using_list(transport, 0);
diff --git a/transport.c b/transport.c
index 3c4714581f..8266e347b3 100644
--- a/transport.c
+++ b/transport.c
@@ -368,6 +368,77 @@ static struct ref *handshake(struct transport *transport, int for_push,
return refs;
}
+static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
+{
+ int size_index = -1;
+ struct git_transport_data *data = transport->data;
+ struct object_info_args args = { 0 };
+ struct packet_reader reader;
+
+ args.server_options = transport->server_options;
+ args.object_info_options = transport->smart_options->object_info_options;
+ args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+ data->version = discover_version(&reader);
+
+ transport->hash_algo = reader.hash_algo;
+
+ switch (data->version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args.object_info_options, "size")
+ && !server_supports_feature("object-info", "size", 0))
+ return -1;
+ send_object_info_request(data->fd[1], &args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args.object_info_options->nr; i++) {
+ if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
+ if (!strcmp(reader.line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args.oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader.line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+
+ return 0;
+}
+
static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
struct transport_ls_refs_options *options)
{
@@ -415,6 +486,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -441,11 +513,36 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct ref *ref_itr = object_info_refs = alloc_ref("");
+
+ if (!fetch_object_info(transport, data->options.object_info_data))
+ goto cleanup;
+
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
+ /* last element, no need to allocate to next */
+ ref_itr -> next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
- if (!data->finished_handshake) {
- int i;
+ ref_itr = ref_itr->next;
+ }
+
+ transport->remote_refs = object_info_refs;
+
+ } else if (!data->finished_handshake) {
int must_list_refs = 0;
- for (i = 0; i < nr_heads; i++) {
+ for (int i = 0; i < nr_heads; i++) {
if (!to_fetch[i]->exact_oid) {
must_list_refs = 1;
break;
@@ -483,23 +580,32 @@ static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
+ &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
+
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
- if (!refs)
+ if (!refs && !args.object_info)
ret = -1;
if (report_unmatched_refs(to_fetch, nr_heads))
ret = -1;
cleanup:
+ free_refs(object_info_refs);
close(data->fd[0]);
if (data->fd[1] >= 0)
close(data->fd[1]);
if (finish_connect(data->conn))
ret = -1;
data->conn = NULL;
-
free_refs(refs_tmp);
free_refs(refs);
list_objects_filter_release(&args.filter_options);
diff --git a/transport.h b/transport.h
index 6393cd9823..50ea2b05cf 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to pull only object-info. Fallbacks
+ * to pulling entire object if object-info is not supported.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.46.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v3 5/6] cat-file: add declaration of variable i inside its for loop
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
` (3 preceding siblings ...)
2024-09-26 1:38 ` [PATCH v3 4/6] transport: add client support for object-info Eric Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-09-26 1:38 ` [PATCH v3 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index bfdfb51c7c..5db55fabc4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -673,12 +673,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -686,9 +684,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -714,7 +710,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -724,7 +719,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
--
2.46.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v3 6/6] cat-file: add remote-object-info to batch-command
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
` (4 preceding siblings ...)
2024-09-26 1:38 ` [PATCH v3 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-09-26 1:38 ` Eric Ju
2024-10-23 9:49 ` Christian Couder
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-09-26 1:38 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, print the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 22 +-
builtin/cat-file.c | 108 +++-
object-file.c | 11 +
object-store-ll.h | 3 +
t/t1017-cat-file-remote-object-info.sh | 750 +++++++++++++++++++++++++
5 files changed, 889 insertions(+), 5 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..5c81af80c9 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified <remote> without
+ downloading objects from remote. If the object-info capability is not
+ supported by the server, the objects will be downloaded instead.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,9 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+When "%(objecttype)" is supported, default format should be unified.
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +407,9 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are currently not supported by the
+`remote-object-info` command, we will error and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5db55fabc4..714c182f39 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -528,7 +534,7 @@ static void batch_one_object(const char *obj_name,
enum get_oid_result result;
result = get_oid_with_context(the_repository, obj_name,
- flags, &data->oid, &ctx);
+ flags, &data->oid, &ctx);
if (result != FOUND) {
switch (result) {
case MISSING_OBJECT:
@@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -667,6 +726,52 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line,
+ struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ data->size = *remote_object_info[i].sizep;
+ } else {
+ /*
+ * When reaching here, it means remote-object-info can't retrieve
+ * information from server without downloading them, and the objects
+ * have been fetched to client already.
+ * Print the information using the logic for local objects.
+ */
+ data->skip_object_info = 0;
+ }
+
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -698,6 +803,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index 7ac9533ab1..485aa51dc6 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3020,3 +3020,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index 53b8e693b1..611e2ca708 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -548,4 +548,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..6826ff7a59
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,750 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback git://: fetch objects to client' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback file://: fetch objects to client' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+ cd file_client_empty &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback http://: fetch objects to client' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.46.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v3 4/6] transport: add client support for object-info
2024-09-26 1:38 ` [PATCH v3 4/6] transport: add client support for object-info Eric Ju
@ 2024-10-23 9:48 ` Christian Couder
2024-10-24 20:23 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Christian Couder @ 2024-10-23 9:48 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Thu, Sep 26, 2024 at 3:39 AM Eric Ju <eric.peijian@gmail.com> wrote:
> + if (transport->smart_options
> + && transport->smart_options->object_info
> + && transport->smart_options->object_info_oids->nr > 0) {
> + struct ref *ref_itr = object_info_refs = alloc_ref("");
> +
> + if (!fetch_object_info(transport, data->options.object_info_data))
> + goto cleanup;
> +
> + args.object_info_data = data->options.object_info_data;
> + args.quiet = 1;
> + args.no_progress = 1;
> + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> + ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
> + ref_itr->exact_oid = 1;
> + if (i == transport->smart_options->object_info_oids->nr - 1)
> + /* last element, no need to allocate to next */
> + ref_itr -> next = NULL;
Space characters should be removed around "->".
> + else
> + ref_itr->next = alloc_ref("");
Here for example there are no spaces around "->".
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v3 6/6] cat-file: add remote-object-info to batch-command
2024-09-26 1:38 ` [PATCH v3 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-10-23 9:49 ` Christian Couder
2024-10-23 20:25 ` Taylor Blau
2024-10-24 20:28 ` Peijian Ju
0 siblings, 2 replies; 174+ messages in thread
From: Christian Couder @ 2024-10-23 9:49 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Thu, Sep 26, 2024 at 3:39 AM Eric Ju <eric.peijian@gmail.com> wrote:
> And finally for --buffer mode `remote-object-info`:
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue:
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through and print each object info
> Else:
> - Call respective function attached to command
> - Get object info, print object info
>
> To summarize, `remote-object-info` gets object info from the remote and
> then loop through the object info passed in, print the info.
Maybe: s/print the info/printing the info/
> In order for remote-object-info to avoid remote communication overhead
> in the non-buffer mode, the objects are passed in as such:
>
> remote-object-info <remote> <oid> <oid> ... <oid>
>
> rather than
>
> remote-object-info <remote> <oid>
> remote-object-info <remote> <oid>
> ...
> remote-object-info <remote> <oid>
[...]
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +When "%(objecttype)" is supported, default format should be unified.
I think we should warn more clearly and strongly that users should
take into account that the default format will change. So they should
better not rely on the current format in their code.
Maybe something like:
`%(objectname) %(objectsize)` for now because "%(objecttype)" is not
supported yet.
WARNING: When "%(objecttype)" is supported, default format WILL be unified, so
DO NOT RELY on the current default format to stay the same!!!
> If `--batch` is specified, or if `--batch-command` is used with the `contents`
> command, the object information is followed by the object contents (consisting
[...]
> diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> new file mode 100755
> index 0000000000..6826ff7a59
> --- /dev/null
> +++ b/t/t1017-cat-file-remote-object-info.sh
> @@ -0,0 +1,750 @@
> +#!/bin/sh
> +
> +test_description='git cat-file --batch-command with remote-object-info command'
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +
> +echo_without_newline () {
> + printf '%s' "$*"
> +}
> +
> +echo_without_newline_nul () {
> + echo_without_newline "$@" | tr '\n' '\0'
> +}
> +
> +strlen () {
> + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> +}
The above functions have been copied verbatim from t1006-cat-file.sh.
I think this is worth a comment or a TODO before these functions
saying that common code might want to be unified in the future.
Maybe something like:
# TODO: refactor these functions which were copied from
t1006-cat-file.sh into a new common file, maybe "lib-cat-file.sh"
Except the above nits and another one I found in patch 4/6, the rest
of this patch series looks good to me.
Thanks!
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v3 6/6] cat-file: add remote-object-info to batch-command
2024-10-23 9:49 ` Christian Couder
@ 2024-10-23 20:25 ` Taylor Blau
2024-10-24 20:28 ` Peijian Ju
2024-10-24 20:28 ` Peijian Ju
1 sibling, 1 reply; 174+ messages in thread
From: Taylor Blau @ 2024-10-23 20:25 UTC (permalink / raw)
To: Christian Couder
Cc: Eric Ju, git, calvinwan, jonathantanmy, chriscool, karthik.188,
toon, jltobler
On Wed, Oct 23, 2024 at 11:49:44AM +0200, Christian Couder wrote:
> Except the above nits and another one I found in patch 4/6, the rest
> of this patch series looks good to me.
Thanks for reviewing. Sounds like we are expecting another round here In
the meantime, do other reviewers have any feedback on this series?
Thanks,
Taylor
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v3 4/6] transport: add client support for object-info
2024-10-23 9:48 ` Christian Couder
@ 2024-10-24 20:23 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-10-24 20:23 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Wed, Oct 23, 2024 at 5:48 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Thu, Sep 26, 2024 at 3:39 AM Eric Ju <eric.peijian@gmail.com> wrote:
>
> > + if (transport->smart_options
> > + && transport->smart_options->object_info
> > + && transport->smart_options->object_info_oids->nr > 0) {
> > + struct ref *ref_itr = object_info_refs = alloc_ref("");
> > +
> > + if (!fetch_object_info(transport, data->options.object_info_data))
> > + goto cleanup;
> > +
> > + args.object_info_data = data->options.object_info_data;
> > + args.quiet = 1;
> > + args.no_progress = 1;
> > + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> > + ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
> > + ref_itr->exact_oid = 1;
> > + if (i == transport->smart_options->object_info_oids->nr - 1)
> > + /* last element, no need to allocate to next */
> > + ref_itr -> next = NULL;
>
> Space characters should be removed around "->".
Thank you. It is fixed in v4.
>
> > + else
> > + ref_itr->next = alloc_ref("");
>
> Here for example there are no spaces around "->".
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v3 6/6] cat-file: add remote-object-info to batch-command
2024-10-23 9:49 ` Christian Couder
2024-10-23 20:25 ` Taylor Blau
@ 2024-10-24 20:28 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-10-24 20:28 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Wed, Oct 23, 2024 at 5:49 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Thu, Sep 26, 2024 at 3:39 AM Eric Ju <eric.peijian@gmail.com> wrote:
>
> > And finally for --buffer mode `remote-object-info`:
> > - Receive and parse input from user
> > - Store respective function attached to command in a queue
> > - After flush, loop through commands in queue:
> > If command is `remote-object-info`:
> > - Get object info from remote
> > - Loop through and print each object info
> > Else:
> > - Call respective function attached to command
> > - Get object info, print object info
> >
> > To summarize, `remote-object-info` gets object info from the remote and
> > then loop through the object info passed in, print the info.
>
> Maybe: s/print the info/printing the info/
Thank you. Fixed in V4.
>
> > In order for remote-object-info to avoid remote communication overhead
> > in the non-buffer mode, the objects are passed in as such:
> >
> > remote-object-info <remote> <oid> <oid> ... <oid>
> >
> > rather than
> >
> > remote-object-info <remote> <oid>
> > remote-object-info <remote> <oid>
> > ...
> > remote-object-info <remote> <oid>
>
> [...]
>
> > If no format is specified, the default format is `%(objectname)
> > -%(objecttype) %(objectsize)`.
> > +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> > +When "%(objecttype)" is supported, default format should be unified.
>
> I think we should warn more clearly and strongly that users should
> take into account that the default format will change. So they should
> better not rely on the current format in their code.
>
> Maybe something like:
>
> `%(objectname) %(objectsize)` for now because "%(objecttype)" is not
> supported yet.
> WARNING: When "%(objecttype)" is supported, default format WILL be unified, so
> DO NOT RELY on the current default format to stay the same!!!
>
Thank you. The warning is added to v4.
> > If `--batch` is specified, or if `--batch-command` is used with the `contents`
> > command, the object information is followed by the object contents (consisting
>
> [...]
>
> > diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
> > new file mode 100755
> > index 0000000000..6826ff7a59
> > --- /dev/null
> > +++ b/t/t1017-cat-file-remote-object-info.sh
> > @@ -0,0 +1,750 @@
> > +#!/bin/sh
> > +
> > +test_description='git cat-file --batch-command with remote-object-info command'
> > +
> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> > +
> > +. ./test-lib.sh
> > +
> > +echo_without_newline () {
> > + printf '%s' "$*"
> > +}
> > +
> > +echo_without_newline_nul () {
> > + echo_without_newline "$@" | tr '\n' '\0'
> > +}
> > +
> > +strlen () {
> > + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> > +}
>
> The above functions have been copied verbatim from t1006-cat-file.sh.
> I think this is worth a comment or a TODO before these functions
> saying that common code might want to be unified in the future.
>
> Maybe something like:
>
> # TODO: refactor these functions which were copied from
> t1006-cat-file.sh into a new common file, maybe "lib-cat-file.sh"
>
Thank you. I added "lib-cat-file.sh" in v4, and let both tests
t1006-cat-file.sh
and t1017-cat-file-remote-object-info.sh refer to it.
> Except the above nits and another one I found in patch 4/6, the rest
> of this patch series looks good to me.
>
> Thanks!
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v3 6/6] cat-file: add remote-object-info to batch-command
2024-10-23 20:25 ` Taylor Blau
@ 2024-10-24 20:28 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-10-24 20:28 UTC (permalink / raw)
To: Taylor Blau
Cc: Christian Couder, git, calvinwan, jonathantanmy, chriscool,
karthik.188, toon, jltobler
On Wed, Oct 23, 2024 at 4:25 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Wed, Oct 23, 2024 at 11:49:44AM +0200, Christian Couder wrote:
> > Except the above nits and another one I found in patch 4/6, the rest
> > of this patch series looks good to me.
>
> Thanks for reviewing. Sounds like we are expecting another round here In
> the meantime, do other reviewers have any feedback on this series?
>
> Thanks,
> Taylor
Thank you Tayor, no other reviewers' feedback yet.
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v4 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (8 preceding siblings ...)
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-24 20:53 ` [PATCH v4 1/6] fetch-pack: refactor packet writing Eric Ju
` (6 more replies)
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
` (6 subsequent siblings)
16 siblings, 7 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This is a continuation of Calvin Wan's (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
Sometimes it is useful to get information about an object without having to download
it completely. The server logic for retrieving size has already been implemented and merged in
"a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
This patch series implement the client option for it.
This patch series add the `remote-object-info` command to `cat-file --batch-command`.
This command allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not have
object-info capability, the entire object is fetched and the
relevant object info is returned.
A few questions open for discussions please:
1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
or "warn and wait for new command".
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when remote-object-info is used.
Any suggestions on this approach?
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
V1 of the patch series can be found here:
https://lore.kernel.org/git/20240628190503.67389-1-eric.peijian@gmail.com/
v2 of the patch series can be found here:
https://lore.kernel.org/git/20240720034337.57125-1-eric.peijian@gmail.com/
Changes since V3
================
- Fix typos and formatting errors
- Add warning in the git-cat-file doc about the default format
- Add a new test lib file, lib-cat-file.sh. And put the shared code of
t1017-cat-file-remote-object-info.sh and t1006-cat-file.sh in it.
Thank you.
Eric Ju
Calvin Wan (5):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
cat-file: add remote-object-info to batch-command
Eric Ju (1):
cat-file: add declaration of variable i inside its for loop
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 119 +++-
fetch-pack.c | 49 +-
fetch-pack.h | 10 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 115 +++-
transport.h | 11 +
13 files changed, 1081 insertions(+), 44 deletions(-)
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v3:
1: b570dee186 = 1: 41898fe23e fetch-pack: refactor packet writing
2: e8777e8776 = 2: b3a1bee551 fetch-pack: move fetch initialization
3: d00d19cf2c = 3: d363b0f768 serve: advertise object-info feature
4: 3e1773910c ! 4: 3118061b21 transport: add client support for object-info
@@ transport-helper.c: static int fetch_refs(struct transport *transport,
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
- return -1;
+@@ transport-helper.c: static int fetch_refs(struct transport *transport,
+ free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
- if (!data->get_refs_list_called)
- get_refs_list_using_list(transport, 0);
-
++ if (!data->get_refs_list_called)
++ get_refs_list_using_list(transport, 0);
++
+ count = 0;
+ for (i = 0; i < nr_heads; i++)
+ if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
## transport.c ##
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_push,
@@ transport.c: static struct ref *handshake(struct transport *transport, int for_p
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
- struct ref *refs_tmp = NULL;
+ struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
+ /* last element, no need to allocate to next */
-+ ref_itr -> next = NULL;
++ ref_itr->next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
close(data->fd[0]);
if (data->fd[1] >= 0)
close(data->fd[1]);
- if (finish_connect(data->conn))
- ret = -1;
- data->conn = NULL;
--
- free_refs(refs_tmp);
- free_refs(refs);
- list_objects_filter_release(&args.filter_options);
## transport.h ##
@@
5: bb110fbc93 = 5: 2ae81acf2a cat-file: add declaration of variable i inside its for loop
6: 6dd143c164 ! 6: b5aa6c1888 cat-file: add remote-object-info to batch-command
@@ Commit message
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
- then loop through the object info passed in, print the info.
+ then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
@@ Documentation/git-cat-file.txt: newline. The available atoms are:
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
-+When "%(objecttype)" is supported, default format should be unified.
++WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
++DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ Documentation/git-cat-file.txt: scripting purposes.
CAVEATS
-------
-+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are currently not supported by the
-+`remote-object-info` command, we will error and exit when they are in the format string.
++Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
++currently not supported by the `remote-object-info` command, we will error
++and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
@@ object-store-ll.h: int for_each_object_in_pack(struct packed_git *p,
+
#endif /* OBJECT_STORE_LL_H */
- ## t/t1017-cat-file-remote-object-info.sh (new) ##
+ ## t/lib-cat-file.sh (new) ##
@@
-+#!/bin/sh
-+
-+test_description='git cat-file --batch-command with remote-object-info command'
-+
-+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
-+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
-+
-+. ./test-lib.sh
++# Library of git-cat-file related functions.
+
++# Print a string without a trailing newline
+echo_without_newline () {
-+ printf '%s' "$*"
++ printf '%s' "$*"
+}
+
++# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
-+ echo_without_newline "$@" | tr '\n' '\0'
++ echo_without_newline "$@" | tr '\n' '\0'
+}
+
++# Calculate the length of a string removing any leading spaces.
+strlen () {
-+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
++ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
+
+ ## t/t1006-cat-file.sh ##
+@@ t/t1006-cat-file.sh: test_description='git cat-file'
+
+ TEST_PASSES_SANITIZE_LEAK=true
+ . ./test-lib.sh
++. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+ test_cmdmode_usage () {
+ test_expect_code 129 "$@" 2>err &&
+@@ t/t1006-cat-file.sh: do
+ '
+ done
+
+-echo_without_newline () {
+- printf '%s' "$*"
+-}
+-
+-echo_without_newline_nul () {
+- echo_without_newline "$@" | tr '\n' '\0'
+-}
+-
+-strlen () {
+- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+-}
+-
+ run_tests () {
+ type=$1
+ oid=$2
+
+ ## t/t1017-cat-file-remote-object-info.sh (new) ##
+@@
++#!/bin/sh
++
++test_description='git cat-file --batch-command with remote-object-info command'
++
++GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
++export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
++
++. ./test-lib.sh
++. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
--
2.47.0
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v4 1/6] fetch-pack: refactor packet writing
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-25 9:52 ` karthik nayak
2024-10-24 20:53 ` [PATCH v4 2/6] fetch-pack: move fetch initialization Eric Ju
` (5 subsequent siblings)
6 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to be a more general
purpose function write_command_and_capabilities(), so that it can be
used by both fetch and future command.
Here "command" means the "operations" supported by Git’s wire protocol
https://git-scm.com/docs/protocol-v2. An example would be a
git's subcommand, such as git-fetch(1); or an operation supported by
the server side such as "object-info" implemented in "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
In a future separate series, we can move
write_command_and_capabilities() to a higher-level file, such as
connect.c, so that it becomes accessible to other commands.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index f752da93a8..756fb83f89 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1314,13 +1314,14 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
+static void write_command_and_capabilities(struct strbuf *req_buf,
+ const char *command,
+ const struct string_list *server_options)
{
const char *hash_name;
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
if (server_supports_v2("agent"))
packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
if (advertise_sid && server_supports_v2("session-id"))
@@ -1346,6 +1347,28 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
+
+void send_object_info_request(int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1356,7 +1379,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2174,7 +2197,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v4 2/6] fetch-pack: move fetch initialization
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
2024-10-24 20:53 ` [PATCH v4 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-24 20:53 ` [PATCH v4 3/6] serve: advertise object-info feature Eric Ju
` (4 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 756fb83f89..800505f25f 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1700,18 +1700,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v4 3/6] serve: advertise object-info feature
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
2024-10-24 20:53 ` [PATCH v4 1/6] fetch-pack: refactor packet writing Eric Ju
2024-10-24 20:53 ` [PATCH v4 2/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-24 20:53 ` [PATCH v4 4/6] transport: add client support for object-info Eric Ju
` (3 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index d674764a25..c3d8098642 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v4 4/6] transport: add client support for object-info
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
` (2 preceding siblings ...)
2024-10-24 20:53 ` [PATCH v4 3/6] serve: advertise object-info feature Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-25 10:12 ` karthik nayak
2024-10-24 20:53 ` [PATCH v4 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
` (2 subsequent siblings)
6 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
in “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
feature 'size' from a v2 server. If a server does not
advertise the feature, then the client falls back
to making the request through 'fetch'.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 4 +-
fetch-pack.h | 10 ++++
transport-helper.c | 11 ++++-
transport.c | 115 +++++++++++++++++++++++++++++++++++++++++++--
transport.h | 11 +++++
5 files changed, 144 insertions(+), 7 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 800505f25f..1a9facc1c0 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1347,7 +1347,6 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
packet_buf_delim(req_buf);
}
-
void send_object_info_request(int fd_out, struct object_info_args *args)
{
struct strbuf req_buf = STRBUF_INIT;
@@ -1706,6 +1705,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..5a5211e355 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
@@ -68,6 +70,12 @@ struct fetch_pack_args {
unsigned connectivity_checked:1;
};
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
/*
* sought represents remote references that should be updated from.
* On return, the names that were found on the remote will have been
@@ -106,4 +114,6 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
*/
int fetch_pack_fsck_objects(void);
+void send_object_info_request(int fd_out, struct object_info_args *args);
+
#endif
diff --git a/transport-helper.c b/transport-helper.c
index 013ec79dc9..2ff9675984 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -709,8 +709,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
+ * require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -726,6 +726,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 47fda6a773..64b49c083b 100644
--- a/transport.c
+++ b/transport.c
@@ -371,6 +371,77 @@ static struct ref *handshake(struct transport *transport, int for_push,
return refs;
}
+static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
+{
+ int size_index = -1;
+ struct git_transport_data *data = transport->data;
+ struct object_info_args args = { 0 };
+ struct packet_reader reader;
+
+ args.server_options = transport->server_options;
+ args.object_info_options = transport->smart_options->object_info_options;
+ args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+ data->version = discover_version(&reader);
+
+ transport->hash_algo = reader.hash_algo;
+
+ switch (data->version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args.object_info_options, "size")
+ && !server_supports_feature("object-info", "size", 0))
+ return -1;
+ send_object_info_request(data->fd[1], &args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args.object_info_options->nr; i++) {
+ if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
+ if (!strcmp(reader.line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args.oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader.line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
+
+ return 0;
+}
+
static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
struct transport_ls_refs_options *options)
{
@@ -418,6 +489,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -444,11 +516,36 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct ref *ref_itr = object_info_refs = alloc_ref("");
+
+ if (!fetch_object_info(transport, data->options.object_info_data))
+ goto cleanup;
+
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
+ /* last element, no need to allocate to next */
+ ref_itr->next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
- if (!data->finished_handshake) {
- int i;
+ ref_itr = ref_itr->next;
+ }
+
+ transport->remote_refs = object_info_refs;
+
+ } else if (!data->finished_handshake) {
int must_list_refs = 0;
- for (i = 0; i < nr_heads; i++) {
+ for (int i = 0; i < nr_heads; i++) {
if (!to_fetch[i]->exact_oid) {
must_list_refs = 1;
break;
@@ -494,16 +591,26 @@ static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
+ &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
+
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
- if (!refs)
+ if (!refs && !args.object_info)
ret = -1;
if (report_unmatched_refs(to_fetch, nr_heads))
ret = -1;
cleanup:
+ free_refs(object_info_refs);
close(data->fd[0]);
if (data->fd[1] >= 0)
close(data->fd[1]);
diff --git a/transport.h b/transport.h
index 44100fa9b7..42b8ee1251 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to pull only object-info. Fallbacks
+ * to pulling entire object if object-info is not supported.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v4 5/6] cat-file: add declaration of variable i inside its for loop
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
` (3 preceding siblings ...)
2024-10-24 20:53 ` [PATCH v4 4/6] transport: add client support for object-info Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-24 20:53 ` [PATCH v4 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-10-25 20:56 ` [PATCH v4 0/6] " Taylor Blau
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index bfdfb51c7c..5db55fabc4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -673,12 +673,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -686,9 +684,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -714,7 +710,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -724,7 +719,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v4 6/6] cat-file: add remote-object-info to batch-command
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
` (4 preceding siblings ...)
2024-10-24 20:53 ` [PATCH v4 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-10-24 20:53 ` Eric Ju
2024-10-25 10:53 ` karthik nayak
2024-10-25 20:56 ` [PATCH v4 0/6] " Taylor Blau
6 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-10-24 20:53 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 108 +++-
object-file.c | 11 +
object-store-ll.h | 3 +
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++
7 files changed, 897 insertions(+), 17 deletions(-)
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..f2be00b599 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified <remote> without
+ downloading objects from remote. If the object-info capability is not
+ supported by the server, the objects will be downloaded instead.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will error
+and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5db55fabc4..714c182f39 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -528,7 +534,7 @@ static void batch_one_object(const char *obj_name,
enum get_oid_result result;
result = get_oid_with_context(the_repository, obj_name,
- flags, &data->oid, &ctx);
+ flags, &data->oid, &ctx);
if (result != FOUND) {
switch (result) {
case MISSING_OBJECT:
@@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -667,6 +726,52 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line,
+ struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ data->size = *remote_object_info[i].sizep;
+ } else {
+ /*
+ * When reaching here, it means remote-object-info can't retrieve
+ * information from server without downloading them, and the objects
+ * have been fetched to client already.
+ * Print the information using the logic for local objects.
+ */
+ data->skip_object_info = 0;
+ }
+
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -698,6 +803,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index b1a3463852..181cde98e1 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3132,3 +3132,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index 53b8e693b1..611e2ca708 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -548,4 +548,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..929d32da76
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related functions.
+
+# Print a string without a trailing newline
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string removing any leading spaces.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index d36cd7c086..d8a851c427 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -4,6 +4,7 @@ test_description='git cat-file'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -99,18 +100,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..f4bff07311
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,739 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback git://: fetch objects to client' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback file://: fetch objects to client' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+ cd file_client_empty &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback http://: fetch objects to client' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v4 1/6] fetch-pack: refactor packet writing
2024-10-24 20:53 ` [PATCH v4 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-10-25 9:52 ` karthik nayak
2024-10-25 16:06 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: karthik nayak @ 2024-10-25 9:52 UTC (permalink / raw)
To: Eric Ju, git; +Cc: calvinwan, jonathantanmy, chriscool, toon, jltobler
[-- Attachment #1: Type: text/plain, Size: 865 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
[snip]
> +
> +void send_object_info_request(int fd_out, struct object_info_args *args)
> +{
> + struct strbuf req_buf = STRBUF_INIT;
> +
> + write_command_and_capabilities(&req_buf, "object-info", args->server_options);
> +
> + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> + packet_buf_write(&req_buf, "size");
> +
> + if (args->oids) {
> + for (size_t i = 0; i < args->oids->nr; i++)
> + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> + }
> +
> + packet_buf_flush(&req_buf);
> + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> + die_errno(_("unable to write request to remote"));
> +
> + strbuf_release(&req_buf);
> +}
> +
Was this function meant to be added here? I mean, there is no reference
to it in the commit message or anywhere else.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 4/6] transport: add client support for object-info
2024-10-24 20:53 ` [PATCH v4 4/6] transport: add client support for object-info Eric Ju
@ 2024-10-25 10:12 ` karthik nayak
2024-10-28 5:39 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: karthik nayak @ 2024-10-25 10:12 UTC (permalink / raw)
To: Eric Ju, git; +Cc: calvinwan, jonathantanmy, chriscool, toon, jltobler
[-- Attachment #1: Type: text/plain, Size: 5170 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
[snip]
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 800505f25f..1a9facc1c0 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1347,7 +1347,6 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
> packet_buf_delim(req_buf);
> }
>
> -
Seems like this was introduced in Patch 1/6, including the function
below which is not used in that patch.
> void send_object_info_request(int fd_out, struct object_info_args *args)
> {
> struct strbuf req_buf = STRBUF_INIT;
> @@ -1706,6 +1705,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> if (args->depth > 0 || args->deepen_since || args->deepen_not)
> args->deepen = 1;
>
> + if (args->object_info)
> + state = FETCH_SEND_REQUEST;
> +
> while (state != FETCH_DONE) {
> switch (state) {
> case FETCH_CHECK_LOCAL:
[snip]
> /*
> * sought represents remote references that should be updated from.
> * On return, the names that were found on the remote will have been
> @@ -106,4 +114,6 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
> */
> int fetch_pack_fsck_objects(void);
>
> +void send_object_info_request(int fd_out, struct object_info_args *args);
> +
>
Nit: Would be nice to have a comment here explaining what the function does.
> #endif
> diff --git a/transport-helper.c b/transport-helper.c
> index 013ec79dc9..2ff9675984 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -709,8 +709,8 @@ static int fetch_refs(struct transport *transport,
>
> /*
> * If we reach here, then the server, the client, and/or the transport
> - * helper does not support protocol v2. --negotiate-only requires
> - * protocol v2.
> + * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
Nit: could we wrap this comment?
> + * require protocol v2.
> */
> if (data->transport_options.acked_commits) {
> warning(_("--negotiate-only requires protocol v2"));
[snip]
> static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> struct transport_ls_refs_options *options)
> {
> @@ -418,6 +489,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> struct ref *refs = NULL;
> struct fetch_pack_args args;
> struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
> + struct ref *object_info_refs = NULL;
>
> memset(&args, 0, sizeof(args));
> args.uploadpack = data->options.uploadpack;
> @@ -444,11 +516,36 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.server_options = transport->server_options;
> args.negotiation_tips = data->options.negotiation_tips;
> args.reject_shallow_remote = transport->smart_options->reject_shallow;
> + args.object_info = transport->smart_options->object_info;
> +
> + if (transport->smart_options
> + && transport->smart_options->object_info
> + && transport->smart_options->object_info_oids->nr > 0) {
> + struct ref *ref_itr = object_info_refs = alloc_ref("");
> +
> + if (!fetch_object_info(transport, data->options.object_info_data))
> + goto cleanup;
So if we were successful, we skip to the cleanup. Okay.
> + args.object_info_data = data->options.object_info_data;
> + args.quiet = 1;
> + args.no_progress = 1;
Not sure why we set quiet and no_progress here.
> + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> + ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
> + ref_itr->exact_oid = 1;
> + if (i == transport->smart_options->object_info_oids->nr - 1)
> + /* last element, no need to allocate to next */
> + ref_itr->next = NULL;
> + else
> + ref_itr->next = alloc_ref("");
>
> - if (!data->finished_handshake) {
> - int i;
> + ref_itr = ref_itr->next;
> + }
> +
> + transport->remote_refs = object_info_refs;
> +
> + } else if (!data->finished_handshake) {
> int must_list_refs = 0;
> - for (i = 0; i < nr_heads; i++) {
> + for (int i = 0; i < nr_heads; i++) {
> if (!to_fetch[i]->exact_oid) {
> must_list_refs = 1;
> break;
> @@ -494,16 +591,26 @@ static int fetch_refs_via_pack(struct transport *transport,
> &transport->pack_lockfiles, data->version);
>
> data->finished_handshake = 0;
> + if (args.object_info) {
> + struct ref *ref_cpy_reader = object_info_refs;
> + for (int i = 0; ref_cpy_reader; i++) {
> + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
> + &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
> + ref_cpy_reader = ref_cpy_reader->next;
> + }
> + }
> +
> data->options.self_contained_and_connected =
> args.self_contained_and_connected;
> data->options.connectivity_checked = args.connectivity_checked;
>
> - if (!refs)
> + if (!refs && !args.object_info)
> ret = -1;
This is because, now we don't necessary always fetch the refs, since
sometimes we're just happy fetching the object info. Would be nice to
have a comment here.
> if (report_unmatched_refs(to_fetch, nr_heads))
> ret = -1;
>
> cleanup:
> + free_refs(object_info_refs);
> close(data->fd[0]);
> if (data->fd[1] >= 0)
> close(data->fd[1]);
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 6/6] cat-file: add remote-object-info to batch-command
2024-10-24 20:53 ` [PATCH v4 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-10-25 10:53 ` karthik nayak
2024-10-25 13:55 ` Christian Couder
0 siblings, 1 reply; 174+ messages in thread
From: karthik nayak @ 2024-10-25 10:53 UTC (permalink / raw)
To: Eric Ju, git; +Cc: calvinwan, jonathantanmy, chriscool, toon, jltobler
[-- Attachment #1: Type: text/plain, Size: 1778 bytes --]
Eric Ju <eric.peijian@gmail.com> writes:
[snip]
> @@ -314,7 +323,10 @@ newline. The available atoms are:
> line) are output in place of the `%(rest)` atom.
>
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> +DO NOT RELY on the current the default format to stay the same!!!
>
This seems like a planned breakage, wouldn't it make more sense to
implement %(objecttype) first?
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 5db55fabc4..714c182f39 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -24,6 +24,9 @@
> #include "promisor-remote.h"
> #include "mailmap.h"
> #include "write-or-die.h"
> +#include "alias.h"
> +#include "remote.h"
> +#include "transport.h"
>
> enum batch_mode {
> BATCH_MODE_CONTENTS,
> @@ -42,9 +45,12 @@ struct batch_options {
> char input_delim;
> char output_delim;
> const char *format;
> + int use_remote_info;
> };
>
> static const char *force_path;
> +static struct object_info *remote_object_info;
> +static struct oid_array object_info_oids = OID_ARRAY_INIT;
>
> static struct string_list mailmap = STRING_LIST_INIT_NODUP;
> static int use_mailmap;
> @@ -528,7 +534,7 @@ static void batch_one_object(const char *obj_name,
> enum get_oid_result result;
>
> result = get_oid_with_context(the_repository, obj_name,
> - flags, &data->oid, &ctx);
> + flags, &data->oid, &ctx);
Nit: we usually don't fix parts of code, which we're not explicitly
modifying.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 6/6] cat-file: add remote-object-info to batch-command
2024-10-25 10:53 ` karthik nayak
@ 2024-10-25 13:55 ` Christian Couder
0 siblings, 0 replies; 174+ messages in thread
From: Christian Couder @ 2024-10-25 13:55 UTC (permalink / raw)
To: karthik nayak
Cc: Eric Ju, git, calvinwan, jonathantanmy, chriscool, toon, jltobler
Hi Karthik,
On Fri, Oct 25, 2024 at 12:53 PM karthik nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> [snip]
>
> > @@ -314,7 +323,10 @@ newline. The available atoms are:
> > line) are output in place of the `%(rest)` atom.
> >
> > If no format is specified, the default format is `%(objectname)
> > -%(objecttype) %(objectsize)`.
> > +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> > +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> > +DO NOT RELY on the current the default format to stay the same!!!
>
> This seems like a planned breakage, wouldn't it make more sense to
> implement %(objecttype) first?
I don't think it's fair to say it's a planned breakage. For example if
a default specifies what is displayed on the command line and if
what's displayed has an informational purpose there, then changing the
default to add more information is not really a breakage. Here we make
it clear that a possible breakage (in case the feature wasn't used for
informational purposes only for example) could easily be avoided by
not relying on the default, but instead specifying exactly the desired
output format.
And yeah ideally both %(objecttype) and %(objectsize) should be
implemented first, but on the other hand sending short patch series
and growing some features step by step is a nicer approach in some
ways than sending big patch series that are hard to review. So I think
it's fair here to start with just %(objectsize) and leave
%(objecttype) for a future patch series.
Except for this I agree with the comments in your review of the v4.
Thanks!
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 1/6] fetch-pack: refactor packet writing
2024-10-25 9:52 ` karthik nayak
@ 2024-10-25 16:06 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-10-25 16:06 UTC (permalink / raw)
To: karthik nayak; +Cc: git, calvinwan, jonathantanmy, chriscool, toon, jltobler
On Fri, Oct 25, 2024 at 5:52 AM karthik nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> [snip]
>
> > +
> > +void send_object_info_request(int fd_out, struct object_info_args *args)
> > +{
> > + struct strbuf req_buf = STRBUF_INIT;
> > +
> > + write_command_and_capabilities(&req_buf, "object-info", args->server_options);
> > +
> > + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> > + packet_buf_write(&req_buf, "size");
> > +
> > + if (args->oids) {
> > + for (size_t i = 0; i < args->oids->nr; i++)
> > + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> > + }
> > +
> > + packet_buf_flush(&req_buf);
> > + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> > + die_errno(_("unable to write request to remote"));
> > +
> > + strbuf_release(&req_buf);
> > +}
> > +
>
> Was this function meant to be added here? I mean, there is no reference
> to it in the commit message or anywhere else.
>
Thank you.
The `send_object_info_request` function is used in `transport.c`
`fetch_object_info()` in patch 4/6. Its functionality is similar to
`send_fetch_request()`: sending the object-info command along with
sub-command (e.g. size) and arguments (e.g. oids) to the remote.
I guess Clavin put it here because
1. it has similar functionality as `send_fetch_request()`
2. `write_command_and_capabilities()` is only visible within `fecth-pack.c`.
However, I believe your comment is valid. Adding everything to
`fetch-pack.c` makes the file overly bloated with functionality
unrelated to fetch-pack. For v5, I plan to address this by:
I will:
1. move `write_command_and_capabilities()` a level up to connect.c
2. add a new file f`ecth-object-info.c` at the same level of
`fetch-pack.c`. This new file contains the logic related to
object-info command, i.e. `send_object_info_request()` and
`fetch_object_info()`
3. move `fetch_object_info()` away from `transport.c`
The dependency WAS like this, as I pointed out in a previous reply at
https://lore.kernel.org/git/CAN2LT1AM5rYpwjZ+rhYerxDkL6mbxr7iDc=wvuhvNKS8VVXQ8w@mail.gmail.com/#t
`transport.c` -> `fetch-pack.c` -> `connect.c`, where "->" means
"depends on".
In v5, it would be like this:
`transport.c` -> `fetch-pack.c` -> `connect.c`
| /
-> fecth-object-info.c ->
Let me know if that makes sense.
> [snip]
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 0/6] cat-file: add remote-object-info to batch-command
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
` (5 preceding siblings ...)
2024-10-24 20:53 ` [PATCH v4 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-10-25 20:56 ` Taylor Blau
2024-10-27 3:54 ` Peijian Ju
6 siblings, 1 reply; 174+ messages in thread
From: Taylor Blau @ 2024-10-25 20:56 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Thu, Oct 24, 2024 at 04:53:53PM -0400, Eric Ju wrote:
> Calvin Wan (5):
> fetch-pack: refactor packet writing
> fetch-pack: move fetch initialization
> serve: advertise object-info feature
> transport: add client support for object-info
> cat-file: add remote-object-info to batch-command
>
> Eric Ju (1):
> cat-file: add declaration of variable i inside its for loop
Thanks. I just want to make sure that I have the right base here... this
was previously based on 3857aae53f (Git 2.47-rc0, 2024-09-25), but
applying the new round did not cleanly apply on top of that commit as
its merge base.
I applied the new round on top of the current tip of 'master', which is
6a11438f43 (The fifth batch, 2024-10-25) at the time of writing.
Please let me know if that was the right choice to make ;-).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 0/6] cat-file: add remote-object-info to batch-command
2024-10-25 20:56 ` [PATCH v4 0/6] " Taylor Blau
@ 2024-10-27 3:54 ` Peijian Ju
2024-10-28 0:01 ` Taylor Blau
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2024-10-27 3:54 UTC (permalink / raw)
To: Taylor Blau
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Oct 25, 2024 at 4:56 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Thu, Oct 24, 2024 at 04:53:53PM -0400, Eric Ju wrote:
> > Calvin Wan (5):
> > fetch-pack: refactor packet writing
> > fetch-pack: move fetch initialization
> > serve: advertise object-info feature
> > transport: add client support for object-info
> > cat-file: add remote-object-info to batch-command
> >
> > Eric Ju (1):
> > cat-file: add declaration of variable i inside its for loop
>
> Thanks. I just want to make sure that I have the right base here... this
> was previously based on 3857aae53f (Git 2.47-rc0, 2024-09-25), but
> applying the new round did not cleanly apply on top of that commit as
> its merge base.
>
> I applied the new round on top of the current tip of 'master', which is
> 6a11438f43 (The fifth batch, 2024-10-25) at the time of writing.
>
> Please let me know if that was the right choice to make ;-).
>
> Thanks,
> Taylor
Hi Taylor,
I probably rebase on the wrong master tip. I am working on a new v5 now.
Would you like to resend v4 or can we skip v4 and use v5?
Thank you.
Eric
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 0/6] cat-file: add remote-object-info to batch-command
2024-10-27 3:54 ` Peijian Ju
@ 2024-10-28 0:01 ` Taylor Blau
0 siblings, 0 replies; 174+ messages in thread
From: Taylor Blau @ 2024-10-28 0:01 UTC (permalink / raw)
To: Peijian Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Sat, Oct 26, 2024 at 11:54:03PM -0400, Peijian Ju wrote:
> I probably rebase on the wrong master tip. I am working on a new v5 now.
> Would you like to resend v4 or can we skip v4 and use v5?
Let's skip resending this round since I found a suitable base for it.
Once you send v5, I can apply it onto any base that you specify as
appropriate.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v4 4/6] transport: add client support for object-info
2024-10-25 10:12 ` karthik nayak
@ 2024-10-28 5:39 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-10-28 5:39 UTC (permalink / raw)
To: karthik nayak; +Cc: git, calvinwan, jonathantanmy, chriscool, toon, jltobler
On Fri, Oct 25, 2024 at 6:13 AM karthik nayak <karthik.188@gmail.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> [snip]
>
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index 800505f25f..1a9facc1c0 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1347,7 +1347,6 @@ static void write_command_and_capabilities(struct strbuf *req_buf,
> > packet_buf_delim(req_buf);
> > }
> >
> > -
>
> Seems like this was introduced in Patch 1/6, including the function
> below which is not used in that patch.
>
Thank you. As explained at
https://lore.kernel.org/git/CAN2LT1CEPdTAxCEpKtd+8-5zKYSnh0PMqEXgAZ++TTMPPKrD1g@mail.gmail.com/.
In Patch 1/6, I am moving `write_command_and_capabilities()` to connect.c.
And I am moving `send_object_info_request()` to a new file
fetch-object-info.c in patch 4/6 where it is used.
> > void send_object_info_request(int fd_out, struct object_info_args *args)
> > {
> > struct strbuf req_buf = STRBUF_INIT;
> > @@ -1706,6 +1705,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> > if (args->depth > 0 || args->deepen_since || args->deepen_not)
> > args->deepen = 1;
> >
> > + if (args->object_info)
> > + state = FETCH_SEND_REQUEST;
> > +
> > while (state != FETCH_DONE) {
> > switch (state) {
> > case FETCH_CHECK_LOCAL:
>
> [snip]
>
> > /*
> > * sought represents remote references that should be updated from.
> > * On return, the names that were found on the remote will have been
> > @@ -106,4 +114,6 @@ int report_unmatched_refs(struct ref **sought, int nr_sought);
> > */
> > int fetch_pack_fsck_objects(void);
> >
> > +void send_object_info_request(int fd_out, struct object_info_args *args);
> > +
> >
>
> Nit: Would be nice to have a comment here explaining what the function does.
>
Thank you. Added in v5.
> > #endif
> > diff --git a/transport-helper.c b/transport-helper.c
> > index 013ec79dc9..2ff9675984 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -709,8 +709,8 @@ static int fetch_refs(struct transport *transport,
> >
> > /*
> > * If we reach here, then the server, the client, and/or the transport
> > - * helper does not support protocol v2. --negotiate-only requires
> > - * protocol v2.
> > + * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
>
> Nit: could we wrap this comment?
>
Thank you, Fixed in v5.
> > + * require protocol v2.
> > */
> > if (data->transport_options.acked_commits) {
> > warning(_("--negotiate-only requires protocol v2"));
>
> [snip]
>
> > static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> > struct transport_ls_refs_options *options)
> > {
> > @@ -418,6 +489,7 @@ static int fetch_refs_via_pack(struct transport *transport,
> > struct ref *refs = NULL;
> > struct fetch_pack_args args;
> > struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
> > + struct ref *object_info_refs = NULL;
> >
> > memset(&args, 0, sizeof(args));
> > args.uploadpack = data->options.uploadpack;
> > @@ -444,11 +516,36 @@ static int fetch_refs_via_pack(struct transport *transport,
> > args.server_options = transport->server_options;
> > args.negotiation_tips = data->options.negotiation_tips;
> > args.reject_shallow_remote = transport->smart_options->reject_shallow;
> > + args.object_info = transport->smart_options->object_info;
> > +
> > + if (transport->smart_options
> > + && transport->smart_options->object_info
> > + && transport->smart_options->object_info_oids->nr > 0) {
> > + struct ref *ref_itr = object_info_refs = alloc_ref("");
> > +
> > + if (!fetch_object_info(transport, data->options.object_info_data))
> > + goto cleanup;
>
> So if we were successful, we skip to the cleanup. Okay.
>
Yes, that is right.
> > + args.object_info_data = data->options.object_info_data;
> > + args.quiet = 1;
> > + args.no_progress = 1;
>
> Not sure why we set quiet and no_progress here.
>
Thank you. If the code reaches here, it means we fall back to
downloading the pack file with fetch_pack().
Setting quiet and no_progress just wants to make fetch_pack less
verbose and do its job quietly in the background.
It is like calling `git fetch-pack -q ...`. I see setting quiet and
no_progress is necessary here because:
1. If the call git fetch-pack is from an internal command, we would
better keep the call lean and efficient.
2. If the user wants to do a verbose call, they have the choice to
call git fetch-pack directly from the client.
I add a comment in v5 to explain it, like this:
" we can't retrieve object info in packets, so we will fall back to
downland pack files. We set quiet and no_progress to be true, so that
the internal call of fetch-pack is less verbose."
> > + for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
> > + ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
> > + ref_itr->exact_oid = 1;
> > + if (i == transport->smart_options->object_info_oids->nr - 1)
> > + /* last element, no need to allocate to next */
> > + ref_itr->next = NULL;
> > + else
> > + ref_itr->next = alloc_ref("");
> >
> > - if (!data->finished_handshake) {
> > - int i;
> > + ref_itr = ref_itr->next;
> > + }
> > +
> > + transport->remote_refs = object_info_refs;
> > +
> > + } else if (!data->finished_handshake) {
> > int must_list_refs = 0;
> > - for (i = 0; i < nr_heads; i++) {
> > + for (int i = 0; i < nr_heads; i++) {
> > if (!to_fetch[i]->exact_oid) {
> > must_list_refs = 1;
> > break;
> > @@ -494,16 +591,26 @@ static int fetch_refs_via_pack(struct transport *transport,
> > &transport->pack_lockfiles, data->version);
> >
> > data->finished_handshake = 0;
> > + if (args.object_info) {
> > + struct ref *ref_cpy_reader = object_info_refs;
> > + for (int i = 0; ref_cpy_reader; i++) {
> > + oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
> > + &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
> > + ref_cpy_reader = ref_cpy_reader->next;
> > + }
> > + }
> > +
> > data->options.self_contained_and_connected =
> > args.self_contained_and_connected;
> > data->options.connectivity_checked = args.connectivity_checked;
> >
> > - if (!refs)
> > + if (!refs && !args.object_info)
> > ret = -1;
>
> This is because, now we don't necessary always fetch the refs, since
> sometimes we're just happy fetching the object info. Would be nice to
> have a comment here.
>
Thank you. Acutally, if the code reaches here, it means we fall back
to downloading the pack file.
I would expect there is no difference from the old logic, so
`!args.object_info` might not be needed here.
I am removing `!args.object_info` in v5.
> > if (report_unmatched_refs(to_fetch, nr_heads))
> > ret = -1;
> >
> > cleanup:
> > + free_refs(object_info_refs);
> > close(data->fd[0]);
> > if (data->fd[1] >= 0)
> > close(data->fd[1]);
>
>
> [snip]
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v5 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (9 preceding siblings ...)
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
2024-10-28 20:34 ` [PATCH v5 1/6] fetch-pack: refactor packet writing Eric Ju
` (5 more replies)
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
` (5 subsequent siblings)
16 siblings, 6 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This is a continuation of Calvin Wan's (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
Sometimes it is useful to get information about an object without having to download
it completely. The server logic for retrieving size has already been implemented and merged in
"a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
This patch series implement the client option for it.
This patch series add the `remote-object-info` command to `cat-file --batch-command`.
This command allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not have
object-info capability, the entire object is fetched and the
relevant object info is returned.
A few questions open for discussions please:
1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
or "warn and wait for new command".
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when remote-object-info is used.
Any suggestions on this approach?
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
Changes since V4
================
- Take write_command_and_capabilities() out of fetch_pack.c, put it in a higher level connect.c.
- Move remote-object-info related logic out of fetch_pack.c, and put them in a new file fetch-object-info.c and fetch-object-info.h.
- Add more comments, esapcially where we fallback to downloading pack files and on functions on sending arguments and receiving data.
- Fix typos and formatting errors.
Thank you.
Eric Ju
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (2):
cat-file: add declaration of variable i inside its for loop
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.txt | 24 +-
Makefile | 1 +
builtin/cat-file.c | 119 +++-
connect.c | 34 ++
connect.h | 4 +
fetch-object-info.c | 95 ++++
fetch-object-info.h | 18 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 80 ++-
transport.h | 11 +
18 files changed, 1165 insertions(+), 71 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v4:
1: 20f32be592 ! 1: f678c6b76f fetch-pack: refactor packet writing
@@ Commit message
Refactor write_fetch_command_and_capabilities() to be a more general
purpose function write_command_and_capabilities(), so that it can be
- used by both fetch and future command.
+ used by both fetch and future commands.
Here "command" means the "operations" supported by Git’s wire protocol
https://git-scm.com/docs/protocol-v2. An example would be a
@@ Commit message
the server side such as "object-info" implemented in "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
- In a future separate series, we can move
- write_command_and_capabilities() to a higher-level file, such as
+ The new write_command_and_capabilities() function is also moved to
connect.c, so that it becomes accessible to other commands.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
@@ Commit message
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
- ## fetch-pack.c ##
-@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
- return haves_added;
+ ## connect.c ##
+@@ connect.c: int server_supports(const char *feature)
+ return !!server_feature_value(feature, NULL);
}
--static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
-- const struct string_list *server_options)
-+static void write_command_and_capabilities(struct strbuf *req_buf,
-+ const char *command,
-+ const struct string_list *server_options)
- {
- const char *hash_name;
-
-- ensure_server_supports_v2("fetch");
-- packet_buf_write(req_buf, "command=fetch");
-+ ensure_server_supports_v2(command);
-+ packet_buf_write(req_buf, "command=%s", command);
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
-@@ fetch-pack.c: static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- packet_buf_delim(req_buf);
- }
-
-+
-+void send_object_info_request(int fd_out, struct object_info_args *args)
++void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
++ const struct string_list *server_options)
+{
-+ struct strbuf req_buf = STRBUF_INIT;
++ const char *hash_name;
++ int advertise_sid;
+
-+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
++ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
-+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
-+ packet_buf_write(&req_buf, "size");
-+
-+ if (args->oids) {
-+ for (size_t i = 0; i < args->oids->nr; i++)
-+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
++ ensure_server_supports_v2(command);
++ packet_buf_write(req_buf, "command=%s", command);
++ if (server_supports_v2("agent"))
++ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
++ if (advertise_sid && server_supports_v2("session-id"))
++ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
++ if (server_options && server_options->nr) {
++ ensure_server_supports_v2("server-option");
++ for (int i = 0; i < server_options->nr; i++)
++ packet_buf_write(req_buf, "server-option=%s",
++ server_options->items[i].string);
+ }
+
-+ packet_buf_flush(&req_buf);
-+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
-+ die_errno(_("unable to write request to remote"));
-+
-+ strbuf_release(&req_buf);
++ if (server_feature_v2("object-format", &hash_name)) {
++ const int hash_algo = hash_algo_by_name(hash_name);
++ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
++ die(_("mismatched algorithms: client %s; server %s"),
++ the_hash_algo->name, hash_name);
++ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
++ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
++ die(_("the server does not support algorithm '%s'"),
++ the_hash_algo->name);
++ }
++ packet_buf_delim(req_buf);
+}
+
+ enum protocol {
+ PROTO_LOCAL = 1,
+ PROTO_FILE,
+
+ ## connect.h ##
+@@
+ #ifndef CONNECT_H
+ #define CONNECT_H
+
++#include "string-list.h"
+ #include "protocol.h"
+
+ #define CONNECT_VERBOSE (1u << 0)
+@@ connect.h: void check_stateless_delimiter(int stateless_rpc,
+ struct packet_reader *reader,
+ const char *error);
+
++void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
++ const struct string_list *server_options);
++
+ #endif
+
+ ## fetch-pack.c ##
+@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
+ return haves_added;
+ }
+
+-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
+- const struct string_list *server_options)
+-{
+- const char *hash_name;
+-
+- ensure_server_supports_v2("fetch");
+- packet_buf_write(req_buf, "command=fetch");
+- if (server_supports_v2("agent"))
+- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+- if (advertise_sid && server_supports_v2("session-id"))
+- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+- if (server_options && server_options->nr) {
+- int i;
+- ensure_server_supports_v2("server-option");
+- for (i = 0; i < server_options->nr; i++)
+- packet_buf_write(req_buf, "server-option=%s",
+- server_options->items[i].string);
+- }
+-
+- if (server_feature_v2("object-format", &hash_name)) {
+- int hash_algo = hash_algo_by_name(hash_name);
+- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+- die(_("mismatched algorithms: client %s; server %s"),
+- the_hash_algo->name, hash_name);
+- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+- die(_("the server does not support algorithm '%s'"),
+- the_hash_algo->name);
+- }
+- packet_buf_delim(req_buf);
+-}
+-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
2: 75184a49f5 = 2: e91f01ec4d fetch-pack: move fetch initialization
3: 7d1f341589 = 3: dfc685c90b serve: advertise object-info feature
4: e59964f6c9 ! 4: 7da1a1c904 transport: add client support for object-info
@@ Commit message
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
+ ## Makefile ##
+@@ Makefile: LIB_OBJS += ewah/ewah_rlw.o
+ LIB_OBJS += exec-cmd.o
+ LIB_OBJS += fetch-negotiator.o
+ LIB_OBJS += fetch-pack.o
++LIB_OBJS += fetch-object-info.o
+ LIB_OBJS += fmt-merge-msg.o
+ LIB_OBJS += fsck.o
+ LIB_OBJS += fsmonitor.o
+
+ ## fetch-object-info.c (new) ##
+@@
++#include "git-compat-util.h"
++#include "gettext.h"
++#include "hex.h"
++#include "pkt-line.h"
++#include "connect.h"
++#include "oid-array.h"
++#include "object-store-ll.h"
++#include "fetch-object-info.h"
++#include "string-list.h"
++
++/**
++ * send_object_info_request sends git-cat-file object-info command and its
++ * arguments into the request buffer.
++ */
++static void send_object_info_request(const int fd_out, struct object_info_args *args)
++{
++ struct strbuf req_buf = STRBUF_INIT;
++
++ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
++
++ if (unsorted_string_list_has_string(args->object_info_options, "size"))
++ packet_buf_write(&req_buf, "size");
++
++ if (args->oids) {
++ for (size_t i = 0; i < args->oids->nr; i++)
++ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
++ }
++
++ packet_buf_flush(&req_buf);
++ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
++ die_errno(_("unable to write request to remote"));
++
++ strbuf_release(&req_buf);
++}
++
++/**
++ * fetch_object_info sends git-cat-file object-info command into the request buf
++ * and read the results from packets.
++ */
++int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
++ struct packet_reader *reader, struct object_info *object_info_data,
++ const int stateless_rpc, const int fd_out)
++{
++ int size_index = -1;
++
++ switch (version) {
++ case protocol_v2:
++ if (!server_supports_v2("object-info"))
++ return -1;
++ if (unsorted_string_list_has_string(args->object_info_options, "size")
++ && !server_supports_feature("object-info", "size", 0))
++ return -1;
++ send_object_info_request(fd_out, args);
++ break;
++ case protocol_v1:
++ case protocol_v0:
++ die(_("wrong protocol version. expected v2"));
++ case protocol_unknown_version:
++ BUG("unknown protocol version");
++ }
++
++ for (size_t i = 0; i < args->object_info_options->nr; i++) {
++ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
++ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
++ return -1;
++ }
++ if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
++ if (!strcmp(reader->line, "size")) {
++ size_index = i;
++ for (size_t j = 0; j < args->oids->nr; j++)
++ object_info_data[j].sizep = xcalloc(1, sizeof(long));
++ }
++ continue;
++ }
++ return -1;
++ }
++
++ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
++ struct string_list object_info_values = STRING_LIST_INIT_DUP;
++
++ string_list_split(&object_info_values, reader->line, ' ', -1);
++ if (0 <= size_index) {
++ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
++ die("object-info: not our ref %s",
++ object_info_values.items[0].string);
++
++ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
++ }
++
++ string_list_clear(&object_info_values, 0);
++ }
++ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
++
++ return 0;
++}
+
+ ## fetch-object-info.h (new) ##
+@@
++#ifndef FETCH_OBJECT_INFO_H
++#define FETCH_OBJECT_INFO_H
++
++#include "pkt-line.h"
++#include "protocol.h"
++#include "object-store-ll.h"
++
++struct object_info_args {
++ struct string_list *object_info_options;
++ const struct string_list *server_options;
++ struct oid_array *oids;
++};
++
++int fetch_object_info(enum protocol_version version, struct object_info_args *args,
++ struct packet_reader *reader, struct object_info *object_info_data,
++ int stateless_rpc, int fd_out);
++
++#endif /* FETCH_OBJECT_INFO_H */
+
## fetch-pack.c ##
-@@ fetch-pack.c: static void write_command_and_capabilities(struct strbuf *req_buf,
- packet_buf_delim(req_buf);
- }
-
--
- void send_object_info_request(int fd_out, struct object_info_args *args)
- {
- struct strbuf req_buf = STRBUF_INIT;
@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
@@ fetch-pack.h: struct fetch_pack_args {
/*
* Indicate that the remote of this request is a promisor remote. The
-@@ fetch-pack.h: struct fetch_pack_args {
- unsigned connectivity_checked:1;
- };
-
-+struct object_info_args {
-+ struct string_list *object_info_options;
-+ const struct string_list *server_options;
-+ struct oid_array *oids;
-+};
-+
- /*
- * sought represents remote references that should be updated from.
- * On return, the names that were found on the remote will have been
-@@ fetch-pack.h: int report_unmatched_refs(struct ref **sought, int nr_sought);
- */
- int fetch_pack_fsck_objects(void);
-
-+void send_object_info_request(int fd_out, struct object_info_args *args);
-+
- #endif
## transport-helper.c ##
@@ transport-helper.c: static int fetch_refs(struct transport *transport,
@@ transport-helper.c: static int fetch_refs(struct transport *transport,
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
-+ * helper does not support protocol v2. --negotiate-only and cat-file remote-object-info
-+ * require protocol v2.
++ * helper does not support protocol v2. --negotiate-only and cat-file
++ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ transport-helper.c: static int fetch_refs(struct transport *transport,
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
## transport.c ##
-@@ transport.c: static struct ref *handshake(struct transport *transport, int for_push,
- return refs;
- }
-
-+static int fetch_object_info(struct transport *transport, struct object_info *object_info_data)
-+{
-+ int size_index = -1;
-+ struct git_transport_data *data = transport->data;
-+ struct object_info_args args = { 0 };
-+ struct packet_reader reader;
-+
-+ args.server_options = transport->server_options;
-+ args.object_info_options = transport->smart_options->object_info_options;
-+ args.oids = transport->smart_options->object_info_oids;
-+
-+ connect_setup(transport, 0);
-+ packet_reader_init(&reader, data->fd[0], NULL, 0,
-+ PACKET_READ_CHOMP_NEWLINE |
-+ PACKET_READ_GENTLE_ON_EOF |
-+ PACKET_READ_DIE_ON_ERR_PACKET);
-+ data->version = discover_version(&reader);
-+
-+ transport->hash_algo = reader.hash_algo;
-+
-+ switch (data->version) {
-+ case protocol_v2:
-+ if (!server_supports_v2("object-info"))
-+ return -1;
-+ if (unsorted_string_list_has_string(args.object_info_options, "size")
-+ && !server_supports_feature("object-info", "size", 0))
-+ return -1;
-+ send_object_info_request(data->fd[1], &args);
-+ break;
-+ case protocol_v1:
-+ case protocol_v0:
-+ die(_("wrong protocol version. expected v2"));
-+ case protocol_unknown_version:
-+ BUG("unknown protocol version");
-+ }
-+
-+ for (size_t i = 0; i < args.object_info_options->nr; i++) {
-+ if (packet_reader_read(&reader) != PACKET_READ_NORMAL) {
-+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
-+ return -1;
-+ }
-+ if (unsorted_string_list_has_string(args.object_info_options, reader.line)) {
-+ if (!strcmp(reader.line, "size")) {
-+ size_index = i;
-+ for (size_t j = 0; j < args.oids->nr; j++)
-+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
-+ }
-+ continue;
-+ }
-+ return -1;
-+ }
-+
-+ for (size_t i = 0; packet_reader_read(&reader) == PACKET_READ_NORMAL && i < args.oids->nr; i++){
-+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
-+
-+ string_list_split(&object_info_values, reader.line, ' ', -1);
-+ if (0 <= size_index) {
-+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
-+ die("object-info: not our ref %s",
-+ object_info_values.items[0].string);
-+
-+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
-+ }
-+
-+ string_list_clear(&object_info_values, 0);
-+ }
-+ check_stateless_delimiter(transport->stateless_rpc, &reader, "stateless delimiter expected");
-+
-+ return 0;
-+}
-+
- static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
- struct transport_ls_refs_options *options)
- {
+@@
+ #include "hook.h"
+ #include "pkt-line.h"
+ #include "fetch-pack.h"
++#include "fetch-object-info.h"
+ #include "remote.h"
+ #include "connect.h"
+ #include "send-pack.h"
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct ref *ref_itr = object_info_refs = alloc_ref("");
++ struct packet_reader reader;
++ struct object_info_args obj_info_args = { 0 };
+
-+ if (!fetch_object_info(transport, data->options.object_info_data))
-+ goto cleanup;
++ obj_info_args.server_options = transport->server_options;
++ obj_info_args.object_info_options = transport->smart_options->object_info_options;
++ obj_info_args.oids = transport->smart_options->object_info_oids;
++
++ connect_setup(transport, 0);
++ packet_reader_init(&reader, data->fd[0], NULL, 0,
++ PACKET_READ_CHOMP_NEWLINE |
++ PACKET_READ_GENTLE_ON_EOF |
++ PACKET_READ_DIE_ON_ERR_PACKET);
++
++ data->version = discover_version(&reader);
++ transport->hash_algo = reader.hash_algo;
+
++ if (!fetch_object_info(data->version, &obj_info_args, &reader,
++ data->options.object_info_data, transport->stateless_rpc,
++ data->fd[1])) {
++ /*
++ * If the code reaches here, fetch_object_info is successful and
++ * remote object info are retrieved from packets (i.e. without
++ * downloading the objects).
++ */
++ goto cleanup;
++ }
+
+- if (!data->finished_handshake) {
+- int i;
++ /*
++ * If the code reaches here, it means we can't retrieve object info from
++ * packets, and we will fallback to downland the pack files.
++ * We set quiet and no_progress to be true, so that the internal call to
++ * fetch-pack is less verbose.
++ */
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
++
++ /*
++ * Allocate memory for object info data according to oids.
++ * The actual results will be retrieved later from the downloaded
++ * pack files.
++ */
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ ref_itr->next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
-
-- if (!data->finished_handshake) {
-- int i;
++
+ ref_itr = ref_itr->next;
+ }
+
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
++
++ /* Retrieve object info data from the downloaded pack files */
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
-
-- if (!refs)
-+ if (!refs && !args.object_info)
- ret = -1;
- if (report_unmatched_refs(to_fetch, nr_heads))
+@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
ret = -1;
cleanup:
5: 4443e5e408 = 5: 2107bfc7ca cat-file: add declaration of variable i inside its for loop
6: c777a4dd84 ! 6: 740d629626 cat-file: add remote-object-info to batch-command
@@
## Metadata ##
-Author: Calvin Wan <calvinwan@google.com>
+Author: Eric Ju <eric.peijian@gmail.com>
## Commit message ##
cat-file: add remote-object-info to batch-command
@@ object-store-ll.h: int for_each_object_in_pack(struct packed_git *p,
## t/lib-cat-file.sh (new) ##
@@
-+# Library of git-cat-file related functions.
++# Library of git-cat-file related tests.
+
+# Print a string without a trailing newline
+echo_without_newline () {
--
2.47.0
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v5 1/6] fetch-pack: refactor packet writing
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
2024-11-05 17:44 ` Christian Couder
2024-10-28 20:34 ` [PATCH v5 2/6] fetch-pack: move fetch initialization Eric Ju
` (4 subsequent siblings)
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to be a more general
purpose function write_command_and_capabilities(), so that it can be
used by both fetch and future commands.
Here "command" means the "operations" supported by Git’s wire protocol
https://git-scm.com/docs/protocol-v2. An example would be a
git's subcommand, such as git-fetch(1); or an operation supported by
the server side such as "object-info" implemented in "a2ba162cda
(object-info: support for retrieving object info, 2021-04-20)".
The new write_command_and_capabilities() function is also moved to
connect.c, so that it becomes accessible to other commands.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 4 ++++
fetch-pack.c | 36 ++----------------------------------
3 files changed, 40 insertions(+), 34 deletions(-)
diff --git a/connect.c b/connect.c
index 58f53d8dcb..bb4e4eec44 100644
--- a/connect.c
+++ b/connect.c
@@ -688,6 +688,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (int i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..2ed009066e 100644
--- a/connect.h
+++ b/connect.h
@@ -1,6 +1,7 @@
#ifndef CONNECT_H
#define CONNECT_H
+#include "string-list.h"
#include "protocol.h"
#define CONNECT_VERBOSE (1u << 0)
@@ -30,4 +31,7 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index f752da93a8..533fb76f95 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1314,38 +1314,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- int i;
- ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1356,7 +1324,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2174,7 +2142,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v5 2/6] fetch-pack: move fetch initialization
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
2024-10-28 20:34 ` [PATCH v5 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
2024-10-28 20:34 ` [PATCH v5 3/6] serve: advertise object-info feature Eric Ju
` (3 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 533fb76f95..afffbcaafc 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1645,18 +1645,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v5 3/6] serve: advertise object-info feature
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
2024-10-28 20:34 ` [PATCH v5 1/6] fetch-pack: refactor packet writing Eric Ju
2024-10-28 20:34 ` [PATCH v5 2/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
2024-10-28 20:34 ` [PATCH v5 4/6] transport: add client support for object-info Eric Ju
` (2 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index d674764a25..c3d8098642 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v5 4/6] transport: add client support for object-info
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
` (2 preceding siblings ...)
2024-10-28 20:34 ` [PATCH v5 3/6] serve: advertise object-info feature Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
2024-10-28 20:34 ` [PATCH v5 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-10-28 20:34 ` [PATCH v5 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
in “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
feature 'size' from a v2 server. If a server does not
advertise the feature, then the client falls back
to making the request through 'fetch'.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 95 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 18 +++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 +
transport-helper.c | 11 +++++-
transport.c | 80 ++++++++++++++++++++++++++++++++++++--
transport.h | 11 ++++++
8 files changed, 216 insertions(+), 5 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index 8d8cc6ab90..3969ddcaa8 100644
--- a/Makefile
+++ b/Makefile
@@ -1024,6 +1024,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..4d7c2d265f
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,95 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/**
+ * send_object_info_request sends git-cat-file object-info command and its
+ * arguments into the request buffer.
+ */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+/**
+ * fetch_object_info sends git-cat-file object-info command into the request buf
+ * and read the results from packets.
+ */
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args->object_info_options, "size")
+ && !server_supports_feature("object-info", "size", 0))
+ return -1;
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..b1e545532f
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,18 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index afffbcaafc..8b4143d752 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1651,6 +1651,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..cf7cedf161 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index 013ec79dc9..334b35174e 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -709,8 +709,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -726,6 +726,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 47fda6a773..41e157d73d 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -418,6 +419,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -444,11 +446,71 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct ref *ref_itr = object_info_refs = alloc_ref("");
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ if (!fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1])) {
+ /*
+ * If the code reaches here, fetch_object_info is successful and
+ * remote object info are retrieved from packets (i.e. without
+ * downloading the objects).
+ */
+ goto cleanup;
+ }
- if (!data->finished_handshake) {
- int i;
+ /*
+ * If the code reaches here, it means we can't retrieve object info from
+ * packets, and we will fallback to downland the pack files.
+ * We set quiet and no_progress to be true, so that the internal call to
+ * fetch-pack is less verbose.
+ */
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+
+ /*
+ * Allocate memory for object info data according to oids.
+ * The actual results will be retrieved later from the downloaded
+ * pack files.
+ */
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
+ /* last element, no need to allocate to next */
+ ref_itr->next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
+
+ ref_itr = ref_itr->next;
+ }
+
+ transport->remote_refs = object_info_refs;
+
+ } else if (!data->finished_handshake) {
int must_list_refs = 0;
- for (i = 0; i < nr_heads; i++) {
+ for (int i = 0; i < nr_heads; i++) {
if (!to_fetch[i]->exact_oid) {
must_list_refs = 1;
break;
@@ -494,6 +556,17 @@ static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
+
+ /* Retrieve object info data from the downloaded pack files */
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
+ &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
+
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
@@ -504,6 +577,7 @@ static int fetch_refs_via_pack(struct transport *transport,
ret = -1;
cleanup:
+ free_refs(object_info_refs);
close(data->fd[0]);
if (data->fd[1] >= 0)
close(data->fd[1]);
diff --git a/transport.h b/transport.h
index 44100fa9b7..42b8ee1251 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to pull only object-info. Fallbacks
+ * to pulling entire object if object-info is not supported.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v5 5/6] cat-file: add declaration of variable i inside its for loop
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
` (3 preceding siblings ...)
2024-10-28 20:34 ` [PATCH v5 4/6] transport: add client support for object-info Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
2024-10-28 20:34 ` [PATCH v5 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index bfdfb51c7c..5db55fabc4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -673,12 +673,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -686,9 +684,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -714,7 +710,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -724,7 +719,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v5 6/6] cat-file: add remote-object-info to batch-command
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
` (4 preceding siblings ...)
2024-10-28 20:34 ` [PATCH v5 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-10-28 20:34 ` Eric Ju
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-10-28 20:34 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 108 +++-
object-file.c | 11 +
object-store-ll.h | 3 +
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++
7 files changed, 897 insertions(+), 17 deletions(-)
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..f2be00b599 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified <remote> without
+ downloading objects from remote. If the object-info capability is not
+ supported by the server, the objects will be downloaded instead.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will error
+and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5db55fabc4..714c182f39 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -528,7 +534,7 @@ static void batch_one_object(const char *obj_name,
enum get_oid_result result;
result = get_oid_with_context(the_repository, obj_name,
- flags, &data->oid, &ctx);
+ flags, &data->oid, &ctx);
if (result != FOUND) {
switch (result) {
case MISSING_OBJECT:
@@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -667,6 +726,52 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line,
+ struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ data->size = *remote_object_info[i].sizep;
+ } else {
+ /*
+ * When reaching here, it means remote-object-info can't retrieve
+ * information from server without downloading them, and the objects
+ * have been fetched to client already.
+ * Print the information using the logic for local objects.
+ */
+ data->skip_object_info = 0;
+ }
+
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -698,6 +803,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index b1a3463852..181cde98e1 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3132,3 +3132,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index 53b8e693b1..611e2ca708 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -548,4 +548,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..9fb20be308
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related tests.
+
+# Print a string without a trailing newline
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string removing any leading spaces.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index d36cd7c086..d8a851c427 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -4,6 +4,7 @@ test_description='git cat-file'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -99,18 +100,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..f4bff07311
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,739 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback git://: fetch objects to client' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback file://: fetch objects to client' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+ cd file_client_empty &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback http://: fetch objects to client' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v5 1/6] fetch-pack: refactor packet writing
2024-10-28 20:34 ` [PATCH v5 1/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-11-05 17:44 ` Christian Couder
2024-11-06 1:06 ` Junio C Hamano
2024-11-06 19:50 ` Peijian Ju
0 siblings, 2 replies; 174+ messages in thread
From: Christian Couder @ 2024-11-05 17:44 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Oct 28, 2024 at 9:35 PM Eric Ju <eric.peijian@gmail.com> wrote:
> connect.c | 34 ++++++++++++++++++++++++++++++++++
> connect.h | 4 ++++
> fetch-pack.c | 36 ++----------------------------------
> 3 files changed, 40 insertions(+), 34 deletions(-)
>
> diff --git a/connect.c b/connect.c
> index 58f53d8dcb..bb4e4eec44 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -688,6 +688,40 @@ int server_supports(const char *feature)
> return !!server_feature_value(feature, NULL);
> }
>
> +void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
> + const struct string_list *server_options)
When I apply your patches this line doesn't seem well indented.
> +{
> + const char *hash_name;
> + int advertise_sid;
> +
> + git_config_get_bool("transfer.advertisesid", &advertise_sid);
It looks like moving the function to connect.c required adding the
above line into it. There are a few other small changes, including
probably spurious indentation changes, in the moved function which
make it a bit more difficult than necessary to check that the moved
code is the same as the original one.
This makes me wonder if it was actually a good idea to move the
function, or if moving the function should have been done in a
separate step than the step making the small changes. Perhaps patch
5/6 "cat-file: add declaration of variable i inside its for loop"
could have been moved before this patch and could have included some
of the small changes related to the i variable that are made in this
patch.
It might have been nice to mention the changes in the commit message anyway.
> + ensure_server_supports_v2(command);
> + packet_buf_write(req_buf, "command=%s", command);
> + if (server_supports_v2("agent"))
> + packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
> + if (advertise_sid && server_supports_v2("session-id"))
> + packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
> + if (server_options && server_options->nr) {
> + ensure_server_supports_v2("server-option");
> + for (int i = 0; i < server_options->nr; i++)
> + packet_buf_write(req_buf, "server-option=%s",
> + server_options->items[i].string);
> + }
> +
> + if (server_feature_v2("object-format", &hash_name)) {
> + const int hash_algo = hash_algo_by_name(hash_name);
> + if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
> + die(_("mismatched algorithms: client %s; server %s"),
> + the_hash_algo->name, hash_name);
> + packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
> + } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
> + die(_("the server does not support algorithm '%s'"),
> + the_hash_algo->name);
> + }
> + packet_buf_delim(req_buf);
> +}
> +
> enum protocol {
> PROTO_LOCAL = 1,
> PROTO_FILE,
> diff --git a/connect.h b/connect.h
> index 1645126c17..2ed009066e 100644
> --- a/connect.h
> +++ b/connect.h
> @@ -1,6 +1,7 @@
> #ifndef CONNECT_H
> #define CONNECT_H
>
> +#include "string-list.h"
> #include "protocol.h"
>
> #define CONNECT_VERBOSE (1u << 0)
> @@ -30,4 +31,7 @@ void check_stateless_delimiter(int stateless_rpc,
> struct packet_reader *reader,
> const char *error);
>
> +void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
> + const struct string_list *server_options);
When I apply your patches the above line doesn't seem well indented either.
You might want to make sure that your editor uses 8 spaces for each
tab, see Documentation/CodingGuidelines, or just that your editor
properly follows our .editorconfig file.
It looks like other patches in the series, like patch 4/6, have
similar issues. Otherwise the other patches in the series look good to
me.
Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v5 1/6] fetch-pack: refactor packet writing
2024-11-05 17:44 ` Christian Couder
@ 2024-11-06 1:06 ` Junio C Hamano
2024-11-06 18:00 ` Peijian Ju
2024-11-06 19:50 ` Peijian Ju
1 sibling, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2024-11-06 1:06 UTC (permalink / raw)
To: Christian Couder
Cc: Eric Ju, git, calvinwan, jonathantanmy, chriscool, karthik.188,
toon, jltobler
Christian Couder <christian.couder@gmail.com> writes:
> It looks like other patches in the series, like patch 4/6, have
> similar issues. Otherwise the other patches in the series look good to
> me.
Thanks for a review.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v5 1/6] fetch-pack: refactor packet writing
2024-11-06 1:06 ` Junio C Hamano
@ 2024-11-06 18:00 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-11-06 18:00 UTC (permalink / raw)
To: Junio C Hamano
Cc: Christian Couder, git, calvinwan, jonathantanmy, chriscool,
karthik.188, toon, jltobler
On Tue, Nov 5, 2024 at 8:06 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > It looks like other patches in the series, like patch 4/6, have
> > similar issues. Otherwise the other patches in the series look good to
> > me.
>
> Thanks for a review.
Thank you, sir. Yes, I noticed that. I will make sure my IDE respects
the .editorconfig and revise them in v6.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v5 1/6] fetch-pack: refactor packet writing
2024-11-05 17:44 ` Christian Couder
2024-11-06 1:06 ` Junio C Hamano
@ 2024-11-06 19:50 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-11-06 19:50 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Tue, Nov 5, 2024 at 12:44 PM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Mon, Oct 28, 2024 at 9:35 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> > connect.c | 34 ++++++++++++++++++++++++++++++++++
> > connect.h | 4 ++++
> > fetch-pack.c | 36 ++----------------------------------
> > 3 files changed, 40 insertions(+), 34 deletions(-)
> >
> > diff --git a/connect.c b/connect.c
> > index 58f53d8dcb..bb4e4eec44 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -688,6 +688,40 @@ int server_supports(const char *feature)
> > return !!server_feature_value(feature, NULL);
> > }
> >
> > +void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
> > + const struct string_list *server_options)
>
> When I apply your patches this line doesn't seem well indented.
>
Thank you. I will make sure my IDE respects
the .editorconfig and revise them in v6.
> > +{
> > + const char *hash_name;
> > + int advertise_sid;
> > +
> > + git_config_get_bool("transfer.advertisesid", &advertise_sid);
>
> It looks like moving the function to connect.c required adding the
> above line into it. There are a few other small changes, including
> probably spurious indentation changes, in the moved function which
> make it a bit more difficult than necessary to check that the moved
> code is the same as the original one.
>
> This makes me wonder if it was actually a good idea to move the
> function, or if moving the function should have been done in a
> separate step than the step making the small changes. Perhaps patch
> 5/6 "cat-file: add declaration of variable i inside its for loop"
> could have been moved before this patch and could have included some
> of the small changes related to the i variable that are made in this
> patch.
>
> It might have been nice to mention the changes in the commit message anyway.
>
Thank you. In v6, I will move small changes commit "cat-file: add
declaration of variable i inside its for loop" to the very first
commit, where I will include small changes related to the i variable
that are made in this patch.
About the extra charges related to `advertise_sid`. I did a bit of
analysis, please feel free to correct me.
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. line 1221 (function do_fetch_pack):
if (!server_supports("session-id"))
advertise_sid = 0;
2. line 1895 (function fetch_pack_config)
git_config_get_bool("transfer.advertisesid", &advertise_sid);
In 1, do_fetch_pack() is called when protocol is NOT v2. While
write_fetch_command_and_capabilities() or the new
write_command_and_capabilities() is only used in protocol v2, I think
it is safe so it is safe to ignore 1, and only consider 2.
In 2, git_config_get_bool is from config.h and it is an out-of-box
dependency of connect.c, so I just directly use it.
> > + ensure_server_supports_v2(command);
> > + packet_buf_write(req_buf, "command=%s", command);
> > + if (server_supports_v2("agent"))
> > + packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
> > + if (advertise_sid && server_supports_v2("session-id"))
> > + packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
> > + if (server_options && server_options->nr) {
> > + ensure_server_supports_v2("server-option");
> > + for (int i = 0; i < server_options->nr; i++)
> > + packet_buf_write(req_buf, "server-option=%s",
> > + server_options->items[i].string);
> > + }
> > +
> > + if (server_feature_v2("object-format", &hash_name)) {
> > + const int hash_algo = hash_algo_by_name(hash_name);
> > + if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
> > + die(_("mismatched algorithms: client %s; server %s"),
> > + the_hash_algo->name, hash_name);
> > + packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
> > + } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
> > + die(_("the server does not support algorithm '%s'"),
> > + the_hash_algo->name);
> > + }
> > + packet_buf_delim(req_buf);
> > +}
> > +
> > enum protocol {
> > PROTO_LOCAL = 1,
> > PROTO_FILE,
> > diff --git a/connect.h b/connect.h
> > index 1645126c17..2ed009066e 100644
> > --- a/connect.h
> > +++ b/connect.h
> > @@ -1,6 +1,7 @@
> > #ifndef CONNECT_H
> > #define CONNECT_H
> >
> > +#include "string-list.h"
> > #include "protocol.h"
> >
> > #define CONNECT_VERBOSE (1u << 0)
> > @@ -30,4 +31,7 @@ void check_stateless_delimiter(int stateless_rpc,
> > struct packet_reader *reader,
> > const char *error);
> >
> > +void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
> > + const struct string_list *server_options);
>
> When I apply your patches the above line doesn't seem well indented either.
>
> You might want to make sure that your editor uses 8 spaces for each
> tab, see Documentation/CodingGuidelines, or just that your editor
> properly follows our .editorconfig file.
>
> It looks like other patches in the series, like patch 4/6, have
> similar issues. Otherwise the other patches in the series look good to
> me.
>
> Thanks.
Thank you. I will make sure my IDE respects
the .editorconfig and revise them in v6.
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (10 preceding siblings ...)
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-08 16:24 ` [PATCH v6 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
` (6 more replies)
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
` (4 subsequent siblings)
16 siblings, 7 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This is a continuation of Calvin Wan's (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
Sometimes it is useful to get information about an object without having to download
it completely. The server logic for retrieving size has already been implemented and merged in
"a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
This patch series implement the client option for it.
This patch series add the `remote-object-info` command to `cat-file --batch-command`.
This command allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not have
object-info capability, the entire object is fetched and the
relevant object info is returned.
A few questions open for discussions please:
1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
or "warn and wait for new command".
2. Right now, only the size is supported. If the batch command format
contains objectsize:disk or deltabase, it will die. The question
is about objecttype. In the current implementation, it will die too.
But dying on objecttype breaks the default format. We have changed the
default format to %(objectname) %(objectsize) when remote-object-info is used.
Any suggestions on this approach?
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
Changes since V5
================
- Move small changes commit to the very first commit
- Add more detailed description on what changes when moving write_fetch_command_and_capabilities() to connect.c
- Fix indentation problems
Thank you.
Eric Ju
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (2):
cat-file: add declaration of variable i inside its for loop
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.txt | 24 +-
Makefile | 1 +
builtin/cat-file.c | 116 +++-
connect.c | 34 ++
connect.h | 4 +
fetch-object-info.c | 95 ++++
fetch-object-info.h | 18 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 77 ++-
transport.h | 11 +
18 files changed, 1162 insertions(+), 68 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v5:
5: 0e533d6d0a ! 1: 858a864651 cat-file: add declaration of variable i inside its for loop
@@ Metadata
## Commit message ##
cat-file: add declaration of variable i inside its for loop
- Some code declares variable i and only uses it
+ Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
@@ builtin/cat-file.c: static void batch_objects_command(struct batch_options *opt,
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
+
+ ## fetch-pack.c ##
+@@ fetch-pack.c: static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+- int i;
+ ensure_server_supports_v2("server-option");
+- for (i = 0; i < server_options->nr; i++)
++ for (int i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
1: d5c93792d3 ! 2: 51707f08d1 fetch-pack: refactor packet writing
@@ Metadata
## Commit message ##
fetch-pack: refactor packet writing
- Refactor write_fetch_command_and_capabilities() to be a more general
- purpose function write_command_and_capabilities(), so that it can be
- used by both fetch and future commands.
+ Refactor write_fetch_command_and_capabilities() to a more
+ general-purpose function, write_command_and_capabilities(), enabling it
+ to serve both fetch and additional commands.
- Here "command" means the "operations" supported by Git’s wire protocol
- https://git-scm.com/docs/protocol-v2. An example would be a
- git's subcommand, such as git-fetch(1); or an operation supported by
- the server side such as "object-info" implemented in "a2ba162cda
- (object-info: support for retrieving object info, 2021-04-20)".
+ In this context, "command" refers to the "operations" supported by
+ Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
+ subcommand (e.g., git-fetch(1)) or a server-side operation like
+ "object-info" as implemented in commit a2ba162c
+ (object-info: support for retrieving object info, 2021-04-20).
- The new write_command_and_capabilities() function is also moved to
- connect.c, so that it becomes accessible to other commands.
+ Furthermore, write_command_and_capabilities() is moved to connect.c,
+ making it accessible to additional commands in the future.
+
+ To move write_command_and_capabilities() to connect.c, we need to
+ adjust how `advertise_sid` is managed. Previously,
+ in fetch_pack.c, `advertise_sid` was a static variable, modified using
+ git_config_get_bool().
+
+ In connect.c, we now initialize `advertise_sid` at the beginning by
+ directly using git_config_get_bool(). This change is safe because:
+
+ In the original fetch-pack.c code, there are only two places that
+ write `advertise_sid` :
+ 1. In function do_fetch_pack:
+ if (!server_supports("session-id"))
+ advertise_sid = 0;
+ 2. In function fetch_pack_config():
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ About 1, since do_fetch_pack() is only relevant for protocol v1, this
+ assignment can be ignored in our refactor, as
+ write_command_and_capabilities() is only used in protocol v2.
+
+ About 2, git_config_get_bool() is from config.h and it is an out-of-box
+ dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
@@ connect.c: int server_supports(const char *feature)
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
-+ const struct string_list *server_options)
++ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
@@ connect.h: void check_stateless_delimiter(int stateless_rpc,
const char *error);
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
-+ const struct string_list *server_options);
++ const struct string_list *server_options);
+
#endif
@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
-- int i;
- ensure_server_supports_v2("server-option");
-- for (i = 0; i < server_options->nr; i++)
+- for (int i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
2: 4bf51150c0 = 3: f02338e90e fetch-pack: move fetch initialization
3: 15b4095e28 = 4: 934bafb1db serve: advertise object-info feature
4: 68794ae57a ! 5: e2e94d1a32 transport: add client support for object-info
@@ fetch-object-info.c (new)
+ * and read the results from packets.
+ */
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
-+ struct packet_reader *reader, struct object_info *object_info_data,
-+ const int stateless_rpc, const int fd_out)
++ struct packet_reader *reader, struct object_info *object_info_data,
++ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
@@ fetch-object-info.h (new)
+};
+
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
-+ struct packet_reader *reader, struct object_info *object_info_data,
-+ int stateless_rpc, int fd_out);
++ struct packet_reader *reader, struct object_info *object_info_data,
++ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ }
- if (!data->finished_handshake) {
-- int i;
+ /*
+ * If the code reaches here, it means we can't retrieve object info from
+ * packets, and we will fallback to downland the pack files.
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ transport->remote_refs = object_info_refs;
+
+ } else if (!data->finished_handshake) {
+ int i;
int must_list_refs = 0;
-- for (i = 0; i < nr_heads; i++) {
-+ for (int i = 0; i < nr_heads; i++) {
- if (!to_fetch[i]->exact_oid) {
- must_list_refs = 1;
- break;
+ for (i = 0; i < nr_heads; i++) {
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
6: 98f8248203 ! 6: 69eb091a6b cat-file: add remote-object-info to batch-command
@@ Documentation/git-cat-file.txt: info <object>::
output of `--batch-check`.
+remote-object-info <remote> <object>...::
-+ Print object info for object references `<object>` at specified <remote> without
-+ downloading objects from remote. If the object-info capability is not
-+ supported by the server, the objects will be downloaded instead.
++ Print object info for object references `<object>` at specified <remote>
++ without downloading objects from remote. If the object-info capability
++ is not supported by the server, the objects will be downloaded instead.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
@@ builtin/cat-file.c: struct batch_options {
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
-@@ builtin/cat-file.c: static void batch_one_object(const char *obj_name,
- enum get_oid_result result;
-
- result = get_oid_with_context(the_repository, obj_name,
-- flags, &data->oid, &ctx);
-+ flags, &data->oid, &ctx);
- if (result != FOUND) {
- switch (result) {
- case MISSING_OBJECT:
@@ builtin/cat-file.c: static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
-+ const char *line,
-+ struct strbuf *output,
-+ struct expand_data *data)
++ const char *line, struct strbuf *output,
++ struct expand_data *data)
+{
+ int count;
+ const char **argv;
--
2.47.0
base-commit: 8f8d6eee531b3fa1a8ef14f169b0cb5035f7a772
Merge Request: https://gitlab.com/gitlab-org/git/-/merge_requests/168
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v6 1/6] cat-file: add declaration of variable i inside its for loop
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-08 16:24 ` [PATCH v6 2/6] fetch-pack: refactor packet writing Eric Ju
` (5 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
fetch-pack.c | 3 +--
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index bfdfb51c7c..5db55fabc4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -673,12 +673,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -686,9 +684,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -714,7 +710,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -724,7 +719,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index f752da93a8..3699cf9945 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1326,9 +1326,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
if (advertise_sid && server_supports_v2("session-id"))
packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
if (server_options && server_options->nr) {
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
+ for (int i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v6 2/6] fetch-pack: refactor packet writing
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
2024-11-08 16:24 ` [PATCH v6 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-08 16:24 ` [PATCH v6 3/6] fetch-pack: move fetch initialization Eric Ju
` (4 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to a more
general-purpose function, write_command_and_capabilities(), enabling it
to serve both fetch and additional commands.
In this context, "command" refers to the "operations" supported by
Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
subcommand (e.g., git-fetch(1)) or a server-side operation like
"object-info" as implemented in commit a2ba162c
(object-info: support for retrieving object info, 2021-04-20).
Furthermore, write_command_and_capabilities() is moved to connect.c,
making it accessible to additional commands in the future.
To move write_command_and_capabilities() to connect.c, we need to
adjust how `advertise_sid` is managed. Previously,
in fetch_pack.c, `advertise_sid` was a static variable, modified using
git_config_get_bool().
In connect.c, we now initialize `advertise_sid` at the beginning by
directly using git_config_get_bool(). This change is safe because:
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. In function do_fetch_pack:
if (!server_supports("session-id"))
advertise_sid = 0;
2. In function fetch_pack_config():
git_config_get_bool("transfer.advertisesid", &advertise_sid);
About 1, since do_fetch_pack() is only relevant for protocol v1, this
assignment can be ignored in our refactor, as
write_command_and_capabilities() is only used in protocol v2.
About 2, git_config_get_bool() is from config.h and it is an out-of-box
dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 4 ++++
fetch-pack.c | 35 ++---------------------------------
3 files changed, 40 insertions(+), 33 deletions(-)
diff --git a/connect.c b/connect.c
index 58f53d8dcb..5dd544335c 100644
--- a/connect.c
+++ b/connect.c
@@ -688,6 +688,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (int i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..04043cd66d 100644
--- a/connect.h
+++ b/connect.h
@@ -1,6 +1,7 @@
#ifndef CONNECT_H
#define CONNECT_H
+#include "string-list.h"
#include "protocol.h"
#define CONNECT_VERBOSE (1u << 0)
@@ -30,4 +31,7 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index 3699cf9945..533fb76f95 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1314,37 +1314,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
- for (int i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1355,7 +1324,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2173,7 +2142,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v6 3/6] fetch-pack: move fetch initialization
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
2024-11-08 16:24 ` [PATCH v6 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-11-08 16:24 ` [PATCH v6 2/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-08 16:24 ` [PATCH v6 4/6] serve: advertise object-info feature Eric Ju
` (3 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 533fb76f95..afffbcaafc 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1645,18 +1645,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v6 4/6] serve: advertise object-info feature
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
` (2 preceding siblings ...)
2024-11-08 16:24 ` [PATCH v6 3/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-08 16:24 ` [PATCH v6 5/6] transport: add client support for object-info Eric Ju
` (2 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index d674764a25..c3d8098642 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v6 5/6] transport: add client support for object-info
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
` (3 preceding siblings ...)
2024-11-08 16:24 ` [PATCH v6 4/6] serve: advertise object-info feature Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-08 16:24 ` [PATCH v6 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-11-11 4:38 ` [PATCH v6 0/6] " Junio C Hamano
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
in “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”.
Add client functions to communicate with the server.
The client currently supports requesting a list of object ids with
feature 'size' from a v2 server. If a server does not
advertise the feature, then the client falls back
to making the request through 'fetch'.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 95 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 18 +++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 +
transport-helper.c | 11 +++++-
transport.c | 77 +++++++++++++++++++++++++++++++++++-
transport.h | 11 ++++++
8 files changed, 215 insertions(+), 3 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index d06c9a8ffa..beca828963 100644
--- a/Makefile
+++ b/Makefile
@@ -1024,6 +1024,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..c6abc69332
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,95 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/**
+ * send_object_info_request sends git-cat-file object-info command and its
+ * arguments into the request buffer.
+ */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+/**
+ * fetch_object_info sends git-cat-file object-info command into the request buf
+ * and read the results from packets.
+ */
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ return -1;
+ if (unsorted_string_list_has_string(args->object_info_options, "size")
+ && !server_supports_feature("object-info", "size", 0))
+ return -1;
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..ce1a05dc96
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,18 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index afffbcaafc..8b4143d752 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1651,6 +1651,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..cf7cedf161 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index 013ec79dc9..334b35174e 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -709,8 +709,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -726,6 +726,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 47fda6a773..7702b1926e 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -418,6 +419,7 @@ static int fetch_refs_via_pack(struct transport *transport,
struct ref *refs = NULL;
struct fetch_pack_args args;
struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
+ struct ref *object_info_refs = NULL;
memset(&args, 0, sizeof(args));
args.uploadpack = data->options.uploadpack;
@@ -444,8 +446,69 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct ref *ref_itr = object_info_refs = alloc_ref("");
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ if (!fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1])) {
+ /*
+ * If the code reaches here, fetch_object_info is successful and
+ * remote object info are retrieved from packets (i.e. without
+ * downloading the objects).
+ */
+ goto cleanup;
+ }
- if (!data->finished_handshake) {
+ /*
+ * If the code reaches here, it means we can't retrieve object info from
+ * packets, and we will fallback to downland the pack files.
+ * We set quiet and no_progress to be true, so that the internal call to
+ * fetch-pack is less verbose.
+ */
+ args.object_info_data = data->options.object_info_data;
+ args.quiet = 1;
+ args.no_progress = 1;
+
+ /*
+ * Allocate memory for object info data according to oids.
+ * The actual results will be retrieved later from the downloaded
+ * pack files.
+ */
+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
+ ref_itr->exact_oid = 1;
+ if (i == transport->smart_options->object_info_oids->nr - 1)
+ /* last element, no need to allocate to next */
+ ref_itr->next = NULL;
+ else
+ ref_itr->next = alloc_ref("");
+
+ ref_itr = ref_itr->next;
+ }
+
+ transport->remote_refs = object_info_refs;
+
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
@@ -494,6 +557,17 @@ static int fetch_refs_via_pack(struct transport *transport,
&transport->pack_lockfiles, data->version);
data->finished_handshake = 0;
+
+ /* Retrieve object info data from the downloaded pack files */
+ if (args.object_info) {
+ struct ref *ref_cpy_reader = object_info_refs;
+ for (int i = 0; ref_cpy_reader; i++) {
+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
+ &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
+ ref_cpy_reader = ref_cpy_reader->next;
+ }
+ }
+
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
@@ -504,6 +578,7 @@ static int fetch_refs_via_pack(struct transport *transport,
ret = -1;
cleanup:
+ free_refs(object_info_refs);
close(data->fd[0]);
if (data->fd[1] >= 0)
close(data->fd[1]);
diff --git a/transport.h b/transport.h
index 44100fa9b7..42b8ee1251 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to pull only object-info. Fallbacks
+ * to pulling entire object if object-info is not supported.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v6 6/6] cat-file: add remote-object-info to batch-command
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
` (4 preceding siblings ...)
2024-11-08 16:24 ` [PATCH v6 5/6] transport: add client support for object-info Eric Ju
@ 2024-11-08 16:24 ` Eric Ju
2024-11-11 4:38 ` [PATCH v6 0/6] " Junio C Hamano
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-08 16:24 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 105 ++++
object-file.c | 11 +
object-store-ll.h | 3 +
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++
7 files changed, 895 insertions(+), 16 deletions(-)
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..666422201c 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified <remote>
+ without downloading objects from remote. If the object-info capability
+ is not supported by the server, the objects will be downloaded instead.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will error
+and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5db55fabc4..078e4517b8 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -667,6 +726,51 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line, struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ data->size = *remote_object_info[i].sizep;
+ } else {
+ /*
+ * When reaching here, it means remote-object-info can't retrieve
+ * information from server without downloading them, and the objects
+ * have been fetched to client already.
+ * Print the information using the logic for local objects.
+ */
+ data->skip_object_info = 0;
+ }
+
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -698,6 +802,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index b1a3463852..181cde98e1 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3132,3 +3132,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index 53b8e693b1..611e2ca708 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -548,4 +548,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..9fb20be308
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related tests.
+
+# Print a string without a trailing newline
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string removing any leading spaces.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index d36cd7c086..d8a851c427 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -4,6 +4,7 @@ test_description='git cat-file'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -99,18 +100,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..f4bff07311
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,739 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback git://: fetch objects to client' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback file://: fetch objects to client' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+ cd file_client_empty &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+
+test_expect_success 'remote-object-info fallback http://: fetch objects to client' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # Prove object is not on the client
+ echo "$hello_oid missing" >expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ # These results prove remote-object-info can retrieve object info
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results are for the info command
+ # They prove objects are downloaded
+ echo "$hello_oid $hello_size" >>expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+
+ test_cmp expect actual
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
` (5 preceding siblings ...)
2024-11-08 16:24 ` [PATCH v6 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-11-11 4:38 ` Junio C Hamano
2024-11-18 16:28 ` Peijian Ju
6 siblings, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2024-11-11 4:38 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Eric Ju <eric.peijian@gmail.com> writes:
> This is a continuation of Calvin Wan's (calvinwan@google.com)
> patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
>
> Sometimes it is useful to get information about an object without having to download
> it completely. The server logic for retrieving size has already been implemented and merged in
> "a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
> This patch series implement the client option for it.
>
> This patch series add the `remote-object-info` command to `cat-file --batch-command`.
> This command allows the client to make an object-info command request to a server
> that supports protocol v2. If the server is v2, but does not have
> object-info capability, the entire object is fetched and the
> relevant object info is returned.
>
> A few questions open for discussions please:
>
> 1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
> `cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
> or "warn and wait for new command".
In the primary use case envisioned, would it be a program that is
driving the "cat-file --batch-command" process? Can it sensibly
react to "warn and wait" and throw different commands to achieve
what it wanted to do with the remote-object-info command?
If the answer is "no", die would be more appropriate.
> 2. Right now, only the size is supported. If the batch command format
> contains objectsize:disk or deltabase, it will die. The question
> is about objecttype. In the current implementation, it will die too.
> But dying on objecttype breaks the default format. We have changed the
> default format to %(objectname) %(objectsize) when remote-object-info is used.
> Any suggestions on this approach?
Why bend the default format to the shortcoming of the new feature?
What makes it impossible to learn what type of object it is? If the
limitation that makes it impossible cannot be avoided, would it make
more sense to fall back to the "fetch and locally inspect" just like
"the other side does not know how to do object-info" case?
Another thing you did not list, which is related, is where the
"fetch and locally inspect" fallback fetch the object into. Would
we use a quarantine mechanism, so that a mere request for remote
object info for an object will not contaminate our local object
store until the next gc realizes that such an object is dangling?
Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-11 4:38 ` [PATCH v6 0/6] " Junio C Hamano
@ 2024-11-18 16:28 ` Peijian Ju
2024-11-19 0:16 ` Junio C Hamano
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2024-11-18 16:28 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Sun, Nov 10, 2024 at 11:39 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > This is a continuation of Calvin Wan's (calvinwan@google.com)
> > patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1].
> >
> > Sometimes it is useful to get information about an object without having to download
> > it completely. The server logic for retrieving size has already been implemented and merged in
> > "a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2].
> > This patch series implement the client option for it.
> >
> > This patch series add the `remote-object-info` command to `cat-file --batch-command`.
> > This command allows the client to make an object-info command request to a server
> > that supports protocol v2. If the server is v2, but does not have
> > object-info capability, the entire object is fetched and the
> > relevant object info is returned.
> >
> > A few questions open for discussions please:
> >
> > 1. In the current implementation, if a user puts `remote-object-info` in protocol v1,
> > `cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)"
> > or "warn and wait for new command".
>
> In the primary use case envisioned, would it be a program that is
> driving the "cat-file --batch-command" process? Can it sensibly
> react to "warn and wait" and throw different commands to achieve
> what it wanted to do with the remote-object-info command?
>
> If the answer is "no", die would be more appropriate.
>
Thank you, sir.
I’m inclined to answer "no."
Our primary use case is to use git cat-file remote-object-info in a
promisor remote setup to retrieve metadata about an object stored in
the promisor remote, without fetching it back to the local repository.
This approach helps conserve disk space. I don’t believe other
commands can achieve this functionality, particularly without
requiring the object to be downloaded.
In the context of GitLab, we can mandate a specific version of Git to
be used alongside GitLab. Therefore, it is acceptable to error out if
the required Git version is not available, as we can ensure
compatibility by enforcing the version requirement.
Also, Mr. Christian Couder provided me with another more concrete example:
For example, consider a partial clone user initially interested in
only the foo/ and bar/ directories. They might execute git clone
--filter=blob:none --no-checkout <url> followed by git sparse-checkout
set foo bar. Later, they decided to estimate how much space would be
required to fetch everything related to the baz/ directory.
The challenge arises because remote-object-info might need to operate
recursively to calculate the total size for all objects related to
baz/. A driver program that drives these recursive operations would
struggle to handle a “warn and wait” mechanism effectively, as it
would need to issue additional commands dynamically based on the
warning. If the only alternative is to fetch the objects directly, it
would defeat the purpose of using remote-object-info—which is intended
to provide size information without actually downloading the objects.
> > 2. Right now, only the size is supported. If the batch command format
> > contains objectsize:disk or deltabase, it will die. The question
> > is about objecttype. In the current implementation, it will die too.
> > But dying on objecttype breaks the default format. We have changed the
> > default format to %(objectname) %(objectsize) when remote-object-info is used.
> > Any suggestions on this approach?
>
> Why bend the default format to the shortcoming of the new feature?
> What makes it impossible to learn what type of object it is? If the
> limitation that makes it impossible cannot be avoided, would it make
> more sense to fall back to the "fetch and locally inspect" just like
> "the other side does not know how to do object-info" case?
>
Thank you.
It is indeed possible to determine the type of an object, and the plan
was to include this functionality in a follow-up patch series to
ensure an iterative development approach. We expect that developers
using this feature will have some experience with Git and will notice
the warnings in the documentation, which caution against relying on
the default format remaining unchanged.
While the “fetch and locally inspect” approach is an option, it would
undermines the purpose of the feature, as highlighted by Christian’s
partial clone and sparse checkout example. This feature is
specifically designed to provide information without requiring the
objects to be fetched, making such an alternative counterproductive.
> Another thing you did not list, which is related, is where the
> "fetch and locally inspect" fallback fetch the object into. Would
> we use a quarantine mechanism, so that a mere request for remote
> object info for an object will not contaminate our local object
> store until the next gc realizes that such an object is dangling?
>
> Thanks.
Thank you.
Currently, the fetched object becomes a loose object in the local
object store. We have a bunch of test cases covering it in
t1017-cat-file-remote-object-info.sh to cover it. For example:
'remote-object-info fallback git://: fetch objects to client' '
'remote-object-info fallback file://: fetch objects to client' '
'remote-object-info fallback http://: fetch objects to client' '
where transfer.advertiseobjectinfo is set to false.
I am not sure about adding a quarantine mechanism at this stage:
Pros:
- The quarantine area can be garbage collected, preventing
contamination of the local storage.
- The quarantine area could be utilized to calculate metadata
information, such as in the partial clone and sparse checkout example
mentioned above. Objects in the quarantine can be selectively included
or excluded from such calculations.
Cons:
- Implementing a quarantine mechanism seems like a separate feature.
This patch series already introduces a number of changes, and
including the quarantine mechanism might make it too extensive.
- Based on Mr. Patrick Steinhardt’s comment at [1], since
remote-object-info operates only on protocol v2, adding a quarantine
mechanism may lead to differing client-side behavior depending on the
protocol, which could complicate the feature’s consistency.
In my opinion, the quarantine mechanism appears to have a broader
scope that extends beyond just remote-object-info. If deemed
necessary, it would be more appropriate to address it in its own
dedicated patch series.
[1] https://gitlab.com/gitlab-org/git/-/merge_requests/168#note_2212333586
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-18 16:28 ` Peijian Ju
@ 2024-11-19 0:16 ` Junio C Hamano
2024-11-19 6:31 ` Patrick Steinhardt
0 siblings, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2024-11-19 0:16 UTC (permalink / raw)
To: Peijian Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Peijian Ju <eric.peijian@gmail.com> writes:
> While the “fetch and locally inspect” approach is an option, it would
> undermines the purpose of the feature, as highlighted by Christian’s
> partial clone and sparse checkout example. This feature is
> specifically designed to provide information without requiring the
> objects to be fetched, making such an alternative counterproductive.
Thanks, then wouldn't it make more sense to say, because support for
new protocol capabilities on the server side would have to happen at
a lot fewer places than the clients, we only work when the necessary
protocol extension support is available, without any "fetch and
locally inspect" fallback?
The above is after reading your "cons" here of the fallback.
> Cons:
> - Implementing a quarantine mechanism seems like a separate feature.
> This patch series already introduces a number of changes, and
> including the quarantine mechanism might make it too extensive.
Not an excuse to introduce incomplete changes that are not
sufficient to be useful, though.
> - Based on Mr. Patrick Steinhardt’s comment at [1], since
> remote-object-info operates only on protocol v2, adding a quarantine
> mechanism may lead to differing client-side behavior depending on the
> protocol, which could complicate the feature’s consistency.
Not doing quarantine would give even _more_ different client-side
behaviour, though. When talking with a server with v2, you'll not
see a cruft object left locally, but with older servers, you'll see
crufts left behind. After a failed remote-object-info call, you can
do an object-info to figure out what you needed to learn about the
object, but only after the failed remote-object-info was against an
older server.
So, I do not see it as a reason against putting temporary objects
into quarantine.
Not that I consider it important to give the same client-side
behaviour when talking with older and newer servers, though. It is
natural for a new feature to be available only with versions of Git
that supports the feature, after all.
And if we throw that away as a goal, it starts to make more sense
not to add "fetch and locally inspect" anywhere.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-19 0:16 ` Junio C Hamano
@ 2024-11-19 6:31 ` Patrick Steinhardt
2024-11-19 6:48 ` Junio C Hamano
0 siblings, 1 reply; 174+ messages in thread
From: Patrick Steinhardt @ 2024-11-19 6:31 UTC (permalink / raw)
To: Junio C Hamano
Cc: Peijian Ju, git, calvinwan, jonathantanmy, chriscool, karthik.188,
toon, jltobler
On Tue, Nov 19, 2024 at 09:16:50AM +0900, Junio C Hamano wrote:
> Peijian Ju <eric.peijian@gmail.com> writes:
> > - Based on Mr. Patrick Steinhardt’s comment at [1], since
> > remote-object-info operates only on protocol v2, adding a quarantine
> > mechanism may lead to differing client-side behavior depending on the
> > protocol, which could complicate the feature’s consistency.
>
> Not doing quarantine would give even _more_ different client-side
> behaviour, though. When talking with a server with v2, you'll not
> see a cruft object left locally, but with older servers, you'll see
> crufts left behind. After a failed remote-object-info call, you can
> do an object-info to figure out what you needed to learn about the
> object, but only after the failed remote-object-info was against an
> older server.
>
> So, I do not see it as a reason against putting temporary objects
> into quarantine.
I agree, and that's also what I wanted to say in the linked comment.
> Not that I consider it important to give the same client-side
> behaviour when talking with older and newer servers, though. It is
> natural for a new feature to be available only with versions of Git
> that supports the feature, after all.
I think having subtly different behaviour like this is a recipe for
confusion. As an end user (or rather end developer in this context) it
is quite likely that you start to rely on the object either being
fetched or not fetched as they probably won't end up testing against
servers with both protocol versions. So you'd have to be aware of the
fallback behaviour, and given that it is rather subtle and thus easy to
miss the end result would likely be confusion when it works different in
some repos than in others.
> And if we throw that away as a goal, it starts to make more sense
> not to add "fetch and locally inspect" anywhere.
While having a quarantine directory would help with the case where you
have differing end user behaviour depending on the protocol, it of
course wouldn't help with the implied performance hit when using the
fallback logic.
So maybe not having the fallback is the best solution after all, and
when users have a good use case for why they need it we could implement
it in a future iteration.
Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-19 6:31 ` Patrick Steinhardt
@ 2024-11-19 6:48 ` Junio C Hamano
2024-11-19 16:35 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2024-11-19 6:48 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Peijian Ju, git, calvinwan, jonathantanmy, chriscool, karthik.188,
toon, jltobler
Patrick Steinhardt <ps@pks.im> writes:
> While having a quarantine directory would help with the case where you
> have differing end user behaviour depending on the protocol, it of
> course wouldn't help with the implied performance hit when using the
> fallback logic.
>
> So maybe not having the fallback is the best solution after all, and
> when users have a good use case for why they need it we could implement
> it in a future iteration.
And I would strongly suspect that we won't have to implement any
fallback---hopefully folks upgrade their server side to be capable
of whatever capability is needed fast enough ;-).
Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-19 6:48 ` Junio C Hamano
@ 2024-11-19 16:35 ` Peijian Ju
2024-11-20 1:19 ` Junio C Hamano
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2024-11-19 16:35 UTC (permalink / raw)
To: Junio C Hamano
Cc: Patrick Steinhardt, git, calvinwan, jonathantanmy, chriscool,
karthik.188, toon, jltobler
On Tue, Nov 19, 2024 at 1:48 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Patrick Steinhardt <ps@pks.im> writes:
>
> > While having a quarantine directory would help with the case where you
> > have differing end user behaviour depending on the protocol, it of
> > course wouldn't help with the implied performance hit when using the
> > fallback logic.
> >
> > So maybe not having the fallback is the best solution after all, and
> > when users have a good use case for why they need it we could implement
> > it in a future iteration.
>
> And I would strongly suspect that we won't have to implement any
> fallback---hopefully folks upgrade their server side to be capable
> of whatever capability is needed fast enough ;-).
>
> Thanks.
Thank you, sir, and also Patrick and Christian, for helping clarify
this for me. That makes sense. I will remove the "fetch and inspect"
approach in the next series.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v6 0/6] cat-file: add remote-object-info to batch-command
2024-11-19 16:35 ` Peijian Ju
@ 2024-11-20 1:19 ` Junio C Hamano
0 siblings, 0 replies; 174+ messages in thread
From: Junio C Hamano @ 2024-11-20 1:19 UTC (permalink / raw)
To: Peijian Ju
Cc: Patrick Steinhardt, git, calvinwan, jonathantanmy, chriscool,
karthik.188, toon, jltobler
Peijian Ju <eric.peijian@gmail.com> writes:
> On Tue, Nov 19, 2024 at 1:48 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> > While having a quarantine directory would help with the case where you
>> > have differing end user behaviour depending on the protocol, it of
>> > course wouldn't help with the implied performance hit when using the
>> > fallback logic.
>> >
>> > So maybe not having the fallback is the best solution after all, and
>> > when users have a good use case for why they need it we could implement
>> > it in a future iteration.
>>
>> And I would strongly suspect that we won't have to implement any
>> fallback---hopefully folks upgrade their server side to be capable
>> of whatever capability is needed fast enough ;-).
>>
>> Thanks.
>
> Thank you, sir, and also Patrick and Christian, for helping clarify
> this for me. That makes sense. I will remove the "fetch and inspect"
> approach in the next series.
Thanks, everybody. And thank you for working on this topic.
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v7 0/6] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (11 preceding siblings ...)
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 5:36 ` [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
` (5 more replies)
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
` (3 subsequent siblings)
16 siblings, 6 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This patch series is a continuation of Calvin Wan’s (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info
command at [1].
Sometimes it is beneficial to retrieve information about an object without
having to download it completely. The server logic for retrieving size has
already been implemented and merged in "a2ba162cda (object-info: support for
retrieving object info, 2021-04-20)"[2]. This patch series implement the client
option for it.
This patch series add the `remote-object-info` command to
`cat-file --batch-command`. This command allows the client to make an
object-info command request to a server that supports protocol v2.
If the server uses protocol v2 but does not support the object-info capability,
`cat-file --batch-command` will die.
If a user attempts to use `remote-object-info` with protocol v1,,
`cat-file --batch-command` will die.
Currently, only the size is supported in this implementation. If the batch
command format contains objecttype objectsize:disk or deltabase, it will die.
The default format is set to %(objectname) %(objectsize) when remote-object-info
is used. When "%(objecttype)" is supported, the default format will be unified.
[1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@google.com/#t
[2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7
Changes since V6
================
- Removal of “Fetch and Inspect” Fallback. In the previous patch series, if the
object-info capability was not supported (i.e., transfer.advertiseobjectinfo
was set to false), the remote-object-info command would fall back to fetching
the object locally and inspecting it. This “fetch and inspect” behavior has
been removed. Instead, the client will now error and exit if the object-info
capability is not supported by the server. For more details, refer to the
discussion at this thread
https://lore.kernel.org/git/CAN2LT1Dm17-mmoMQr457fb5ta-TxG6Fj3Ma-gPh4YRJV9rRDrw@mail.gmail.com/.
- Test Updates. Adjusted tests to cover cases where the object-info capability
is not supported by the server.
- Documentation Updates. Removed references to the “fetch and inspect” fallback
mechanism.
- Typos and Formatting Fixes.
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (2):
cat-file: add declaration of variable i inside its for loop
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.txt | 24 +-
Makefile | 1 +
builtin/cat-file.c | 110 ++++-
connect.c | 34 ++
connect.h | 8 +
fetch-object-info.c | 92 ++++
fetch-object-info.h | 18 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 652 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 28 +-
transport.h | 11 +
18 files changed, 1021 insertions(+), 68 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v6:
1: 0998382d5e = 1: 92feca4218 cat-file: add declaration of variable i inside its for loop
2: 26b861f416 ! 2: 58a24a4b92 fetch-pack: refactor packet writing
@@ connect.h: void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
++/**
++ * write_command_and_capabilities writes a command along with the requested
++ * server capabilities/features into a request buffer.
++ */
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
3: 044d0ec46c = 3: 6bd0f945c2 fetch-pack: move fetch initialization
4: 09b4a2c081 = 4: 000c82b681 serve: advertise object-info feature
5: 4782211b31 ! 5: 97c03a9c2c transport: add client support for object-info
@@ Metadata
## Commit message ##
transport: add client support for object-info
- Sometimes it is useful to get information about an object without having
- to download it completely. The server logic has already been implemented
- in “a2ba162cda (object-info: support for retrieving object info,
- 2021-04-20)”.
+ Sometimes, it is beneficial to retrieve information about an object
+ without downloading it entirely. The server-side logic for this
+ functionality was implemented in commit "a2ba162cda (object-info:
+ support for retrieving object info, 2021-04-20)."
- Add client functions to communicate with the server.
+ This commit introduces client functions to interact with the server.
- The client currently supports requesting a list of object ids with
- feature 'size' from a v2 server. If a server does not
- advertise the feature, then the client falls back
- to making the request through 'fetch'.
+ Currently, the client supports requesting a list of object IDs with
+ the ‘size’ feature from a v2 server. If the server does not advertise
+ this feature (i.e., transfer.advertiseobjectinfo is set to false),
+ the client will return an error and exit.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
@@ fetch-object-info.c (new)
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
-+ return -1;
-+ if (unsorted_string_list_has_string(args->object_info_options, "size")
-+ && !server_supports_feature("object-info", "size", 0))
-+ return -1;
++ die(_("object-info capability is not enabled on the server"));
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
@@ transport.c
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
-@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
- struct ref *refs = NULL;
- struct fetch_pack_args args;
- struct ref *refs_tmp = NULL, **to_fetch_dup = NULL;
-+ struct ref *object_info_refs = NULL;
-
- memset(&args, 0, sizeof(args));
- args.uploadpack = data->options.uploadpack;
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
-+ struct ref *ref_itr = object_info_refs = alloc_ref("");
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
-+ if (!fetch_object_info(data->version, &obj_info_args, &reader,
-+ data->options.object_info_data, transport->stateless_rpc,
-+ data->fd[1])) {
-+ /*
-+ * If the code reaches here, fetch_object_info is successful and
-+ * remote object info are retrieved from packets (i.e. without
-+ * downloading the objects).
-+ */
-+ goto cleanup;
-+ }
++ ret = fetch_object_info(data->version, &obj_info_args, &reader,
++ data->options.object_info_data, transport->stateless_rpc,
++ data->fd[1]);
++ goto cleanup;
- if (!data->finished_handshake) {
-+ /*
-+ * If the code reaches here, it means we can't retrieve object info from
-+ * packets, and we will fallback to downland the pack files.
-+ * We set quiet and no_progress to be true, so that the internal call to
-+ * fetch-pack is less verbose.
-+ */
-+ args.object_info_data = data->options.object_info_data;
-+ args.quiet = 1;
-+ args.no_progress = 1;
-+
-+ /*
-+ * Allocate memory for object info data according to oids.
-+ * The actual results will be retrieved later from the downloaded
-+ * pack files.
-+ */
-+ for (size_t i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
-+ ref_itr->old_oid = transport->smart_options->object_info_oids->oid[i];
-+ ref_itr->exact_oid = 1;
-+ if (i == transport->smart_options->object_info_oids->nr - 1)
-+ /* last element, no need to allocate to next */
-+ ref_itr->next = NULL;
-+ else
-+ ref_itr->next = alloc_ref("");
-+
-+ ref_itr = ref_itr->next;
-+ }
-+
-+ transport->remote_refs = object_info_refs;
-+
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
-@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
- &transport->pack_lockfiles, data->version);
-
- data->finished_handshake = 0;
-+
-+ /* Retrieve object info data from the downloaded pack files */
-+ if (args.object_info) {
-+ struct ref *ref_cpy_reader = object_info_refs;
-+ for (int i = 0; ref_cpy_reader; i++) {
-+ oid_object_info_extended(the_repository, &ref_cpy_reader->old_oid,
-+ &args.object_info_data[i], OBJECT_INFO_LOOKUP_REPLACE);
-+ ref_cpy_reader = ref_cpy_reader->next;
-+ }
-+ }
-+
- data->options.self_contained_and_connected =
- args.self_contained_and_connected;
- data->options.connectivity_checked = args.connectivity_checked;
-@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
- ret = -1;
-
- cleanup:
-+ free_refs(object_info_refs);
- close(data->fd[0]);
- if (data->fd[1] >= 0)
- close(data->fd[1]);
## transport.h ##
@@
@@ transport.h: struct git_transport_options {
unsigned connectivity_checked:1;
+ /*
-+ * Transport will attempt to pull only object-info. Fallbacks
-+ * to pulling entire object if object-info is not supported.
++ * Transport will attempt to retrieve only object-info.
++ * If object-info is not supported, the operation will error and exit.
+ */
+ unsigned object_info : 1;
+
6: 91689c0d0d ! 6: 7d1936591d cat-file: add remote-object-info to batch-command
@@ Commit message
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
+
Add `remote-object-info` to cat-file --batch-command.
- While `info` takes object ids one at a time, this creates overhead when
- making requests to a server so `remote-object-info` instead can take
- multiple object ids at once.
+ While `info` takes object ids one at a time, this creates
+ overhead when making requests to a server.So `remote-object-info`
+ instead can take multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
@@ Documentation/git-cat-file.txt: info <object>::
output of `--batch-check`.
+remote-object-info <remote> <object>...::
-+ Print object info for object references `<object>` at specified <remote>
-+ without downloading objects from remote. If the object-info capability
-+ is not supported by the server, the objects will be downloaded instead.
++ Print object info for object references `<object>` at specified
++ `<remote>` without downloading objects from the remote.
++ Error when the `object-info` capability is not supported by the server.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
-+ data->size = *remote_object_info[i].sizep;
-+ } else {
+ /*
-+ * When reaching here, it means remote-object-info can't retrieve
-+ * information from server without downloading them, and the objects
-+ * have been fetched to client already.
-+ * Print the information using the logic for local objects.
++ * When reaching here, it means remote-object-info can retrieve
++ * information from server without downloading them.
+ */
-+ data->skip_object_info = 0;
++ data->size = *remote_object_info[i].sizep;
++ opt->batch_mode = BATCH_MODE_INFO;
++ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
-+
-+ opt->batch_mode = BATCH_MODE_INFO;
-+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
-+
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
@@ t/t1017-cat-file-remote-object-info.sh (new)
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
-+
-+test_expect_success 'remote-object-info fallback git://: fetch objects to client' '
++test_expect_success 'batch-command remote-object-info git:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
-+ cd "$daemon_parent/daemon_client_empty" &&
-+
-+ # Prove object is not on the client
-+ echo "$hello_oid missing" >expect &&
-+ echo "$tree_oid missing" >>expect &&
-+ echo "$commit_oid missing" >>expect &&
-+ echo "$tag_oid missing" >>expect &&
+
-+ # These results prove remote-object-info can retrieve object info
-+ echo "$hello_oid $hello_size" >>expect &&
-+ echo "$tree_oid $tree_size" >>expect &&
-+ echo "$commit_oid $commit_size" >>expect &&
-+ echo "$tag_oid $tag_size" >>expect &&
-+
-+ # These results are for the info command
-+ # They prove objects are downloaded
-+ echo "$hello_oid $hello_size" >>expect &&
-+ echo "$tree_oid $tree_size" >>expect &&
-+ echo "$commit_oid $commit_size" >>expect &&
-+ echo "$tag_oid $tag_size" >>expect &&
-+
-+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ info $hello_oid
-+ info $tree_oid
-+ info $commit_oid
-+ info $tag_oid
++ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
-+ info $hello_oid
-+ info $tree_oid
-+ info $commit_oid
-+ info $tag_oid
+ EOF
++ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
-+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
++ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+
-+ test_cmp expect actual
+ )
+'
+
@@ t/t1017-cat-file-remote-object-info.sh (new)
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
-+
-+test_expect_success 'remote-object-info fallback file://: fetch objects to client' '
++test_expect_success 'batch-command remote-object-info file:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
-+ cd file_client_empty &&
+
-+ # Prove object is not on the client
-+ echo "$hello_oid missing" >expect &&
-+ echo "$tree_oid missing" >>expect &&
-+ echo "$commit_oid missing" >>expect &&
-+ echo "$tag_oid missing" >>expect &&
-+
-+ # These results prove remote-object-info can retrieve object info
-+ echo "$hello_oid $hello_size" >>expect &&
-+ echo "$tree_oid $tree_size" >>expect &&
-+ echo "$commit_oid $commit_size" >>expect &&
-+ echo "$tag_oid $tag_size" >>expect &&
-+
-+ # These results are for the info command
-+ # They prove objects are downloaded
-+ echo "$hello_oid $hello_size" >>expect &&
-+ echo "$tree_oid $tree_size" >>expect &&
-+ echo "$commit_oid $commit_size" >>expect &&
-+ echo "$tag_oid $tag_size" >>expect &&
-+
-+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
-+ info $hello_oid
-+ info $tree_oid
-+ info $commit_oid
-+ info $tag_oid
++ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
-+ info $hello_oid
-+ info $tree_oid
-+ info $commit_oid
-+ info $tag_oid
+ EOF
++ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
-+ git -C "${server_path}" config transfer.advertiseobjectinfo true &&
-+ test_cmp expect actual
++ git -C "${server_path}" config transfer.advertiseobjectinfo true
+ )
+'
+
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ )
+'
+
-+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
++test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ )
+'
+
-+test_expect_success 'remote-object-info fails on malformed OID fallback' '
++test_expect_success 'remote-object-info fails on malformed OID with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ )
+'
+
++
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
-+
-+test_expect_success 'remote-object-info fallback http://: fetch objects to client' '
++test_expect_success 'batch-command remote-object-info http:// fails when transfer.advertiseobjectinfo=false ' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
-+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
-+
-+ # Prove object is not on the client
-+ echo "$hello_oid missing" >expect &&
-+ echo "$tree_oid missing" >>expect &&
-+ echo "$commit_oid missing" >>expect &&
-+ echo "$tag_oid missing" >>expect &&
+
-+ # These results prove remote-object-info can retrieve object info
-+ echo "$hello_oid $hello_size" >>expect &&
-+ echo "$tree_oid $tree_size" >>expect &&
-+ echo "$commit_oid $commit_size" >>expect &&
-+ echo "$tag_oid $tag_size" >>expect &&
-+
-+ # These results are for the info command
-+ # They prove objects are downloaded
-+ echo "$hello_oid $hello_size" >>expect &&
-+ echo "$tree_oid $tree_size" >>expect &&
-+ echo "$commit_oid $commit_size" >>expect &&
-+ echo "$tag_oid $tag_size" >>expect &&
-+
-+ git cat-file --batch-command >actual <<-EOF &&
-+ info $hello_oid
-+ info $tree_oid
-+ info $commit_oid
-+ info $tag_oid
++ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
-+ info $hello_oid
-+ info $tree_oid
-+ info $commit_oid
-+ info $tag_oid
+ EOF
++ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
-+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
-+
-+ test_cmp expect actual
++ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true
+ )
+'
+
--
2.47.0
Information Footer:
base-commit: 8f8d6eee531b3fa1a8ef14f169b0cb5035f7a772
Merge Request: https://gitlab.com/gitlab-org/git/-/merge_requests/168
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-11-25 5:36 ` [PATCH v7 2/6] fetch-pack: refactor packet writing Eric Ju
` (4 subsequent siblings)
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
fetch-pack.c | 3 +--
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index bfdfb51c7c..5db55fabc4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -673,12 +673,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -686,9 +684,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -714,7 +710,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -724,7 +719,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index fe1fb3c1b7..bb7ec96963 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1328,9 +1328,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
if (advertise_sid && server_supports_v2("session-id"))
packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
if (server_options && server_options->nr) {
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
+ for (int i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v7 2/6] fetch-pack: refactor packet writing
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
2024-11-25 5:36 ` [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-11-25 5:36 ` [PATCH v7 3/6] fetch-pack: move fetch initialization Eric Ju
` (3 subsequent siblings)
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to a more
general-purpose function, write_command_and_capabilities(), enabling it
to serve both fetch and additional commands.
In this context, "command" refers to the "operations" supported by
Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
subcommand (e.g., git-fetch(1)) or a server-side operation like
"object-info" as implemented in commit a2ba162c
(object-info: support for retrieving object info, 2021-04-20).
Furthermore, write_command_and_capabilities() is moved to connect.c,
making it accessible to additional commands in the future.
To move write_command_and_capabilities() to connect.c, we need to
adjust how `advertise_sid` is managed. Previously,
in fetch_pack.c, `advertise_sid` was a static variable, modified using
git_config_get_bool().
In connect.c, we now initialize `advertise_sid` at the beginning by
directly using git_config_get_bool(). This change is safe because:
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. In function do_fetch_pack:
if (!server_supports("session-id"))
advertise_sid = 0;
2. In function fetch_pack_config():
git_config_get_bool("transfer.advertisesid", &advertise_sid);
About 1, since do_fetch_pack() is only relevant for protocol v1, this
assignment can be ignored in our refactor, as
write_command_and_capabilities() is only used in protocol v2.
About 2, git_config_get_bool() is from config.h and it is an out-of-box
dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 8 ++++++++
fetch-pack.c | 35 ++---------------------------------
3 files changed, 44 insertions(+), 33 deletions(-)
diff --git a/connect.c b/connect.c
index 58f53d8dcb..5dd544335c 100644
--- a/connect.c
+++ b/connect.c
@@ -688,6 +688,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (int i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..8b56a68b62 100644
--- a/connect.h
+++ b/connect.h
@@ -1,6 +1,7 @@
#ifndef CONNECT_H
#define CONNECT_H
+#include "string-list.h"
#include "protocol.h"
#define CONNECT_VERBOSE (1u << 0)
@@ -30,4 +31,11 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+/**
+ * write_command_and_capabilities writes a command along with the requested
+ * server capabilities/features into a request buffer.
+ */
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index bb7ec96963..bcee4004a1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1316,37 +1316,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
- for (int i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1357,7 +1326,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2175,7 +2144,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v7 3/6] fetch-pack: move fetch initialization
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
2024-11-25 5:36 ` [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-11-25 5:36 ` [PATCH v7 2/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 5:36 ` [PATCH v7 4/6] serve: advertise object-info feature Eric Ju
` (2 subsequent siblings)
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index bcee4004a1..eb4aface36 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1647,18 +1647,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v7 4/6] serve: advertise object-info feature
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
` (2 preceding siblings ...)
2024-11-25 5:36 ` [PATCH v7 3/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 5:36 ` [PATCH v7 5/6] transport: add client support for object-info Eric Ju
2024-11-25 5:36 ` [PATCH v7 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index d674764a25..c3d8098642 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v7 5/6] transport: add client support for object-info
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
` (3 preceding siblings ...)
2024-11-25 5:36 ` [PATCH v7 4/6] serve: advertise object-info feature Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-11-25 5:36 ` [PATCH v7 6/6] cat-file: add remote-object-info to batch-command Eric Ju
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes, it is beneficial to retrieve information about an object
without downloading it entirely. The server-side logic for this
functionality was implemented in commit "a2ba162cda (object-info:
support for retrieving object info, 2021-04-20)."
This commit introduces client functions to interact with the server.
Currently, the client supports requesting a list of object IDs with
the ‘size’ feature from a v2 server. If the server does not advertise
this feature (i.e., transfer.advertiseobjectinfo is set to false),
the client will return an error and exit.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 92 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 18 +++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 +
transport-helper.c | 11 +++++-
transport.c | 28 +++++++++++++-
transport.h | 11 ++++++
8 files changed, 163 insertions(+), 3 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index d06c9a8ffa..beca828963 100644
--- a/Makefile
+++ b/Makefile
@@ -1024,6 +1024,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..2aa9f2b70d
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,92 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/**
+ * send_object_info_request sends git-cat-file object-info command and its
+ * arguments into the request buffer.
+ */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+/**
+ * fetch_object_info sends git-cat-file object-info command into the request buf
+ * and read the results from packets.
+ */
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ die(_("object-info capability is not enabled on the server"));
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..ce1a05dc96
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,18 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index eb4aface36..6e5d7df2dc 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1653,6 +1653,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index b5c579cdae..cf7cedf161 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index bc27653cde..bf0a1877c7 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -711,8 +711,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -728,6 +728,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 47fda6a773..746ec19ddc 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -444,8 +445,33 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ ret = fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1]);
+ goto cleanup;
- if (!data->finished_handshake) {
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
diff --git a/transport.h b/transport.h
index 44100fa9b7..e61e931863 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to retrieve only object-info.
+ * If object-info is not supported, the operation will error and exit.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v7 6/6] cat-file: add remote-object-info to batch-command
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
` (4 preceding siblings ...)
2024-11-25 5:36 ` [PATCH v7 5/6] transport: add client support for object-info Eric Ju
@ 2024-11-25 5:36 ` Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
5 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-11-25 5:36 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates
overhead when making requests to a server.So `remote-object-info`
instead can take multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 99 ++++
object-file.c | 11 +
object-store-ll.h | 3 +
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 652 +++++++++++++++++++++++++
7 files changed, 802 insertions(+), 16 deletions(-)
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..6a2f9fd752 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified
+ `<remote>` without downloading objects from the remote.
+ Error when the `object-info` capability is not supported by the server.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will error
+and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5db55fabc4..ad17be69b0 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -24,6 +24,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -42,9 +45,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -667,6 +726,45 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line, struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ /*
+ * When reaching here, it means remote-object-info can retrieve
+ * information from server without downloading them.
+ */
+ data->size = *remote_object_info[i].sizep;
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -698,6 +796,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index b1a3463852..181cde98e1 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3132,3 +3132,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index 53b8e693b1..611e2ca708 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -548,4 +548,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(each_packed_object_fn, void *,
enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..9fb20be308
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related tests.
+
+# Print a string without a trailing newline
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string removing any leading spaces.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index d36cd7c086..d8a851c427 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -4,6 +4,7 @@ test_description='git cat-file'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -99,18 +100,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..0e3044d8d6
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,652 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info git:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info file:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info http:// fails when transfer.advertiseobjectinfo=false ' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop
2024-11-25 5:36 ` [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 19:26 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Patrick Steinhardt @ 2024-11-25 9:51 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Nov 25, 2024 at 12:36:11AM -0500, Eric Ju wrote:
> diff --git a/fetch-pack.c b/fetch-pack.c
> index fe1fb3c1b7..bb7ec96963 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1328,9 +1328,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
> if (advertise_sid && server_supports_v2("session-id"))
> packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
> if (server_options && server_options->nr) {
> - int i;
> ensure_server_supports_v2("server-option");
> - for (i = 0; i < server_options->nr; i++)
> + for (int i = 0; i < server_options->nr; i++)
> packet_buf_write(req_buf, "server-option=%s",
> server_options->items[i].string);
> }
It's somewhat curious that you change the type to `size_t` while at it
in other spots, but not here. Doubly so because `server_options` is a
`struct string_list`, and `string_list::nr` is of type `size_t`.
Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 2/6] fetch-pack: refactor packet writing
2024-11-25 5:36 ` [PATCH v7 2/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 19:09 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Patrick Steinhardt @ 2024-11-25 9:51 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Nov 25, 2024 at 12:36:12AM -0500, Eric Ju wrote:
> diff --git a/connect.h b/connect.h
> index 1645126c17..8b56a68b62 100644
> --- a/connect.h
> +++ b/connect.h
> @@ -1,6 +1,7 @@
> #ifndef CONNECT_H
> #define CONNECT_H
>
> +#include "string-list.h"
> #include "protocol.h"
>
> #define CONNECT_VERBOSE (1u << 0)
Instead of including this header, you can add a forward declaration of
`struct string_list`. This is mostly done to keep compilation times at
bay by not including too many headers.
> @@ -30,4 +31,11 @@ void check_stateless_delimiter(int stateless_rpc,
> struct packet_reader *reader,
> const char *error);
>
> +/**
> + * write_command_and_capabilities writes a command along with the requested
Nit: we don't typically use Go-style comments where the comment starts
with the name of what's being documented.
Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 5/6] transport: add client support for object-info
2024-11-25 5:36 ` [PATCH v7 5/6] transport: add client support for object-info Eric Ju
@ 2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 3:15 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Patrick Steinhardt @ 2024-11-25 9:51 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Nov 25, 2024 at 12:36:15AM -0500, Eric Ju wrote:
> diff --git a/fetch-object-info.c b/fetch-object-info.c
> new file mode 100644
> index 0000000000..2aa9f2b70d
> --- /dev/null
> +++ b/fetch-object-info.c
> @@ -0,0 +1,92 @@
> +#include "git-compat-util.h"
> +#include "gettext.h"
> +#include "hex.h"
> +#include "pkt-line.h"
> +#include "connect.h"
> +#include "oid-array.h"
> +#include "object-store-ll.h"
> +#include "fetch-object-info.h"
> +#include "string-list.h"
> +
> +/**
> + * send_object_info_request sends git-cat-file object-info command and its
> + * arguments into the request buffer.
> + */
> +static void send_object_info_request(const int fd_out, struct object_info_args *args)
> +{
> + struct strbuf req_buf = STRBUF_INIT;
> +
> + write_command_and_capabilities(&req_buf, "object-info", args->server_options);
> +
> + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> + packet_buf_write(&req_buf, "size");
Do we have a document somewhere that spells out the wire format that
client- and server-side talk with each other? If so it would be nice to
point it out in the commit message so that I know where to look, and
otherwise we should document it. Without such a doc it's hard to figure
out whether this is correct.
> + if (args->oids) {
> + for (size_t i = 0; i < args->oids->nr; i++)
> + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> + }
Nit: needless curly braces.
> + packet_buf_flush(&req_buf);
> + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> + die_errno(_("unable to write request to remote"));
So we write the whole request into `req_buf` first before sending it to
the remote. Isn't that quite inefficient memory wise? In other words,
couldn't we instead stream the request line by line or at least in
batches to the file descriptor?
> + strbuf_release(&req_buf);
> +}
> +
> +/**
Nit: s|/**|/*|
> + * fetch_object_info sends git-cat-file object-info command into the request buf
> + * and read the results from packets.
> + */
> +int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
> + struct packet_reader *reader, struct object_info *object_info_data,
> + const int stateless_rpc, const int fd_out)
> +{
> + int size_index = -1;
> +
> + switch (version) {
> + case protocol_v2:
> + if (!server_supports_v2("object-info"))
> + die(_("object-info capability is not enabled on the server"));
> + send_object_info_request(fd_out, args);
> + break;
> + case protocol_v1:
> + case protocol_v0:
> + die(_("wrong protocol version. expected v2"));
> + case protocol_unknown_version:
> + BUG("unknown protocol version");
> + }
> +
> + for (size_t i = 0; i < args->object_info_options->nr; i++) {
> + if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
> + check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
> + return -1;
> + }
> + if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
Hum. Does this result in quadratic runtime behaviour?
> + if (!strcmp(reader->line, "size")) {
> + size_index = i;
> + for (size_t j = 0; j < args->oids->nr; j++)
> + object_info_data[j].sizep = xcalloc(1, sizeof(long));
This might be a bit more future proof in case the `sizep` type were ever
to change:
object_info_data[j].sizep = xcalloc(1, sizeof(*object_info_data[j].sizep));
It also allows you to skip double-checking whether you picked the
correct type. In fact, the type is actually an `unsigned long`, which
is confusing but ultimately does not make much of a difference because
it should have the same size.
> + }
> + continue;
> + }
> + return -1;
> + }
> +
> + for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
> + struct string_list object_info_values = STRING_LIST_INIT_DUP;
> +
> + string_list_split(&object_info_values, reader->line, ' ', -1);
> + if (0 <= size_index) {
> + if (!strcmp(object_info_values.items[1 + size_index].string, ""))
> + die("object-info: not our ref %s",
> + object_info_values.items[0].string);
> +
> + *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
We're completely missing error handling for strtoul(3p) here. That
function is also discouraged nowadays because error handling is hard to
do correct. We have `strtoul_ui()` and friends, but don't have a variant
yet that know to return an `unsigned long`. We might backfill that
omission and then use it instead.
> diff --git a/transport-helper.c b/transport-helper.c
> index bc27653cde..bf0a1877c7 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -728,6 +728,13 @@ static int fetch_refs(struct transport *transport,
> free_refs(dummy);
> }
>
> + /* fail the command explicitly to avoid further commands input. */
> + if (transport->smart_options->object_info)
> + die(_("remote-object-info requires protocol v2"));
The code path that checks for for protocol v2 with "--negotiate-only"
knows to warn and return an error. Should we do the same here?
> diff --git a/transport.c b/transport.c
> index 47fda6a773..746ec19ddc 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -9,6 +9,7 @@
> #include "hook.h"
> #include "pkt-line.h"
> #include "fetch-pack.h"
> +#include "fetch-object-info.h"
> #include "remote.h"
> #include "connect.h"
> #include "send-pack.h"
> @@ -444,8 +445,33 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.server_options = transport->server_options;
> args.negotiation_tips = data->options.negotiation_tips;
> args.reject_shallow_remote = transport->smart_options->reject_shallow;
> + args.object_info = transport->smart_options->object_info;
> +
> + if (transport->smart_options
> + && transport->smart_options->object_info
> + && transport->smart_options->object_info_oids->nr > 0) {
Formatting is wrong here:
if (transport->smart_options &&
transport->smart_options->object_info &&
transport->smart_options->object_info_oids->nr > 0) {
Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 6/6] cat-file: add remote-object-info to batch-command
2024-11-25 5:36 ` [PATCH v7 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 19:23 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Patrick Steinhardt @ 2024-11-25 9:51 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Nov 25, 2024 at 12:36:16AM -0500, Eric Ju wrote:
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index d5890ae368..6a2f9fd752 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -314,7 +323,10 @@ newline. The available atoms are:
> line) are output in place of the `%(rest)` atom.
>
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> +DO NOT RELY on the current the default format to stay the same!!!
Is this stale or do we still not support `%(objecttype)`? I thought we
wanted to support that, as well, so that we don't have to change the
default format.
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 5db55fabc4..ad17be69b0 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
> object_context_release(&ctx);
> }
>
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> + int retval = 0;
> + struct remote *remote = NULL;
> + struct object_id oid;
> + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> + static struct transport *gtransport;
> +
> + /*
> + * Change the format to "%(objectname) %(objectsize)" when
> + * remote-object-info command is used. Once we start supporting objecttype
> + * the default format should change to DEFAULT_FORMAT
> + */
> + if (!opt->format)
> + opt->format = "%(objectname) %(objectsize)";
Seems like it isn't stale. Hum.
> + remote = remote_get(argv[0]);
> + if (!remote)
> + die(_("must supply valid remote when using remote-object-info"));
> +
> + oid_array_clear(&object_info_oids);
> + for (size_t i = 1; i < argc; i++) {
> + if (get_oid_hex(argv[i], &oid))
> + die(_("Not a valid object name %s"), argv[i]);
> + oid_array_append(&object_info_oids, &oid);
> + }
Should we return an error when the user didn't pass any object IDs?
> @@ -667,6 +726,45 @@ static void parse_cmd_info(struct batch_options *opt,
> batch_one_object(line, output, opt, data);
> }
>
> +static void parse_cmd_remote_object_info(struct batch_options *opt,
> + const char *line, struct strbuf *output,
> + struct expand_data *data)
> +{
> + int count;
> + const char **argv;
> +
> + char *line_to_split = xstrdup_or_null(line);
> + count = split_cmdline(line_to_split, &argv);
> + if (get_remote_info(opt, count, argv))
> + goto cleanup;
> +
> + opt->use_remote_info = 1;
> + data->skip_object_info = 1;
> + for (size_t i = 0; i < object_info_oids.nr; i++) {
> +
Nit: empty newline at the start of a block.
> diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
> new file mode 100644
> index 0000000000..9fb20be308
> --- /dev/null
> +++ b/t/lib-cat-file.sh
I think it would make sense to split the introduction of
"lib-cat-file.sh" into a separate commit.
Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 5/6] transport: add client support for object-info
2024-11-25 9:51 ` Patrick Steinhardt
@ 2024-12-03 3:15 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-12-03 3:15 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Hi Patrick,
Thank you for your feedback. I have a few questions. I agree with the
comments I didn’t specifically respond to and will address them in v8.
Eric.
On Mon, Nov 25, 2024 at 4:51 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Mon, Nov 25, 2024 at 12:36:15AM -0500, Eric Ju wrote:
> > diff --git a/fetch-object-info.c b/fetch-object-info.c
> > new file mode 100644
> > index 0000000000..2aa9f2b70d
> > --- /dev/null
> > +++ b/fetch-object-info.c
> > @@ -0,0 +1,92 @@
> > +#include "git-compat-util.h"
> > +#include "gettext.h"
> > +#include "hex.h"
> > +#include "pkt-line.h"
> > +#include "connect.h"
> > +#include "oid-array.h"
> > +#include "object-store-ll.h"
> > +#include "fetch-object-info.h"
> > +#include "string-list.h"
> > +
> > +/**
> > + * send_object_info_request sends git-cat-file object-info command and its
> > + * arguments into the request buffer.
> > + */
> > +static void send_object_info_request(const int fd_out, struct object_info_args *args)
> > +{
> > + struct strbuf req_buf = STRBUF_INIT;
> > +
> > + write_command_and_capabilities(&req_buf, "object-info", args->server_options);
> > +
> > + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> > + packet_buf_write(&req_buf, "size");
>
> Do we have a document somewhere that spells out the wire format that
> client- and server-side talk with each other? If so it would be nice to
> point it out in the commit message so that I know where to look, and
> otherwise we should document it. Without such a doc it's hard to figure
> out whether this is correct.
>
Thank you. Is this what you are looking for?
https://git-scm.com/docs/protocol-v2#_object_info
If so, I will put it in the commit message in v8.
> > + if (args->oids) {
> > + for (size_t i = 0; i < args->oids->nr; i++)
> > + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> > + }
>
> Nit: needless curly braces.
>
> > + packet_buf_flush(&req_buf);
> > + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> > + die_errno(_("unable to write request to remote"));
>
> So we write the whole request into `req_buf` first before sending it to
> the remote. Isn't that quite inefficient memory wise? In other words,
> couldn't we instead stream the request line by line or at least in
> batches to the file descriptor?
>
Thank you.
I followed the `send_fetch_request()` logic in `fetch-pack.c`. I’m
not entirely clear on how to “stream the request line by line or in
batches.” Could you point me to an example or reference that
demonstrates this approach?
> > + strbuf_release(&req_buf);
> > +}
> > +
> > +/**
>
> Nit: s|/**|/*|
>
> > + * fetch_object_info sends git-cat-file object-info command into the request buf
> > + * and read the results from packets.
> > + */
> > +int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
> > + struct packet_reader *reader, struct object_info *object_info_data,
> > + const int stateless_rpc, const int fd_out)
> > +{
> > + int size_index = -1;
> > +
> > + switch (version) {
> > + case protocol_v2:
> > + if (!server_supports_v2("object-info"))
> > + die(_("object-info capability is not enabled on the server"));
> > + send_object_info_request(fd_out, args);
> > + break;
> > + case protocol_v1:
> > + case protocol_v0:
> > + die(_("wrong protocol version. expected v2"));
> > + case protocol_unknown_version:
> > + BUG("unknown protocol version");
> > + }
> > +
> > + for (size_t i = 0; i < args->object_info_options->nr; i++) {
> > + if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
> > + check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
> > + return -1;
> > + }
> > + if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
>
> Hum. Does this result in quadratic runtime behaviour?
>
> > + if (!strcmp(reader->line, "size")) {
> > + size_index = i;
> > + for (size_t j = 0; j < args->oids->nr; j++)
> > + object_info_data[j].sizep = xcalloc(1, sizeof(long));
>
> This might be a bit more future proof in case the `sizep` type were ever
> to change:
>
> object_info_data[j].sizep = xcalloc(1, sizeof(*object_info_data[j].sizep));
>
> It also allows you to skip double-checking whether you picked the
> correct type. In fact, the type is actually an `unsigned long`, which
> is confusing but ultimately does not make much of a difference because
> it should have the same size.
>
> > + }
> > + continue;
> > + }
> > + return -1;
> > + }
> > +
> > + for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
> > + struct string_list object_info_values = STRING_LIST_INIT_DUP;
> > +
> > + string_list_split(&object_info_values, reader->line, ' ', -1);
> > + if (0 <= size_index) {
> > + if (!strcmp(object_info_values.items[1 + size_index].string, ""))
> > + die("object-info: not our ref %s",
> > + object_info_values.items[0].string);
> > +
> > + *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
>
> We're completely missing error handling for strtoul(3p) here. That
> function is also discouraged nowadays because error handling is hard to
> do correct. We have `strtoul_ui()` and friends, but don't have a variant
> yet that know to return an `unsigned long`. We might backfill that
> omission and then use it instead.
>
> > diff --git a/transport-helper.c b/transport-helper.c
> > index bc27653cde..bf0a1877c7 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -728,6 +728,13 @@ static int fetch_refs(struct transport *transport,
> > free_refs(dummy);
> > }
> >
> > + /* fail the command explicitly to avoid further commands input. */
> > + if (transport->smart_options->object_info)
> > + die(_("remote-object-info requires protocol v2"));
>
> The code path that checks for for protocol v2 with "--negotiate-only"
> knows to warn and return an error. Should we do the same here?
>
Thank you.
If we follow "warn and return an error" as "--negotiate-only", we will
end up with "warn and wait". This was the question I was asking in the
previous patches:
In the current implementation, if a user puts
`remote-object-info` in protocol v1,
`cat-file --batch-command` will die. Which way do we prefer?
"error and exit (i.e. die)"
or "warn and wait for new command".
And we decided to take the path of "error and exit (i.e. die)", the
reason was explained at
https://lore.kernel.org/git/CAN2LT1Cmsw3RB1kbRBvoeLs8WaQeZWqrG96EQfMkMe_jdKaO4g@mail.gmail.com/:
Our primary use case is to use git cat-file remote-object-info in
a promisor remote setup to retrieve metadata
about an object stored in the promisor remote, without fetching it
back to the local repository.
This approach helps conserve disk space. I don’t believe other
commands can achieve this functionality,
particularly without requiring the object to be downloaded.
> > diff --git a/transport.c b/transport.c
> > index 47fda6a773..746ec19ddc 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -9,6 +9,7 @@
> > #include "hook.h"
> > #include "pkt-line.h"
> > #include "fetch-pack.h"
> > +#include "fetch-object-info.h"
> > #include "remote.h"
> > #include "connect.h"
> > #include "send-pack.h"
> > @@ -444,8 +445,33 @@ static int fetch_refs_via_pack(struct transport *transport,
> > args.server_options = transport->server_options;
> > args.negotiation_tips = data->options.negotiation_tips;
> > args.reject_shallow_remote = transport->smart_options->reject_shallow;
> > + args.object_info = transport->smart_options->object_info;
> > +
> > + if (transport->smart_options
> > + && transport->smart_options->object_info
> > + && transport->smart_options->object_info_oids->nr > 0) {
>
> Formatting is wrong here:
>
> if (transport->smart_options &&
> transport->smart_options->object_info &&
> transport->smart_options->object_info_oids->nr > 0) {
>
> Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 2/6] fetch-pack: refactor packet writing
2024-11-25 9:51 ` Patrick Steinhardt
@ 2024-12-03 19:09 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-12-03 19:09 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Thank you, Patrick.
Your comments make perfect sense to me, and I will revise them
accordingly to v8.
On Mon, Nov 25, 2024 at 4:51 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Mon, Nov 25, 2024 at 12:36:12AM -0500, Eric Ju wrote:
> > diff --git a/connect.h b/connect.h
> > index 1645126c17..8b56a68b62 100644
> > --- a/connect.h
> > +++ b/connect.h
> > @@ -1,6 +1,7 @@
> > #ifndef CONNECT_H
> > #define CONNECT_H
> >
> > +#include "string-list.h"
> > #include "protocol.h"
> >
> > #define CONNECT_VERBOSE (1u << 0)
>
> Instead of including this header, you can add a forward declaration of
> `struct string_list`. This is mostly done to keep compilation times at
> bay by not including too many headers.
>
> > @@ -30,4 +31,11 @@ void check_stateless_delimiter(int stateless_rpc,
> > struct packet_reader *reader,
> > const char *error);
> >
> > +/**
> > + * write_command_and_capabilities writes a command along with the requested
>
> Nit: we don't typically use Go-style comments where the comment starts
> with the name of what's being documented.
>
> Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 6/6] cat-file: add remote-object-info to batch-command
2024-11-25 9:51 ` Patrick Steinhardt
@ 2024-12-03 19:23 ` Peijian Ju
2024-12-05 9:50 ` Patrick Steinhardt
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2024-12-03 19:23 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Nov 25, 2024 at 4:51 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Mon, Nov 25, 2024 at 12:36:16AM -0500, Eric Ju wrote:
> > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> > index d5890ae368..6a2f9fd752 100644
> > --- a/Documentation/git-cat-file.txt
> > +++ b/Documentation/git-cat-file.txt
> > @@ -314,7 +323,10 @@ newline. The available atoms are:
> > line) are output in place of the `%(rest)` atom.
> >
> > If no format is specified, the default format is `%(objectname)
> > -%(objecttype) %(objectsize)`.
> > +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> > +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> > +DO NOT RELY on the current the default format to stay the same!!!
>
> Is this stale or do we still not support `%(objecttype)`? I thought we
> wanted to support that, as well, so that we don't have to change the
> default format.
>
Please see my next reply.
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 5db55fabc4..ad17be69b0 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
> > object_context_release(&ctx);
> > }
> >
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > + static struct transport *gtransport;
> > +
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT
> > + */
> > + if (!opt->format)
> > + opt->format = "%(objectname) %(objectsize)";
>
> Seems like it isn't stale. Hum.
>
No, this isn’t stale. As I mentioned in my response to Junio in
https://lore.kernel.org/git/CAN2LT1Cmsw3RB1kbRBvoeLs8WaQeZWqrG96EQfMkMe_jdKaO4g@mail.gmail.com/,
adding type support is planned for the next patch series. Based on the
documentation at https://git-scm.com/docs/protocol-v2#_object_info, it
seems type isn’t yet supported on the server side either. My plan is
to implement the logic for both server and client in the next series.
Unless the reviewers feel strongly that this must be included now, I’d
prefer to stick to the original plan.
> > + remote = remote_get(argv[0]);
> > + if (!remote)
> > + die(_("must supply valid remote when using remote-object-info"));
> > +
> > + oid_array_clear(&object_info_oids);
> > + for (size_t i = 1; i < argc; i++) {
> > + if (get_oid_hex(argv[i], &oid))
> > + die(_("Not a valid object name %s"), argv[i]);
> > + oid_array_append(&object_info_oids, &oid);
> > + }
>
> Should we return an error when the user didn't pass any object IDs?
>
Thank you. Revising in v8 and also adding a new test case to cover it.
> > @@ -667,6 +726,45 @@ static void parse_cmd_info(struct batch_options *opt,
> > batch_one_object(line, output, opt, data);
> > }
> >
> > +static void parse_cmd_remote_object_info(struct batch_options *opt,
> > + const char *line, struct strbuf *output,
> > + struct expand_data *data)
> > +{
> > + int count;
> > + const char **argv;
> > +
> > + char *line_to_split = xstrdup_or_null(line);
> > + count = split_cmdline(line_to_split, &argv);
> > + if (get_remote_info(opt, count, argv))
> > + goto cleanup;
> > +
> > + opt->use_remote_info = 1;
> > + data->skip_object_info = 1;
> > + for (size_t i = 0; i < object_info_oids.nr; i++) {
> > +
>
> Nit: empty newline at the start of a block.
>
Thank you. Fixing in v8.
> > diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
> > new file mode 100644
> > index 0000000000..9fb20be308
> > --- /dev/null
> > +++ b/t/lib-cat-file.sh
>
> I think it would make sense to split the introduction of
> "lib-cat-file.sh" into a separate commit.
>
Thank you. Will split it into a separate commit in v8.
> Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop
2024-11-25 9:51 ` Patrick Steinhardt
@ 2024-12-03 19:26 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-12-03 19:26 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Nov 25, 2024 at 4:51 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Mon, Nov 25, 2024 at 12:36:11AM -0500, Eric Ju wrote:
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index fe1fb3c1b7..bb7ec96963 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1328,9 +1328,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
> > if (advertise_sid && server_supports_v2("session-id"))
> > packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
> > if (server_options && server_options->nr) {
> > - int i;
> > ensure_server_supports_v2("server-option");
> > - for (i = 0; i < server_options->nr; i++)
> > + for (int i = 0; i < server_options->nr; i++)
> > packet_buf_write(req_buf, "server-option=%s",
> > server_options->items[i].string);
> > }
>
> It's somewhat curious that you change the type to `size_t` while at it
> in other spots, but not here. Doubly so because `server_options` is a
> `struct string_list`, and `string_list::nr` is of type `size_t`.
>
> Patrick
Thank you.
Yes, I missed that. Will correct it in v8.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 6/6] cat-file: add remote-object-info to batch-command
2024-12-03 19:23 ` Peijian Ju
@ 2024-12-05 9:50 ` Patrick Steinhardt
2024-12-05 10:34 ` Christian Couder
0 siblings, 1 reply; 174+ messages in thread
From: Patrick Steinhardt @ 2024-12-05 9:50 UTC (permalink / raw)
To: Peijian Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Tue, Dec 03, 2024 at 02:23:01PM -0500, Peijian Ju wrote:
> On Mon, Nov 25, 2024 at 4:51 AM Patrick Steinhardt <ps@pks.im> wrote:
> > On Mon, Nov 25, 2024 at 12:36:16AM -0500, Eric Ju wrote:
> > > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > > index 5db55fabc4..ad17be69b0 100644
> > > --- a/builtin/cat-file.c
> > > +++ b/builtin/cat-file.c
> > > @@ -576,6 +582,59 @@ static void batch_one_object(const char *obj_name,
> > > object_context_release(&ctx);
> > > }
> > >
> > > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > > +{
> > > + int retval = 0;
> > > + struct remote *remote = NULL;
> > > + struct object_id oid;
> > > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > > + static struct transport *gtransport;
> > > +
> > > + /*
> > > + * Change the format to "%(objectname) %(objectsize)" when
> > > + * remote-object-info command is used. Once we start supporting objecttype
> > > + * the default format should change to DEFAULT_FORMAT
> > > + */
> > > + if (!opt->format)
> > > + opt->format = "%(objectname) %(objectsize)";
> >
> > Seems like it isn't stale. Hum.
> >
>
> No, this isn’t stale. As I mentioned in my response to Junio in
> https://lore.kernel.org/git/CAN2LT1Cmsw3RB1kbRBvoeLs8WaQeZWqrG96EQfMkMe_jdKaO4g@mail.gmail.com/,
> adding type support is planned for the next patch series. Based on the
> documentation at https://git-scm.com/docs/protocol-v2#_object_info, it
> seems type isn’t yet supported on the server side either. My plan is
> to implement the logic for both server and client in the next series.
>
> Unless the reviewers feel strongly that this must be included now, I’d
> prefer to stick to the original plan.
The problem is that you cannot introduce a different format first and
then change it in a subsequent patch series because that would be a
backwards-incompatible change. So if the follow-up patch series would
implement that you cannot revert back to the default format, and
consequently the behaviour would now be inconsistent with the non-remote
case without a good reason.
Patrick
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v7 6/6] cat-file: add remote-object-info to batch-command
2024-12-05 9:50 ` Patrick Steinhardt
@ 2024-12-05 10:34 ` Christian Couder
0 siblings, 0 replies; 174+ messages in thread
From: Christian Couder @ 2024-12-05 10:34 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Peijian Ju, git, calvinwan, jonathantanmy, chriscool, karthik.188,
toon, jltobler
On Thu, Dec 5, 2024 at 10:52 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Dec 03, 2024 at 02:23:01PM -0500, Peijian Ju wrote:
> > No, this isn’t stale. As I mentioned in my response to Junio in
> > https://lore.kernel.org/git/CAN2LT1Cmsw3RB1kbRBvoeLs8WaQeZWqrG96EQfMkMe_jdKaO4g@mail.gmail.com/,
> > adding type support is planned for the next patch series. Based on the
> > documentation at https://git-scm.com/docs/protocol-v2#_object_info, it
> > seems type isn’t yet supported on the server side either. My plan is
> > to implement the logic for both server and client in the next series.
> >
> > Unless the reviewers feel strongly that this must be included now, I’d
> > prefer to stick to the original plan.
>
> The problem is that you cannot introduce a different format first and
> then change it in a subsequent patch series because that would be a
> backwards-incompatible change. So if the follow-up patch series would
> implement that you cannot revert back to the default format, and
> consequently the behaviour would now be inconsistent with the non-remote
> case without a good reason.
The doc added with this patch clearly says that the default format is
very likely to change in the future and that users should not rely on
it. Also there are very simple ways for users (who are likely to be
very advanced users) to use a custom format instead of the default
format.
If we always reject any backward-incompatible change, even on features
we have clearly marked as experimental or temporary, then it means
there is no point in marking features as experimental or temporary in
the docs, and it will make developing new features more difficult as
we will likely have to spend a lot of time to get it right the first
time instead developing them organically. As we will likely fail in
some cases to get it right the first time, it will mean more things we
will have to redo in other ways and more things we will have to
deprecate and eventually remove, which will be bad for backward
compatibility anyway.
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v8 0/6] add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (12 preceding siblings ...)
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2024-12-23 23:25 ` [PATCH v8 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
` (6 more replies)
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
` (2 subsequent siblings)
16 siblings, 7 replies; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This patch series is a continuation of Calvin Wan’s (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info
command at [1].
Sometimes it is beneficial to retrieve information about an object without
having to download it completely. The server logic for retrieving size has
already been implemented and merged in "a2ba162cda (object-info: support for
retrieving object info, 2021-04-20)"[2]. This patch series implement the client
option for it.
This patch series add the `remote-object-info` command to
`cat-file --batch-command`. This command allows the client to make an
object-info command request to a server that supports protocol v2.
If the server uses protocol v2 but does not support the object-info capability,
`cat-file --batch-command` will die.
If a user attempts to use `remote-object-info` with protocol v1,,
`cat-file --batch-command` will die.
Currently, only the size (%(objectsize)) is supported in this implementation.
The type (%(objecttype)) is not included in this patch series, as it is not yet
supported on the server side either. The plan is to implement the necessary
logic for both the server and client in a subsequent series.
The default format for remote-object-info is set to %(objectname) %(objectsize).
Once %(objecttype) is supported, the default format will be unified accordingly.
If the batch command format includes unsupported fields such as %(objecttype),
%(objectsize:disk), or %(deltabase), the command will terminate with an error.
Changes since V7
================
- Introduced strtoul_ul() in git-compat-util.h to ensure proper error handling
using strtoul from the standard library.
- Separated the test library into its own commit for better clarity
and organization.
- Use string_list_has_string() instead of unsorted_string_list_has_string() to
avoid quadratic runtime behaviour
- Added a documentation link to the wire format in the commit message to
provide additional context.
- New test case "remote-object-info fails on not providing OID"
- Fixed typos and formatting issues for improved readability.
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (2):
cat-file: add declaration of variable i inside its for loop
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.txt | 24 +-
Makefile | 1 +
builtin/cat-file.c | 110 ++++-
connect.c | 34 ++
connect.h | 8 +
fetch-object-info.c | 92 ++++
fetch-object-info.h | 18 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 652 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 28 +-
transport.h | 11 +
18 files changed, 1021 insertions(+), 68 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v7:
-: ---------- > 1: c09e21a9d6 cat-file: add declaration of variable i inside its for loop
-: ---------- > 2: ed04a4a7c4 fetch-pack: refactor packet writing
-: ---------- > 3: bc52c4f80c fetch-pack: move fetch initialization
-: ---------- > 4: 4c1b989c41 serve: advertise object-info feature
-: ---------- > 5: dbc95a9ae5 transport: add client support for object-info
-: ---------- > 6: f244ec8a2f cat-file: add remote-object-info to batch-command
--
2.47.0
Information Footer:
base-commit: 8f8d6eee531b3fa1a8ef14f169b0cb5035f7a772
Merge Request: https://gitlab.com/gitlab-org/git/-/merge_requests/168
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v8 1/6] cat-file: add declaration of variable i inside its for loop
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2024-12-23 23:25 ` [PATCH v8 2/6] fetch-pack: refactor packet writing Eric Ju
` (5 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
fetch-pack.c | 3 +--
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b13561cf73..69ea642dc6 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -676,12 +676,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -689,9 +687,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -717,7 +713,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -727,7 +722,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index 3a227721ed..72c6a254c9 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1329,9 +1329,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
if (advertise_sid && server_supports_v2("session-id"))
packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
if (server_options && server_options->nr) {
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
+ for (int i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v8 2/6] fetch-pack: refactor packet writing
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
2024-12-23 23:25 ` [PATCH v8 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2024-12-23 23:25 ` [PATCH v8 3/6] fetch-pack: move fetch initialization Eric Ju
` (4 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to a more
general-purpose function, write_command_and_capabilities(), enabling it
to serve both fetch and additional commands.
In this context, "command" refers to the "operations" supported by
Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
subcommand (e.g., git-fetch(1)) or a server-side operation like
"object-info" as implemented in commit a2ba162c
(object-info: support for retrieving object info, 2021-04-20).
Furthermore, write_command_and_capabilities() is moved to connect.c,
making it accessible to additional commands in the future.
To move write_command_and_capabilities() to connect.c, we need to
adjust how `advertise_sid` is managed. Previously,
in fetch_pack.c, `advertise_sid` was a static variable, modified using
git_config_get_bool().
In connect.c, we now initialize `advertise_sid` at the beginning by
directly using git_config_get_bool(). This change is safe because:
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. In function do_fetch_pack:
if (!server_supports("session-id"))
advertise_sid = 0;
2. In function fetch_pack_config():
git_config_get_bool("transfer.advertisesid", &advertise_sid);
About 1, since do_fetch_pack() is only relevant for protocol v1, this
assignment can be ignored in our refactor, as
write_command_and_capabilities() is only used in protocol v2.
About 2, git_config_get_bool() is from config.h and it is an out-of-box
dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 8 ++++++++
fetch-pack.c | 35 ++---------------------------------
3 files changed, 44 insertions(+), 33 deletions(-)
diff --git a/connect.c b/connect.c
index 10fad43e98..2b51cf09bf 100644
--- a/connect.c
+++ b/connect.c
@@ -689,6 +689,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (int i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..8b56a68b62 100644
--- a/connect.h
+++ b/connect.h
@@ -1,6 +1,7 @@
#ifndef CONNECT_H
#define CONNECT_H
+#include "string-list.h"
#include "protocol.h"
#define CONNECT_VERBOSE (1u << 0)
@@ -30,4 +31,11 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+/**
+ * write_command_and_capabilities writes a command along with the requested
+ * server capabilities/features into a request buffer.
+ */
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index 72c6a254c9..78e7d38c47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1317,37 +1317,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
- for (int i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1358,7 +1327,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2186,7 +2155,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v8 3/6] fetch-pack: move fetch initialization
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
2024-12-23 23:25 ` [PATCH v8 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-12-23 23:25 ` [PATCH v8 2/6] fetch-pack: refactor packet writing Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2024-12-23 23:25 ` [PATCH v8 4/6] serve: advertise object-info feature Eric Ju
` (3 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 78e7d38c47..51de82e414 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1648,18 +1648,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v8 4/6] serve: advertise object-info feature
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
` (2 preceding siblings ...)
2024-12-23 23:25 ` [PATCH v8 3/6] fetch-pack: move fetch initialization Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2024-12-23 23:25 ` [PATCH v8 5/6] transport: add client support for object-info Eric Ju
` (2 subsequent siblings)
6 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index c8694e3751..7a388d26d9 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v8 5/6] transport: add client support for object-info
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
` (3 preceding siblings ...)
2024-12-23 23:25 ` [PATCH v8 4/6] serve: advertise object-info feature Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2025-01-07 18:31 ` Calvin Wan
2024-12-23 23:25 ` [PATCH v8 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-12-26 21:56 ` [PATCH v8 0/6] " Junio C Hamano
6 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes, it is beneficial to retrieve information about an object
without downloading it entirely. The server-side logic for this
functionality was implemented in commit "a2ba162cda (object-info:
support for retrieving object info, 2021-04-20)."
This commit introduces client functions to interact with the server.
Currently, the client supports requesting a list of object IDs with
the ‘size’ feature from a v2 server. If the server does not advertise
this feature (i.e., transfer.advertiseobjectinfo is set to false),
the client will return an error and exit.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 92 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 18 +++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 +
transport-helper.c | 11 +++++-
transport.c | 28 +++++++++++++-
transport.h | 11 ++++++
8 files changed, 163 insertions(+), 3 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index 3fa4bf0d06..70e9ec0464 100644
--- a/Makefile
+++ b/Makefile
@@ -1020,6 +1020,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..2aa9f2b70d
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,92 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/**
+ * send_object_info_request sends git-cat-file object-info command and its
+ * arguments into the request buffer.
+ */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids) {
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+/**
+ * fetch_object_info sends git-cat-file object-info command into the request buf
+ * and read the results from packets.
+ */
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ die(_("object-info capability is not enabled on the server"));
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("wrong protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
+ }
+ continue;
+ }
+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..ce1a05dc96
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,18 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index 51de82e414..704bc21b47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1654,6 +1654,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index 9d3470366f..119d3369f1 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index d457b42550..9da1547b2c 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -710,8 +710,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -727,6 +727,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 10d820c333..5a2629de52 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -464,8 +465,33 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options
+ && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ ret = fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1]);
+ goto cleanup;
- if (!data->finished_handshake) {
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
diff --git a/transport.h b/transport.h
index 44100fa9b7..e61e931863 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to retrieve only object-info.
+ * If object-info is not supported, the operation will error and exit.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v8 6/6] cat-file: add remote-object-info to batch-command
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
` (4 preceding siblings ...)
2024-12-23 23:25 ` [PATCH v8 5/6] transport: add client support for object-info Eric Ju
@ 2024-12-23 23:25 ` Eric Ju
2025-01-07 21:29 ` Calvin Wan
2024-12-26 21:56 ` [PATCH v8 0/6] " Junio C Hamano
6 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2024-12-23 23:25 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates
overhead when making requests to a server.So `remote-object-info`
instead can take multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 99 ++++
object-file.c | 11 +
object-store-ll.h | 3 +
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 652 +++++++++++++++++++++++++
7 files changed, 802 insertions(+), 16 deletions(-)
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..6a2f9fd752 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified
+ `<remote>` without downloading objects from the remote.
+ Error when the `object-info` capability is not supported by the server.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will error
+and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 69ea642dc6..998addf28c 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -27,6 +27,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -45,9 +48,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -579,6 +585,59 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -670,6 +729,45 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line, struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+
+ data->oid = object_info_oids.oid[i];
+
+ if (remote_object_info[i].sizep) {
+ /*
+ * When reaching here, it means remote-object-info can retrieve
+ * information from server without downloading them.
+ */
+ data->size = *remote_object_info[i].sizep;
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -701,6 +799,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index 5b792b3dd4..96f204c93a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3128,3 +3128,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index cd3bd5bd99..20208e1d4f 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -553,4 +553,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
void *data, enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..9fb20be308
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related tests.
+
+# Print a string without a trailing newline
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string removing any leading spaces.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..5c7d581ea2 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -3,6 +3,7 @@
test_description='git cat-file'
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -98,18 +99,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..0e3044d8d6
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,652 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info git:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info file:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info http:// fails when transfer.advertiseobjectinfo=false ' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v8 0/6] add remote-object-info to batch-command
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
` (5 preceding siblings ...)
2024-12-23 23:25 ` [PATCH v8 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2024-12-26 21:56 ` Junio C Hamano
2024-12-30 23:25 ` Peijian Ju
6 siblings, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2024-12-26 21:56 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Eric Ju <eric.peijian@gmail.com> writes:
> Range-diff against v7:
> -: ---------- > 1: c09e21a9d6 cat-file: add declaration of variable i inside its for loop
> -: ---------- > 2: ed04a4a7c4 fetch-pack: refactor packet writing
> -: ---------- > 3: bc52c4f80c fetch-pack: move fetch initialization
> -: ---------- > 4: 4c1b989c41 serve: advertise object-info feature
> -: ---------- > 5: dbc95a9ae5 transport: add client support for object-info
> -: ---------- > 6: f244ec8a2f cat-file: add remote-object-info to batch-command
This is curious. Did you compare the right things?
--
2.47.0
Information Footer:
base-commit: 8f8d6eee531b3fa1a8ef14f169b0cb5035f7a772
Merge Request: https://gitlab.com/gitlab-org/git/-/merge_requests/168
If the base-commit information is relevant, please do not write it
below the "signature" like (i.e. a line that consists only of
dash-dash-space near the end of the message), as some e-mail programs
consider them irrelevant and omit from quoting.
I tried to apply them on top of 8f8d6eee (The seventh batch,
2024-11-01) but the last step [6/6] fails to apply (the first five
applied cleanly, and matched what I already had).
Could you help to figure out what is going wrong on your end?
Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v8 0/6] add remote-object-info to batch-command
2024-12-26 21:56 ` [PATCH v8 0/6] " Junio C Hamano
@ 2024-12-30 23:25 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2024-12-30 23:25 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Sorry for the noise. I forgot to CC others, so I am resending it.
On Thu, Dec 26, 2024 at 5:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > Range-diff against v7:
> > -: ---------- > 1: c09e21a9d6 cat-file: add declaration of variable i inside its for loop
> > -: ---------- > 2: ed04a4a7c4 fetch-pack: refactor packet writing
> > -: ---------- > 3: bc52c4f80c fetch-pack: move fetch initialization
> > -: ---------- > 4: 4c1b989c41 serve: advertise object-info feature
> > -: ---------- > 5: dbc95a9ae5 transport: add client support for object-info
> > -: ---------- > 6: f244ec8a2f cat-file: add remote-object-info to batch-command
>
> This is curious. Did you compare the right things?
>
Thank you.
I think I may compare it wrong.
> --
> 2.47.0
>
> Information Footer:
> base-commit: 8f8d6eee531b3fa1a8ef14f169b0cb5035f7a772
> Merge Request: https://gitlab.com/gitlab-org/git/-/merge_requests/168
>
> If the base-commit information is relevant, please do not write it
> below the "signature" like (i.e. a line that consists only of
> dash-dash-space near the end of the message), as some e-mail programs
> consider them irrelevant and omit from quoting.
>
Roger that.
> I tried to apply them on top of 8f8d6eee (The seventh batch,
> 2024-11-01) but the last step [6/6] fails to apply (the first five
> applied cleanly, and matched what I already had).
>
> Could you help to figure out what is going wrong on your end?
>
Should I resend v8 or send a v9 instead?
> Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v8 5/6] transport: add client support for object-info
2024-12-23 23:25 ` [PATCH v8 5/6] transport: add client support for object-info Eric Ju
@ 2025-01-07 18:31 ` Calvin Wan
2025-01-07 18:53 ` Junio C Hamano
2025-01-08 15:55 ` Peijian Ju
0 siblings, 2 replies; 174+ messages in thread
From: Calvin Wan @ 2025-01-07 18:31 UTC (permalink / raw)
To: Eric Ju; +Cc: git, jonathantanmy, chriscool, karthik.188, toon, jltobler
Thanks for picking up this series btw!
On Mon, Dec 23, 2024 at 3:25 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> From: Calvin Wan <calvinwan@google.com>
>
> Sometimes, it is beneficial to retrieve information about an object
> without downloading it entirely. The server-side logic for this
> functionality was implemented in commit "a2ba162cda (object-info:
> support for retrieving object info, 2021-04-20)."
>
> This commit introduces client functions to interact with the server.
>
> Currently, the client supports requesting a list of object IDs with
> the ‘size’ feature from a v2 server. If the server does not advertise
> this feature (i.e., transfer.advertiseobjectinfo is set to false),
> the client will return an error and exit.
>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Helped-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> ---
> Makefile | 1 +
> fetch-object-info.c | 92 +++++++++++++++++++++++++++++++++++++++++++++
> fetch-object-info.h | 18 +++++++++
> fetch-pack.c | 3 ++
> fetch-pack.h | 2 +
> transport-helper.c | 11 +++++-
> transport.c | 28 +++++++++++++-
> transport.h | 11 ++++++
> 8 files changed, 163 insertions(+), 3 deletions(-)
> create mode 100644 fetch-object-info.c
> create mode 100644 fetch-object-info.h
>
> diff --git a/Makefile b/Makefile
> index 3fa4bf0d06..70e9ec0464 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1020,6 +1020,7 @@ LIB_OBJS += ewah/ewah_rlw.o
> LIB_OBJS += exec-cmd.o
> LIB_OBJS += fetch-negotiator.o
> LIB_OBJS += fetch-pack.o
> +LIB_OBJS += fetch-object-info.o
> LIB_OBJS += fmt-merge-msg.o
> LIB_OBJS += fsck.o
> LIB_OBJS += fsmonitor.o
> diff --git a/fetch-object-info.c b/fetch-object-info.c
> new file mode 100644
> index 0000000000..2aa9f2b70d
> --- /dev/null
> +++ b/fetch-object-info.c
> @@ -0,0 +1,92 @@
> +#include "git-compat-util.h"
> +#include "gettext.h"
> +#include "hex.h"
> +#include "pkt-line.h"
> +#include "connect.h"
> +#include "oid-array.h"
> +#include "object-store-ll.h"
> +#include "fetch-object-info.h"
> +#include "string-list.h"
> +
> +/**
> + * send_object_info_request sends git-cat-file object-info command and its
> + * arguments into the request buffer.
> + */
> +static void send_object_info_request(const int fd_out, struct object_info_args *args)
> +{
> + struct strbuf req_buf = STRBUF_INIT;
> +
> + write_command_and_capabilities(&req_buf, "object-info", args->server_options);
> +
> + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> + packet_buf_write(&req_buf, "size");
> +
> + if (args->oids) {
> + for (size_t i = 0; i < args->oids->nr; i++)
> + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> + }
> +
> + packet_buf_flush(&req_buf);
> + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> + die_errno(_("unable to write request to remote"));
> +
> + strbuf_release(&req_buf);
> +}
> +
> +/**
> + * fetch_object_info sends git-cat-file object-info command into the request buf
> + * and read the results from packets.
> + */
> +int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
> + struct packet_reader *reader, struct object_info *object_info_data,
> + const int stateless_rpc, const int fd_out)
> +{
> + int size_index = -1;
> +
> + switch (version) {
> + case protocol_v2:
> + if (!server_supports_v2("object-info"))
> + die(_("object-info capability is not enabled on the server"));
> + send_object_info_request(fd_out, args);
> + break;
> + case protocol_v1:
> + case protocol_v0:
> + die(_("wrong protocol version. expected v2"));
s/wrong/unsupported
> + case protocol_unknown_version:
> + BUG("unknown protocol version");
> + }
> +
> + for (size_t i = 0; i < args->object_info_options->nr; i++) {
> + if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
> + check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
> + return -1;
> + }
> + if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
> + if (!strcmp(reader->line, "size")) {
> + size_index = i;
> + for (size_t j = 0; j < args->oids->nr; j++)
> + object_info_data[j].sizep = xcalloc(1, sizeof(long));
> + }
> + continue;
> + }
> + return -1;
> + }
I think we can flatten this logic a bit more here to make it more intuitive.
if (!unsorted_string_list_has_string(args->object_info_options, reader->line))
return -1;
if (!strcmp(reader->line, "size")) {
size_index = i;
for (size_t j = 0; j < args->oids->nr; j++)
object_info_data[j].sizep = xcalloc(1, sizeof(long));
}
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v8 5/6] transport: add client support for object-info
2025-01-07 18:31 ` Calvin Wan
@ 2025-01-07 18:53 ` Junio C Hamano
2025-01-08 15:55 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Junio C Hamano @ 2025-01-07 18:53 UTC (permalink / raw)
To: Calvin Wan
Cc: Eric Ju, git, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Calvin Wan <calvinwan@google.com> writes:
> Thanks for picking up this series btw!
> ...
> I think we can flatten this logic a bit more here to make it more intuitive.
>
> if (!unsorted_string_list_has_string(args->object_info_options, reader->line))
> return -1;
> if (!strcmp(reader->line, "size")) {
> size_index = i;
> for (size_t j = 0; j < args->oids->nr; j++)
> object_info_data[j].sizep = xcalloc(1, sizeof(long));
> }
Indeed the updated code structure gets easier to follow.
Thanks, both of you.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v8 6/6] cat-file: add remote-object-info to batch-command
2024-12-23 23:25 ` [PATCH v8 6/6] cat-file: add remote-object-info to batch-command Eric Ju
@ 2025-01-07 21:29 ` Calvin Wan
0 siblings, 0 replies; 174+ messages in thread
From: Calvin Wan @ 2025-01-07 21:29 UTC (permalink / raw)
To: Eric Ju; +Cc: git, jonathantanmy, chriscool, karthik.188, toon, jltobler
On Mon, Dec 23, 2024 at 3:26 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> Since the `info` command in cat-file --batch-command prints object info
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
>
> Add `remote-object-info` to cat-file --batch-command.
>
> While `info` takes object ids one at a time, this creates
> overhead when making requests to a server.So `remote-object-info`
> instead can take multiple object ids at once.
>
> cat-file --batch-command is generally implemented in the following
> manner:
>
> - Receive and parse input from user
> - Call respective function attached to command
> - Get object info, print object info
>
> In --buffer mode, this changes to:
>
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue
> - Call respective function attached to command
> - Get object info, print object info
>
> Notice how the getting and printing of object info is accomplished one
> at a time. As described above, this creates a problem for making
> requests to a server. Therefore, `remote-object-info` is implemented in
> the following manner:
>
> - Receive and parse input from user
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through and print each object info
> Else:
> - Call respective function attached to command
> - Parse input, get object info, print object info
>
> And finally for --buffer mode `remote-object-info`:
> - Receive and parse input from user
> - Store respective function attached to command in a queue
> - After flush, loop through commands in queue:
> If command is `remote-object-info`:
> - Get object info from remote
> - Loop through and print each object info
> Else:
> - Call respective function attached to command
> - Get object info, print object info
>
> To summarize, `remote-object-info` gets object info from the remote and
> then loop through the object info passed in, printing the info.
>
> In order for remote-object-info to avoid remote communication overhead
> in the non-buffer mode, the objects are passed in as such:
>
> remote-object-info <remote> <oid> <oid> ... <oid>
>
> rather than
>
> remote-object-info <remote> <oid>
> remote-object-info <remote> <oid>
> ...
> remote-object-info <remote> <oid>
>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Helped-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> ---
> Documentation/git-cat-file.txt | 24 +-
> builtin/cat-file.c | 99 ++++
> object-file.c | 11 +
> object-store-ll.h | 3 +
> t/lib-cat-file.sh | 16 +
> t/t1006-cat-file.sh | 13 +-
> t/t1017-cat-file-remote-object-info.sh | 652 +++++++++++++++++++++++++
> 7 files changed, 802 insertions(+), 16 deletions(-)
> create mode 100644 t/lib-cat-file.sh
> create mode 100755 t/t1017-cat-file-remote-object-info.sh
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index d5890ae368..6a2f9fd752 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -149,6 +149,13 @@ info <object>::
> Print object info for object reference `<object>`. This corresponds to the
> output of `--batch-check`.
>
> +remote-object-info <remote> <object>...::
> + Print object info for object references `<object>` at specified
> + `<remote>` without downloading objects from the remote.
> + Error when the `object-info` capability is not supported by the server.
> + Error when no object references are provided.
> + This command may be combined with `--buffer`.
> +
> flush::
> Used with `--buffer` to execute all preceding commands that were issued
> since the beginning or since the last flush was issued. When `--buffer`
> @@ -290,7 +297,8 @@ newline. The available atoms are:
> The full hex representation of the object name.
>
> `objecttype`::
> - The type of the object (the same as `cat-file -t` reports).
> + The type of the object (the same as `cat-file -t` reports). See
> + `CAVEATS` below. Not supported by `remote-object-info`.
>
> `objectsize`::
> The size, in bytes, of the object (the same as `cat-file -s`
> @@ -298,13 +306,14 @@ newline. The available atoms are:
>
> `objectsize:disk`::
> The size, in bytes, that the object takes up on disk. See the
> - note about on-disk sizes in the `CAVEATS` section below.
> + note about on-disk sizes in the `CAVEATS` section below. Not
> + supported by `remote-object-info`.
>
> `deltabase`::
> If the object is stored as a delta on-disk, this expands to the
> full hex representation of the delta base object name.
> Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> - below.
> + below. Not supported by `remote-object-info`.
>
> `rest`::
> If this atom is used in the output string, input lines are split
> @@ -314,7 +323,10 @@ newline. The available atoms are:
> line) are output in place of the `%(rest)` atom.
>
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> +DO NOT RELY on the current the default format to stay the same!!!
I remember this was one of my initial concerns when I first worked on
this series -- without a use case for other fields, it's definitely
hard to say how a default format for such would look and obviously
when implemented, would cause the default format of %(objectsize) to
change as well. I'm glad to see this outcome is well documented so we
can have this feature working with a backdoor to change it if
necessary for the future. Thanks again for your work on this series.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v8 5/6] transport: add client support for object-info
2025-01-07 18:31 ` Calvin Wan
2025-01-07 18:53 ` Junio C Hamano
@ 2025-01-08 15:55 ` Peijian Ju
1 sibling, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-01-08 15:55 UTC (permalink / raw)
To: Calvin Wan; +Cc: git, jonathantanmy, chriscool, karthik.188, toon, jltobler
On Tue, Jan 7, 2025 at 1:31 PM Calvin Wan <calvinwan@google.com> wrote:
>
> Thanks for picking up this series btw!
>
> On Mon, Dec 23, 2024 at 3:25 PM Eric Ju <eric.peijian@gmail.com> wrote:
> >
> > From: Calvin Wan <calvinwan@google.com>
> >
> > Sometimes, it is beneficial to retrieve information about an object
> > without downloading it entirely. The server-side logic for this
> > functionality was implemented in commit "a2ba162cda (object-info:
> > support for retrieving object info, 2021-04-20)."
> >
> > This commit introduces client functions to interact with the server.
> >
> > Currently, the client supports requesting a list of object IDs with
> > the ‘size’ feature from a v2 server. If the server does not advertise
> > this feature (i.e., transfer.advertiseobjectinfo is set to false),
> > the client will return an error and exit.
> >
> > Helped-by: Jonathan Tan <jonathantanmy@google.com>
> > Helped-by: Christian Couder <chriscool@tuxfamily.org>
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> > ---
> > Makefile | 1 +
> > fetch-object-info.c | 92 +++++++++++++++++++++++++++++++++++++++++++++
> > fetch-object-info.h | 18 +++++++++
> > fetch-pack.c | 3 ++
> > fetch-pack.h | 2 +
> > transport-helper.c | 11 +++++-
> > transport.c | 28 +++++++++++++-
> > transport.h | 11 ++++++
> > 8 files changed, 163 insertions(+), 3 deletions(-)
> > create mode 100644 fetch-object-info.c
> > create mode 100644 fetch-object-info.h
> >
> > diff --git a/Makefile b/Makefile
> > index 3fa4bf0d06..70e9ec0464 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -1020,6 +1020,7 @@ LIB_OBJS += ewah/ewah_rlw.o
> > LIB_OBJS += exec-cmd.o
> > LIB_OBJS += fetch-negotiator.o
> > LIB_OBJS += fetch-pack.o
> > +LIB_OBJS += fetch-object-info.o
> > LIB_OBJS += fmt-merge-msg.o
> > LIB_OBJS += fsck.o
> > LIB_OBJS += fsmonitor.o
> > diff --git a/fetch-object-info.c b/fetch-object-info.c
> > new file mode 100644
> > index 0000000000..2aa9f2b70d
> > --- /dev/null
> > +++ b/fetch-object-info.c
> > @@ -0,0 +1,92 @@
> > +#include "git-compat-util.h"
> > +#include "gettext.h"
> > +#include "hex.h"
> > +#include "pkt-line.h"
> > +#include "connect.h"
> > +#include "oid-array.h"
> > +#include "object-store-ll.h"
> > +#include "fetch-object-info.h"
> > +#include "string-list.h"
> > +
> > +/**
> > + * send_object_info_request sends git-cat-file object-info command and its
> > + * arguments into the request buffer.
> > + */
> > +static void send_object_info_request(const int fd_out, struct object_info_args *args)
> > +{
> > + struct strbuf req_buf = STRBUF_INIT;
> > +
> > + write_command_and_capabilities(&req_buf, "object-info", args->server_options);
> > +
> > + if (unsorted_string_list_has_string(args->object_info_options, "size"))
> > + packet_buf_write(&req_buf, "size");
> > +
> > + if (args->oids) {
> > + for (size_t i = 0; i < args->oids->nr; i++)
> > + packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
> > + }
> > +
> > + packet_buf_flush(&req_buf);
> > + if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
> > + die_errno(_("unable to write request to remote"));
> > +
> > + strbuf_release(&req_buf);
> > +}
> > +
> > +/**
> > + * fetch_object_info sends git-cat-file object-info command into the request buf
> > + * and read the results from packets.
> > + */
> > +int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
> > + struct packet_reader *reader, struct object_info *object_info_data,
> > + const int stateless_rpc, const int fd_out)
> > +{
> > + int size_index = -1;
> > +
> > + switch (version) {
> > + case protocol_v2:
> > + if (!server_supports_v2("object-info"))
> > + die(_("object-info capability is not enabled on the server"));
> > + send_object_info_request(fd_out, args);
> > + break;
> > + case protocol_v1:
> > + case protocol_v0:
> > + die(_("wrong protocol version. expected v2"));
>
> s/wrong/unsupported
>
Thank you. Fixing it in v9.
> > + case protocol_unknown_version:
> > + BUG("unknown protocol version");
> > + }
> > +
> > + for (size_t i = 0; i < args->object_info_options->nr; i++) {
> > + if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
> > + check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
> > + return -1;
> > + }
> > + if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
> > + if (!strcmp(reader->line, "size")) {
> > + size_index = i;
> > + for (size_t j = 0; j < args->oids->nr; j++)
> > + object_info_data[j].sizep = xcalloc(1, sizeof(long));
> > + }
> > + continue;
> > + }
> > + return -1;
> > + }
>
> I think we can flatten this logic a bit more here to make it more intuitive.
>
> if (!unsorted_string_list_has_string(args->object_info_options, reader->line))
> return -1;
> if (!strcmp(reader->line, "size")) {
> size_index = i;
> for (size_t j = 0; j < args->oids->nr; j++)
> object_info_data[j].sizep = xcalloc(1, sizeof(long));
> }
Thank you. Revising it in v9.
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v9 0/8] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (13 preceding siblings ...)
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-08 18:37 ` [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
` (7 more replies)
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
16 siblings, 8 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Because I mistakenly sent a wrong range-diff in v8, please consider this v9 as
both an update addressing new comments from Calvin Wan at
https://lore.kernel.org/git/CAFySSZAqh6J14+r9JLM3LmRmV02ZvPRf5dB3rWVnUZS_5XaHcQ@mail.gmail.com/
and a resend of the corrected range-diff for v8.
This patch series is a continuation of Calvin Wan’s (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info
command at [1].
Sometimes it is beneficial to retrieve information about an object without
having to download it completely. The server logic for retrieving size has
already been implemented and merged in "a2ba162cda (object-info: support for
retrieving object info, 2021-04-20)"[2]. This patch series implement the client
option for it.
This patch series add the `remote-object-info` command to
`cat-file --batch-command`. This command allows the client to make an
object-info command request to a server that supports protocol v2.
If the server uses protocol v2 but does not support the object-info capability,
`cat-file --batch-command` will die.
If a user attempts to use `remote-object-info` with protocol v1,,
`cat-file --batch-command` will die.
Currently, only the size (%(objectsize)) is supported in this implementation.
The type (%(objecttype)) is not included in this patch series, as it is not yet
supported on the server side either. The plan is to implement the necessary
logic for both the server and client in a subsequent series.
The default format for remote-object-info is set to %(objectname) %(objectsize).
Once %(objecttype) is supported, the default format will be unified accordingly.
If the batch command format includes unsupported fields such as %(objecttype),
%(objectsize:disk), or %(deltabase), the command will terminate with an error.
Changes since V7 (v8 had an incorrect range-diff)
================
- Introduced strtoul_ul() in git-compat-util.h to ensure proper error handling
using strtoul from the standard library.
- Separated the test library into its own commit for better clarity
and organization.
- Use string_list_has_string() instead of unsorted_string_list_has_string() to
avoid quadratic runtime behaviour
- Added a documentation link to the wire format in the commit message to
provide additional context.
- New test case "remote-object-info fails on not providing OID"
- Fixed typos and formatting issues for improved readability.
- Flattened the memory allocation logic of sizep in object_info_data for better
intuitiveness and readability.
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (4):
git-compat-util: add strtoul_ul() with error handling
cat-file: add declaration of variable i inside its for loop
cat-file: split test utility functions into a separate library file
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.txt | 24 +-
Makefile | 1 +
builtin/cat-file.c | 110 +++-
connect.c | 34 ++
connect.h | 8 +
fetch-object-info.c | 85 ++++
fetch-object-info.h | 22 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
git-compat-util.h | 18 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 28 +-
transport.h | 11 +
19 files changed, 1048 insertions(+), 68 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v7:
-: ---------- > 1: 63997081d1 git-compat-util: add strtoul_ul() with error handling
1: 5181e849eb ! 2: f188962f05 cat-file: add declaration of variable i inside its for loop
@@ fetch-pack.c: static void write_fetch_command_and_capabilities(struct strbuf *re
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
-+ for (int i = 0; i < server_options->nr; i++)
++ for (size_t i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
-: ---------- > 3: 71250a03d2 cat-file: split test utility functions into a separate library file
2: 0c6acf58c2 ! 4: 0ab26e6cd5 fetch-pack: refactor packet writing
@@ connect.c: int server_supports(const char *feature)
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
-+ for (int i = 0; i < server_options->nr; i++)
++ for (size_t i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
@@ connect.c: int server_supports(const char *feature)
PROTO_FILE,
## connect.h ##
-@@
- #ifndef CONNECT_H
- #define CONNECT_H
-
-+#include "string-list.h"
- #include "protocol.h"
-
- #define CONNECT_VERBOSE (1u << 0)
@@ connect.h: void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
-+/**
-+ * write_command_and_capabilities writes a command along with the requested
++/*
++ * Writes a command along with the requested
+ * server capabilities/features into a request buffer.
+ */
++struct string_list;
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
@@ fetch-pack.c: static int add_haves(struct fetch_negotiator *negotiator,
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
-- for (int i = 0; i < server_options->nr; i++)
+- for (size_t i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
3: 28ef74980c = 5: 8b381b4bdc fetch-pack: move fetch initialization
4: cb5bf65b88 = 6: a0a15e1e4f serve: advertise object-info feature
5: 79eab87dd2 ! 7: e1aad1ec30 transport: add client support for object-info
@@ Commit message
Sometimes, it is beneficial to retrieve information about an object
without downloading it entirely. The server-side logic for this
functionality was implemented in commit "a2ba162cda (object-info:
- support for retrieving object info, 2021-04-20)."
+ support for retrieving object info, 2021-04-20)." And the wire
+ format is documented at
+ https://git-scm.com/docs/protocol-v2#_object_info.
This commit introduces client functions to interact with the server.
Currently, the client supports requesting a list of object IDs with
- the ‘size’ feature from a v2 server. If the server does not advertise
+ the 'size' feature from a v2 server. If the server does not advertise
this feature (i.e., transfer.advertiseobjectinfo is set to false),
the client will return an error and exit.
+ Notice that the entire request is written into req_buf before being
+ sent to the remote. This approach follows the pattern used in the
+ `send_fetch_request()` logic within fetch-pack.c.
+ Streaming the request is not addressed in this patch.
+
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
@@ fetch-object-info.c (new)
+#include "fetch-object-info.h"
+#include "string-list.h"
+
-+/**
-+ * send_object_info_request sends git-cat-file object-info command and its
-+ * arguments into the request buffer.
-+ */
++/* Sends git-cat-file object-info command and its arguments into the request buffer. */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
@@ fetch-object-info.c (new)
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
-+ if (args->oids) {
++ if (args->oids)
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
-+ }
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
@@ fetch-object-info.c (new)
+ strbuf_release(&req_buf);
+}
+
-+/**
-+ * fetch_object_info sends git-cat-file object-info command into the request buf
-+ * and read the results from packets.
-+ */
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
@@ fetch-object-info.c (new)
+ break;
+ case protocol_v1:
+ case protocol_v0:
-+ die(_("wrong protocol version. expected v2"));
++ die(_("unsupported protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
@@ fetch-object-info.c (new)
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
-+ if (unsorted_string_list_has_string(args->object_info_options, reader->line)) {
-+ if (!strcmp(reader->line, "size")) {
-+ size_index = i;
-+ for (size_t j = 0; j < args->oids->nr; j++)
-+ object_info_data[j].sizep = xcalloc(1, sizeof(long));
-+ }
-+ continue;
++ if (!string_list_has_string(args->object_info_options, reader->line))
++ return -1;
++ if (!strcmp(reader->line, "size")) {
++ size_index = i;
++ for (size_t j = 0; j < args->oids->nr; j++)
++ object_info_data[j].sizep = xcalloc(1, sizeof(*object_info_data[j].sizep));
+ }
-+ return -1;
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
@@ fetch-object-info.c (new)
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
-+ *object_info_data[i].sizep = strtoul(object_info_values.items[1 + size_index].string, NULL, 10);
++ if (strtoul_ul(object_info_values.items[1 + size_index].string, 10, object_info_data[i].sizep))
++ die("object-info: ref %s has invalid size %s",
++ object_info_values.items[0].string,
++ object_info_values.items[1 + size_index].string);
+ }
+
+ string_list_clear(&object_info_values, 0);
@@ fetch-object-info.h (new)
+ struct oid_array *oids;
+};
+
++/*
++ * Sends git-cat-file object-info command into the request buf and read the
++ * results from packets.
++ */
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
-+ if (transport->smart_options
-+ && transport->smart_options->object_info
-+ && transport->smart_options->object_info_oids->nr > 0) {
++ if (transport->smart_options && transport->smart_options->object_info
++ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
-+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
++ obj_info_args.object_info_options = transport->smart_options->object_info_options;
++ string_list_sort(obj_info_args.object_info_options);
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
6: b60863aa5b ! 8: 0795ad53fe cat-file: add remote-object-info to batch-command
@@ builtin/cat-file.c: static void batch_one_object(const char *obj_name,
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
-+
++ if (object_info_oids.nr == 0) {
++ die(_("remote-object-info requires objects"));
++ }
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
-+
+ data->oid = object_info_oids.oid[i];
-+
+ if (remote_object_info[i].sizep) {
+ /*
+ * When reaching here, it means remote-object-info can retrieve
@@ object-store-ll.h: int for_each_object_in_pack(struct packed_git *p,
+
#endif /* OBJECT_STORE_LL_H */
- ## t/lib-cat-file.sh (new) ##
-@@
-+# Library of git-cat-file related tests.
-+
-+# Print a string without a trailing newline
-+echo_without_newline () {
-+ printf '%s' "$*"
-+}
-+
-+# Print a string without newlines and replaces them with a NULL character (\0).
-+echo_without_newline_nul () {
-+ echo_without_newline "$@" | tr '\n' '\0'
-+}
-+
-+# Calculate the length of a string removing any leading spaces.
-+strlen () {
-+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-+}
-
- ## t/t1006-cat-file.sh ##
-@@
- test_description='git cat-file'
-
- . ./test-lib.sh
-+. "$TEST_DIRECTORY"/lib-cat-file.sh
-
- test_cmdmode_usage () {
- test_expect_code 129 "$@" 2>err &&
-@@ t/t1006-cat-file.sh: do
- '
- done
-
--echo_without_newline () {
-- printf '%s' "$*"
--}
--
--echo_without_newline_nul () {
-- echo_without_newline "$@" | tr '\n' '\0'
--}
--
--strlen () {
-- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
--}
--
- run_tests () {
- type=$1
- oid=$2
-
## t/t1017-cat-file-remote-object-info.sh (new) ##
@@
+#!/bin/sh
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ )
+'
+
-+test_expect_success 'remote-object-info fails on server with legacy protocol' '
++test_expect_success 'remote-object-info fails on server with legacy protocol with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
@@ t/t1017-cat-file-remote-object-info.sh (new)
+ )
+'
+
++test_expect_success 'remote-object-info fails on not providing OID' '
++ (
++ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
++ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
++
++ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
++ remote-object-info "$HTTPD_URL/smart/http_parent"
++ EOF
++ test_grep "remote-object-info requires objects" err
++ )
++'
++
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
--
2.47.0
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-10 11:33 ` Christian Couder
2025-01-08 18:37 ` [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
` (6 subsequent siblings)
7 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
We already have strtoul_ui() and similar functions that provide proper
error handling using strtoul from the standard library. However,
there isn't currently a variant that returns an unsigned long.
This commit introduces strtoul_ul() to address this gap, enabling the
return of an unsigned long with proper error handling.
---
git-compat-util.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/git-compat-util.h b/git-compat-util.h
index e283c46c6f..3bdb085624 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1351,6 +1351,24 @@ static inline int strtoul_ui(char const *s, int base, unsigned int *result)
return 0;
}
+// Converts a string to an unsigned long using the standard library's strtoul,
+// with additional error handling to ensure robustness.
+static inline int strtoul_ul(char const *s, int base, unsigned long *result)
+{
+ unsigned long ul;
+ char *p;
+
+ errno = 0;
+ /* negative values would be accepted by strtoul */
+ if (strchr(s, '-'))
+ return -1;
+ ul = strtoul(s, &p, base);
+ if (errno || *p || p == s )
+ return -1;
+ *result = ul;
+ return 0;
+}
+
static inline int strtol_i(char const *s, int base, int *result)
{
long ul;
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
2025-01-08 18:37 ` [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-10 11:39 ` Christian Couder
2025-01-08 18:37 ` [PATCH v9 3/8] cat-file: split test utility functions into a separate library file Eric Ju
` (5 subsequent siblings)
7 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
fetch-pack.c | 3 +--
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b13561cf73..69ea642dc6 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -676,12 +676,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -689,9 +687,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -717,7 +713,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -727,7 +722,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index 3a227721ed..f5a63f12cd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1329,9 +1329,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
if (advertise_sid && server_supports_v2("session-id"))
packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
if (server_options && server_options->nr) {
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
+ for (size_t i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 3/8] cat-file: split test utility functions into a separate library file
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
2025-01-08 18:37 ` [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
2025-01-08 18:37 ` [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-10 14:26 ` Christian Couder
2025-01-08 18:37 ` [PATCH v9 4/8] fetch-pack: refactor packet writing Eric Ju
` (4 subsequent siblings)
7 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This refactor extracts utility functions from the cat-file's test
t1006-cat-file.sh into a dedicated library file. The goal is to improve
code reuse and readability, enabling future tests to leverage these
utilities without duplicating code
---
t/lib-cat-file.sh | 16 ++++++++++++++++
t/t1006-cat-file.sh | 13 +------------
2 files changed, 17 insertions(+), 12 deletions(-)
create mode 100644 t/lib-cat-file.sh
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..9fb20be308
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related tests.
+
+# Print a string without a trailing newline
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replaces them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string removing any leading spaces.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..5c7d581ea2 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -3,6 +3,7 @@
test_description='git cat-file'
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -98,18 +99,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 4/8] fetch-pack: refactor packet writing
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
` (2 preceding siblings ...)
2025-01-08 18:37 ` [PATCH v9 3/8] cat-file: split test utility functions into a separate library file Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-08 18:37 ` [PATCH v9 5/8] fetch-pack: move fetch initialization Eric Ju
` (3 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to a more
general-purpose function, write_command_and_capabilities(), enabling it
to serve both fetch and additional commands.
In this context, "command" refers to the "operations" supported by
Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
subcommand (e.g., git-fetch(1)) or a server-side operation like
"object-info" as implemented in commit a2ba162c
(object-info: support for retrieving object info, 2021-04-20).
Furthermore, write_command_and_capabilities() is moved to connect.c,
making it accessible to additional commands in the future.
To move write_command_and_capabilities() to connect.c, we need to
adjust how `advertise_sid` is managed. Previously,
in fetch_pack.c, `advertise_sid` was a static variable, modified using
git_config_get_bool().
In connect.c, we now initialize `advertise_sid` at the beginning by
directly using git_config_get_bool(). This change is safe because:
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. In function do_fetch_pack:
if (!server_supports("session-id"))
advertise_sid = 0;
2. In function fetch_pack_config():
git_config_get_bool("transfer.advertisesid", &advertise_sid);
About 1, since do_fetch_pack() is only relevant for protocol v1, this
assignment can be ignored in our refactor, as
write_command_and_capabilities() is only used in protocol v2.
About 2, git_config_get_bool() is from config.h and it is an out-of-box
dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 8 ++++++++
fetch-pack.c | 35 ++---------------------------------
3 files changed, 44 insertions(+), 33 deletions(-)
diff --git a/connect.c b/connect.c
index 10fad43e98..d89591f043 100644
--- a/connect.c
+++ b/connect.c
@@ -689,6 +689,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (size_t i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..d904c73a85 100644
--- a/connect.h
+++ b/connect.h
@@ -30,4 +30,12 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+/*
+ * Writes a command along with the requested
+ * server capabilities/features into a request buffer.
+ */
+struct string_list;
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index f5a63f12cd..78e7d38c47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1317,37 +1317,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
- for (size_t i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1358,7 +1327,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2186,7 +2155,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 5/8] fetch-pack: move fetch initialization
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
` (3 preceding siblings ...)
2025-01-08 18:37 ` [PATCH v9 4/8] fetch-pack: refactor packet writing Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-08 18:37 ` [PATCH v9 6/8] serve: advertise object-info feature Eric Ju
` (2 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 78e7d38c47..51de82e414 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1648,18 +1648,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 6/8] serve: advertise object-info feature
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
` (4 preceding siblings ...)
2025-01-08 18:37 ` [PATCH v9 5/8] fetch-pack: move fetch initialization Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-08 18:37 ` [PATCH v9 7/8] transport: add client support for object-info Eric Ju
2025-01-08 18:37 ` [PATCH v9 8/8] cat-file: add remote-object-info to batch-command Eric Ju
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index c8694e3751..7a388d26d9 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 7/8] transport: add client support for object-info
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
` (5 preceding siblings ...)
2025-01-08 18:37 ` [PATCH v9 6/8] serve: advertise object-info feature Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-08 18:37 ` [PATCH v9 8/8] cat-file: add remote-object-info to batch-command Eric Ju
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes, it is beneficial to retrieve information about an object
without downloading it entirely. The server-side logic for this
functionality was implemented in commit "a2ba162cda (object-info:
support for retrieving object info, 2021-04-20)." And the wire
format is documented at
https://git-scm.com/docs/protocol-v2#_object_info.
This commit introduces client functions to interact with the server.
Currently, the client supports requesting a list of object IDs with
the 'size' feature from a v2 server. If the server does not advertise
this feature (i.e., transfer.advertiseobjectinfo is set to false),
the client will return an error and exit.
Notice that the entire request is written into req_buf before being
sent to the remote. This approach follows the pattern used in the
`send_fetch_request()` logic within fetch-pack.c.
Streaming the request is not addressed in this patch.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 85 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 22 ++++++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 ++
transport-helper.c | 11 ++++--
transport.c | 28 ++++++++++++++-
transport.h | 11 ++++++
8 files changed, 160 insertions(+), 3 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index 97e8385b66..e8c6702b32 100644
--- a/Makefile
+++ b/Makefile
@@ -1020,6 +1020,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..b279e06dc8
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,85 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/* Sends git-cat-file object-info command and its arguments into the request buffer. */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids)
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ die(_("object-info capability is not enabled on the server"));
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("unsupported protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (!string_list_has_string(args->object_info_options, reader->line))
+ return -1;
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(*object_info_data[j].sizep));
+ }
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ if (strtoul_ul(object_info_values.items[1 + size_index].string, 10, object_info_data[i].sizep))
+ die("object-info: ref %s has invalid size %s",
+ object_info_values.items[0].string,
+ object_info_values.items[1 + size_index].string);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..6184d04d72
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,22 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+/*
+ * Sends git-cat-file object-info command into the request buf and read the
+ * results from packets.
+ */
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index 51de82e414..704bc21b47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1654,6 +1654,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index 9d3470366f..119d3369f1 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index d457b42550..9da1547b2c 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -710,8 +710,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -727,6 +727,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 10d820c333..b6a2052908 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -464,8 +465,33 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ string_list_sort(obj_info_args.object_info_options);
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ ret = fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1]);
+ goto cleanup;
- if (!data->finished_handshake) {
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
diff --git a/transport.h b/transport.h
index 44100fa9b7..e61e931863 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to retrieve only object-info.
+ * If object-info is not supported, the operation will error and exit.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v9 8/8] cat-file: add remote-object-info to batch-command
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
` (6 preceding siblings ...)
2025-01-08 18:37 ` [PATCH v9 7/8] transport: add client support for object-info Eric Ju
@ 2025-01-08 18:37 ` Eric Ju
2025-01-10 11:20 ` Christian Couder
7 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2025-01-08 18:37 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.
While `info` takes object ids one at a time, this creates
overhead when making requests to a server.So `remote-object-info`
instead can take multiple object ids at once.
cat-file --batch-command is generally implemented in the following
manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loop through the object info passed in, printing the info.
In order for remote-object-info to avoid remote communication overhead
in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 99 ++++
object-file.c | 11 +
object-store-ll.h | 3 +
t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
5 files changed, 797 insertions(+), 4 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..6a2f9fd752 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified
+ `<remote>` without downloading objects from the remote.
+ Error when the `object-info` capability is not supported by the server.
+ Error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current the default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will error
+and exit when they are in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 69ea642dc6..5fb71aaf91 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -27,6 +27,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -45,9 +48,12 @@ struct batch_options {
char input_delim;
char output_delim;
const char *format;
+ int use_remote_info;
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -579,6 +585,61 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+ if (object_info_oids.nr == 0) {
+ die(_("remote-object-info requires objects"));
+ }
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -670,6 +731,43 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line, struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+ data->oid = object_info_oids.oid[i];
+ if (remote_object_info[i].sizep) {
+ /*
+ * When reaching here, it means remote-object-info can retrieve
+ * information from server without downloading them.
+ */
+ data->size = *remote_object_info[i].sizep;
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
+ }
+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -701,6 +799,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index 5b792b3dd4..96f204c93a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3128,3 +3128,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index cd3bd5bd99..20208e1d4f 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -553,4 +553,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
void *data, enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..fd6c63cdb9
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,664 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info git:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info file:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on not providing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent"
+ EOF
+ test_grep "remote-object-info requires objects" err
+ )
+'
+
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info http:// fails when transfer.advertiseobjectinfo=false ' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.0
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v9 8/8] cat-file: add remote-object-info to batch-command
2025-01-08 18:37 ` [PATCH v9 8/8] cat-file: add remote-object-info to batch-command Eric Ju
@ 2025-01-10 11:20 ` Christian Couder
2025-01-14 1:24 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Christian Couder @ 2025-01-10 11:20 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Wed, Jan 8, 2025 at 7:39 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> Since the `info` command in cat-file --batch-command prints object info
Nit: Everywhere in this commit message, it would be a bit clearer and
easier to read with:
s/cat-file --batch-command/`cat-file --batch-command`/
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
>
> Add `remote-object-info` to cat-file --batch-command.
s/`remote-object-info`/a new `remote-object-info` command/
[...]
> To summarize, `remote-object-info` gets object info from the remote and
> then loop through the object info passed in, printing the info.
s/loop/loops/
> +remote-object-info <remote> <object>...::
> + Print object info for object references `<object>` at specified
> + `<remote>` without downloading objects from the remote.
> + Error when the `object-info` capability is not supported by the server.
I think it's more grammatically correct to use "Error out when..." or
"Raise an error when..." than just "Error when..."
Also maybe: s/server/remote/
> + Error when no object references are provided.
Here also "Error out when..." or "Raise an error when..."
> + This command may be combined with `--buffer`.
[...]
> If no format is specified, the default format is `%(objectname)
> -%(objecttype) %(objectsize)`.
> +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> +DO NOT RELY on the current the default format to stay the same!!!
s/current the default/current default/
> CAVEATS
> -------
>
> +Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
> +currently not supported by the `remote-object-info` command, we will error
s/error/raise an error/
or maybe:
s/error and exit/error out/
> +and exit when they are in the format string.
s/are/appear/
> @@ -45,9 +48,12 @@ struct batch_options {
> char input_delim;
> char output_delim;
> const char *format;
> + int use_remote_info;
"unsigned int" might be a bit better for bool fields like this.
Actually it seems to me that this field is set to 0 and 1 in some
places but we never read it, so I wonder if it's actually useful.
> };
> @@ -579,6 +585,61 @@ static void batch_one_object(const char *obj_name,
> object_context_release(&ctx);
> }
>
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> + int retval = 0;
> + struct remote *remote = NULL;
> + struct object_id oid;
> + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> + static struct transport *gtransport;
> +
> + /*
> + * Change the format to "%(objectname) %(objectsize)" when
> + * remote-object-info command is used. Once we start supporting objecttype
> + * the default format should change to DEFAULT_FORMAT
s/DEFAULT_FORMAT/DEFAULT_FORMAT./
> + */
> + if (!opt->format)
> + opt->format = "%(objectname) %(objectsize)";
> +
> + remote = remote_get(argv[0]);
> + if (!remote)
> + die(_("must supply valid remote when using remote-object-info"));
> +
> + oid_array_clear(&object_info_oids);
> + for (size_t i = 1; i < argc; i++) {
> + if (get_oid_hex(argv[i], &oid))
> + die(_("Not a valid object name %s"), argv[i]);
> + oid_array_append(&object_info_oids, &oid);
> + }
> + if (object_info_oids.nr == 0) {
> + die(_("remote-object-info requires objects"));
> + }
We prefer to drop '{' and '}' and use "!X" instead of "X == 0" when
possible, so:
if (!object_info_oids.nr)
die(_("remote-object-info requires objects"));
Thanks.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling
2025-01-08 18:37 ` [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
@ 2025-01-10 11:33 ` Christian Couder
2025-01-14 1:39 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Christian Couder @ 2025-01-10 11:33 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Wed, Jan 8, 2025 at 7:38 PM Eric Ju <eric.peijian@gmail.com> wrote:
> +// Converts a string to an unsigned long using the standard library's strtoul,
> +// with additional error handling to ensure robustness.
We use comments like this:
/*
* Converts a string to an unsigned long using the standard library's strtoul,
* with additional error handling to ensure robustness.
*/
Also we use the imperative mood in comments before a function, so:
s/Converts/Convert/
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop
2025-01-08 18:37 ` [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2025-01-10 11:39 ` Christian Couder
2025-01-14 1:36 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Christian Couder @ 2025-01-10 11:39 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Wed, Jan 8, 2025 at 7:38 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> Some code used in this series declares variable i and only uses it
> in a for loop, not in any other logic outside the loop.
>
> Change the declaration of i to be inside the for loop for readability.
It might be nice to say that, while at it, we also change the type
from "int" to "size_t" where the latter makes more sense.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 3/8] cat-file: split test utility functions into a separate library file
2025-01-08 18:37 ` [PATCH v9 3/8] cat-file: split test utility functions into a separate library file Eric Ju
@ 2025-01-10 14:26 ` Christian Couder
2025-01-14 1:33 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Christian Couder @ 2025-01-10 14:26 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
About the commit subject, maybe something like the following would be
a bit shorter:
t1006: split test utility functions into new "lib-cat-file.sh"
On Wed, Jan 8, 2025 at 7:38 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> This refactor extracts utility functions from the cat-file's test
s/test/test script/
> t1006-cat-file.sh into a dedicated library file. The goal is to improve
s/a dedicated library file/a new "lib-cat-file.sh" dedicated library file/
> code reuse and readability, enabling future tests to leverage these
> utilities without duplicating code
s/code/code./
> diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
> new file mode 100644
> index 0000000000..9fb20be308
> --- /dev/null
> +++ b/t/lib-cat-file.sh
> @@ -0,0 +1,16 @@
> +# Library of git-cat-file related tests.
s/tests/test functions/
> +
> +# Print a string without a trailing newline
s/newline/newline./
> +echo_without_newline () {
> + printf '%s' "$*"
> +}
> +
> +# Print a string without newlines and replaces them with a NULL character (\0).
s/replaces/replace/
> +echo_without_newline_nul () {
> + echo_without_newline "$@" | tr '\n' '\0'
> +}
> +
> +# Calculate the length of a string removing any leading spaces.
This might be a bit misleading as leading spaces are removed from the
output from `wc -c`, not from the string.
> +strlen () {
> + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> +}
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 8/8] cat-file: add remote-object-info to batch-command
2025-01-10 11:20 ` Christian Couder
@ 2025-01-14 1:24 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-01-14 1:24 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Jan 10, 2025 at 6:21 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Wed, Jan 8, 2025 at 7:39 PM Eric Ju <eric.peijian@gmail.com> wrote:
> >
> > Since the `info` command in cat-file --batch-command prints object info
>
> Nit: Everywhere in this commit message, it would be a bit clearer and
> easier to read with:
>
> s/cat-file --batch-command/`cat-file --batch-command`/
>
Thank you. Fixed in v10.
> > for a given object, it is natural to add another command in cat-file
> > --batch-command to print object info for a given object from a remote.
> >
> > Add `remote-object-info` to cat-file --batch-command.
>
> s/`remote-object-info`/a new `remote-object-info` command/
>
Thank you. Fixed in v10.
> [...]
>
> > To summarize, `remote-object-info` gets object info from the remote and
> > then loop through the object info passed in, printing the info.
>
> s/loop/loops/
>
Thank you. Fixed in v10.
> > +remote-object-info <remote> <object>...::
> > + Print object info for object references `<object>` at specified
> > + `<remote>` without downloading objects from the remote.
> > + Error when the `object-info` capability is not supported by the server.
>
> I think it's more grammatically correct to use "Error out when..." or
> "Raise an error when..." than just "Error when..."
>
> Also maybe: s/server/remote/
>
Thank you. Fixed in v10. I will use "Raise an error when..."
> > + Error when no object references are provided.
>
> Here also "Error out when..." or "Raise an error when..."
>
Thank you. Fixed in v10.
> > + This command may be combined with `--buffer`.
>
> [...]
>
> > If no format is specified, the default format is `%(objectname)
> > -%(objecttype) %(objectsize)`.
> > +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
> > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
> > +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
> > +DO NOT RELY on the current the default format to stay the same!!!
>
> s/current the default/current default/
>
Thank you. Fixed in v10.
> > CAVEATS
> > -------
> >
> > +Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
> > +currently not supported by the `remote-object-info` command, we will error
>
> s/error/raise an error/
>
> or maybe:
>
> s/error and exit/error out/
>
Thank you. Replaced with "raise an error".
> > +and exit when they are in the format string.
>
> s/are/appear/
>
Thank you. Fixed in v10.
> > @@ -45,9 +48,12 @@ struct batch_options {
> > char input_delim;
> > char output_delim;
> > const char *format;
> > + int use_remote_info;
>
> "unsigned int" might be a bit better for bool fields like this.
>
> Actually it seems to me that this field is set to 0 and 1 in some
> places but we never read it, so I wonder if it's actually useful.
>
Thank you. It is used at all and removed in v10.
> > };
>
> > @@ -579,6 +585,61 @@ static void batch_one_object(const char *obj_name,
> > object_context_release(&ctx);
> > }
> >
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > + static struct transport *gtransport;
> > +
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT
>
> s/DEFAULT_FORMAT/DEFAULT_FORMAT./
>
Thank you. Fixed in v10.
> > + */
> > + if (!opt->format)
> > + opt->format = "%(objectname) %(objectsize)";
> > +
> > + remote = remote_get(argv[0]);
> > + if (!remote)
> > + die(_("must supply valid remote when using remote-object-info"));
> > +
> > + oid_array_clear(&object_info_oids);
> > + for (size_t i = 1; i < argc; i++) {
> > + if (get_oid_hex(argv[i], &oid))
> > + die(_("Not a valid object name %s"), argv[i]);
> > + oid_array_append(&object_info_oids, &oid);
> > + }
> > + if (object_info_oids.nr == 0) {
> > + die(_("remote-object-info requires objects"));
> > + }
>
> We prefer to drop '{' and '}' and use "!X" instead of "X == 0" when
> possible, so:
>
> if (!object_info_oids.nr)
> die(_("remote-object-info requires objects"));
>
> Thanks.
Thank you. Fixed in v10.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 3/8] cat-file: split test utility functions into a separate library file
2025-01-10 14:26 ` Christian Couder
@ 2025-01-14 1:33 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-01-14 1:33 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Thank you Christian. They are all fixed in v10
On Fri, Jan 10, 2025 at 9:26 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> About the commit subject, maybe something like the following would be
> a bit shorter:
>
> t1006: split test utility functions into new "lib-cat-file.sh"
>
> On Wed, Jan 8, 2025 at 7:38 PM Eric Ju <eric.peijian@gmail.com> wrote:
> >
> > This refactor extracts utility functions from the cat-file's test
>
> s/test/test script/
>
> > t1006-cat-file.sh into a dedicated library file. The goal is to improve
>
> s/a dedicated library file/a new "lib-cat-file.sh" dedicated library file/
>
> > code reuse and readability, enabling future tests to leverage these
> > utilities without duplicating code
>
> s/code/code./
>
> > diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
> > new file mode 100644
> > index 0000000000..9fb20be308
> > --- /dev/null
> > +++ b/t/lib-cat-file.sh
> > @@ -0,0 +1,16 @@
> > +# Library of git-cat-file related tests.
>
> s/tests/test functions/
>
> > +
> > +# Print a string without a trailing newline
>
> s/newline/newline./
>
> > +echo_without_newline () {
> > + printf '%s' "$*"
> > +}
> > +
> > +# Print a string without newlines and replaces them with a NULL character (\0).
>
> s/replaces/replace/
>
> > +echo_without_newline_nul () {
> > + echo_without_newline "$@" | tr '\n' '\0'
> > +}
> > +
> > +# Calculate the length of a string removing any leading spaces.
>
> This might be a bit misleading as leading spaces are removed from the
> output from `wc -c`, not from the string.
>
Yes, I will just change it to "Calculate the length of a string. "
> > +strlen () {
> > + echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
> > +}
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop
2025-01-10 11:39 ` Christian Couder
@ 2025-01-14 1:36 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-01-14 1:36 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Thank you. Add that in the commit message.
On Fri, Jan 10, 2025 at 6:39 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Wed, Jan 8, 2025 at 7:38 PM Eric Ju <eric.peijian@gmail.com> wrote:
> >
> > Some code used in this series declares variable i and only uses it
> > in a for loop, not in any other logic outside the loop.
> >
> > Change the declaration of i to be inside the for loop for readability.
>
> It might be nice to say that, while at it, we also change the type
> from "int" to "size_t" where the latter makes more sense.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling
2025-01-10 11:33 ` Christian Couder
@ 2025-01-14 1:39 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-01-14 1:39 UTC (permalink / raw)
To: Christian Couder
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Thank you. Fixed in v10.
On Fri, Jan 10, 2025 at 6:33 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Wed, Jan 8, 2025 at 7:38 PM Eric Ju <eric.peijian@gmail.com> wrote:
>
> > +// Converts a string to an unsigned long using the standard library's strtoul,
> > +// with additional error handling to ensure robustness.
>
> We use comments like this:
>
> /*
> * Converts a string to an unsigned long using the standard library's strtoul,
> * with additional error handling to ensure robustness.
> */
>
> Also we use the imperative mood in comments before a function, so:
>
> s/Converts/Convert/
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v10 0/8] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (14 preceding siblings ...)
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
` (7 more replies)
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
16 siblings, 8 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This patch series is a continuation of Calvin Wan’s (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info
command at [1].
Sometimes it is beneficial to retrieve information about an object without
having to download it completely. The server logic for retrieving size has
already been implemented and merged in "a2ba162cda (object-info: support for
retrieving object info, 2021-04-20)"[2]. This patch series implement the client
option for it.
This patch series add the `remote-object-info` command to
`cat-file --batch-command`. This command allows the client to make an
object-info command request to a server that supports protocol v2.
If the server uses protocol v2 but does not support the object-info capability,
`cat-file --batch-command` will die.
If a user attempts to use `remote-object-info` with protocol v1,,
`cat-file --batch-command` will die.
Currently, only the size (%(objectsize)) is supported in this implementation.
The type (%(objecttype)) is not included in this patch series, as it is not yet
supported on the server side either. The plan is to implement the necessary
logic for both the server and client in a subsequent series.
The default format for remote-object-info is set to %(objectname) %(objectsize).
Once %(objecttype) is supported, the default format will be unified accordingly.
If the batch command format includes unsupported fields such as %(objecttype),
%(objectsize:disk), or %(deltabase), the command will terminate with an error.
Changes since V9
================
- Refactored documentation for improved clarity.
- Refactored commit messages to provide more detailed and relevant information.
- Revised comments to align with best practices and reduce potential confusion or misinterpretation.
- Fixed grammatical errors and typos throughout the code and documentation.
- Removed unused variables to improve code cleanliness and maintainability.
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (4):
git-compat-util: add strtoul_ul() with error handling
cat-file: add declaration of variable i inside its for loop
t1006: split test utility functions into new "lib-cat-file.sh"
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.txt | 24 +-
Makefile | 1 +
builtin/cat-file.c | 107 +++-
connect.c | 34 ++
connect.h | 8 +
fetch-object-info.c | 85 ++++
fetch-object-info.h | 22 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
git-compat-util.h | 20 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 28 +-
transport.h | 11 +
19 files changed, 1047 insertions(+), 68 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v9:
1: 0a77ace719 ! 1: a567de3dc6 git-compat-util: add strtoul_ul() with error handling
@@ git-compat-util.h: static inline int strtoul_ui(char const *s, int base, unsigne
return 0;
}
-+// Converts a string to an unsigned long using the standard library's strtoul,
-+// with additional error handling to ensure robustness.
++/*
++ * Convert a string to an unsigned long using the standard library's strtoul,
++ * with additional error handling to ensure robustness.
++ */
+static inline int strtoul_ul(char const *s, int base, unsigned long *result)
+{
+ unsigned long ul;
2: 51a0a48d7b ! 2: 8d5140b111 cat-file: add declaration of variable i inside its for loop
@@ Commit message
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
+ While at it, we also change its type from "int" to "size_t" where the latter makes more sense.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
3: fa1d6678a0 ! 3: 42d0539e9b cat-file: split test utility functions into a separate library file
@@ Metadata
Author: Eric Ju <eric.peijian@gmail.com>
## Commit message ##
- cat-file: split test utility functions into a separate library file
+ t1006: split test utility functions into new "lib-cat-file.sh"
This refactor extracts utility functions from the cat-file's test
- t1006-cat-file.sh into a dedicated library file. The goal is to improve
- code reuse and readability, enabling future tests to leverage these
- utilities without duplicating code
+ script "t1006-cat-file.sh" into a new "lib-cat-file.sh" dedicated
+ library file. The goal is to improve code reuse and readability,
+ enabling future tests to leverage these utilities without duplicating
+ code.
## t/lib-cat-file.sh (new) ##
@@
-+# Library of git-cat-file related tests.
++# Library of git-cat-file related test functions.
+
-+# Print a string without a trailing newline
++# Print a string without a trailing newline.
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
-+# Print a string without newlines and replaces them with a NULL character (\0).
++# Print a string without newlines and replace them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
-+# Calculate the length of a string removing any leading spaces.
++# Calculate the length of a string.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
4: 61dd598576 = 4: 8fdb2c6f81 fetch-pack: refactor packet writing
5: 106e776dda = 5: 7d5f53de6e fetch-pack: move fetch initialization
6: fe9366b59d = 6: 17583ebcd9 serve: advertise object-info feature
7: 09646f6517 = 7: 22d5eb26d1 transport: add client support for object-info
8: 99c450fc7e ! 8: 0698aa3606 cat-file: add remote-object-info to batch-command
@@ Metadata
## Commit message ##
cat-file: add remote-object-info to batch-command
- Since the `info` command in cat-file --batch-command prints object info
- for a given object, it is natural to add another command in cat-file
- --batch-command to print object info for a given object from a remote.
+ Since the `info` command in `cat-file --batch-command` prints object
+ info for a given object, it is natural to add another command in
+ `cat-file --batch-command` to print object info for a given object
+ from a remote.
- Add `remote-object-info` to cat-file --batch-command.
+ Add `remote-object-info` to `cat-file --batch-command`.
While `info` takes object ids one at a time, this creates
- overhead when making requests to a server.So `remote-object-info`
+ overhead when making requests to a server. So `remote-object-info`
instead can take multiple object ids at once.
- cat-file --batch-command is generally implemented in the following
- manner:
+ The `cat-file --batch-command` command is generally implemented in
+ the following manner:
- Receive and parse input from user
- Call respective function attached to command
@@ Commit message
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
- then loop through the object info passed in, printing the info.
+ then loops through the object info passed in, printing the info.
- In order for remote-object-info to avoid remote communication overhead
- in the non-buffer mode, the objects are passed in as such:
+ In order for `remote-object-info` to avoid remote communication
+ overhead in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
@@ Documentation/git-cat-file.txt: info <object>::
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified
+ `<remote>` without downloading objects from the remote.
-+ Error when the `object-info` capability is not supported by the server.
-+ Error when no object references are provided.
++ Raise an error when the `object-info` capability is not supported by the remote.
++ Raise an error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
@@ Documentation/git-cat-file.txt: newline. The available atoms are:
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
-+DO NOT RELY on the current the default format to stay the same!!!
++DO NOT RELY on the current default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ Documentation/git-cat-file.txt: scripting purposes.
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
-+currently not supported by the `remote-object-info` command, we will error
-+and exit when they are in the format string.
++currently not supported by the `remote-object-info` command, we will raise
++an error and exit when they appear in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
@@ builtin/cat-file.c
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ builtin/cat-file.c: struct batch_options {
- char input_delim;
- char output_delim;
- const char *format;
-+ int use_remote_info;
};
static const char *force_path;
@@ builtin/cat-file.c: static void batch_one_object(const char *obj_name,
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
-+ * the default format should change to DEFAULT_FORMAT
++ * the default format should change to DEFAULT_FORMAT.
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
@@ builtin/cat-file.c: static void batch_one_object(const char *obj_name,
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
-+ if (object_info_oids.nr == 0) {
++ if (!object_info_oids.nr)
+ die(_("remote-object-info requires objects"));
-+ }
++
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
-+ opt->use_remote_info = 1;
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+ data->oid = object_info_oids.oid[i];
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
+ }
-+ opt->use_remote_info = 0;
+ data->skip_object_info = 0;
+
+cleanup:
--
2.47.1
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v10 1/8] git-compat-util: add strtoul_ul() with error handling
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
` (6 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
We already have strtoul_ui() and similar functions that provide proper
error handling using strtoul from the standard library. However,
there isn't currently a variant that returns an unsigned long.
This commit introduces strtoul_ul() to address this gap, enabling the
return of an unsigned long with proper error handling.
---
git-compat-util.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/git-compat-util.h b/git-compat-util.h
index e283c46c6f..f2935750bf 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1351,6 +1351,26 @@ static inline int strtoul_ui(char const *s, int base, unsigned int *result)
return 0;
}
+/*
+ * Convert a string to an unsigned long using the standard library's strtoul,
+ * with additional error handling to ensure robustness.
+ */
+static inline int strtoul_ul(char const *s, int base, unsigned long *result)
+{
+ unsigned long ul;
+ char *p;
+
+ errno = 0;
+ /* negative values would be accepted by strtoul */
+ if (strchr(s, '-'))
+ return -1;
+ ul = strtoul(s, &p, base);
+ if (errno || *p || p == s )
+ return -1;
+ *result = ul;
+ return 0;
+}
+
static inline int strtol_i(char const *s, int base, int *result)
{
long ul;
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 2/8] cat-file: add declaration of variable i inside its for loop
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
2025-01-14 2:14 ` [PATCH v10 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 3/8] t1006: split test utility functions into new "lib-cat-file.sh" Eric Ju
` (5 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
While at it, we also change its type from "int" to "size_t" where the latter makes more sense.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
fetch-pack.c | 3 +--
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b13561cf73..69ea642dc6 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -676,12 +676,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -689,9 +687,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -717,7 +713,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -727,7 +722,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index 3a227721ed..f5a63f12cd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1329,9 +1329,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
if (advertise_sid && server_supports_v2("session-id"))
packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
if (server_options && server_options->nr) {
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
+ for (size_t i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 3/8] t1006: split test utility functions into new "lib-cat-file.sh"
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
2025-01-14 2:14 ` [PATCH v10 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
2025-01-14 2:14 ` [PATCH v10 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 4/8] fetch-pack: refactor packet writing Eric Ju
` (4 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This refactor extracts utility functions from the cat-file's test
script "t1006-cat-file.sh" into a new "lib-cat-file.sh" dedicated
library file. The goal is to improve code reuse and readability,
enabling future tests to leverage these utilities without duplicating
code.
---
t/lib-cat-file.sh | 16 ++++++++++++++++
t/t1006-cat-file.sh | 13 +------------
2 files changed, 17 insertions(+), 12 deletions(-)
create mode 100644 t/lib-cat-file.sh
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..44af232d74
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related test functions.
+
+# Print a string without a trailing newline.
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replace them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..5c7d581ea2 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -3,6 +3,7 @@
test_description='git cat-file'
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -98,18 +99,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 4/8] fetch-pack: refactor packet writing
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
` (2 preceding siblings ...)
2025-01-14 2:14 ` [PATCH v10 3/8] t1006: split test utility functions into new "lib-cat-file.sh" Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 5/8] fetch-pack: move fetch initialization Eric Ju
` (3 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to a more
general-purpose function, write_command_and_capabilities(), enabling it
to serve both fetch and additional commands.
In this context, "command" refers to the "operations" supported by
Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
subcommand (e.g., git-fetch(1)) or a server-side operation like
"object-info" as implemented in commit a2ba162c
(object-info: support for retrieving object info, 2021-04-20).
Furthermore, write_command_and_capabilities() is moved to connect.c,
making it accessible to additional commands in the future.
To move write_command_and_capabilities() to connect.c, we need to
adjust how `advertise_sid` is managed. Previously,
in fetch_pack.c, `advertise_sid` was a static variable, modified using
git_config_get_bool().
In connect.c, we now initialize `advertise_sid` at the beginning by
directly using git_config_get_bool(). This change is safe because:
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. In function do_fetch_pack:
if (!server_supports("session-id"))
advertise_sid = 0;
2. In function fetch_pack_config():
git_config_get_bool("transfer.advertisesid", &advertise_sid);
About 1, since do_fetch_pack() is only relevant for protocol v1, this
assignment can be ignored in our refactor, as
write_command_and_capabilities() is only used in protocol v2.
About 2, git_config_get_bool() is from config.h and it is an out-of-box
dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 8 ++++++++
fetch-pack.c | 35 ++---------------------------------
3 files changed, 44 insertions(+), 33 deletions(-)
diff --git a/connect.c b/connect.c
index 10fad43e98..d89591f043 100644
--- a/connect.c
+++ b/connect.c
@@ -689,6 +689,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (size_t i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..d904c73a85 100644
--- a/connect.h
+++ b/connect.h
@@ -30,4 +30,12 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+/*
+ * Writes a command along with the requested
+ * server capabilities/features into a request buffer.
+ */
+struct string_list;
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index f5a63f12cd..78e7d38c47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1317,37 +1317,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
- for (size_t i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1358,7 +1327,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2186,7 +2155,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 5/8] fetch-pack: move fetch initialization
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
` (3 preceding siblings ...)
2025-01-14 2:14 ` [PATCH v10 4/8] fetch-pack: refactor packet writing Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 6/8] serve: advertise object-info feature Eric Ju
` (2 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 78e7d38c47..51de82e414 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1648,18 +1648,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 6/8] serve: advertise object-info feature
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
` (4 preceding siblings ...)
2025-01-14 2:14 ` [PATCH v10 5/8] fetch-pack: move fetch initialization Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-01-14 2:14 ` [PATCH v10 7/8] transport: add client support for object-info Eric Ju
2025-01-14 2:15 ` [PATCH v10 8/8] cat-file: add remote-object-info to batch-command Eric Ju
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index c8694e3751..7a388d26d9 100644
--- a/serve.c
+++ b/serve.c
@@ -70,7 +70,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -78,6 +78,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 7/8] transport: add client support for object-info
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
` (5 preceding siblings ...)
2025-01-14 2:14 ` [PATCH v10 6/8] serve: advertise object-info feature Eric Ju
@ 2025-01-14 2:14 ` Eric Ju
2025-02-01 2:08 ` Jeff King
2025-01-14 2:15 ` [PATCH v10 8/8] cat-file: add remote-object-info to batch-command Eric Ju
7 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:14 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes, it is beneficial to retrieve information about an object
without downloading it entirely. The server-side logic for this
functionality was implemented in commit "a2ba162cda (object-info:
support for retrieving object info, 2021-04-20)." And the wire
format is documented at
https://git-scm.com/docs/protocol-v2#_object_info.
This commit introduces client functions to interact with the server.
Currently, the client supports requesting a list of object IDs with
the 'size' feature from a v2 server. If the server does not advertise
this feature (i.e., transfer.advertiseobjectinfo is set to false),
the client will return an error and exit.
Notice that the entire request is written into req_buf before being
sent to the remote. This approach follows the pattern used in the
`send_fetch_request()` logic within fetch-pack.c.
Streaming the request is not addressed in this patch.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 85 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 22 ++++++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 ++
transport-helper.c | 11 ++++--
transport.c | 28 ++++++++++++++-
transport.h | 11 ++++++
8 files changed, 160 insertions(+), 3 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index 97e8385b66..e8c6702b32 100644
--- a/Makefile
+++ b/Makefile
@@ -1020,6 +1020,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..b279e06dc8
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,85 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/* Sends git-cat-file object-info command and its arguments into the request buffer. */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids)
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ die(_("object-info capability is not enabled on the server"));
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("unsupported protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (!string_list_has_string(args->object_info_options, reader->line))
+ return -1;
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(*object_info_data[j].sizep));
+ }
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ if (strtoul_ul(object_info_values.items[1 + size_index].string, 10, object_info_data[i].sizep))
+ die("object-info: ref %s has invalid size %s",
+ object_info_values.items[0].string,
+ object_info_values.items[1 + size_index].string);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..6184d04d72
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,22 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+/*
+ * Sends git-cat-file object-info command into the request buf and read the
+ * results from packets.
+ */
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index 51de82e414..704bc21b47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1654,6 +1654,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index 9d3470366f..119d3369f1 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index d457b42550..9da1547b2c 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -710,8 +710,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -727,6 +727,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 10d820c333..b6a2052908 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -464,8 +465,33 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options && transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ string_list_sort(obj_info_args.object_info_options);
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ ret = fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1]);
+ goto cleanup;
- if (!data->finished_handshake) {
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
diff --git a/transport.h b/transport.h
index 44100fa9b7..e61e931863 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to retrieve only object-info.
+ * If object-info is not supported, the operation will error and exit.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v10 8/8] cat-file: add remote-object-info to batch-command
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
` (6 preceding siblings ...)
2025-01-14 2:14 ` [PATCH v10 7/8] transport: add client support for object-info Eric Ju
@ 2025-01-14 2:15 ` Eric Ju
2025-02-01 2:03 ` Jeff King
7 siblings, 1 reply; 174+ messages in thread
From: Eric Ju @ 2025-01-14 2:15 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in `cat-file --batch-command` prints object
info for a given object, it is natural to add another command in
`cat-file --batch-command` to print object info for a given object
from a remote.
Add `remote-object-info` to `cat-file --batch-command`.
While `info` takes object ids one at a time, this creates
overhead when making requests to a server. So `remote-object-info`
instead can take multiple object ids at once.
The `cat-file --batch-command` command is generally implemented in
the following manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loops through the object info passed in, printing the info.
In order for `remote-object-info` to avoid remote communication
overhead in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.txt | 24 +-
builtin/cat-file.c | 96 ++++
object-file.c | 11 +
object-store-ll.h | 3 +
t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
5 files changed, 794 insertions(+), 4 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5890ae368..4fbb3a077b 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified
+ `<remote>` without downloading objects from the remote.
+ Raise an error when the `object-info` capability is not supported by the remote.
+ Raise an error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will raise
+an error and exit when they appear in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 69ea642dc6..78b16e10bf 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -27,6 +27,9 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -48,6 +51,8 @@ struct batch_options {
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -579,6 +584,61 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT.
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+ if (!object_info_oids.nr)
+ die(_("remote-object-info requires objects"));
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -670,6 +730,41 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line, struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+
+ char *line_to_split = xstrdup_or_null(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+ data->oid = object_info_oids.oid[i];
+ if (remote_object_info[i].sizep) {
+ /*
+ * When reaching here, it means remote-object-info can retrieve
+ * information from server without downloading them.
+ */
+ data->size = *remote_object_info[i].sizep;
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
+ }
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -701,6 +796,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index 5b792b3dd4..96f204c93a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3128,3 +3128,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index cd3bd5bd99..20208e1d4f 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -553,4 +553,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
void *data, enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..fd6c63cdb9
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,664 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info git:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info file:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on not providing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent"
+ EOF
+ test_grep "remote-object-info requires objects" err
+ )
+'
+
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info http:// fails when transfer.advertiseobjectinfo=false ' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.47.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v10 8/8] cat-file: add remote-object-info to batch-command
2025-01-14 2:15 ` [PATCH v10 8/8] cat-file: add remote-object-info to batch-command Eric Ju
@ 2025-02-01 2:03 ` Jeff King
2025-02-21 15:34 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Jeff King @ 2025-02-01 2:03 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Jan 13, 2025 at 09:15:00PM -0500, Eric Ju wrote:
> +static void parse_cmd_remote_object_info(struct batch_options *opt,
> + const char *line, struct strbuf *output,
> + struct expand_data *data)
> +{
> + int count;
> + const char **argv;
> +
> + char *line_to_split = xstrdup_or_null(line);
> + count = split_cmdline(line_to_split, &argv);
> + if (get_remote_info(opt, count, argv))
> + goto cleanup;
Coverity complains that split_cmdline() can return a negative value when
the input is malformed, which we then feed to get_remote_info(). If I
understand correctly (from my very brief glance at the series), that
string would be under the control of the untrusted client?
I _think_ an attacker can't do anything too bad here, since
get_remote_info() also takes a signed int, and so iterating from 0 will
just find no entries. But probably we should explicitly check for error
and bail.
While just looking at this code from a security perspective, two other
things occur to me:
1. Calling xstrdup_or_null() implies that "line" may be NULL, which
would make "line_to_split" also NULL. But I think split_cmdline()
would segfault in that case. Should it just be xstrdup()?
2. Are there any bounds on the size of "line"? E.g., is it coming in
as a single pkt, or can it be arbitrarily large if an attacker
wants (it looks like maybe the latter, since it comes from a strbuf
in batch_objects_command(), but I didn't look at how network data
gets passed in to that). At any rate, I think we ran into problems
before with split_cmdline() and integer overflow, since it returns
an int (CVE-2022-39260). I thought we fixed it by rejecting long
lines in git-shell, but it looks like we also hardened
split_cmdline() in 0ca6ead81e (alias.c: reject too-long cmdline
strings in split_cmdline(), 2022-09-28).
So we are maybe OK, but I wonder if we should punt on absurd lines.
Related, can an attacker just flood input into that strbuf, making
it grow forever and waste memory? That's just a simple resource
attack, but we have tried to avoid those elsewhere in upload-pack,
etc.
-Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v10 7/8] transport: add client support for object-info
2025-01-14 2:14 ` [PATCH v10 7/8] transport: add client support for object-info Eric Ju
@ 2025-02-01 2:08 ` Jeff King
2025-02-20 22:52 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Jeff King @ 2025-02-01 2:08 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Jan 13, 2025 at 09:14:59PM -0500, Eric Ju wrote:
> @@ -464,8 +465,33 @@ static int fetch_refs_via_pack(struct transport *transport,
> args.server_options = transport->server_options;
> args.negotiation_tips = data->options.negotiation_tips;
> args.reject_shallow_remote = transport->smart_options->reject_shallow;
> + args.object_info = transport->smart_options->object_info;
> +
> + if (transport->smart_options && transport->smart_options->object_info
Coverity complains about the check for a NULL transport->smart_options
here. If it's NULL we'd already have segfaulted a few lines above when
we look at the reject_shallow flag.
Not sure if that's an existing bug in the earlier code or not. ;) Your
extra check can't hurt anything, in the sense that it's just being
overly defensive, but it does make puzzling out the expected value of
smart_options harder.
-Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v10 7/8] transport: add client support for object-info
2025-02-01 2:08 ` Jeff King
@ 2025-02-20 22:52 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-02-20 22:52 UTC (permalink / raw)
To: Jeff King
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Jan 31, 2025 at 9:08 PM Jeff King <peff@peff.net> wrote:
>
> On Mon, Jan 13, 2025 at 09:14:59PM -0500, Eric Ju wrote:
>
> > @@ -464,8 +465,33 @@ static int fetch_refs_via_pack(struct transport *transport,
> > args.server_options = transport->server_options;
> > args.negotiation_tips = data->options.negotiation_tips;
> > args.reject_shallow_remote = transport->smart_options->reject_shallow;
> > + args.object_info = transport->smart_options->object_info;
> > +
> > + if (transport->smart_options && transport->smart_options->object_info
>
> Coverity complains about the check for a NULL transport->smart_options
> here. If it's NULL we'd already have segfaulted a few lines above when
> we look at the reject_shallow flag.
>
> Not sure if that's an existing bug in the earlier code or not. ;) Your
> extra check can't hurt anything, in the sense that it's just being
> overly defensive, but it does make puzzling out the expected value of
> smart_options harder.
>
> -Peff
Thank you Jeff. Sorry for the late response.
I will remove the extra check. transport->smart_options will not be
NULL when it reaches
`args.reject_shallow_remote = transport->smart_options->reject_shallow;`
The call sequence is like this
get_remote_info() in cat-file.c ==> transport_fetch_refs() ==>
transport->vtable->fetch_refs ==> fetch_refs_via_pack()
in get_remote_info(), we already have a check for NULL:
if (gtransport->smart_options) {
...
} else {
retval = -1;
}
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v10 8/8] cat-file: add remote-object-info to batch-command
2025-02-01 2:03 ` Jeff King
@ 2025-02-21 15:34 ` Peijian Ju
2025-02-24 23:45 ` Jeff King
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2025-02-21 15:34 UTC (permalink / raw)
To: Jeff King
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Jan 31, 2025 at 9:03 PM Jeff King <peff@peff.net> wrote:
>
> On Mon, Jan 13, 2025 at 09:15:00PM -0500, Eric Ju wrote:
>
> > +static void parse_cmd_remote_object_info(struct batch_options *opt,
> > + const char *line, struct strbuf *output,
> > + struct expand_data *data)
> > +{
> > + int count;
> > + const char **argv;
> > +
> > + char *line_to_split = xstrdup_or_null(line);
> > + count = split_cmdline(line_to_split, &argv);
> > + if (get_remote_info(opt, count, argv))
> > + goto cleanup;
>
> Coverity complains that split_cmdline() can return a negative value when
> the input is malformed, which we then feed to get_remote_info(). If I
> understand correctly (from my very brief glance at the series), that
> string would be under the control of the untrusted client?
>
> I _think_ an attacker can't do anything too bad here, since
> get_remote_info() also takes a signed int, and so iterating from 0 will
> just find no entries. But probably we should explicitly check for error
> and bail.
>
An explicit check is added to make sure if a negative value is returned, we
will error and bail.
> While just looking at this code from a security perspective, two other
> things occur to me:
>
> 1. Calling xstrdup_or_null() implies that "line" may be NULL, which
> would make "line_to_split" also NULL. But I think split_cmdline()
> would segfault in that case. Should it just be xstrdup()?
>
Thank you. Revised to use xstrdup() in v11.
> 2. Are there any bounds on the size of "line"? E.g., is it coming in
> as a single pkt, or can it be arbitrarily large if an attacker
> wants (it looks like maybe the latter, since it comes from a strbuf
> in batch_objects_command(), but I didn't look at how network data
> gets passed in to that). At any rate, I think we ran into problems
> before with split_cmdline() and integer overflow, since it returns
> an int (CVE-2022-39260). I thought we fixed it by rejecting long
> lines in git-shell, but it looks like we also hardened
> split_cmdline() in 0ca6ead81e (alias.c: reject too-long cmdline
> strings in split_cmdline(), 2022-09-28).
>
> So we are maybe OK, but I wonder if we should punt on absurd lines.
> Related, can an attacker just flood input into that strbuf, making
> it grow forever and waste memory? That's just a simple resource
> attack, but we have tried to avoid those elsewhere in upload-pack,
> etc.
>
Thank you. Adding a check in v11 for the length of `lines`. Please let
me know if something like this makes sense:
if (strlen(line) >= INT_MAX) {
die(_("remote-object-info command input overflow"));
}
> -Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v11 0/8] cat-file: add remote-object-info to batch-command
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
` (15 preceding siblings ...)
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
` (7 more replies)
16 siblings, 8 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This patch series is a continuation of Calvin Wan’s (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info
command at [1].
Sometimes it is beneficial to retrieve information about an object without
having to download it completely. The server logic for retrieving size has
already been implemented and merged in "a2ba162cda (object-info: support for
retrieving object info, 2021-04-20)"[2]. This patch series implement the client
option for it.
This patch series add the `remote-object-info` command to
`cat-file --batch-command`. This command allows the client to make an
object-info command request to a server that supports protocol v2.
If the server uses protocol v2 but does not support the object-info capability,
`cat-file --batch-command` will die.
If a user attempts to use `remote-object-info` with protocol v1,,
`cat-file --batch-command` will die.
Currently, only the size (%(objectsize)) is supported in this implementation.
The type (%(objecttype)) is not included in this patch series, as it is not yet
supported on the server side either. The plan is to implement the necessary
logic for both the server and client in a subsequent series.
The default format for remote-object-info is set to %(objectname) %(objectsize).
Once %(objecttype) is supported, the default format will be unified accordingly.
If the batch command format includes unsupported fields such as %(objecttype),
%(objectsize:disk), or %(deltabase), the command will terminate with an error.
Changes since V10
================
- Add a check on command input to prevent overflow.
- Add other checks to prevent potential abuse.
Calvin Wan (4):
fetch-pack: refactor packet writing
fetch-pack: move fetch initialization
serve: advertise object-info feature
transport: add client support for object-info
Eric Ju (4):
git-compat-util: add strtoul_ul() with error handling
cat-file: add declaration of variable i inside its for loop
t1006: split test utility functions into new "lib-cat-file.sh"
cat-file: add remote-object-info to batch-command
Documentation/git-cat-file.adoc | 24 +-
Makefile | 1 +
builtin/cat-file.c | 125 ++++-
connect.c | 34 ++
connect.h | 8 +
fetch-object-info.c | 85 ++++
fetch-object-info.h | 22 +
fetch-pack.c | 51 +-
fetch-pack.h | 2 +
git-compat-util.h | 20 +
object-file.c | 11 +
object-store-ll.h | 3 +
serve.c | 4 +-
t/lib-cat-file.sh | 16 +
t/t1006-cat-file.sh | 13 +-
t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
transport-helper.c | 11 +-
transport.c | 28 +-
transport.h | 11 +
19 files changed, 1065 insertions(+), 68 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
create mode 100644 t/lib-cat-file.sh
create mode 100755 t/t1017-cat-file-remote-object-info.sh
Range-diff against v10:
1: a4a5aefa3e = 1: 814c53b402 git-compat-util: add strtoul_ul() with error handling
2: c67e79804e = 2: 04f41100c4 cat-file: add declaration of variable i inside its for loop
3: 7f0b824714 = 3: 3af67e6648 t1006: split test utility functions into new "lib-cat-file.sh"
4: 0d22d6af6e = 4: cb1088e436 fetch-pack: refactor packet writing
5: 34c34c7464 = 5: 614daac4bb fetch-pack: move fetch initialization
6: 54dd237c45 = 6: 4bc403fa2c serve: advertise object-info feature
7: 90a3d987d5 ! 7: adae08d5a8 transport: add client support for object-info
@@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
-+ if (transport->smart_options && transport->smart_options->object_info
++ if (transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
8: 9d932c2cb2 ! 8: 975d39cb6a cat-file: add remote-object-info to batch-command
@@ builtin/cat-file.c
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
++
++/* Maximum length for a remote URL. While no universal standard exists,
++ * 8K is assumed to be a reasonable limit.
++ */
++#define MAX_REMOTE_URL_LEN (8*1024)
++/* Maximum number of objects allowed in a single remote-object-info request. */
++#define MAX_ALLOWED_OBJ_LIMIT 10000
++/* Maximum input size permitted for the remote-object-info command. */
++#define MAX_REMOTE_OBJ_INFO_LINE (MAX_REMOTE_URL_LEN + MAX_ALLOWED_OBJ_LIMIT * (GIT_MAX_HEXSZ + 1))
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
+{
+ int count;
+ const char **argv;
++ char *line_to_split;
++
++ if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
++ die(_("remote-object-info command input overflow "
++ "(no more than %d objects are allowed)"),
++ MAX_ALLOWED_OBJ_LIMIT);
+
-+ char *line_to_split = xstrdup_or_null(line);
++ line_to_split = xstrdup(line);
+ count = split_cmdline(line_to_split, &argv);
++ if (count < 0)
++ die(_("split remote-object-info command"));
++
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
--
2.48.1
^ permalink raw reply [flat|nested] 174+ messages in thread
* [PATCH v11 1/8] git-compat-util: add strtoul_ul() with error handling
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
` (6 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
We already have strtoul_ui() and similar functions that provide proper
error handling using strtoul from the standard library. However,
there isn't currently a variant that returns an unsigned long.
This commit introduces strtoul_ul() to address this gap, enabling the
return of an unsigned long with proper error handling.
---
git-compat-util.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/git-compat-util.h b/git-compat-util.h
index e123288e8f..0e9a43351a 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1353,6 +1353,26 @@ static inline int strtoul_ui(char const *s, int base, unsigned int *result)
return 0;
}
+/*
+ * Convert a string to an unsigned long using the standard library's strtoul,
+ * with additional error handling to ensure robustness.
+ */
+static inline int strtoul_ul(char const *s, int base, unsigned long *result)
+{
+ unsigned long ul;
+ char *p;
+
+ errno = 0;
+ /* negative values would be accepted by strtoul */
+ if (strchr(s, '-'))
+ return -1;
+ ul = strtoul(s, &p, base);
+ if (errno || *p || p == s )
+ return -1;
+ *result = ul;
+ return 0;
+}
+
static inline int strtol_i(char const *s, int base, int *result)
{
long ul;
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 2/8] cat-file: add declaration of variable i inside its for loop
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
2025-02-21 19:04 ` [PATCH v11 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 3/8] t1006: split test utility functions into new "lib-cat-file.sh" Eric Ju
` (5 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.
Change the declaration of i to be inside the for loop for readability.
While at it, we also change its type from "int" to "size_t" where the latter makes more sense.
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
builtin/cat-file.c | 11 +++--------
fetch-pack.c | 3 +--
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b13561cf73..69ea642dc6 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -676,12 +676,10 @@ static void dispatch_calls(struct batch_options *opt,
struct queued_cmd *cmd,
int nr)
{
- int i;
-
if (!opt->buffer_output)
die(_("flush is only for --buffer mode"));
- for (i = 0; i < nr; i++)
+ for (size_t i = 0; i < nr; i++)
cmd[i].fn(opt, cmd[i].line, output, data);
fflush(stdout);
@@ -689,9 +687,7 @@ static void dispatch_calls(struct batch_options *opt,
static void free_cmds(struct queued_cmd *cmd, size_t *nr)
{
- size_t i;
-
- for (i = 0; i < *nr; i++)
+ for (size_t i = 0; i < *nr; i++)
FREE_AND_NULL(cmd[i].line);
*nr = 0;
@@ -717,7 +713,6 @@ static void batch_objects_command(struct batch_options *opt,
size_t alloc = 0, nr = 0;
while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
- int i;
const struct parse_cmd *cmd = NULL;
const char *p = NULL, *cmd_end;
struct queued_cmd call = {0};
@@ -727,7 +722,7 @@ static void batch_objects_command(struct batch_options *opt,
if (isspace(*input.buf))
die(_("whitespace before command: '%s'"), input.buf);
- for (i = 0; i < ARRAY_SIZE(commands); i++) {
+ for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index 1ed5e11dd5..71fb2ca054 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1331,9 +1331,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
if (advertise_sid && server_supports_v2("session-id"))
packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
if (server_options && server_options->nr) {
- int i;
ensure_server_supports_v2("server-option");
- for (i = 0; i < server_options->nr; i++)
+ for (size_t i = 0; i < server_options->nr; i++)
packet_buf_write(req_buf, "server-option=%s",
server_options->items[i].string);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 3/8] t1006: split test utility functions into new "lib-cat-file.sh"
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
2025-02-21 19:04 ` [PATCH v11 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
2025-02-21 19:04 ` [PATCH v11 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 4/8] fetch-pack: refactor packet writing Eric Ju
` (4 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
This refactor extracts utility functions from the cat-file's test
script "t1006-cat-file.sh" into a new "lib-cat-file.sh" dedicated
library file. The goal is to improve code reuse and readability,
enabling future tests to leverage these utilities without duplicating
code.
---
t/lib-cat-file.sh | 16 ++++++++++++++++
t/t1006-cat-file.sh | 13 +------------
2 files changed, 17 insertions(+), 12 deletions(-)
create mode 100644 t/lib-cat-file.sh
diff --git a/t/lib-cat-file.sh b/t/lib-cat-file.sh
new file mode 100644
index 0000000000..44af232d74
--- /dev/null
+++ b/t/lib-cat-file.sh
@@ -0,0 +1,16 @@
+# Library of git-cat-file related test functions.
+
+# Print a string without a trailing newline.
+echo_without_newline () {
+ printf '%s' "$*"
+}
+
+# Print a string without newlines and replace them with a NULL character (\0).
+echo_without_newline_nul () {
+ echo_without_newline "$@" | tr '\n' '\0'
+}
+
+# Calculate the length of a string.
+strlen () {
+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 398865d6eb..1c27c10c6f 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -3,6 +3,7 @@
test_description='git cat-file'
. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
test_cmdmode_usage () {
test_expect_code 129 "$@" 2>err &&
@@ -98,18 +99,6 @@ do
'
done
-echo_without_newline () {
- printf '%s' "$*"
-}
-
-echo_without_newline_nul () {
- echo_without_newline "$@" | tr '\n' '\0'
-}
-
-strlen () {
- echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
-}
-
run_tests () {
type=$1
oid=$2
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 4/8] fetch-pack: refactor packet writing
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
` (2 preceding siblings ...)
2025-02-21 19:04 ` [PATCH v11 3/8] t1006: split test utility functions into new "lib-cat-file.sh" Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 5/8] fetch-pack: move fetch initialization Eric Ju
` (3 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Refactor write_fetch_command_and_capabilities() to a more
general-purpose function, write_command_and_capabilities(), enabling it
to serve both fetch and additional commands.
In this context, "command" refers to the "operations" supported by
Git's wire protocol https://git-scm.com/docs/protocol-v2, such as a Git
subcommand (e.g., git-fetch(1)) or a server-side operation like
"object-info" as implemented in commit a2ba162c
(object-info: support for retrieving object info, 2021-04-20).
Furthermore, write_command_and_capabilities() is moved to connect.c,
making it accessible to additional commands in the future.
To move write_command_and_capabilities() to connect.c, we need to
adjust how `advertise_sid` is managed. Previously,
in fetch_pack.c, `advertise_sid` was a static variable, modified using
git_config_get_bool().
In connect.c, we now initialize `advertise_sid` at the beginning by
directly using git_config_get_bool(). This change is safe because:
In the original fetch-pack.c code, there are only two places that
write `advertise_sid` :
1. In function do_fetch_pack:
if (!server_supports("session-id"))
advertise_sid = 0;
2. In function fetch_pack_config():
git_config_get_bool("transfer.advertisesid", &advertise_sid);
About 1, since do_fetch_pack() is only relevant for protocol v1, this
assignment can be ignored in our refactor, as
write_command_and_capabilities() is only used in protocol v2.
About 2, git_config_get_bool() is from config.h and it is an out-of-box
dependency of connect.c, so we can reuse it directly.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
connect.c | 34 ++++++++++++++++++++++++++++++++++
connect.h | 8 ++++++++
fetch-pack.c | 35 ++---------------------------------
3 files changed, 44 insertions(+), 33 deletions(-)
diff --git a/connect.c b/connect.c
index 91f3990014..6647b4a4b6 100644
--- a/connect.c
+++ b/connect.c
@@ -688,6 +688,40 @@ int server_supports(const char *feature)
return !!server_feature_value(feature, NULL);
}
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options)
+{
+ const char *hash_name;
+ int advertise_sid;
+
+ git_config_get_bool("transfer.advertisesid", &advertise_sid);
+
+ ensure_server_supports_v2(command);
+ packet_buf_write(req_buf, "command=%s", command);
+ if (server_supports_v2("agent"))
+ packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
+ if (advertise_sid && server_supports_v2("session-id"))
+ packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
+ if (server_options && server_options->nr) {
+ ensure_server_supports_v2("server-option");
+ for (size_t i = 0; i < server_options->nr; i++)
+ packet_buf_write(req_buf, "server-option=%s",
+ server_options->items[i].string);
+ }
+
+ if (server_feature_v2("object-format", &hash_name)) {
+ const int hash_algo = hash_algo_by_name(hash_name);
+ if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
+ die(_("mismatched algorithms: client %s; server %s"),
+ the_hash_algo->name, hash_name);
+ packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
+ } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
+ die(_("the server does not support algorithm '%s'"),
+ the_hash_algo->name);
+ }
+ packet_buf_delim(req_buf);
+}
+
enum protocol {
PROTO_LOCAL = 1,
PROTO_FILE,
diff --git a/connect.h b/connect.h
index 1645126c17..d904c73a85 100644
--- a/connect.h
+++ b/connect.h
@@ -30,4 +30,12 @@ void check_stateless_delimiter(int stateless_rpc,
struct packet_reader *reader,
const char *error);
+/*
+ * Writes a command along with the requested
+ * server capabilities/features into a request buffer.
+ */
+struct string_list;
+void write_command_and_capabilities(struct strbuf *req_buf, const char *command,
+ const struct string_list *server_options);
+
#endif
diff --git a/fetch-pack.c b/fetch-pack.c
index 71fb2ca054..19b4a092ea 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1319,37 +1319,6 @@ static int add_haves(struct fetch_negotiator *negotiator,
return haves_added;
}
-static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
- const struct string_list *server_options)
-{
- const char *hash_name;
-
- ensure_server_supports_v2("fetch");
- packet_buf_write(req_buf, "command=fetch");
- if (server_supports_v2("agent"))
- packet_buf_write(req_buf, "agent=%s", git_user_agent_sanitized());
- if (advertise_sid && server_supports_v2("session-id"))
- packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
- if (server_options && server_options->nr) {
- ensure_server_supports_v2("server-option");
- for (size_t i = 0; i < server_options->nr; i++)
- packet_buf_write(req_buf, "server-option=%s",
- server_options->items[i].string);
- }
-
- if (server_feature_v2("object-format", &hash_name)) {
- int hash_algo = hash_algo_by_name(hash_name);
- if (hash_algo_by_ptr(the_hash_algo) != hash_algo)
- die(_("mismatched algorithms: client %s; server %s"),
- the_hash_algo->name, hash_name);
- packet_buf_write(req_buf, "object-format=%s", the_hash_algo->name);
- } else if (hash_algo_by_ptr(the_hash_algo) != GIT_HASH_SHA1) {
- die(_("the server does not support algorithm '%s'"),
- the_hash_algo->name);
- }
- packet_buf_delim(req_buf);
-}
-
static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
struct fetch_pack_args *args,
const struct ref *wants, struct oidset *common,
@@ -1360,7 +1329,7 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
int done_sent = 0;
struct strbuf req_buf = STRBUF_INIT;
- write_fetch_command_and_capabilities(&req_buf, args->server_options);
+ write_command_and_capabilities(&req_buf, "fetch", args->server_options);
if (args->use_thin_pack)
packet_buf_write(&req_buf, "thin-pack");
@@ -2188,7 +2157,7 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips,
the_repository, "%d",
negotiation_round);
strbuf_reset(&req_buf);
- write_fetch_command_and_capabilities(&req_buf, server_options);
+ write_command_and_capabilities(&req_buf, "fetch", server_options);
packet_buf_write(&req_buf, "wait-for-done");
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 5/8] fetch-pack: move fetch initialization
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
` (3 preceding siblings ...)
2025-02-21 19:04 ` [PATCH v11 4/8] fetch-pack: refactor packet writing Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 6/8] serve: advertise object-info feature Eric Ju
` (2 subsequent siblings)
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
There are some variables initialized at the start of the
do_fetch_pack_v2() state machine. Currently, they are initialized
in FETCH_CHECK_LOCAL, which is the initial state set at the beginning
of the function.
However, a subsequent patch will allow for another initial state,
while still requiring these initialized variables.
Move the initialization to be before the state machine,
so that they are set regardless of the initial state.
Note that there is no change in behavior, because we're moving code
from the beginning of the first state to just before the execution of
the state machine.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
fetch-pack.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index 19b4a092ea..35dccea073 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1650,18 +1650,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
reader.me = "fetch-pack";
}
+ /* v2 supports these by default */
+ allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+ use_sideband = 2;
+ if (args->depth > 0 || args->deepen_since || args->deepen_not)
+ args->deepen = 1;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
sort_ref_list(&ref, ref_compare_name);
QSORT(sought, nr_sought, cmp_ref_by_name);
- /* v2 supports these by default */
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- use_sideband = 2;
- if (args->depth > 0 || args->deepen_since || args->deepen_not)
- args->deepen = 1;
-
/* Filter 'ref' by 'sought' and those that aren't local */
mark_complete_and_common_ref(negotiator, args, &ref);
filter_refs(args, &ref, sought, nr_sought);
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 6/8] serve: advertise object-info feature
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
` (4 preceding siblings ...)
2025-02-21 19:04 ` [PATCH v11 5/8] fetch-pack: move fetch initialization Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 7/8] transport: add client support for object-info Eric Ju
2025-02-21 19:04 ` [PATCH v11 8/8] cat-file: add remote-object-info to batch-command Eric Ju
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
In order for a client to know what object-info components a server can
provide, advertise supported object-info features. This will allow a
client to decide whether to query the server for object-info or fetch
as a fallback.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
serve.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/serve.c b/serve.c
index f6dfe34a2b..92fd26fd0a 100644
--- a/serve.c
+++ b/serve.c
@@ -68,7 +68,7 @@ static void session_id_receive(struct repository *r UNUSED,
trace2_data_string("transfer", NULL, "client-sid", client_sid);
}
-static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+static int object_info_advertise(struct repository *r, struct strbuf *value)
{
if (advertise_object_info == -1 &&
repo_config_get_bool(r, "transfer.advertiseobjectinfo",
@@ -76,6 +76,8 @@ static int object_info_advertise(struct repository *r, struct strbuf *value UNUS
/* disabled by default */
advertise_object_info = 0;
}
+ if (value && advertise_object_info)
+ strbuf_addstr(value, "size");
return advertise_object_info;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 7/8] transport: add client support for object-info
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
` (5 preceding siblings ...)
2025-02-21 19:04 ` [PATCH v11 6/8] serve: advertise object-info feature Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-21 19:04 ` [PATCH v11 8/8] cat-file: add remote-object-info to batch-command Eric Ju
7 siblings, 0 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
From: Calvin Wan <calvinwan@google.com>
Sometimes, it is beneficial to retrieve information about an object
without downloading it entirely. The server-side logic for this
functionality was implemented in commit "a2ba162cda (object-info:
support for retrieving object info, 2021-04-20)." And the wire
format is documented at
https://git-scm.com/docs/protocol-v2#_object_info.
This commit introduces client functions to interact with the server.
Currently, the client supports requesting a list of object IDs with
the 'size' feature from a v2 server. If the server does not advertise
this feature (i.e., transfer.advertiseobjectinfo is set to false),
the client will return an error and exit.
Notice that the entire request is written into req_buf before being
sent to the remote. This approach follows the pattern used in the
`send_fetch_request()` logic within fetch-pack.c.
Streaming the request is not addressed in this patch.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Makefile | 1 +
fetch-object-info.c | 85 +++++++++++++++++++++++++++++++++++++++++++++
fetch-object-info.h | 22 ++++++++++++
fetch-pack.c | 3 ++
fetch-pack.h | 2 ++
transport-helper.c | 11 ++++--
transport.c | 28 ++++++++++++++-
transport.h | 11 ++++++
8 files changed, 160 insertions(+), 3 deletions(-)
create mode 100644 fetch-object-info.c
create mode 100644 fetch-object-info.h
diff --git a/Makefile b/Makefile
index bcf5ed3f85..bd6786a3d9 100644
--- a/Makefile
+++ b/Makefile
@@ -1030,6 +1030,7 @@ LIB_OBJS += ewah/ewah_rlw.o
LIB_OBJS += exec-cmd.o
LIB_OBJS += fetch-negotiator.o
LIB_OBJS += fetch-pack.o
+LIB_OBJS += fetch-object-info.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
diff --git a/fetch-object-info.c b/fetch-object-info.c
new file mode 100644
index 0000000000..b279e06dc8
--- /dev/null
+++ b/fetch-object-info.c
@@ -0,0 +1,85 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "hex.h"
+#include "pkt-line.h"
+#include "connect.h"
+#include "oid-array.h"
+#include "object-store-ll.h"
+#include "fetch-object-info.h"
+#include "string-list.h"
+
+/* Sends git-cat-file object-info command and its arguments into the request buffer. */
+static void send_object_info_request(const int fd_out, struct object_info_args *args)
+{
+ struct strbuf req_buf = STRBUF_INIT;
+
+ write_command_and_capabilities(&req_buf, "object-info", args->server_options);
+
+ if (unsorted_string_list_has_string(args->object_info_options, "size"))
+ packet_buf_write(&req_buf, "size");
+
+ if (args->oids)
+ for (size_t i = 0; i < args->oids->nr; i++)
+ packet_buf_write(&req_buf, "oid %s", oid_to_hex(&args->oids->oid[i]));
+
+ packet_buf_flush(&req_buf);
+ if (write_in_full(fd_out, req_buf.buf, req_buf.len) < 0)
+ die_errno(_("unable to write request to remote"));
+
+ strbuf_release(&req_buf);
+}
+
+int fetch_object_info(const enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ const int stateless_rpc, const int fd_out)
+{
+ int size_index = -1;
+
+ switch (version) {
+ case protocol_v2:
+ if (!server_supports_v2("object-info"))
+ die(_("object-info capability is not enabled on the server"));
+ send_object_info_request(fd_out, args);
+ break;
+ case protocol_v1:
+ case protocol_v0:
+ die(_("unsupported protocol version. expected v2"));
+ case protocol_unknown_version:
+ BUG("unknown protocol version");
+ }
+
+ for (size_t i = 0; i < args->object_info_options->nr; i++) {
+ if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+ return -1;
+ }
+ if (!string_list_has_string(args->object_info_options, reader->line))
+ return -1;
+ if (!strcmp(reader->line, "size")) {
+ size_index = i;
+ for (size_t j = 0; j < args->oids->nr; j++)
+ object_info_data[j].sizep = xcalloc(1, sizeof(*object_info_data[j].sizep));
+ }
+ }
+
+ for (size_t i = 0; packet_reader_read(reader) == PACKET_READ_NORMAL && i < args->oids->nr; i++){
+ struct string_list object_info_values = STRING_LIST_INIT_DUP;
+
+ string_list_split(&object_info_values, reader->line, ' ', -1);
+ if (0 <= size_index) {
+ if (!strcmp(object_info_values.items[1 + size_index].string, ""))
+ die("object-info: not our ref %s",
+ object_info_values.items[0].string);
+
+ if (strtoul_ul(object_info_values.items[1 + size_index].string, 10, object_info_data[i].sizep))
+ die("object-info: ref %s has invalid size %s",
+ object_info_values.items[0].string,
+ object_info_values.items[1 + size_index].string);
+ }
+
+ string_list_clear(&object_info_values, 0);
+ }
+ check_stateless_delimiter(stateless_rpc, reader, "stateless delimiter expected");
+
+ return 0;
+}
diff --git a/fetch-object-info.h b/fetch-object-info.h
new file mode 100644
index 0000000000..6184d04d72
--- /dev/null
+++ b/fetch-object-info.h
@@ -0,0 +1,22 @@
+#ifndef FETCH_OBJECT_INFO_H
+#define FETCH_OBJECT_INFO_H
+
+#include "pkt-line.h"
+#include "protocol.h"
+#include "object-store-ll.h"
+
+struct object_info_args {
+ struct string_list *object_info_options;
+ const struct string_list *server_options;
+ struct oid_array *oids;
+};
+
+/*
+ * Sends git-cat-file object-info command into the request buf and read the
+ * results from packets.
+ */
+int fetch_object_info(enum protocol_version version, struct object_info_args *args,
+ struct packet_reader *reader, struct object_info *object_info_data,
+ int stateless_rpc, int fd_out);
+
+#endif /* FETCH_OBJECT_INFO_H */
diff --git a/fetch-pack.c b/fetch-pack.c
index 35dccea073..92e8a7291c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1656,6 +1656,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
if (args->depth > 0 || args->deepen_since || args->deepen_not)
args->deepen = 1;
+ if (args->object_info)
+ state = FETCH_SEND_REQUEST;
+
while (state != FETCH_DONE) {
switch (state) {
case FETCH_CHECK_LOCAL:
diff --git a/fetch-pack.h b/fetch-pack.h
index 9d3470366f..119d3369f1 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -16,6 +16,7 @@ struct fetch_pack_args {
const struct string_list *deepen_not;
struct list_objects_filter_options filter_options;
const struct string_list *server_options;
+ struct object_info *object_info_data;
/*
* If not NULL, during packfile negotiation, fetch-pack will send "have"
@@ -42,6 +43,7 @@ struct fetch_pack_args {
unsigned reject_shallow_remote:1;
unsigned deepen:1;
unsigned refetch:1;
+ unsigned object_info:1;
/*
* Indicate that the remote of this request is a promisor remote. The
diff --git a/transport-helper.c b/transport-helper.c
index d457b42550..9da1547b2c 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -710,8 +710,8 @@ static int fetch_refs(struct transport *transport,
/*
* If we reach here, then the server, the client, and/or the transport
- * helper does not support protocol v2. --negotiate-only requires
- * protocol v2.
+ * helper does not support protocol v2. --negotiate-only and cat-file
+ * remote-object-info require protocol v2.
*/
if (data->transport_options.acked_commits) {
warning(_("--negotiate-only requires protocol v2"));
@@ -727,6 +727,13 @@ static int fetch_refs(struct transport *transport,
free_refs(dummy);
}
+ /* fail the command explicitly to avoid further commands input. */
+ if (transport->smart_options->object_info)
+ die(_("remote-object-info requires protocol v2"));
+
+ if (!data->get_refs_list_called)
+ get_refs_list_using_list(transport, 0);
+
count = 0;
for (i = 0; i < nr_heads; i++)
if (!(to_fetch[i]->status & REF_STATUS_UPTODATE))
diff --git a/transport.c b/transport.c
index 6c2801bcbd..95be3771a6 100644
--- a/transport.c
+++ b/transport.c
@@ -9,6 +9,7 @@
#include "hook.h"
#include "pkt-line.h"
#include "fetch-pack.h"
+#include "fetch-object-info.h"
#include "remote.h"
#include "connect.h"
#include "send-pack.h"
@@ -465,8 +466,33 @@ static int fetch_refs_via_pack(struct transport *transport,
args.server_options = transport->server_options;
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
+ args.object_info = transport->smart_options->object_info;
+
+ if (transport->smart_options->object_info
+ && transport->smart_options->object_info_oids->nr > 0) {
+ struct packet_reader reader;
+ struct object_info_args obj_info_args = { 0 };
+
+ obj_info_args.server_options = transport->server_options;
+ obj_info_args.oids = transport->smart_options->object_info_oids;
+ obj_info_args.object_info_options = transport->smart_options->object_info_options;
+ string_list_sort(obj_info_args.object_info_options);
+
+ connect_setup(transport, 0);
+ packet_reader_init(&reader, data->fd[0], NULL, 0,
+ PACKET_READ_CHOMP_NEWLINE |
+ PACKET_READ_GENTLE_ON_EOF |
+ PACKET_READ_DIE_ON_ERR_PACKET);
+
+ data->version = discover_version(&reader);
+ transport->hash_algo = reader.hash_algo;
+
+ ret = fetch_object_info(data->version, &obj_info_args, &reader,
+ data->options.object_info_data, transport->stateless_rpc,
+ data->fd[1]);
+ goto cleanup;
- if (!data->finished_handshake) {
+ } else if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
diff --git a/transport.h b/transport.h
index 44100fa9b7..e61e931863 100644
--- a/transport.h
+++ b/transport.h
@@ -5,6 +5,7 @@
#include "remote.h"
#include "list-objects-filter-options.h"
#include "string-list.h"
+#include "object-store.h"
struct git_transport_options {
unsigned thin : 1;
@@ -30,6 +31,12 @@ struct git_transport_options {
*/
unsigned connectivity_checked:1;
+ /*
+ * Transport will attempt to retrieve only object-info.
+ * If object-info is not supported, the operation will error and exit.
+ */
+ unsigned object_info : 1;
+
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
@@ -53,6 +60,10 @@ struct git_transport_options {
* common commits to this oidset instead of fetching any packfiles.
*/
struct oidset *acked_commits;
+
+ struct oid_array *object_info_oids;
+ struct object_info *object_info_data;
+ struct string_list *object_info_options;
};
enum transport_family {
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
` (6 preceding siblings ...)
2025-02-21 19:04 ` [PATCH v11 7/8] transport: add client support for object-info Eric Ju
@ 2025-02-21 19:04 ` Eric Ju
2025-02-24 20:46 ` Junio C Hamano
2025-02-24 23:47 ` Jeff King
7 siblings, 2 replies; 174+ messages in thread
From: Eric Ju @ 2025-02-21 19:04 UTC (permalink / raw)
To: git
Cc: calvinwan, jonathantanmy, chriscool, eric.peijian, karthik.188,
toon, jltobler
Since the `info` command in `cat-file --batch-command` prints object
info for a given object, it is natural to add another command in
`cat-file --batch-command` to print object info for a given object
from a remote.
Add `remote-object-info` to `cat-file --batch-command`.
While `info` takes object ids one at a time, this creates
overhead when making requests to a server. So `remote-object-info`
instead can take multiple object ids at once.
The `cat-file --batch-command` command is generally implemented in
the following manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loops through the object info passed in, printing the info.
In order for `remote-object-info` to avoid remote communication
overhead in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
---
Documentation/git-cat-file.adoc | 24 +-
builtin/cat-file.c | 114 +++++
object-file.c | 11 +
object-store-ll.h | 3 +
t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
5 files changed, 812 insertions(+), 4 deletions(-)
create mode 100755 t/t1017-cat-file-remote-object-info.sh
diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
index d5890ae368..4fbb3a077b 100644
--- a/Documentation/git-cat-file.adoc
+++ b/Documentation/git-cat-file.adoc
@@ -149,6 +149,13 @@ info <object>::
Print object info for object reference `<object>`. This corresponds to the
output of `--batch-check`.
+remote-object-info <remote> <object>...::
+ Print object info for object references `<object>` at specified
+ `<remote>` without downloading objects from the remote.
+ Raise an error when the `object-info` capability is not supported by the remote.
+ Raise an error when no object references are provided.
+ This command may be combined with `--buffer`.
+
flush::
Used with `--buffer` to execute all preceding commands that were issued
since the beginning or since the last flush was issued. When `--buffer`
@@ -290,7 +297,8 @@ newline. The available atoms are:
The full hex representation of the object name.
`objecttype`::
- The type of the object (the same as `cat-file -t` reports).
+ The type of the object (the same as `cat-file -t` reports). See
+ `CAVEATS` below. Not supported by `remote-object-info`.
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
@@ -298,13 +306,14 @@ newline. The available atoms are:
`objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the
- note about on-disk sizes in the `CAVEATS` section below.
+ note about on-disk sizes in the `CAVEATS` section below. Not
+ supported by `remote-object-info`.
`deltabase`::
If the object is stored as a delta on-disk, this expands to the
full hex representation of the delta base object name.
Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
- below.
+ below. Not supported by `remote-object-info`.
`rest`::
If this atom is used in the output string, input lines are split
@@ -314,7 +323,10 @@ newline. The available atoms are:
line) are output in place of the `%(rest)` atom.
If no format is specified, the default format is `%(objectname)
-%(objecttype) %(objectsize)`.
+%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use
+`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet.
+WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so
+DO NOT RELY on the current default format to stay the same!!!
If `--batch` is specified, or if `--batch-command` is used with the `contents`
command, the object information is followed by the object contents (consisting
@@ -396,6 +408,10 @@ scripting purposes.
CAVEATS
-------
+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are
+currently not supported by the `remote-object-info` command, we will raise
+an error and exit when they appear in the format string.
+
Note that the sizes of objects on disk are reported accurately, but care
should be taken in drawing conclusions about which refs or objects are
responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 69ea642dc6..47fd2a777b 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -27,6 +27,18 @@
#include "promisor-remote.h"
#include "mailmap.h"
#include "write-or-die.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
+
+/* Maximum length for a remote URL. While no universal standard exists,
+ * 8K is assumed to be a reasonable limit.
+ */
+#define MAX_REMOTE_URL_LEN (8*1024)
+/* Maximum number of objects allowed in a single remote-object-info request. */
+#define MAX_ALLOWED_OBJ_LIMIT 10000
+/* Maximum input size permitted for the remote-object-info command. */
+#define MAX_REMOTE_OBJ_INFO_LINE (MAX_REMOTE_URL_LEN + MAX_ALLOWED_OBJ_LIMIT * (GIT_MAX_HEXSZ + 1))
enum batch_mode {
BATCH_MODE_CONTENTS,
@@ -48,6 +60,8 @@ struct batch_options {
};
static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
static struct string_list mailmap = STRING_LIST_INIT_NODUP;
static int use_mailmap;
@@ -579,6 +593,61 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+ int retval = 0;
+ struct remote *remote = NULL;
+ struct object_id oid;
+ struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+ static struct transport *gtransport;
+
+ /*
+ * Change the format to "%(objectname) %(objectsize)" when
+ * remote-object-info command is used. Once we start supporting objecttype
+ * the default format should change to DEFAULT_FORMAT.
+ */
+ if (!opt->format)
+ opt->format = "%(objectname) %(objectsize)";
+
+ remote = remote_get(argv[0]);
+ if (!remote)
+ die(_("must supply valid remote when using remote-object-info"));
+
+ oid_array_clear(&object_info_oids);
+ for (size_t i = 1; i < argc; i++) {
+ if (get_oid_hex(argv[i], &oid))
+ die(_("Not a valid object name %s"), argv[i]);
+ oid_array_append(&object_info_oids, &oid);
+ }
+ if (!object_info_oids.nr)
+ die(_("remote-object-info requires objects"));
+
+ gtransport = transport_get(remote, NULL);
+ if (gtransport->smart_options) {
+ CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
+ gtransport->smart_options->object_info = 1;
+ gtransport->smart_options->object_info_oids = &object_info_oids;
+
+ /* 'objectsize' is the only option currently supported */
+ if (!strstr(opt->format, "%(objectsize)"))
+ die(_("%s is currently not supported with remote-object-info"), opt->format);
+
+ string_list_append(&object_info_options, "size");
+
+ if (object_info_options.nr > 0) {
+ gtransport->smart_options->object_info_options = &object_info_options;
+ gtransport->smart_options->object_info_data = remote_object_info;
+ retval = transport_fetch_refs(gtransport, NULL);
+ }
+ } else {
+ retval = -1;
+ }
+
+ string_list_clear(&object_info_options, 0);
+ transport_disconnect(gtransport);
+ return retval;
+}
+
struct object_cb_data {
struct batch_options *opt;
struct expand_data *expand;
@@ -670,6 +739,50 @@ static void parse_cmd_info(struct batch_options *opt,
batch_one_object(line, output, opt, data);
}
+static void parse_cmd_remote_object_info(struct batch_options *opt,
+ const char *line, struct strbuf *output,
+ struct expand_data *data)
+{
+ int count;
+ const char **argv;
+ char *line_to_split;
+
+ if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
+ die(_("remote-object-info command input overflow "
+ "(no more than %d objects are allowed)"),
+ MAX_ALLOWED_OBJ_LIMIT);
+
+ line_to_split = xstrdup(line);
+ count = split_cmdline(line_to_split, &argv);
+ if (count < 0)
+ die(_("split remote-object-info command"));
+
+ if (get_remote_info(opt, count, argv))
+ goto cleanup;
+
+ data->skip_object_info = 1;
+ for (size_t i = 0; i < object_info_oids.nr; i++) {
+ data->oid = object_info_oids.oid[i];
+ if (remote_object_info[i].sizep) {
+ /*
+ * When reaching here, it means remote-object-info can retrieve
+ * information from server without downloading them.
+ */
+ data->size = *remote_object_info[i].sizep;
+ opt->batch_mode = BATCH_MODE_INFO;
+ batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ }
+ }
+ data->skip_object_info = 0;
+
+cleanup:
+ for (size_t i = 0; i < object_info_oids.nr; i++)
+ free_object_info_contents(&remote_object_info[i]);
+ free(line_to_split);
+ free(argv);
+ free(remote_object_info);
+}
+
static void dispatch_calls(struct batch_options *opt,
struct strbuf *output,
struct expand_data *data,
@@ -701,6 +814,7 @@ static const struct parse_cmd {
} commands[] = {
{ "contents", parse_cmd_contents, 1},
{ "info", parse_cmd_info, 1},
+ { "remote-object-info", parse_cmd_remote_object_info, 1},
{ "flush", NULL, 0},
};
diff --git a/object-file.c b/object-file.c
index 00c3a4b910..836554437c 100644
--- a/object-file.c
+++ b/object-file.c
@@ -3161,3 +3161,14 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+void free_object_info_contents(struct object_info *object_info)
+{
+ if (!object_info)
+ return;
+ free(object_info->typep);
+ free(object_info->sizep);
+ free(object_info->disk_sizep);
+ free(object_info->delta_base_oid);
+ free(object_info->type_name);
+}
diff --git a/object-store-ll.h b/object-store-ll.h
index cd3bd5bd99..20208e1d4f 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -553,4 +553,7 @@ int for_each_object_in_pack(struct packed_git *p,
int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
void *data, enum for_each_object_flags flags);
+/* Free pointers inside of object_info, but not object_info itself */
+void free_object_info_contents(struct object_info *object_info);
+
#endif /* OBJECT_STORE_LL_H */
diff --git a/t/t1017-cat-file-remote-object-info.sh b/t/t1017-cat-file-remote-object-info.sh
new file mode 100755
index 0000000000..fd6c63cdb9
--- /dev/null
+++ b/t/t1017-cat-file-remote-object-info.sh
@@ -0,0 +1,664 @@
+#!/bin/sh
+
+test_description='git cat-file --batch-command with remote-object-info command'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-cat-file.sh
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+# This is how we get 13:
+# 13 = <file mode> + <a_space> + <file name> + <a_null>, where
+# file mode is 100644, which is 6 characters;
+# file name is hello, which is 5 characters
+# a space is 1 character and a null is 1 character
+tree_size=$(($(test_oid rawsz) + 13))
+
+commit_message="Initial commit"
+
+# This is how we get 137:
+# 137 = <tree header> + <a_space> + <a newline> +
+# <Author line> + <a newline> +
+# <Committer line> + <a newline> +
+# <a newline> +
+# <commit message length>
+# An easier way to calculate is: 1. use `git cat-file commit <commit hash> | wc -c`,
+# to get 177, 2. then deduct 40 hex characters to get 137
+commit_size=$(($(test_oid hexsz) + 137))
+
+tag_header_without_oid="type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_header_without_timestamp="object $hello_oid
+$tag_header_without_oid"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0 +0000
+
+$tag_description"
+
+tag_oid=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
+tag_size=$(strlen "$tag_content")
+
+set_transport_variables () {
+ hello_oid=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+ tree_oid=$(git -C "$1" write-tree)
+ commit_oid=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_oid)
+ tag_oid=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+ tag_size=$(strlen "$tag_content")
+}
+
+# This section tests --batch-command with remote-object-info command
+# Since "%(objecttype)" is currently not supported by the command remote-object-info ,
+# the filters are set to "%(objectname) %(objectsize)" in some test cases.
+
+# Test --batch-command remote-object-info with 'git://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+ git init "$daemon_parent" &&
+ echo_without_newline "$hello_content" > $daemon_parent/hello &&
+ git -C "$daemon_parent" update-index --add hello &&
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true &&
+ git clone "$GIT_DAEMON_URL/parent" -n "$daemon_parent/daemon_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+ GIT_TRACE_PACKET=1 git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$GIT_DAEMON_URL/parent" $hello_oid $tree_oid
+ remote-object-info "$GIT_DAEMON_URL/parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info git:// default filter' '
+ (
+ set_transport_variables "$daemon_parent" &&
+ cd "$daemon_parent/daemon_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid
+remote-object-info $GIT_DAEMON_URL/parent $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'git://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info git:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo false &&
+ set_transport_variables "$daemon_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info $GIT_DAEMON_URL/parent $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$daemon_parent" config transfer.advertiseobjectinfo true
+
+ )
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'file://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+# shellcheck disable=SC2016
+test_expect_success 'create repo to be served by file:// transport' '
+ git init server &&
+ git -C server config protocol.version 2 &&
+ git -C server config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > server/hello &&
+ git -C server update-index --add hello &&
+ git clone -n "file://$(pwd)/server" file_client_empty
+'
+
+test_expect_success 'batch-command remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid
+ remote-object-info "file://${server_path}" $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid
+ remote-object-info "file://${server_path}" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid
+ remote-object-info "file://${server_path}" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info file:// default filter' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ cd file_client_empty &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ printf "%s\0" "$hello_oid missing" >>expect &&
+ printf "%s\0" "$tree_oid missing" >>expect &&
+ printf "%s\0" "$commit_oid missing" >>expect &&
+ printf "%s\0" "$tag_oid missing" >>expect &&
+
+ batch_input="remote-object-info \"file://${server_path}\" $hello_oid $tree_oid
+remote-object-info \"file://${server_path}\" $commit_oid $tag_oid
+info $hello_oid
+info $tree_oid
+info $commit_oid
+info $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+# Test --batch-command remote-object-info with 'file://' and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info file:// fails when transfer.advertiseobjectinfo=false' '
+ (
+ set_transport_variables "server" &&
+ server_path="$(pwd)/server" &&
+ git -C "${server_path}" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "file://${server_path}" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "${server_path}" config transfer.advertiseobjectinfo true
+ )
+'
+
+# Test --batch-command remote-object-info with 'http://' transport with
+# transfer.advertiseobjectinfo set to true, i.e. server has object-info capability
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+ git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true &&
+ echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello &&
+ git clone "$HTTPD_URL/smart/http_parent" -n "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty"
+'
+
+test_expect_success 'batch-command remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ # These results prove remote-object-info can get object info from the remote
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ # These results prove remote-object-info did not download objects from the remote
+ echo "$hello_oid missing" >>expect &&
+ echo "$tree_oid missing" >>expect &&
+ echo "$commit_oid missing" >>expect &&
+ echo "$tag_oid missing" >>expect &&
+
+ git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ info $hello_oid
+ info $tree_oid
+ info $commit_oid
+ info $tag_oid
+ flush
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ echo "$hello_oid $hello_size" >expect &&
+ echo "$tree_oid $tree_size" >>expect &&
+ echo "$commit_oid $commit_size" >>expect &&
+ echo "$tag_oid $tag_size" >>expect &&
+
+ git cat-file --batch-command >actual <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid
+ remote-object-info "$HTTPD_URL/smart/http_parent" $commit_oid $tag_oid
+ EOF
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'batch-command -Z remote-object-info http:// default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_client_empty" &&
+
+ printf "%s\0" "$hello_oid $hello_size" >expect &&
+ printf "%s\0" "$tree_oid $tree_size" >>expect &&
+ printf "%s\0" "$commit_oid $commit_size" >>expect &&
+ printf "%s\0" "$tag_oid $tag_size" >>expect &&
+
+ batch_input="remote-object-info $HTTPD_URL/smart/http_parent $hello_oid $tree_oid
+remote-object-info $HTTPD_URL/smart/http_parent $commit_oid $tag_oid
+" &&
+ echo_without_newline_nul "$batch_input" >commands_null_delimited &&
+
+ git cat-file --batch-command -Z < commands_null_delimited >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(objectsize:disk) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "%(deltabase) is currently not supported with remote-object-info" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid
+ EOF
+ test_grep "remote-object-info requires protocol v2" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on malformed OID with default filter' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ malformed_object_id="this_id_is_not_valid" &&
+
+ test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+ EOF
+ test_grep "Not a valid object name '$malformed_object_id'" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+ test_commit -C missing_oid_repo message1 c.txt &&
+ cd missing_oid_repo &&
+
+ object_id=$(git rev-parse message1:c.txt) &&
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+ EOF
+ test_grep "object-info: not our ref $object_id" err
+ )
+'
+
+test_expect_success 'remote-object-info fails on not providing OID' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent"
+ EOF
+ test_grep "remote-object-info requires objects" err
+ )
+'
+
+
+# Test --batch-command remote-object-info with 'http://' transport and
+# transfer.advertiseobjectinfo set to false, i.e. server does not have object-info capability
+test_expect_success 'batch-command remote-object-info http:// fails when transfer.advertiseobjectinfo=false ' '
+ (
+ set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo false &&
+
+ test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+ remote-object-info "$HTTPD_URL/smart/http_parent" $hello_oid $tree_oid $commit_oid $tag_oid
+ EOF
+ test_grep "object-info capability is not enabled on the server" err &&
+
+ # revert server state back
+ git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config transfer.advertiseobjectinfo true
+ )
+'
+
+# DO NOT add non-httpd-specific tests here, because the last part of this
+# test script is only executed when httpd is available and enabled.
+
+test_done
--
2.48.1
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-02-21 19:04 ` [PATCH v11 8/8] cat-file: add remote-object-info to batch-command Eric Ju
@ 2025-02-24 20:46 ` Junio C Hamano
2025-03-11 23:10 ` Peijian Ju
2025-02-24 23:47 ` Jeff King
1 sibling, 1 reply; 174+ messages in thread
From: Junio C Hamano @ 2025-02-24 20:46 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
Eric Ju <eric.peijian@gmail.com> writes:
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 69ea642dc6..47fd2a777b 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -27,6 +27,18 @@
> #include "promisor-remote.h"
> #include "mailmap.h"
> #include "write-or-die.h"
> +#include "alias.h"
> +#include "remote.h"
> +#include "transport.h"
> +
> +/* Maximum length for a remote URL. While no universal standard exists,
> + * 8K is assumed to be a reasonable limit.
> + */
Style. Our multi-line comment begins with slash-asterisk and ends
with asterisk-slash both on their own line without anything else.
> +#define MAX_REMOTE_URL_LEN (8*1024)
Here and ...
> +/* Maximum number of objects allowed in a single remote-object-info request. */
> +#define MAX_ALLOWED_OBJ_LIMIT 10000
... here, please have a blank line.
> +/* Maximum input size permitted for the remote-object-info command. */
> +#define MAX_REMOTE_OBJ_INFO_LINE (MAX_REMOTE_URL_LEN + MAX_ALLOWED_OBJ_LIMIT * (GIT_MAX_HEXSZ + 1))
This is an overly long line.
> @@ -579,6 +593,61 @@ static void batch_one_object(const char *obj_name,
> object_context_release(&ctx);
> }
>
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> + int retval = 0;
> + struct remote *remote = NULL;
> + struct object_id oid;
> + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> + static struct transport *gtransport;
> +
> + /*
> + * Change the format to "%(objectname) %(objectsize)" when
> + * remote-object-info command is used. Once we start supporting objecttype
> + * the default format should change to DEFAULT_FORMAT.
> + */
Style. Closing asterisk-slash aligns with the asterisk on the
previous line.
> + if (!opt->format)
> + opt->format = "%(objectname) %(objectsize)";
> +
> + remote = remote_get(argv[0]);
> + if (!remote)
> + die(_("must supply valid remote when using remote-object-info"));
> +
> + oid_array_clear(&object_info_oids);
> + for (size_t i = 1; i < argc; i++) {
Pointless mixing of "size_t" and "int". We have declared "int
argc", which is perfectly a sensible type, since we know that the
value of it would not exceed MAX_ALLOWED_OBJ_LIMIT, which is 10000.
> + if (get_oid_hex(argv[i], &oid))
> + die(_("Not a valid object name %s"), argv[i]);
> + oid_array_append(&object_info_oids, &oid);
> + }
> + if (!object_info_oids.nr)
> + die(_("remote-object-info requires objects"));
> +
> + gtransport = transport_get(remote, NULL);
> + if (gtransport->smart_options) {
> + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> + gtransport->smart_options->object_info = 1;
> + gtransport->smart_options->object_info_oids = &object_info_oids;
> +
> + /* 'objectsize' is the only option currently supported */
> + if (!strstr(opt->format, "%(objectsize)"))
> + die(_("%s is currently not supported with remote-object-info"), opt->format);
> +
> + string_list_append(&object_info_options, "size");
> +
> + if (object_info_options.nr > 0) {
> + gtransport->smart_options->object_info_options = &object_info_options;
> + gtransport->smart_options->object_info_data = remote_object_info;
> + retval = transport_fetch_refs(gtransport, NULL);
> + }
> + } else {
> + retval = -1;
> + }
Minor style nit, but when everything else is equal, writing the side
of smaller body first would make it easier to follow if/else, i.e.
gtransport = transport_get(remote, NULL);
if (!gtransport->smart_options) {
/* error */
retval = -1;
} else {
... a lot of real code here ...
}
> +static void parse_cmd_remote_object_info(struct batch_options *opt,
> + const char *line, struct strbuf *output,
> + struct expand_data *data)
> +{
> + int count;
> + const char **argv;
> + char *line_to_split;
> +
> + if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
> + die(_("remote-object-info command input overflow "
> + "(no more than %d objects are allowed)"),
> + MAX_ALLOWED_OBJ_LIMIT);
Nobody guarantees this user gave a request for more than 10000
objects; after all it may have been an overly long URL that busted
the line length limit, no?
> + line_to_split = xstrdup(line);
> + count = split_cmdline(line_to_split, &argv);
> + if (count < 0)
> + die(_("split remote-object-info command"));
Here, the code could check if count busts MAX_ALLOWED_OBJ_LIMIT, but
it doesn't.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v10 8/8] cat-file: add remote-object-info to batch-command
2025-02-21 15:34 ` Peijian Ju
@ 2025-02-24 23:45 ` Jeff King
2025-03-12 19:53 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Jeff King @ 2025-02-24 23:45 UTC (permalink / raw)
To: Peijian Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Feb 21, 2025 at 10:34:44AM -0500, Peijian Ju wrote:
> Thank you. Revised to use xstrdup() in v11.
>
> > 2. Are there any bounds on the size of "line"? E.g., is it coming in
> > as a single pkt, or can it be arbitrarily large if an attacker
> > wants (it looks like maybe the latter, since it comes from a strbuf
> > in batch_objects_command(), but I didn't look at how network data
> > gets passed in to that). At any rate, I think we ran into problems
> > before with split_cmdline() and integer overflow, since it returns
> > an int (CVE-2022-39260). I thought we fixed it by rejecting long
> > lines in git-shell, but it looks like we also hardened
> > split_cmdline() in 0ca6ead81e (alias.c: reject too-long cmdline
> > strings in split_cmdline(), 2022-09-28).
> >
> > So we are maybe OK, but I wonder if we should punt on absurd lines.
> > Related, can an attacker just flood input into that strbuf, making
> > it grow forever and waste memory? That's just a simple resource
> > attack, but we have tried to avoid those elsewhere in upload-pack,
> > etc.
> >
>
> Thank you. Adding a check in v11 for the length of `lines`. Please let
> me know if something like this makes sense:
>
> if (strlen(line) >= INT_MAX) {
> die(_("remote-object-info command input overflow"));
> }
I took a look at what you ended up with in v11, and...I think I totally
misunderstood what was going on in your series, or when this code would
be run.
I had thought the cat-file here was running on the server side, and that
we needed to protect ourselves against malicious clients. But your new
parse_cmd_remote_object_info() is purely a client-side function that
will then access the server behind the scenes. And its input will be
coming from the stdin of cat-file locally.
So I'm not sure that we need to protect it unless we think there's some
way that an attacker can automatically trigger arbitrary
remote-object-info requests.
That said, I'm not sure why you need split_cmdline() at all. The format
seems to be:
remote-object-info <url> <oid>...
The only thing that _might_ need quoting is the url, but is shell
quoting a reasonable thing there? I'd think that it would be
URL-encoded, and thus contain no spaces. The <oid> has to be a real full
oid, I think, because the object-info on the server side insists on
that.
So why not just split on space? Something like this:
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 9de1016acd..aedbcba347 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -597,7 +597,7 @@ static void batch_one_object(const char *obj_name,
object_context_release(&ctx);
}
-static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+static int get_remote_info(struct batch_options *opt, const char *url, const char *oid_list)
{
int retval = 0;
struct remote *remote = NULL;
@@ -613,16 +613,19 @@ static int get_remote_info(struct batch_options *opt, int argc, const char **arg
if (!opt->format)
opt->format = "%(objectname) %(objectsize)";
- remote = remote_get(argv[0]);
+ remote = remote_get(url);
if (!remote)
die(_("must supply valid remote when using remote-object-info"));
oid_array_clear(&object_info_oids);
- for (size_t i = 1; i < argc; i++) {
- if (get_oid_hex(argv[i], &oid))
- die(_("Not a valid object name %s"), argv[i]);
+ while (*oid_list) {
+ if (parse_oid_hex(oid_list, &oid, &oid_list))
+ die(_("Not a valid object name %s"), oid_list);
oid_array_append(&object_info_oids, &oid);
+ while (*oid_list == ' ')
+ oid_list++;
}
+
if (!object_info_oids.nr)
die(_("remote-object-info requires objects"));
@@ -747,21 +750,15 @@ static void parse_cmd_remote_object_info(struct batch_options *opt,
const char *line, struct strbuf *output,
struct expand_data *data)
{
- int count;
- const char **argv;
- char *line_to_split;
-
- if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
- die(_("remote-object-info command input overflow "
- "(no more than %d objects are allowed)"),
- MAX_ALLOWED_OBJ_LIMIT);
+ char *url;
+ const char *space;
- line_to_split = xstrdup(line);
- count = split_cmdline(line_to_split, &argv);
- if (count < 0)
- die(_("split remote-object-info command"));
+ space = strchr(line, ' ');
+ if (!space)
+ return; /* report error somehow? */
+ url = xmemdupz(line, space - line);
- if (get_remote_info(opt, count, argv))
+ if (get_remote_info(opt, url, space + 1))
goto cleanup;
data->skip_object_info = 1;
@@ -774,16 +771,15 @@ static void parse_cmd_remote_object_info(struct batch_options *opt,
*/
data->size = *remote_object_info[i].sizep;
opt->batch_mode = BATCH_MODE_INFO;
- batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+ batch_object_write(oid_to_hex(&data->oid), output, opt, data, NULL, 0);
}
}
data->skip_object_info = 0;
cleanup:
for (size_t i = 0; i < object_info_oids.nr; i++)
free_object_info_contents(&remote_object_info[i]);
- free(line_to_split);
- free(argv);
+ free(url);
free(remote_object_info);
}
You'd need to adjust t1017 to remote the quotes from the inputs, and I
think you'd have to correctly url-encoded the file:// one to avoid
spaces (but that is technically true already! If the filesystem path has
a "%" in it, it would be misinterpreted).
-Peff
^ permalink raw reply related [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-02-21 19:04 ` [PATCH v11 8/8] cat-file: add remote-object-info to batch-command Eric Ju
2025-02-24 20:46 ` Junio C Hamano
@ 2025-02-24 23:47 ` Jeff King
2025-03-12 2:19 ` Peijian Ju
1 sibling, 1 reply; 174+ messages in thread
From: Jeff King @ 2025-02-24 23:47 UTC (permalink / raw)
To: Eric Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Feb 21, 2025 at 02:04:49PM -0500, Eric Ju wrote:
> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> [...]
> + if (gtransport->smart_options) {
> + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> + gtransport->smart_options->object_info = 1;
> + gtransport->smart_options->object_info_oids = &object_info_oids;
> +
> + /* 'objectsize' is the only option currently supported */
> + if (!strstr(opt->format, "%(objectsize)"))
> + die(_("%s is currently not supported with remote-object-info"), opt->format);
BTW, this strstr() isn't quite sufficient to prevent problems, as it
would not find placeholders which _do_ exist but which aren't handled.
One of the first things I tried was:
git cat-file --batch-command='%(objecttype) %(objectsize)'
and feeding it "remote-object-info /path/to/repo some-oid". And it
segfaulted.
-Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-02-24 20:46 ` Junio C Hamano
@ 2025-03-11 23:10 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-03-11 23:10 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Feb 24, 2025 at 3:46 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Eric Ju <eric.peijian@gmail.com> writes:
>
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 69ea642dc6..47fd2a777b 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -27,6 +27,18 @@
> > #include "promisor-remote.h"
> > #include "mailmap.h"
> > #include "write-or-die.h"
> > +#include "alias.h"
> > +#include "remote.h"
> > +#include "transport.h"
> > +
> > +/* Maximum length for a remote URL. While no universal standard exists,
> > + * 8K is assumed to be a reasonable limit.
> > + */
>
> Style. Our multi-line comment begins with slash-asterisk and ends
> with asterisk-slash both on their own line without anything else.
>
Thank you. All the style-related comments will be fixed in the next patch.
> > +#define MAX_REMOTE_URL_LEN (8*1024)
>
> Here and ...
>
> > +/* Maximum number of objects allowed in a single remote-object-info request. */
> > +#define MAX_ALLOWED_OBJ_LIMIT 10000
>
> ... here, please have a blank line.
>
> > +/* Maximum input size permitted for the remote-object-info command. */
> > +#define MAX_REMOTE_OBJ_INFO_LINE (MAX_REMOTE_URL_LEN + MAX_ALLOWED_OBJ_LIMIT * (GIT_MAX_HEXSZ + 1))
>
> This is an overly long line.
>
> > @@ -579,6 +593,61 @@ static void batch_one_object(const char *obj_name,
> > object_context_release(&ctx);
> > }
> >
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > +{
> > + int retval = 0;
> > + struct remote *remote = NULL;
> > + struct object_id oid;
> > + struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> > + static struct transport *gtransport;
> > +
> > + /*
> > + * Change the format to "%(objectname) %(objectsize)" when
> > + * remote-object-info command is used. Once we start supporting objecttype
> > + * the default format should change to DEFAULT_FORMAT.
> > + */
>
> Style. Closing asterisk-slash aligns with the asterisk on the
> previous line.
>
> > + if (!opt->format)
> > + opt->format = "%(objectname) %(objectsize)";
> > +
> > + remote = remote_get(argv[0]);
> > + if (!remote)
> > + die(_("must supply valid remote when using remote-object-info"));
> > +
> > + oid_array_clear(&object_info_oids);
> > + for (size_t i = 1; i < argc; i++) {
>
> Pointless mixing of "size_t" and "int". We have declared "int
> argc", which is perfectly a sensible type, since we know that the
> value of it would not exceed MAX_ALLOWED_OBJ_LIMIT, which is 10000.
>
Thank you. Change to "int" instead.
> > + if (get_oid_hex(argv[i], &oid))
> > + die(_("Not a valid object name %s"), argv[i]);
> > + oid_array_append(&object_info_oids, &oid);
> > + }
> > + if (!object_info_oids.nr)
> > + die(_("remote-object-info requires objects"));
> > +
> > + gtransport = transport_get(remote, NULL);
> > + if (gtransport->smart_options) {
> > + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> > + gtransport->smart_options->object_info = 1;
> > + gtransport->smart_options->object_info_oids = &object_info_oids;
> > +
> > + /* 'objectsize' is the only option currently supported */
> > + if (!strstr(opt->format, "%(objectsize)"))
> > + die(_("%s is currently not supported with remote-object-info"), opt->format);
> > +
> > + string_list_append(&object_info_options, "size");
> > +
> > + if (object_info_options.nr > 0) {
> > + gtransport->smart_options->object_info_options = &object_info_options;
> > + gtransport->smart_options->object_info_data = remote_object_info;
> > + retval = transport_fetch_refs(gtransport, NULL);
> > + }
> > + } else {
> > + retval = -1;
> > + }
>
> Minor style nit, but when everything else is equal, writing the side
> of smaller body first would make it easier to follow if/else, i.e.
>
>
> gtransport = transport_get(remote, NULL);
> if (!gtransport->smart_options) {
> /* error */
> retval = -1;
> } else {
> ... a lot of real code here ...
> }
>
> > +static void parse_cmd_remote_object_info(struct batch_options *opt,
> > + const char *line, struct strbuf *output,
> > + struct expand_data *data)
> > +{
> > + int count;
> > + const char **argv;
> > + char *line_to_split;
> > +
> > + if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
> > + die(_("remote-object-info command input overflow "
> > + "(no more than %d objects are allowed)"),
> > + MAX_ALLOWED_OBJ_LIMIT);
>
> Nobody guarantees this user gave a request for more than 10000
> objects; after all it may have been an overly long URL that busted
> the line length limit, no?
Yes, the error message is indeed misleading here.
In the next patch, the behavior will be updated as follows:
1. If strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE, we will die with a clear error:
"remote-object-info input too long".
2. After step 1, when calling get_remote_info(opt, count, argv), we
will check object_info_oids.nr.
If `object_info_oids.nr > MAX_ALLOWED_OBJ_LIMIT`, we will die with:
"no more than %d objects are allowed", MAX_ALLOWED_OBJ_LIMIT.
>
> > + line_to_split = xstrdup(line);
> > + count = split_cmdline(line_to_split, &argv);
> > + if (count < 0)
> > + die(_("split remote-object-info command"));
>
> Here, the code could check if count busts MAX_ALLOWED_OBJ_LIMIT, but
> it doesn't.
Yes, a check is needed here. Please see the previous reply.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-02-24 23:47 ` Jeff King
@ 2025-03-12 2:19 ` Peijian Ju
2025-03-13 6:02 ` Jeff King
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2025-03-12 2:19 UTC (permalink / raw)
To: Jeff King
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Feb 24, 2025 at 6:47 PM Jeff King <peff@peff.net> wrote:
>
> On Fri, Feb 21, 2025 at 02:04:49PM -0500, Eric Ju wrote:
>
> > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> > [...]
> > + if (gtransport->smart_options) {
> > + CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
> > + gtransport->smart_options->object_info = 1;
> > + gtransport->smart_options->object_info_oids = &object_info_oids;
> > +
> > + /* 'objectsize' is the only option currently supported */
> > + if (!strstr(opt->format, "%(objectsize)"))
> > + die(_("%s is currently not supported with remote-object-info"), opt->format);
>
> BTW, this strstr() isn't quite sufficient to prevent problems, as it
> would not find placeholders which _do_ exist but which aren't handled.
> One of the first things I tried was:
>
> git cat-file --batch-command='%(objecttype) %(objectsize)'
>
> and feeding it "remote-object-info /path/to/repo some-oid". And it
> segfaulted.
>
> -Peff
Thank you, Peff. Yes, you are right. It is a bug. I am adding a new
logic in v12:
1. Iterating on the `opt->format` to see if there are any unsupported
placeholders. If there is, error with unspported placeholders.
2. Adding more test cases to cover different formats, e.g., just
`%(objectsize)`, just `%(objectname)`, mixed usage of supported and
unsupported placeholders.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v10 8/8] cat-file: add remote-object-info to batch-command
2025-02-24 23:45 ` Jeff King
@ 2025-03-12 19:53 ` Peijian Ju
0 siblings, 0 replies; 174+ messages in thread
From: Peijian Ju @ 2025-03-12 19:53 UTC (permalink / raw)
To: Jeff King
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Mon, Feb 24, 2025 at 6:45 PM Jeff King <peff@peff.net> wrote:
>
> On Fri, Feb 21, 2025 at 10:34:44AM -0500, Peijian Ju wrote:
>
> > Thank you. Revised to use xstrdup() in v11.
> >
> > > 2. Are there any bounds on the size of "line"? E.g., is it coming in
> > > as a single pkt, or can it be arbitrarily large if an attacker
> > > wants (it looks like maybe the latter, since it comes from a strbuf
> > > in batch_objects_command(), but I didn't look at how network data
> > > gets passed in to that). At any rate, I think we ran into problems
> > > before with split_cmdline() and integer overflow, since it returns
> > > an int (CVE-2022-39260). I thought we fixed it by rejecting long
> > > lines in git-shell, but it looks like we also hardened
> > > split_cmdline() in 0ca6ead81e (alias.c: reject too-long cmdline
> > > strings in split_cmdline(), 2022-09-28).
> > >
> > > So we are maybe OK, but I wonder if we should punt on absurd lines.
> > > Related, can an attacker just flood input into that strbuf, making
> > > it grow forever and waste memory? That's just a simple resource
> > > attack, but we have tried to avoid those elsewhere in upload-pack,
> > > etc.
> > >
> >
> > Thank you. Adding a check in v11 for the length of `lines`. Please let
> > me know if something like this makes sense:
> >
> > if (strlen(line) >= INT_MAX) {
> > die(_("remote-object-info command input overflow"));
> > }
>
> I took a look at what you ended up with in v11, and...I think I totally
> misunderstood what was going on in your series, or when this code would
> be run.
>
> I had thought the cat-file here was running on the server side, and that
> we needed to protect ourselves against malicious clients. But your new
> parse_cmd_remote_object_info() is purely a client-side function that
> will then access the server behind the scenes. And its input will be
> coming from the stdin of cat-file locally.
>
> So I'm not sure that we need to protect it unless we think there's some
> way that an attacker can automatically trigger arbitrary
> remote-object-info requests.
>
Thank you. Yes, remote-object-info is purely a client-side command.If
an attacker is able to automatically
trigger arbitrary remote-object-info requests, it likely means they
already have control over that system.
From my understanding, Git generally trusts its clients. So unless
there are strong objections,
I will revert those input length checks.
> That said, I'm not sure why you need split_cmdline() at all. The format
> seems to be:
>
> remote-object-info <url> <oid>...
>
> The only thing that _might_ need quoting is the url, but is shell
> quoting a reasonable thing there? I'd think that it would be
> URL-encoded, and thus contain no spaces. The <oid> has to be a real full
> oid, I think, because the object-info on the server side insists on
> that.
>
Thank you! We hadn’t given much thought to the URL format earlier,
but I agree that it’s reasonable to require the URL in
remote-object-info to be properly URL-encoded.
With that assumption, splitting on spaces makes sense. I’ll update
this in the next patch and
also revise the documentation to clarify that URL parameters must be
URL-encoded.
> So why not just split on space? Something like this:
>
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 9de1016acd..aedbcba347 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -597,7 +597,7 @@ static void batch_one_object(const char *obj_name,
> object_context_release(&ctx);
> }
>
> -static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +static int get_remote_info(struct batch_options *opt, const char *url, const char *oid_list)
> {
> int retval = 0;
> struct remote *remote = NULL;
> @@ -613,16 +613,19 @@ static int get_remote_info(struct batch_options *opt, int argc, const char **arg
> if (!opt->format)
> opt->format = "%(objectname) %(objectsize)";
>
> - remote = remote_get(argv[0]);
> + remote = remote_get(url);
> if (!remote)
> die(_("must supply valid remote when using remote-object-info"));
>
> oid_array_clear(&object_info_oids);
> - for (size_t i = 1; i < argc; i++) {
> - if (get_oid_hex(argv[i], &oid))
> - die(_("Not a valid object name %s"), argv[i]);
> + while (*oid_list) {
> + if (parse_oid_hex(oid_list, &oid, &oid_list))
> + die(_("Not a valid object name %s"), oid_list);
> oid_array_append(&object_info_oids, &oid);
> + while (*oid_list == ' ')
> + oid_list++;
> }
> +
> if (!object_info_oids.nr)
> die(_("remote-object-info requires objects"));
>
> @@ -747,21 +750,15 @@ static void parse_cmd_remote_object_info(struct batch_options *opt,
> const char *line, struct strbuf *output,
> struct expand_data *data)
> {
> - int count;
> - const char **argv;
> - char *line_to_split;
> -
> - if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
> - die(_("remote-object-info command input overflow "
> - "(no more than %d objects are allowed)"),
> - MAX_ALLOWED_OBJ_LIMIT);
> + char *url;
> + const char *space;
>
> - line_to_split = xstrdup(line);
> - count = split_cmdline(line_to_split, &argv);
> - if (count < 0)
> - die(_("split remote-object-info command"));
> + space = strchr(line, ' ');
> + if (!space)
> + return; /* report error somehow? */
> + url = xmemdupz(line, space - line);
>
> - if (get_remote_info(opt, count, argv))
> + if (get_remote_info(opt, url, space + 1))
> goto cleanup;
>
> data->skip_object_info = 1;
> @@ -774,16 +771,15 @@ static void parse_cmd_remote_object_info(struct batch_options *opt,
> */
> data->size = *remote_object_info[i].sizep;
> opt->batch_mode = BATCH_MODE_INFO;
> - batch_object_write(argv[i+1], output, opt, data, NULL, 0);
> + batch_object_write(oid_to_hex(&data->oid), output, opt, data, NULL, 0);
> }
> }
> data->skip_object_info = 0;
>
> cleanup:
> for (size_t i = 0; i < object_info_oids.nr; i++)
> free_object_info_contents(&remote_object_info[i]);
> - free(line_to_split);
> - free(argv);
> + free(url);
> free(remote_object_info);
> }
>
>
> You'd need to adjust t1017 to remote the quotes from the inputs, and I
> think you'd have to correctly url-encoded the file:// one to avoid
> spaces (but that is technically true already! If the filesystem path has
> a "%" in it, it would be misinterpreted).
>
Thank you. Tests will be adjusted.
> -Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-03-12 2:19 ` Peijian Ju
@ 2025-03-13 6:02 ` Jeff King
2025-03-21 18:24 ` Peijian Ju
0 siblings, 1 reply; 174+ messages in thread
From: Jeff King @ 2025-03-13 6:02 UTC (permalink / raw)
To: Peijian Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Tue, Mar 11, 2025 at 10:19:55PM -0400, Peijian Ju wrote:
> > BTW, this strstr() isn't quite sufficient to prevent problems, as it
> > would not find placeholders which _do_ exist but which aren't handled.
> > One of the first things I tried was:
> >
> > git cat-file --batch-command='%(objecttype) %(objectsize)'
> >
> > and feeding it "remote-object-info /path/to/repo some-oid". And it
> > segfaulted.
> >
> > -Peff
>
> Thank you, Peff. Yes, you are right. It is a bug. I am adding a new
> logic in v12:
> 1. Iterating on the `opt->format` to see if there are any unsupported
> placeholders. If there is, error with unspported placeholders.
> 2. Adding more test cases to cover different formats, e.g., just
> `%(objectsize)`, just `%(objectname)`, mixed usage of supported and
> unsupported placeholders.
Yes, though it would be nice for step 1 to avoid re-parsing the string.
I think you could either:
1. After the mark_query pass in batch_objects(), check for unsupported
pointers in expand_data. The downside here is that you'd have to
match each one that you _don't_ allow (so if somebody adds a new
one and forgets to update your list, it wouldn't be caught).
2. In expand_atom() or expand_format(), check an allow-list using
is_atom(), when remote-mode is in use. The downside here is that I
think we'd eventually want to move that parsing and formatting to
the shared ref-filter API. But maybe that API could provide some
kind of "check that this atom is allowed" function pointer.
I do wonder if there might be a way to also just notice that we don't
have the requested information and handle it gracefully. I didn't
reproduce it again just now, but I'd guess the segfault is due to
feeding garbage to type_name() in expand_atom().
So maybe if we initialized expand_data fully (so that data->type is
always OBJ_BAD or something) and then checked for a NULL return from
type_name(), we could do something sensible in expand_atom(), like
insert a blank string or similar. And then it is not an error to ask for
%(objecttype), but you will just not get useful data for those entries.
From the description of the protocol, it sounds like you could actually
intermix remote and local object requests?
-Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-03-13 6:02 ` Jeff King
@ 2025-03-21 18:24 ` Peijian Ju
2025-03-24 3:39 ` Jeff King
0 siblings, 1 reply; 174+ messages in thread
From: Peijian Ju @ 2025-03-21 18:24 UTC (permalink / raw)
To: Jeff King
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Thu, Mar 13, 2025 at 2:02 AM Jeff King <peff@peff.net> wrote:
>
> On Tue, Mar 11, 2025 at 10:19:55PM -0400, Peijian Ju wrote:
>
> > > BTW, this strstr() isn't quite sufficient to prevent problems, as it
> > > would not find placeholders which _do_ exist but which aren't handled.
> > > One of the first things I tried was:
> > >
> > > git cat-file --batch-command='%(objecttype) %(objectsize)'
> > >
> > > and feeding it "remote-object-info /path/to/repo some-oid". And it
> > > segfaulted.
> > >
> > > -Peff
> >
> > Thank you, Peff. Yes, you are right. It is a bug. I am adding a new
> > logic in v12:
> > 1. Iterating on the `opt->format` to see if there are any unsupported
> > placeholders. If there is, error with unspported placeholders.
> > 2. Adding more test cases to cover different formats, e.g., just
> > `%(objectsize)`, just `%(objectname)`, mixed usage of supported and
> > unsupported placeholders.
>
> Yes, though it would be nice for step 1 to avoid re-parsing the string.
> I think you could either:
>
> 1. After the mark_query pass in batch_objects(), check for unsupported
> pointers in expand_data. The downside here is that you'd have to
> match each one that you _don't_ allow (so if somebody adds a new
> one and forgets to update your list, it wouldn't be caught).
>
> 2. In expand_atom() or expand_format(), check an allow-list using
> is_atom(), when remote-mode is in use. The downside here is that I
> think we'd eventually want to move that parsing and formatting to
> the shared ref-filter API. But maybe that API could provide some
> kind of "check that this atom is allowed" function pointer.
>
Thank you, Peff. I prefer option 2. Maintaining an allow-list of
supported placeholders seems more practical than tracking unsupported
ones with a disallow-list. This approach has the added benefit that
any newly added placeholders would automatically be treated as
unsupported until explicitly added to the allow-list, reducing the
chance of oversights.
> I do wonder if there might be a way to also just notice that we don't
> have the requested information and handle it gracefully. I didn't
> reproduce it again just now, but I'd guess the segfault is due to
> feeding garbage to type_name() in expand_atom().
>
> So maybe if we initialized expand_data fully (so that data->type is
> always OBJ_BAD or something) and then checked for a NULL return from
> type_name(), we could do something sensible in expand_atom(), like
> insert a blank string or similar. And then it is not an error to ask for
> %(objecttype), but you will just not get useful data for those entries.
> From the description of the protocol, it sounds like you could actually
> intermix remote and local object requests?
>
> -Peff
Thank you Peff. I like the idea "... it is not an error to ask for
%(objecttype), but you will just not get useful data for those
entries."
So if we do remote-object-info with format "%(objectname)
%(objectsize) %(objecttype) %(objectsize:disk)", the response can be:
4346b22767c07e31d0f9b524fcb377972d957313 199 ??? ???
Where ??? means the placeholder is not yet supported. In this way we
don't have to change the default format, and as new support for the
placeholders is added, ??? will be replaced by meaningful data.
About intermixing remote and local object requests, do you mean what
happens when remote-object-info is passed oids of objects that are
available locally instead of on a remote? If so, I have these
scenarios:
1. An object is on remote but not on local. This is what
`remote-object-info` primarily focuses on: we retrieve info from
remote without downloading the object.
2. An object is on remote as well as on local. I think
`remote-object-info` should still retrieve info from remote instead of
checking local data. After all, if the user knows the object is on
local, they can use the `info` command. If remote-object-info is used,
it means we are interested in the information stored on the remote.
3. An object is not on remote, but only on local. I think
remote-object-info should fail in this case, since the remote doesn't
have the object. The info command should be used in this case.
^ permalink raw reply [flat|nested] 174+ messages in thread
* Re: [PATCH v11 8/8] cat-file: add remote-object-info to batch-command
2025-03-21 18:24 ` Peijian Ju
@ 2025-03-24 3:39 ` Jeff King
0 siblings, 0 replies; 174+ messages in thread
From: Jeff King @ 2025-03-24 3:39 UTC (permalink / raw)
To: Peijian Ju
Cc: git, calvinwan, jonathantanmy, chriscool, karthik.188, toon,
jltobler
On Fri, Mar 21, 2025 at 02:24:05PM -0400, Peijian Ju wrote:
>
> Thank you Peff. I like the idea "... it is not an error to ask for
> %(objecttype), but you will just not get useful data for those
> entries."
>
> So if we do remote-object-info with format "%(objectname)
> %(objectsize) %(objecttype) %(objectsize:disk)", the response can be:
>
> 4346b22767c07e31d0f9b524fcb377972d957313 199 ??? ???
>
>
> Where ??? means the placeholder is not yet supported. In this way we
> don't have to change the default format, and as new support for the
> placeholders is added, ??? will be replaced by meaningful data.
Yes, something like that. I don't know what the placeholder should be.
In similar situations for the ref-filter printer, I think we use the
empty string for unsupported cases. E.g.:
git for-each-ref --format='%(refname) %(tagger)'
will show the empty string for %(tagger) of non-tags. That lets you use
conditionals like %(if) to switch behavior. The cat-file formatter
doesn't use ref-filter now, but I think in the long run we'd want to
unify them. So it probably makes sense to match its behavior.
> About intermixing remote and local object requests, do you mean what
> happens when remote-object-info is passed oids of objects that are
> available locally instead of on a remote? If so, I have these
> scenarios:
No, I meant that --batch-command takes a single format string, but you
can issue both local and remote requests to it. So for example:
git cat-file --batch-command='%(objectname) %(objecttype) %(objectsize)' <<\EOF
info 683c54c999c301c2cd6f715c411407c413b1d84e
remote-object-info c9d3534de317f31915f37e9d9c0d52d4cf901482
EOF
would show the local info for the first object, and remote info for the
other. If you're only issuing remote-object-info commands, obviously
it's dumb to include %(objecttype) which cannot be filled. But in the
example above, it is possibly useful to get more data on the local
objects, and a reduced set of data for the remote ones.
> 1. An object is on remote but not on local. This is what
> `remote-object-info` primarily focuses on: we retrieve info from
> remote without downloading the object.
> 2. An object is on remote as well as on local. I think
> `remote-object-info` should still retrieve info from remote instead of
> checking local data. After all, if the user knows the object is on
> local, they can use the `info` command. If remote-object-info is used,
> it means we are interested in the information stored on the remote.
> 3. An object is not on remote, but only on local. I think
> remote-object-info should fail in this case, since the remote doesn't
> have the object. The info command should be used in this case.
Yeah, agreed. remote-object-info should always predictably ask the
remote.
-Peff
^ permalink raw reply [flat|nested] 174+ messages in thread
end of thread, other threads:[~2025-03-24 3:39 UTC | newest]
Thread overview: 174+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-28 19:04 [PATCH 0/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-06-28 19:04 ` [PATCH 1/6] fetch-pack: refactor packet writing Eric Ju
2024-07-04 16:59 ` Karthik Nayak
2024-07-08 15:17 ` Peijian Ju
2024-07-10 9:39 ` Karthik Nayak
2024-07-15 16:40 ` Peijian Ju
2024-06-28 19:04 ` [PATCH 2/6] fetch-pack: move fetch initialization Eric Ju
2024-06-28 19:05 ` [PATCH 3/6] serve: advertise object-info feature Eric Ju
2024-06-28 19:05 ` [PATCH 4/6] transport: add client support for object-info Eric Ju
2024-07-09 7:15 ` Toon claes
2024-07-09 16:37 ` Junio C Hamano
2024-07-13 2:32 ` Peijian Ju
2024-07-13 2:30 ` Peijian Ju
2024-07-10 10:13 ` Karthik Nayak
2024-07-16 2:39 ` Peijian Ju
2024-06-28 19:05 ` [PATCH 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-07-10 10:16 ` Karthik Nayak
2024-07-16 2:59 ` Peijian Ju
2024-06-28 19:05 ` [PATCH 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-07-09 1:50 ` Justin Tobler
2024-07-12 17:41 ` Peijian Ju
2024-07-09 7:16 ` Toon claes
2024-07-13 2:35 ` Peijian Ju
2024-07-10 12:08 ` Karthik Nayak
2024-07-17 2:38 ` Peijian Ju
2024-07-20 3:43 ` [PATCH v2 0/6] " Eric Ju
2024-07-20 3:43 ` [PATCH v2 1/6] fetch-pack: refactor packet writing Eric Ju
2024-09-24 11:45 ` Christian Couder
2024-09-25 20:42 ` Peijian Ju
2024-07-20 3:43 ` [PATCH v2 2/6] fetch-pack: move fetch initialization Eric Ju
2024-07-20 3:43 ` [PATCH v2 3/6] serve: advertise object-info feature Eric Ju
2024-07-20 3:43 ` [PATCH v2 4/6] transport: add client support for object-info Eric Ju
2024-09-24 11:45 ` Christian Couder
2024-09-24 17:29 ` Junio C Hamano
2024-09-25 18:29 ` Peijian Ju
2024-07-20 3:43 ` [PATCH v2 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-07-20 3:43 ` [PATCH v2 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-09-11 13:11 ` Toon Claes
2024-09-25 18:18 ` Peijian Ju
2024-09-24 12:13 ` Christian Couder
2024-09-25 18:12 ` Peijian Ju
2024-08-22 21:24 ` [PATCH 0/6] " Peijian Ju
2024-09-26 1:38 ` [PATCH v3 " Eric Ju
2024-09-26 1:38 ` [PATCH v3 1/6] fetch-pack: refactor packet writing Eric Ju
2024-09-26 1:38 ` [PATCH v3 2/6] fetch-pack: move fetch initialization Eric Ju
2024-09-26 1:38 ` [PATCH v3 3/6] serve: advertise object-info feature Eric Ju
2024-09-26 1:38 ` [PATCH v3 4/6] transport: add client support for object-info Eric Ju
2024-10-23 9:48 ` Christian Couder
2024-10-24 20:23 ` Peijian Ju
2024-09-26 1:38 ` [PATCH v3 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-09-26 1:38 ` [PATCH v3 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-10-23 9:49 ` Christian Couder
2024-10-23 20:25 ` Taylor Blau
2024-10-24 20:28 ` Peijian Ju
2024-10-24 20:28 ` Peijian Ju
2024-10-24 20:53 ` [PATCH v4 0/6] " Eric Ju
2024-10-24 20:53 ` [PATCH v4 1/6] fetch-pack: refactor packet writing Eric Ju
2024-10-25 9:52 ` karthik nayak
2024-10-25 16:06 ` Peijian Ju
2024-10-24 20:53 ` [PATCH v4 2/6] fetch-pack: move fetch initialization Eric Ju
2024-10-24 20:53 ` [PATCH v4 3/6] serve: advertise object-info feature Eric Ju
2024-10-24 20:53 ` [PATCH v4 4/6] transport: add client support for object-info Eric Ju
2024-10-25 10:12 ` karthik nayak
2024-10-28 5:39 ` Peijian Ju
2024-10-24 20:53 ` [PATCH v4 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-10-24 20:53 ` [PATCH v4 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-10-25 10:53 ` karthik nayak
2024-10-25 13:55 ` Christian Couder
2024-10-25 20:56 ` [PATCH v4 0/6] " Taylor Blau
2024-10-27 3:54 ` Peijian Ju
2024-10-28 0:01 ` Taylor Blau
2024-10-28 20:34 ` [PATCH v5 " Eric Ju
2024-10-28 20:34 ` [PATCH v5 1/6] fetch-pack: refactor packet writing Eric Ju
2024-11-05 17:44 ` Christian Couder
2024-11-06 1:06 ` Junio C Hamano
2024-11-06 18:00 ` Peijian Ju
2024-11-06 19:50 ` Peijian Ju
2024-10-28 20:34 ` [PATCH v5 2/6] fetch-pack: move fetch initialization Eric Ju
2024-10-28 20:34 ` [PATCH v5 3/6] serve: advertise object-info feature Eric Ju
2024-10-28 20:34 ` [PATCH v5 4/6] transport: add client support for object-info Eric Ju
2024-10-28 20:34 ` [PATCH v5 5/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-10-28 20:34 ` [PATCH v5 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-11-08 16:24 ` [PATCH v6 0/6] " Eric Ju
2024-11-08 16:24 ` [PATCH v6 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-11-08 16:24 ` [PATCH v6 2/6] fetch-pack: refactor packet writing Eric Ju
2024-11-08 16:24 ` [PATCH v6 3/6] fetch-pack: move fetch initialization Eric Ju
2024-11-08 16:24 ` [PATCH v6 4/6] serve: advertise object-info feature Eric Ju
2024-11-08 16:24 ` [PATCH v6 5/6] transport: add client support for object-info Eric Ju
2024-11-08 16:24 ` [PATCH v6 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-11-11 4:38 ` [PATCH v6 0/6] " Junio C Hamano
2024-11-18 16:28 ` Peijian Ju
2024-11-19 0:16 ` Junio C Hamano
2024-11-19 6:31 ` Patrick Steinhardt
2024-11-19 6:48 ` Junio C Hamano
2024-11-19 16:35 ` Peijian Ju
2024-11-20 1:19 ` Junio C Hamano
2024-11-25 5:36 ` [PATCH v7 " Eric Ju
2024-11-25 5:36 ` [PATCH v7 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 19:26 ` Peijian Ju
2024-11-25 5:36 ` [PATCH v7 2/6] fetch-pack: refactor packet writing Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 19:09 ` Peijian Ju
2024-11-25 5:36 ` [PATCH v7 3/6] fetch-pack: move fetch initialization Eric Ju
2024-11-25 5:36 ` [PATCH v7 4/6] serve: advertise object-info feature Eric Ju
2024-11-25 5:36 ` [PATCH v7 5/6] transport: add client support for object-info Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 3:15 ` Peijian Ju
2024-11-25 5:36 ` [PATCH v7 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2024-11-25 9:51 ` Patrick Steinhardt
2024-12-03 19:23 ` Peijian Ju
2024-12-05 9:50 ` Patrick Steinhardt
2024-12-05 10:34 ` Christian Couder
2024-12-23 23:25 ` [PATCH v8 0/6] " Eric Ju
2024-12-23 23:25 ` [PATCH v8 1/6] cat-file: add declaration of variable i inside its for loop Eric Ju
2024-12-23 23:25 ` [PATCH v8 2/6] fetch-pack: refactor packet writing Eric Ju
2024-12-23 23:25 ` [PATCH v8 3/6] fetch-pack: move fetch initialization Eric Ju
2024-12-23 23:25 ` [PATCH v8 4/6] serve: advertise object-info feature Eric Ju
2024-12-23 23:25 ` [PATCH v8 5/6] transport: add client support for object-info Eric Ju
2025-01-07 18:31 ` Calvin Wan
2025-01-07 18:53 ` Junio C Hamano
2025-01-08 15:55 ` Peijian Ju
2024-12-23 23:25 ` [PATCH v8 6/6] cat-file: add remote-object-info to batch-command Eric Ju
2025-01-07 21:29 ` Calvin Wan
2024-12-26 21:56 ` [PATCH v8 0/6] " Junio C Hamano
2024-12-30 23:25 ` Peijian Ju
2025-01-08 18:37 ` [PATCH v9 0/8] cat-file: " Eric Ju
2025-01-08 18:37 ` [PATCH v9 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
2025-01-10 11:33 ` Christian Couder
2025-01-14 1:39 ` Peijian Ju
2025-01-08 18:37 ` [PATCH v9 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
2025-01-10 11:39 ` Christian Couder
2025-01-14 1:36 ` Peijian Ju
2025-01-08 18:37 ` [PATCH v9 3/8] cat-file: split test utility functions into a separate library file Eric Ju
2025-01-10 14:26 ` Christian Couder
2025-01-14 1:33 ` Peijian Ju
2025-01-08 18:37 ` [PATCH v9 4/8] fetch-pack: refactor packet writing Eric Ju
2025-01-08 18:37 ` [PATCH v9 5/8] fetch-pack: move fetch initialization Eric Ju
2025-01-08 18:37 ` [PATCH v9 6/8] serve: advertise object-info feature Eric Ju
2025-01-08 18:37 ` [PATCH v9 7/8] transport: add client support for object-info Eric Ju
2025-01-08 18:37 ` [PATCH v9 8/8] cat-file: add remote-object-info to batch-command Eric Ju
2025-01-10 11:20 ` Christian Couder
2025-01-14 1:24 ` Peijian Ju
2025-01-14 2:14 ` [PATCH v10 0/8] " Eric Ju
2025-01-14 2:14 ` [PATCH v10 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
2025-01-14 2:14 ` [PATCH v10 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
2025-01-14 2:14 ` [PATCH v10 3/8] t1006: split test utility functions into new "lib-cat-file.sh" Eric Ju
2025-01-14 2:14 ` [PATCH v10 4/8] fetch-pack: refactor packet writing Eric Ju
2025-01-14 2:14 ` [PATCH v10 5/8] fetch-pack: move fetch initialization Eric Ju
2025-01-14 2:14 ` [PATCH v10 6/8] serve: advertise object-info feature Eric Ju
2025-01-14 2:14 ` [PATCH v10 7/8] transport: add client support for object-info Eric Ju
2025-02-01 2:08 ` Jeff King
2025-02-20 22:52 ` Peijian Ju
2025-01-14 2:15 ` [PATCH v10 8/8] cat-file: add remote-object-info to batch-command Eric Ju
2025-02-01 2:03 ` Jeff King
2025-02-21 15:34 ` Peijian Ju
2025-02-24 23:45 ` Jeff King
2025-03-12 19:53 ` Peijian Ju
2025-02-21 19:04 ` [PATCH v11 0/8] " Eric Ju
2025-02-21 19:04 ` [PATCH v11 1/8] git-compat-util: add strtoul_ul() with error handling Eric Ju
2025-02-21 19:04 ` [PATCH v11 2/8] cat-file: add declaration of variable i inside its for loop Eric Ju
2025-02-21 19:04 ` [PATCH v11 3/8] t1006: split test utility functions into new "lib-cat-file.sh" Eric Ju
2025-02-21 19:04 ` [PATCH v11 4/8] fetch-pack: refactor packet writing Eric Ju
2025-02-21 19:04 ` [PATCH v11 5/8] fetch-pack: move fetch initialization Eric Ju
2025-02-21 19:04 ` [PATCH v11 6/8] serve: advertise object-info feature Eric Ju
2025-02-21 19:04 ` [PATCH v11 7/8] transport: add client support for object-info Eric Ju
2025-02-21 19:04 ` [PATCH v11 8/8] cat-file: add remote-object-info to batch-command Eric Ju
2025-02-24 20:46 ` Junio C Hamano
2025-03-11 23:10 ` Peijian Ju
2025-02-24 23:47 ` Jeff King
2025-03-12 2:19 ` Peijian Ju
2025-03-13 6:02 ` Jeff King
2025-03-21 18:24 ` Peijian Ju
2025-03-24 3:39 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).