Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Glen Choo <chooglen@google.com>
To: Siddharth Singh <siddhartth@google.com>
Cc: git@vger.kernel.org, nasamuffin@google.com
Subject: Re: [RFC PATCH v2] khash_test.c : add unit tests
Date: Fri, 07 Jul 2023 14:12:47 -0700	[thread overview]
Message-ID: <kl6lv8evpgs0.fsf@chooglen-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <ZKgpgnU7WrMCTp/C@google.com>

Siddharth Singh <siddhartth@google.com> writes:

>> > +int initialized_hashmap_pointer_not_null() {
>> > +  kh_str_t *hashmap = kh_init_str();
>> > +
>> > +  if(hashmap != NULL){
>> > +    free(hashmap);
>> > +    return 1;
>> > +  }
>> > +  return 0;
>> > +}
>> 
>> I don't think this is a useful test. It would be extremely weird for the
>> return value to be NULL in regular usage.
>
> While it is indeed uncommon for the return value of kh_init_str() to be null in
> regular usage, there could be certain cases where the hashmap pointer might be
> null. One possible scenario is if there is an error during memory allocation for
> the hashmap. This can happen if the system runs out of memory or if there are
> other memory-related issues. In such cases, the kh_init_str() function might
> return a null pointer to indicate the failure to allocate memory for the hashmap.

Ah, yes, that is what I expected. By "regular usage", I meant the case
where the allocation succeeds. Nevertheless, I don't think this test
demonstrates or asserts anything that isn't easily observed in other
tests, though I am open to be proven wrong.

>> > +int put_value_into_hashmap_once_succeeds() {
>> > +  int ret, value = 14;
>> > +  khint_t pos;
>> > +  kh_str_t *hashmap = kh_init_str();
>> > +  pos = kh_put_str(hashmap, "test_key", &ret);
>> > +  if (!ret)
>> > +    return 0;
>> > +  kh_key(hashmap, pos) = xstrdup("test_key");
>> > +  kh_value(hashmap, pos) = &value;
>> > +  return *kh_value(hashmap, pos) == value;
>> > +}
>> Also, do we have to kh_put_str("some_key") and then immediately set it
>> again with kh_key(pos)? That seems odd, and I don't see other callers
>> doing that all the time.
>
> Regarding the usage of kh_value(), I saw this part of code[1] This suggests
> that it may be the recommended approach in this context.

What 'context' were you thinking of when you say "recommended approach
in this context"? (Did you mean kh_put_str() vs kh_put_oid_*()?) I
didn't see such an approach in most kh_put_*() call sites. I couldn't
find the link you're referencing, but I assume it is the following lines
in delta-islands.c:

	int hash_ret;
	khiter_t pos = kh_put_str(remote_islands, island_name, &hash_ret);

	if (hash_ret) {
		kh_key(remote_islands, pos) = xstrdup(island_name);
		kh_value(remote_islands, pos) = xcalloc(1, sizeof(struct remote_island));
	}

In which case, the use case (as I understand it at least) seems quite
different:

- When kh_put_str() 'returns' hash_ret = 0, it means there's already a
  matching entry and we should do nothing.

- Otherwise, kh_put_str() actually creates a new matching entry, and now
  we want to allocate memory to store the entry, so we overwrite the key
  with "kh_key() = xstrdup()". We could have avoided this by doing
  something like:

    char *key = xstrdup(island_name)
    khiter_t pos = kh_put_str(remote_islands, key, &hash_ret);

    if (hash_ret) {
      kh_value(remote_islands, pos) = xcalloc(1, sizeof(struct remote_island));
    } else {
      free(key);
    }

  but allocations are expensive, so we should avoid allocating before we
  know it's needed.

>> > +int put_same_value_into_hashmap_twice_fails() {
>> > +  int first_return_value, second_return_value, value = 14;
>> > +  khint_t pos;
>> > +  kh_str_t *hashmap = kh_init_str();
>> > +  pos = kh_put_str(hashmap, "test_key", &first_return_value);
>> > +  if (!first_return_value)
>> > +    return 0;
>> > +  kh_key(hashmap, pos) = xstrdup("test_key");
>> > +  kh_value(hashmap, pos) = &value;
>> > +  kh_put_str(hashmap, "test_key", &second_return_value);
>> > +  return !second_return_value;
>> > +}
>> 
>> I don't see how this is different from the previous test that tested for
>> collisions.
>
> I also attempt to assign a value to the hashmap here, but this will not be
> successful due to a collision. The intent of this test was different, however

Hm, what was the different intent? Is it to test this slightly different
case where the value is assigned?

My reasoning is:

- Would assigning a value result in different behavior?
- Would a reasonable user expect that it would result in different
  behavior?

If the answer to both of those is 'no' (which I believe it is), then it
is not worth testing.

  reply	other threads:[~2023-07-07 21:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 15:51 [RFC PATCH 0/1] Unit tests for khash.h Siddharth Singh
2023-05-31 15:51 ` [RFC PATCH 1/1] khash_test.c: add unit tests Siddharth Singh
2023-05-31 22:35 ` [RFC PATCH 0/1] Unit tests for khash.h Taylor Blau
2023-06-01  6:26   ` Junio C Hamano
2023-06-01 13:02   ` Phillip Wood
2023-05-31 23:35 ` Eric Wong
2023-06-01 13:23 ` Phillip Wood
2023-06-27 13:20 ` [RFC PATCH v2] khash_test.c : add unit tests Siddharth Singh
2023-06-30  0:53   ` Glen Choo
2023-07-07 15:04     ` Siddharth Singh
2023-07-07 21:12       ` Glen Choo [this message]
2023-06-30  1:05   ` Glen Choo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kl6lv8evpgs0.fsf@chooglen-macbookpro.roam.corp.google.com \
    --to=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=nasamuffin@google.com \
    --cc=siddhartth@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).