From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A728DC77B71 for ; Tue, 18 Apr 2023 10:24:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230499AbjDRKYI (ORCPT ); Tue, 18 Apr 2023 06:24:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230311AbjDRKX1 (ORCPT ); Tue, 18 Apr 2023 06:23:27 -0400 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B50054EC3 for ; Tue, 18 Apr 2023 03:23:21 -0700 (PDT) Received: (qmail 4866 invoked by uid 109); 18 Apr 2023 10:23:21 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 18 Apr 2023 10:23:21 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 746 invoked by uid 111); 18 Apr 2023 10:23:20 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 18 Apr 2023 06:23:20 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 18 Apr 2023 06:23:20 -0400 From: Jeff King To: Taylor Blau Cc: git@vger.kernel.org, Junio C Hamano Subject: Re: [PATCH 2/5] t/helper/test-hashmap.c: avoid using `strtok()` Message-ID: <20230418102320.GB508219@coredump.intra.peff.net> References: <0f199468a3b8375dbec0f56fdc831c3ac298eb4e.1681428696.git.me@ttaylorr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <0f199468a3b8375dbec0f56fdc831c3ac298eb4e.1681428696.git.me@ttaylorr.com> Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Thu, Apr 13, 2023 at 07:31:46PM -0400, Taylor Blau wrote: > Avoid using the non-reentrant `strtok()` to separate the parts of each > incoming command. Instead of replacing it with `strtok_r()`, let's > instead use the more friendly `string_list_split_in_place_multi()`. Junio mentioned this offhand in his response, but I wanted to highlight one difference in before/after behavior here. [before] $ printf 'add foo bar\niterate\n' | t/helper/test-tool hashmap foo bar [after] $ printf 'add foo bar\niterate\n' | t/helper/test-tool hashmap foo bar I think that's fine for this test script, but it may be an indication that we want string-list's split to support different semantics. > @@ -159,21 +161,34 @@ int cmd__hashmap(int argc, const char **argv) > > /* process commands from stdin */ > while (strbuf_getline(&line, stdin) != EOF) { > - char *cmd, *p1 = NULL, *p2 = NULL; > + char *cmd, *p1, *p2; > unsigned int hash = 0; > struct test_entry *entry; > > + /* > + * Because we memdup() the arguments out of the > + * string_list before inserting them into the hashmap, > + * it's OK to set its length back to zero to avoid > + * re-allocating the items array once per line. > + * > + * By doing so, we'll instead overwrite the existing > + * entries and avoid re-allocating. > + */ > + parts.nr = 0; I think this is OK, but I wonder if we should wrap it in a function that makes sure the strings aren't owned by the strdup (and thus aren't being leaked). Something like: void string_list_setlen(struct string_list *sl, size_t nr) { /* alternatively, I guess we could actually free nr..sl->nr */ if (sl->strdup_strings) BUG("you can't setlen on a string-list which owns its strings"); if (nr > sl->nr) BUG("you can't grow a string-list with setlen"); sl->nr = nr; } That is probably overkill for these two test helpers, but I wonder if the pattern might escape into "real" code (especially if we suggest using string-list instead of strtok). -Peff