From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS63949 64.71.152.0/24 X-Spam-Status: No, score=-2.2 required=3.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, RDNS_NONE,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=no autolearn_force=no version=3.4.0 Received: from 80x24.org (unknown [64.71.152.64]) by dcvr.yhbt.net (Postfix) with ESMTP id B85261F404 for ; Sat, 30 Dec 2017 02:51:09 +0000 (UTC) From: Eric Wong To: spew@80x24.org Subject: [PATCH] hash: do not taint auto-frozen string keys Date: Sat, 30 Dec 2017 02:51:09 +0000 Message-Id: <20171230025109.1946-1-e@80x24.org> List-Id: If somebody gives us an already-frozen, tainted string, then we'll use that key as-is. Otherwise, do not taint the frozen string by default to allow reusing fstrings. The following code shows a reduction from 587 to 415 strings allocated from calling Psych.load: ``` require 'psych' before = {} after = {} h = {} Object.constants.each do |c| # Psych loads UTF-8 keys by default, ensure we have a deduped # version for hash keys c = c.to_s.encode!(Encoding::UTF_8) h[-c] = nil end # n.b. this allocates the first time it is called, so we # do it here before doing it again: ObjectSpace.count_objects(before) data = Psych.dump(h) data.taint GC.disable ObjectSpace.count_objects(before) t = Psych.load(data) ObjectSpace.count_objects(after) p(after[:T_STRING] - before[:T_STRING]) ``` * hash.c (rb_hash_key_str): ensure frozen key is not tainted * test/ruby/test_hash.rb (test_tainted_string_key): update test [Feature #14225] --- hash.c | 7 +++++-- test/ruby/test_hash.rb | 18 +++++++++++++++++- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/hash.c b/hash.c index b4e86bdfb1..fc7ca8c40e 100644 --- a/hash.c +++ b/hash.c @@ -1587,11 +1587,14 @@ rb_hash_key_str(VALUE key) { VALUE k; - if (!RB_OBJ_TAINTED(key) && - (k = fstring_existing_str(key)) != Qnil) { + if ((k = fstring_existing_str(key)) != Qnil) { return k; } else { + if (RB_OBJ_TAINTED(key)) { + key = rb_str_dup(key); + FL_UNSET_RAW(key, FL_TAINT); + } return rb_str_new_frozen(key); } } diff --git a/test/ruby/test_hash.rb b/test/ruby/test_hash.rb index 313c2fd9de..a739fa1e80 100644 --- a/test/ruby/test_hash.rb +++ b/test/ruby/test_hash.rb @@ -310,8 +310,24 @@ def test_tainted_string_key key = h.keys.first assert_predicate str, :tainted? assert_not_predicate str, :frozen? - assert_predicate key, :tainted? + assert_not_predicate key, :tainted? assert_predicate key, :frozen? + + # test random string to ensure no fstring reuse + str = rand.to_s.taint + h.clear + h[str] = nil + key = h.keys.first + assert_predicate str, :tainted? + assert_not_predicate str, :frozen? + assert_not_predicate key, :tainted? + assert_predicate key, :frozen? + assert_not_same str, key + + str.freeze + h.clear + h[str] = nil + assert_same str, h.keys[0], 'frozen tainted string used as-is' end def test_EQUAL # '==' -- EW