From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS5577 94.242.192.0/18 X-Spam-Status: No, score=-0.7 required=3.0 tests=BAYES_00,RCVD_IN_XBL, RDNS_NONE shortcircuit=no autolearn=no version=3.3.2 X-Original-To: spew@80x24.org Received: from 80x24.org (unknown [94.242.228.108]) by dcvr.yhbt.net (Postfix) with ESMTP id 34DB41F434 for ; Wed, 20 Jan 2016 03:53:51 +0000 (UTC) From: Eric Wong To: spew@80x24.org Subject: [PATCH] psych: pre-freeze string keys for hashes Date: Wed, 20 Jan 2016 03:53:42 +0000 Message-Id: <20160120035342.20168-1-e@80x24.org> List-Id: With the following example, this reduces allocations from 346 to 324 strings when calling Psych.load on a 26-entry hash: -------------------------------8<-------------------------------- require 'psych' require 'objspace' before = {} after = {} str = [ '---', *(('a'..'z').map { |k| "#{k * 11}: 1" }), '' ].join("\n") GC.disable ObjectSpace.count_objects(before) h = Psych.load(str) ObjectSpace.count_objects(after) p(after[:T_STRING] - before[:T_STRING]) -------------------------------8<-------------------------------- Allocating 324 strings for 26 hash keys is still expensive. More work will be needed to reduce allocations further... Tested on x86-64. --- ext/psych/lib/psych/visitors/to_ruby.rb | 8 ++++++++ test/psych/test_psych.rb | 24 ++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/ext/psych/lib/psych/visitors/to_ruby.rb b/ext/psych/lib/psych/visitors/to_ruby.rb index c061da2..d109086 100644 --- a/ext/psych/lib/psych/visitors/to_ruby.rb +++ b/ext/psych/lib/psych/visitors/to_ruby.rb @@ -26,6 +26,7 @@ def initialize ss, class_loader @ss = ss @domain_types = Psych.domain_types @class_loader = class_loader + @key_cache = {} end def accept target @@ -336,6 +337,13 @@ def revive_hash hash, o o.children.each_slice(2) { |k,v| key = accept(k) val = accept(v) + if key.instance_of?(String) + if cached = @key_cache[key] + key = cached + else + @key_cache[key.freeze] = key + end + end if key == SHOVEL && k.tag != "tag:yaml.org,2002:str" case v diff --git a/test/psych/test_psych.rb b/test/psych/test_psych.rb index 7de9e07..ceb6cb3 100644 --- a/test/psych/test_psych.rb +++ b/test/psych/test_psych.rb @@ -176,4 +176,28 @@ def test_callbacks ["tag:example.com,2002:foo", "bar"] ], types end + + def test_string_key_dedup_optimization + new_hash = lambda { { 'a' => 1, 'b' => 2 } } + ary = [] + 10.times { ary << new_hash.call } + ary << [] + 10.times { ary.last << new_hash.call } + + ids = Hash.new { |h,k| h[k] = 0 } + + ary = Psych.load(Psych.dump(ary)) + ary.each do |ent| + case ent + when Hash + ent.each_key { |k| ids[k.object_id] += 1 } + when Array + ent.each do |h| + h.each_key { |k| ids[k.object_id] += 1 } + end + end + end + assert_equal 2, ids.size + ids.each_value { |v| assert_equal 20, v } + end end -- EW