dumping ground for random patches and texts
 help / color / mirror / Atom feed
* [PATCH] psych: pre-freeze string keys for hashes
@ 2016-01-20  1:03 Eric Wong
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Wong @ 2016-01-20  1:03 UTC (permalink / raw)
  To: spew

With the following example, this reduces allocations from 346 to 324
strings when calling Psych.load on a 26-entry hash:

-------------------------------8<--------------------------------
require 'psych'
require 'objspace'
before = {}
after = {}
str = [ '---', *(('a'..'z').map { |k| "#{k * 11}: 1" }), '' ].join("\n")
GC.disable
ObjectSpace.count_objects(before)
h = Psych.load(str)
ObjectSpace.count_objects(after)
p(after[:T_STRING] - before[:T_STRING])
-------------------------------8<--------------------------------

Allocating 324 strings for 26 hash keys is still expensive.  More
work will be needed to reduce allocations further...

Tested on x86-64.
---
 ext/psych/lib/psych/visitors/to_ruby.rb | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ext/psych/lib/psych/visitors/to_ruby.rb b/ext/psych/lib/psych/visitors/to_ruby.rb
index c061da2..2f9408c 100644
--- a/ext/psych/lib/psych/visitors/to_ruby.rb
+++ b/ext/psych/lib/psych/visitors/to_ruby.rb
@@ -336,6 +336,7 @@ def revive_hash hash, o
         o.children.each_slice(2) { |k,v|
           key = accept(k)
           val = accept(v)
+          key.freeze if key.instance_of?(String)
 
           if key == SHOVEL && k.tag != "tag:yaml.org,2002:str"
             case v
-- 
EW


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH] psych: pre-freeze string keys for hashes
@ 2016-01-20  3:53 Eric Wong
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Wong @ 2016-01-20  3:53 UTC (permalink / raw)
  To: spew

With the following example, this reduces allocations from 346 to 324
strings when calling Psych.load on a 26-entry hash:

-------------------------------8<--------------------------------
require 'psych'
require 'objspace'
before = {}
after = {}
str = [ '---', *(('a'..'z').map { |k| "#{k * 11}: 1" }), '' ].join("\n")
GC.disable
ObjectSpace.count_objects(before)
h = Psych.load(str)
ObjectSpace.count_objects(after)
p(after[:T_STRING] - before[:T_STRING])
-------------------------------8<--------------------------------

Allocating 324 strings for 26 hash keys is still expensive.  More
work will be needed to reduce allocations further...

Tested on x86-64.
---
 ext/psych/lib/psych/visitors/to_ruby.rb |  8 ++++++++
 test/psych/test_psych.rb                | 24 ++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/ext/psych/lib/psych/visitors/to_ruby.rb b/ext/psych/lib/psych/visitors/to_ruby.rb
index c061da2..d109086 100644
--- a/ext/psych/lib/psych/visitors/to_ruby.rb
+++ b/ext/psych/lib/psych/visitors/to_ruby.rb
@@ -26,6 +26,7 @@ def initialize ss, class_loader
         @ss = ss
         @domain_types = Psych.domain_types
         @class_loader = class_loader
+        @key_cache = {}
       end
 
       def accept target
@@ -336,6 +337,13 @@ def revive_hash hash, o
         o.children.each_slice(2) { |k,v|
           key = accept(k)
           val = accept(v)
+          if key.instance_of?(String)
+            if cached = @key_cache[key]
+              key = cached
+            else
+              @key_cache[key.freeze] = key
+            end
+          end
 
           if key == SHOVEL && k.tag != "tag:yaml.org,2002:str"
             case v
diff --git a/test/psych/test_psych.rb b/test/psych/test_psych.rb
index 7de9e07..ceb6cb3 100644
--- a/test/psych/test_psych.rb
+++ b/test/psych/test_psych.rb
@@ -176,4 +176,28 @@ def test_callbacks
       ["tag:example.com,2002:foo", "bar"]
     ], types
   end
+
+  def test_string_key_dedup_optimization
+    new_hash = lambda { { 'a' => 1, 'b' => 2 } }
+    ary = []
+    10.times { ary << new_hash.call }
+    ary << []
+    10.times { ary.last << new_hash.call }
+
+    ids = Hash.new { |h,k| h[k] = 0 }
+
+    ary = Psych.load(Psych.dump(ary))
+    ary.each do |ent|
+      case ent
+      when Hash
+        ent.each_key { |k| ids[k.object_id] += 1 }
+      when Array
+        ent.each do |h|
+          h.each_key { |k| ids[k.object_id] += 1 }
+        end
+      end
+    end
+    assert_equal 2, ids.size
+    ids.each_value { |v| assert_equal 20, v }
+  end
 end
-- 
EW


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-01-20  3:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-20  3:53 [PATCH] psych: pre-freeze string keys for hashes Eric Wong
  -- strict thread matches above, loose matches on Subject: below --
2016-01-20  1:03 Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).