From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS16276 137.74.0.0/16 X-Spam-Status: No, score=-2.8 required=3.0 tests=AWL,BAYES_00, RCVD_IN_MSPIKE_BL,RCVD_IN_MSPIKE_ZBI,RCVD_IN_XBL,RDNS_DYNAMIC,SPF_FAIL, SPF_HELO_FAIL,TO_EQ_FM_DOM_SPF_FAIL shortcircuit=no autolearn=no autolearn_force=no version=3.4.0 Received: from 80x24.org (22.ip-137-74-162.eu [137.74.162.22]) by dcvr.yhbt.net (Postfix) with ESMTP id AA7FC205C9 for ; Thu, 5 Jan 2017 23:05:37 +0000 (UTC) From: Eric Wong To: spew@80x24.org Subject: [PATCH] io.c (io_fwrite): copy to hidden buffer when writing Date: Thu, 5 Jan 2017 23:05:33 +0000 Message-Id: <20170105230533.15023-1-e@80x24.org> List-Id: This avoids garbage from IO#write for [Bug #13085] when called in a read-write loop while protecting the VM from race conditions forced by the user. Memory usage from benchmark/bm_io_copy_stream_write.rb is reduced greatly: target 0: a (ruby 2.5.0dev (2017-01-05 trunk 57270) [x86_64-linux]) target 1: b (ruby 2.5.0dev (2017-01-05) [x86_64-linux]) Memory usage (last size) (B) name a b io_copy_stream_write 81899520.000 6561792.000 Memory consuming ratio (size) with the result of `a' (greater is better) name b io_copy_stream_write 12.481 Despite the extra deep data copy, there is a small speedup in execution time due to GC avoidance: Execution time (sec) name a b io_copy_stream_write 0.393 0.296 Speedup ratio: compare with the result of `a' (greater is better) name b io_copy_stream_write 1.328 This patch increases memory bandwidth use by pessimistically copying the data into a temporary hidden buffer. Using a lightweight frozen copy (as before this patch) is ineffective in read-write loops, since the read operation will trigger a heavy copy behind our back due to the CoW operation. It is also impossible to safely release memory from the lightweight CoW string, because we have no idea how many lightweight duplicates exist by the time we reacquire GVL. So, we now make a heavy copy up front which we recycle immediately upon completion. Ideally, Ruby should have a way of detecting Strings which are not visible to other threads and be able to optimize away the internal copy. Or, we give up on the idea of implicit data sharing between threads since its dangerous anyways. --- io.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/io.c b/io.c index 1074689..b7a85b9 100644 --- a/io.c +++ b/io.c @@ -1423,20 +1423,34 @@ static long io_fwrite(VALUE str, rb_io_t *fptr, int nosync) { int converted = 0; + VALUE tmp = str; + const char *ptr; + long len; + #ifdef _WIN32 if (fptr->mode & FMODE_TTY) { long len = rb_w32_write_console(str, fptr->fd); if (len > 0) return len; } #endif - str = do_writeconv(str, fptr, &converted); - if (converted) - OBJ_FREEZE(str); - else - str = rb_str_new_frozen(str); + tmp = do_writeconv(str, fptr, &converted); + if (converted) { + rb_obj_hide(tmp); + } + else if (!OBJ_FROZEN_RAW(str)) { + tmp = rb_str_tmp_new(0); + rb_str_buf_append(tmp, str); + } + + RSTRING_GETMEM(tmp, ptr, len); + len = io_binwrite(tmp, ptr, len, fptr, nosync); - return io_binwrite(str, RSTRING_PTR(str), RSTRING_LEN(str), - fptr, nosync); + if (tmp != str) { + rb_str_resize(tmp, 0); + rb_gc_force_recycle(tmp); + } + + return len; } ssize_t -- EW