From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-3.7 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 7173E1F516; Tue, 3 Jul 2018 07:48:24 +0000 (UTC) Date: Tue, 3 Jul 2018 07:48:24 +0000 From: Eric Wong To: ruby-talk@ruby-lang.org Cc: mwrap-public@80x24.org Subject: Re: [ANN] mwrap - LD_PRELOAD malloc wrapper + line stats for Ruby Message-ID: <20180703074824.GA22835@dcvr> References: <20180702120050.GA24029@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: Sam Saffron wrote: > Awesome Eric! > > I just ran this on a simple "Discourse boot" and got: > > https://gist.githubusercontent.com/SamSaffron/220910a8fb7abd1226c3e9eb0c447ef4/raw/53af7bb83c4ebe58e51b0cee862048b2e47995ad/sorted.txt > > It is fascinating to see absolute complete accounting with lines like: > > 301056 147 /usr/lib/x86_64-linux-gnu/libpq.so.5(+0x1324a) [0x7efd84d3824a] > > This is information that is hidden from > `ObjectSpace#trace_object_allocations`. Complete accounting opens up a > whole bunch of possibilities, we could for example compare with Ruby > accounting and find memory that is not being reported to Ruby > properly. Glad you've found that useful! Took me a while to figure out that part out and I'm hoping to make it more informational (steal from addr2line.c in ruby) > Is there any chance we can have a mode with "free" tracking as well? > That way we can enable this on a long running process to detect leaks? Maybe, but it might get a lot more expensive to be useful... Merely counting frees and associating them with source lines like mwrap currently does with *allocs would not be useful, since GC can call free from just about anywhere. So we'd have to try per-allocation accounting to know for sure and it could get time consuming. Right now, allocations from the same location just increment counters and there's no per-allocation overhead after a callsite is hit the first time. So it's not excessive overhead at the moment... Noticeable overhead, but probably usable in production. If it ends up being as expensive as the leak checking done by valgrind, I'm afraid it would lose usefulness in production environments. Neglecting "free" also simplifies the code a lot, as I cheat a little by ignoring most allocations made by liburcu and mwrap itself :) Wrapping free/cfree means we'd have to add extra accounting info for all those allocations, too; in addition to having to wrap all the memalign functions and ensure proper alignment. > It also makes c level leaks way easier to debug, cause we can listen > on a signal and dump a log on it. Then comparing 2 logs can give us a > leak very very easily. (we could add mwrap --compare dump1 dump2 to > ease this usage) Yes, if we add per-allocation tracking, storing the rb_gc_count() result for each malloc count could be useful to track the relative age of each malloc-ed region. For now, probably calling Mwrap.dump periodically is prudent, (it releases GVL, so you can probably dedicate a thread to it), and sorting on the location column + "diff -u" seems to be enough. > Also per: > > 344809221 1070570 > /home/sam/.rbenv/versions/master/lib/ruby/gems/2.6.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:283 > 39139224 26553 > /home/sam/.rbenv/versions/master/lib/ruby/gems/2.6.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:277 > > > I wonder if there is any way we can get slightly better reporting > here, I am guessing this is just around loading of code, but I wonder > if we could break that down in a cleaner way here? Yes, I noticed a problems with rubygems/core_ext/kernel_require.rb, too. It seems to be a limitation of rb_source_location_cstr and the result of VM optimizations. The usual trade-off between diagnostic details and performance. I tried using the "caller_locations" method via rb_funcall, but that creates new objects and even calling rb_gc_disable() inside the malloc() wrapper caused GC failures.