From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <e@80x24.org>
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net
X-Spam-Level: 
X-Spam-ASN:  
X-Spam-Status: No, score=-3.7 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00
	shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1
Received: from localhost (dcvr.yhbt.net [127.0.0.1])
	by dcvr.yhbt.net (Postfix) with ESMTP id 7173E1F516;
	Tue,  3 Jul 2018 07:48:24 +0000 (UTC)
Date: Tue, 3 Jul 2018 07:48:24 +0000
From: Eric Wong <e@80x24.org>
To: ruby-talk@ruby-lang.org
Cc: mwrap-public@80x24.org
Subject: Re: [ANN] mwrap - LD_PRELOAD malloc wrapper + line stats for Ruby
Message-ID: <20180703074824.GA22835@dcvr>
References: <20180702120050.GA24029@dcvr>
 <CAAtdryNexZtjU=Fr1QjFVJWJ8eN5ASLs8xox54dD1ogqEzf8ew@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAAtdryNexZtjU=Fr1QjFVJWJ8eN5ASLs8xox54dD1ogqEzf8ew@mail.gmail.com>
List-Id: <mwrap-public.80x24.org>

Sam Saffron <sam.saffron@gmail.com> wrote:
> Awesome Eric!
> 
> I just ran this on a simple "Discourse boot" and got:
> 
> https://gist.githubusercontent.com/SamSaffron/220910a8fb7abd1226c3e9eb0c447ef4/raw/53af7bb83c4ebe58e51b0cee862048b2e47995ad/sorted.txt
> 
> It is fascinating to see absolute complete accounting with lines like:
> 
> 301056 147 /usr/lib/x86_64-linux-gnu/libpq.so.5(+0x1324a) [0x7efd84d3824a]
> 
> This is information that is hidden from
> `ObjectSpace#trace_object_allocations`. Complete accounting opens up a
> whole bunch of possibilities, we could for example compare with Ruby
> accounting and find memory that is not being reported to Ruby
> properly.

Glad you've found that useful!  Took me a while to figure out
that part out and I'm hoping to make it more informational
(steal from addr2line.c in ruby)

> Is there any chance we can have a mode with "free" tracking as well?
> That way we can enable this on a long running process to detect leaks?

Maybe, but it might get a lot more expensive to be useful...
Merely counting frees and associating them with source lines
like mwrap currently does with *allocs would not be useful,
since GC can call free from just about anywhere.

So we'd have to try per-allocation accounting to know for sure
and it could get time consuming.

Right now, allocations from the same location just increment
counters and there's no per-allocation overhead after a callsite
is hit the first time.  So it's not excessive overhead at the
moment...  Noticeable overhead, but probably usable in
production.

If it ends up being as expensive as the leak checking done by
valgrind, I'm afraid it would lose usefulness in production
environments.

Neglecting "free" also simplifies the code a lot, as I cheat
a little by ignoring most allocations made by liburcu and mwrap
itself :)  Wrapping free/cfree means we'd have to add extra
accounting info for all those allocations, too; in addition
to having to wrap all the memalign functions and ensure proper
alignment.

> It also makes c level leaks way easier to debug, cause we can listen
> on a signal and dump a log on it. Then comparing 2 logs can give us a
> leak very very easily. (we could add mwrap --compare dump1 dump2 to
> ease this usage)

Yes, if we add per-allocation tracking, storing the
rb_gc_count() result for each malloc count could be useful to
track the relative age of each malloc-ed region.

For now, probably calling Mwrap.dump periodically is prudent,
(it releases GVL, so you can probably dedicate a thread to it),
and sorting on the location column + "diff -u" seems to be enough.

> Also per:
> 
>        344809221      1070570
> /home/sam/.rbenv/versions/master/lib/ruby/gems/2.6.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:283
>         39139224        26553
> /home/sam/.rbenv/versions/master/lib/ruby/gems/2.6.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:277
> 
> 
> I wonder if there is any way we can get slightly better reporting
> here, I am guessing this is just around loading of code, but I wonder
> if we could break that down in a cleaner way here?

Yes, I noticed a problems with rubygems/core_ext/kernel_require.rb,
too.  It seems to be a limitation of rb_source_location_cstr and
the result of VM optimizations.  The usual trade-off between
diagnostic details and performance.

I tried using the "caller_locations" method via rb_funcall, but
that creates new objects and even calling rb_gc_disable() inside
the malloc() wrapper caused GC failures.