* [PATCH] mwrap 2.0.0 mwrap - LD_PRELOAD malloc wrapper for Ruby
@ 2018-07-20 9:25 7% Eric Wong
0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2018-07-20 9:25 UTC (permalink / raw)
To: ruby-talk, mwrap-public
mwrap is designed to answer the question:
Which lines of Ruby are hitting malloc the most?
mwrap wraps all malloc-family calls to trace the Ruby source
location of such calls and bytes allocated at each callsite.
As of mwrap 2.0.0, it can also function as a leak detector
and show live allocations at every call site. Depending on
your application and workload, the overhead is roughly a 50%
increase memory and runtime.
It works best for allocations under GVL, but tries to track
numeric caller addresses for allocations made without GVL so you
can get an idea of how much memory usage certain extensions and
native libraries use.
It requires the concurrent lock-free hash table from the
Userspace RCU project: https://liburcu.org/
It does not require recompiling or rebuilding Ruby, but only
supports Ruby trunk (2.6.0dev+) on a few platforms:
* GNU/Linux
* FreeBSD (tested 11.1)
It may work on NetBSD, OpenBSD and DragonFly BSD.
Changes in 2.0.0:
This release includes significant changes to track live
allocations and frees. It can find memory leaks from malloc
with less overhead than valgrind's leakchecker and there is a
new Rack endpoint (MwrapRack) which can display live allocation
stats.
API additions:
* Mwrap#[] - https://80x24.org/mwrap/Mwrap.html#method-c-5B-5D
* Mwrap::SourceLocation - https://80x24.org/mwrap/Mwrap/SourceLocation.html
* MwrapRack - https://80x24.org/mwrap/MwrapRack.html
Incompatible changes:
* Mwrap.clear now an alias to Mwrap.reset; as it's unsafe
to implement the new Mwrap#[] API otherwise:
https://80x24.org/mwrap-public/20180716211933.5835-12-e@80x24.org/
26 changes since v1.0.0:
README: improve usage example
MANIFEST: add .document
add benchmark
use __attribute__((weak)) instead of dlsym
Mwrap.dump: do not segfault on invalid IO arg
bin/mwrap: support LISTEN_FDS env from systemd
support per-allocation headers for per-alloc tracking
mwrap: use malloc to do our own memalign
hold RCU read lock to insert each allocation
realloc: do not copy if allocation failed
internal_memalign: do not assume real_malloc succeeds
ensure ENOMEM is preserved in errno when appropriate
memalign: check alignment on all public functions
reduce stack usage from file names
resolve real_malloc earlier for C++ programs
allow analyzing live allocations via Mwrap[location]
alias Mwrap.clear to Mwrap.reset
implement accessors for SourceLocation
mwrap_aref: quiet -Wshorten-64-to-32 warning
fixes for FreeBSD 11.1...
use memrchr to extract address under glibc
do not track allocations for constructor and Init_
disable memalign tracking by default
support Mwrap.quiet to temporarily disable allocation tracking
mwrap_rack: Rack app to track live allocations
documentation updates for 2.0.0 release
^ permalink raw reply [relevance 7%]
* [PATCH 12/19] implement accessors for SourceLocation
2018-07-16 21:19 6% [PATCH 0/19] the heavy version of mwrap Eric Wong
@ 2018-07-16 21:19 4% ` Eric Wong
0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2018-07-16 21:19 UTC (permalink / raw)
To: mwrap-public
Knowing the average/max lifespan (in terms of GC.count) of
allocations can be useful.
---
ext/mwrap/mwrap.c | 146 ++++++++++++++++++++++++++++++++++++---------
test/test_mwrap.rb | 48 +++++++++++++++
2 files changed, 165 insertions(+), 29 deletions(-)
diff --git a/ext/mwrap/mwrap.c b/ext/mwrap/mwrap.c
index 729e3a8..7455c54 100644
--- a/ext/mwrap/mwrap.c
+++ b/ext/mwrap/mwrap.c
@@ -186,8 +186,11 @@ static int has_ec_p(void)
/* allocated via real_malloc/real_free */
struct src_loc {
pthread_mutex_t *mtx;
- size_t calls;
size_t total;
+ size_t allocations;
+ size_t frees;
+ size_t age_total; /* (age_total / frees) => mean age at free */
+ size_t max_lifespan;
struct cds_lfht_node hnode;
struct cds_list_head allocs; /* <=> alloc_hdr.node */
uint32_t hval;
@@ -258,14 +261,17 @@ again:
if (cur) {
l = caa_container_of(cur, struct src_loc, hnode);
uatomic_add(&l->total, k->total);
- uatomic_add(&l->calls, 1);
+ uatomic_add(&l->allocations, 1);
} else {
size_t n = loc_size(k);
l = real_malloc(sizeof(*l) + n);
if (!l) goto out_unlock;
memcpy(l, k, sizeof(*l) + n);
l->mtx = mutex_assign();
- l->calls = 1;
+ l->age_total = 0;
+ l->max_lifespan = 0;
+ l->frees = 0;
+ l->allocations = 1;
CDS_INIT_LIST_HEAD(&l->allocs);
cur = cds_lfht_add_unique(t, k->hval, loc_eq, l, &l->hnode);
if (cur != &l->hnode) { /* lost race */
@@ -345,13 +351,22 @@ void free(void *p)
{
if (p) {
struct alloc_hdr *h = ptr2hdr(p);
+ struct src_loc *l = h->as.live.loc;
if (!real_free) return; /* oh well, leak a little */
- if (h->as.live.loc) {
+ if (l) {
+ size_t age = generation - h->as.live.gen;
+
uatomic_set(&h->size, 0);
- mutex_lock(h->as.live.loc->mtx);
+ uatomic_add(&l->frees, 1);
+ uatomic_add(&l->age_total, age);
+
+ mutex_lock(l->mtx);
cds_list_del_rcu(&h->anode);
- mutex_unlock(h->as.live.loc->mtx);
+ if (age > l->max_lifespan)
+ l->max_lifespan = age;
+ mutex_unlock(l->mtx);
+
call_rcu(&h->as.dead, free_hdr_rcu);
}
else {
@@ -621,7 +636,7 @@ static void *dump_to_file(void *x)
p = s[0];
}
fprintf(a->fp, "%16zu %12zu %s\n",
- l->total, l->calls, (const char *)p);
+ l->total, l->allocations, (const char *)p);
if (s) free(s);
}
out_unlock:
@@ -676,7 +691,10 @@ static void *totals_reset(void *ign)
t = rcu_dereference(totals);
cds_lfht_for_each_entry(t, &iter, l, hnode) {
uatomic_set(&l->total, 0);
- uatomic_set(&l->calls, 0);
+ uatomic_set(&l->allocations, 0);
+ uatomic_set(&l->frees, 0);
+ uatomic_set(&l->age_total, 0);
+ uatomic_set(&l->max_lifespan, 0);
}
rcu_read_unlock();
return 0;
@@ -710,6 +728,27 @@ static VALUE rcu_unlock_ensure(VALUE ignored)
return Qfalse;
}
+static VALUE location_string(struct src_loc *l)
+{
+ VALUE ret, tmp;
+
+ if (loc_is_addr(l)) {
+ char **s = backtrace_symbols((void *)l->k, 1);
+ tmp = rb_str_new_cstr(s[0]);
+ free(s);
+ }
+ else {
+ tmp = rb_str_new(l->k, l->capa - 1);
+ }
+
+ /* deduplicate and try to free up some memory */
+ ret = rb_funcall(tmp, id_uminus, 0);
+ if (!OBJ_FROZEN_RAW(tmp))
+ rb_str_resize(tmp, 0);
+
+ return ret;
+}
+
static VALUE dump_each_rcu(VALUE x)
{
struct dump_arg *a = (struct dump_arg *)x;
@@ -719,27 +758,17 @@ static VALUE dump_each_rcu(VALUE x)
t = rcu_dereference(totals);
cds_lfht_for_each_entry(t, &iter, l, hnode) {
- VALUE v[3];
+ VALUE v[6];
if (l->total <= a->min) continue;
- if (loc_is_addr(l)) {
- char **s = backtrace_symbols((void *)l->k, 1);
- v[1] = rb_str_new_cstr(s[0]);
- free(s);
- }
- else {
- v[1] = rb_str_new(l->k, l->capa - 1);
- }
-
- /* deduplicate and try to free up some memory */
- v[0] = rb_funcall(v[1], id_uminus, 0);
- if (!OBJ_FROZEN_RAW(v[1]))
- rb_str_resize(v[1], 0);
-
+ v[0] = location_string(l);
v[1] = SIZET2NUM(l->total);
- v[2] = SIZET2NUM(l->calls);
+ v[2] = SIZET2NUM(l->allocations);
+ v[3] = SIZET2NUM(l->frees);
+ v[4] = SIZET2NUM(l->age_total);
+ v[5] = SIZET2NUM(l->max_lifespan);
- rb_yield_values2(3, v);
+ rb_yield_values2(6, v);
assert(rcu_read_ongoing());
}
return Qnil;
@@ -748,10 +777,12 @@ static VALUE dump_each_rcu(VALUE x)
/*
* call-seq:
*
- * Mwrap.each([min]) { |location,total_bytes,call_count| ... }
+ * Mwrap.each([min]) do |location,total,allocations,frees,age_total,max_lifespan|
+ * ...
+ * end
*
* Yields each entry of the of the table to a caller-supplied block.
- * +min+ may be specified to filter out lines with +total_bytes+
+ * +min+ may be specified to filter out lines with +total+ bytes
* equal-to-or-smaller-than the supplied minimum.
*/
static VALUE mwrap_each(int argc, VALUE * argv, VALUE mod)
@@ -855,6 +886,14 @@ static VALUE src_loc_each_i(VALUE p)
return Qfalse;
}
+static struct src_loc *src_loc_get(VALUE self)
+{
+ struct src_loc *l;
+ TypedData_Get_Struct(self, struct src_loc, &src_loc_type, l);
+ assert(l);
+ return l;
+}
+
/*
* call-seq:
* loc = Mwrap[location]
@@ -867,8 +906,7 @@ static VALUE src_loc_each_i(VALUE p)
*/
static VALUE src_loc_each(VALUE self)
{
- struct src_loc *l;
- TypedData_Get_Struct(self, struct src_loc, &src_loc_type, l);
+ struct src_loc *l = src_loc_get(self);
assert(locating == 0 && "forgot to clear locating");
++locating;
@@ -877,6 +915,50 @@ static VALUE src_loc_each(VALUE self)
return self;
}
+static VALUE src_loc_mean_lifespan(VALUE self)
+{
+ struct src_loc *l = src_loc_get(self);
+ size_t tot, frees;
+
+ frees = uatomic_read(&l->frees);
+ tot = uatomic_read(&l->age_total);
+ return DBL2NUM(frees ? ((double)tot/(double)frees) : HUGE_VAL);
+}
+
+static VALUE src_loc_frees(VALUE self)
+{
+ return SIZET2NUM(uatomic_read(&src_loc_get(self)->frees));
+}
+
+static VALUE src_loc_allocations(VALUE self)
+{
+ return SIZET2NUM(uatomic_read(&src_loc_get(self)->allocations));
+}
+
+static VALUE src_loc_total(VALUE self)
+{
+ return SIZET2NUM(uatomic_read(&src_loc_get(self)->total));
+}
+
+static VALUE src_loc_max_lifespan(VALUE self)
+{
+ return SIZET2NUM(uatomic_read(&src_loc_get(self)->max_lifespan));
+}
+
+/*
+ * Returns a frozen String location of the given SourceLocation object.
+ */
+static VALUE src_loc_name(VALUE self)
+{
+ struct src_loc *l = src_loc_get(self);
+ VALUE ret;
+
+ ++locating;
+ ret = location_string(l);
+ --locating;
+ return ret;
+}
+
/*
* Document-module: Mwrap
*
@@ -911,6 +993,12 @@ void Init_mwrap(void)
rb_define_singleton_method(mod, "each", mwrap_each, -1);
rb_define_singleton_method(mod, "[]", mwrap_aref, 1);
rb_define_method(cSrcLoc, "each", src_loc_each, 0);
+ rb_define_method(cSrcLoc, "frees", src_loc_frees, 0);
+ rb_define_method(cSrcLoc, "allocations", src_loc_allocations, 0);
+ rb_define_method(cSrcLoc, "total", src_loc_total, 0);
+ rb_define_method(cSrcLoc, "mean_lifespan", src_loc_mean_lifespan, 0);
+ rb_define_method(cSrcLoc, "max_lifespan", src_loc_max_lifespan, 0);
+ rb_define_method(cSrcLoc, "name", src_loc_name, 0);
}
/* rb_cloexec_open isn't usable by non-Ruby processes */
diff --git a/test/test_mwrap.rb b/test/test_mwrap.rb
index 686d87d..2234f7d 100644
--- a/test/test_mwrap.rb
+++ b/test/test_mwrap.rb
@@ -203,4 +203,52 @@ class TestMwrap < Test::Unit::TestCase
def test_mwrap_dump_check
assert_raise(TypeError) { Mwrap.dump(:bogus) }
end
+
+ def assert_separately(src, *opts)
+ Tempfile.create(%w(mwrap .rb)) do |tmp|
+ tmp.write(src.lstrip!)
+ tmp.flush
+ assert(system(@@env, *@@cmd, tmp.path, *opts))
+ end
+ end
+
+ def test_source_location
+ assert_separately(+"#{<<~"begin;"}\n#{<<~'end;'}")
+ begin;
+ require 'mwrap'
+ foo = '0' * 10000
+ k = -"#{__FILE__}:2"
+ loc = Mwrap[k]
+ loc.name == k or abort 'SourceLocation#name broken'
+ loc.total >= 10000 or abort 'SourceLocation#total broken'
+ loc.frees == 0 or abort 'SourceLocation#frees broken'
+ loc.allocations == 1 or abort 'SourceLocation#allocations broken'
+ seen = false
+ loc.each do |*x| seen = x end
+ seen[1] == loc.total or 'SourceLocation#each broken'
+ foo.clear
+
+ # wait for call_rcu to perform real_free
+ freed = false
+ until freed
+ freed = true
+ loc.each do freed = false end
+ end
+ loc.frees == 1 or abort 'SourceLocation#frees broken (after free)'
+ Float === loc.mean_lifespan or abort 'mean_lifespan broken'
+ Integer === loc.max_lifespan or abort 'max_lifespan broken'
+
+ addr = false
+ Mwrap.each do |a,|
+ if a =~ /\[0x[a-f0-9]+\]/
+ addr = a
+ break
+ end
+ end
+ addr.frozen? or abort 'Mwrap.each returned unfrozen address'
+ loc = Mwrap[addr] or abort "Mwrap[#{addr}] broken"
+ addr == loc.name or abort 'SourceLocation#name works on address'
+ loc.name.frozen? or abort 'SourceLocation#name not frozen'
+ end;
+ end
end
--
EW
^ permalink raw reply related [relevance 4%]
* [PATCH 0/19] the heavy version of mwrap
@ 2018-07-16 21:19 6% Eric Wong
2018-07-16 21:19 4% ` [PATCH 12/19] implement accessors for SourceLocation Eric Wong
0 siblings, 1 reply; 3+ results
From: Eric Wong @ 2018-07-16 21:19 UTC (permalink / raw)
To: mwrap-public
TL;DR: live demo of the new features running inside a Rack app:
https://80x24.org/MWRAP/each/2000
The following changes since commit 834de3bc0da4af53535d5c9d4975e546df9fb186:
bin/mwrap: support LISTEN_FDS env from systemd (2018-07-16 19:33:12 +0000)
are available in the Git repository at:
https://80x24.org/mwrap.git heavy
for you to fetch changes up to c432e3ad30aa247dbac8575af87b0c594365d3fd:
mwrap_rack: Rack app to track live allocations (2018-07-16 21:14:13 +0000)
----------------------------------------------------------------
Eric Wong (19):
support per-allocation headers for per-alloc tracking
mwrap: use malloc to do our own memalign
hold RCU read lock to insert each allocation
realloc: do not copy if allocation failed
internal_memalign: do not assume real_malloc succeeds
ensure ENOMEM is preserved in errno when appropriate
memalign: check alignment on all public functions
reduce stack usage from file names
resolve real_malloc earlier for C++ programs
allow analyzing live allocations via Mwrap[location]
alias Mwrap.clear to Mwrap.reset
implement accessors for SourceLocation
mwrap_aref: quiet -Wshorten-64-to-32 warning
fixes for FreeBSD 11.1...
use memrchr to extract address under glibc
do not track allocations for constructor and Init_
disable memalign tracking by default
support Mwrap.quiet to temporarily disable allocation tracking
mwrap_rack: Rack app to track live allocations
ext/mwrap/extconf.rb | 15 +
ext/mwrap/mwrap.c | 792 +++++++++++++++++++++++++++++++++++++++++++--------
lib/mwrap_rack.rb | 105 +++++++
test/test_mwrap.rb | 113 ++++++++
4 files changed, 901 insertions(+), 124 deletions(-)
create mode 100644 lib/mwrap_rack.rb
^ permalink raw reply [relevance 6%]
Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2018-07-16 21:19 6% [PATCH 0/19] the heavy version of mwrap Eric Wong
2018-07-16 21:19 4% ` [PATCH 12/19] implement accessors for SourceLocation Eric Wong
2018-07-20 9:25 7% [PATCH] mwrap 2.0.0 mwrap - LD_PRELOAD malloc wrapper for Ruby Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/mwrap.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).