mwrap user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH] mwrap 2.0.0 mwrap - LD_PRELOAD malloc wrapper for Ruby
@ 2018-07-20  9:25  7% Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2018-07-20  9:25 UTC (permalink / raw)
  To: ruby-talk, mwrap-public

mwrap is designed to answer the question:

   Which lines of Ruby are hitting malloc the most?

mwrap wraps all malloc-family calls to trace the Ruby source
location of such calls and bytes allocated at each callsite.
As of mwrap 2.0.0, it can also function as a leak detector
and show live allocations at every call site.  Depending on
your application and workload, the overhead is roughly a 50%
increase memory and runtime.

It works best for allocations under GVL, but tries to track
numeric caller addresses for allocations made without GVL so you
can get an idea of how much memory usage certain extensions and
native libraries use.

It requires the concurrent lock-free hash table from the
Userspace RCU project: https://liburcu.org/

It does not require recompiling or rebuilding Ruby, but only
supports Ruby trunk (2.6.0dev+) on a few platforms:

* GNU/Linux
* FreeBSD (tested 11.1)

It may work on NetBSD, OpenBSD and DragonFly BSD.


Changes in 2.0.0:

This release includes significant changes to track live
allocations and frees.  It can find memory leaks from malloc
with less overhead than valgrind's leakchecker and there is a
new Rack endpoint (MwrapRack) which can display live allocation
stats.

API additions:

* Mwrap#[] - https://80x24.org/mwrap/Mwrap.html#method-c-5B-5D
* Mwrap::SourceLocation - https://80x24.org/mwrap/Mwrap/SourceLocation.html
* MwrapRack - https://80x24.org/mwrap/MwrapRack.html

Incompatible changes:

* Mwrap.clear now an alias to Mwrap.reset; as it's unsafe
  to implement the new Mwrap#[] API otherwise:
  https://80x24.org/mwrap-public/20180716211933.5835-12-e@80x24.org/

26 changes since v1.0.0:

      README: improve usage example
      MANIFEST: add .document
      add benchmark
      use __attribute__((weak)) instead of dlsym
      Mwrap.dump: do not segfault on invalid IO arg
      bin/mwrap: support LISTEN_FDS env from systemd
      support per-allocation headers for per-alloc tracking
      mwrap: use malloc to do our own memalign
      hold RCU read lock to insert each allocation
      realloc: do not copy if allocation failed
      internal_memalign: do not assume real_malloc succeeds
      ensure ENOMEM is preserved in errno when appropriate
      memalign: check alignment on all public functions
      reduce stack usage from file names
      resolve real_malloc earlier for C++ programs
      allow analyzing live allocations via Mwrap[location]
      alias Mwrap.clear to Mwrap.reset
      implement accessors for SourceLocation
      mwrap_aref: quiet -Wshorten-64-to-32 warning
      fixes for FreeBSD 11.1...
      use memrchr to extract address under glibc
      do not track allocations for constructor and Init_
      disable memalign tracking by default
      support Mwrap.quiet to temporarily disable allocation tracking
      mwrap_rack: Rack app to track live allocations
      documentation updates for 2.0.0 release

^ permalink raw reply	[relevance 7%]

* [PATCH 12/19] implement accessors for SourceLocation
  2018-07-16 21:19  6% [PATCH 0/19] the heavy version of mwrap Eric Wong
@ 2018-07-16 21:19  4% ` Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2018-07-16 21:19 UTC (permalink / raw)
  To: mwrap-public

Knowing the average/max lifespan (in terms of GC.count) of
allocations can be useful.
---
 ext/mwrap/mwrap.c  | 146 ++++++++++++++++++++++++++++++++++++---------
 test/test_mwrap.rb |  48 +++++++++++++++
 2 files changed, 165 insertions(+), 29 deletions(-)

diff --git a/ext/mwrap/mwrap.c b/ext/mwrap/mwrap.c
index 729e3a8..7455c54 100644
--- a/ext/mwrap/mwrap.c
+++ b/ext/mwrap/mwrap.c
@@ -186,8 +186,11 @@ static int has_ec_p(void)
 /* allocated via real_malloc/real_free */
 struct src_loc {
 	pthread_mutex_t *mtx;
-	size_t calls;
 	size_t total;
+	size_t allocations;
+	size_t frees;
+	size_t age_total; /* (age_total / frees) => mean age at free */
+	size_t max_lifespan;
 	struct cds_lfht_node hnode;
 	struct cds_list_head allocs; /* <=> alloc_hdr.node */
 	uint32_t hval;
@@ -258,14 +261,17 @@ again:
 	if (cur) {
 		l = caa_container_of(cur, struct src_loc, hnode);
 		uatomic_add(&l->total, k->total);
-		uatomic_add(&l->calls, 1);
+		uatomic_add(&l->allocations, 1);
 	} else {
 		size_t n = loc_size(k);
 		l = real_malloc(sizeof(*l) + n);
 		if (!l) goto out_unlock;
 		memcpy(l, k, sizeof(*l) + n);
 		l->mtx = mutex_assign();
-		l->calls = 1;
+		l->age_total = 0;
+		l->max_lifespan = 0;
+		l->frees = 0;
+		l->allocations = 1;
 		CDS_INIT_LIST_HEAD(&l->allocs);
 		cur = cds_lfht_add_unique(t, k->hval, loc_eq, l, &l->hnode);
 		if (cur != &l->hnode) { /* lost race */
@@ -345,13 +351,22 @@ void free(void *p)
 {
 	if (p) {
 		struct alloc_hdr *h = ptr2hdr(p);
+		struct src_loc *l = h->as.live.loc;
 
 		if (!real_free) return; /* oh well, leak a little */
-		if (h->as.live.loc) {
+		if (l) {
+			size_t age = generation - h->as.live.gen;
+
 			uatomic_set(&h->size, 0);
-			mutex_lock(h->as.live.loc->mtx);
+			uatomic_add(&l->frees, 1);
+			uatomic_add(&l->age_total, age);
+
+			mutex_lock(l->mtx);
 			cds_list_del_rcu(&h->anode);
-			mutex_unlock(h->as.live.loc->mtx);
+			if (age > l->max_lifespan)
+				l->max_lifespan = age;
+			mutex_unlock(l->mtx);
+
 			call_rcu(&h->as.dead, free_hdr_rcu);
 		}
 		else {
@@ -621,7 +636,7 @@ static void *dump_to_file(void *x)
 			p = s[0];
 		}
 		fprintf(a->fp, "%16zu %12zu %s\n",
-			l->total, l->calls, (const char *)p);
+			l->total, l->allocations, (const char *)p);
 		if (s) free(s);
 	}
 out_unlock:
@@ -676,7 +691,10 @@ static void *totals_reset(void *ign)
 	t = rcu_dereference(totals);
 	cds_lfht_for_each_entry(t, &iter, l, hnode) {
 		uatomic_set(&l->total, 0);
-		uatomic_set(&l->calls, 0);
+		uatomic_set(&l->allocations, 0);
+		uatomic_set(&l->frees, 0);
+		uatomic_set(&l->age_total, 0);
+		uatomic_set(&l->max_lifespan, 0);
 	}
 	rcu_read_unlock();
 	return 0;
@@ -710,6 +728,27 @@ static VALUE rcu_unlock_ensure(VALUE ignored)
 	return Qfalse;
 }
 
+static VALUE location_string(struct src_loc *l)
+{
+	VALUE ret, tmp;
+
+	if (loc_is_addr(l)) {
+		char **s = backtrace_symbols((void *)l->k, 1);
+		tmp = rb_str_new_cstr(s[0]);
+		free(s);
+	}
+	else {
+		tmp = rb_str_new(l->k, l->capa - 1);
+	}
+
+	/* deduplicate and try to free up some memory */
+	ret = rb_funcall(tmp, id_uminus, 0);
+	if (!OBJ_FROZEN_RAW(tmp))
+		rb_str_resize(tmp, 0);
+
+	return ret;
+}
+
 static VALUE dump_each_rcu(VALUE x)
 {
 	struct dump_arg *a = (struct dump_arg *)x;
@@ -719,27 +758,17 @@ static VALUE dump_each_rcu(VALUE x)
 
 	t = rcu_dereference(totals);
 	cds_lfht_for_each_entry(t, &iter, l, hnode) {
-		VALUE v[3];
+		VALUE v[6];
 		if (l->total <= a->min) continue;
 
-		if (loc_is_addr(l)) {
-			char **s = backtrace_symbols((void *)l->k, 1);
-			v[1] = rb_str_new_cstr(s[0]);
-			free(s);
-		}
-		else {
-			v[1] = rb_str_new(l->k, l->capa - 1);
-		}
-
-		/* deduplicate and try to free up some memory */
-		v[0] = rb_funcall(v[1], id_uminus, 0);
-		if (!OBJ_FROZEN_RAW(v[1]))
-			rb_str_resize(v[1], 0);
-
+		v[0] = location_string(l);
 		v[1] = SIZET2NUM(l->total);
-		v[2] = SIZET2NUM(l->calls);
+		v[2] = SIZET2NUM(l->allocations);
+		v[3] = SIZET2NUM(l->frees);
+		v[4] = SIZET2NUM(l->age_total);
+		v[5] = SIZET2NUM(l->max_lifespan);
 
-		rb_yield_values2(3, v);
+		rb_yield_values2(6, v);
 		assert(rcu_read_ongoing());
 	}
 	return Qnil;
@@ -748,10 +777,12 @@ static VALUE dump_each_rcu(VALUE x)
 /*
  * call-seq:
  *
- * 	Mwrap.each([min]) { |location,total_bytes,call_count| ... }
+ *	Mwrap.each([min]) do |location,total,allocations,frees,age_total,max_lifespan|
+ *	  ...
+ *	end
  *
  * Yields each entry of the of the table to a caller-supplied block.
- * +min+ may be specified to filter out lines with +total_bytes+
+ * +min+ may be specified to filter out lines with +total+ bytes
  * equal-to-or-smaller-than the supplied minimum.
  */
 static VALUE mwrap_each(int argc, VALUE * argv, VALUE mod)
@@ -855,6 +886,14 @@ static VALUE src_loc_each_i(VALUE p)
 	return Qfalse;
 }
 
+static struct src_loc *src_loc_get(VALUE self)
+{
+	struct src_loc *l;
+	TypedData_Get_Struct(self, struct src_loc, &src_loc_type, l);
+	assert(l);
+	return l;
+}
+
 /*
  * call-seq:
  *	loc = Mwrap[location]
@@ -867,8 +906,7 @@ static VALUE src_loc_each_i(VALUE p)
  */
 static VALUE src_loc_each(VALUE self)
 {
-	struct src_loc *l;
-	TypedData_Get_Struct(self, struct src_loc, &src_loc_type, l);
+	struct src_loc *l = src_loc_get(self);
 
 	assert(locating == 0 && "forgot to clear locating");
 	++locating;
@@ -877,6 +915,50 @@ static VALUE src_loc_each(VALUE self)
 	return self;
 }
 
+static VALUE src_loc_mean_lifespan(VALUE self)
+{
+	struct src_loc *l = src_loc_get(self);
+	size_t tot, frees;
+
+	frees = uatomic_read(&l->frees);
+	tot = uatomic_read(&l->age_total);
+	return DBL2NUM(frees ? ((double)tot/(double)frees) : HUGE_VAL);
+}
+
+static VALUE src_loc_frees(VALUE self)
+{
+	return SIZET2NUM(uatomic_read(&src_loc_get(self)->frees));
+}
+
+static VALUE src_loc_allocations(VALUE self)
+{
+	return SIZET2NUM(uatomic_read(&src_loc_get(self)->allocations));
+}
+
+static VALUE src_loc_total(VALUE self)
+{
+	return SIZET2NUM(uatomic_read(&src_loc_get(self)->total));
+}
+
+static VALUE src_loc_max_lifespan(VALUE self)
+{
+	return SIZET2NUM(uatomic_read(&src_loc_get(self)->max_lifespan));
+}
+
+/*
+ * Returns a frozen String location of the given SourceLocation object.
+ */
+static VALUE src_loc_name(VALUE self)
+{
+	struct src_loc *l = src_loc_get(self);
+	VALUE ret;
+
+	++locating;
+	ret = location_string(l);
+	--locating;
+	return ret;
+}
+
 /*
  * Document-module: Mwrap
  *
@@ -911,6 +993,12 @@ void Init_mwrap(void)
 	rb_define_singleton_method(mod, "each", mwrap_each, -1);
 	rb_define_singleton_method(mod, "[]", mwrap_aref, 1);
 	rb_define_method(cSrcLoc, "each", src_loc_each, 0);
+	rb_define_method(cSrcLoc, "frees", src_loc_frees, 0);
+	rb_define_method(cSrcLoc, "allocations", src_loc_allocations, 0);
+	rb_define_method(cSrcLoc, "total", src_loc_total, 0);
+	rb_define_method(cSrcLoc, "mean_lifespan", src_loc_mean_lifespan, 0);
+	rb_define_method(cSrcLoc, "max_lifespan", src_loc_max_lifespan, 0);
+	rb_define_method(cSrcLoc, "name", src_loc_name, 0);
 }
 
 /* rb_cloexec_open isn't usable by non-Ruby processes */
diff --git a/test/test_mwrap.rb b/test/test_mwrap.rb
index 686d87d..2234f7d 100644
--- a/test/test_mwrap.rb
+++ b/test/test_mwrap.rb
@@ -203,4 +203,52 @@ class TestMwrap < Test::Unit::TestCase
   def test_mwrap_dump_check
     assert_raise(TypeError) { Mwrap.dump(:bogus) }
   end
+
+  def assert_separately(src, *opts)
+    Tempfile.create(%w(mwrap .rb)) do |tmp|
+      tmp.write(src.lstrip!)
+      tmp.flush
+      assert(system(@@env, *@@cmd, tmp.path, *opts))
+    end
+  end
+
+  def test_source_location
+    assert_separately(+"#{<<~"begin;"}\n#{<<~'end;'}")
+    begin;
+      require 'mwrap'
+      foo = '0' * 10000
+      k = -"#{__FILE__}:2"
+      loc = Mwrap[k]
+      loc.name == k or abort 'SourceLocation#name broken'
+      loc.total >= 10000 or abort 'SourceLocation#total broken'
+      loc.frees == 0 or abort 'SourceLocation#frees broken'
+      loc.allocations == 1 or abort 'SourceLocation#allocations broken'
+      seen = false
+      loc.each do |*x| seen = x end
+      seen[1] == loc.total or 'SourceLocation#each broken'
+      foo.clear
+
+      # wait for call_rcu to perform real_free
+      freed = false
+      until freed
+        freed = true
+        loc.each do freed = false end
+      end
+      loc.frees == 1 or abort 'SourceLocation#frees broken (after free)'
+      Float === loc.mean_lifespan or abort 'mean_lifespan broken'
+      Integer === loc.max_lifespan or abort 'max_lifespan broken'
+
+      addr = false
+      Mwrap.each do |a,|
+        if a =~ /\[0x[a-f0-9]+\]/
+          addr = a
+          break
+        end
+      end
+      addr.frozen? or abort 'Mwrap.each returned unfrozen address'
+      loc = Mwrap[addr] or abort "Mwrap[#{addr}] broken"
+      addr == loc.name or abort 'SourceLocation#name works on address'
+      loc.name.frozen? or abort 'SourceLocation#name not frozen'
+    end;
+  end
 end
-- 
EW


^ permalink raw reply related	[relevance 4%]

* [PATCH 0/19] the heavy version of mwrap
@ 2018-07-16 21:19  6% Eric Wong
  2018-07-16 21:19  4% ` [PATCH 12/19] implement accessors for SourceLocation Eric Wong
  0 siblings, 1 reply; 3+ results
From: Eric Wong @ 2018-07-16 21:19 UTC (permalink / raw)
  To: mwrap-public

TL;DR: live demo of the new features running inside a Rack app:

  https://80x24.org/MWRAP/each/2000

The following changes since commit 834de3bc0da4af53535d5c9d4975e546df9fb186:

  bin/mwrap: support LISTEN_FDS env from systemd (2018-07-16 19:33:12 +0000)

are available in the Git repository at:

  https://80x24.org/mwrap.git heavy

for you to fetch changes up to c432e3ad30aa247dbac8575af87b0c594365d3fd:

  mwrap_rack: Rack app to track live allocations (2018-07-16 21:14:13 +0000)

----------------------------------------------------------------
Eric Wong (19):
      support per-allocation headers for per-alloc tracking
      mwrap: use malloc to do our own memalign
      hold RCU read lock to insert each allocation
      realloc: do not copy if allocation failed
      internal_memalign: do not assume real_malloc succeeds
      ensure ENOMEM is preserved in errno when appropriate
      memalign: check alignment on all public functions
      reduce stack usage from file names
      resolve real_malloc earlier for C++ programs
      allow analyzing live allocations via Mwrap[location]
      alias Mwrap.clear to Mwrap.reset
      implement accessors for SourceLocation
      mwrap_aref: quiet -Wshorten-64-to-32 warning
      fixes for FreeBSD 11.1...
      use memrchr to extract address under glibc
      do not track allocations for constructor and Init_
      disable memalign tracking by default
      support Mwrap.quiet to temporarily disable allocation tracking
      mwrap_rack: Rack app to track live allocations

 ext/mwrap/extconf.rb |  15 +
 ext/mwrap/mwrap.c    | 792 +++++++++++++++++++++++++++++++++++++++++++--------
 lib/mwrap_rack.rb    | 105 +++++++
 test/test_mwrap.rb   | 113 ++++++++
 4 files changed, 901 insertions(+), 124 deletions(-)
 create mode 100644 lib/mwrap_rack.rb



^ permalink raw reply	[relevance 6%]

Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2018-07-16 21:19  6% [PATCH 0/19] the heavy version of mwrap Eric Wong
2018-07-16 21:19  4% ` [PATCH 12/19] implement accessors for SourceLocation Eric Wong
2018-07-20  9:25  7% [PATCH] mwrap 2.0.0 mwrap - LD_PRELOAD malloc wrapper for Ruby Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mwrap.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).