That's where interesting developments happen, first. --- README | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README b/README index f969c14..5770240 100644 --- a/README +++ b/README @@ -55,7 +55,7 @@ For long running processes, you can see the AF_UNIX HTTP interface: And connect via `curl --unix-socket /some/dir/$PID.sock' or `mwrap-rproxy(1p)<https://80x24.org/mwrap-perl.git/tree/script/mwrap-rproxy#n44> -for more info. +distributed with the Perl version of mwrap for more info. You may also `require "mwrap"' in your Ruby code and use Mwrap.dump, Mwrap.reset, Mwrap.each, etc. @@ -113,7 +113,7 @@ top-posting costs everybody memory and bandwidth. Send all patches ("git format-patch" + "git send-email") and pull requests (use "git request-pull" to format) via email -to mwrap-perl@80x24.org. We do not and will not use +to mwrap-public@80x24.org. We do not and will not use proprietary messaging systems. == License
libdl is no longer needed since we no longer use dlsym(3). libc is assumed, not sure what I was smoking when I added an explicit check for that... --- ext/mwrap/extconf.rb | 2 -- 1 file changed, 2 deletions(-) diff --git a/ext/mwrap/extconf.rb b/ext/mwrap/extconf.rb index 3336548..f184b76 100644 --- a/ext/mwrap/extconf.rb +++ b/ext/mwrap/extconf.rb @@ -7,8 +7,6 @@ have_func 'mempcpy' have_library 'urcu-cds' or abort 'userspace RCU not installed' have_header 'urcu/rculfhash.h' or abort 'rculfhash.h not found' have_library 'urcu-bp' or abort 'liburcu-bp not found' -have_library 'dl' -have_library 'c' have_library 'execinfo' # FreeBSD $defs << '-DHAVE_XXHASH' if have_header('xxhash.h')
.dump_csv was added to dump_args for the destructor, but not initialized properly for the Mwrap.dump API call. --- ext/mwrap/mwrap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ext/mwrap/mwrap.c b/ext/mwrap/mwrap.c index a45bb38..826ca92 100644 --- a/ext/mwrap/mwrap.c +++ b/ext/mwrap/mwrap.c @@ -27,7 +27,7 @@ extern VALUE __attribute__((weak)) rb_yield(VALUE); static VALUE mwrap_dump(int argc, VALUE *argv, VALUE mod) { VALUE io, min; - struct dump_arg a; + struct dump_arg a = { .dump_csv = false }; rb_io_t *fptr; rb_scan_args(argc, argv, "02", &io, &min);
I somehow forgot about the existence of the perror(3) function :x --- Documentation/mwrap.pod | 21 +++- README | 31 ++++-- ext/mwrap/httpd.h | 241 +++++++++++++++++++++------------------- ext/mwrap/mwrap_core.h | 72 +++++++++--- 4 files changed, 222 insertions(+), 143 deletions(-) diff --git a/Documentation/mwrap.pod b/Documentation/mwrap.pod index 6832430..a31bc1f 100644 --- a/Documentation/mwrap.pod +++ b/Documentation/mwrap.pod @@ -7,8 +7,8 @@ mwrap - run any command under mwrap # to trace a long-running program and access it via $DIR/$PID.sock: MWRAP=socket_dir:$DIR mwrap COMMAND - # to trace a short-lived command and dump its output to a log: - MWRAP=dump_path:$FILENAME mwrap COMMAND + # to trace a short-lived command and dump its output to a CSV: + MWRAP=dump_csv:$FILENAME mwrap COMMAND =head1 DESCRIPTION @@ -46,13 +46,28 @@ This may be changed via POST request (see below). Default: 0 +=item dump_csv:$FILENAME + +Dump CSV to the given filename. + +This output matches the HTTP server output and includes column headers, +but is subject to change in future releases. + +C<dump_csv> without the C<:> may also be used in conjunction with +C<dump_fd>, such as C<MWRAP=dump_fd:2,dump_csv>. + +As of mwrap 3.0, +C<$FILENAME> may contain C<%p> where C<%p> is a placeholder for +the PID being dumped. No other use of C<%> is accepted, and +multiple C<%> means all C<%> (including C<%p>) are handled as-is. + =item dump_path:$FILENAME Dumps the output at exit to a given filename: total_bytes call_count location -In the future, dumping to a self-describing CSV will be supported. +Expands C<%p> to the PID in C<$FILENAME> as described for C<dump_csv> =item dump_fd:$DESCRIPTOR diff --git a/README b/README index 761f87e..f969c14 100644 --- a/README +++ b/README @@ -16,20 +16,24 @@ numeric caller addresses for allocations made without GVL so you can get an idea of how much memory usage certain extensions and native libraries use. +As of 3.0, it also gives configurable C backtraces of all +dynamically-linked malloc callsites for any program where backtrace(3) +works, including programs not linked to Ruby. + It requires the concurrent lock-free hash table from the Userspace RCU project: https://liburcu.org/ It does not require recompiling or rebuilding Ruby, but only supports Ruby 2.7.0 or later on a few platforms: -* GNU/Linux (only tested --without-jemalloc, mwrap 3.x provides its own) +* GNU/Linux (only tested --without-jemalloc, mwrap 3.x provides its own malloc) It may work on FreeBSD, NetBSD, OpenBSD and DragonFly BSD if given appropriate build options. == Install - # Debian-based systems: apt-get liburcu-dev + # Debian-based systems: apt-get install liburcu-dev # Install mwrap via RubyGems.org gem install mwrap @@ -37,13 +41,21 @@ appropriate build options. == Usage mwrap works as an LD_PRELOAD and supplies a mwrap RubyGem executable to -improve ease-of-use. You can set dump_path: in the MWRAP environment -variable to append the results to a log file: +improve ease-of-use. You can set `dump_csv:' in the MWRAP environment +variable to append the results to a CSV file: + + MWRAP=dump_csv:/path/to/log mwrap RUBY_COMMAND + +(`dump_csv:' is new in mwrap 3.x, `dump_file:' from earlier versions is +still supported). - MWRAP=dump_path:/path/to/log mwrap RUBY_COMMAND +For long running processes, you can see the AF_UNIX HTTP interface: - # And to display the locations with the most allocations: - sort -k1,1rn </path/to/log | $PAGER + MWRAP=socket_dir:/some/dir mwrap COMMAND + +And connect via `curl --unix-socket /some/dir/$PID.sock' or +`mwrap-rproxy(1p)<https://80x24.org/mwrap-perl.git/tree/script/mwrap-rproxy#n44> +for more info. You may also `require "mwrap"' in your Ruby code and use Mwrap.dump, Mwrap.reset, Mwrap.each, etc. @@ -53,7 +65,10 @@ effect in tracking malloc use. However, it is safe to keep "require 'mwrap'" in performance-critical deployments, as overhead is only incurred when used as an LD_PRELOAD. -The output of the mwrap dump is a text file with 3 columns: +The output of `dump_csv:' is has self-describing columns and is +subject to change. SQLite 3.32+ can load it with: `.import --csv'. + +The output of the `dump_file:' output is a text file with 3 columns: total_bytes call_count location diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index ef4d83c..8a105aa 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -221,7 +221,7 @@ static FILE *fbuf_init(struct mw_fbuf *fb) { fb->ptr = NULL; fb->fp = open_memstream(&fb->ptr, &fb->len); - if (!fb->fp) fprintf(stderr, "open_memstream: %m\n"); + if (!fb->fp) perror("open_memstream"); return fb->fp; } @@ -237,7 +237,7 @@ static int fbuf_close(struct mw_fbuf *fb) { int e = ferror(fb->fp) | fclose(fb->fp); fb->fp = NULL; - if (e) fprintf(stderr, "ferror|fclose: %m\n"); + if (e) perror("ferror|fclose"); return e; } @@ -279,7 +279,7 @@ static enum mw_qev h1_200(struct mw_h1 *h1, struct mw_fbuf *fb, const char *ct) */ off_t clen = ftello(fb->fp); if (clen < 0) { - fprintf(stderr, "ftello: %m\n"); + perror("ftello"); fbuf_close(fb); return h1_close(h1); } @@ -468,7 +468,7 @@ static off_t write_loc_name(FILE *fp, const struct src_loc *l) off_t beg = ftello(fp); if (beg < 0) { - fprintf(stderr, "ftello: %m\n"); + perror("ftello"); return beg; } if (l->f) { @@ -498,15 +498,17 @@ static off_t write_loc_name(FILE *fp, const struct src_loc *l) } off_t end = ftello(fp); if (end < 0) { - fprintf(stderr, "ftello: %m\n"); + perror("ftello"); return end; } return end - beg; } -static struct h1_src_loc *accumulate(unsigned long min, size_t *hslc, FILE *lp) +static struct h1_src_loc * +accumulate(struct mw_fbuf *lb, unsigned long min, size_t *hslc) { struct mw_fbuf fb; + if (!fbuf_init(lb)) return NULL; if (!fbuf_init(&fb)) return NULL; rcu_read_lock(); struct cds_lfht *t = CMM_LOAD_SHARED(totals); @@ -528,18 +530,23 @@ static struct h1_src_loc *accumulate(unsigned long min, size_t *hslc, FILE *lp) HUGE_VAL; hsl.max_life = uatomic_read(&l->max_lifespan); hsl.sl = l; - hsl.lname_len = write_loc_name(lp, l); + hsl.lname_len = write_loc_name(lb->fp, l); fwrite(&hsl, sizeof(hsl), 1, fb.fp); } rcu_read_unlock(); - struct h1_src_loc *hslv; - if (fbuf_close(&fb)) { - hslv = NULL; - } else { - *hslc = fb.len / sizeof(*hslv); - mwrap_assert((fb.len % sizeof(*hslv)) == 0); - hslv = (struct h1_src_loc *)fb.ptr; + if (fbuf_close(&fb) || fbuf_close(lb)) + return NULL; + + struct h1_src_loc *hslv = (struct h1_src_loc *)fb.ptr; + *hslc = fb.len / sizeof(*hslv); + mwrap_assert((fb.len % sizeof(*hslv)) == 0); + char *n = lb->ptr; + for (size_t i = 0; i < *hslc; ++i) { + hslv[i].loc_name = n; + n += hslv[i].lname_len; + if (hslv[i].lname_len < 0) + return NULL; } return hslv; } @@ -609,124 +616,128 @@ static enum mw_qev each_at(struct mw_h1 *h1, struct mw_h1req *h1r) return h1_200(h1, &html, TYPE_HTML); } -/* /$PID/each/$MIN endpoint */ -static enum mw_qev each_gt(struct mw_h1 *h1, struct mw_h1req *h1r, - unsigned long min, bool csv) -{ - static const char default_sort[] = "bytes"; - const char *sort; - size_t sort_len = 0; +typedef int (*cmp_fn)(const void *, const void *); - if (!csv) { - sort = default_sort; - sort_len = sizeof(default_sort) - 1; +static cmp_fn write_csv_header(FILE *fp, const char *sort, size_t sort_len) +{ + cmp_fn cmp = NULL; + for (size_t i = 0; i < CAA_ARRAY_SIZE(fields); i++) { + const char *fn = fields[i].fname; + if (i) + fputc(',', fp); + fputs(fn, fp); + if (fields[i].flen == sort_len && !memcmp(fn, sort, sort_len)) + cmp = fields[i].cmp; } + fputc('\n', fp); + return cmp; +} - if (h1r->qstr && h1r->qlen > 5 && !memcmp(h1r->qstr, "sort=", 5)) { - sort = h1r->qstr + 5; - sort_len = h1r->qlen - 5; +static void write_csv_data(FILE *fp, struct h1_src_loc *hslv, size_t hslc) +{ + for (size_t i = 0; i < hslc; i++) { + struct h1_src_loc *hsl = &hslv[i]; + + fprintf(fp, "%zu,%zu,%zu,%zu,%0.3f,%zu,", + hsl->bytes, hsl->allocations, hsl->frees, + hsl->live, hsl->mean_life, hsl->max_life); + write_q_csv(fp, hsl->loc_name, hsl->lname_len); + fputc('\n', fp); } +} - size_t hslc; +static void *write_csv(FILE *fp, size_t min, const char *sort, size_t sort_len) +{ AUTO_CLOFREE struct mw_fbuf lb; - if (!fbuf_init(&lb)) return h1_close(h1); - AUTO_FREE struct h1_src_loc *hslv = accumulate(min, &hslc, lb.fp); - if (!hslv) - return h1_close(h1); + size_t hslc; + AUTO_FREE struct h1_src_loc *hslv = accumulate(&lb, min, &hslc); + if (!hslv) return NULL; - if (fbuf_close(&lb)) - return h1_close(h1); + cmp_fn cmp = write_csv_header(fp, sort, sort_len); + if (cmp) + qsort(hslv, hslc, sizeof(*hslv), cmp); + write_csv_data(fp, hslv, hslc); + return fp; +} - char *n = lb.ptr; - for (size_t i = 0; i < hslc; ++i) { - hslv[i].loc_name = n; - n += hslv[i].lname_len; - if (hslv[i].lname_len < 0) - return h1_close(h1); +/* /$PID/each/$MIN endpoint */ +static enum mw_qev each_gt(struct mw_h1 *h1, struct mw_h1req *h1r, + size_t min, bool csv) +{ + static const char default_sort[] = "bytes"; + const char *sort = csv ? NULL : default_sort; + size_t sort_len = csv ? 0 : (sizeof(default_sort) - 1); + + if (h1r->qstr && h1r->qlen > 5 && !memcmp(h1r->qstr, "sort=", 5)) { + sort = h1r->qstr + 5; + sort_len = h1r->qlen - 5; } struct mw_fbuf bdy; FILE *fp = wbuf_init(&bdy); if (!fp) return h1_close(h1); - - if (!csv) { - unsigned depth = (unsigned)CMM_LOAD_SHARED(bt_req_depth); - fprintf(fp, "<html><head><title>mwrap each >%lu" - "</title></head><body><p>mwrap each >%lu " - "(change `%lu' in URL to adjust filtering) - " - "MWRAP=bt:%u <a href=\"%lu.csv\">.csv</a>", - min, min, min, depth, min); - show_stats(fp); - /* need borders to distinguish multi-level traces */ - if (depth) - FPUTS("<table\nborder=1><tr>", fp); - else /* save screen space if only tracing one line */ - FPUTS("<table><tr>", fp); + if (csv) { + if (write_csv(fp, min, sort, sort_len)) + return h1_200(h1, &bdy, TYPE_CSV); + return h1_close(h1); } - int (*cmp)(const void *, const void *) = NULL; - if (csv) { - for (size_t i = 0; i < CAA_ARRAY_SIZE(fields); i++) { - const char *fn = fields[i].fname; - if (i) - fputc(',', fp); - fputs(fn, fp); - if (fields[i].flen == sort_len && - !memcmp(fn, sort, sort_len)) - cmp = fields[i].cmp; - } - fputc('\n', fp); - } else { - for (size_t i = 0; i < CAA_ARRAY_SIZE(fields); i++) { - const char *fn = fields[i].fname; - FPUTS("<th>", fp); - if (fields[i].flen == sort_len && - !memcmp(fn, sort, sort_len)) { - cmp = fields[i].cmp; - fprintf(fp, "<b>%s</b>", fields[i].fname); - } else { - fprintf(fp, "<a\nhref=\"./%lu?sort=%s\">%s</a>", - min, fn, fn); - } - FPUTS("</th>", fp); + size_t hslc; + AUTO_CLOFREE struct mw_fbuf lb; + AUTO_FREE struct h1_src_loc *hslv = accumulate(&lb, min, &hslc); + if (!hslv) + return h1_close(h1); + + unsigned depth = (unsigned)CMM_LOAD_SHARED(bt_req_depth); + fprintf(fp, "<html><head><title>mwrap each >%lu" + "</title></head><body><p>mwrap each >%lu " + "(change `%lu' in URL to adjust filtering) - " + "MWRAP=bt:%u <a href=\"%lu.csv\">.csv</a>", + min, min, min, depth, min); + show_stats(fp); + /* need borders to distinguish multi-level traces */ + if (depth) + FPUTS("<table\nborder=1><tr>", fp); + else /* save screen space if only tracing one line */ + FPUTS("<table><tr>", fp); + cmp_fn cmp = NULL; + for (size_t i = 0; i < CAA_ARRAY_SIZE(fields); i++) { + const char *fn = fields[i].fname; + FPUTS("<th>", fp); + if (fields[i].flen == sort_len && + !memcmp(fn, sort, sort_len)) { + cmp = fields[i].cmp; + fprintf(fp, "<b>%s</b>", fields[i].fname); + } else { + fprintf(fp, "<a\nhref=\"./%lu?sort=%s\">%s</a>", + min, fn, fn); } + FPUTS("</th>", fp); } - if (!csv) - FPUTS("</tr>", fp); + FPUTS("</tr>", fp); if (cmp) qsort(hslv, hslc, sizeof(*hslv), cmp); - else if (!csv) + else FPUTS("<tr><td>sort= not understood</td></tr>", fp); - if (csv) { - for (size_t i = 0; i < hslc; i++) { - struct h1_src_loc *hsl = &hslv[i]; - fprintf(fp, "%zu,%zu,%zu,%zu,%0.3f,%zu,", - hsl->bytes, hsl->allocations, hsl->frees, - hsl->live, hsl->mean_life, hsl->max_life); - write_q_csv(fp, hsl->loc_name, hsl->lname_len); - fputc('\n', fp); - } - } else { - for (size_t i = 0; i < hslc; i++) { - struct h1_src_loc *hsl = &hslv[i]; + for (size_t i = 0; i < hslc; i++) { + struct h1_src_loc *hsl = &hslv[i]; - fprintf(fp, "<tr><td>%zu</td><td>%zu</td><td>%zu</td>" - "<td>%zu</td><td>%0.3f</td><td>%zu</td>", - hsl->bytes, hsl->allocations, hsl->frees, - hsl->live, hsl->mean_life, hsl->max_life); - FPUTS("<td><a\nhref=\"../at/", fp); + fprintf(fp, "<tr><td>%zu</td><td>%zu</td><td>%zu</td>" + "<td>%zu</td><td>%0.3f</td><td>%zu</td>", + hsl->bytes, hsl->allocations, hsl->frees, + hsl->live, hsl->mean_life, hsl->max_life); + FPUTS("<td><a\nhref=\"../at/", fp); - write_b64_url(fp, src_loc_hash_tip(hsl->sl), - src_loc_hash_len(hsl->sl)); + write_b64_url(fp, src_loc_hash_tip(hsl->sl), + src_loc_hash_len(hsl->sl)); - FPUTS("\">", fp); - write_html(fp, hsl->loc_name, hsl->lname_len); - FPUTS("</a></td></tr>", fp); - } - FPUTS("</table></body></html>", fp); + FPUTS("\">", fp); + write_html(fp, hsl->loc_name, hsl->lname_len); + FPUTS("</a></td></tr>", fp); } - return h1_200(h1, &bdy, csv ? TYPE_CSV : TYPE_HTML); + FPUTS("</table></body></html>", fp); + return h1_200(h1, &bdy, TYPE_HTML); } /* /$PID/ root endpoint */ @@ -781,7 +792,7 @@ static enum mw_qev h1_dispatch(struct mw_h1 *h1, struct mw_h1req *h1r) if ((c = PATH_SKIP(h1r, "/each/"))) { errno = 0; char *e; - unsigned long min = strtoul(c, &e, 10); + size_t min = (size_t)strtoul(c, &e, 10); if (!errno) { if (*e == ' ' || *e == '?') return each_gt(h1, h1r, min, false); @@ -857,7 +868,7 @@ static enum mw_qev h1_drain_input(struct mw_h1 *h1, struct mw_h1req *h1r, return h1_close(h1); default: /* ENOMEM, ENOBUFS, ... */ assert(errno != EBADF); - fprintf(stderr, "read: %m\n"); + perror("read"); return h1_close(h1); } } @@ -990,7 +1001,7 @@ static enum mw_qev h1_event_step(struct mw_h1 *h1, struct mw_h1d *h1d) if (!h1r) { h1r = h1d->shared_h1r = malloc(sizeof(*h1r)); if (!h1r) { - fprintf(stderr, "h1r malloc: %m\n"); + perror("h1r malloc"); return h1_close(h1); } } @@ -1034,7 +1045,7 @@ static enum mw_qev h1_event_step(struct mw_h1 *h1, struct mw_h1d *h1d) return h1_close(h1); default: /* ENOMEM, ENOBUFS, ... */ assert(errno != EBADF); - fprintf(stderr, "read: %m\n"); + perror("read"); return h1_close(h1); } } @@ -1142,7 +1153,7 @@ static void h1d_unlink(struct mw_h1d *h1d, bool do_close) if (h1d->lfd < 0 || !h1d->pid_len) return; if (getsockname(h1d->lfd, &sa.any, &len) < 0) { - fprintf(stderr, "getsockname: %m\n"); + perror("getsockname"); return; } if (do_close) { /* only safe to close if thread isn't running */ @@ -1208,13 +1219,13 @@ static int h1d_init(struct mw_h1d *h1d, const char *menv) return fprintf(stderr, "unlink(%s): %m\n", sa.un.sun_path); h1d->lfd = socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0); if (h1d->lfd < 0) - return fprintf(stderr, "socket: %m\n"); + return perror("socket"), 1; if (bind(h1d->lfd, &sa.any, (socklen_t)sizeof(sa)) < 0) { - fprintf(stderr, "bind: %m\n"); + perror("bind"); goto close_fail; } if (listen(h1d->lfd, 1024) < 0) { - fprintf(stderr, "listen: %m\n"); + perror("listen"); goto close_fail; } h1d->alive = 1; /* runs in parent, before pthread_create */ diff --git a/ext/mwrap/mwrap_core.h b/ext/mwrap/mwrap_core.h index 827ee7b..a84cd6d 100644 --- a/ext/mwrap/mwrap_core.h +++ b/ext/mwrap/mwrap_core.h @@ -4,7 +4,9 @@ * Disclaimer: I don't really know my way around XS or Perl internals well */ #define _LGPL_SOURCE /* allows URCU to inline some stuff */ -#define _GNU_SOURCE +#ifndef _GNU_SOURCE +# define _GNU_SOURCE +#endif #include "mymalloc.h" /* includes dlmalloc_c.h */ #ifndef MWRAP_PERL # define MWRAP_PERL 0 @@ -19,9 +21,6 @@ # define MWRAP_BT_MAX 32 #endif -#ifndef _GNU_SOURCE -# define _GNU_SOURCE -#endif #include <execinfo.h> #include <stdio.h> #include <stdlib.h> @@ -389,11 +388,8 @@ static const COP *mwp_curcop(void) static const char *mw_perl_src_file_cstr(unsigned *lineno) { const COP *cop = mwp_curcop(); - if (!cop) return NULL; - const char *fn = CopFILE(cop); - if (!fn) return NULL; - *lineno = CopLINE(cop); - return fn; + *lineno = cop ? CopLINE(cop) : 0; + return cop ? CopFILE(cop) : NULL; } # define SRC_FILE_CSTR(lineno) mw_perl_src_file_cstr(lineno) #endif /* MWRAP_PERL */ @@ -735,6 +731,7 @@ enomem: struct dump_arg { FILE *fp; size_t min; + bool dump_csv; }; char **bt_syms(void * const *addrlist, uint32_t size) @@ -745,7 +742,7 @@ char **bt_syms(void * const *addrlist, uint32_t size) #else /* make FreeBSD look like glibc output: */ char **s = backtrace_symbols_fmt(addrlist, size, "%f(%n%D) [%a]"); #endif - if (!s) fprintf(stderr, "backtrace_symbols: %m\n"); + if (!s) perror("backtrace_symbols"); return s; } @@ -757,12 +754,16 @@ static void cleanup_free(void *any) free(*p); } +static void *write_csv(FILE *, size_t min, const char *sort, size_t sort_len); static void *dump_to_file(struct dump_arg *a) { struct cds_lfht_iter iter; struct src_loc *l; struct cds_lfht *t; + if (a->dump_csv) + return write_csv(a->fp, a->min, NULL, 0); + ++locating; rcu_read_lock(); t = CMM_LOAD_SHARED(totals); @@ -860,7 +861,7 @@ __attribute__ ((destructor)) static void mwrap_dtor(void) { const char *opt = getenv("MWRAP"); const char *modes[] = { "a", "a+", "w", "w+", "r+" }; - struct dump_arg a = { .min = 0 }; + struct dump_arg a = { .min = 0, .dump_csv = false }; size_t i; int dump_fd; char *dump_path; @@ -873,27 +874,64 @@ __attribute__ ((destructor)) static void mwrap_dtor(void) return; ++locating; - if ((dump_path = strstr(opt, "dump_path:")) && - (dump_path += sizeof("dump_path")) && - *dump_path) { + + /* parse dump_csv:$PATHNAME */ + if ((dump_path = strstr(opt, "dump_csv:"))) { + dump_path += sizeof("dump_csv"); + if (!*dump_path) + dump_path = NULL; + else + a.dump_csv = true; + } + if (!dump_path) { + /* parse dump_path:$PATHNAME */ + if ((dump_path = strstr(opt, "dump_path:"))) { + dump_path += sizeof("dump_path"); + if (!*dump_path) + dump_path = NULL; + } + } + if (dump_path) { char *end = strchr(dump_path, ','); char buf[PATH_MAX]; + AUTO_FREE char *pid_path = NULL; if (end) { mwrap_assert((end - dump_path) < (intptr_t)sizeof(buf)); end = mempcpy(buf, dump_path, end - dump_path); *end = 0; dump_path = buf; } + + /* %p => PID expansion (Linux core_pattern uses %p, too) */ + if ((s = strchr(dump_path, '%')) && s[1] == 'p' && + /* don't allow injecting extra formats: */ + !strchr(s + 2, '%')) { + s[1] = 'd'; /* s/%p/%d/ to make asprintf happy */ + int n = asprintf(&pid_path, dump_path, (int)getpid()); + if (n < 0) + fprintf(stderr, + "asprintf failed: %m, dumping to %s\n", + dump_path); + else + dump_path = pid_path; + } dump_fd = open(dump_path, O_CLOEXEC|O_WRONLY|O_APPEND|O_CREAT, 0666); if (dump_fd < 0) { fprintf(stderr, "open %s failed: %m\n", dump_path); goto out; } + } else { + s = strstr(opt, "dump_fd:"); + if (!s) + goto out; + if (!sscanf(s, "dump_fd:%d", &dump_fd)) + goto out; } - else if (!sscanf(opt, "dump_fd:%d", &dump_fd)) - goto out; + /* allow dump_csv standalone for dump_fd */ + if (!a.dump_csv && strstr(opt, "dump_csv")) + a.dump_csv = true; if ((s = strstr(opt, "dump_min:"))) sscanf(s, "dump_min:%zu", &a.min); @@ -1019,7 +1057,7 @@ __attribute__((constructor)) static void mwrap_ctor(void) h->real = h; call_rcu(&h->as.dead, free_hdr_rcu); } else - fprintf(stderr, "malloc: %m\n"); + perror("malloc"); h1d_start(); CHECK(int, 0, pthread_sigmask(SIG_SETMASK, &old, NULL));
$ gem install --pre mwrap mwrap(1): https://80x24.org/mwrap.git/tree/Documentation/mwrap.pod mwrap-rproxy(1p): https://80x24.org/mwrap-perl.git/tree/script/mwrap-rproxy#n44 (everything said in mwrap-rproxy(1p) for the `mwrap-perl' command also applies to the Ruby `mwrap') This contains many changes from the Perl port @ https://80x24.org/mwrap-perl.git I'll probably make proper announcements for the Perl version on other lists once I finish replacing cgit with something mail-archives-aware * Built-in RCU-friendly version of dlmalloc, no more fragile dlsym(3m) resolution of malloc-family functions in the constructor * Allocations are now backed by O_TMPFILE on $TMPDIR on modern Linux. Since mwrap increases memory usage greatly and I needed to use it on a system where I needed more VM space but lacked the ability to add swap. * Configurable C backtrace level via MWRAP=bt:$DEPTH where $DEPTH is a non-negative integer. Be careful about increasing it, even a depth of 3-4 can be orders-of-magnitude more expensive in time and space. This can be changed dynamically at runtime via local HTTP (see below). * Embedded per-process local-socket-only HTTP server obsoletes MwrapRack when combined with mwrap-rproxy from the Perl dist (set `MWRAP=socket_dir:/dir/of/sockets') See https://80x24.org/mwrap-perl/20221210015518.272576-4-e@80x24.org/ and the new mwrap(1) man page for more info It now supports downloading CSV (suitable for importing into sqlite 3.32.0+) * License switched to GPL-3+ to be compatible with GNU binutils since we may take code from addr2line in the future. * libxxhash supported if XXH3_64bits is available (minor speedup). - Mwrap::HeapPageBody no longer supported since Ruby 3.1+ uses mmap(2) - Ruby files longer than 16.7 million lines are no longer supported :P
`p' will deadlock even with `STDOUT.sync=true', apparently :< --- test/test_mwrap.rb | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/test/test_mwrap.rb b/test/test_mwrap.rb index e04fab5..58c9743 100644 --- a/test/test_mwrap.rb +++ b/test/test_mwrap.rb @@ -126,33 +126,41 @@ class TestMwrap < Test::Unit::TestCase # some URCU flavors use USR1, ensure the one we choose does not def test_sigusr1_works + err = Tempfile.new('dump') cmd = @@cmd + %w( -e STDOUT.sync=true - -e trap(:USR1){p("HELLO_WORLD")} + -e trap(:USR1){STDOUT.syswrite("HELLO_WORLD\n")} -e END{Mwrap.dump} - -e puts -e STDIN.read) + -e puts("HI") + -e STDIN.read) IO.pipe do |r, w| IO.pipe do |r2, w2| - pid = spawn(@@env, *cmd, in: r2, out: w, err: '/dev/null') + pid = spawn(@@env, *cmd, in: r2, out: w, err: err) r2.close w.close - assert_equal "\n", r.gets + assert_equal "HI\n", r.gets, '#puts HI fired' buf = +'' 10.times { Process.kill(:USR1, pid) } - while r.wait_readable(0.1) + Thread.pass # sched_yield + while r.wait_readable(0.5) case tmp = r.read_nonblock(1000, exception: false) - when String - buf << tmp - when nil - break + when String; buf << tmp; break + when nil; break + else + warn "Unexpected read_nonblock result: #{tmp.inspect}" end end - w2.close - Process.wait(pid) - assert_predicate $?, :success?, $?.inspect - assert_equal(["\"HELLO_WORLD\"\n"], buf.split(/^/).uniq) + w2.close # break from STDERR.read + _, st = Process.wait2(pid) + warn "# buf=#{buf.inspect}" if $DEBUG + assert_predicate(st, :success?, + "#{st.inspect} is success buf=#{buf.inspect} "\ + "err=#{err.rewind;err.read.inspect}") + assert_equal(["HELLO_WORLD\n"], buf.split(/^/).uniq) end end + ensure + err.close! if err end def test_reset
This makes the .CSV download discoverable so I don't have to document it in the manpage \o/ --- ext/mwrap/httpd.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index fe4fe2f..ef4d83c 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -654,7 +654,8 @@ static enum mw_qev each_gt(struct mw_h1 *h1, struct mw_h1req *h1r, fprintf(fp, "<html><head><title>mwrap each >%lu" "</title></head><body><p>mwrap each >%lu " "(change `%lu' in URL to adjust filtering) - " - "MWRAP=bt:%u", min, min, min, depth); + "MWRAP=bt:%u <a href=\"%lu.csv\">.csv</a>", + min, min, min, depth, min); show_stats(fp); /* need borders to distinguish multi-level traces */ if (depth)
We don't users being confused if an innocuous-looking line of code allocates unnexpectedly large values. --- ext/mwrap/httpd.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index 17fb187..fe4fe2f 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -600,7 +600,12 @@ static enum mw_qev each_at(struct mw_h1 *h1, struct mw_h1req *h1r) size, h->as.live.gen, h->real); } rcu_read_unlock(); - FPUTS("</table></body></html>", fp); + FPUTS("</table><pre>\nNotes:\n" +"* 16344-byte (64-bit) or 16344-byte (32-bit) allocations in\n" +" Ruby <= 3.0 aligned to 0x4000 are likely for object heap slots.\n" +"* 4080-byte allocations in Perl 5 are likely for arenas\n" +" (set via the PERL_ARENA_SIZE compile-time macro)" +"</pre></body></html>", fp); return h1_200(h1, &html, TYPE_HTML); }
Being edumacational :P Eric Wong (2): httpd: add notes about arenas and object heaps httpd: add CSV download link to /each/$MIN HTML ext/mwrap/httpd.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
CSV output is intended to be loaded by something else (e.g. SQLite, spreadsheet program, etc), so sorting it is likely a waste of time. --- ext/mwrap/httpd.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index 5c3b83f..17fb187 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -609,8 +609,13 @@ static enum mw_qev each_gt(struct mw_h1 *h1, struct mw_h1req *h1r, unsigned long min, bool csv) { static const char default_sort[] = "bytes"; - const char *sort = default_sort; - size_t sort_len = sizeof(default_sort) - 1; + const char *sort; + size_t sort_len = 0; + + if (!csv) { + sort = default_sort; + sort_len = sizeof(default_sort) - 1; + } if (h1r->qstr && h1r->qlen > 5 && !memcmp(h1r->qstr, "sort=", 5)) { sort = h1r->qstr + 5;
FreeBSD 12.x doesn't seem to work with the `pkg install'-ed ruby --- Documentation/.gitignore | 2 + Documentation/GNUmakefile | 63 +++++++++++++++++++ Documentation/mwrap.pod | 123 ++++++++++++++++++++++++++++++++++++++ MANIFEST | 4 +- README | 38 +++++++----- mwrap.gemspec | 11 +++- 6 files changed, 223 insertions(+), 18 deletions(-) create mode 100644 Documentation/.gitignore create mode 100644 Documentation/GNUmakefile create mode 100644 Documentation/mwrap.pod diff --git a/Documentation/.gitignore b/Documentation/.gitignore new file mode 100644 index 0000000..1b0e502 --- /dev/null +++ b/Documentation/.gitignore @@ -0,0 +1,2 @@ +mwrap.txt +mwrap.1 diff --git a/Documentation/GNUmakefile b/Documentation/GNUmakefile new file mode 100644 index 0000000..14480da --- /dev/null +++ b/Documentation/GNUmakefile @@ -0,0 +1,63 @@ +# Copyright (C) all contributors <mwrap-public@80x24.org> +# License: GPL-3.0+ <https://www.gnu.org/licenses/gpl-3.0.txt> +all:: + +INSTALL = install +POD2MAN = pod2man +VERSION := $(shell cd .. && ./VERSION-GEN) +release := mwrap $(VERSION) +POD2MAN_OPTS = -v -r '$(release)' --stderr -d 1993-10-02 -c 'mwrap user manual' +pod2man = $(POD2MAN) $(POD2MAN_OPTS) +POD2TEXT = pod2text +POD2TEXT_OPTS = --stderr +pod2text = $(POD2TEXT) $(POD2TEXT_OPTS) + +m1 = mwrap +m5 = +m7 = + +man1 := $(addsuffix .1, $(m1)) +man5 := $(addsuffix .5, $(m5)) +man7 := $(addsuffix .7, $(m7)) + +all:: man + +man: $(man1) $(man5) $(man7) + +prefix ?= $(HOME) +mandir ?= $(prefix)/share/man +man1dir = $(mandir)/man1 +man5dir = $(mandir)/man5 +man7dir = $(mandir)/man7 + +gem-man: man + $(INSTALL) -d -m 755 ../man + test -z "$(man1)" || $(INSTALL) -m 644 $(man1) ../man + test -z "$(man5)" || $(INSTALL) -m 644 $(man5) ../man + test -z "$(man7)" || $(INSTALL) -m 644 $(man7) ../man + +install-man: man + $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir) + $(INSTALL) -d -m 755 $(DESTDIR)$(man5dir) + test -z "$(man7)" || $(INSTALL) -d -m 755 $(DESTDIR)$(man7dir) + $(INSTALL) -m 644 $(man1) $(DESTDIR)$(man1dir) + $(INSTALL) -m 644 $(man5) $(DESTDIR)$(man5dir) + test -z "$(man7)" || $(INSTALL) -m 644 $(man7) $(DESTDIR)$(man7dir) + +%.1 %.5 %.7 : %.pod + $(pod2man) -s $(subst .,,$(suffix $@)) $< $@+ && mv $@+ $@ + +mantxt = $(addsuffix .txt, $(m1) $(m5) $(m7)) + +txt :: $(mantxt) + +all :: txt + +%.txt : %.pod + $(pod2text) $< $@+ + touch -r $< $@+ + mv $@+ $@ + +clean:: + $(RM) $(man1) $(man5) $(man7) + $(RM) $(addsuffix .txt.gz, $(m1) $(m5) $(m7)) diff --git a/Documentation/mwrap.pod b/Documentation/mwrap.pod new file mode 100644 index 0000000..6832430 --- /dev/null +++ b/Documentation/mwrap.pod @@ -0,0 +1,123 @@ +=head1 NAME + +mwrap - run any command under mwrap + +=head1 SYNOPSIS + + # to trace a long-running program and access it via $DIR/$PID.sock: + MWRAP=socket_dir:$DIR mwrap COMMAND + + # to trace a short-lived command and dump its output to a log: + MWRAP=dump_path:$FILENAME mwrap COMMAND + +=head1 DESCRIPTION + +mwrap is a command-line to automatically add mwrap.so as an +LD_PRELOAD for any command. It will resolve malloc-family calls +to a Ruby file and line number, and it can also provide a backtrace +of native (C/C++) functions for non-Ruby programs. + +=head1 ENVIRONMENT + +C<MWRAP> is the only environment variable read. It contains multiple +options delimited by C<,> with names and values delimited by C<:> + +=over 4 + +=item socket_dir:$DIR + +This launches an embedded HTTP server in each process and binds it +to C<$DIR/$PID.sock>. C<curl --unix-socket $DIR/$PID.sock> +or L<mwrap-rproxy(1p)> (from L<https://80x24.org/mwrap-perl.git>) may +be used to access various endpoints in the HTTP server. + +=item bt:$DEPTH + +The backtrace depth for L<backtrace(3)> in addition to the Perl +file and line number where C<$DEPTH> is a non-negative number. + +The maximum allowed value is 32, though values of 5 or less are +typically useful. Increasing this to even 2 or 3 can significantly +increase the amount of memory mwrap (and liburcu) itself uses. + +This is only useful in conjunction with C<socket_dir> + +This may be changed via POST request (see below). + +Default: 0 + +=item dump_path:$FILENAME + +Dumps the output at exit to a given filename: + + total_bytes call_count location + +In the future, dumping to a self-describing CSV will be supported. + +=item dump_fd:$DESCRIPTOR + +As with dump_path, but dumps the output to a given file descriptor. + +=back + +=head1 HTTP POST API + +In addition to the various GET endpoints linked via C<http://0/$PID/>, +there are some POST endpoints which are typically accessed via +C<curl --unix-socket $DIR/$PID.sock> + +=over 4 + +=item POST http://0/$PID/reset + +C<curl --unix-socket $DIR/$PID.sock -XPOST http://0/$PID/reset> + +Reset all internal counters. This is not done atomically and does +not release any memory. + +=item POST http://0/$PID/trim + +C<curl --unix-socket $DIR/$PID.sock -XPOST http://0/$PID/trim> + +Runs L<malloc_trim(3)> with a 0 pad value to release unused memory +back to the kernel. In our malloc implementation, this is done +lazily to avoid contention and does not happen unless sleeping threads. + +=item POST http://0/$PID/ctl + +Set various internal knobs. Currently, C<X-Mwrap-BT> is the +only knob supported: + +C<curl --unix-socket $DIR/$PID.sock -XPOST -HX-Mwrap-BT:1 http://0/$PID/ctl> + +Using the C<X-Mwrap-BT> header allows changing the aforementioned +C<bt:> value to a specified depth level. As with C<bt:>, only make small +adjustments as the memory cost can increase exponentially with each step. + +It is typically a good idea to reset (C<http://0/$PID/reset>) after changing +the depth on a running process. + +Headers other than C<X-Mwrap-BT> may be accepted in the future to +tweak other settings. + +=back + +=head1 CONTACT + +Feedback welcome via plain-text mail to L<mailto:mwrap-public@80x24.org> + +Mail archives are hosted at L<https://80x24.org/mwrap-public/> + +=head1 COPYRIGHT + +Copyright all contributors L<mailto:mwrap-public@80x24.org> + +License: GPL-3.0+ L<https://www.gnu.org/licenses/gpl-3.0.txt> + +Source code is at L<https://80x24.org/mwrap.git/> + +=head1 SEE ALSO + +L<mwrap-rproxy(1)>, L<Devel::Mwrap(3pm)>, L<https://80x24.org/mwrap-perl.git> + +=cut diff --git a/MANIFEST b/MANIFEST index 4c5be8b..8d4cdd6 100644 --- a/MANIFEST +++ b/MANIFEST @@ -2,6 +2,9 @@ .gitignore .olddoc.yml COPYING +Documentation/.gitignore +Documentation/GNUmakefile +Documentation/mwrap.pod MANIFEST README Rakefile @@ -24,4 +27,3 @@ mwrap.gemspec t/httpd.t t/test_common.perl test/test_mwrap.rb -lib/mwrap/version.rb diff --git a/README b/README index 382c5a0..761f87e 100644 --- a/README +++ b/README @@ -8,8 +8,8 @@ mwrap wraps all malloc-family calls to trace the Ruby source location of such calls and bytes allocated at each callsite. As of mwrap 2.0.0, it can also function as a leak detector and show live allocations at every call site. Depending on -your application and workload, the overhead is roughly a 50% -increase memory and runtime. +your application and workload, the overhead is roughly a 50-100% +increase memory and runtime with default settings. It works best for allocations under GVL, but tries to track numeric caller addresses for allocations made without GVL so you @@ -22,15 +22,13 @@ Userspace RCU project: https://liburcu.org/ It does not require recompiling or rebuilding Ruby, but only supports Ruby 2.7.0 or later on a few platforms: -* GNU/Linux -* FreeBSD +* GNU/Linux (only tested --without-jemalloc, mwrap 3.x provides its own) -It may work on NetBSD, OpenBSD and DragonFly BSD. +It may work on FreeBSD, NetBSD, OpenBSD and DragonFly BSD if given +appropriate build options. == Install - # FreeBSD: pkg install liburcu - # Debian-based systems: apt-get liburcu-dev # Install mwrap via RubyGems.org @@ -64,24 +62,33 @@ or an address retrieved by backtrace_symbols(3). It is recommended to use the sort(1) command on either of the first two columns to find the hottest malloc locations. -mwrap 2.0.0+ also supports a Rack application endpoint, +mwrap 3.0.0+ also supports an embedded HTTP server it is documented at: -https://80x24.org/mwrap/MwrapRack.html +https://80x24.org/mwrap.git/tree/Documentation/mwrap.pod == Known problems * 32-bit machines are prone to overflow (WONTFIX) -== Public mail archives and contact info: +* signalfd(2)-reliant code will need latest URCU with commit + ea3a28a3f71dd02f (Disable signals in URCU background threads, 2022-09-23) + This doesn't affect C Ruby itself, and signalfd(2) use is rare + 3rd-party processes. + +* Ruby source files over 16.7 million lines long are not supported :P + +== Public mail archives (HTTP, Atom feeds, IMAP mailbox, NNTP group, POP3): * https://80x24.org/mwrap-public/ * nntps://80x24.org/inbox.comp.lang.ruby.mwrap * imaps://;AUTH=ANONYMOUS@80x24.org/inbox.comp.lang.ruby.mwrap.0 * https://80x24.org/mwrap-public/_/text/help/#pop3 -No subscription will ever be required to post, but HTML mail -will be rejected: +No subscription nor real identities will ever be required to obtain support, +but HTML mail is rejected. Memory usage reductions start with you; +only send plain-text mail to us and do not top-post. HTML mail and +top-posting costs everybody memory and bandwidth. mwrap-public@80x24.org @@ -89,9 +96,10 @@ will be rejected: git clone https://80x24.org/mwrap.git -Send all patches and pull requests (use "git request-pull" to format) to -mwrap-public@80x24.org. We do not use centralized or proprietary messaging -systems. +Send all patches ("git format-patch" + "git send-email") and +pull requests (use "git request-pull" to format) via email +to mwrap-perl@80x24.org. We do not and will not use +proprietary messaging systems. == License diff --git a/mwrap.gemspec b/mwrap.gemspec index dc99924..b6f9e71 100644 --- a/mwrap.gemspec +++ b/mwrap.gemspec @@ -1,6 +1,5 @@ git_manifest = `git ls-files 2>/dev/null`.split("\n") git_ok = $?.success? -git_manifest << 'lib/mwrap/version.rb'.freeze # generated by ./VERSION-GEN manifest = File.exist?('MANIFEST') ? File.readlines('MANIFEST').map!(&:chomp).delete_if(&:empty?) : git_manifest if git_ok && manifest != git_manifest @@ -12,11 +11,19 @@ end version = `./VERSION-GEN`.chomp $?.success? or abort './VERSION-GEN failed' +manifest << 'lib/mwrap/version.rb'.freeze + +if system(*%w(make -C Documentation man)) || + system(*%w(gmake -C Documentation man)) + manifest.concat(%w(Documentation/mwrap.1)) +else + warn 'failed to build man-page(s), proceeding without them' +end Gem::Specification.new do |s| s.name = 'mwrap' s.version = version - s.homepage = 'https://80x24.org/mwrap/' + s.homepage = 'https://80x24.org/mwrap.git/' s.authors = ["mwrap hackers"] s.summary = 'LD_PRELOAD malloc wrapper for Ruby' s.executables = %w(mwrap)
This is more consistent with the `MWRAP=bt:' use, since adding `-Depth' seems unnecessary and makes curl commands too long. --- ext/mwrap/httpd.h | 2 +- t/httpd.t | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index cea79f7..5c3b83f 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -899,7 +899,7 @@ static enum mw_qev h1_parse_harder(struct mw_h1 *h1, struct mw_h1req *h1r, * request bodies, so let pico handle parameters in * HTTP request headers, instead. */ - if (NAME_EQ(hdr, "X-Mwrap-BT-Depth")) { + if (NAME_EQ(hdr, "X-Mwrap-BT")) { errno = 0; depth = strtol(hdr->value, &end, 10); if (errno || !valid_end(end)) diff --git a/t/httpd.t b/t/httpd.t index 9a0fae6..76fe7d1 100644 --- a/t/httpd.t +++ b/t/httpd.t @@ -174,12 +174,12 @@ SKIP: { $rc = system(@curl, qw(-d x=y), "http://0/$pid/reset"); is($rc, 0, 'curl /reset'); - $rc = system(@curl, qw(-HX-Mwrap-BT-Depth:10 -XPOST), + $rc = system(@curl, qw(-HX-Mwrap-BT:10 -XPOST), "http://0/$pid/ctl"); - is($rc, 0, 'curl /ctl (X-Mwrap-BT-Depth)'); + is($rc, 0, 'curl /ctl (X-Mwrap-BT)'); like(slurp($cout), qr/\bMWRAP=bt:10\b/, 'changed bt depth'); - $rc = system(@curl, qw(-HX-Mwrap-BT-Depth:10 -d blah http://0/ctl)); + $rc = system(@curl, qw(-HX-Mwrap-BT:10 -d blah http://0/ctl)); is($rc >> 8, 22, '404 w/o PID prefix'); };
They're different projects, still, I guess... --- ext/mwrap/httpd.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index 0ef6cd9..cea79f7 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -38,7 +38,11 @@ #include "picohttpparser_c.h" #include <pthread.h> #include <stdbool.h> -#define URL "https://80x24.org/mwrap-perl.git/about" +#if MWRAP_PERL +# define URL "https://80x24.org/mwrap-perl.git/" +#else +# define URL "https://80x24.org/mwrap.git/" +#endif #define TYPE_HTML "text/html; charset=UTF-8" #define TYPE_CSV "text/csv" #define TYPE_PLAIN "text/plain"
We can't call rb_gc_count() safely outside of Ruby threads (especially during startup/teardown), but we can share it's last-known value safely. --- ext/mwrap/httpd.h | 3 +++ ext/mwrap/mwrap_core.h | 5 ++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index 03aef9f..0ef6cd9 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -548,6 +548,9 @@ static void show_stats(FILE *fp) "/ files: %zu / locations: %zu", inc , inc - dec, uatomic_read(&nr_file), uatomic_read(&nr_src_loc)); +#if MWRAP_RUBY + fprintf(fp, " / GC: %zu", uatomic_read(&last_gc_count)); +#endif } /* /$PID/at/$LOCATION endpoint */ diff --git a/ext/mwrap/mwrap_core.h b/ext/mwrap/mwrap_core.h index 48669d5..827ee7b 100644 --- a/ext/mwrap/mwrap_core.h +++ b/ext/mwrap/mwrap_core.h @@ -85,6 +85,7 @@ static size_t *root_locating; /* determines if PL_curcop is our thread */ #if MWRAP_RUBY static void mw_ruby_set_generation(size_t *, size_t); # define SET_GENERATION(gen, size) mw_ruby_set_generation(gen, size) +static size_t last_gc_count; /* for httpd which runs in a non-GVL thread */ #endif /* MWRAP_RUBY */ #ifndef SET_GENERATION /* C-only builds w/o Perl|Ruby */ @@ -1074,8 +1075,10 @@ static void mw_ruby_set_generation(size_t *gen, size_t size) { if (rb_gc_count) { uatomic_add_return(&total_bytes_inc, size); - if (has_ec_p()) + if (has_ec_p()) { *gen = rb_gc_count(); + uatomic_set(&last_gc_count, *gen); + } } else { *gen = uatomic_add_return(&total_bytes_inc, size); }
By including them at the bottom. This will be done for Perl headers in the future, too, since they break assert(). --- ext/mwrap/httpd.h | 8 ---- ext/mwrap/mwrap_core.h | 104 +++++++++++++++++++++-------------------- 2 files changed, 53 insertions(+), 59 deletions(-) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index da7ff6d..03aef9f 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -43,14 +43,6 @@ #define TYPE_CSV "text/csv" #define TYPE_PLAIN "text/plain" -/* - * C ruby defines snprintf to ruby_snprintf, we can't have that in - * non-ruby processes spawned by C ruby - */ -#if MWRAP_RUBY && defined(snprintf) -# undef snprintf -#endif - enum mw_qev { MW_QEV_IGNORE = 0, MW_QEV_RD = POLLIN, diff --git a/ext/mwrap/mwrap_core.h b/ext/mwrap/mwrap_core.h index 721e5d3..48669d5 100644 --- a/ext/mwrap/mwrap_core.h +++ b/ext/mwrap/mwrap_core.h @@ -46,13 +46,6 @@ # include "ppport.h" #endif -#if MWRAP_RUBY -# undef _GNU_SOURCE /* ruby.h redefines it */ -# include <ruby.h> /* defines HAVE_RUBY_RACTOR_H on 3.0+ */ -# include <ruby/thread.h> -# include <ruby/io.h> -#endif - /* * XXH3 (truncated to 32-bits) seems to provide a ~2% speedup. * XXH32 doesn't show improvements over jhash despite rculfhash @@ -90,44 +83,11 @@ static size_t *root_locating; /* determines if PL_curcop is our thread */ #endif /* MWRAP_PERL */ #if MWRAP_RUBY -const char *rb_source_location_cstr(int *line); /* requires 2.6.0dev or later */ - -# ifdef HAVE_RUBY_RACTOR_H /* Ruby 3.0+ */ -extern MWRAP_TSD void * __attribute__((weak)) ruby_current_ec; -# else /* Ruby 2.6-2.7 */ -extern void * __attribute__((weak)) ruby_current_execution_context_ptr; -# define ruby_current_ec ruby_current_execution_context_ptr -# endif /* HAVE_RUBY_RACTOR_H */ - -extern void * __attribute__((weak)) ruby_current_vm_ptr; /* for rb_gc_count */ -extern size_t __attribute__((weak)) rb_gc_count(void); -int __attribute__((weak)) ruby_thread_has_gvl_p(void); - -/* - * rb_source_location_cstr relies on GET_EC(), and it's possible - * to have a native thread but no EC during the early and late - * (teardown) phases of the Ruby process - */ -static int has_ec_p(void) -{ - return ruby_thread_has_gvl_p && ruby_thread_has_gvl_p() && - ruby_current_vm_ptr && ruby_current_ec; -} - -static void set_generation(size_t *gen, size_t size) -{ - if (rb_gc_count) { - uatomic_add_return(&total_bytes_inc, size); - if (has_ec_p()) - *gen = rb_gc_count(); - } else { - *gen = uatomic_add_return(&total_bytes_inc, size); - } -} -# define SET_GENERATION(gen, size) set_generation(gen, size) +static void mw_ruby_set_generation(size_t *, size_t); +# define SET_GENERATION(gen, size) mw_ruby_set_generation(gen, size) #endif /* MWRAP_RUBY */ -#ifndef SET_GENERATION +#ifndef SET_GENERATION /* C-only builds w/o Perl|Ruby */ # define SET_GENERATION(gen, size) \ *gen = uatomic_add_return(&total_bytes_inc, size) #endif /* !SET_GENERATION */ @@ -438,14 +398,7 @@ static const char *mw_perl_src_file_cstr(unsigned *lineno) #endif /* MWRAP_PERL */ #if MWRAP_RUBY -static const char *mw_ruby_src_file_cstr(unsigned *lineno) -{ - if (!has_ec_p()) return NULL; - int line; - const char *fn = rb_source_location_cstr(&line); - *lineno = line < 0 ? UINT_MAX : (unsigned)line; - return fn; -} +static const char *mw_ruby_src_file_cstr(unsigned *lineno); # define SRC_FILE_CSTR(lineno) mw_ruby_src_file_cstr(lineno) #endif /* MWRAP_RUBY */ @@ -1088,3 +1041,52 @@ __attribute__((constructor)) static void mwrap_ctor(void) } --locating; } + +#if MWRAP_RUBY +# undef _GNU_SOURCE /* ruby.h redefines it */ +# include <ruby.h> /* defines HAVE_RUBY_RACTOR_H on 3.0+ */ +# include <ruby/thread.h> +# include <ruby/io.h> +# ifdef HAVE_RUBY_RACTOR_H /* Ruby 3.0+ */ +extern MWRAP_TSD void * __attribute__((weak)) ruby_current_ec; +# else /* Ruby 2.6-2.7 */ +extern void * __attribute__((weak)) ruby_current_execution_context_ptr; +# define ruby_current_ec ruby_current_execution_context_ptr +# endif /* HAVE_RUBY_RACTOR_H */ + +extern void * __attribute__((weak)) ruby_current_vm_ptr; /* for rb_gc_count */ +extern size_t __attribute__((weak)) rb_gc_count(void); +int __attribute__((weak)) ruby_thread_has_gvl_p(void); + +const char *rb_source_location_cstr(int *line); /* requires 2.6.0dev or later */ +/* + * rb_source_location_cstr relies on GET_EC(), and it's possible + * to have a native thread but no EC during the early and late + * (teardown) phases of the Ruby process + */ +static int has_ec_p(void) +{ + return ruby_thread_has_gvl_p && ruby_thread_has_gvl_p() && + ruby_current_vm_ptr && ruby_current_ec; +} + +static void mw_ruby_set_generation(size_t *gen, size_t size) +{ + if (rb_gc_count) { + uatomic_add_return(&total_bytes_inc, size); + if (has_ec_p()) + *gen = rb_gc_count(); + } else { + *gen = uatomic_add_return(&total_bytes_inc, size); + } +} + +static const char *mw_ruby_src_file_cstr(unsigned *lineno) +{ + if (!has_ec_p()) return NULL; + int line; + const char *fn = rb_source_location_cstr(&line); + *lineno = line < 0 ? UINT_MAX : (unsigned)line; + return fn; +} +#endif /* !MWRAP_RUBY */
This tests commit 649a0d3e3578 (httpd: undefine ruby_snprintf alias for non-Ruby processes, 2023-01-08) --- test/test_mwrap.rb | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/test/test_mwrap.rb b/test/test_mwrap.rb index 176dca6..e04fab5 100644 --- a/test/test_mwrap.rb +++ b/test/test_mwrap.rb @@ -6,6 +6,7 @@ require 'mwrap' require 'rbconfig' require 'tempfile' require 'io/wait' +require 'tmpdir' class TestMwrap < Test::Unit::TestCase RB = "#{RbConfig::CONFIG['bindir']}/#{RbConfig::CONFIG['RUBY_INSTALL_NAME']}" @@ -79,6 +80,15 @@ class TestMwrap < Test::Unit::TestCase assert_match(/\b0x[a-f0-9]+\b/s, dump, 'dump output has addresses') end + def test_spawn_non_ruby + Dir.mktmpdir do |dir| + sockdir = "#{dir}/sockdir" + env = @@env.merge('MWRAP' => "socket_dir:#{sockdir}") + out = IO.popen(env, %w(ls -alR), { chdir: dir }, &:read) + assert_match(/\b\d+\.sock\b/, out) + end + end + def test_clear cmd = @@cmd + %w( -e ("0"*10000).clear
--- Rakefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Rakefile b/Rakefile index cf4311e..efb62b1 100644 --- a/Rakefile +++ b/Rakefile @@ -8,7 +8,7 @@ rescue LoadError warn 'rake-compiler not available, cross compiling disabled' end -Rake::TestTask.new(:test) +Rake::TestTask.new('test-ruby') task 'test-ruby' => :compile task :default => :compile
Not sure what drugs I was on when I wrote this :x --- test/test_mwrap.rb | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/test/test_mwrap.rb b/test/test_mwrap.rb index 29bbdd2..176dca6 100644 --- a/test/test_mwrap.rb +++ b/test/test_mwrap.rb @@ -5,6 +5,7 @@ require 'test/unit' require 'mwrap' require 'rbconfig' require 'tempfile' +require 'io/wait' class TestMwrap < Test::Unit::TestCase RB = "#{RbConfig::CONFIG['bindir']}/#{RbConfig::CONFIG['RUBY_INSTALL_NAME']}" @@ -128,10 +129,12 @@ class TestMwrap < Test::Unit::TestCase assert_equal "\n", r.gets buf = +'' 10.times { Process.kill(:USR1, pid) } - while IO.select([r], nil, nil, 0.1) + while r.wait_readable(0.1) case tmp = r.read_nonblock(1000, exception: false) when String buf << tmp + when nil + break end end w2.close
ruby/subst.h (included by ruby.h) replaces `snprintf' with `ruby_snprintf'. This only works in processes linked to Ruby, but won't work in subprocesses spawned by Ruby. --- ext/mwrap/httpd.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/ext/mwrap/httpd.h b/ext/mwrap/httpd.h index 03aef9f..da7ff6d 100644 --- a/ext/mwrap/httpd.h +++ b/ext/mwrap/httpd.h @@ -43,6 +43,14 @@ #define TYPE_CSV "text/csv" #define TYPE_PLAIN "text/plain" +/* + * C ruby defines snprintf to ruby_snprintf, we can't have that in + * non-ruby processes spawned by C ruby + */ +#if MWRAP_RUBY && defined(snprintf) +# undef snprintf +#endif + enum mw_qev { MW_QEV_IGNORE = 0, MW_QEV_RD = POLLIN,
A weak symbol works fine, here. --- ext/mwrap/mwrap.c | 3 ++- ext/mwrap/mwrap_core.h | 1 - 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ext/mwrap/mwrap.c b/ext/mwrap/mwrap.c index d88fee6..a45bb38 100644 --- a/ext/mwrap/mwrap.c +++ b/ext/mwrap/mwrap.c @@ -6,6 +6,7 @@ #include "mwrap_core.h" static ID id_uminus; +extern VALUE __attribute__((weak)) rb_stderr; extern VALUE __attribute__((weak)) rb_cObject; extern VALUE __attribute__((weak)) rb_eTypeError; extern VALUE __attribute__((weak)) rb_yield(VALUE); @@ -33,7 +34,7 @@ static VALUE mwrap_dump(int argc, VALUE *argv, VALUE mod) if (NIL_P(io)) /* library may be linked w/o Ruby */ - io = *((VALUE *)dlsym(RTLD_DEFAULT, "rb_stderr")); + io = rb_stderr; a.min = NIL_P(min) ? 0 : NUM2SIZET(min); io = rb_io_get_io(io); diff --git a/ext/mwrap/mwrap_core.h b/ext/mwrap/mwrap_core.h index c0eea2f..721e5d3 100644 --- a/ext/mwrap/mwrap_core.h +++ b/ext/mwrap/mwrap_core.h @@ -26,7 +26,6 @@ #include <stdio.h> #include <stdlib.h> #include <string.h> -#include <dlfcn.h> #include <assert.h> #include <errno.h> #include <sys/types.h>
pushed to https://80x24.org/mwrap.git commit b5ab9be6686aa778a4cfd7622c598736b9c42321 parent 4356beb8237a92b3 (picohttpparser: fix __SSE_4_2__ CPP check, 2023-01-07) parent 2c25edb01139365f (undefine Mwrap::SourceLocation.allocate, 2023-01-07) Date: Sun Jan 8 05:03:26 2023 +0000 Merge changes from the Perl side This contains many changes from https://80x24.org/mwrap-perl.git * Built-in RCU-friendly version of dlmalloc, no more fragile dlsym(3m) resolution of malloc-family functions in the constructor * Allocations are now backed by O_TMPFILE on $TMPDIR on modern Linux. Since mwrap increases memory usage greatly and I needed to use it on a system where I needed more VM space but lacked the ability to add swap. * Configurable C backtrace level via MWRAP=bt:$DEPTH where $DEPTH is a non-negative integer. Be careful about increasing it, even a depth of 3-4 can be orders-of-magnitude more expensive in time and space. This can be changed dynamically at runtime via local HTTP (see below). * Embedded per-process local-socket-only HTTP server obsoletes MwrapRack when combined with mwrap-rproxy from the Perl dist (set `MWRAP=socket_dir:/dir/of/sockets') See https://80x24.org/mwrap-perl/20221210015518.272576-4-e@80x24.org/ for more info. It now supports downloading CSV (suitable for importing into sqlite 3.32.0+) * License switched to GPL-3+ to be compatible with GNU binutils since we may take code from addr2line in the future. * libxxhash supported if XXH3_64bits is available.
Ruby 3.1 uses mmap, nowadays, and I don't think it's worth the effort to suport it since mmap and munmap don't require the symmetry *memalign + free do. --- Keeping this separate from the upcoming mwrap-perl merge which features major changes including more common code. ext/mwrap/extconf.rb | 7 - ext/mwrap/mwrap.c | 327 ++----------------------------------------- lib/mwrap_rack.rb | 51 ------- test/test_mwrap.rb | 38 ----- 4 files changed, 12 insertions(+), 411 deletions(-) diff --git a/ext/mwrap/extconf.rb b/ext/mwrap/extconf.rb index 1828407..e8d3cc6 100644 --- a/ext/mwrap/extconf.rb +++ b/ext/mwrap/extconf.rb @@ -25,11 +25,4 @@ else abort 'missing __builtin_add_overflow' end -begin - if n = GC::INTERNAL_CONSTANTS[:HEAP_PAGE_SIZE] - $defs << "-DHEAP_PAGE_SIZE=#{n}" - end -rescue NameError -end - create_makefile 'mwrap' diff --git a/ext/mwrap/mwrap.c b/ext/mwrap/mwrap.c index 08761d6..6875486 100644 --- a/ext/mwrap/mwrap.c +++ b/ext/mwrap/mwrap.c @@ -51,19 +51,6 @@ static size_t total_bytes_inc, total_bytes_dec; /* true for glibc/dlmalloc/ptmalloc, not sure about jemalloc */ #define ASSUMED_MALLOC_ALIGNMENT (sizeof(void *) * 2) -/* match values in Ruby gc.c */ -#define HEAP_PAGE_ALIGN_LOG 14 -enum { - HEAP_PAGE_ALIGN = (1UL << HEAP_PAGE_ALIGN_LOG) -#ifndef HEAP_PAGE_SIZE /* Ruby 2.6-2.7 only */ - , - REQUIRED_SIZE_BY_MALLOC = (sizeof(size_t) * 5), - HEAP_PAGE_SIZE = (HEAP_PAGE_ALIGN - REQUIRED_SIZE_BY_MALLOC) -#endif -}; - -#define IS_HEAP_PAGE_BODY ((struct src_loc *)-1) - #ifdef __FreeBSD__ void *__malloc(size_t); void __free(void *); @@ -111,33 +98,6 @@ static union padded_mutex mutexes[MUTEX_NR] = { #endif }; -#define ACC_INIT(name) { .nr=0, .min=INT64_MAX, .max=-1, .m2=0, .mean=0 } -struct acc { - uint64_t nr; - int64_t min; - int64_t max; - double m2; - double mean; -}; - -/* for tracking 16K-aligned heap page bodies (protected by GVL) */ -struct { - pthread_mutex_t lock; - struct cds_list_head bodies; - struct cds_list_head freed; - - struct acc alive; - struct acc reborn; -} hpb_stats = { -#if STATIC_MTX_INIT_OK - .lock = PTHREAD_MUTEX_INITIALIZER, -#endif - .bodies = CDS_LIST_HEAD_INIT(hpb_stats.bodies), - .freed = CDS_LIST_HEAD_INIT(hpb_stats.freed), - .alive = ACC_INIT(hpb_stats.alive), - .reborn = ACC_INIT(hpb_stats.reborn) -}; - static pthread_mutex_t *mutex_assign(void) { return &mutexes[uatomic_add_return(&mutex_i, 1) & MUTEX_MASK].mtx; @@ -168,11 +128,6 @@ __attribute__((constructor)) static void resolve_malloc(void) _exit(1); } } - err = pthread_mutex_init(&hpb_stats.lock, 0); - if (err) { - fprintf(stderr, "error: %s\n", strerror(err)); - _exit(1); - } /* initialize mutexes used by urcu-bp */ rcu_read_lock(); rcu_read_unlock(); @@ -300,9 +255,6 @@ struct alloc_hdr { struct src_loc *loc; } live; struct rcu_head dead; - struct { - size_t at; /* rb_gc_count() */ - } hpb_freed; } as; void *real; /* what to call real_free on */ size_t size; @@ -344,64 +296,6 @@ static int loc_eq(struct cds_lfht_node *node, const void *key) memcmp(k->k, existing->k, loc_size(k)) == 0); } -/* note: not atomic */ -static void -acc_add(struct acc *acc, size_t val) -{ - double delta = val - acc->mean; - uint64_t nr = ++acc->nr; - - /* just don't divide-by-zero if we ever hit this (unlikely :P) */ - if (nr) - acc->mean += delta / nr; - - acc->m2 += delta * (val - acc->mean); - if ((int64_t)val < acc->min) - acc->min = (int64_t)val; - if ((int64_t)val > acc->max) - acc->max = (int64_t)val; -} - -#if SIZEOF_LONG == 8 -# define INT64toNUM(x) LONG2NUM((long)x) -#elif defined(HAVE_LONG_LONG) && SIZEOF_LONG_LONG == 8 -# define INT64toNUM(x) LL2NUM((LONG_LONG)x) -#endif - -static VALUE -acc_max(const struct acc *acc) -{ - return INT64toNUM(acc->max); -} - -static VALUE -acc_min(const struct acc *acc) -{ - return acc->min == INT64_MAX ? INT2FIX(-1) : INT64toNUM(acc->min); -} - -static VALUE -acc_mean(const struct acc *acc) -{ - return DBL2NUM(acc->nr ? acc->mean : HUGE_VAL); -} - -static double -acc_stddev_dbl(const struct acc *acc) -{ - if (acc->nr > 1) { - double variance = acc->m2 / (acc->nr - 1); - return sqrt(variance); - } - return 0.0; -} - -static VALUE -acc_stddev(const struct acc *acc) -{ - return DBL2NUM(acc_stddev_dbl(acc)); -} - static struct src_loc *totals_add_rcu(const struct src_loc *k) { struct cds_lfht_iter iter; @@ -519,7 +413,7 @@ void free(void *p) struct src_loc *l = h->as.live.loc; if (!real_free) return; /* oh well, leak a little */ - if (l && l != IS_HEAP_PAGE_BODY) { + if (l) { size_t age = generation - h->as.live.gen; uatomic_add(&total_bytes_dec, h->size); @@ -534,19 +428,6 @@ void free(void *p) mutex_unlock(l->mtx); call_rcu(&h->as.dead, free_hdr_rcu); - } else if (l == IS_HEAP_PAGE_BODY) { - size_t gen = generation; - size_t age = gen - h->as.live.gen; - - h->as.hpb_freed.at = gen; - - mutex_lock(&hpb_stats.lock); - acc_add(&hpb_stats.alive, age); - - /* hpb_stats.bodies => hpb_stats.freed */ - cds_list_move(&h->anode, &hpb_stats.freed); - - mutex_unlock(&hpb_stats.lock); } else { real_free(h->real); } @@ -614,65 +495,18 @@ internal_memalign(void **pp, size_t alignment, size_t size, uintptr_t caller) return ENOMEM; - if (alignment == HEAP_PAGE_ALIGN && size == HEAP_PAGE_SIZE) { - if (has_ec_p()) generation = rb_gc_count(); - l = IS_HEAP_PAGE_BODY; - } else { - l = update_stats_rcu_lock(size, caller); - } + l = update_stats_rcu_lock(size, caller); - if (l == IS_HEAP_PAGE_BODY) { - void *p; - size_t gen = generation; - - mutex_lock(&hpb_stats.lock); - - /* reuse existing entry */ - if (!cds_list_empty(&hpb_stats.freed)) { - size_t deathspan; - - h = cds_list_first_entry(&hpb_stats.freed, - struct alloc_hdr, anode); - /* hpb_stats.freed => hpb_stats.bodies */ - cds_list_move(&h->anode, &hpb_stats.bodies); - assert(h->size == size); - assert(h->real); - real = h->real; - p = hdr2ptr(h); - assert(ptr_is_aligned(p, alignment)); - - deathspan = gen - h->as.hpb_freed.at; - acc_add(&hpb_stats.reborn, deathspan); - } - else { - real = real_malloc(asize); - if (!real) return ENOMEM; - - p = hdr2ptr(real); - if (!ptr_is_aligned(p, alignment)) - p = ptr_align(p, alignment); - h = ptr2hdr(p); - h->size = size; - h->real = real; - cds_list_add(&h->anode, &hpb_stats.bodies); - } - mutex_unlock(&hpb_stats.lock); - h->as.live.loc = l; - h->as.live.gen = gen; + real = real_malloc(asize); + if (real) { + void *p = hdr2ptr(real); + if (!ptr_is_aligned(p, alignment)) + p = ptr_align(p, alignment); + h = ptr2hdr(p); + alloc_insert_rcu(l, h, size, real); *pp = p; } - else { - real = real_malloc(asize); - if (real) { - void *p = hdr2ptr(real); - if (!ptr_is_aligned(p, alignment)) - p = ptr_align(p, alignment); - h = ptr2hdr(p); - alloc_insert_rcu(l, h, size, real); - *pp = p; - } - update_stats_rcu_unlock(l); - } + update_stats_rcu_unlock(l); return real ? 0 : ENOMEM; } @@ -1243,73 +1077,6 @@ static VALUE total_dec(VALUE mod) return SIZET2NUM(total_bytes_dec); } -static VALUE hpb_each_yield(VALUE ignore) -{ - struct alloc_hdr *h, *next; - - cds_list_for_each_entry_safe(h, next, &hpb_stats.bodies, anode) { - VALUE v[2]; /* [ generation, address ] */ - void *addr = hdr2ptr(h); - assert(ptr_is_aligned(addr, HEAP_PAGE_ALIGN)); - v[0] = LONG2NUM((long)addr); - v[1] = SIZET2NUM(h->as.live.gen); - rb_yield_values2(2, v); - } - return Qnil; -} - -/* - * call-seq: - * - * Mwrap::HeapPageBody.each { |gen, addr| } -> Integer - * - * Yields the generation (GC.count) the heap page body was created - * and address of the heap page body as an Integer. Returns the - * number of allocated pages as an Integer. This return value should - * match the result of GC.stat(:heap_allocated_pages) - */ -static VALUE hpb_each(VALUE mod) -{ - ++locating; - return rb_ensure(hpb_each_yield, Qfalse, reset_locating, 0); -} - -/* - * call-seq: - * - * Mwrap::HeapPageBody.stat -> Hash - * Mwrap::HeapPageBody.stat(hash) -> hash - * - * The maximum lifespan of a heap page body in the Ruby VM. - * This may be Infinity if no heap page bodies were ever freed. - */ -static VALUE hpb_stat(int argc, VALUE *argv, VALUE hpb) -{ - VALUE h; - - rb_scan_args(argc, argv, "01", &h); - if (NIL_P(h)) - h = rb_hash_new(); - else if (!RB_TYPE_P(h, T_HASH)) - rb_raise(rb_eTypeError, "not a hash %+"PRIsVALUE, h); - - ++locating; -#define S(x) ID2SYM(rb_intern(#x)) - rb_hash_aset(h, S(lifespan_max), acc_max(&hpb_stats.alive)); - rb_hash_aset(h, S(lifespan_min), acc_min(&hpb_stats.alive)); - rb_hash_aset(h, S(lifespan_mean), acc_mean(&hpb_stats.alive)); - rb_hash_aset(h, S(lifespan_stddev), acc_stddev(&hpb_stats.alive)); - rb_hash_aset(h, S(deathspan_max), acc_max(&hpb_stats.reborn)); - rb_hash_aset(h, S(deathspan_min), acc_min(&hpb_stats.reborn)); - rb_hash_aset(h, S(deathspan_mean), acc_mean(&hpb_stats.reborn)); - rb_hash_aset(h, S(deathspan_stddev), acc_stddev(&hpb_stats.reborn)); - rb_hash_aset(h, S(resurrects), SIZET2NUM(hpb_stats.reborn.nr)); -#undef S - --locating; - - return h; -} - /* * Document-module: Mwrap * @@ -1328,19 +1095,13 @@ static VALUE hpb_stat(int argc, VALUE *argv, VALUE hpb) * * dump_fd: a writable FD to dump to * * dump_path: a path to dump to, the file is opened in O_APPEND mode * * dump_min: the minimum allocation size (total) to dump - * * dump_heap: mask of heap_page_body statistics to dump * * If both `dump_fd' and `dump_path' are specified, dump_path takes * precedence. - * - * dump_heap bitmask - * * 0x01 - summary stats (same info as HeapPageBody.stat) - * * 0x02 - all live heaps (similar to HeapPageBody.each) - * * 0x04 - skip non-heap_page_body-related output */ void Init_mwrap(void) { - VALUE mod, hpb; + VALUE mod; ++locating; mod = rb_define_module("Mwrap"); @@ -1372,67 +1133,9 @@ void Init_mwrap(void) rb_define_method(cSrcLoc, "max_lifespan", src_loc_max_lifespan, 0); rb_define_method(cSrcLoc, "name", src_loc_name, 0); - /* - * Information about "struct heap_page_body" allocations from - * Ruby gc.c. This can be useful for tracking fragmentation - * from posix_memalign(3) use in mainline Ruby: - * - * https://sourceware.org/bugzilla/show_bug.cgi?id=14581 - * - * These statistics are never reset by Mwrap.reset or - * any other method. They only make sense in the context - * of an entire program lifetime. - */ - hpb = rb_define_class_under(mod, "HeapPageBody", rb_cObject); - rb_define_singleton_method(hpb, "stat", hpb_stat, -1); - rb_define_singleton_method(hpb, "each", hpb_each, 0); - --locating; } -enum { - DUMP_HPB_STATS = 0x1, - DUMP_HPB_EACH = 0x2, - DUMP_HPB_EXCL = 0x4, -}; - -static void dump_hpb(FILE *fp, unsigned flags) -{ - if (flags & DUMP_HPB_STATS) { - fprintf(fp, - "lifespan_max: %"PRId64"\n" - "lifespan_min:%s%"PRId64"\n" - "lifespan_mean: %0.3f\n" - "lifespan_stddev: %0.3f\n" - "deathspan_max: %"PRId64"\n" - "deathspan_min:%s%"PRId64"\n" - "deathspan_mean: %0.3f\n" - "deathspan_stddev: %0.3f\n" - "gc_count: %zu\n", - hpb_stats.alive.max, - hpb_stats.alive.min == INT64_MAX ? " -" : " ", - hpb_stats.alive.min, - hpb_stats.alive.mean, - acc_stddev_dbl(&hpb_stats.alive), - hpb_stats.reborn.max, - hpb_stats.reborn.min == INT64_MAX ? " -" : " ", - hpb_stats.reborn.min, - hpb_stats.reborn.mean, - acc_stddev_dbl(&hpb_stats.reborn), - /* n.b.: unsafe to call rb_gc_count() in destructor */ - generation); - } - if (flags & DUMP_HPB_EACH) { - struct alloc_hdr *h; - - cds_list_for_each_entry(h, &hpb_stats.bodies, anode) { - void *addr = hdr2ptr(h); - - fprintf(fp, "%p\t%zu\n", addr, h->as.live.gen); - } - } -} - /* rb_cloexec_open isn't usable by non-Ruby processes */ #ifndef O_CLOEXEC # define O_CLOEXEC 0 @@ -1446,7 +1149,6 @@ static void mwrap_dump_destructor(void) struct dump_arg a = { .min = 0 }; size_t i; int dump_fd; - unsigned dump_heap = 0; char *dump_path; char *s; @@ -1478,9 +1180,6 @@ static void mwrap_dump_destructor(void) if ((s = strstr(opt, "dump_min:"))) sscanf(s, "dump_min:%zu", &a.min); - if ((s = strstr(opt, "dump_heap:"))) - sscanf(s, "dump_heap:%u", &dump_heap); - switch (dump_fd) { case 0: goto out; case 1: a.fp = stdout; break; @@ -1500,9 +1199,7 @@ static void mwrap_dump_destructor(void) } /* we'll leak some memory here, but this is a destructor */ } - if ((dump_heap & DUMP_HPB_EXCL) == 0) - dump_to_file(&a); - dump_hpb(a.fp, dump_heap); + dump_to_file(&a); out: --locating; } diff --git a/lib/mwrap_rack.rb b/lib/mwrap_rack.rb index 53380b9..c777a78 100644 --- a/lib/mwrap_rack.rb +++ b/lib/mwrap_rack.rb @@ -89,54 +89,6 @@ class MwrapRack end end - class HeapPages # :nodoc: - include HtmlResponse - HEADER = '<tr><th>address</th><th>generation</th></tr>' - - def hpb_rows - Mwrap::HeapPageBody.stat(stat = Thread.current[:mwrap_hpb_stat] ||= {}) - %i(lifespan_max lifespan_min lifespan_mean lifespan_stddev - deathspan_max deathspan_min deathspan_mean deathspan_stddev - resurrects - ).map! do |k| - "<tr><td>#{k}</td><td>#{stat[k]}</td></tr>\n" - end.join - end - - def gc_stat_rows - GC.stat(stat = Thread.current[:mwrap_gc_stat] ||= {}) - %i(count heap_allocated_pages heap_eden_pages heap_tomb_pages - total_allocated_pages total_freed_pages).map do |k| - "<tr><td>GC.stat(:#{k})</td><td>#{stat[k]}</td></tr>\n" - end.join - end - - GC_STAT_URL = 'https://docs.ruby-lang.org/en/trunk/GC.html#method-c-stat' - GC_STAT_HELP = <<~EOM - <p>Non-Infinity lifespans can indicate fragmentation. - <p>See <a - href="#{GC_STAT_URL}">#{GC_STAT_URL}</a> for info on GC.stat values. - EOM - - def each - Mwrap.quiet do - yield("<html><head><title>heap pages</title></head>" \ - "<body><h1>heap pages</h1>" \ - "<table><tr><th>stat</th><th>value</th></tr>\n" \ - "#{hpb_rows}" \ - "#{gc_stat_rows}" \ - "</table>\n" \ - "#{GC_STAT_HELP}" \ - "<table>#{HEADER}") - Mwrap::HeapPageBody.each do |addr, generation| - addr = -sprintf('0x%x', addr) - yield(-"<tr><td>#{addr}</td><td>#{generation}</td></tr>\n") - end - yield "</table></body></html>\n" - end - end - end - def r404 # :nodoc: [404,{'Content-Type'=>'text/plain'},["Not found\n"]] end @@ -152,15 +104,12 @@ class MwrapRack loc = -CGI.unescape($1) loc = Mwrap[loc] or return r404 EachAt.new(loc).response - when '/heap_pages' - HeapPages.new.response when '/' n = 2000 u = 'https://80x24.org/mwrap/README.html' b = -('<html><head><title>Mwrap demo</title></head>' \ "<body><p><a href=\"each/#{n}\">allocations >#{n} bytes</a>" \ "<p><a href=\"#{u}\">#{u}</a>" \ - "<p><a href=\"heap_pages\">heap pages</a>" \ "</body></html>\n") [ 200, {'Content-Type'=>'text/html','Content-Length'=>-b.size.to_s},[b]] else diff --git a/test/test_mwrap.rb b/test/test_mwrap.rb index eaa65cb..6522167 100644 --- a/test/test_mwrap.rb +++ b/test/test_mwrap.rb @@ -59,13 +59,6 @@ class TestMwrap < Test::Unit::TestCase res = system(env, *cmd) assert res, $?.inspect assert_match(/\b1\d{4}\s+[1-9]\d*\s+-e:1$/, tmp.read) - - tmp.rewind - tmp.truncate(0) - env['MWRAP'] = "dump_path:#{tmp.path},dump_heap:5" - res = system(env, *cmd) - assert res, $?.inspect - assert_match %r{lifespan_stddev}, tmp.read end end @@ -295,35 +288,4 @@ class TestMwrap < Test::Unit::TestCase abort 'freed more than allocated' end; end - - def test_heap_page_body - assert_separately(+"#{<<~"begin;"}\n#{<<~'end;'}") - begin; - require 'mwrap' - require 'rubygems' # use up some memory - ap = GC.stat(:heap_allocated_pages) - h = {} - nr = 0 - Mwrap::HeapPageBody.each do |addr, gen| - nr += 1 - gen <= GC.count && gen >= 0 or abort "bad generation: #{gen}" - (0 == (addr & 16383)) or abort "addr not aligned: #{'%x' % addr}" - end - if RUBY_VERSION.to_f < 3.1 # 3.1+ uses mmap on platforms we care about - nr == ap or abort "HeapPageBody.each missed page #{nr} != #{ap}" - end - 10.times { (1..20000).to_a.map(&:to_s) } - 3.times { GC.start } - Mwrap::HeapPageBody.stat(h) - Integer === h[:lifespan_max] or abort 'lifespan_max not recorded' - Integer === h[:lifespan_min] or abort 'lifespan_min not recorded' - Float === h[:lifespan_mean] or abort 'lifespan_mean not recorded' - 3.times { GC.start } - 10.times { (1..20000).to_a.map(&:to_s) } - Mwrap::HeapPageBody.stat(h) - h[:deathspan_min] <= h[:deathspan_max] or - abort 'wrong min/max deathtime' - Float === h[:deathspan_mean] or abort 'deathspan_mean not recorded' - end; - end end
Rack 3 requires lowercase headers, and they work with any Rack <=2.x version. --- lib/mwrap_rack.rb | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/mwrap_rack.rb b/lib/mwrap_rack.rb index c777a78..1bd00ac 100644 --- a/lib/mwrap_rack.rb +++ b/lib/mwrap_rack.rb @@ -22,10 +22,10 @@ class MwrapRack module HtmlResponse # :nodoc: def response [ 200, { - 'Expires' => 'Fri, 01 Jan 1980 00:00:00 GMT', - 'Pragma' => 'no-cache', - 'Cache-Control' => 'no-cache, max-age=0, must-revalidate', - 'Content-Type' => 'text/html; charset=UTF-8', + 'expires' => 'Fri, 01 Jan 1980 00:00:00 GMT', + 'pragma' => 'no-cache', + 'cache-control' => 'no-cache, max-age=0, must-revalidate', + 'content-type' => 'text/html; charset=UTF-8', }, self ] end end @@ -90,7 +90,7 @@ class MwrapRack end def r404 # :nodoc: - [404,{'Content-Type'=>'text/plain'},["Not found\n"]] + [404,{'content-type'=>'text/plain'},["Not found\n"]] end # The standard Rack application endpoint for MwrapRack @@ -111,7 +111,7 @@ class MwrapRack "<body><p><a href=\"each/#{n}\">allocations >#{n} bytes</a>" \ "<p><a href=\"#{u}\">#{u}</a>" \ "</body></html>\n") - [ 200, {'Content-Type'=>'text/html','Content-Length'=>-b.size.to_s},[b]] + [ 200, {'content-type'=>'text/html','content-length'=>-b.size.to_s},[b]] else r404 end
This quiets `undefining the allocator of T_DATA class Mwrap::SourceLocation' warnings. --- ext/mwrap/mwrap.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ext/mwrap/mwrap.c b/ext/mwrap/mwrap.c index 6875486..160007f 100644 --- a/ext/mwrap/mwrap.c +++ b/ext/mwrap/mwrap.c @@ -1115,6 +1115,7 @@ void Init_mwrap(void) * This class is only available since mwrap 2.0.0+. */ cSrcLoc = rb_define_class_under(mod, "SourceLocation", rb_cObject); + rb_undef_alloc_func(cSrcLoc); rb_define_singleton_method(mod, "dump", mwrap_dump, -1); rb_define_singleton_method(mod, "reset", mwrap_reset, 0); rb_define_singleton_method(mod, "clear", mwrap_clear, 0);
Ruby 3.1 switched to mmap for HPB in 2021, and I doubt it'll be going back to *memalign on Linux or FreeBSD. HPB stats don't exist at all for 3.1+ right now. Tracking mmap allocations safely would be significantly more difficult and expensive since munmap can cross mmap-ed regions. With *memalign + free, there's a simple 1:1 relationship, but not with mmap + munmap. munmap can work on any subset (or even superset if multiple mmap calls return sequential pages) of addresses within any mmap-ed region(s). In other words, each 4k page would need a separately-allocated tracking struct in a process-wide tree or hash table. I don't think Ruby currently does asymmetric mmap/munmap; but extensions and any spawned processes may and it's the only safe way to account for it. So the tracking is definitely doable, but I'm not sure it's worth the time and effort. These are GC-internal allocations and any instrumentation for the GC itself is probably better off being added to ruby/gc.c There's something similar on the Perl 5 side, too. It allocates small strings out of 4080-byte malloc-ed arenas and I was confused with 4080-byte allocations until I cranked up C backtraces via MWRAP=bt:$N. I think a better long-term feature would be to be able to interactively crank up C backtrace levels on a per-callsite basis. Right now, the C backtrace level is global, and increasing that interactively gets expensive fast.