public-inbox.git  about / heads / tags
an "archives first" approach to mailing lists
blob 039694c3344118970b2f0d364c7d2506cf7dcd11 2175 bytes (raw)
$ git show HEAD:Documentation/technical/memory.txt	# shows this blob on the CLI

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
 
semi-automatic memory management in public-inbox
------------------------------------------------

The majority of public-inbox is implemented in Perl 5, a
language and interpreter not particularly known for being
memory-efficient.

We strive to keep processes small to improve locality, allow
the kernel to cache more files, and to be a good neighbor to
other processes running on the machine.  Taking advantage of
automatic reference counting (ARC) in Perl allows us to
deterministically release memory back to the heap.

We start with a simple data model with few circular
references.  This both eases human understanding and reduces
the likelihood of bugs.

Knowing the relative sizes and quantities of our data
structures, we limit the scope of allocations as much as
possible and keep large allocations shortest-lived.  This
minimizes both the cognitive overhead on humans in addition
to reducing memory pressure on the machine.

Short-lived non-immortal closures (aka "anonymous subs") are
avoided in long-running daemons unless required for
compatibility with PSGI.  Closures are memory-intensive and
may make allocation lifetimes less obvious to humans.  They
are also the source of memory leaks in older versions of
Perl, including 5.16.3 found in enterprise distros.

We also use Perl's `delete' and `undef' built-ins to drop
reference counts sooner than scope allows.  These functions
are required to break the few reference cycles we have that
would otherwise lead to leaks.

Of note, `undef' may be used in two ways:

1. to free(3) the underlying buffer:

	undef $scalar;

2. to reset a buffer but reduce realloc(3) on subsequent growth:

	$scalar = "";		# useful when repeated appending
	$scalar = undef;	# usually not needed

In the future, our internal data model will be further
flattened and simplified to reduce the overhead imposed by
small objects.  Large allocations may also be avoided by
optionally using Inline::C.

Finally, the mwrap-perl LD_PRELOAD wrapper was ported to Perl 5
and enhanced to provide live memory usage tracking on 64-bit systems
with minimal performance impact on production traffic:

	git clone https://80x24.org/mwrap-perl.git

git clone https://public-inbox.org/public-inbox.git
git clone http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git