Date | Commit message (Collapse) |
|
The 131072 byte lower bound was the old default before the
sliding mmap window was introduced in modern glibc malloc.
While the sliding mmap window was intended to be faster by
reducing syscalls, zeroing and kernel overhead, it is also prone
to fragmentation from allocation patterns seen in evented Perl
servers.
Individual allocations over 128K are rare in our codebase since
there aren't many messages this large, making any performance
impact tiny. Furthermore, the reduction in fragmentation and
memory use will be a speedup for memory-constrained systems
since they can avoid swap and have more leftover for the page
cache.
|
|
Large string processing + concurrency + caching/memoization
really brings out the worst in glibc malloc :<
|
|
|
|
I may be mistaken, but I suspect the reason jemalloc handles
long-lived processes better than glibc is due to granularity
reduction being scaled to larger size classes. This can waste
20% of an individual allocation, but increases the likelyhood
of reuse (without splitting/consolidating into other sizes).
In other words, glibc seems to try too hard to make the best fit
for initial allocations. This ends up being suboptimal over
time as those allocations are freed and similar (but not
identical) allocations come in. jemalloc sacrifices the best
initial fit for better fits over a long process lifetime.
|
|
It's important show that a single systemd service and socket file
can replace all other read-only daemons for ease-of-management.
|