about summary refs log tree commit homepage
path: root/Documentation/public-inbox-tuning.pod
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/public-inbox-tuning.pod')
-rw-r--r--Documentation/public-inbox-tuning.pod42
1 files changed, 36 insertions, 6 deletions
diff --git a/Documentation/public-inbox-tuning.pod b/Documentation/public-inbox-tuning.pod
index 53668ecc..892ee0f2 100644
--- a/Documentation/public-inbox-tuning.pod
+++ b/Documentation/public-inbox-tuning.pod
@@ -42,6 +42,14 @@ Other OS tuning knobs
 
 Scalability to many inboxes
 
+=item 9
+
+public-inbox-cindex --join performance
+
+=item 10
+
+public-inbox-clone with shared object stores
+
 =back
 
 =head2 New inboxes: public-inbox-init -V2
@@ -79,8 +87,8 @@ RAM.  Attempts to parallelize random I/O on HDDs leads to pathological
 slowdowns as inboxes grow.
 
 While C<-V2> introduced Xapian shards as a parallelization
-mechanism for SSDs; enabling C<publicInbox.indexSequentialShard>
-repurposes sharding as mechanism to reduce the kernel page cache
+mechanism for SSDs, enabling C<publicInbox.indexSequentialShard>
+repurposes sharding as a mechanism to reduce the kernel page cache
 footprint when indexing on HDDs.
 
 Initializing a mirror with a high C<--jobs> count to create more
@@ -108,7 +116,7 @@ indices on btrfs to achieve acceptable performance (even on SSD).
 Disabling copy-on-write also disables checksumming, thus C<raid1>
 (or higher) configurations may be corrupt after unsafe shutdowns.
 
-Fortunately, these SQLite and Xapian indices are designed to
+Fortunately, these SQLite and Xapian indices are designed to be
 recoverable from git if missing.
 
 Disabling CoW does not prevent all fragmentation.  Large values
@@ -125,7 +133,7 @@ C<btrfs filesystem defragment -fr $INBOX_DIR> may be necessary.
 Large filesystems benefit significantly from the C<space_cache=v2>
 mount option documented in L<btrfs(5)>.
 
-Older, non-CoW filesystems are generally work well out-of-the-box
+Older, non-CoW filesystems generally work well out of the box
 for our Xapian and SQLite indices.
 
 =head2 Performance on solid state drives
@@ -152,9 +160,16 @@ C<LimitNOFILE=> in L<systemd.exec(5)>) may need to be raised to
 accommodate many concurrent clients.
 
 Transport Layer Security (IMAPS, NNTPS, or via STARTTLS) significantly
-increases memory use of client sockets, sure to account for that in
+increases memory use of client sockets, be sure to account for that in
 capacity planning.
 
+Bursts of small object allocations late in process life contribute to
+fragmentation of the heap due to arenas (slabs) used internally by Perl.
+glibc malloc users should use C<MALLOC_MMAP_THRESHOLD_=131072> to reduce
+fragmentation from the sliding mmap window.  jemalloc (tested as an
+LD_PRELOAD on GNU/Linux) also reduces fragmentation compared to an
+unconfigured glibc malloc in long-lived processes.
+
 =head2 Other OS tuning knobs
 
 Linux users: the C<sys.vm.max_map_count> sysctl may need to be increased if
@@ -168,13 +183,28 @@ Other OSes may have similar tuning knobs (patches appreciated).
 L<public-inbox-extindex(1)> allows any number of public-inboxes
 to share the same Xapian indices.
 
-git 2.33+ startup time is orders-of-magnitude faster and uses
+git 2.33+ startup time is orders of magnitude faster and uses
 less memory when dealing with thousands of alternates required
 for thousands of inboxes with L<public-inbox-extindex(1)>.
 
 Frequent packing (via L<git-gc(1)>) both improves performance
 and reduces the need to increase C<sys.vm.max_map_count>.
 
+=head2 public-inbox-cindex --join performance
+
+A C++ compiler and the Xapian development files makes C<--join> or
+C<--join=aggressive> orders of magnitude faster in L<public-inbox-cindex(1)>.
+On Debian-based systems this is C<libxapian-dev>.  RPM-based distros have
+these in C<xapian-core-devel> or C<xapian14-core-libs>.  *BSDs typically
+package development files together with runtime libraries, so the C<xapian>
+or C<xapian-core> package will already have the development files.
+
+=head2 public-inbox-clone with shared object stores
+
+When mirroring manifests with many forks using the same objstore,
+git 2.41+ is highly recommended for performance as we automatically
+use the C<fetch.hideRefs> feature to speed up negotiation.
+
 =head1 CONTACT
 
 Feedback encouraged via plain-text mail to L<mailto:meta@public-inbox.org>