about summary refs log tree commit homepage
path: root/Documentation/public-inbox-extindex.pod
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/public-inbox-extindex.pod')
-rw-r--r--Documentation/public-inbox-extindex.pod51
1 files changed, 40 insertions, 11 deletions
diff --git a/Documentation/public-inbox-extindex.pod b/Documentation/public-inbox-extindex.pod
index f71a90e5..b53e45ed 100644
--- a/Documentation/public-inbox-extindex.pod
+++ b/Documentation/public-inbox-extindex.pod
@@ -13,7 +13,7 @@ public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all
 public-inbox-extindex creates and updates an external search and
 overview database used by the read-only public-inbox PSGI (HTTP),
 NNTP, and IMAP interfaces.  This requires either the
-L<Search::Xapian> XS bindings OR the L<Xapian> SWIG bindings,
+L<Xapian> SWIG bindings OR or L<Search::Xapian> XS bindings
 along with L<DBD::SQLite> and L<DBI> Perl modules.
 
 =head1 OPTIONS
@@ -47,11 +47,26 @@ C<indexlevel> set to C<basic> and their respective Xapian
 public-inboxes where cross-posting is common, this allows
 significant space savings on Xapian indices.
 
+=item --dedupe=MSGID
+
+=item --dedupe
+
+Rerun deduplication on messages with the given Message-ID or
+all messages if no Message-ID is specified.  Deduplication rules may
+change and evolve over time, especially if filters are involved.
+
+C<--dedupe=MSGID> may be specified multiple times to deduplicate
+multiple Message-IDs.
+
+Use this if you see C<W: BUG? $MSGID not deduplicated properly>
+warnings from WWW logs.
+
 =item --gc
 
 Perform garbage collection instead of indexing.  Use this if
-inboxes are removed from the extindex, or if messages are
-purged or removed from some inboxes.
+inboxes are removed from the extindex, a newsgroup name is
+set or changed, or if messages are purged or removed from
+some inboxes.
 
 =item --reindex
 
@@ -60,10 +75,6 @@ used for in-place upgrades and bugfixes while read-only server
 processes are utilizing the index.  Keep in mind this roughly
 doubles the size of the already-large Xapian database.
 
-The extindex locks will be released roughly every 10s to
-allow L<public-inbox-mda(1)> and L<public-inbox-watch(1)>
-processes to write to the extindex.
-
 =item --fast
 
 Used with C<--reindex>, it will only look for new and stale
@@ -77,9 +88,9 @@ L<public-inbox-extindex-format(5)>
 
 =head1 CONFIGURATION
 
-public-inbox-extindex does not currently write to the
-L<public-inbox-config(5)> file, configuration may be entered
-manually.  The extindex name of C<all> is a special case which
+public-inbox-extindex does not write to the L<public-inbox-config(5)>
+file, it must be entered manually.
+The extindex name of C<all> is a special case which
 corresponds to indexing C<--all> inboxes.  An example for
 C<--all> is as follows:
 
@@ -89,6 +100,16 @@ C<--all> is as follows:
                 coderepo = foo
                 coderepo = bar
 
+Putting an C<extindex> entry in the config allows L<PublicInbox::WWW>.
+You can have any number of C<extentry.$NAME> sections where C<$NAME>
+is something other than C<all> to display a union of several inboxes.
+
+It is strongly recommended any public inboxes indexed by this command
+have a stable C<publicinbox.$NAME.newsgroup> entry (regardless of
+the presence of an NNTP or IMAP server).  Otherwise, public-inbox-extindex
+will use C<publicinbox.$NAME.inboxdir> as an internal key which can
+cause needless reindexing and require L<--gc> if inboxes are relocated.
+
 See L<public-inbox-config(5)> for more details.
 
 =head1 ENVIRONMENT
@@ -117,9 +138,17 @@ Default: none, uses C<publicinbox.indexBatchSize>
 
 =head1 UPGRADING
 
-Occasionally, public-inbox will update it's schema version and
+Occasionally, public-inbox will update its schema version and
 require a full index by running this command.
 
+=head1 LOCKING
+
+It is safe to use C<--dedupe>, C<--gc> and C<--reindex> while
+other processes are writing to covered inboxes or extindex.
+The extindex locks will be released roughly every 10s to
+allow L<public-inbox-mda(1)> and L<public-inbox-watch(1)>
+processes to write to the extindex.
+
 =head1 CONTACT
 
 Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>