1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
| | =head1 NAME
public-inbox-extindex - create and update external search indices
=head1 SYNOPSIS
public-inbox-extindex [OPTIONS] EXTINDEX_DIR INBOX_DIR...
public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all
=head1 DESCRIPTION
public-inbox-extindex creates and updates an external search and
overview database used by the read-only public-inbox PSGI (HTTP),
NNTP, and IMAP interfaces. This requires either the
L<Search::Xapian> XS bindings OR the L<Xapian> SWIG bindings,
along with L<DBD::SQLite> and L<DBI> Perl modules.
=head1 OPTIONS
=over
=item -j JOBS
=item --jobs=JOBS
=item --no-fsync
=item --dangerous
=item --rethread
=item --max-size SIZE
=item --batch-size SIZE
These switches behave as they do for L<public-inbox-index(1)>
=item --all
Index all C<publicinbox> entries in C<PI_CONFIG>.
C<publicinbox> entries indexed by C<public-inbox-extindex> can
have full Xapian searching abilities with the per-C<publicinbox>
C<indexlevel> set to C<basic> and their respective Xapian
(C<xap15> or C<xapian15>) directories removed. For multiple
public-inboxes where cross-posting is common, this allows
significant space savings on Xapian indices.
=item --gc
Perform garbage collection instead of indexing. Use this if
inboxes are removed from the extindex, or if messages are
purged or removed from some inboxes.
=item --reindex
Forces a re-index of all messages in the extindex. This can be
used for in-place upgrades and bugfixes while read-only server
processes are utilizing the index. Keep in mind this roughly
doubles the size of the already-large Xapian database.
The extindex locks will be released roughly every 10s to
allow L<public-inbox-mda(1)> and L<public-inbox-watch(1)>
processes to write to the extindex.
=item --fast
Used with C<--reindex>, it will only look for new and stale
entries and not touch already-indexed messages.
=back
=head1 FILES
L<public-inbox-extindex-format(5)>
=head1 CONFIGURATION
public-inbox-extindex does not currently write to the
L<public-inbox-config(5)> file, configuration may be entered
manually. The extindex name of C<all> is a special case which
corresponds to indexing C<--all> inboxes. An example for
C<--all> is as follows:
[extindex "all"]
topdir = /path/to/extindex_dir
url = all
coderepo = foo
coderepo = bar
See L<public-inbox-config(5)> for more details.
=head1 ENVIRONMENT
=over 8
=item PI_CONFIG
Used to override the default "~/.public-inbox/config" value.
=item XAPIAN_FLUSH_THRESHOLD
The number of documents to update before committing changes to
disk. This environment is handled directly by Xapian, refer to
Xapian API documentation for more details.
Setting C<XAPIAN_FLUSH_THRESHOLD> or
C<publicinbox.indexBatchSize> for a large C<--reindex> may cause
L<public-inbox-mda(1)>, L<public-inbox-learn(1)> and
L<public-inbox-watch(1)> tasks to wait long and unpredictable
periods of time during C<--reindex>.
Default: none, uses C<publicinbox.indexBatchSize>
=back
=head1 UPGRADING
Occasionally, public-inbox will update it's schema version and
require a full index by running this command.
=head1 CONTACT
Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
The mail archives are hosted at L<https://public-inbox.org/meta/> and
L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
=head1 COPYRIGHT
Copyright all contributors L<mailto:meta@public-inbox.org>
License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
=head1 SEE ALSO
L<Search::Xapian>, L<DBD::SQLite>
|