about summary refs log tree commit homepage
path: root/lib/PublicInbox/SearchIdx.pm
DateCommit message (Collapse)
2015-09-30nntp: implement OVER/XOVER summary in search document
The document data of a search message already contains a good chunk of the information needed to respond to OVER/XOVER commands quickly. Expand on that and use the document data to implement OVER/XOVER quickly. This adds a dependency on Xapian being available for nntpd usage, but is probably alright since nntpd is esoteric enough that anybody willing to run nntpd will also want search functionality offered by Xapian. This also speeds up XHDR/HDR with the To: and Cc: headers and :bytes/:lines article metadata used by some clients for header displays and marking messages as read/unread.
2015-09-25searchidx: remove unused sub: next_doc_id
It seems like it was never used
2015-09-15searchidx: sync Msgmap database along with Xapian
We can avoid duplicating work of extracting messages from git if we tie this to Xapian. Of course, this ties the two features together, but it's probably reasonable to expect that anybody who wants to use public-inbox to serve messages to front-end users will have both.
2015-09-15searchidx: hoist out rlog code
We'll be reusing this for loading msgmap.
2015-09-06update copyright headers and email addresses
In the future, it should be possible to use this: git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-03search: disable Message-ID compression in Xapian
We'll continue to compress long Message-IDs in URLs (which we know about), but we will store entire Message-IDs in the Xapian database to facilitate ease-of-lookups in external databases.
2015-09-01search: reduce redundant doc data
Redundant document data increases our database size, pull the smsg->mid off the unique term, the smsg->ts off the value, and only generate the formatted display date off smsg->ts.
2015-08-30search: do not index references and inreplyto terms
We no longer need them, as we can rely on index-time thread resolution and thread merging. This allows us to index less data and hopefully increase efficiency.
2015-08-29avoid length in boolean context
Perl does not currently optimize for this. ref (from p5p): http://mid.gmane.org/D5C27970-9176-4C7A-8B99-7D78360E67A2@pobox.com
2015-08-25mid: mid_compressed => mid_compress
Consistently name mid_* functions as verbs.
2015-08-23cleanup calls to header_obj
Dereference header_obj only once when performance may be critical, or simplify our code by calling "header" directly on the Email::{Simple,MIME} object if not.
2015-08-23hopefully fix broken permissions for search
We must preserve the umask for the entirety of the indexing operation, as Xapian transactions replace entire files atomically instead of writing them in place.
2015-08-23search: respect core.sharedRepository in for Xapian DB
Extend the purpose of core.sharedRepository to apply to the $GIT_DIR/public-inbox/xapian* directory.
2015-08-22search: split search indexing to a separate file
This makes organization easier and reduces the amount of code loaded for a PSGI, mod_perl or CGI instance.