From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS12876 163.172.0.0/16 X-Spam-Status: No, score=-2.6 required=3.0 tests=AWL,BAYES_00, RCVD_IN_MSPIKE_BL,RCVD_IN_MSPIKE_ZBI,RCVD_IN_XBL,SPF_FAIL,SPF_HELO_FAIL shortcircuit=no autolearn=no autolearn_force=no version=3.4.0 Received: from 80x24.org (torrelay5.tomhek.net [163.172.38.173]) by dcvr.yhbt.net (Postfix) with ESMTP id 9D1A22070F for ; Thu, 8 Sep 2016 10:24:05 +0000 (UTC) From: Eric Wong To: spew@80x24.org Subject: [PATCH 7/8] search: increase term positions for each quoted hunk Date: Thu, 8 Sep 2016 10:23:40 +0000 Message-Id: <20160908102341.20534-8-e@80x24.org> In-Reply-To: <20160908102341.20534-1-e@80x24.org> References: <20160908102341.20534-1-e@80x24.org> List-Id: We pay a storage cost for storing positional information in Xapian, make good use of it by attempting to preserve it for (hopefully) better search results. --- lib/PublicInbox/SearchIdx.pm | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm index 25452da..0e499ad 100644 --- a/lib/PublicInbox/SearchIdx.pm +++ b/lib/PublicInbox/SearchIdx.pm @@ -135,6 +135,13 @@ sub index_users ($$) { $tg->increase_termpos; } +sub index_body ($$$) { + my ($tg, $lines, $inc) = @_; + $tg->index_text(join("\n", @$lines), $inc, $inc ? 'XNQ' : 'XQUOT'); + @$lines = (); + $tg->increase_termpos; +} + sub add_message { my ($self, $mime, $bytes, $num, $blob) = @_; # mime = Email::MIME object my $db = $self->{xdb}; @@ -185,23 +192,15 @@ sub add_message { my @lines = split(/\n/, $body); while (defined(my $l = shift @lines)) { if ($l =~ /^>/) { + index_body($tg, \@orig, 1) if @orig; push @quot, $l; } else { + index_body($tg, \@quot, 0) if @quot; push @orig, $l; } } - if (@quot) { - my $s = join("\n", @quot); - @quot = (); - $tg->index_text($s, 0, 'XQUOT'); - $tg->increase_termpos; - } - if (@orig) { - my $s = join("\n", @orig); - @orig = (); - $tg->index_text($s, 1, 'XNQ'); - $tg->increase_termpos; - } + index_body($tg, \@quot, 0) if @quot; + index_body($tg, \@orig, 1) if @orig; }); link_message($self, $smsg, $old_tid); -- EW