about summary refs log tree commit homepage
path: root/t
DateCommit message (Collapse)
2017-02-16repobrowse: switch to new URL format to avoid query strings
Query strings make endpoint caching more difficult since they're order-independent. They are also more likely lost or truncated inadvertantly when copy+pasting, so try to avoid them for default endpoints. There's still some things which are broken and followup commits will be needed to fix them.
2017-02-15config: avoid circular loading dependency
We must lazilly load one of them, so load Inbox later since we need to parse the config, first.
2017-02-14Merge remote-tracking branch 'origin/master' into repobrowse
* origin/master: www: do not unescape PATH_INFO twice t/mime: quiet warnings for old versions of Email::Simple handle repeated References and In-Reply-To headers
2017-02-14www: do not unescape PATH_INFO twice
PSGI specs already require PATH_INFO to be unescaped; so our tests were wrong, too.
2017-02-12t/mime: quiet warnings for old versions of Email::Simple
This is fixed in the newest versions of Email::Simple, but not the version in Debian jessie (2.203)
2017-02-10search: remove unnecessary abstractions and functionality
This simplifies the code a bit and reduces the translation overhead for looking directly at data from tools shipped with Xapian. While we're at it, fix thread-all.t :)
2017-02-09repobrowse: shorten internal names
We'll still be keeping "repobrowse" for the public API for use with .psgi files, but shortening the name means less typing and we may have command-line tools, too.
2017-02-09Merge remote-tracking branch 'origin/master' into repobrowse
* origin/master: config: do not slurp lines into memory TODO: several updates search: schema version bump for empty References/In-Reply-To Revert "searchidx: reindex clobbers old thread IDs" searchidx: reindex clobbers old thread IDs searchidx: deal with empty In-Reply-To and References headers searchview: increase limit for displaying search results searchview: clarify numeric summary at bottom add filter for Subject: tags watchmaildir: allow arguments for filters watchmaildir: limit live importer processes learn: implement "rm" only functionality mime: avoid SUPER usage in Email::MIME subclass inbox: reinstate periodic cleanup of Xapian and SQLite objects introduce PublicInbox::MIME wrapper class
2017-02-08repobrowse: start wiring up git search
Much more work on this will be needed, but at least explicit flush points prevents OOMs on my system.
2017-02-07search: hoist out git directory search index helper
We will be reusing this for indexing normal (code) repositories using git and Xapian, too.
2017-01-26add filter for Subject: tags
Some mailing lists add annoying tags into the Subject line which discourages readers from doing proper mail organization on the client side. They also waste precious screen space and attention span. Remove them from our archives to reduce clutter.
2017-01-22t/httpd-unix: better diagnostics and comments for test
I've hit random test failures on this, so attempt to improve diagnostics and improve documentation for this test.
2017-01-21repobrowse: preserve newlines in Atom feed
Commit messages are assumed to be displayed in a terminal with a fixed width font, so we must preserve newlines and all whitespace as-is so ASCII art may be displayed properly.
2017-01-21repobrowse: simplify git log parsing implementation
Based on what was done for the Atom feed, this will allow us to simplify state management through metaprogramming and avoid placeholder characters ('D' for decoration) for empty fields.
2017-01-21repobrowse: fix full URL generation in Atom feed
We must not drop the leading slash in the URI. This regression was introduced when we dropped Plack::Request dependency.
2017-01-18mime: avoid SUPER usage in Email::MIME subclass
We must call Email::Simple methods directly in our monkey patch for Email::MIME to call the intended method. Using SUPER in our subclass would instead hit a different, unintended method in Email::MIME. Reported-by: Junio C Hamano <gitster@pobox.com> <xmqq4m0wb43w.fsf@gitster.mtv.corp.google.com>
2017-01-11repobrowse: qspawn + streaming for git commit display
This prevents "git show" processes from monopolizing the system and allows us to better handle backpressure from gigantic commits.
2017-01-10introduce PublicInbox::MIME wrapper class
This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2017-01-08Merge remote-tracking branch 'origin/master' into repobrowse
* origin/master: inbox: properly register cleanup timer for git processes search: remove subject_summary searchmsg: favor direct hash access over accessor methods remove incorrect comment about strftime + locales config: allow per-inbox nntpserver inbox: eliminate weaken usage entirely inbox: describe the full key name config: remove unused get() method config: always use namespaced "publicinboxlimiter" qspawn: prepare to support runtime reloading of Limiter http: remove weaken usage, reduce anonsub capture scope httpd/async: remove weaken usage http: fix spelling error watch: watchspam affects all configured inboxes doc: minor updates to design notes
2017-01-07initial git async work
This will allow us to handle network operations while waiting on "git cat-file" to seek and unpack things.
2017-01-07search: remove subject_summary
Apparently it never actually got used, and the world seems fine without it, so we can drop it. While we're at it, consider removing our subject_path usage from existence, too. We are not using fancy subject-line based URLs, here.
2017-01-07config: allow per-inbox nntpserver
This allows certain inboxes to override the global nntpserver (perhaps under a different domain).
2017-01-07inbox: eliminate weaken usage entirely
We can do a better job initializing the data structure so we no longer need to rely on weak references to cleanup when we ditch the config on reload.
2017-01-07config: always use namespaced "publicinboxlimiter"
I'm not sure if we'll ever support sharing a config file with other tools, but maybe we will, and "limiter" is too generic.
2016-12-26spawn: remove non-blocking support, here
It is never used, and inappropriate to support in generic code. HTTPD::Async already sets non-blocking, and it's better to do it in -httpd-specific code since we know our -httpd can handle it.
2016-12-26Merge remote-tracking branch 'origin/master' into repobrowse
* origin/master: (25 commits) evcleanup: ensure deferred close from timers are handled ASAP httpd/async: improve variable naming githttpbackend: minor cleanups to improve readability githttpbackend: simplify compatibility code githttpbackend: minor readability improvement http: fix clobbering of $null_io linkify: modify argument in place view: do not modify array during iteration view: stop chomping off whitespace at ends of messages view: remove unused parameter search: lookup_mail handles modified DBs doc: various comments on async handling searchthread: simplify API and remove needless OO searchthread: update comment about loop prevention searchmsg: remove ensure_metadata tests: add thread-all testing for benchmarking searchmsg: do not memoize {date} field searchmsg: remove locale-dependency for ->date t/config.t: fix feedmax default wwwtext: link to RFC4685 (Atom Threading) ...
2016-12-26t/repobrowse_git_httpd: remove XS parser dependency
Relying on the XS parser has been optional since March 2016: commit 7dd78012da81d48e5e73e56c3255895dfa9de1f5 ("http: use Plack::HTTPParser for HTTP parsing")
2016-12-22repobrowse: remove Plack::Request dependency
This does not make installation easier, but lightens runtime a bit. Plack::Request is unnecessary bloat and indirection which does things behind our back. $env has all the stuff we need.
2016-12-21searchthread: simplify API and remove needless OO
This simplifies callers to prevent errors and avoids needless object-orientation in favor of a single procedure call to handle threading and ordering.
2016-12-20searchmsg: remove ensure_metadata
Instead, only preload the ->mid field for threading, as we only need ->thread and ->path once in Search->get_thread (but we will need the ->mid field repeatedly). This more than doubles View->load_results performance on according to thread-all on an inbox with over 300K messages.
2016-12-20tests: add thread-all testing for benchmarking
I'll be using this to improve message threading performance.
2016-12-17t/config.t: fix feedmax default
Oops :x
2016-12-17feed: support publicinbox.<name>.feedmax
This allows users to customize by using smaller or larger Atom feeds than the default value of 25 entries.
2016-12-14t/thread-cycle: no need for Xapian to run this test
We don't actually use anything from SearchMsg, just the class name.
2016-12-14Merge remote-tracking branch 'origin/repobrowse' into repobrowse
* origin/repobrowse: (98 commits) t/repobrowse_git_httpd.t: ensure signature exists for split t/repobrowse_git_tree.t: fix test for lack of bold repobrowse: fix alignment of gitlink entries repobrowse: show invalid type for tree views repobrowse: do not bold directory names in tree view repobrowse: reduce checks for response fh repobrowse: larger, short-lived buffer for reading patches repobrowse: reduce risk of callback reference cycles repobrowse: snapshot support for cgit compatibility test: disable warning for Plack::Test::Impl repobrowse: avoid confusing linkification for "diff" repobrowse: git commit view uses pi-httpd.async repobrowse: more consistent variable naming for /commit/ repobrowse: show roughly equivalent "diff-tree" invocation repobrowse: reduce local variables for state management repobrowse: summary handles multiple README types repobrowse: remove bold decorations from diff view repobrowse: common git diff parsing code repobrowse: implement diff view for compatibility examples/repobrowse.psgi: disable Chunked response by default ...
2016-12-13nntp: add test case for the "DATE" command
We may not always use strftime and may implement caching. But for now, just add a test.
2016-12-12init: preserve permissions of existing config file
This matches git-config(1) behavior, and implied user intent when it comes to programatically editing files.
2016-12-10thread: last Reference always wins
Since we use SearchMsg from Xapian data, we can be assured we do not get self-referential {references} field. However, we may need to be more careful when checking has_descendent for loops, as blindly calling add_child could open us up to that possibility...
2016-12-06linkify: implement Markdown link compatibility (again)
Although unescaped parentheses in URLs are technically allowed, they are uncommon. However, Markdown-like syntaxes are unfortunately common for URLs, so we might as well support them. This fixes parentheses detection at sentence endings, as seen in practice on emails.
2016-12-06Revert "linkify: implement Markdown link compatibility"
This reverts commit 130d0c4e33c5c73dc69e270fc698735d49e0f159.
2016-12-06linkify: implement Markdown link compatibility
Although unescaped parentheses in URLs are technically allowed, they are uncommon. However, Markdown-like syntaxes are unfortunately common for URLs, so we might as well support them.
2016-12-03atom: switch to getline/close for response bodies
This will let us stream larger Atom documents bodies without wasting too much memory and reduce the amount of round-trip requests needed to get necessary information. Hopefully clients are using streaming (SAX) parsers, too. This is the final transition in the core public-inbox code to allow migrating to a "pull"-based body streaming scheme which allows a HTTP server to respond appropriately to backpressure from slow clients.
2016-11-26avoid IO::File for anonymous temporary files
We do not need to import IO::File into the main programs since Perl 5.8+ supports literal "undef" for generating anonymous temporary file handles.
2016-10-05t/thread-cycle: test self-referential messages
Some broken (or malicious) mailers may include a generated Message-ID in its References header, so be prepared for it.
2016-10-05thread: use hash + array instead of hand-rolled linked list
This starts to show noticeable performance improvements when attempting to thread over 400 messages; but the improvement may not be measurable with less. However, the resulting code is much shorter and (IMHO) much easier to understand.
2016-10-05thread: remove Email::Abstract wrapping
This roughly doubles performance due to the reduction in object creation and abstraction layers.
2016-10-05thread: remove Mail::Thread dependency
Introduce our own SearchThread class for threading messages. This should allow us to specialize and optimize away objects in future commits.
2016-09-09t/httpd-unix: warn about connection failure
Output $! for diagnostic purposes since I've noticed this on two slow machines, today (and seemingly, never prior).
2016-09-09search: index attachment filenames
And while we're at it, ensure searching inside displayable attachment bodies works.
2016-09-09search: more granular message body searching
"bs:" and "b:" are adapted from mairix(1) We will also support searching explicitly for quoted vs non-quoted text via "q:" and "nq:" prefixes since sometimes readers will not care for quoted text. In the future, we will support parsing diffs (perhaps when repobrowse integration is complete). Note: this roughly doubles the size of the Xapian database due to the additional information; so this change may not be worth it.