about summary refs log tree commit homepage
path: root/lib/PublicInbox/Watch.pm
DateCommit message (Collapse)
2024-04-03treewide: avoid getpid() for OnDestroy checks
getpid() isn't cached by glibc nowadays and system calls are more expensive due to CPU vulnerability mitigations. To ensure we switch to the new semantics properly, introduce a new `on_destroy' function to simplify callers. Furthermore, most OnDestroy correctness is often tied to the process which creates it, so make the new API default to guarded against running in subprocesses. For cases which require running in all children, a new PublicInbox::OnDestroy::all call is provided.
2024-01-30watch: support incremental updates from MH
The good news (compared to lei) is we only have to worry about imports and don't care about the filename nor keywords, so it's immune to .mh_sequences writing inconsistencies across MH implementations and sequence number packing. We still assume the writer will write the mail file with one of: * rename(2) to create the final sequence number filename * a single write(2) if not relying on rename(2) mlmmj and mutt satisfy these requirements. Python's Lib/mailbox.py may, I'm not sure...
2023-11-22watch: support `watch=false' to negate watchspam
For users hosting read-only mirrors (via clone|fetch) and feeding inboxes via -watch
2023-11-11mda|learn|watch: support dropUniqueUnsubscribe config
List-Unsubscribe headers with unique identifiers (such as those generated by our examples/unsubscribe.milter) should not end up in public archives. Add a new config knob to strip List-Unsubscribe headers if they have the `List-Unsubscribe-Post: List-Unsubscribe=One-Click' header. Unfortunately, this breaks DKIM signatures if the signature covers either of these List-Unsubscribe* headers. However, breaking DKIM is the lesser evil compared to any archive reader being able to stop archival by an independent archivist. As much as I would like this to be the default, it probably affects few users at the moment since very few mailing lists use unique identifiers in List-Unsubscribe (but that number has grown, recently).
2023-11-09net: retry on EINTR and check for {quit} flag
This should allow us to detect shutdown signals in -watch more quickly and not unnecessarily fail on inconsequential signals such as SIGWINCH.
2023-11-01watch: simplify DirIdle object cleanup
There's no need to waste time nor reach into DS internals to map FDs to Perl objects, here. LEI.pm has never had to deal with integer FDs for DirIdle, either.
2023-10-18ds: introduce and use do_fork helper
This ensures we handle RNG reseeding and resetting the event loop properly in child processes after forking.
2023-10-04move all non-test @post_loop_do into named subs
Compared to Danga::Socket, our @post_loop_do API is designed to make it easier to avoid anonymous subs (and their potential for leaks in buggy old versions of Perl).
2023-09-08watch: reset HUP + USR1 signal handlers in children
Child processes handling IMAP/NNTP aren't going to want to handle config reloads nor forced rescans, those are exclusively for the parent. We'll leave a note that QUIT/TERM/INT can safely use the same callback for both parent and children, as I nearly made the mistake of resetting those to their default values in the child.
2023-09-05watch: ensure children can use signal handlers
Blindly using the signal set inherited from the parent process is wrong, since the parent (or grandparent) could've blocked all signals. Ensure children can process signals in the event loop when sig handlers have to use standard Perl facilities.
2023-08-28watch: remove unused variable
2023-04-06watch: close inotify FD on ->quit
For simplicity, we quit and recreate an entire watch instance on SIGHUP. However, inotify (and signalfd) FDs are tied to the DS event loop and stay pinned to existence that way. Thus we explicitly close the FD in Watch->quit to prevent leakage on SIGHUP.
2023-04-06watch: use detect_indexlevel for unconfigured inboxes
I favor leaving the publicinbox.<name>.indexlevel parameter out of config files to make it easier to alter and reduce sources of truth. It worked well in most cases, but public-inbox-watch also needs to detect the indexlevel. Moving the sub to InboxWritable (from Admin) probably makes sense since it's a per-inbox attribute and allows -watch to reuse it.
2023-03-26watch: do not recreate signalfd on SIGHUP
The normal method by which PublicInbox::DS::event_loop sets up signals once needs some coercing to work with -watch. Otherwise, we'll end up wasting FDs every time somebody reloads -watch via SIGHUP.
2023-03-26watch: avoid Mail::IMAPClient errors when disconnected
No point in issuing LOGOUT commands and causing Mail::IMAPClient to spew a giant backtrace when we're unconnected.
2023-03-25ds: @post_loop_do replaces SetPostLoopCallback
This allows us to avoid repeatedly using memory-intensive anonymous subs in CodeSearchIdx where the callback is assigned frequently. Anonymous subs are known to leak memory in old Perls (e.g. 5.16.3 in enterprise distros) and still expensive in newer Perls. So favor the (\&subroutine, @args) form which allows us to eliminate anonymous subs going forward. Only CodeSearchIdx takes advantage of the new API at the moment, since it's the biggest repeat user of post-loop callback changes. Getting rid of the subroutine and relying on a global `our' variable also has two advantages: 1) Perl warnings can detect typos at compile-time, whereas the (now gone) method could only detect errors at run-time. 2) `our' variable assignment can be `local'-ized to a scope
2023-03-14watch: add space before "UID" or "ARTICLE" in warnings
In other words, it now shows `imap://example.com/INBOX.foo UID:123' instead of: `imap://example.com/INBOX.foo UID:123'
2023-01-18watch: IMAP and NNTP polling can use the same interval
An obvious error :x
2023-01-18watch: simplify internal data structures
We can flatten arrays and avoid distinguishing between PID types now that more of that logic and argument passing logic is offloaded to awaitpid.
2023-01-18watch: switch to awaitpid
-watch relies on our event_loop anyways, and awaitpid lets us avoid the extra overhead of EOFpipe. Add an extra {quit} check in imap_idle_fork while we're at it.
2022-10-24treewide: replace /^I: / prefix with /^# /
This is like more familiar to readers of TAP (Test Anywhere Protocol) output, as well as shell and Perl scripters which also use `#' for comments. AFAIK, nobody is parsing our stderr, and I'm not sure how standardized the `I:' prefix is (nor `W:' and `E:' are). It's already the prevailing style in Lei* code, too, so things have been moving in that direction for a bit.
2021-10-22watch: remove redundant signal mask manipulation
The top-level daemon process already blocks all signals, so there's no reason to block them around fork() calls.
2021-10-22watch: check for {quit} before IDLE
This may make it less likely for watch-dependent tests to get stuck. Unfortunately, due to the synchronous API of Mail::IMAPClient, ->idle is still susceptible to missing signals.
2021-10-16dir_idle: do not add watches in ->new
There's no savings in having two ways to add watches to an inotify nor kqueue descriptor.
2021-10-01ds: simplify signalfd use
Since signalfd is often combined with our event loop, give it a convenient API and reduce the code duplication required to use it. EventLoop is replaced with ::event_loop to allow consistent parameter passing and avoid needlessly passing the package name on stack. We also avoid exporting SFD_NONBLOCK since it's the only flag we support. There's no sense in having the memory overhead of a constant function when it's in cold code.
2021-09-22treewide: fix %SIG localization, harder
This fixes the occasional t/lei-sigpipe.t infinite loop under "make check-run". Link: http://nntp.perl.org/group/perl.perl5.porters/258784 <CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com> Followup-to: b552bb9150775fe4 ("daemon+watch: fix localization of %SIG for non-signalfd users")
2021-09-19watch: use net_reader->mic_new wrapper for SOCKS+TLS
This brings -watch up to feature parity with lei with SOCKS support.
2021-09-09net_reader: combine Net::NNTP and IMAPClient args
Since these are keyed by IMAP and NNTP URIs which can never conflict, it simplifies our internals to keep them in one big hash since we'll add POP3 and JMAP client support.
2021-09-09net_reader: imap_opt => cfg_opt
Since this our internal IMAP options are keyed by URI section, there's no need to have separate hashes for NNTP and IMAP options since they URI already distinguishes them. This will make future changes to support POP3 and JMAP and arg caching with lei/store easier.
2021-09-09net_reader: nntp_opt => cfg_opt
Since this our internal NNTP options are keyed by URI section, there's no need to have separate hashes for NNTP and IMAP options since they URI already distinguishes them. This will make future changes to support POP3 and JMAP and arg caching with lei/store easier.
2021-09-09net_reader: set IMAPClient Keepalive flag late
Since we always enable SO_KEEPALIVE unconditionally, having it in {mic_arg} leads to unnecessary IPC overhead and memory use.
2021-09-03lei: fix read/write IMAP access
xt/net_writer-imap.t was completely broken in recent months and I completely forgot this test. net->add_url still only accepts bare scalars (and not scalar refs), so we must set that up properly. Furthermore, our changes to do FLAGS-only synchronization in lei of old messages was causing us to not handle FLAGS properly for the test.
2021-04-20config: favor ->get_all when possible
It's slightly less code.
2021-03-31doc: add lei-mail-formats(5) manpage
While plenty of online documentation exists, it's good to have a locally-available summary for users to look at offline. Fix a URL in Watch.pm while we're at it, too.
2021-03-10watch: IMAP: ignore \Deleted and \Draft messages
This matches existing Maildir behavior, as trash and draft messages have little reason to be exposed publicly.
2021-02-24net_reader: trim exports and remove unused uri_new
More network things for -watch are isolated in NetReader, now, so fewer exports are necessary.
2021-02-24watch: switch IMAP and NNTP fetch loops to NetReader
NetReader::<imap|nntp>_each were based on the -watch code they now replace. v2: do not warn on EINTR if user quit to fix occasional test failure in t/imapd.t
2021-02-24lei <import|convert>: support NNTP sources
We can read NNTP in -watch and Net::NNTP is shipped with Perl5, so lei import and convert have no excuse not to support NNTP as a client. Authentication is not tested, yet; but should be close to what IMAP is like...
2021-02-21net_reader: use and accept URIimap objects in more places
This flexibility should save us some code down-the-line.
2021-02-18watch: connect to NNTP and IMAP in config order
This is hopefully less surprising to users when they're prompted for credentials.
2021-02-18watch: move imap_common_init to NetReader
We'll use this in LeiImport and likely other places.
2021-02-10net_reader: new package split from -watch
We'll be using some of this for IMAP and NNTP support in lei, too. More will need to be done to improve code sharing and reusability, soon, but this is a start.
2021-02-10use MdirReader in -watch and InboxWritable
MdirReader now handles files in "$MAILDIR/new" properly and is stricter about what it accepts. eml_from_path is also made robust against FIFOs while eliminating TOCTOU races with between stat(2) and open(2) calls.
2021-02-08ds: improve add_timer usability
Packing args into an arrayref is awkward and we may be using this API more in lei.
2021-02-05eml: handle warning ignores for lei
There's nothing we can do about bad emails in our search results, so quiet things down and don't fight the MUA for the terminal.
2021-01-26use defined-or in a few more places
Mainly around fork() calls, but some nearby places as well.
2021-01-24treewide: reseed RNG in child processes
This prevents name conflicts leading to retries and slowdowns in temporary file name generation. No actual data corruption resulted because all temporary files are opened with O_EXCL anyways. This may increase security for IMAP, NNTP, and HTTPS sessions using TLS, but it's all public data anyways.
2021-01-12ds: block signals when reaping
This lets us call dwaitpid long before a process exits and not have to wait around for it. This is advantageous for lei where we can run dwaitpid on the pager as soon as we spawn it, instead of waiting for a client socket to go away on DESTROY.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-26default to CORE::warn in $SIG{__WARN__} handlers
As with CORE::die and $SIG{__DIE__}, it turns out CORE::warn is safe to use inside $SIG{__WARN__} handlers without triggering infinite recursion. So fall back to reusing CORE::warn instead of creating a new sub.