about summary refs log tree commit homepage
path: root/lib/PublicInbox/DS.pm
DateCommit message (Collapse)
2024-04-03treewide: avoid getpid() for OnDestroy checks
getpid() isn't cached by glibc nowadays and system calls are more expensive due to CPU vulnerability mitigations. To ensure we switch to the new semantics properly, introduce a new `on_destroy' function to simplify callers. Furthermore, most OnDestroy correctness is often tied to the process which creates it, so make the new API default to guarded against running in subprocesses. For cases which require running in all children, a new PublicInbox::OnDestroy::all call is provided.
2023-11-26ds: long_step: eliminate redundant fileno call
We already stash the associated FD for reporting at startup and don't need to call `fileno' again. Found via manual code inspection while considering the effort to make async {forward} from PublicInbox::HTTP more like the generic long_response API and {long_cb} field used by IMAP/NNTP/POP3.
2023-11-15ds: run @post_loop_do if any user-queued events run
This ensures we can notice shutdown events in one-shot scripts like -cindex (and eventually -clone/-fetch/-compact) without forcing another real event to fire.
2023-11-03ds: don't try ->close after ->accept_SSL failure
Eric Wong <e@80x24.org> wrote: > --- a/lib/PublicInbox/DS.pm > +++ b/lib/PublicInbox/DS.pm > @@ -341,8 +341,8 @@ sub greet { > my $ev = EPOLLIN; > my $wbuf; > if ($sock->can('accept_SSL') && !$sock->accept_SSL) { > - return CORE::close($sock) if $! != EAGAIN; > - $ev = PublicInbox::TLS::epollbit() or return CORE::close($sock); > + return $sock->close if $! != EAGAIN; > + $ev = PublicInbox::TLS::epollbit() or return $sock->close; > $wbuf = [ \&accept_tls_step, $self->can('do_greet')]; > } > new($self, $sock, $ev | EPOLLONESHOT); Noticed this on deploy: -----8<----- Subject: [PATCH] ds: don't try ->close after ->accept_SSL failure ->accept_SSL failures leaves the socket ref as a GLOB (not IO::Handle) and unable to respond to the ->close method. Calling close in any form isn't actually necessary at all, so just let refcounting destroy the socket.
2023-11-02treewide: use ->close method rather than CORE::close
It's easier-to-read and should open the door for us to get rid of `tie' for ProcessIO without performance penalties for more frequently-used perlop calls and ability to do `stat' directly on the object instead of the awkward `tied' thing.
2023-11-02ds: replace FD map hash table with array
FDs are array indices into the kernel, anyways, so we can take advantage of space savings and speedups because the majority of FDs a big process has is going to end up in the array, anyways.
2023-11-01ds: make ->close behave like CORE::close
Matching existing Perl IO semantics seems like a good idea to reduce confusion in the future. We'll also fix some outdated comments and update indentation to match the rest of our code base since we're far detached from Danga::Socket at this point.
2023-11-01ds: move maxevents further down the stack
The epoll implementation is the only one which respects the limit (kevent would, but IO::KQueue does not). In any case, I'm not a fan of the maxevents=1000 historical default since it leads to fairness problems with shared non-blocking listeners across multiple daemon workers.
2023-11-01ds: do not defer close
We can map all integer FDs to Perl objects once ->ep_wait returns, so there's no need to play tricks elsewhere to ensure FDs can be mapped to objects within the same event loop iteration.
2023-11-01ds: next_tick: shorten object lifetimes
Drop reference counts ASAP in case it saves us some memory sooner rather than later. This ought to give us more predictable resource use and ensure OnDestroy callbacks fire sooner. There's no need to use `local' to clobber the arrayref anymore, either. AFAIK, this doesn't fix any known bug, but more predictability will make it easier to debug things going forward.
2023-10-18ds: get rid of SetLoopTimeout
It's not worth the code and memory to have a setter method we never use outside of tests.
2023-10-18ds: introduce and use do_fork helper
This ensures we handle RNG reseeding and resetting the event loop properly in child processes after forking.
2023-10-10ds: use a dummy poller during Reset
commit 1897c3be1ed644a05f96ed06cde4a9cc2ad0e5a4 (ds: Reset: replace Poller object early, 2023-10-04) was not effective at eliminating the following message at daemon shutdown: Can't call method "FILENO" on an undefined value at .../PublicInbox/Select.pm line 34 during global destruction. This seems down to some tied objects having unpredictable destruction order. So use a dummy class to ensure its ep_* methods never call the tied `FILENO' method at all since dropping the Poller object will release any resources it holds.
2023-10-04ds: make %AWAIT_PIDS a hash, not hashref
This is more persistent than some of the others and we don't swap it on use (unlike $nextq or $ToClose). In other words, it's helpful for communicating its lifetime expectancy is close to %DescriptorMap and not like to queue-type things such as $ToClose.
2023-10-04ds: cleanup fork + Reset support
We used to have many entries for %Stack, but nowadays it's just the one used by next_tick, so just replace it a $cur_runq variable. I'm reducing reliance on hash keys for things with global scope to ensure typos can be detected (strict||v5.12 forces us to fix uses of undeclared variables, but they can't detect typos in hash keys.
2023-10-04ds: Reset: replace Poller object early
Process shutdown can be chaotic and unpredictable. Try to make it more predictable by ensuring any PublicInbox::Select object can't hold references to any objects. This should fix the following error I saw in syslog during a deploy: Can't call method "FILENO" on an undefined value at .../PublicInbox/Select.pm line 34 during global destruction. Replacing $Poller with PublicInbox::Select (instead of undef-ing it) means we can avoid adding branches to ->epwait and ->close before calls to ->ep_mod and ->ep_del, respectively.
2023-10-04ds: don't pass FD map to post_loop_do callback
It's not used by any post_loop_do callbacks anymore, and the underlying FD map is a global `our' variable accessible from anywhere, anyways.
2023-10-04ds: hoist out close_non_busy
It's shared by both by lei and public-facing daemons in using the ->busy callback.
2023-10-04lei: drop stores explicitly at daemon shutdown
This will allow us to avoid unblocking signals during shutdown to simplify our code.
2023-09-26ds: awaitpid: Perl waitpid retries on EINTR automatically
perlipc(1) man page states both wait + waitpid will retry on EINTR. Thus there's no need to retry it ourselves.
2023-09-25ds: force event_loop wakeup on final child death
Reaping children needs to keep the event_loop spinning another round when the @post_loop_do callback may be used to check on process exit during shutdown. This allows us to get rid of the hacky SetLoopTimeout calls in lei-daemon and XapHelper.pm during process shutdown if we're trying to wait for all PIDs to exit before leaving the event loop.
2023-09-14ds: Reset at END
This seems to avoid some DBI teardown segfaults on OpenBSD 7.3; but there may be other places where something similar is necessary.
2023-09-12provide select(2) backend for PublicInbox::DS
This is safer than relying on an internal API of IO::Poll and doesn't create extra references to IO globs like the public one.
2023-09-11ds: use constants for @UNBLOCKABLE list
There's no need for for a complicated map {} block here. All these unblockable signals are POSIX since 2001 at the latest, so there's no reason any platform would lack them.
2023-09-11favor poll(2) for most daemons
public-inbox-watch, lei-daemon, the master process of public-inbox-(netd|httpd|imapd|nntpd|pop3d), and the (mostly) Perl implementation of XapHelper do not have many FDs to watch so epoll|kqueue end up being overkill. Of course, *BSDs already have separate kqueue FDs emulating signalfd and/or inotify, even. In other words, only the worker processes of public-inbox-(netd|httpd|imapd|nntpd|pop3d) are expected to see C10K (or C100K) types of traffic where epoll|kqueue shine. Perhaps lei could benefit from epoll/kqueue on some virtual users IMAP/JMAP system one day; as could -watch with many IMAP IDLE folders; but we'll probably add a knob if/when it comes to that.
2023-09-11ds: use object-oriented API for epoll
This allows us to cut down on imports and reduce code. This also makes it easier (in the next commit) to provide an option to disable epoll/kqueue when saving an FD is valued over scalability.
2023-09-11daemon: depend on DS event_loop in master process, too
The awaitpid API turns out to be quite handy for managing long-lived worker processes. This allows us to ensure all our uses of signalfd (and kevent emulation) are non-blocking.
2023-09-08ds: fix signals unblock for non-signalfd/kqueue
Using the sigset result of allowset() isn't appropriate for SIG_UNBLOCK. We must generate a new signal set off of the $sig dispatch map for use with SIG_UNBLOCK to actually unblock the signals. This is the first part in getting t/imapd.t to pass the reload-after-setting--imap.pollInterval-test when neither signalfd nor kqueue are usable.
2023-09-05daemon: workaround pre-EVFILT_SIGNAL signals
FreeBSD and OpenBSD kqueue EVFILT_SIGNAL isn't able to handle blocked signals which were sent before the filter is created. This behavior differs from Linux signalfd, which can process blocked signals that were sent before the signalfd existed.
2023-09-05ds: don't block important signals we don't use
Don't block SIGABRT, SIGBUS, SIGFPE, SIGILL nor SIGSEGV since blocking them can hide real bugs in our code or 3rd-party libraries and executables. We'll also leave SIGXCPU and SIGXFSZ unblocked since users may've setup RLIMIT_CPU and RLIMIT_FSIZE, respectively.
2023-08-28Fix some typos/grammar/errors in docs and comments
2023-03-29cindex: leave SIGTSTP and SIGCONT unblocked
This makes it easier to pause and restart long-running indexing jobs which use our event loop.
2023-03-25ds: @post_loop_do replaces SetPostLoopCallback
This allows us to avoid repeatedly using memory-intensive anonymous subs in CodeSearchIdx where the callback is assigned frequently. Anonymous subs are known to leak memory in old Perls (e.g. 5.16.3 in enterprise distros) and still expensive in newer Perls. So favor the (\&subroutine, @args) form which allows us to eliminate anonymous subs going forward. Only CodeSearchIdx takes advantage of the new API at the moment, since it's the biggest repeat user of post-loop callback changes. Getting rid of the subroutine and relying on a global `our' variable also has two advantages: 1) Perl warnings can detect typos at compile-time, whereas the (now gone) method could only detect errors at run-time. 2) `our' variable assignment can be `local'-ized to a scope
2023-03-16ds: reap_pids: remove redundant signal blocking
Blocking signals when reaping was done when the lei pager was spawned by the daemon in b90e8d6e02. Shortly afterwards in 7b79c918a5, the client script took over spawning of the pager and made b90e8d6e02 redundant. cf. b90e8d6e02 (ds: block signals when reaping, 2021-01-10) 7b79c918a5 (lei: run pager in client script, 2021-01-10)
2023-02-26ds: write: do not assume final wbuf entry is tmpio
The final entry of {wbuf} may be a CODE ref and not a tmpio ARRAY ref, so we must ensure it's an ARRAY before attempting to use `->[INDEX]' to access it. This fixes: forward ->close error: Not an ARRAY reference at PublicInbox/DS.pm line 544.
2023-01-21ds: awaitpid: do not clobber entries for reaped processes
We must only write to $AWAIT_PIDS on the initial reap attempt. While we're at it, avoid triggering an extra wakeup if we're doing synchronous awaitpid. This seems to eliminate most reliance on Qspawn->DESTROY to call Qspawn->finalize.
2023-01-19ds: improve error handling of synchronous awaitpid
EINTR needs to be retried for non-kqueue|signalfd users, and ECHILD indicates a bug in our code.
2023-01-18ds: drop dwaitpid, switch to waitpid(-1)
With no remaining users, we can drop dwaitpid and switch awaitpid to rely on waitpid(-1) to save syscalls.
2023-01-18ds: introduce awaitpid, switch ProcessPipe users
awaitpid is the new API which will eventually replace dwaitpid. It enables early registration of callback handlers. Eventually (once dwaitpid is gone) it'll be able to use fewer waitpid calls. The avoidance of waitpid(-1) in our earlier days was driven by the belief that threads may eventually become relevant for Perl 5, but that's extremely unlikely at this stage. I will still introduce optional threads via C, but they definitely won't be spawning/reaping processes. Argument order to callbacks is swapped (PID first) to allow flattened multiple arguments more natrually. The previous API (allowing only a single argument, as influenced by pthread_create(3)) was more tedious as it involved packing multiple arguments into yet another array.
2023-01-03daemon: don't bother checking for existing FD flags
FD_CLOEXEC is the only currently defined FD flag, and has been the case for decades at this point. I highly doubt any default FD flag will ever be forced on us by the kernel, init system, or Perl. So save ourselves a syscall and just call F_SETFD with the assumption FD_CLOEXEC is the only FD flag that we'd ever care for.
2022-12-23syscall: get rid of epoll_defined() sub
We can just check defined() on the `our' var itself and save the process several kilobytes of memory.
2022-08-10daemon: rely on $SIG{__WARN__} for error output
warn/carp usage is unavoidable given Perl itself and standard libraries, so just rely on localized $SIG{__WARN__} from 60d262483a4d6ddf (daemon: use per-listener SIG{__WARN__} callbacks, 2022-08-08) for all error reporting. While we're in the area, make some of the error handling more consistent between IMAP/NNTP/POP3.
2022-08-09imap: prioritize AUTH=ANONYMOUS clients
...by deprioritizing clients using a username + password. As IMAP provides AUTH=ANONYMOUS for designating anonymous access, we'll rely on it as a heuristic for favoring "good" clients. Clients using a username + password seem to (more often than not) be malicious and looking for info which doesn't belong in public inboxes. This copies the technique used by WWW + -httpd to deprioritize expensive mbox.gz downloads.
2022-08-03ds: use ->dflush to distinguish from ->zflush
->zflush is already for GzipFilter in PublicInbox::WWW, while we use DEFLATE for NNTP and IMAP. This ought to make the code easier-to-follow.
2022-07-23imap+nntp: share COMPRESS implementation
Their code was nearly identical to begin with, so save some memory in -netd and disk space for all of our tarball/distro users, at least. And I seem to have used multiple inheritance successfully, here, maybe...
2022-07-23ds: share long_step between NNTP and IMAP
It's not actually used by our POP3 code at the moment, but it may be soon to reduce memory usage when loading 50K smsg objects into memory.
2022-07-23ds: move requeue_once
It's the same subroutine everywhere.
2022-07-23ds: move no-op ->zflush to common base class
More deduplication, and POP3 never needed it.
2022-07-23ds: support greeting protocols
We can share some common code between IMAP, NNTP, and POP3 without too much trouble, so cut down our LoC.
2021-10-16imapd+nntpd: drop timer-based expiration
It's needlessly complex and O(n), so it doesn't scale well to a high number of clients nor is it easy-to-scale with the data structures available to us in pure Perl. In any case, I see no evidence of either -imapd nor -nntpd experiencing high connection loads on public-facing sites. -httpd has never had its own timer-based expiration, either. Fwiw, public-inbox.org itself has been running a public-facing HTTP/HTTPS server with no userspace idle client expiration for the past 8 years or with no ill effect. Clients can come and go as they wish, and SO_KEEPALIVE takes care of truly broken connections if they're gone for ~2 hours. Internet connections drop all time, so it should be harmless to drop connections w/o warning since both NNTP and IMAP protocols have well-defined semantics for determining if a message was truncated (as does HTTP/1.1+).