Date | Commit message (Collapse) |
|
getpid() isn't cached by glibc nowadays and system calls are
more expensive due to CPU vulnerability mitigations. To
ensure we switch to the new semantics properly, introduce
a new `on_destroy' function to simplify callers.
Furthermore, most OnDestroy correctness is often tied to the
process which creates it, so make the new API default to
guarded against running in subprocesses.
For cases which require running in all children, a new
PublicInbox::OnDestroy::all call is provided.
|
|
We already stash the associated FD for reporting at startup and
don't need to call `fileno' again. Found via manual code
inspection while considering the effort to make async {forward}
from PublicInbox::HTTP more like the generic long_response API
and {long_cb} field used by IMAP/NNTP/POP3.
|
|
This ensures we can notice shutdown events in one-shot scripts
like -cindex (and eventually -clone/-fetch/-compact) without
forcing another real event to fire.
|
|
Eric Wong <e@80x24.org> wrote:
> --- a/lib/PublicInbox/DS.pm
> +++ b/lib/PublicInbox/DS.pm
> @@ -341,8 +341,8 @@ sub greet {
> my $ev = EPOLLIN;
> my $wbuf;
> if ($sock->can('accept_SSL') && !$sock->accept_SSL) {
> - return CORE::close($sock) if $! != EAGAIN;
> - $ev = PublicInbox::TLS::epollbit() or return CORE::close($sock);
> + return $sock->close if $! != EAGAIN;
> + $ev = PublicInbox::TLS::epollbit() or return $sock->close;
> $wbuf = [ \&accept_tls_step, $self->can('do_greet')];
> }
> new($self, $sock, $ev | EPOLLONESHOT);
Noticed this on deploy:
-----8<-----
Subject: [PATCH] ds: don't try ->close after ->accept_SSL failure
->accept_SSL failures leaves the socket ref as a GLOB (not
IO::Handle) and unable to respond to the ->close method.
Calling close in any form isn't actually necessary at all,
so just let refcounting destroy the socket.
|
|
It's easier-to-read and should open the door for us to get rid
of `tie' for ProcessIO without performance penalties for
more frequently-used perlop calls and ability to do `stat' directly
on the object instead of the awkward `tied' thing.
|
|
FDs are array indices into the kernel, anyways, so we can
take advantage of space savings and speedups because the
majority of FDs a big process has is going to end up in
the array, anyways.
|
|
Matching existing Perl IO semantics seems like a good idea to
reduce confusion in the future.
We'll also fix some outdated comments and update indentation
to match the rest of our code base since we're far detached from
Danga::Socket at this point.
|
|
The epoll implementation is the only one which respects the
limit (kevent would, but IO::KQueue does not). In any case,
I'm not a fan of the maxevents=1000 historical default since
it leads to fairness problems with shared non-blocking listeners
across multiple daemon workers.
|
|
We can map all integer FDs to Perl objects once ->ep_wait returns,
so there's no need to play tricks elsewhere to ensure FDs can
be mapped to objects within the same event loop iteration.
|
|
Drop reference counts ASAP in case it saves us some memory
sooner rather than later. This ought to give us more predictable
resource use and ensure OnDestroy callbacks fire sooner.
There's no need to use `local' to clobber the arrayref anymore,
either.
AFAIK, this doesn't fix any known bug, but more predictability
will make it easier to debug things going forward.
|
|
It's not worth the code and memory to have a setter method we
never use outside of tests.
|
|
This ensures we handle RNG reseeding and resetting the event
loop properly in child processes after forking.
|
|
commit 1897c3be1ed644a05f96ed06cde4a9cc2ad0e5a4
(ds: Reset: replace Poller object early, 2023-10-04)
was not effective at eliminating the following message
at daemon shutdown:
Can't call method "FILENO" on an undefined value at
.../PublicInbox/Select.pm line 34 during global destruction.
This seems down to some tied objects having unpredictable
destruction order. So use a dummy class to ensure its ep_*
methods never call the tied `FILENO' method at all since
dropping the Poller object will release any resources it holds.
|
|
This is more persistent than some of the others and we don't
swap it on use (unlike $nextq or $ToClose). In other words,
it's helpful for communicating its lifetime expectancy is
close to %DescriptorMap and not like to queue-type things
such as $ToClose.
|
|
We used to have many entries for %Stack, but nowadays it's just
the one used by next_tick, so just replace it a $cur_runq variable.
I'm reducing reliance on hash keys for things with global scope
to ensure typos can be detected (strict||v5.12 forces us to fix
uses of undeclared variables, but they can't detect typos in
hash keys.
|
|
Process shutdown can be chaotic and unpredictable. Try to make
it more predictable by ensuring any PublicInbox::Select object
can't hold references to any objects.
This should fix the following error I saw in syslog during a deploy:
Can't call method "FILENO" on an undefined value at
.../PublicInbox/Select.pm line 34 during global destruction.
Replacing $Poller with PublicInbox::Select (instead of undef-ing
it) means we can avoid adding branches to ->epwait and ->close
before calls to ->ep_mod and ->ep_del, respectively.
|
|
It's not used by any post_loop_do callbacks anymore, and the
underlying FD map is a global `our' variable accessible from
anywhere, anyways.
|
|
It's shared by both by lei and public-facing daemons in using
the ->busy callback.
|
|
This will allow us to avoid unblocking signals during
shutdown to simplify our code.
|
|
perlipc(1) man page states both wait + waitpid will retry
on EINTR. Thus there's no need to retry it ourselves.
|
|
Reaping children needs to keep the event_loop spinning another
round when the @post_loop_do callback may be used to check
on process exit during shutdown.
This allows us to get rid of the hacky SetLoopTimeout calls in
lei-daemon and XapHelper.pm during process shutdown if we're
trying to wait for all PIDs to exit before leaving the event
loop.
|
|
This seems to avoid some DBI teardown segfaults on OpenBSD 7.3;
but there may be other places where something similar is necessary.
|
|
This is safer than relying on an internal API of IO::Poll
and doesn't create extra references to IO globs like the
public one.
|
|
There's no need for for a complicated map {} block here. All
these unblockable signals are POSIX since 2001 at the latest, so
there's no reason any platform would lack them.
|
|
public-inbox-watch, lei-daemon, the master process of
public-inbox-(netd|httpd|imapd|nntpd|pop3d),
and the (mostly) Perl implementation of XapHelper do not
have many FDs to watch so epoll|kqueue end up being overkill.
Of course, *BSDs already have separate kqueue FDs emulating
signalfd and/or inotify, even.
In other words, only the worker processes of
public-inbox-(netd|httpd|imapd|nntpd|pop3d) are expected
to see C10K (or C100K) types of traffic where epoll|kqueue
shine.
Perhaps lei could benefit from epoll/kqueue on some virtual users
IMAP/JMAP system one day; as could -watch with many IMAP IDLE
folders; but we'll probably add a knob if/when it comes to that.
|
|
This allows us to cut down on imports and reduce code.
This also makes it easier (in the next commit) to provide an option
to disable epoll/kqueue when saving an FD is valued over scalability.
|
|
The awaitpid API turns out to be quite handy for managing
long-lived worker processes. This allows us to ensure all our
uses of signalfd (and kevent emulation) are non-blocking.
|
|
Using the sigset result of allowset() isn't appropriate for
SIG_UNBLOCK. We must generate a new signal set off of the $sig
dispatch map for use with SIG_UNBLOCK to actually unblock the
signals.
This is the first part in getting t/imapd.t to pass the
reload-after-setting--imap.pollInterval-test when neither
signalfd nor kqueue are usable.
|
|
FreeBSD and OpenBSD kqueue EVFILT_SIGNAL isn't able to handle
blocked signals which were sent before the filter is created.
This behavior differs from Linux signalfd, which can process
blocked signals that were sent before the signalfd existed.
|
|
Don't block SIGABRT, SIGBUS, SIGFPE, SIGILL nor SIGSEGV since
blocking them can hide real bugs in our code or 3rd-party
libraries and executables.
We'll also leave SIGXCPU and SIGXFSZ unblocked since users
may've setup RLIMIT_CPU and RLIMIT_FSIZE, respectively.
|
|
|
|
This makes it easier to pause and restart long-running indexing
jobs which use our event loop.
|
|
This allows us to avoid repeatedly using memory-intensive
anonymous subs in CodeSearchIdx where the callback is assigned
frequently. Anonymous subs are known to leak memory in old
Perls (e.g. 5.16.3 in enterprise distros) and still expensive in
newer Perls. So favor the (\&subroutine, @args) form which
allows us to eliminate anonymous subs going forward.
Only CodeSearchIdx takes advantage of the new API at the moment,
since it's the biggest repeat user of post-loop callback
changes.
Getting rid of the subroutine and relying on a global `our'
variable also has two advantages:
1) Perl warnings can detect typos at compile-time, whereas the
(now gone) method could only detect errors at run-time.
2) `our' variable assignment can be `local'-ized to a scope
|
|
Blocking signals when reaping was done when the lei pager was
spawned by the daemon in b90e8d6e02. Shortly afterwards in
7b79c918a5, the client script took over spawning of the pager
and made b90e8d6e02 redundant.
cf. b90e8d6e02 (ds: block signals when reaping, 2021-01-10)
7b79c918a5 (lei: run pager in client script, 2021-01-10)
|
|
The final entry of {wbuf} may be a CODE ref and not a
tmpio ARRAY ref, so we must ensure it's an ARRAY before
attempting to use `->[INDEX]' to access it.
This fixes:
forward ->close error: Not an ARRAY reference at PublicInbox/DS.pm line 544.
|
|
We must only write to $AWAIT_PIDS on the initial reap attempt.
While we're at it, avoid triggering an extra wakeup if we're
doing synchronous awaitpid. This seems to eliminate most
reliance on Qspawn->DESTROY to call Qspawn->finalize.
|
|
EINTR needs to be retried for non-kqueue|signalfd users,
and ECHILD indicates a bug in our code.
|
|
With no remaining users, we can drop dwaitpid and switch
awaitpid to rely on waitpid(-1) to save syscalls.
|
|
awaitpid is the new API which will eventually replace dwaitpid.
It enables early registration of callback handlers. Eventually
(once dwaitpid is gone) it'll be able to use fewer waitpid
calls.
The avoidance of waitpid(-1) in our earlier days was driven by
the belief that threads may eventually become relevant for Perl 5,
but that's extremely unlikely at this stage. I will still
introduce optional threads via C, but they definitely won't be
spawning/reaping processes.
Argument order to callbacks is swapped (PID first) to allow
flattened multiple arguments more natrually. The previous API
(allowing only a single argument, as influenced by
pthread_create(3)) was more tedious as it involved packing
multiple arguments into yet another array.
|
|
FD_CLOEXEC is the only currently defined FD flag, and has been
the case for decades at this point. I highly doubt any default
FD flag will ever be forced on us by the kernel, init system, or
Perl. So save ourselves a syscall and just call F_SETFD with
the assumption FD_CLOEXEC is the only FD flag that we'd ever
care for.
|
|
We can just check defined() on the `our' var itself and
save the process several kilobytes of memory.
|
|
warn/carp usage is unavoidable given Perl itself and standard
libraries, so just rely on localized $SIG{__WARN__} from
60d262483a4d6ddf (daemon: use per-listener SIG{__WARN__} callbacks, 2022-08-08)
for all error reporting.
While we're in the area, make some of the error handling more
consistent between IMAP/NNTP/POP3.
|
|
...by deprioritizing clients using a username + password.
As IMAP provides AUTH=ANONYMOUS for designating anonymous
access, we'll rely on it as a heuristic for favoring "good"
clients. Clients using a username + password seem to (more
often than not) be malicious and looking for info which doesn't
belong in public inboxes.
This copies the technique used by WWW + -httpd to deprioritize
expensive mbox.gz downloads.
|
|
->zflush is already for GzipFilter in PublicInbox::WWW,
while we use DEFLATE for NNTP and IMAP. This ought to
make the code easier-to-follow.
|
|
Their code was nearly identical to begin with, so save some
memory in -netd and disk space for all of our tarball/distro
users, at least.
And I seem to have used multiple inheritance successfully, here,
maybe...
|
|
It's not actually used by our POP3 code at the moment,
but it may be soon to reduce memory usage when loading
50K smsg objects into memory.
|
|
It's the same subroutine everywhere.
|
|
More deduplication, and POP3 never needed it.
|
|
We can share some common code between IMAP, NNTP, and POP3
without too much trouble, so cut down our LoC.
|
|
It's needlessly complex and O(n), so it doesn't scale well to a
high number of clients nor is it easy-to-scale with the data
structures available to us in pure Perl.
In any case, I see no evidence of either -imapd nor -nntpd
experiencing high connection loads on public-facing sites.
-httpd has never had its own timer-based expiration, either.
Fwiw, public-inbox.org itself has been running a public-facing
HTTP/HTTPS server with no userspace idle client expiration for
the past 8 years or with no ill effect. Clients can come and go
as they wish, and SO_KEEPALIVE takes care of truly broken
connections if they're gone for ~2 hours.
Internet connections drop all time, so it should be harmless to
drop connections w/o warning since both NNTP and IMAP protocols
have well-defined semantics for determining if a message was
truncated (as does HTTP/1.1+).
|