about summary refs log tree commit homepage
path: root/lib/PublicInbox/Qspawn.pm
DateCommit message (Collapse)
2019-09-17qspawn: improve variable naming and commenting
Naming $start_cb consistently helps avoid confusing new readers, and some comments will help with understanding flow
2019-09-17qspawn: shorten lifetime of circular references
All of these circular references are designed to clear themselves, but these will make actual errors from Devel::Cycle easier-to-spot. The circular reference in the limiter {run_queue} is not a real problem, but we can avoid storing the circular reference until we actually need to spawn the child, reducing the size of the Qspawn object while it's in the queue, slightly. We also do not need to have redundant checks to spawn new processes, we should only spawn new processes when they're ->start-ed or after waitpid reaps them.
2019-09-17qspawn: log errors for generic PSGI server users
Generic PSGI servers have $env->{'psgi.errors'}, too, so ensure they can log errors.
2019-09-17qspawn: remove return value from ->finish
We don't use the return value in real code since we do waitpid asynchronously, now. So simplify our runtime code at the cost of making our test slighly more complex.
2019-09-15qspawn: shorten lifetime of environ and opts args
We don't need to hold onto the subprocess environ and redirects/options for popen_rd after spawning the child process. I do not expect this to fix problem of leaking unlinked regular file descriptors (which I still can't reproduce), and it definitely does not fix the problem of leaking pipe descriptors (which I also can't reproduce). This will save an FD sooner on non-public-inbox-httpd servers which give a non-FD $env->{'psgi.input'}, however Regardless, it's good to free up memory resources in our own process ASAP we're done using them.
2019-09-15qspawn: clarify and improve error handling
EINTR should not happen when using non-blocking sockets like we do in our daemons, but maybe some OSes allow it to happen and edge-triggered notifications won't notify us again. So always retry immediately on EINTR without relying on kqueue or epoll to notify us, and log any other unrecoverable errors which may happen while we're at it.
2019-09-14qspawn: remove unused WNOHANG import
We rely on DS to do waitpid with WNOHANG, now, and the non-DS code path won't use WNOHANG.
2019-09-14httpd/async: improve naming and comments
Rename the {cleanup} field to {end}, since it's similar to END {} and is consistent with the variable in Qspawn.pm And document how certain subs get called, since we have many subs named "new" and "close".
2019-09-14qspawn: simplify by using PerlIO::scalar
I didn't know PerlIO::scalar existed until a few months ago, but it's been distributed with Perl since 5.8 and doesn't seem to be split out into it's own package on any distro.
2019-07-08ds: use WNOHANG with waitpid if inside event loop
While we're usually not stuck waiting on waitpid after seeing a pipe EOF or even triggering SIGPIPE in the process (e.g. git-http-backend) we're reading from, it MAY happen and we should be careful to never hang the daemon process on waitpid calls. v2: use "eq" for string comparison against 'DEFAULT'
2019-07-04qspawn: retry sysread when parsing headers, too
We need to ensure the BIN_DETECT (8000 byte) check in ViewVCS can be handled properly when sending giant files. Otherwise, EPOLLET won't notify us, again, and responses can get stuck. While we're at it, bump up the read-size up to 4096 bytes so we make fewer trips to the kernel.
2019-06-29http: use bigger, but shorter-lived buffers for pipes
Linux pipes default to 65536 bytes in size, and we want to read external processes as fast as possible now that we don't use Danga::Socket or buffer to heap. However, drop the buffer ASAP if we need to wait on anything; since idle buffers can be idle for eons. This lets other execution contexts can reuse that memory right away.
2019-06-24http|nntp: favor "$! == EFOO" over $!{EFOO} checks
Integer comparisions of "$!" are faster than hash lookups. See commit 6fa2b29fcd0477d126ebb7db7f97b334f74bbcbc ("ds: cleanup Errno imports and favor constant comparisons") for benchmarks.
2019-06-24qspawn: describe where `$rpipe' come from
It wasn't immediately obvious to me after several months of not looking at this code.
2019-05-04bundle Danga::Socket and Sys::Syscall
These modules are unmaintained upstream at the moment, but I'll be able to help with the intended maintainer once/if CPAN ownership is transferred. OTOH, we've been waiting for that transfer for several years, now... Changes I intend to make: * EPOLLEXCLUSIVE for Linux * remove unused fields wasting memory * kqueue bugfixes e.g. https://rt.cpan.org/Ticket/Display.html?id=116615 * accept4 support And some lower priority experiments: * switch to EV_ONESHOT / EPOLLONESHOT (incompatible changes) * nginx-style buffering to tmpfile instead of string array * sendfile off tmpfile buffers * io_uring maybe?
2019-04-04qspawn: wire up RLIMIT_* handling to limiters
This allows users to configure RLIMIT_{CORE,CPU,DATA} using our "limiter" config directive when spawning external processes.
2019-01-31qspawn: documentation updates
This will become critical for future changes to display git commits, diffs, and trees. Use "qspawn.wcb" instead of "qspawn.response" to enhance readability.
2019-01-27qspawn: decode $? for user-friendliness
The raw value of $? isn't very useful, generally.
2019-01-22qspawn: implement psgi_qx
This new asynchronous API, will allow us to take advantage of non-blocking I/O from even small commands; as those may still need to wait for slow operations.
2019-01-22qspawn|httpd/async: improve and fix out-of-date comments
2019-01-22qspawn|getlinebody: support streaming filters
This is intended for wrapping "git show" and "git diff" processes in the future and to prevent it from monopolizing callers. This will us to better handle backpressure from gigantic commits.
2019-01-22qspawn: implement psgi_return and use it for githttpbackend
Was: ("repobrowse: port patch generation over to qspawn") We'll be using it for githttpbackend and maybe other things.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-01-07qspawn: prepare to support runtime reloading of Limiter
We may allow the {max} value of a limiter to be changed in the future, so lets start accounting for it before we spawn followup processes.
2016-07-09www: add configurable limiters
Currently only for git-http-backend use, this allows limiting the number of spawned processes per-inbox or by group, if there are multiple large inboxes amidst a sea of small ones. For example, a "big" repo limiter could be used for big inboxes: which would be shared between multiple repos: [limiter "big"] max = 4 [publicinbox "git"] address = git@vger.kernel.org mainrepo = /path/to/git.git ; shared limiter with giant: httpbackendmax = big [publicinbox "giant"] address = giant@project.org mainrepo = /path/to/giant.git ; shared limiter with git: httpbackendmax = big ; This is a tiny inbox, use the default limiter with 32 slots: [publicinbox "meta"] address = meta@public-inbox.org mainrepo = /path/to/meta.git
2016-07-09qspawn: allow configurable limiters
And bump the default limit to 32 so we match git-daemon behavior. This shall allow us to configure different levels of concurrency for different repositories and prevent clones of giant repos from stalling service to small repos.
2016-06-21spawn: improve error checking for fork failures
fork failures are unfortunately common when Xapian has gigabytes and gigabytes mmapped.
2016-05-24git-http-backend: use qspawn to limit running processes
Having an excessive amount of git-pack-objects processes is dangerous to the health of the server. Queue up process spawning for long-running responses and serve them sequentially, instead.