about summary refs log tree commit homepage
path: root/lib/PublicInbox/LEI.pm
DateCommit message (Collapse)
2023-09-25ds: force event_loop wakeup on final child death
Reaping children needs to keep the event_loop spinning another round when the @post_loop_do callback may be used to check on process exit during shutdown. This allows us to get rid of the hacky SetLoopTimeout calls in lei-daemon and XapHelper.pm during process shutdown if we're trying to wait for all PIDs to exit before leaving the event loop.
2023-09-24lei: use scalar %SIG assignment
Perl v5.16.3 (and possibly some later versions) complain about this, but newer (v5.32.1) are fine with it. Fixes: e281363ba937 ("lei: ensure we run DESTROY|END at daemon exit w/ kqueue")
2023-09-24ipc: recv_cmd4 clobbers destination buffer on errors
Handling this should be done at the lowest levels possible; so away from higher-level lei code.
2023-09-24lei: fix `-c NAME=VALUE' config support
We can pass `-c NAME=VALUE' args directly to git-config without needing a temporary directory nor file. Furthermore, this opens the door to us being able to correctly handle `-c NAME=VALUE' after `delete $lei->{cfg}' if we need to reload the config during a command. This tightens up error-checking for `lei config' and ensures we can make config settings changes while using `-c NAME=VALUE' instead of editing the temporary file. The non-obvious part was avoiding the use of the -f/--file arg for `git config' for read-only operations and include relying on `-c include.path=$ABS_PATH'. This is done by parsing the switches to be passed to `git config' to determine if it's a read-only operation or not.
2023-09-24lei config: send `git config' errors to pager
Our previous use of lei->cfg_dump was wrong as the extra arg was never supported. Instead, we need to capture the output of `git config' and send it to the pager if ->cfg_dump fails. We'll also add a note to the user to quit the pager to continue.
2023-09-24lei: check git-config(1) failures
2020-2021 were bad times and I somehow got deluded into believing git-config(1) would always succeed :x
2023-09-22lei: improve ->fail internal API
Allow the exit code to be the first argument intead of the last to match our ->child_error, as well as the BSD err(3) API. We'll also avoid shifting user-passed exit codes so $? can be passed as-is without losing signal information.
2023-09-15lei: ensure we run DESTROY|END at daemon exit w/ kqueue
The fundamental difference which I originally missed when implementing kqueue EVFILT_SIGNAL support is that it does not consume signals like signalfd(2) does. In other words, with EVFILT_SIGNAL, it's possible for a single signal to be delivered twice if we unblock signals upon leaving the event loop as we do in lei. Note: Our DS->event_loop and Sigfd APIs can/should probably be changed to better accomodate EVFILT_SIGNAL differences from signalfd without sacrificing usability of either. This fixes the problem of leftover lei-ovv.dst*, lei_cfg-* and skv.* files in $TMPDIR at the end of test suite runs on *BSD when IO::KQueue is installed.
2023-08-30treewide: drop MSG_EOR with AF_UNIX+SOCK_SEQPACKET
It's apparently not needed for AF_UNIX + SOCK_SEQPACKET as our receivers never check for MSG_EOR in "struct msghdr".msg_flags anyways. I don't believe POSIX is clear on the exact semantics of MSG_EOR on this socket type. This works around truncation problems on OpenBSD recvmsg when MSG_EOR is used by the sender. Link: https://marc.info/?i=20230826020759.M335788@dcvr
2023-03-26Merge branch 'cindex'
* cindex: (29 commits) cindex: --prune checkpoints to avoid OOM cindex: ignore SIGPIPE cindex: respect existing permissions cindex: squelch incompatible options cindex: implement reindex cindex: add support for --prune cindex: filter out non-existent git directories spawn: show failing directory for chdir failures cindex: improve granularity of quit checks cindex: attempt to give oldest commits lowest docids cindex: truncate or drop body for over-sized commits cindex: check for checkpoint before giant messages cindex: implement --max-size=SIZE sigfd: pass signal name rather than number to callback cindex: handle graceful shutdown by default cindex: drop `unchanged' progress message cindex: show shard number in progress message cindex: implement --exclude= like -clone ds: @post_loop_do replaces SetPostLoopCallback cindex: use DS and workqueues for parallelism ...
2023-03-25ds: @post_loop_do replaces SetPostLoopCallback
This allows us to avoid repeatedly using memory-intensive anonymous subs in CodeSearchIdx where the callback is assigned frequently. Anonymous subs are known to leak memory in old Perls (e.g. 5.16.3 in enterprise distros) and still expensive in newer Perls. So favor the (\&subroutine, @args) form which allows us to eliminate anonymous subs going forward. Only CodeSearchIdx takes advantage of the new API at the moment, since it's the biggest repeat user of post-loop callback changes. Getting rid of the subroutine and relying on a global `our' variable also has two advantages: 1) Perl warnings can detect typos at compile-time, whereas the (now gone) method could only detect errors at run-time. 2) `our' variable assignment can be `local'-ized to a scope
2023-03-25ipc: retry sendmsg + recvmsg calls on EINTR
I'm not sure how this went undetected for so long, but EINTR must be checked for when working with blocking sockets. EINTR shouldn't happen for non-blocking sockets, though, but it's easier to just use the new wrapper in most of those places. I don't know what I was smoking when I left out EINTR checks :x
2023-02-22treewide: simplify File::Path mkpath/make_path callers
File::Path already accounts for the existence of directories, handles races from redundant mkdir(2), and croaks on unrecoverable errors. So there's no point in doing any of that on our end. Furthermore, avoiding the overhead of loading File::Path doesn't seem worth it to save 20-60ms given the overhead of loading our other code. Instead, try to reduce optree overhead on our code, instead, since File::Path gets used in a bunch of places. We'll also favor the newer make_path for multi-directory invocations to avoid bloating our own optree to create an arrayref, but mkpath is one fewer subroutine call within File::Path itself, right now.
2023-01-31lei: drop -watches and -lei_note_event from workers
I noticed these while tracking down circular refs for commit 7b654d175cf2e31b (ipc: drop awaitpid_init to avoid circular refs, 2023-01-30). While they're not the cause of circular refs, they're still a waste of memory in worker processes.
2023-01-30ipc: drop awaitpid_init to avoid circular refs
This brings t/lei-index.t back down from ~8 to ~3s. I didn't notice this before was because the LeiNoteEvent timer was firing every 5s and clearing circular refs and parallel testing meant the delay got hidden. Fixes: 4a2a95bbc78f99c8 (ipc+lei: switch to awaitpid, 2023-01-17)
2023-01-18ipc+lei: switch to awaitpid
This avoids awkwardly stuffing an arrayref into callbacks which expect multiple arguments. IPC->awaitpid_init now allows pre-registering callbacks before spawning workers.
2022-11-28lei_mirror: remove janky mirror.done stamp file
This makes a fundamental (and overdue) change to the core of lei in how it handles child errors. Every process which generates or receives a child error will remember it before passing it on. This ensures _wq_done_wait callbacks will know of prior errors aside from $? when it runs.
2022-11-28lei_mirror: avoid convoluted lazy_cb usage
lazy_cb should only be used for lei command dispatch and completion callbacks when the method isn't known at startup. There's zero reason to use it when the method is known ahead-of-time, especially when there's a comment pointing reviewers towards the only possible method it can dispatch.
2022-09-10lei: fix --help for --jobs with `up' and `q'
The help needs to match on the short option, too, and that `lei q' option is (like most options) shared with `lei up'.
2022-08-19lei reindex: new command to reindex lei/store
2022-08-16lei: do not wait for sto->done on disconnected EOF
lei-daemon (the top-level daemon process) should not have synchronous waits, and this was causing a deadlock with interrupted commands. There may still be a bug lurking in lei/store despite this fix, though. I originally thought commit fd261b9e65674505 (lei_store_err: use level-trigger for error pipe, 2022-08-15) was sufficient, but at least this change is needed, as well.
2022-05-02lei: improve diagnosis of errors from children
Not 100% sure what's going on, but maybe this helps.
2022-04-26lei: move to v5.12 to avoid "use strict"
Socket.pm still loads strict.pm, unfortunately, which hurts startup time; but we'll save some LoC this way.
2022-04-22lei: commit store on interrupted partial imports
This change prevents lingering shard and git-fast-import processes from remaining after interrupted "lei import" (and similar). It also reduces the likelyhood of data-loss in case of subsequent abnormal termination of the daemon. I think this is the least surprising way to handle users prematurely aborting imports or other similar operations which write to lei/store and will result in reduced bandwidth waste for users with intermittent connections. This is because the lei/store processes may be shared by parallel "lei import" callers, and commits done by any "lei import" caller will inevitably trigger writes for all of them.
2022-04-18lei: wire up pure Perl sendmsg/recvmsg for Linux users
This enables lei-daemon to work without Inline::C nor Socket::MsgHdr installed. Prior to this, only the `lei' client was using the pure Perl implementation. Either C implementation is still marginally faster, however.
2022-04-18lei: clobber recvmsg buffer on errors
It will be necessary when we drop the Inline::C requirement since the pure Perl Linux syscall recvmsg implementation. This likely would've caused errors for Socket::MsgHdr users without Inline::C, but I haven't tested it since it's a rare configuration.
2022-04-05lei: always open mail_sync.sqlite3 R/W
This will make transparently upgrading from 1.7.0 -> 1.8.x easier. Only a single user has access to mail_sync.sqlite3, and R/W at the kernel-level is required for WAL, anyways.
2021-11-22lei: always use 3-arg open perlop
Future-proofing in case future versions of Perl warn on this, since 2-arg forms of open may be subject to injection vulnerabilities with non-literal args.
2021-11-15lei forget-search: add help for --prune
This enables tab-completion, since I'm using --prune quite a bit and my fingers are about to fall off :<
2021-11-02lei: simplify common LeiInput users with ->wq1_start
This method replaces a common pattern of starting workers, preparing internal auth ops, and asynchronous waiting of command completion. It also adds missing LeiAuth support to rediff and rm which rarely need auth.
2021-11-01treewide: kill problematic "$h->{k} //= do {" assignments
As stated in the previous change, conditional hash assignments which trigger other hash assignments seem problematic, at times. So replace: $h->{k} //= do { $h->{x} = ...; $val }; $h->{k} // do { $h->{x} = ...; $hk->{k} = $val }; "||=" is affected the same way, and some instances of "||=" are replaced with "//=" or "// do {", now.
2021-10-30lei: do not access {sock} after SIGPIPE
It's possible for this to break out of the event loop if note_sigpipe fires via PktOp in the same iteration.
2021-10-27lei mail-diff: support more inputs, split newlines
Support --in-format like the rest of LeiInput users, and don't default to .eml if a per-input format was specified. In any case, I saved a bunch of messages from mutt which uses mboxcl2. We'll also split newlines for diff, since it's a pain to read diffs with escaped "\n" characters in them.
2021-10-26lei p2q: use LeiInput for multi-patch series
The LeiInput backend now allows p2q to work like any other command which reads .eml, .patch, mbox*, Maildir, IMAP, and NNTP input. Running "git format-patch --stdout -1 $COMMIT" remains supported. This is intended to allow lower memory use while parsing "git log --pretty=mboxrd -p" output. Previously, the entire output of "git log" would be slurped into memory at once. The intended use is to allow easy(-ish :P) searching for unapplied patches as documented in the new example in the manpage.
2021-10-26lei: add net getopt spec to various commands
All of these commands should support --proxy, at least, if not other curl options.
2021-10-26lei p2q: document --uri, add examples
This is useful for users lacking in local storage. Also, referencing lei-add-external(1) seems to make less sense than referencing lei-q(1). We'll also start dropping years from the copyright statement to reduce future churn.
2021-10-22lei forget-search: support --prune=<local|remote>
Instead of: lei forget-search $OUTPUT && rm -r $OUTPUT we'll also allow a user to do: rm -r $OUTPUT && lei forget-search --prune This gives users flexibility to choose whatever flow is most natural to them.
2021-10-22lei: no Perl FileHandle for `undef' w/ ECONNRESET
Error reporting for recv_cmd4 methods is a bit wonky.
2021-10-22dir_idle: treat IN_MOVED_FROM as a gone event
Whether an MUA uses rename(2) or link(2)+unlink(2) combination should not matter to us. We should be able to handle both cases.
2021-10-19lei: remove unused ->busy time arg
Our graceful shutdown doesn't time out clients.
2021-10-19lei up: support --exclude=, --no-(external|remote|local)
These can be used to temporarily disable using certain externals in case of temporary network failure or mount point unavailability.
2021-10-19lei: conditionally add "\n" to error messages
Some error messages already include "\n" (w/ file+line info), so don't add another one. (`warn' will automatically add its caller location unless there's a final "\n").
2021-10-16lei sockets: favor level-triggered epoll for fairness
Sigfd->event_step needs priority over script/lei clients, LeiSelfSocket, and everything else.
2021-10-16lei_overview: die rather than lei->fail
This will make our code more flexible in case it gets used in non-lei things.
2021-10-16lei: more eval guards for die on failure
Relying on $lei->fail is unsustainable since there'll always be parts of our code and dependencies which can trigger die() and break the event loop.
2021-10-16lei: always keep cwd fd {3} for ->fchdir
The extra FD shouldn't cause noticeable overhead in short-lived workers, and it lets us simplify lei->rel2abs. Get rid of a 2-argument form of open() while we're at it, since it's been considered for warning+deprecation by Perl for safety reasons.
2021-10-16lei: golf PATH2CFG cleanup
More code means more bugs.
2021-10-16dir_idle: do not add watches in ->new
There's no savings in having two ways to add watches to an inotify nor kqueue descriptor.
2021-10-15lei forget-search: support multiple args
I've been testing a lot of searches which I don't want to keep around, so make it easy to remove a bunch at once. We'll behave like rm(1) and keep going in the face of failure.
2021-10-15lei + ipc: simplify process reaping
Simplify our APIs and force dwaitpid() to work in async mode for all lei workers. This avoids having lingering zombies for parallel searches if one worker finishes soon before another. The old distinction between "old" and "new" workers was needlessly complex, error-prone, and embarrasingly bad. We also never handled v2:// writers properly before on Ctrl-C/Ctrl-Z (SIGINT/SIGTSTP), so add them to @WQ_KEYS to ensure they get handled by $lei when appropropriate.