about summary refs log tree commit homepage
path: root/lib/PublicInbox/LeiToMail.pm
DateCommit message (Collapse)
2021-09-08lei q|up: fix write counter for v2
It's a bit confusing to see "0 written to ..." when we actually wrote something.
2021-09-07lei up: support --all for IMAP folders
Since "lei up" is expected to be a heavily-used command, better support for IMAP seems like a reasonable idea. This is inefficient since we waste an IMAP(S) TCP connection since it dies when an auth-only LeiUp worker process dies, but it's better than not working at all, right now.
2021-09-06lei_auth: simplify users
There's no need to alias net_merge_all in each WQ class which uses LeiAuth, `$obj->$sub' works even when `$sub' is a fully-qualified subroutine name with `::' in it. perlobj(1) documents it under "Method Call Variations".
2021-09-04lei_to_mail+mbox_reader: fix handling of empty/bogus emails
We may be handling invalid mboxes, so just return no objects in that case. While "lei q" on HTTP(S) externals expects a gzipped mboxrd, there's always a chance something else gzipped can be sent to us. There's also changes to lei_to_mail to better handle emails which lack a body and/or headers (e.g. t/solve/bare.patch) Link: https://public-inbox.org/meta/20210903151500.h72mzcpqixgtytjs@meerkat.local/
2021-09-03lei: fix read/write IMAP access
xt/net_writer-imap.t was completely broken in recent months and I completely forgot this test. net->add_url still only accepts bare scalars (and not scalar refs), so we must set that up properly. Furthermore, our changes to do FLAGS-only synchronization in lei of old messages was causing us to not handle FLAGS properly for the test.
2021-08-19lei q: make --save the default
Since "lei up" is more often useful than not and incurs neglible overhead; enable --save by default and allow --no-save to work. This also fixes a long-standing when overwriting --output destinations with saved searches: dedupe data from previous searches are reset and no longer influences the new (changed) search, so results no longer go missing if two sequential invocations of "lei q --save" point to the same --output.
2021-07-25lei: avoid SQLite COUNT() for dedupe
SQLite COUNT() is a slow operation that does a full table scan with no conditions. There's no need for it, since lei dedupe only needs to know if it's empty or not to decide between new/ and cur/ for Maildir outputs.
2021-06-04pkt_op: make pkt_do an OO method
This will make it easier to use for internal use such as managing Maildir and IMAP IDLE watches.
2021-05-30lei q: --sort and --save|v2 are incompatible
Saved searches rely on (reverse) docid ordering for efficient incremental results, and sorting any other way prevents that. Update comment description in LeiQuery while we're at it: "ls-query" and "rm-query" are "ls-search" and "forget-search", respectively, and "mv-query" is implicit with "edit-search"
2021-05-30lei import|lcat: improve+fix single message IMAP support
lcat can now dump the memoized contents of entire IMAP folders, not just a single UID. It's now parallelized and pipelined for multiple lei2mail workers. Furthemore, various forms of JSON output work consistently with blob-only output, now. While working on this, I noticed NetReader was passing UID URLs to imap_each callbacks, which was causing mail_sync.sqlite3 to store UIDs in `folders' and clearly wrong so it's now fixed.
2021-05-29lei_to_mail: use abs_path for Maildir in mail_sync.sqlite3
lei->rel2abs doesn't resolve symlinks, which could cause synchronization problems with export-kw or other commands.
2021-05-28lei q|up: support v2:/path/to/inboxdir destination
This allows "lei-managed pseudo mailing lists" as described by Konstantin. Alternates use is optional and can be enables via --shared. This doesn't manage or edit ~/.public-inbox/config; presumably there'll need to be some tweaking of search parameters before finalizing and making the inbox publicly accessible via HTTP/NNTP. Link: https://public-inbox.org/meta/20210426164454.5zd5kgugfhfwfkpo@nitro.local/T/
2021-05-28lei: handle a single IMAP message in most places
"lei import" can now import a single IMAP message via <imaps://example.com/MAILBOX/;UID=$UID> Likewise, "lei inspect" can show the blob information for UID URLs and "lei lcat" can display the blob without network access if imported. "lei lcat" also gets rid of some unused code and supports "blob:$OIDHEX" syntax as described in the comments (and used by our "text" output format). v2: enforce UID in URL, fail without v3: fix error reporting (s/fail/child_error/)
2021-05-23lei <q|up>: set \Recent on non-empty mbox and Maildir
Despite JMAP not supporting the equivalent of the IMAP \Recent flag, it is useful for "lei q --augment", and "lei up" users to be able to distinguish new results from old-but-unread messages in an mbox or Maildir. For mbox family messages, we'll drop the "O" status flag when appending to mboxes, and we'll write to the "new" subdirectory of Maildirs. Behavior when writing to initially empty Maildirs and mboxes remains unchanged since there's no need to distinguish between new and old results in the initial case. Having users wait for a rename(2) storm or complete mbox rewrite hurts UX. With IMAP mailboxes, \Recent is already enforced by the IMAP server and IMAP clients have no way of changing it(*) (*) mutt uses the "Old" IMAP flag which isn't part of RFC 3501, other MUAs may do similar things.
2021-05-23lei export-kw: support exporting keywords to IMAP
We support writing to IMAP stores in other places (just like Maildir), and it's actually less complex for us to write to IMAP. Neither usability nor performance is ideal, but usability will be addressed in the next commit to relax CLI argument checking. Performance is poor due to the synchronous Mail::IMAPClient API and will need to be addressed with pipelining sometime further in the future.
2021-05-23net_reader|net_writer: pass URI refs deeper into callbacks
This will give us more flexibility in the future w.r.t. dealing with UIDVALIDITY and AUTH= info with IMAP. The LoC reduction is welcome, too.
2021-05-23lei export-kw: new command to export keywords to Maildirs
IMAP will eventually be supported.
2021-05-23lei tag: support tagging index-only messages
This will make some of our tests faster and allow users to try more features of lei without high storage requirements.
2021-05-04lei index: new command to index mail w/o git storage
Since completely purging blobs from git is slow, users may wish to index messages in Maildirs (and eventually other local storage) without storing data in git. Much code from LeiImport and LeiInput is reused, and a new dummy FakeImport class supplies a non-storing $im->add and minimize changes to LeiStore. The tricky part of this command is to support "lei import" after a message has gone through "lei index". Relying on $smsg->{bytes} == 0 (as we do for external-only vmd storage) does not work here, since it would break searching for "z:" byte-ranges when not using externals. This eventually required PublicInbox::Import::add to use a SharedKV to keep track of imported blobs and prevent duplication.
2021-05-04lei up: fix dedupe with remote externals on Maildir + IMAP
LeiToMail Maildir and IMAP write callbacks need to account for the caller-supplied smsg. We'll also make better use of the user-supplied smsg object by ensuring blob deduplication happens ASAP. Fixes: e76683309ca4f254 ("lei <q|up>: distinguish between mset and l2m counts")
2021-05-03lei <q|up>: writes to Maildirs and IMAP use mail-sync
This will allow keyword updates from other folders to propagate to folders where search results may be duplicated.
2021-05-01lei_auth: s/net_merge_complete/net_merge_all_done/
We use the "done" term elsewhere for similar things, and my easily-confused mind equates "complete" with shell completion.
2021-05-01lei <q|up>: distinguish between mset and l2m counts
The number of messages we write to --output is usually different than the mset count due to deduplication from combining multiple sources. This change makes the stderr output of "lei up --all=local" way more useful IMHO.
2021-04-30lei: IMAP .onion support via --proxy=s switch
Mail::IMAPClient provides the ability to pass a pre-connected Socket to it. We can rely on this functionality to use IO::Socket::Socks in place whatever socket class Mail::IMAPClient chooses to use. The --proxy=s is shared with curl(1), though we only support socks5h:// at the moment. Is there any need for SOCKS4 or SOCKS5 without name resolution? Tor .onions require socks5h:// for name resolution and to prevent data leakage.
2021-04-27lei q + lcat: support --format=text output
This is mainly for "lei lcat" where it's the default, but I find it useful anyways compared to the JSON view. Colors are loaded from ~/.config/lei/config, and fall back to using diff colors from a normal git config (e.g. ~/.gitconfig).
2021-04-23lei_to_mail: cwd-agnostic Maildir wakeup
Since we don't have *at() syscalls readily available to us, lei-daemon may call ->poke_dst in the wrong relative directory. Despite not having *at() syscalls, we can still capture the "$MAILDIR/cur" directory handle at pre_augment time so we can reliably call futimes(2) on it using the `utime' perlop.
2021-04-22lei: flesh out `forwarded' kw support for Maildir and IMAP
Maildir and IMAP can both handle `forwarded'. Ensure we don't lose `forwarded' when reading from stores which do not support it, but ensure we can set it when reading from IMAP and Maildir stores.
2021-04-19lei q: implement import-before default for --save
This makes "lei q --save" as safe as "lei q" to prevent against accidental data loss when clobbering an existing output,
2021-04-16lei_to_mail: cast to URIimap object early
NetReader->add_url supports URI-like objects, now. We'll be relying on the canonicalization for LeiSavedSearch.
2021-04-13lei: add "lei up" to complement "lei q --save"
The command isn't finalized, yet, but it's intended to update an existing saved search.
2021-04-13lei q: start wiring up saved search
This will have a over.sqlite3 for content-based deduplication. It may exhibit ibxish methods, so serving a read-only (or even R/W) IMAP or instance or displaying HTML isn't outside the realm of possibility.
2021-04-13lei_dedupe: adjust to prepare for saved searches
LeiSavedSearch will use a LeiDedupe-like internal API, so we won't have to make as many changes to callsites between saved and unsaved searches.
2021-04-05lei q: fix auth IMAP --output with remote mboxrd
IMAP authentication info is only shared amongst lei2mail workers, so we must ensure all IMAP writes go through lei2mail workers even if we don't have to access the mail through git. This allows us to decouple the latency of the remote mboxrd from the latency of the IMAP --output at the expense of extra IPC overhead within our own processes.
2021-04-05lei_to_mail: improve comments and reduce LoC
We don't need to waste LoC on corner cases, single-use internal subs, or restoring SIG{__WARN__} when a process exits. All that extra code contributes to memory use and startup time, especially for users who can't use FD passing.
2021-04-05lei: maildir: move shard support to MdirReader
We'll eventually want lei_input users like "lei import" and "lei tag" to support parallel reads.
2021-04-05lei_to_mail: trim down imports
We don't need to import so many things. None of the Errno constants are in common paths so unlikely to benefit from constant folding.
2021-04-01lei: maildir: handle "forwarded" keyword as "P"
mbox and IMAP seem to have no way of describing this keyword. but Maildir does with the "P" flagged (for "passed").
2021-04-01lei q: reduce lei/store work for kw changes to stored mail
We can tweak lse->kw_changed to return docids and reduce IPC traffic and reduce work the lei/store worker needs to do.
2021-03-30lei_to_mail: update some comments and style
Note that update_kw_maybe is critical in preventing accidental data loss with default "lei q --output" behavior. Also avoid treating (proposed) MH support as lock-free, since appears to lack specifications for locking and be even worse than mbox* in that regard...
2021-03-29lei_input: support compressed mboxes
Since "lei q" and "lei convert" already support writing these compressed inboxes, it makes sense that all mbox readers support them, as well. Using compression is one reliable way to know an mboxrd or mboxo hasn't been unexpectedly truncated.
2021-03-26lei: support /dev/fd/[0-2] inputs and outputs in daemon
Since lei-daemon won't have the same FDs as the client, we need to special-case thse mappings and won't be able to open arbitrary, non-standard FDs. We also won't attempt to support /proc/self/fd/[0-2] since that's a Linux-ism. /dev/fd/[0-2] and /dev/std{in,out,err} are portable to FreeBSD, at least. mawk(1) also supports /dev/std{out,err}, as does gawk(1) (which supports everything we can support, and arbitrary /dev/fd/$FD).
2021-03-26lei q: skip lei/store->write_prepare for JSON outputs
JSON outputs won't write to lei/store at all, so there's no point in forking the store worker if it's not already running. LeiSearch object ($lse) is also fork-safe until it opens a persistent FD for Xapian/SQLite so we can unconditionally carry it across fork.
2021-03-24lei: hide *_atfork_child from command-line
Otherwise we could get non-sensical results if somebody tries running "lei atfork_child" from the command-line.
2021-03-21lei_to_mail: match mutt order of status headers
These changes may make it easier to do byte-for-byte comparisons with mail copied out of mutt, a popular MUA for our target audience. mutt currently outputs the 'R' (seen) flag before the 'O' character in the Status: header. We'll assume that stays the case (it has been for a while). Status now comes before X-Status, also matching mutt behavior.
2021-03-21lei q: support vmd for external-only messages
"lei q" now preserves changes per-message keywords across invocations when it's --output (Maildir or mbox) is reused (with or without --augment). In the future, these changes will be monitored via inotify, EVFILT_VNODE or IMAP IDLE, too. Unfortunately, this currently prevents "lei import" from ever importing a message that's in an external. That will be fixed in a future change.
2021-03-21lei: All Local Externals: bare git dir for alternates
This will be used for keyword (and label) storage for externals. We'll be using this to ensure we don't redundantly auto-import messages into lei/store if they're already in a local external (they can still be imported explicitly via "lei import").
2021-03-20maildir: avoid redundant slashes
Redundant slashes look ugly in strace(1) output.
2021-03-17lei_store: keywords => vmd (volatile metadata), prepare for labels
Since keywords and mailboxes (AKA labels) are separate things in JMAP; and only keywords can map reliably to Maildir and mbox; we'll keep them separate in our internal data representations, too. I initially wanted to call this just "meta" for "metadata", but that might be confused with our mailing list name. "metadata" is already used in Xapian's own API, to add another layer of confusion. "tags" was also considered, but probably confusing to notmuch users since our "labels" are analogous to "tags" in notmuch, and notmuch doesn't seem to cover "keywords" separately... So "vmd" it is, since we haven't used this particular three-letter-abbreviation anywhere before; and "volatile" seems like a good description of this metadata since everything else up to this point has been mostly WORM (write-once, read-many).
2021-03-16mbox: move mbox_keywords to MboxReader
MboxReader is a more appropriate place for it than LeiStore.
2021-03-15lei q: do not import unnecessarily from externals
We only want to auto import messages that are exclusively in remote externals. Messages in local externals are not auto-imported to save space and reduce wear on storage device.