Date | Commit message (Collapse) |
|
The MH format is widely-supported and used by various MUAs such
as mutt and sylpheed, and a MH-like format is used by mlmmj for
archives, as well. Locking implementations for writes are
inconsistent, so this commit doesn't support writes, yet.
inotify|EVFILT_VNODE watches aren't supported, yet, but that'll
have to come since MH allows packing unused integers and
renaming files.
|
|
Calling vmd_mod_extract after optparse causes the implicit
stdin-as-input functionality to fail, as the implicit stdin
requires a lack of inputs remaining in argv after option
parsing (along with a regular file or pipe as stdin).
This allows commands such as `lei import -F eml +kw:seen'
to work without `--stdin', `-' or any path names when
importing a single message. This also ensures commands like
`lei import +kw:seen' without any inputs/locations will fail
reliably, as the extra +kw: arg won't be a false-positive.
|
|
This will make it easier to switch in the far future while
making callers easier-to-read (and more callers will be added).
Anyways, Perl 5.26 is a long time away for enterprise users;
but isolating compatibility code away can improve readability
of code we actually care about in the meantime.
|
|
This fixes completions of labels (`+L:' for `lei import' and
`L:' for `lei q') so they can appear anywhere in the
command-line.
I mainly wanted this for `lei import $URL +L:label', but
this also fixes `lei forget-external' completions for URLs
(which involve colons).
|
|
This can probably be added for "lei q", too, but we typically
import first. Labels can probably be made persistent on a
per-folder basis in the future.
|
|
This will make transparently upgrading from 1.7.0 -> 1.8.x
easier. Only a single user has access to mail_sync.sqlite3,
and R/W at the kernel-level is required for WAL, anyways.
|
|
This method replaces a common pattern of starting workers,
preparing internal auth ops, and asynchronous waiting of
command completion.
It also adds missing LeiAuth support to rediff and rm
which rarely need auth.
|
|
This will make future developments easier.
|
|
warn() is easier to augment with context information, and
frankly unavoidable in the presence of 3rd-party libraries
we don't control.
|
|
While `$argv[-1]' is `undef' on an empty @argv, using `$argv[-1]'
as a subroutine argument would fail incorrectly with:
Modification of non-creatable array value attempted, subscript -1 at ...
...even though we'd never attempt to modify @_ itself in the
subroutines being called. Work around the bug (tested on
5.16.3) by passing `undef' explicitly when `$argv[-1]' is
already `undef'.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210927124056.kj5okiefvs4ztk27@meerkat.local/
|
|
"lei export-kw" no longer completes for anonymous sources.
More commands use "lei refresh-mail-sync" as a basis for their
completion work, as well.
";AUTH=ANONYMOUS@" is stripped from completions since it was
preventing bash completion from working on AUTH=ANONYMOUS IMAP
URLs. I'm not sure if there's a better way, but all of our code
works fine without specifying AUTH=ANONYMOUS as a command-line
arg.
Finally, we fallback to using more candidates if none can
be found, allowing multiple URLs to be completed.
|
|
NNTP servers, IMAP servers, and various MUAs may recycle
"unique" identifiers due to software bugs or careless BOFHs.
Warn about them, but always be prepared to account for them.
|
|
This has several advantages:
* no need to use ipc.lock to protect a pipe for non-atomic writes
* ability to pass FDs. In another commit, this will let us
simplify lei->sto_done_request and pass newly-created
sockets to lei/store directly.
disadvantages:
- an extra pipe is required for rare messages over several
hundred KB, this is probably a non-issue, though
The performance delta is unknown, but I expect shards
(which remain pipes) to be the primary bottleneck IPC-wise
for lei/store.
|
|
Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"),
relying on lei/store to serialize access was a pointless endeavor.
Rely on flock(2) to serialize multiple writers since (in my
experience) it's the easiest way to deal with parallel writers
when using SQLite. This allows us to simplify existing callers
while speeding up 'lei refresh-mail-sync --all=local' by 5% or
so.
|
|
It doesn't seem worthwhile to change worker counts dynamically
on a per-command-basis with lei, and I don't know how such an
interface would even work...
|
|
There's no need to alias net_merge_all in each WQ class
which uses LeiAuth, `$obj->$sub' works even when `$sub'
is a fully-qualified subroutine name with `::' in it.
perlobj(1) documents it under "Method Call Variations".
|
|
And fix "lei index" completion while we're at it.
|
|
Taking ~40s to synchronize a ~75K message IMAP folder is
still a lot of time, so support an option to only touch
new messages.
This is similar to "offlineimap -q" (quick) or "mbsync --new"
switches, but lei already accepts "-q" as a shortcut for
--quiet. "--new" could work, but "--new-only" might be more
descriptive (or "--only-new"?), since the default fetches
also fetches new messages.
v2: warn for non-IMAP sources, I'm not sure it's worth it for
Maildir or other sources, yet. It will also make sense
for MH and JMAP once we support them.
|
|
Since Maildir isn't guaranteed to have any sort of order, we
can parallelize inputs, here. On a 4-core system, this reduced
one of my tag invocations from 5.5 to 1.4s.
|
|
This is a slight behavior change for "lei q": Trashed
(but not-yet-expunged) messages no longer get unlinked
when --output is used without --augment.
|
|
On a 4-core CPU, this speeds up "lei import" on a largish
Maildir inbox with 75K messages from ~8 minutes down to ~40s.
Parallelizing alone did not bring any improvement and may
even hurt performance slightly, depending on CPU availability.
However, creating the index on the "fid" and "name" columns in
blob2name yields us the same speedup we got.
Parallelizing IMAP makes more sense due to the fact most IMAP
stores are non-local and subject to network latency.
Followup-to: bdecd7ed8e0dcf0b45491b947cd737ba8cfe38a3 ("lei import: speed up kw updates for old IMAP messages")
|
|
op_wait_event is now more lei-specific since we no longer have
to care about oneshot and use a synchronous loop.
{ikw} (import-keywords) started a trend, but LeiPmdir (parallel
Maildir) is an upcoming WQ class that will follow this idea.
Eventually, {l2m} usage may be updated to follow this, too.
|
|
On a 4-core CPU, this speeds up "lei import" on a largish IMAP
inbox with 75K messages from ~21 minutes down to 40s.
Parallelizing with the new LeiImportKw WQ worker class gives a
near-linear speedup and brought the runtime down to ~5:40.
The new idx_fid_uid index on the "fid" and "uid" columns of
blob2num in mail_sync.sqlite3 brought us the final speedup.
An additional index on over.sqlite3#xref3(oidbin) did not help,
since idx_nntp already exists and speeds up the new ->oidbin_exists
internal API.
I initially experimented with a separate "lei import-kw" command
but decided against it since it's useless outside of IMAP+JMAP
and would require extra cognitive overhead for both users and
hackers. So LeiImportKw is just a WQ worker used by "lei import"
and not its own user-visible command.
v2: fix ikw_done_wait arg handling (ugh, confusing API :x)
|
|
We don't need to write VMD changes to lei/store if local
keywords are unchanged.
|
|
This makes "lei import" behavior with IMAP folders more
consistent with that with Maildir.
Opening IMAP folders read-write with "SELECT" (instead of
read-only with "EXAMINE") was necessary, since it lets an IMAP
server communicate to us as to whether or not it's worth
refetching IMAP flags of previously imported messages.
Fetching UID+FLAGS only is one of the fastest IMAP operations
with dovecot, our -imapd and presumably other common IMAP servers.
It is issued by common MUAs such as mutt after every SELECT.
Users may now rely on "lei import" exclusively to merge mail and
keywords into lei/store, and "lei export-kw" to propagate
keyword changes back to IMAP servers.
A sticks-and-stones workflow for personal mailboxes is currently:
lei import imaps://$MY_PERSONAL_INBOX
lei q --mua=$MUA -o /tmp/results SEARCH TERMS...
# do stuff from within $MUA to /tmp/results
lei import /tmp/results # read keyword changes from MUA
lei export-kw imaps://$MY_PERSONAL_INBOX
# repeat when new stuff shows up in personal inbox
The next goal is to automate repeated imports + export-kw
commands with with inotify and IMAP IDLE.
|
|
This will give us more flexibility in the future w.r.t.
dealing with UIDVALIDITY and AUTH= info with IMAP. The LoC
reduction is welcome, too.
|
|
Since completely purging blobs from git is slow, users may wish
to index messages in Maildirs (and eventually other local
storage) without storing data in git.
Much code from LeiImport and LeiInput is reused, and a new dummy
FakeImport class supplies a non-storing $im->add and minimize
changes to LeiStore.
The tricky part of this command is to support "lei import"
after a message has gone through "lei index". Relying on
$smsg->{bytes} == 0 (as we do for external-only vmd storage)
does not work here, since it would break searching for "z:"
byte-ranges when not using externals.
This eventually required PublicInbox::Import::add to use a
SharedKV to keep track of imported blobs and prevent
duplication.
|
|
In most cases, we just name the worker process based
on the command. The only change is for LeiMirror
vs "lei add-external --mirror", but I doubt it matters.
|
|
I suspect there'll be more lei_input-only things in the future.
|
|
We use the "done" term elsewhere for similar things, and
my easily-confused mind equates "complete" with shell
completion.
|
|
This also fixes completion of "lei up" for IMAP folders.
|
|
IMAPTracker has a UNIQUE constraint on the `url' column,
which may cause compatibility and/or rollback problems
in attempting to deal with UIDVALIDITY changes.
Having multiple sources of truth leads to confusion and bugs,
so relying on LeiMailSync exclusively ought to simplify things.
Furthermore, since LeiMailSync is only written to by LeiStore,
it is safer in that it won't mark a UID or article as imported
until git-fast-import has seen it, and the SQLite commit always
happens after "done\n" is sent to fast-import.
This mostly reverts recent commits to IMAPTracker to support
lei, those are:
1) commit 7632d8f7590daf70c65d4270e750c36552fa9389
("net_reader: restart on first UID when UIDVALIDITY changes")
2) commit 311a5d37ad275cd75b1e64d87827c4d13fe4bfab
("imap_tracker: prepare for use with lei").
This means public-inbox-watch will not change between 1.6 and
1.7: -watch stops synching a folder when UIDVALIDITY changes.
|
|
This lets us share more code and reduces cognitive overhead when
it comes to picking names (because {lsss} was ridiculous).
We'll need to ensure the first error set in lei is the actual
error we exit with, otherwise things can get confusing and
errors may get lost.
|
|
"lei import" is probably the only place where it users
might care about warnings.
|
|
Simplify our internals a little bit.
|
|
We aren't using it, yet, but the plan is to be able to use
this information to propagate keyword changes back to IMAP
and Maildir folders using some to-be-implemented command.
"lei inspect" is a half-baked new command to make testing this
change easier. It will be updated to support more SQLite+Xapian
introspection duties in the future, including public-inbox
things independent of lei.
|
|
This saves some work and makes it easier to set volatile
metadata on a message at import time.
|
|
No point in burning through bandwidth to import stuff we already
saw. All this logic is shared with -watch but uses a different
pathname for lei since it's tied to lei/store (and not a
public-inbox).
|
|
Code is the enemy, and there's no need to duplicate things, here.
There may be further opportunities along these lines to further
deduplicate things...
|
|
No point in sending a command for every input when a
single one will do. We'll also trigger LeiStore->done
sooner in the worker rather than later.
|
|
We must use the $ops hashref returned by lei->workers_start,
since it's modified to include extra handlers for auth failures
and whatnot.
Fixes: 954581b8e575966a ("lei: simplify PktOp callers")
|
|
We can consistently open /dev/stdin correctly nowadays, so
drop the input_stdin and just use the normal ->path_to_fd
code path.
|
|
Provide a consistent ->op_wait_event method instead of
forcing callers to loop (or not) at each callsite.
This also avoid a leak possibility by avoiding circular
references.
|
|
"lei import" should never be without a {sto}, and *_done should
not be called multiple times, so ensure we can fail if it's
missing.
Update some existing tests to complain loudly by introducing a
handy "xbail" function which wraps "explain" and BAIL_OUT.
BAIL_OUT was painful to type and concatenating the result of
"explain" doesn't work as I thought it would since "explain"
always returns an array, and BAIL_OUT only accepts a single
scalar arg (unlike "die").
|
|
Instead of creating a short-lived circular reference,
ensure they don't exist in the first place.
Note the following changes to hold an extra ref to $sto:
- $self->_lei_store(1)->write_prepare($self);
+ my $sto = $self->_lei_store(1);
+ $sto->write_prepare($self);
I'm not a perlguts expert, but I actually wanted to switch
to the one-line version for LeiImport, but xt/lei-auth-fail.t
was getting stuck for some reason. It seems the extra ref
to the LeiStore ($sto) object is necessary.
|
|
"lei convert" is actually a bit of the odd one, since
it uses lei2mail for auth, unlike the others.
|
|
Only tested for keywords and labels with file inputs, so far;
but it seems to do what it needs to do. There's a bit more
redundant code than I'd like, and more opportunities for code
sharing in the future
"lei import" will be expanded to support +kw:$KEYWORD and
+L:$LABEL in the future.
|
|
Those headers only have meaning with for mboxes. Don't surprise
users by trying to make sense of a header that is defined for mboxes.
It's possible to send email with (Status|X-Status) headers and
have those headers show up in a recipient's IMAP mailbox.
This was bad because an IMAP user may want to import a single
message through their MUA and pipe its contents to "lei import"
without noticing a mischievious sender stuck "X-Status: F"
(flagged/important) in there.
|
|
This improve code regularity, and will let us deal with
the "RFC822" messages with "From " line that mutt pipes
to.
|
|
Relying on UNIVERSAL::can may cause internal helper methods
to be used, which can lead to failures or nonsensical results.
|