dumping ground for random patches and texts
 help / color / mirror / Atom feed
* [PATCH 1/9] lei: use sleep(1) loop for infinite sleep
@ 2021-02-03  1:40 Eric Wong
  2021-02-03  1:40 ` [PATCH 2/9] lei: reduce FD pressure from lei2mail worker Eric Wong
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:40 UTC (permalink / raw)
  To: spew

Perl may internally race and miss signals due to a lack of
self-pipe / eventfd / signalfd / EVFILT_SIGNAL usage.  While our
event loop paths avoid these problems by using signalfd or
EVFILT_SIGNAL, thse sleep() calls are not within the event loop.
---
 lib/PublicInbox/LEI.pm | 2 +-
 script/lei             | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9afc90cf..9b9aed64 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -298,7 +298,7 @@ sub x_it ($$) {
 		if (my $signum = ($code & 127)) { # usually SIGPIPE (13)
 			$SIG{PIPE} = 'DEFAULT'; # $SIG{$signum} doesn't work
 			kill $signum, $$;
-			sleep; # wait for signal
+			sleep(1) while 1; # wait for signal
 		} else {
 			$quit->($code >> 8);
 		}
diff --git a/script/lei b/script/lei
index 58f0dbe9..40c21ad8 100755
--- a/script/lei
+++ b/script/lei
@@ -116,7 +116,7 @@ Falling back to (slow) one-shot mode
 	sigchld();
 	if (my $sig = ($x_it_code & 127)) {
 		kill $sig, $$;
-		sleep;
+		sleep(1) while 1;
 	}
 	exit($x_it_code >> 8);
 } else { # for systems lacking Socket::MsgHdr or Inline::C

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/9] lei: reduce FD pressure from lei2mail worker
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
@ 2021-02-03  1:40 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 3/9] lei: further reduce lei2mail FD pressure Eric Wong
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:40 UTC (permalink / raw)
  To: spew

lei2mail doesn't need stdin anymore, so we can use the [0] slot
for the $not_done keepalive purposes.
---
 lib/PublicInbox/LeiOverview.pm | 8 ++++----
 lib/PublicInbox/LeiToMail.pm   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 52da225d..88034ada 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -217,13 +217,13 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
-		# $io[-1] becomes a notification pipe that triggers EOF
+		# $io[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete
-		pipe($l2m->{each_smsg_done}, $io[$#io + 1]) or die "pipe: $!";
-		fcntl($io[-1], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
+		$io[0] = undef;
+		pipe($l2m->{each_smsg_done}, $io[0]) or die "pipe: $!";
+		fcntl($io[0], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
 		delete @$lei_ipc{qw(l2m opt mset_opt cmd)};
-		$lei_ipc->{each_smsg_not_done} = $#io;
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git;
 		my $git_dir = $git->{git_dir};
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index c6c5f84b..c704dc2a 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -464,7 +464,7 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $smsg, $lei) = @_;
-	my $not_done = delete $self->{$lei->{each_smsg_not_done}};
+	my $not_done = delete $self->{0} // die 'BUG: $not_done missing';
 	my $wcb = $self->{wcb} //= do { # first message
 		$lei->atfork_child_wq($self);
 		$self->write_cb($lei);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/9] lei: further reduce lei2mail FD pressure
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
  2021-02-03  1:40 ` [PATCH 2/9] lei: reduce FD pressure from lei2mail worker Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 4/9] pkt_op: rely on DS::in_loop global Eric Wong
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

We don't need to be sending errors directly to the client, but
instead go through lei-daemon or the top-level one-shot process.
---
 lib/PublicInbox/LeiOverview.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 88034ada..366af8b2 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -216,7 +216,9 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
+		my $sock = delete $lei->{sock}; # lei2mail doesn't need it
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
+		$lei->{sock} = $sock if $sock;
 		# $io[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/9] pkt_op: rely on DS::in_loop global
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
  2021-02-03  1:40 ` [PATCH 2/9] lei: reduce FD pressure from lei2mail worker Eric Wong
  2021-02-03  1:41 ` [PATCH 3/9] lei: further reduce lei2mail FD pressure Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 5/9] lei: err: avoid uninitialized variable warnings Eric Wong
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

No reason to check for $lei->{oneshot} here.
---
 lib/PublicInbox/LeiXSearch.pm |  2 +-
 lib/PublicInbox/PktOp.pm      | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 37bd233e..23a9c020 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -421,7 +421,7 @@ sub do_query {
 		'' => [ \&query_done, $lei ],
 		'mset_progress' => [ \&mset_progress, $lei ],
 	};
-	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops, !$lei->{oneshot});
+	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops);
 	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
 	delete($lei->{pkt_op});
 
diff --git a/lib/PublicInbox/PktOp.pm b/lib/PublicInbox/PktOp.pm
index 59b37ff8..40c7262a 100644
--- a/lib/PublicInbox/PktOp.pm
+++ b/lib/PublicInbox/PktOp.pm
@@ -17,9 +17,9 @@ use PublicInbox::IPC qw(ipc_freeze ipc_thaw);
 our @EXPORT_OK = qw(pkt_do);
 
 sub new {
-	my ($cls, $r, $ops, $in_loop) = @_;
-	my $self = bless { sock => $r, ops => $ops, re => [] }, $cls;
-	if ($in_loop) { # iff using DS->EventLoop
+	my ($cls, $r, $ops) = @_;
+	my $self = bless { sock => $r, ops => $ops }, $cls;
+	if ($PublicInbox::DS::in_loop) { # iff using DS->EventLoop
 		$r->blocking(0);
 		$self->SUPER::new($r, EPOLLIN|EPOLLET);
 	}
@@ -28,10 +28,10 @@ sub new {
 
 # returns a blessed object as the consumer, and a GLOB/IO for the producer
 sub pair {
-	my ($cls, $ops, $in_loop) = @_;
+	my ($cls, $ops) = @_;
 	my ($c, $p);
 	socketpair($c, $p, AF_UNIX, SOCK_SEQPACKET, 0) or die "socketpair: $!";
-	(new($cls, $c, $ops, $in_loop), $p);
+	(new($cls, $c, $ops), $p);
 }
 
 sub pkt_do { # for the producer to trigger event_step in consumer

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/9] lei: err: avoid uninitialized variable warnings
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-03  1:41 ` [PATCH 4/9] pkt_op: rely on DS::in_loop global Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 6/9] lei: propagate curl errors, improve internal consistency Eric Wong
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

---
 lib/PublicInbox/LEI.pm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9b9aed64..2fe4646e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -308,12 +308,12 @@ sub x_it ($$) {
 sub err ($;@) {
 	my $self = shift;
 	my $err = $self->{2} // ($self->{pgr} // [])->[2] // *STDERR{GLOB};
-	my $eor = (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
-	print $err @_, $eor and return;
+	my @eor = (substr($_[-1]//'', -1, 1) eq "\n" ? () : ("\n"));
+	print $err @_, @eor and return;
 	my $old_err = delete $self->{2};
-	close($old_err) if $! == EPIPE && $old_err;;
+	close($old_err) if $! == EPIPE && $old_err;
 	$err = $self->{2} = ($self->{pgr} // [])->[2] // *STDERR{GLOB};
-	print $err @_, $eor or print STDERR @_, $eor;
+	print $err @_, @eor or print STDERR @_, @eor;
 }
 
 sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/9] lei: propagate curl errors, improve internal consistency
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-03  1:41 ` [PATCH 5/9] lei: err: avoid uninitialized variable warnings Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 7/9] lei q: --include/--exclude/--only support globs and basenames Eric Wong
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

IO::Uncompress::Gunzip seems to be losing $? when closing
PublicInbox::ProcessPipe.  To workaround this, do a synchronous
waitpid ourselves to force proper $? reporting update tests to
use the new --only feature for testing invalid URLs.

This improves internal code consistency by having {pkt_op}
parse the same ASCII-only protocol script/lei understands.

We no longer pass {sock} to worker processes at all,
further reducing FD pressure on per-user limits.
---
 lib/PublicInbox/LEI.pm         | 15 ++++++++-------
 lib/PublicInbox/LeiOverview.pm |  2 --
 lib/PublicInbox/LeiXSearch.pm  | 16 +++++++---------
 lib/PublicInbox/PktOp.pm       | 15 +++++++++++----
 t/lei.t                        | 29 ++++++++++++++++-------------
 5 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2fe4646e..ca81678a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -285,8 +285,8 @@ sub x_it ($$) {
 	# make sure client sees stdout before exit
 	$self->{1}->autoflush(1) if $self->{1};
 	dump_and_clear_log();
-	if (my $sock = $self->{sock}) {
-		send($sock, "x_it $code", MSG_EOR);
+	if (my $s = $self->{pkt_op} // $self->{sock}) {
+		send($s, "x_it $code", MSG_EOR);
 	} elsif ($self->{oneshot}) {
 		# don't want to end up using $? from child processes
 		for my $f (qw(lxs l2m)) {
@@ -339,9 +339,10 @@ sub puts ($;@) { out(shift, map { "$_\n" } @_) }
 
 sub child_error { # passes non-fatal curl exit codes to user
 	my ($self, $child_error) = @_; # child_error is $?
-	if (my $sock = $self->{sock}) { # send to lei(1) client
-		send($sock, "child_error $child_error", MSG_EOR);
-	} elsif ($self->{oneshot}) {
+	if (my $s = $self->{pkt_op} // $self->{sock}) {
+		# send to the parent lei-daemon or to lei(1) client
+		send($s, "child_error $child_error", MSG_EOR);
+	} elsif (!$PublicInbox::DS::in_loop) {
 		$self->{child_error} = $child_error;
 	} # else noop if client disconnected
 }
@@ -420,9 +421,9 @@ sub atfork_parent_wq {
 		$lei->{$f} = $wq->deep_clone($tmp);
 	}
 	$self->{env} = $env;
-	delete @$lei{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
+	delete @$lei{qw(sock 3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
 	my @io = (delete(@$lei{qw(0 1 2)}),
-			io_extract($lei, qw(sock pkt_op startq)));
+			io_extract($lei, qw(pkt_op startq)));
 	my $l2m = $lei->{l2m};
 	if ($l2m && $l2m != $wq) { # $wq == lxs
 		if (my $wq_s1 = $l2m->{-wq_s1}) {
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 366af8b2..88034ada 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -216,9 +216,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
-		my $sock = delete $lei->{sock}; # lei2mail doesn't need it
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
-		$lei->{sock} = $sock if $sock;
 		# $io[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 23a9c020..d33064bb 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -113,8 +113,7 @@ sub mset_progress {
 	if ($lei->{pkt_op}) { # called via pkt_op/pkt_do from workers
 		pkt_do($lei->{pkt_op}, 'mset_progress', @_);
 	} else { # single lei-daemon consumer
-		my @args = ref($_[-1]) eq 'ARRAY' ? @{$_[-1]} : @_;
-		my ($desc, $mset_size, $mset_total_est) = @args;
+		my ($desc, $mset_size, $mset_total_est) = @_;
 		$lei->{-mset_total} += $mset_size;
 		$lei->err("# $desc $mset_size/$mset_total_est");
 	}
@@ -264,14 +263,11 @@ sub query_remote_mboxrd {
 		shift(@$cmd) if !$cmd->[0];
 
 		$lei->err("# @$cmd") if $verbose;
-		$? = 0;
-		my $fh = popen_rd($cmd, $env, $rdr);
+		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
 		$fh = IO::Uncompress::Gunzip->new($fh);
-		eval {
-			PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
-							$lei, $each_smsg);
-		};
-		return $lei->fail("E: @$cmd: $@") if $@;
+		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
+						$lei, $each_smsg);
+		waitpid($pid, 0) == $pid or die "BUG: waitpid (curl): $!";
 		if ($? == 0) {
 			my $nr = $lei->{-nr_remote_eml};
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
@@ -420,6 +416,8 @@ sub do_query {
 		'.' => [ \&do_post_augment, $lei, $zpipe, $au_done ],
 		'' => [ \&query_done, $lei ],
 		'mset_progress' => [ \&mset_progress, $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
 	};
 	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops);
 	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
diff --git a/lib/PublicInbox/PktOp.pm b/lib/PublicInbox/PktOp.pm
index 40c7262a..10d76da0 100644
--- a/lib/PublicInbox/PktOp.pm
+++ b/lib/PublicInbox/PktOp.pm
@@ -4,8 +4,7 @@
 # op dispatch socket, reads a message, runs a sub
 # There may be multiple producers, but (for now) only one consumer
 # Used for lei_xsearch and maybe other things
-# "literal" => [ sub, @operands ]
-# /regexp/ => [ sub, @operands ]
+# "command" => [ $sub, @fixed_operands ]
 package PublicInbox::PktOp;
 use strict;
 use v5.10.1;
@@ -57,11 +56,19 @@ sub event_step {
 			$self->close;
 			die "recv: $!";
 		}
-		my ($cmd, $pargs) = split(/\0/, $msg, 2);
+		my ($cmd, @pargs);
+		if (index($msg, "\0") > 0) {
+			($cmd, my $pargs) = split(/\0/, $msg, 2);
+			@pargs = @{ipc_thaw($pargs)};
+		} else {
+			# for compatibility with the script/lei in client mode,
+			# it doesn't load Sereal||Storable for startup speed
+			($cmd, @pargs) = split(/ /, $msg);
+		}
 		my $op = $self->{ops}->{$cmd //= $msg};
 		die "BUG: unknown message: `$cmd'" unless $op;
 		my ($sub, @args) = @$op;
-		$sub->(@args, $pargs ? ipc_thaw($pargs) : ());
+		$sub->(@args, @pargs);
 		return $self->close if $msg eq ''; # close on EOF
 	} while (1);
 }
diff --git a/t/lei.t b/t/lei.t
index 33f47ae4..461669a8 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -14,6 +14,7 @@ require_mods(qw(json DBD::SQLite Search::Xapian));
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my ($home, $for_destroy) = tmpdir();
 my $err_filter;
+my $curl = which('curl');
 my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
 	http://czquwvybam4bgbro.onion/meta/
 	http://ou63pmih66umazou.onion/meta/);
@@ -39,7 +40,7 @@ local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 local $ENV{HOME} = $home;
 local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
-my $home_trash = [ "$home/.local", "$home/.config" ];
+my $home_trash = [ "$home/.local", "$home/.config", "$home/junk" ];
 my $cleanup = sub { rmtree([@$home_trash, @_]) };
 my $config_file = "$home/.config/lei/config";
 my $store_dir = "$home/.local/share/lei";
@@ -162,26 +163,19 @@ my $setup_publicinboxes = sub {
 my $test_external_remote = sub {
 	my ($url, $k) = @_;
 SKIP: {
-	my $nr = 4;
+	my $nr = 5;
 	skip "$k unset", $nr if !$url;
-	which('curl') or skip 'no curl', $nr;
+	$curl or skip 'no curl', $nr;
 	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
-	$lei->('ls-external');
-	for my $e (split(/^/ms, $out)) {
-		$e =~ s/\s+boost.*//s;
-		$lei->('forget-external', '-q', $e) or
-			fail "error forgetting $e: $err"
-	}
-	$lei->('add-external', $url);
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
-	ok($lei->('q', '-q', "m:$mid"), "query $url");
+	my @cmd = ('q', '--only', $url, '-q', "m:$mid");
+	ok($lei->(@cmd), "query $url");
 	is($err, '', "no errors on $url");
 	my $res = $json->decode($out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
-	ok($lei->('q', '-q', "m:$mid", 'd:..20101002'), 'no results, no error');
+	ok($lei->(@cmd, 'd:..20101002'), 'no results, no error');
 	is($err, '', 'no output on 404, matching local FS behavior');
 	is($out, "[null]\n", 'got null results');
-	$lei->('forget-external', $url);
 } # /SKIP
 }; # /sub
 
@@ -355,12 +349,21 @@ my $test_completion = sub {
 	}
 };
 
+my $test_fail = sub {
+	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
+	is($? >> 8, 3, 'got curl exit for bogus URL');
+	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
+	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir');
+	is($out, '', 'no output');
+};
+
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
 	$test_init->();
 	$test_external->();
 	$test_completion->();
+	$test_fail->();
 };
 
 if ($ENV{TEST_LEI_ONESHOT}) {

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/9] lei q: --include/--exclude/--only support globs and basenames
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-03  1:41 ` [PATCH 6/9] lei: propagate curl errors, improve internal consistency Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 8/9] lei: complete basenames for include|exclude|only Eric Wong
  2021-02-03  1:41 ` [PATCH 9/9] lei: help starts pager Eric Wong
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

We can do basename matching when it's unambiguous.  Since '*?[]'
characters are rare in URLs and pathnames, we'll do glob
matching by default to support a (curl-inspired) --globoff/-g
option to disable globbing.

And fix --exclude while we're at it
---
 lib/PublicInbox/LEI.pm         |  3 ++-
 lib/PublicInbox/LeiExternal.pm | 38 +++++++++++++++++++++++++++++++++-
 lib/PublicInbox/LeiQuery.pm    | 14 ++++++++-----
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ca81678a..95ce33ea 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -104,7 +104,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
-	include|I=s@ exclude=s@ only=s@ jobs|j=s
+	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g
 	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
@@ -201,6 +201,7 @@ my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 my %OPTDESC = (
 'help|h' => 'show this built-in help',
 'quiet|q' => 'be quiet',
+'globoff|g' => "do not match locations using '*?' wildcards and '[]' ranges",
 'verbose|v' => 'be more verbose',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
 'torsocks=s' => ['auto|no|yes',
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 3853cfc1..6b4c7fb0 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -39,7 +39,7 @@ sub lei_ls_external {
 }
 
 sub ext_canonicalize {
-	my ($location) = $_[-1];
+	my ($location) = @_;
 	if ($location !~ m!\Ahttps?://!) {
 		PublicInbox::Config::rel2abs_collapsed($location);
 	} else {
@@ -52,6 +52,42 @@ sub ext_canonicalize {
 	}
 }
 
+my %patmap = ('*' => '[^/]*?', '?' => '[^/]', '[' => '[', ']' => ']');
+sub glob2pat {
+	my ($glob) = @_;
+        $glob =~ s!(.)!$patmap{$1} || "\Q$1"!ge;
+        $glob;
+}
+
+sub get_externals {
+	my ($self, $loc, $exclude) = @_;
+	return (ext_canonicalize($loc)) if -e $loc;
+
+	my @m;
+	my @cur = externals_each($self);
+	my $do_glob = !$self->{opt}->{globoff}; # glob by default
+	if ($do_glob && ($loc =~ /[\*\?]/s || $loc =~ /\[.*\]/s)) {
+		my $re = glob2pat($loc);
+		@m = grep(m!$re!, @cur);
+		return @m if scalar(@m);
+	} elsif (index($loc, '/') < 0) { # exact basename match:
+		@m = grep(m!/\Q$loc\E/?\z!, @cur);
+		return @m if scalar(@m) == 1;
+	} elsif ($exclude) { # URL, maybe:
+		my $canon = ext_canonicalize($loc);
+		@m = grep(m!\A\Q$canon\E\z!, @cur);
+		return @m if scalar(@m) == 1;
+	} else { # URL:
+		return (ext_canonicalize($loc));
+	}
+	if (scalar(@m) == 0) {
+		$self->fail("`$loc' is unknown");
+	} else {
+		$self->fail("`$loc' is ambiguous:\n", map { "\t$_\n" } @m);
+	}
+	();
+}
+
 sub lei_add_external {
 	my ($self, $location) = @_;
 	my $cfg = $self->_lei_cfg(1);
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 72a67c24..10b8d6fa 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -31,17 +31,21 @@ sub lei_q {
 	}
 	if (@only) {
 		for my $loc (@only) {
-			$lxs->prepare_external($self->ext_canonicalize($loc));
+			my @loc = $self->get_externals($loc) or return;
+			$lxs->prepare_external($_) for @loc;
 		}
 	} else {
 		for my $loc (@{$opt->{include} // []}) {
-			$lxs->prepare_external($self->ext_canonicalize($loc));
+			my @loc = $self->get_externals($loc) or return;
+			$lxs->prepare_external($_) for @loc;
 		}
 		# --external is enabled by default, but allow --no-external
 		if ($opt->{external} //= 1) {
-			my %x = map {;
-				($self->ext_canonicalize($_), 1)
-			} @{$self->{exclude} // []};
+			my %x;
+			for my $loc (@{$opt->{exclude} // []}) {
+				my @l = $self->get_externals($loc, 1) or return;
+				$x{$_} = 1 for @l;
+			}
 			my $ne = $self->externals_each(\&prep_ext, $lxs, \%x);
 			$opt->{remote} //= !($lxs->locals - $opt->{'local'});
 			if ($opt->{'local'}) {

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 8/9] lei: complete basenames for include|exclude|only
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-03  1:41 ` [PATCH 7/9] lei q: --include/--exclude/--only support globs and basenames Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  2021-02-03  1:41 ` [PATCH 9/9] lei: help starts pager Eric Wong
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

This will make it even easier for RSI-afflicted users to use,
since many externals may share a common prefix.
---
 lib/PublicInbox/LeiQuery.pm | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 10b8d6fa..8015ecec 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -112,11 +112,22 @@ sub lei_q {
 sub _complete_q {
 	my ($self, @argv) = @_;
 	my $ext = qr/\A(?:-I|(?:--(?:include|exclude|only)))\z/;
-	# $argv[-1] =~ $ext and return $self->_complete_forget_external;
 	my @cur;
 	while (@argv) {
 		if ($argv[-1] =~ $ext) {
 			my @c = $self->_complete_forget_external(@cur);
+			# try basename match:
+			if (scalar(@cur) == 1 && index($cur[0], '/') < 0) {
+				my $all = $self->externals_each;
+				my %bn;
+				for my $loc (keys %$all) {
+					my $bn = (split(m!/!, $loc))[-1];
+					++$bn{$bn};
+				}
+				push @c, grep {
+					$bn{$_} == 1 && /\A\Q$cur[0]/
+				} keys %bn;
+			}
 			return @c if @c;
 		}
 		unshift(@cur, pop @argv);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 9/9] lei: help starts pager
  2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
                   ` (6 preceding siblings ...)
  2021-02-03  1:41 ` [PATCH 8/9] lei: complete basenames for include|exclude|only Eric Wong
@ 2021-02-03  1:41 ` Eric Wong
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2021-02-03  1:41 UTC (permalink / raw)
  To: spew

Because some commands have many options which take up
multiple screens.
---
 lib/PublicInbox/LEI.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 95ce33ea..28dce0c5 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -507,7 +507,9 @@ EOF
 		$msg .= $rhs;
 		$msg .= "\n";
 	}
-	print { $self->{$errmsg ? 2 : 1} } $msg;
+	my $out = $self->{$errmsg ? 2 : 1};
+	start_pager($self) if -t $out;
+	print $out $msg;
 	x_it($self, $errmsg ? 1 << 8 : 0); # stderr => failure
 	undef;
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-02-03  1:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-03  1:40 [PATCH 1/9] lei: use sleep(1) loop for infinite sleep Eric Wong
2021-02-03  1:40 ` [PATCH 2/9] lei: reduce FD pressure from lei2mail worker Eric Wong
2021-02-03  1:41 ` [PATCH 3/9] lei: further reduce lei2mail FD pressure Eric Wong
2021-02-03  1:41 ` [PATCH 4/9] pkt_op: rely on DS::in_loop global Eric Wong
2021-02-03  1:41 ` [PATCH 5/9] lei: err: avoid uninitialized variable warnings Eric Wong
2021-02-03  1:41 ` [PATCH 6/9] lei: propagate curl errors, improve internal consistency Eric Wong
2021-02-03  1:41 ` [PATCH 7/9] lei q: --include/--exclude/--only support globs and basenames Eric Wong
2021-02-03  1:41 ` [PATCH 8/9] lei: complete basenames for include|exclude|only Eric Wong
2021-02-03  1:41 ` [PATCH 9/9] lei: help starts pager Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).