about summary refs log tree commit homepage
path: root/lib/PublicInbox/SharedKV.pm
DateCommit message (Collapse)
2023-02-22treewide: simplify File::Path mkpath/make_path callers
File::Path already accounts for the existence of directories, handles races from redundant mkdir(2), and croaks on unrecoverable errors. So there's no point in doing any of that on our end. Furthermore, avoiding the overhead of loading File::Path doesn't seem worth it to save 20-60ms given the overhead of loading our other code. Instead, try to reduce optree overhead on our code, instead, since File::Path gets used in a bunch of places. We'll also favor the newer make_path for multi-directory invocations to avoid bloating our own optree to create an arrayref, but mkpath is one fewer subroutine call within File::Path itself, right now.
2022-02-14sharedkv: avoid ambiguity for numeric-like string keys
While we only store URLs and binary SHA-1/SHA-256 values in skv at the moment, we may store potentially ambiguous keys/values in the future. It's possible to store "02" and have it treated as `2' unless explicitly binding parameters as SQL_BLOB. This behavior was independent of the sqlite_unicode parameter as evidenced by the new tests. I only noticed this bug while hacking on another project using DBD::SQLite, and not while hacking on public-inbox itself.
2022-02-14sharedkv: remove unused subs
Some features didn't get used, and they're just getting in the way of upcoming bugfixes.
2022-01-31rewrite Linux nodatacow use in pure Perl w/o system
btrfs is Linux-only at the moment (and likely to remain that way for practical purposes). So rely on Linux ABI stability and use the `syscall' and `ioctl' perlops rather than relying on Inline::C. Inline::C (and gcc||clang) are monstrous dependencies which we can't expect users to have. This makes supporting new architectures more difficult, but new architectures come along rarely and this reduces the burden for the majority of Linux users on popular architectures (while still avoiding the distribution of pre-built binaries). Link: https://public-inbox.org/meta/YbCPWGaJEkV6eWfo@codewreck.org/
2021-11-01treewide: kill problematic "$h->{k} //= do {" assignments
As stated in the previous change, conditional hash assignments which trigger other hash assignments seem problematic, at times. So replace: $h->{k} //= do { $h->{x} = ...; $val }; $h->{k} // do { $h->{x} = ...; $hk->{k} = $val }; "||=" is affected the same way, and some instances of "||=" are replaced with "//=" or "// do {", now.
2021-10-24shared_kv: remove cache_size attribute support
We're not using it, anywhere.
2021-10-10set nodatacow on more SQLite files
We'll set nodatacow when detecting existing but empty files, and also their directories in more cases (for auxiliary -wal, -journal, -shm files). Hopefully this keeps performance reasonable on CoW FSes.
2021-09-21lei: various completion improvements
"lei export-kw" no longer completes for anonymous sources. More commands use "lei refresh-mail-sync" as a basis for their completion work, as well. ";AUTH=ANONYMOUS@" is stripped from completions since it was preventing bash completion from working on AUTH=ANONYMOUS IMAP URLs. I'm not sure if there's a better way, but all of our code works fine without specifying AUTH=ANONYMOUS as a command-line arg. Finally, we fallback to using more candidates if none can be found, allowing multiple URLs to be completed.
2021-07-25lei: avoid SQLite COUNT() for dedupe
SQLite COUNT() is a slow operation that does a full table scan with no conditions. There's no need for it, since lei dedupe only needs to know if it's empty or not to decide between new/ and cur/ for Maildir outputs.
2021-06-13lei ls-mail-source: write through to URL folder cache
We'll be able to use this for shell completion for lei import, lcat, tag, etc.. This also adds --url support for scripting purposes.
2021-02-01sharedkv: do not set cache_size by default
These DBs will probably be too small to be worth increasing the cache size of.
2021-02-01sharedkv: use lock_for_scope_fast
This allows us to avoid repeated open() and close() syscalls and speeds up the new xt/stress-sharedkv.t maintainer test by roughly 7%.
2021-02-01sharedkv: lock and explicitly disconnect {dbh}
It may be possible for updates or changes to be uncommitted until disconnect, so we'll use flock() as we do elsewhere to avoid the polling retry behavior of SQLite. We also need to clear CachedKids before disconnecting to to avoid warnings like: ->disconnect invalidates 1 active statement handle (either destroy statement handles or call finish on them before disconnecting)
2021-02-01sharedkv: release {dbh} before rmtree
This may be needed to avoid warnings/errors when operating in single process mode in the future.
2021-01-30shared_kv: simplify PID+object guard for cleanup
We don't need another hash slot when we can encode the object ID and PID owner into the field name itself.
2021-01-14lei_dedupe+shared_kv: ensure round-tripping serialization
We'll be passing these objects via PublicInbox::IPC which uses Storable (or Sereal), so ensure they're safe to use after serialization.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2021-01-01sharedkv: split out index_values
In most cases, we won't need to index by value, so don't waste cycles or space on it.
2021-01-01sharedkv: fork()-friendly key-value store
This is intended for maintaining Maildir states, mbox message deduplication, but may be useful for other purposes...