about summary refs log tree commit homepage
path: root/examples/grok-pull.post_update_hook.sh
DateCommit message (Collapse)
2022-10-24treewide: replace /^I: / prefix with /^# /
This is like more familiar to readers of TAP (Test Anywhere Protocol) output, as well as shell and Perl scripters which also use `#' for comments. AFAIK, nobody is parsing our stderr, and I'm not sure how standardized the `I:' prefix is (nor `W:' and `E:' are). It's already the prevailing style in Lei* code, too, so things have been moving in that direction for a bit.
2020-08-26grok-pull.post_update_hook: flock(2) before SQLite check
Unlike DBD::SQLite, the sqlite3(1) CLI does not have a default busy timeout enabled, so it easily times out while acquiring a SHARED lock for read-only queries. We can avoid battery-wasting polling from the SQLite timeout handler by relying on flock(2) as we do in our Perl code. Furthermore, this avoids triggering some locking problems[1] from a long "SELECT COUNT(*) ..." query and reindex. While there may be other SQLite-related parallelism issues[1], this works around one of them by relying on flock(2). [1] https://public-inbox.org/meta/20200825001204.GA840@dcvr/
2020-08-14grok-pull.post_update_hook: favor --sequential-shard for HDD
--sequential-shard offers better performance on HDD than -j0 since the on-disk active set can be kept small (with -j $HIGH_NUM). --batch-size can also be helpful for systems with much RAM.
2020-07-29examples/grok-pull.post_update_hook: fix description URL
I finally noticed descriptions weren't showing up in my mirrors :x
2020-07-17doc: add some recommendations around slow HDDs
grok-pull is still painful with serialization on an old USB 2.0 HDD, but at least it can finish with flock(1) and disabling parallelization. While parallel "git fetch" doesn't seem so bad, slow seeks are exacerbated by parallel reads in Xapian. That means some updates can take days instead of hours. The same updates take only seconds or minutes on an SSD.
2020-04-06examples/grok-pull.post_update_hook: move url_base to the top
Users are encouraged to edit this script, anyways, so make it easy for them to swap out and use whatever URL they need.
2020-04-06examples/grok-pull.post_update_hook: capture infourl
The value of infourl parameters are shared in the config, so include them in the mirror.
2020-04-06examples/grok-pull.post_update_hook: fetch mirror description
The $INBOX_URL/description endpoint is available since v1.3.0, so use it in mirrors.
2019-10-18examples/grok-pull.post_update_hook: fix config detection
We need to account for both the old ("mainrepo") and new ("inboxdir") names. But "dir" was just a search+replace error and we don't use that outside of "coderepo.dir".
2019-10-16examples/grok-pull.post_update_hook: use "inbox_dir"
Move away from using "mainrepo" since it's confusing to new users, especially with v2.
2019-10-07examples: add grok-pull post_update_hook example
This requires the latest (to be in 1.2) -init changes for synchronization and has no dependencies on GNU or bash-isms so it should run on *BSD systems without GNU tools. It does attempt to use curl on <$INBOX_URL/_/text/config/raw>, but curl is fairly standard nowadays, and falls back to using an invalid address to initialize.