about summary refs log tree commit homepage
path: root/lib/PublicInbox/WwwCoderepo.pm
DateCommit message (Collapse)
2024-06-07treewide: use cached git executable lookup
Repeated stat(2) syscalls are more expensive nowadays due to CPU vulnerability mitigations and this change also allows bypassing some heap allocations done by Perl.
2024-06-07treewide: use \*STD(IN|OUT|ERR) consistently
Referencing the {IO} slot may not always be populated or work (e.g. with `-t' filetest) if there's no IO handle. Using merely using `\*' is shorter than typing out `{GLOB}', so just use the shortest form consistently. This may fix occasional and difficult-to-reproduce failures from redirecting STDERR in t/imap_searchqp.t
2024-05-25www_coderepo: improve handling of broken repos
The {readme} arrayref may be completely unset if a repo goes missing, so account for that when the summary page runs.
2024-04-03treewide: avoid getpid() for OnDestroy checks
getpid() isn't cached by glibc nowadays and system calls are more expensive due to CPU vulnerability mitigations. To ensure we switch to the new semantics properly, introduce a new `on_destroy' function to simplify callers. Furthermore, most OnDestroy correctness is often tied to the process which creates it, so make the new API default to guarded against running in subprocesses. For cases which require running in all children, a new PublicInbox::OnDestroy::all call is provided.
2024-01-17www: repolist: support globbing in URL
This can make it easier to find deeply-nested repositories on my mirror of git.kernel.org. It's not perfect, since projects like Linux use several completely different basenames (e.g. linux.git vs vfs.git vs net.git), but it can still help find significant matches further up a tree. I don't expect glob characters to conflict with actual git repositories used by reasonable people, but direct (non-glob) hits are still tried first.
2024-01-10www: use autodie in more coderepo places
This cuts down on code somewhat (before I add more :x)
2023-12-13www_coderepo: fix read buffering
Our read buffering only worked well with the stdout buffering on glibc and *BSD libc, but not musl. When reading the stdout of git(1), we are likely to get smaller buffers and require more reads on musl-based systems (tested Alpine Linux 3.19.0). Thus we must prevent ->translate from being called with an empty argument list (denoting EOF). We'll also avoid some local variable assignments while at it and favor the non-OO ->zflush dispatch inside RepoAtom and WwwCoderepo subclasses.
2023-11-29www: start working on a repo listing
The HTML is still extremely rough, but links seem to be mostly working...
2023-11-29www: load and use cindex join data
This is a major step in solving the problem of having to manually associate hundreds/thousands of coderepos with hundreds/thousands of public-inboxes to power solver (and more).
2023-10-25drop psgi_return, httpd/async and GetlineBody
Now that psgi_yield is used everywhere, the more complex psgi_return and it's helper bits can be removed. We'll also fix some outdated comments now that everything on psgi_return has switched to psgi_yield. GetlineResponse replaces GetlineBody and does a better job of isolating generic PSGI-only code.
2023-10-25www_coderepo: use psgi_yield
Yet another drop-in replacement for psgi_return.
2023-10-25www_coderepo: capture uses a flattened list
We no longer need a multi-dimensional list to pass multiple arguments to the psgi_qx callback. This simplifies usage and reduces allocations.
2023-10-09www_coderepo: fix handling of non-UTF-8 git data
We can't assume git output is UTF-8, and we'll always have legacy data in git coderepos. So attempt to display some some garbled text rather than nothing at all if Perl croaks on it. sox commit c38987e8d20505621b8d872863afa7d233ed1096 (Added raw inverse-bit u-law and A-law support. Updated *.txt files., 2001-12-13) is an example of a commit which caused problems for me.
2023-09-16www_coderepo: use space for snapshot_fmt prefix
The tab character causes inconsistent spacing on display, and a single space seems fine, here.
2023-04-18www_coderepo: rescan cgit project-list for new coderepos
Coderepo changes are probably more common than inbox changes, so it probably makes sense to rescan and look for new coderepos on 404s, especially since we serve mirrored manifest.js.gz as-is. I noticed my git.kernel.org mirror was serving manifest.js.gz pointing to irretrievable repositories. This should stop that. We'll also drop the underscore ('_') and use `coderepo' everywhere to be consistent with our documentation. We may serve new inboxes in a similar way down the line, too; but this change only affects coderepos for now since we can guarantee the inbox manifest.js.gz never contains irretrievable inboxes as it's dynamically generated.
2023-04-12www_coderepo: drop unused $EACH_REF variable
It's unused since commit cbe2548c91859dfb (www_coderepo: use OnDestroy to render summary view, 2023-04-09)
2023-04-10www_coderepo: use OnDestroy to render summary view
This lets us get rid of a /bin/sh process and allows us us to rely on Qspawn to parallelize git commands. Special treatment of the OnDestroy object is necessary to keep its scope limited for MockHTTP. Neither the generic `plackup' HTTP server and nor our -httpd/-netd needed this scope limitation. As a result, summary() is now called inside an anonymous sub to keep the memory overhead of the anonymous sub itself as small as possible. Avoiding anonymous subs entirely would be preferable for memory savings, but it's necessary for PSGI.
2023-02-15www_coderepo: handle unborn/dead branches in summary
We need to account for `git log' showing nothing for invalid branches and continue to render properly. We'll also quiet down `git log' stderr to avoid cluttering stderr, too.
2023-02-15www_coderepo: quiet 404s on Atom feeds for dead branches
No need to clutter up logs when a request hits a dead branch.
2023-01-28www_coderepo: summary: fix mis-linkification of `...'
We need to use the ternary operator in assignments to clobber previous values of `$last'.
2023-01-28www_coderepo: support $REPO/refs/{heads,tags}/ endpoints
These are also in cgit, but we'll include CLI hints to show viewers how our data is generated. We don't have "$REPO/refs/" without (heads|tags) yet, though...
2023-01-28www_coderepo: reduce utf8::decode calls
It's safe to call utf8::decode on data where "\0" exists.
2023-01-28www_coderepo: fix snapshot link generation
Do not assume ".git" exists as a suffix in the repo nickname, and filter out all trailing slashes in case it didn't get filtered from Config.
2023-01-28www_coderepo: support /$REPO/tags.atom endpoint
Providing an Atom feed for tags can be a nice way for users to subscribe to new releases without excessive noise.
2023-01-24www_coderepo: remove some needless return statements
Maybe it makes control flow a little easier to rely on implicit return (IIRC, it's slightly faster, too).
2023-01-24www_coderepo: eliminate debug log footer
WwwCoderepo is for viewing blobs already in code repositories, so there's no place for a debug log showing which mails were used to arrive at a given blob. The debug footer remains for /$INBOX/$OID/s/ URLs, of course.
2023-01-13viewvcs: use git(1) for coderepo access
libgit2 development has fallen behind git.git and I've been using objectformat=sha256 somewhere else for over 18 months. Hoist out do_cat_async() into it's own sub to hide generic PSGI vs -httpd differences while we're at it to save us some code.
2023-01-13www_coderepo: tree: do not break #n$LINENO
We can't use 302 redirects at the /tree/ endpoint as originally intended since "#n$LINENO" fragment links aren't preserved across redirects (since clients don't typically send that part of the URL in requests). So we'll have to make sure we handle prefixes properly and show trees directly. Oh well :< At least the history-aware 404 handling remains :>
2023-01-13www_coderepo: /tree/ redirects to /$OID/s/
This is for compatibility with cgit to ease migration.
2023-01-13www_stream: coderepo-specific top bar
It gets nasty when multiple, non-ALL lists point to the same coderepo, but I guess ALL exists for that. Only lightly-tested with various PSGI prefix mounts, but it seems to be working...
2023-01-11config: use inbox names to map inboxes <-> coderepos
We can avoid having to deal with weakening references and then later creating strong references in WwwCoderepo.
2023-01-11www_coderepo: handle "?h=$tip" in summary view
This makes sense at least as far as the README and `git log' output goes. We'll also add the `b=' query parameter to the $OID/s/ href for the README blob.
2023-01-08www_coderepo: do not copy {-code_repos} from config
Avoiding 2 extra hash lookups per-request when we do plenty more isn't worth the static memory overhead. This shaves another chunk off our memory use: $ perl -MDevel::Size=total_size -I lib -MPublicInbox::WwwCoderepo -E \ 'say total_size(PublicInbox::WwwCoderepo->new(PublicInbox::Config->new))' before: 1184385 after: 1020878
2023-01-04www_coderepo: implement /$CODE_REPO/atom/ endpoint
This should be similar or identical to what's in cgit; and tie into the rest of the www_coderepo stuff.
2023-01-01www: load cgitrc for coderepos for solver
Loading cgitrc (and associated projects.list) can get users out of defining as many individual coderepos. xt/solver.t needs a use of `$_' replaced since that gets clobbered while parsing cgitrc.
2022-10-09www_coderepo: allow searching one extindex|inbox
I'm not sure how to best make a UI for one coderepo to many inboxes/extindices, yet; but at least allow a simple 1:1 mapping, for now. This ensures /$CODEREPO/$OID/s/ can work as effectively as /$INBOX/$OID/s/ when looking for emails associated with a git commit.
2022-10-09www_coderepo: update blurb on the goal/purpose of this
I think putting too much functionality in web services leads to ignorance of local/offline tools, so this web UI will give hints here and there for web users. Things like diff options can get expensive and become cache-unfriendly on the web server, so promoting local tools can reduce overall network traffic and server load.
2022-10-09www_coderepo: wire up snapshots from summary
This also ensures we won't waste CPU cycles on snapshots which aren't configured if somebody attempts them by guessing URLs.
2022-10-09config: remove {-cgitrc_unparsed} field
This field has been unneeded since commit 6890430df808 (cgit: fix fallout from lazy coderepo loading, 2021-03-18)
2022-10-05www_coderepo: start a top nav bar in summary view
This needs to be expanded, but quick links to heads/tags/README shouldn't hurt...
2022-10-05www_coderepo: wire up snapshot support
These should be compatible with cgit results
2022-10-05www_coderepo: wire up /$CODEREPO/$OID/s/ endpoint
Just reusing ViewVCS::show, since encoding refname and pathnames into things just makes things slower.
2022-10-05www_coderepo: an alternative to cgit
This will allow it to easily map a single coderepo to multiple inboxes (or multiple coderepos to any number of inboxes). For now, this is just a summary, but $REPO/$OID/s/ support will be added, along with archive downloads. Indexing of coderepos will probably be supported via -extindex, only.