From: Eric Wong <email@example.com> To: Dimid Duchovny <firstname.lastname@example.org> Cc: email@example.com Subject: Re: Feature Request: thread grouping Date: Sun, 21 Jan 2018 23:49:11 +0000 Message-ID: <20180121234911.GA29238@whir> (raw) In-Reply-To: <CANKvuDf7esPfy3eQ0B8aQjg4sTYTcxR_LNNWeDBcENFwmyC_3g@mail.gmail.com> Dimid Duchovny <firstname.lastname@example.org> wrote: > However, I realized that the last step (walking) is redundant, > since that could be done by the library itself in the threading or > ordering stages. I think you want is best done in the storage/indexing stage; whereas msgthr is intended for display/rendering results that were retrieved from some sort of search engine. At least thats how notmuch does it, and I stole the logic for public-inbox(*) as they both use Xapian. I think mairix does something similar, too; but it's been a while... > E.g. keeping track of each container's thread, > and when adding a message A as a child of message B, to point A's > thread to B's one. > We could use an array with a single element, > or some other solution to have pass-by-reference semantics. > Finally, all top-level containers should have their own msg_id as the thread, > and all their descendants will point to it as well. One advantage to doing this in the storage phase is this info is persistent and you don't need to calculate it every time. This is great when you're dealing with more message skeletons than can fit in memory. git@vger has over 300k messages, LKML will have several million messages, and they both use String Message-IDs (being email), so it'll be many hundreds of MB just in containers and Message-IDs. Another huge advantage in doing this when indexing a message phase is you can easily search for something in a single message and then easily pull every message from the thread it belongs to based on a boolean thread_id search. I also find the "-t" switch of mairix being useful for my private mail. I can help you understand how public-inbox does this in SearchIdx.pm (indexer) and Search.pm (read-only queries) if you're not familiar with Perl5, but for now you can grab the code and try understanding it on your own: git clone https://public-inbox.org/public-inbox http://repo.or.cz/public-inbox.git/blob/4f2f0eb94739edf:/lib/PublicInbox/SearchIdx.pm http://repo.or.cz/public-inbox.git/blob/4f2f0eb94739edf:/lib/PublicInbox/Search.pm I'll be happy to answer questions on email@example.com about it :) > Would you consider adding such a feature? If so, I'll be happy to work > out the details and submit a patch. I'm not sure if it makes sense to add this without a stable storage backend (Xapian or some other search indexer/DB). Another potential problem is adding this to msgthr is msgthr is GPL-2+ (since it's a port of Mail::Thread from CPAN); but the notmuch algorithm is GPL-3+, so I'm not allowed to put it into a GPL-2+ project (APGL-3+ is OK). Maybe you can cite prior art from mairix (GPL-2+), but I haven't looked at that code in many years and don't remember it.
next prev parent reply index Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-01-21 9:40 Dimid Duchovny 2018-01-21 23:49 ` Eric Wong [this message] 2018-01-23 21:04 ` Dimid Duchovny 2018-01-23 21:12 ` Dimid Duchovny 2018-01-23 22:03 ` Eric Wong 2018-01-24 10:28 ` Dimid Duchovny 2018-01-24 19:18 ` Eric Wong 2018-01-24 21:14 ` Dimid Duchovny 2018-01-24 22:49 ` Eric Wong 2018-01-25 8:16 ` Dimid Duchovny 2018-01-25 8:38 ` Eric Wong 2018-02-08 13:06 ` Dimid Duchovny
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: https://80x24.org/msgthr/README * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180121234911.GA29238@whir \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
msgthr user+dev discussion/patches/pulls/bugs/help Archives are clonable: git clone --mirror https://80x24.org/msgthr-public Example config snippet for mirrors Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.lang.ruby.msgthr nntp://ou63pmih66umazou.onion/inbox.comp.lang.ruby.msgthr note: .onion URLs require Tor: https://www.torproject.org/ AGPL code for this site: git clone https://public-inbox.org/public-inbox.git