msgthr user+dev discussion/patches/pulls/bugs/help
 help / color / Atom feed
From: Dimid Duchovny <dimidd@gmail.com>
To: Eric Wong <e@80x24.org>
Cc: msgthr-public <msgthr-public@80x24.org>
Subject: Re: Feature Request: thread grouping
Date: Wed, 24 Jan 2018 12:28:34 +0200
Message-ID: <CANKvuDeBhLXMwJeP1gD=7myXH=7_WNJmmXTAg0HMR1=qzFruGw@mail.gmail.com> (raw)
In-Reply-To: <20180123220303.GA7222@80x24.org>

2018-01-24 0:03 GMT+02:00 Eric Wong <e@80x24.org>:
> Dimid Duchovny <dimidd@gmail.com> wrote:
>> > You're right. In my case the flow was: read emails from storage ->
>> > group to threads -> add thread field to storage.
>> > However, I guess it's an edge-case.
>> > On second thought, maybe it'd be better to have a more general solution.
>> > E.g. let the client run an arbitrary callback after adding a child.
>
> OK, I guess you managed to fit skeletons of all your messages in memory?
>
>> > Here's a quick POC:
>> > https://github.com/dimidd/msgthr/commit/1c701717d10879d492d8b55fb8ca2f1c53d7e13f
>
> (truncated output of "git show 1c701717d10879d492d8b55fb8ca2f1c53d7e13f"
>
>>     add callback to Msgthr#add
>>
>>     The motivation is to allow the client to have a custom code executed,
>>         whenever a child is added.
>>
>> --- a/lib/msgthr.rb
>> +++ b/lib/msgthr.rb
>> @@ -166,12 +166,16 @@ class Msgthr
>>        # but do not change existing links or loop
>>        if prev && !cont.parent && !cont.has_descendent(prev)
>>          prev.add_child(cont)
>> +        yield(prev, cont) if block_given?
>>        end
>>        prev = cont
>>      end
>>
>>      # set parent of this message to be the last element in refs
>> -    prev.add_child(cur) if prev
>> +    if prev
>> +      prev.add_child(cur)
>> +      yield(prev, cur) if block_given?
>> +    end
>>    end
>>  end
>
> OK, that seems generic enough and we can probably support it
> long-term, so I'm somewhat inclined to accept it...
>
> However, APIs encouraging/supporting folks to load their entire
> collection(*) of messages (even skeletons) into memory feels
> wrong to me.
>
> Can you come up with a use case where this is useful for
> a subset of messages?
>

Well, in my specific case there weren't many messages, so memory
wasn't an issue.
In general, I think the question of adding the add_child callback is
orthogonal to the
question of using the entire collection or parts of.
I.e. one could use Msgthr as it is, with millions of emails, and one
could use the callback with only a few messages.
Consider this flow:
1. querying the storage backend according to some criteria (e.g. a
time range, a particular sender, etc.)
2. grouping the messages in the response to threads

I'd rather show than tell, so here's a more elaborated example:
https://github.com/dimidd/msgthr/commit/3e38a4910e7a3c17c07f47c4f1b9d556a4a951fd.patch

BTW, note how we only needed one pointer per message and one string
*per thread*,
by using an array with a single element and saving the actual message
only in the top level (the rootset).


>
> (*) I work with millions of emails
>
>> > P.S. I hope you don't mind I uploaded my fork to github.
>
> That's fine, I just add a new remote(*) to my .git/config, fetch
> and show.
>
> What I won't accept about GitHub is having it as a centralized
> and proprietary messaging system which forces participants to
> accept their ToS.  I can't accept that; no single entity
> controls email, so that's what I stick with.
>
>
> (*) added this to my .git/config
> ==> .git/config <==
> [remote "dimidd"]
>         url = https://github.com/dimidd/msgthr
>         fetch = refs/heads/*:refs/remotes/dimidd/*

  reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-21  9:40 Dimid Duchovny
2018-01-21 23:49 ` Eric Wong
2018-01-23 21:04   ` Dimid Duchovny
2018-01-23 21:12     ` Dimid Duchovny
2018-01-23 22:03       ` Eric Wong
2018-01-24 10:28         ` Dimid Duchovny [this message]
2018-01-24 19:18           ` Eric Wong
2018-01-24 21:14             ` Dimid Duchovny
2018-01-24 22:49               ` Eric Wong
2018-01-25  8:16                 ` Dimid Duchovny
2018-01-25  8:38                   ` Eric Wong
2018-02-08 13:06                     ` Dimid Duchovny

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://80x24.org/msgthr/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANKvuDeBhLXMwJeP1gD=7myXH=7_WNJmmXTAg0HMR1=qzFruGw@mail.gmail.com' \
    --to=dimidd@gmail.com \
    --cc=e@80x24.org \
    --cc=msgthr-public@80x24.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

msgthr user+dev discussion/patches/pulls/bugs/help

Archives are clonable: git clone --mirror https://80x24.org/msgthr-public

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.lang.ruby.msgthr
	nntp://ou63pmih66umazou.onion/inbox.comp.lang.ruby.msgthr

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox