msgthr user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [ANN] msgthr 1.2.0 - container-agnostic, non-recursive message threading
@ 2018-01-25 23:08  7% Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2018-01-25 23:08 UTC (permalink / raw)
  To: ruby-talk, msgthr-public; +Cc: misc, Dimid Duchovny

Pure Ruby message threading based on the algorithm described by
JWZ in <https://www.jwz.org/doc/threading.html> and used in
countless mail and news readers; but with some features removed
and improved flexibility for non-mail/news usage.

* https://80x24.org/msgthr/README
* API: https://80x24.org/msgthr/rdoc/Msgthr.html
* public list: msgthr-public@80x24.org
* mail archives: https://80x24.org/msgthr-public/
* git clone https://80x24.org/msgthr.git
* follow releases: https://80x24.org/msgthr/NEWS.atom.xml
* follow all: https://80x24.org/msgthr-public/new.atom
* nntp://news.public-inbox.org/inbox.comp.lang.ruby.msgthr

Changes: Msgthr#add callback support

    This release adds callback support to the Msgthr#add method,
    allowing callers to track progress and potentially group
    message.  Thanks to Dimid Duchovny for this feature.
    Discussion about it begins here:

      https://80x24.org/msgthr-public/CANKvuDf7esPfy3eQ0B8aQjg4sTYTcxR_LNNWeDBcENFwmyC_3g@mail.gmail.com/t/

    4 changes from Dimid Duchovny:
          add callback to Msgthr#add
          test: add a more complex test for add_child callback
          test: fix add_child callback test
          doc: document block parameter of Msgthr#add
-- 
https://80x24.org/msgthr/README

^ permalink raw reply	[relevance 7%]

* Re: Feature Request: thread grouping
  2018-01-23 22:03  0%       ` Eric Wong
@ 2018-01-24 10:28  0%         ` Dimid Duchovny
  0 siblings, 0 replies; 3+ results
From: Dimid Duchovny @ 2018-01-24 10:28 UTC (permalink / raw)
  To: Eric Wong; +Cc: msgthr-public

2018-01-24 0:03 GMT+02:00 Eric Wong <e@80x24.org>:
> Dimid Duchovny <dimidd@gmail.com> wrote:
>> > You're right. In my case the flow was: read emails from storage ->
>> > group to threads -> add thread field to storage.
>> > However, I guess it's an edge-case.
>> > On second thought, maybe it'd be better to have a more general solution.
>> > E.g. let the client run an arbitrary callback after adding a child.
>
> OK, I guess you managed to fit skeletons of all your messages in memory?
>
>> > Here's a quick POC:
>> > https://github.com/dimidd/msgthr/commit/1c701717d10879d492d8b55fb8ca2f1c53d7e13f
>
> (truncated output of "git show 1c701717d10879d492d8b55fb8ca2f1c53d7e13f"
>
>>     add callback to Msgthr#add
>>
>>     The motivation is to allow the client to have a custom code executed,
>>         whenever a child is added.
>>
>> --- a/lib/msgthr.rb
>> +++ b/lib/msgthr.rb
>> @@ -166,12 +166,16 @@ class Msgthr
>>        # but do not change existing links or loop
>>        if prev && !cont.parent && !cont.has_descendent(prev)
>>          prev.add_child(cont)
>> +        yield(prev, cont) if block_given?
>>        end
>>        prev = cont
>>      end
>>
>>      # set parent of this message to be the last element in refs
>> -    prev.add_child(cur) if prev
>> +    if prev
>> +      prev.add_child(cur)
>> +      yield(prev, cur) if block_given?
>> +    end
>>    end
>>  end
>
> OK, that seems generic enough and we can probably support it
> long-term, so I'm somewhat inclined to accept it...
>
> However, APIs encouraging/supporting folks to load their entire
> collection(*) of messages (even skeletons) into memory feels
> wrong to me.
>
> Can you come up with a use case where this is useful for
> a subset of messages?
>

Well, in my specific case there weren't many messages, so memory
wasn't an issue.
In general, I think the question of adding the add_child callback is
orthogonal to the
question of using the entire collection or parts of.
I.e. one could use Msgthr as it is, with millions of emails, and one
could use the callback with only a few messages.
Consider this flow:
1. querying the storage backend according to some criteria (e.g. a
time range, a particular sender, etc.)
2. grouping the messages in the response to threads

I'd rather show than tell, so here's a more elaborated example:
https://github.com/dimidd/msgthr/commit/3e38a4910e7a3c17c07f47c4f1b9d556a4a951fd.patch

BTW, note how we only needed one pointer per message and one string
*per thread*,
by using an array with a single element and saving the actual message
only in the top level (the rootset).


>
> (*) I work with millions of emails
>
>> > P.S. I hope you don't mind I uploaded my fork to github.
>
> That's fine, I just add a new remote(*) to my .git/config, fetch
> and show.
>
> What I won't accept about GitHub is having it as a centralized
> and proprietary messaging system which forces participants to
> accept their ToS.  I can't accept that; no single entity
> controls email, so that's what I stick with.
>
>
> (*) added this to my .git/config
> ==> .git/config <==
> [remote "dimidd"]
>         url = https://github.com/dimidd/msgthr
>         fetch = refs/heads/*:refs/remotes/dimidd/*

^ permalink raw reply	[relevance 0%]

* Re: Feature Request: thread grouping
  @ 2018-01-23 22:03  0%       ` Eric Wong
  2018-01-24 10:28  0%         ` Dimid Duchovny
  0 siblings, 1 reply; 3+ results
From: Eric Wong @ 2018-01-23 22:03 UTC (permalink / raw)
  To: Dimid Duchovny; +Cc: msgthr-public

Dimid Duchovny <dimidd@gmail.com> wrote:
> > You're right. In my case the flow was: read emails from storage ->
> > group to threads -> add thread field to storage.
> > However, I guess it's an edge-case.
> > On second thought, maybe it'd be better to have a more general solution.
> > E.g. let the client run an arbitrary callback after adding a child.

OK, I guess you managed to fit skeletons of all your messages in memory?

> > Here's a quick POC:
> > https://github.com/dimidd/msgthr/commit/1c701717d10879d492d8b55fb8ca2f1c53d7e13f

(truncated output of "git show 1c701717d10879d492d8b55fb8ca2f1c53d7e13f"

>     add callback to Msgthr#add
>     
>     The motivation is to allow the client to have a custom code executed,
>         whenever a child is added.
> 
> --- a/lib/msgthr.rb
> +++ b/lib/msgthr.rb
> @@ -166,12 +166,16 @@ class Msgthr
>        # but do not change existing links or loop
>        if prev && !cont.parent && !cont.has_descendent(prev)
>          prev.add_child(cont)
> +        yield(prev, cont) if block_given?
>        end
>        prev = cont
>      end
>  
>      # set parent of this message to be the last element in refs
> -    prev.add_child(cur) if prev
> +    if prev
> +      prev.add_child(cur)
> +      yield(prev, cur) if block_given?
> +    end
>    end
>  end

OK, that seems generic enough and we can probably support it
long-term, so I'm somewhat inclined to accept it...

However, APIs encouraging/supporting folks to load their entire
collection(*) of messages (even skeletons) into memory feels
wrong to me.

Can you come up with a use case where this is useful for
a subset of messages?


(*) I work with millions of emails

> > P.S. I hope you don't mind I uploaded my fork to github.

That's fine, I just add a new remote(*) to my .git/config, fetch
and show.

What I won't accept about GitHub is having it as a centralized
and proprietary messaging system which forces participants to
accept their ToS.  I can't accept that; no single entity
controls email, so that's what I stick with.


(*) added this to my .git/config
==> .git/config <==
[remote "dimidd"]
	url = https://github.com/dimidd/msgthr
	fetch = refs/heads/*:refs/remotes/dimidd/*

^ permalink raw reply	[relevance 0%]

Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2018-01-21  9:40     Feature Request: thread grouping Dimid Duchovny
2018-01-21 23:49     ` Eric Wong
2018-01-23 21:04       ` Dimid Duchovny
2018-01-23 21:12         ` Dimid Duchovny
2018-01-23 22:03  0%       ` Eric Wong
2018-01-24 10:28  0%         ` Dimid Duchovny
2018-01-25 23:08  7% [ANN] msgthr 1.2.0 - container-agnostic, non-recursive message threading Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/msgthr.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).