* [PATCH 0/2] Better id: link buttonization @ 2012-10-31 2:29 Austin Clements 2012-10-31 2:29 ` [PATCH 1/2] test: Test buttonization of id: links Austin Clements 2012-10-31 2:29 ` [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages Austin Clements 0 siblings, 2 replies; 6+ messages in thread From: Austin Clements @ 2012-10-31 2:29 UTC (permalink / raw) To: notmuch Jani pointed out to me that the new id stashing with less quoting is more likely to result in emails containing id: links that get buttonized wrong. Since id: links without quotes are hip, this series fixes the buttonization regexp to match Xapian query syntax, rather than being ad hoc. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] test: Test buttonization of id: links 2012-10-31 2:29 [PATCH 0/2] Better id: link buttonization Austin Clements @ 2012-10-31 2:29 ` Austin Clements 2012-10-31 2:29 ` [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages Austin Clements 1 sibling, 0 replies; 6+ messages in thread From: Austin Clements @ 2012-10-31 2:29 UTC (permalink / raw) To: notmuch This matches the current behavior of the buttonizer, so it passes, but many of these cases are not what you'd want (and some of them aren't even valid Xapian queries). The next patch will fix the handling of these cases and update the test. --- test/emacs-show | 40 ++++++++++++++++++++++++++++++++++++++++ test/test-lib.el | 14 ++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/test/emacs-show b/test/emacs-show index 64c38d3..c4f3cf4 100755 --- a/test/emacs-show +++ b/test/emacs-show @@ -106,5 +106,45 @@ test_emacs '(notmuch-search "from:lars@seas.harvard.edu and subject:\"Maildir st (test-visible-output)' test_expect_equal_file OUTPUT $EXPECTED/notmuch-show-indent-thread-content-off +test_begin_subtest "id buttonization" +add_message '[body]=" +id:abc +id:abc.def. id:abc,def, id:abc;def; +id:abc)def +id:ab\"c def +id:\"abc\"def +id:\"ab\"\"c\"def +id:\"ab c\"def +id:\"abc\".def +id:\"abc +\" +id:) +id: +cid:xxx"' +test_emacs '(notmuch-show "id:'$gen_msg_id'") + (notmuch-test-mark-links) + (test-visible-output)' +cat <<EOF >EXPECTED +Notmuch Test Suite <test_suite@notmuchmail.org> (2001-01-05) (inbox) +Subject: id buttonization +To: Notmuch Test Suite <test_suite@notmuchmail.org> +Date: Fri, 05 Jan 2001 15:43:57 +0000 + +<<id:abc>> +<<id:abc.def.>> <<id:abc,def,>> <<id:abc;def;>> +<<id:abc)def>> +<<id:ab>>"c def +<<id:"abc">>def +<<id:"ab">>"c"def +id:"ab c"def +<<id:"abc">>.def +id:"abc +" +<<id:)>> +id: +c<<id:xxx>> +EOF +test_expect_equal_file OUTPUT EXPECTED + test_done diff --git a/test/test-lib.el b/test/test-lib.el index fa3380c..dece811 100644 --- a/test/test-lib.el +++ b/test/test-lib.el @@ -107,6 +107,20 @@ nothing." (ad-set-arg 1 (char-to-string char)) ad-do-it)))) +(defun notmuch-test-mark-links () + "Enclose links in the current buffer with << and >>." + ;; Links are often created by jit-lock functions + (jit-lock-fontify-now) + (save-excursion + (let ((inhibit-read-only t)) + (goto-char (point-min)) + (let ((button)) + (while (setq button (next-button (point))) + (goto-char (button-start button)) + (insert "<<") + (goto-char (button-end button)) + (insert ">>")))))) + (defmacro notmuch-test-run (&rest body) "Evaluate a BODY of test expressions and output the result." `(with-temp-buffer -- 1.7.10 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages 2012-10-31 2:29 [PATCH 0/2] Better id: link buttonization Austin Clements 2012-10-31 2:29 ` [PATCH 1/2] test: Test buttonization of id: links Austin Clements @ 2012-10-31 2:29 ` Austin Clements 2012-10-31 10:00 ` Jani Nikula 2012-11-01 14:01 ` Sascha Silbe 1 sibling, 2 replies; 6+ messages in thread From: Austin Clements @ 2012-10-31 2:29 UTC (permalink / raw) To: notmuch This regexp agrees with Xapian query syntax much more closely, though we specifically disallow various cases that would be confusing in the context of an email body (e.g., punctuation at the end of an id: link is not considered part of the id: link because it's probably part of the surrounding text). In particular, this handles id: links that are not surrounded by quotes much better, which stash is much more likely to generate now that we don't quote id's that don't need to be quoted. It also handles quoted id: links better. We update the buttonization test to reflect the new pattern. --- emacs/notmuch-show.el | 20 +++++++++++++++++++- test/emacs-show | 14 +++++++------- 2 files changed, 26 insertions(+), 8 deletions(-) diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el index f273eb4..e96e099 100644 --- a/emacs/notmuch-show.el +++ b/emacs/notmuch-show.el @@ -991,6 +991,24 @@ message at DEPTH in the current thread." "Insert the forest of threads FOREST." (mapc (lambda (thread) (notmuch-show-insert-thread thread 0)) forest)) +(defvar notmuch-id-regexp + (concat + ;; Match the id: prefix only if it begins a word (to disallow, for + ;; example, matching cid:). + "\\<id:\\(" + ;; If the term starts with a ", then parse Xapian's quoted boolean + ;; term syntax, which allows for anything as long as embedded + ;; double quotes escaped by doubling them. We also disallow + ;; newlines (which Xapian allows) to prevent runaway terms. + "\"\\([^\"\n]\\|\"\"\\)*\"" + ;; Otherwise, parse Xapian's unquoted syntax, which goes up to the + ;; next space or ). We disallow [.,;] as the last character + ;; because these are probably part of the surrounding text, and not + ;; part of the id. This doesn't match single character ids; meh. + "\\|[^\"[:space:])][^[:space:])]*[^[:space:]).,;]" + "\\)") + "The regexp used to match id: links in messages.") + (defun notmuch-show-buttonise-links (start end) "Buttonise URLs and mail addresses between START and END. @@ -999,7 +1017,7 @@ a corresponding notmuch search." (goto-address-fontify-region start end) (save-excursion (goto-char start) - (while (re-search-forward "id:\\(\"?\\)[^[:space:]\"]+\\1" end t) + (while (re-search-forward notmuch-id-regexp end t) ;; remove the overlay created by goto-address-mode (remove-overlays (match-beginning 0) (match-end 0) 'goto-address t) (make-text-button (match-beginning 0) (match-end 0) diff --git a/test/emacs-show b/test/emacs-show index c4f3cf4..e16483c 100755 --- a/test/emacs-show +++ b/test/emacs-show @@ -131,18 +131,18 @@ To: Notmuch Test Suite <test_suite@notmuchmail.org> Date: Fri, 05 Jan 2001 15:43:57 +0000 <<id:abc>> -<<id:abc.def.>> <<id:abc,def,>> <<id:abc;def;>> -<<id:abc)def>> -<<id:ab>>"c def +<<id:abc.def>>. <<id:abc,def>>, <<id:abc;def>>; +<<id:abc>>)def +<<id:ab"c>> def <<id:"abc">>def -<<id:"ab">>"c"def -id:"ab c"def +<<id:"ab""c">>def +<<id:"ab c">>def <<id:"abc">>.def id:"abc " -<<id:)>> +id:) id: -c<<id:xxx>> +cid:xxx EOF test_expect_equal_file OUTPUT EXPECTED -- 1.7.10 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages 2012-10-31 2:29 ` [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages Austin Clements @ 2012-10-31 10:00 ` Jani Nikula 2012-11-01 14:01 ` Sascha Silbe 1 sibling, 0 replies; 6+ messages in thread From: Jani Nikula @ 2012-10-31 10:00 UTC (permalink / raw) To: Austin Clements, notmuch On Wed, 31 Oct 2012, Austin Clements <amdragon@MIT.EDU> wrote: > This regexp agrees with Xapian query syntax much more closely, though > we specifically disallow various cases that would be confusing in the > context of an email body (e.g., punctuation at the end of an id: link > is not considered part of the id: link because it's probably part of > the surrounding text). > > In particular, this handles id: links that are not surrounded by > quotes much better, which stash is much more likely to generate now > that we don't quote id's that don't need to be quoted. It also > handles quoted id: links better. > > We update the buttonization test to reflect the new pattern. Hi Austin, all of this looks good as-is, but I propose the changes below on top. (With the relevant comment changed too.) BR, Jani. diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el index e96e099..117eb0e 100644 --- a/emacs/notmuch-show.el +++ b/emacs/notmuch-show.el @@ -1005,7 +1005,7 @@ message at DEPTH in the current thread." ;; next space or ). We disallow [.,;] as the last character ;; because these are probably part of the surrounding text, and not ;; part of the id. This doesn't match single character ids; meh. - "\\|[^\"[:space:])][^[:space:])]*[^[:space:]).,;]" + "\\|[^\"[:space:])][^[:space:])]*[^])[:space:].,:;?!]" "\\)") "The regexp used to match id: links in messages.") diff --git a/test/emacs-show b/test/emacs-show index e16483c..e2d7c70 100755 --- a/test/emacs-show +++ b/test/emacs-show @@ -109,7 +109,12 @@ test_expect_equal_file OUTPUT $EXPECTED/notmuch-show-indent-thread-content-off test_begin_subtest "id buttonization" add_message '[body]=" id:abc -id:abc.def. id:abc,def, id:abc;def; +id:abc.def. id:abc,def, id:abc;def; id:abc:def: +id:foo@bar.?baz? id:foo@bar!.baz! +(id:foo@bar.baz) [id:foo@bar.baz] +id:foo@bar.baz... +id:2+2=5 +id:=_-:/.[]@$%+ id:abc)def id:ab\"c def id:\"abc\"def @@ -131,7 +136,12 @@ To: Notmuch Test Suite <test_suite@notmuchmail.org> Date: Fri, 05 Jan 2001 15:43:57 +0000 <<id:abc>> -<<id:abc.def>>. <<id:abc,def>>, <<id:abc;def>>; +<<id:abc.def>>. <<id:abc,def>>, <<id:abc;def>>; <<id:abc:def>>: +<<id:foo@bar.?baz>>? <<id:foo@bar!.baz>>! +(<<id:foo@bar.baz>>) [<<id:foo@bar.baz>>] +<<id:foo@bar.baz>>... +<<id:2+2=5>> +<<id:=_-:/.[]@$%+>> <<id:abc>>)def <<id:ab"c>> def <<id:"abc">>def ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages 2012-10-31 2:29 ` [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages Austin Clements 2012-10-31 10:00 ` Jani Nikula @ 2012-11-01 14:01 ` Sascha Silbe 2012-11-02 19:27 ` Jameson Graef Rollins 1 sibling, 1 reply; 6+ messages in thread From: Sascha Silbe @ 2012-11-01 14:01 UTC (permalink / raw) To: Austin Clements, notmuch [-- Attachment #1: Type: text/plain, Size: 559 bytes --] Austin Clements <amdragon@MIT.EDU> writes: > +(defvar notmuch-id-regexp > + (concat > + ;; Match the id: prefix only if it begins a word (to disallow, for > + ;; example, matching cid:). > + "\\<id:\\(" [...] Since you're already at it, please extend it to accept RFC2392 [1] syntax for message URLs: mid-url = "mid" ":" message-id [ "/" content-id ] The id: syntax is non-standard (i.e. specific to notmuch). Sascha [1] http://www.ietf.org/rfc/rfc2392.txt -- http://sascha.silbe.org/ http://www.infra-silbe.de/ [-- Attachment #2: Type: application/pgp-signature, Size: 489 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages 2012-11-01 14:01 ` Sascha Silbe @ 2012-11-02 19:27 ` Jameson Graef Rollins 0 siblings, 0 replies; 6+ messages in thread From: Jameson Graef Rollins @ 2012-11-02 19:27 UTC (permalink / raw) To: Sascha Silbe, Austin Clements, notmuch [-- Attachment #1: Type: text/plain, Size: 625 bytes --] On Thu, Nov 01 2012, Sascha Silbe <sascha-ml-reply-to-2012-4@silbe.org> wrote: > Since you're already at it, please extend it to accept RFC2392 [1] > syntax for message URLs: > > mid-url = "mid" ":" message-id [ "/" content-id ] > > The id: syntax is non-standard (i.e. specific to notmuch). > > Sascha > > [1] http://www.ietf.org/rfc/rfc2392.txt Holy crap! There's an RFC that specifies a URL scheme for message-ids?? From 1998?? Very cool. We definitely need to support that. Does anything else support this? I don't see any reason we couldn't support the "cid" schemes as well. jamie. [-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-11-02 19:27 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-31 2:29 [PATCH 0/2] Better id: link buttonization Austin Clements 2012-10-31 2:29 ` [PATCH 1/2] test: Test buttonization of id: links Austin Clements 2012-10-31 2:29 ` [PATCH 2/2] emacs: Improve the regexp used to match id:'s in messages Austin Clements 2012-10-31 10:00 ` Jani Nikula 2012-11-01 14:01 ` Sascha Silbe 2012-11-02 19:27 ` Jameson Graef Rollins
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).