unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] emacs: Avoid regexp overflow when tidying citations.
@ 2010-11-12 12:50 David Edmondson
  2010-11-16 19:57 ` Carl Worth
  0 siblings, 1 reply; 4+ messages in thread
From: David Edmondson @ 2010-11-12 12:50 UTC (permalink / raw)
  To: notmuch

Declare `notmuch-wash-tidy-citations-max', which is the largest region
that `notmuch-wash-tidy-citations' will attempt to improve.
---
 emacs/notmuch-wash.el |   61 ++++++++++++++++++++++++++++---------------------
 1 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/emacs/notmuch-wash.el b/emacs/notmuch-wash.el
index cfcfb21..a7ea5e9 100644
--- a/emacs/notmuch-wash.el
+++ b/emacs/notmuch-wash.el
@@ -187,6 +187,11 @@ is what to put on the button."
 
 ;;
 
+(defcustom notmuch-wash-tidy-citations-max (* 10 1024)
+  "Maximum size of region to tidy."
+  :group 'notmuch
+  :type 'int)
+
 (defun notmuch-wash-tidy-citations (depth)
   "Improve the display of cited regions of a message.
 
@@ -199,32 +204,36 @@ Perform four transformations on the message body:
 - Remove citation trailers standing alone after a block of cited
   text."
 
-  ;; Remove lines of repeated citation leaders with no other content.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(^>[> ]*\n\\)\\{2,\\}" nil t)
-    (replace-match "\\1"))
-
-  ;; Remove citation leaders standing alone before a block of cited
-  ;; text.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(\n\\|^[^>].*\\)\n\\(^>[> ]*\n\\)" nil t)
-    (replace-match "\\1\n"))
-
-  ;; Remove citation trailers standing alone after a block of cited
-  ;; text.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(^>[> ]*\n\\)\\(^$\\|^[^>].*\\)" nil t)
-    (replace-match "\\2"))
-
-  ;; Insert a blank line before a citation if there isn't one.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(^[^>]+\\)\n>" nil t)
-    (replace-match "\\1\n\n>"))
-
-  ;; Insert a blank line after a citation if there isn't one.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(^>.+\\)\n\\([^>]\\)" nil t)
-    (replace-match "\\1\n\n\\2")))
+  ;; If the message is long, don't bother.
+  (unless (> (- (point-max) (point-min))
+	     notmuch-wash-tidy-citations-max)
+
+    ;; Remove lines of repeated citation leaders with no other content.
+    (goto-char (point-min))
+    (while (re-search-forward "\\(^>[> ]*\n\\)\\{2,\\}" nil t)
+      (replace-match "\\1"))
+
+    ;; Remove citation leaders standing alone before a block of cited
+    ;; text.
+    (goto-char (point-min))
+    (while (re-search-forward "\\(\n\\|^[^>].*\\)\n\\(^>[> ]*\n\\)" nil t)
+      (replace-match "\\1\n"))
+
+    ;; Remove citation trailers standing alone after a block of cited
+    ;; text.
+    (goto-char (point-min))
+    (while (re-search-forward "\\(^>[> ]*\n\\)\\(^$\\|^[^>].*\\)" nil t)
+      (replace-match "\\2"))
+
+    ;; Insert a blank line before a citation if there isn't one.
+    (goto-char (point-min))
+    (while (re-search-forward "\\(^[^>]+\\)\n>" nil t)
+      (replace-match "\\1\n\n>"))
+
+    ;; Insert a blank line after a citation if there isn't one.
+    (goto-char (point-min))
+    (while (re-search-forward "\\(^>.+\\)\n\\([^>]\\)" nil t)
+      (replace-match "\\1\n\n\\2"))))
 
 ;;
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] emacs: Avoid regexp overflow when tidying citations.
  2010-11-12 12:50 [PATCH] emacs: Avoid regexp overflow when tidying citations David Edmondson
@ 2010-11-16 19:57 ` Carl Worth
  2010-11-17 13:32   ` [PATCH] emacs: Remove over-eager regular expressions from notmuch-wash-tidy-citations David Edmondson
  0 siblings, 1 reply; 4+ messages in thread
From: Carl Worth @ 2010-11-16 19:57 UTC (permalink / raw)
  To: David Edmondson, notmuch

[-- Attachment #1: Type: text/plain, Size: 652 bytes --]

On Fri, 12 Nov 2010 12:50:02 +0000, David Edmondson <dme@dme.org> wrote:
> Declare `notmuch-wash-tidy-citations-max', which is the largest region
> that `notmuch-wash-tidy-citations' will attempt to improve.

Hi David,

Could you add a test case for whatever bug is being fixed here?

I'm a little concerned about giving the user a tuning knob which appears
to have a somewhat arbitrary default, and also without providing much
guidance to the user on how to set that particular knob.

I'd feel better if the code could somehow "know" when it needs to stop
tidying well enough that we would feel fine just not offering the knob
at all.

Thanks,

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] emacs: Remove over-eager regular expressions from notmuch-wash-tidy-citations.
  2010-11-16 19:57 ` Carl Worth
@ 2010-11-17 13:32   ` David Edmondson
  2010-12-07 22:11     ` Carl Worth
  0 siblings, 1 reply; 4+ messages in thread
From: David Edmondson @ 2010-11-17 13:32 UTC (permalink / raw)
  To: notmuch

The removed expressions, which were used to ensure that citations were
both preceded and followed by a blank line, were poorly implemented
and caused a regexp stack overflow on messages more than a few
thousand lines long.
---

Carl, I was not able to find a version of the regular expressions that
didn't easily overflow. For now, this patch removes the problematic
expressions and I'll look for a better solution.

 emacs/notmuch-wash.el |   14 ++------------
 1 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/emacs/notmuch-wash.el b/emacs/notmuch-wash.el
index cfcfb21..c4a7a41 100644
--- a/emacs/notmuch-wash.el
+++ b/emacs/notmuch-wash.el
@@ -190,7 +190,7 @@ is what to put on the button."
 (defun notmuch-wash-tidy-citations (depth)
   "Improve the display of cited regions of a message.
 
-Perform four transformations on the message body:
+Perform several transformations on the message body:
 
 - Remove lines of repeated citation leaders with no other
   content,
@@ -214,17 +214,7 @@ Perform four transformations on the message body:
   ;; text.
   (goto-char (point-min))
   (while (re-search-forward "\\(^>[> ]*\n\\)\\(^$\\|^[^>].*\\)" nil t)
-    (replace-match "\\2"))
-
-  ;; Insert a blank line before a citation if there isn't one.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(^[^>]+\\)\n>" nil t)
-    (replace-match "\\1\n\n>"))
-
-  ;; Insert a blank line after a citation if there isn't one.
-  (goto-char (point-min))
-  (while (re-search-forward "\\(^>.+\\)\n\\([^>]\\)" nil t)
-    (replace-match "\\1\n\n\\2")))
+    (replace-match "\\2")))
 
 ;;
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] emacs: Remove over-eager regular expressions from notmuch-wash-tidy-citations.
  2010-11-17 13:32   ` [PATCH] emacs: Remove over-eager regular expressions from notmuch-wash-tidy-citations David Edmondson
@ 2010-12-07 22:11     ` Carl Worth
  0 siblings, 0 replies; 4+ messages in thread
From: Carl Worth @ 2010-12-07 22:11 UTC (permalink / raw)
  To: David Edmondson, notmuch

[-- Attachment #1: Type: text/plain, Size: 577 bytes --]

On Wed, 17 Nov 2010 13:32:33 +0000, David Edmondson <dme@dme.org> wrote:
> The removed expressions, which were used to ensure that citations were
> both preceded and followed by a blank line, were poorly implemented
> and caused a regexp stack overflow on messages more than a few
> thousand lines long.
> ---
> 
> Carl, I was not able to find a version of the regular expressions that
> didn't easily overflow. For now, this patch removes the problematic
> expressions and I'll look for a better solution.

Thanks. Applied.

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-12-07 22:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 12:50 [PATCH] emacs: Avoid regexp overflow when tidying citations David Edmondson
2010-11-16 19:57 ` Carl Worth
2010-11-17 13:32   ` [PATCH] emacs: Remove over-eager regular expressions from notmuch-wash-tidy-citations David Edmondson
2010-12-07 22:11     ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).