unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
@ 2014-07-25 17:33 Filipp Gunbin
  2014-07-26  7:22 ` Glenn Morris
  2020-12-04 19:22 ` Mattias Engdegård
  0 siblings, 2 replies; 18+ messages in thread
From: Filipp Gunbin @ 2014-07-25 17:33 UTC (permalink / raw)
  To: 18109

Below is the corrected (I hope) version of the regexp.

=== modified file 'lisp/progmodes/compile.el'
--- lisp/progmodes/compile.el	2014-05-29 03:45:29 +0000
+++ lisp/progmodes/compile.el	2014-07-22 18:33:53 +0000
@@ -211,12 +211,9 @@
     (jikes-file
      "^\\(?:Found\\|Issued\\) .* compiling \"\\(.+\\)\":$" 1 nil nil 0)
 
-
-    ;; This used to be pathologically slow on long lines (Bug#3441),
-    ;; due to matching filenames via \\(.*?\\).  This might be faster.
     (maven
      ;; Maven is a popular free software build tool for Java.
-     "\\([^ \n]\\(?:[^\n :]\\| [^-/\n]\\|:[^ \n]\\)*?\\):\\[\\([0-9]+\\),\\([0-9]+\\)\\] " 1 2 3)
+      "\\(?:\\[ERROR\\]\\s-+\\)?\\([^[\n]+\\):\\[\\([[:digit:]]+\\),\\([[:digit:]]+\\)\\]" 1 2 3)
 
     (jikes-line
      "^ *\\([0-9]+\\)\\.[ \t]+.*\n +\\(<-*>\n\\*\\*\\* \\(?:Error\\|Warnin\\(g\\)\\)\\)"


-- 
    Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2014-07-25 17:33 bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven Filipp Gunbin
@ 2014-07-26  7:22 ` Glenn Morris
  2014-07-28 12:30   ` Filipp Gunbin
  2020-12-04 19:22 ` Mattias Engdegård
  1 sibling, 1 reply; 18+ messages in thread
From: Glenn Morris @ 2014-07-26  7:22 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: 18109


Please explain why it is wrong, and show an example of what is is
supposed to match. The current one in etc/compilation.txt is:

* maven 2.0.9

symbol: maven

FooBar.java:[111,53] no interface expected here






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2014-07-26  7:22 ` Glenn Morris
@ 2014-07-28 12:30   ` Filipp Gunbin
  2014-08-03 15:12     ` Daniel Colascione
  2020-09-09 11:16     ` Lars Ingebrigtsen
  0 siblings, 2 replies; 18+ messages in thread
From: Filipp Gunbin @ 2014-07-28 12:30 UTC (permalink / raw)
  To: Glenn Morris; +Cc: Filipp Gunbin, 18109

Glenn,

On 26/07/2014 03:22 -0400, Glenn Morris wrote:

> Please explain why it is wrong, and show an example of what is is
> supposed to match. The current one in etc/compilation.txt is:
>
> * maven 2.0.9
>
> symbol: maven
>
> FooBar.java:[111,53] no interface expected here

Oh yes, sorry for the bad report.

While the original regexp catches these errors:

D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[27,12] error: cannot find symbol
[ERROR]   symbol:   class SomeService
  location: class MainController

it does not catch these, which Maven emits sometimes:

[ERROR] D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[21,1] error: cannot find symbol
[ERROR]   symbol: class Controller

My version catches both.

Also, the original regexp seems to be more complicated than it really
should be.

Tested on Maven 2.2.1 and 3.0.4.

-- 
    Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2014-07-28 12:30   ` Filipp Gunbin
@ 2014-08-03 15:12     ` Daniel Colascione
  2020-09-09 11:16     ` Lars Ingebrigtsen
  1 sibling, 0 replies; 18+ messages in thread
From: Daniel Colascione @ 2014-08-03 15:12 UTC (permalink / raw)
  To: Filipp Gunbin, Glenn Morris; +Cc: 18109

[-- Attachment #1: Type: text/plain, Size: 1112 bytes --]

On 07/28/2014 05:30 AM, Filipp Gunbin wrote:
> Glenn,
> 
> On 26/07/2014 03:22 -0400, Glenn Morris wrote:
> 
>> Please explain why it is wrong, and show an example of what is is
>> supposed to match. The current one in etc/compilation.txt is:
>>
>> * maven 2.0.9
>>
>> symbol: maven
>>
>> FooBar.java:[111,53] no interface expected here
> 
> Oh yes, sorry for the bad report.
> 
> While the original regexp catches these errors:
> 
> D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[27,12] error: cannot find symbol
> [ERROR]   symbol:   class SomeService
>   location: class MainController
> 
> it does not catch these, which Maven emits sometimes:
> 
> [ERROR] D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[21,1] error: cannot find symbol
> [ERROR]   symbol: class Controller
> 
> My version catches both.
> 
> Also, the original regexp seems to be more complicated than it really
> should be.
> 
> Tested on Maven 2.2.1 and 3.0.4.
> 

Would you please consider converting this regexp to rx form?



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2014-07-28 12:30   ` Filipp Gunbin
  2014-08-03 15:12     ` Daniel Colascione
@ 2020-09-09 11:16     ` Lars Ingebrigtsen
  2020-12-03 14:59       ` Lars Ingebrigtsen
  1 sibling, 1 reply; 18+ messages in thread
From: Lars Ingebrigtsen @ 2020-09-09 11:16 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Glenn Morris, 18109

Filipp Gunbin <fgunbin@fastmail.fm> writes:

> it does not catch these, which Maven emits sometimes:
>
> [ERROR]
> D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[21,1]
> error: cannot find symbol
> [ERROR]   symbol: class Controller
>
> My version catches both.

There wasn't much of a follow-up on this afterwards (six years ago), but
this regexp was rewritten to use rx this year:

    (maven
     ;; Maven is a popular free software build tool for Java.
     ,(rx bol
          ;; It is unclear whether the initial [type] tag is always present.
          (? "["
             (or "ERROR" (group-n 1 "WARNING") (group-n 2 "INFO"))
             "] ")
          (group-n 3                    ; File
                   (not (any "\n ["))
                   (* (or (not (any "\n :"))
                          (: " " (not (any "\n/-")))
                          (: ":" (not (any "\n ["))))))
          ":["
          (group-n 4 (+ digit))         ; Line
          ","
          (group-n 5 (+ digit))         ; Column
          "] ")
     3 4 5 (1 . 2))

Looking at the new version, it does seem more similar to the proposed
patch than it was before the rewrite.

So does this work satisfactorily now (i.e., in Emacs 28)?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-09-09 11:16     ` Lars Ingebrigtsen
@ 2020-12-03 14:59       ` Lars Ingebrigtsen
  2020-12-04 18:11         ` Filipp Gunbin
  0 siblings, 1 reply; 18+ messages in thread
From: Lars Ingebrigtsen @ 2020-12-03 14:59 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Glenn Morris, 18109

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Looking at the new version, it does seem more similar to the proposed
> patch than it was before the rewrite.
>
> So does this work satisfactorily now (i.e., in Emacs 28)?

More information was requested, but no response was given within a few
months, so I'm closing this bug report.  If the problem still exists,
please respond to this email and we'll reopen the bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-03 14:59       ` Lars Ingebrigtsen
@ 2020-12-04 18:11         ` Filipp Gunbin
  0 siblings, 0 replies; 18+ messages in thread
From: Filipp Gunbin @ 2020-12-04 18:11 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Glenn Morris, 18109

On 03/12/2020 15:59 +0100, Lars Ingebrigtsen wrote:

> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
>> Looking at the new version, it does seem more similar to the proposed
>> patch than it was before the rewrite.
>>
>> So does this work satisfactorily now (i.e., in Emacs 28)?
>
> More information was requested, but no response was given within a few
> months, so I'm closing this bug report.  If the problem still exists,
> please respond to this email and we'll reopen the bug report.

Thanks for looking at this, the regexp now seems to catch both my
examples from years ago.

Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':  wrong regexp for Maven
  2014-07-25 17:33 bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven Filipp Gunbin
  2014-07-26  7:22 ` Glenn Morris
@ 2020-12-04 19:22 ` Mattias Engdegård
  2020-12-05 22:21   ` Filipp Gunbin
  1 sibling, 1 reply; 18+ messages in thread
From: Mattias Engdegård @ 2020-12-04 19:22 UTC (permalink / raw)
  To: Filipp Gunbin, Lars Ingebrigtsen; +Cc: 18109

> Thanks for looking at this, the regexp now seems to catch both my examples from years ago. 

We weren't sure whether messages always were prefixed by [ERROR] etc or could occur without such tags. The Maven documentation and source tree didn't help much, but perhaps I was looking in the wrong places. Could you help resolve the issue?






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-04 19:22 ` Mattias Engdegård
@ 2020-12-05 22:21   ` Filipp Gunbin
  2020-12-06  9:32     ` Mattias Engdegård
  0 siblings, 1 reply; 18+ messages in thread
From: Filipp Gunbin @ 2020-12-05 22:21 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 18109

On 04/12/2020 20:22 +0100, Mattias Engdegård wrote:

>> Thanks for looking at this, the regexp now seems to catch both my examples from years ago. 
>
> We weren't sure whether messages always were prefixed by [ERROR] etc
> or could occur without such tags. The Maven documentation and source
> tree didn't help much, but perhaps I was looking in the wrong
> places. Could you help resolve the issue?

Now the regexp seems to catch both prefixed and non-prefixed messages,
what else should be resolved here?

Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-05 22:21   ` Filipp Gunbin
@ 2020-12-06  9:32     ` Mattias Engdegård
  2020-12-06 14:22       ` Filipp Gunbin
  0 siblings, 1 reply; 18+ messages in thread
From: Mattias Engdegård @ 2020-12-06  9:32 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Lars Ingebrigtsen, 18109

5 dec. 2020 kl. 23.21 skrev Filipp Gunbin <fgunbin@fastmail.fm>:

> Now the regexp seems to catch both prefixed and non-prefixed messages,
> what else should be resolved here?

Ah, yes. Apart from the examples in compilation.txt, I could not find any evidence for non-prefixed messages ever being emitted. Since I'm not a Maven user myself, it would be useful to know if the non-prefixed example was just an oversight or an actual occurrence.

It is not a major problem but I like doing a proper work. Having patterns that match more than their strict minimum can be troublesome for two reasons: a regexp may accidentally catch a message intended for another pattern, and it may slow down message matching. Both have been issues several times in the past, which is why I'm wary of having too-loose regexps in the list.






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-06  9:32     ` Mattias Engdegård
@ 2020-12-06 14:22       ` Filipp Gunbin
  2020-12-06 15:05         ` Mattias Engdegård
  0 siblings, 1 reply; 18+ messages in thread
From: Filipp Gunbin @ 2020-12-06 14:22 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 18109

On 06/12/2020 10:32 +0100, Mattias Engdegård wrote:

> 5 dec. 2020 kl. 23.21 skrev Filipp Gunbin <fgunbin@fastmail.fm>:
>
>> Now the regexp seems to catch both prefixed and non-prefixed messages,
>> what else should be resolved here?
>
> Ah, yes. Apart from the examples in compilation.txt, I could not find
> any evidence for non-prefixed messages ever being emitted. Since I'm
> not a Maven user myself, it would be useful to know if the
> non-prefixed example was just an oversight or an actual occurrence.
>
> It is not a major problem but I like doing a proper work. Having
> patterns that match more than their strict minimum can be troublesome
> for two reasons: a regexp may accidentally catch a message intended
> for another pattern, and it may slow down message matching. Both have
> been issues several times in the past, which is why I'm wary of having
> too-loose regexps in the list.

Hm, I rarely use Maven these days (many projects switched to Gradle),
and I'm not on Windows any more, so I cannot reproduce the original
problem now.  If you think it's very improbable to have non-prefixed
message - just make the regexp more strict, and let's see whether
someone reports it again.

Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-06 14:22       ` Filipp Gunbin
@ 2020-12-06 15:05         ` Mattias Engdegård
  2020-12-06 15:25           ` Mattias Engdegård
  2020-12-07 10:41           ` Filipp Gunbin
  0 siblings, 2 replies; 18+ messages in thread
From: Mattias Engdegård @ 2020-12-06 15:05 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Lars Ingebrigtsen, 18109

6 dec. 2020 kl. 15.22 skrev Filipp Gunbin <fgunbin@fastmail.fm>:

> Hm, I rarely use Maven these days (many projects switched to Gradle),
> and I'm not on Windows any more, so I cannot reproduce the original
> problem now.  If you think it's very improbable to have non-prefixed
> message - just make the regexp more strict, and let's see whether
> someone reports it again.

Thank you, maybe we should indeed do that.

It is good to have someone knowing Gradle! That pattern could need some work as well. It currently is (in rx form):

(rx bol
   (| (group "w") nonl)
   ":"
   (* " ")     ; ??
   (group
    (? (in "A-Za-z") ":")
    (+ (not (in "\n:"))))
   ":"
   (* " ")     ; ??
   "("
   (group (+ (in "0-9")))
   ","
   (* " ")     ; ??
   (group (+ (in "0-9")))
   ")")

but the examples (from compilation.txt) look like:

e: /src/Test.kt: (34, 15): foo: bar
w: /src/Test.kt: (34, 15): foo: bar

Thus it looks like we can expect exactly one space each after the first and second colon and after the comma, instead of zero-or-more spaces (the '??' comments above). As a Gradle user, can you confirm this?

The way the pattern is written makes it prone to matching other messages entirely or partly, with potential negative consequences for correctness, performance or both.






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-06 15:05         ` Mattias Engdegård
@ 2020-12-06 15:25           ` Mattias Engdegård
  2020-12-07 10:41           ` Filipp Gunbin
  1 sibling, 0 replies; 18+ messages in thread
From: Mattias Engdegård @ 2020-12-06 15:25 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Lars Ingebrigtsen, 18109

[-- Attachment #1: Type: text/plain, Size: 288 bytes --]

> Thus it looks like we can expect exactly one space each after the first and second colon and after the comma, instead of zero-or-more spaces

Looking at https://github.com/JetBrains/kotlin/commit/ffe8ae3840d7b9bdc82170c8181031f05ced68bd, it looks likely; here is a proposed patch.


[-- Attachment #2: 0001-Stricter-gradle-kotlin-message-pattern.patch --]
[-- Type: application/octet-stream, Size: 1940 bytes --]

From 125b6d1c9c4d852fa638a86652f0fde9a89c9d0d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Sun, 6 Dec 2020 16:22:09 +0100
Subject: [PATCH] Stricter gradle-kotlin message pattern

* lisp/progmodes/compile.el (compilation-error-regexp-alist-alist):
Rule 'gradle-kotlin': don't be more forgiving than necessary; we know
exactly what the output looks like (see
https://github.com/JetBrains/kotlin/commit/\
ffe8ae3840d7b9bdc82170c8181031f05ced68bd) and there is no reason to
risk mismatches or expensive backtracking (bug#18109).  Recognise
'info' level messages.  Convert to rx.
---
 lisp/progmodes/compile.el | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/lisp/progmodes/compile.el b/lisp/progmodes/compile.el
index 787f5d5ef3..bc14407155 100644
--- a/lisp/progmodes/compile.el
+++ b/lisp/progmodes/compile.el
@@ -241,11 +241,20 @@ compilation-error-regexp-alist-alist
     ;; GradleStyleMessagerRenderer.kt in kotlin sources, see
     ;; https://youtrack.jetbrains.com/issue/KT-34683).
     (gradle-kotlin
-     ,(concat
-       "^\\(?:\\(w\\)\\|.\\): *"            ;type
-       "\\(\\(?:[A-Za-z]:\\)?[^:\n]+\\): *" ;file
-       "(\\([0-9]+\\), *\\([0-9]+\\))")     ;line, column
-     2 3 4 (1))
+     ,(rx bol
+          (| (group "w")                ; 1: warning
+             (group (in "iv"))          ; 2: info
+             "e")                       ; error
+          ": "
+          (group                        ; 3: file
+           (? (in "A-Za-z") ":")
+           (+ (not (in "\n:"))))
+          ": ("
+          (group (+ digit))             ; 4: line
+          ", "
+          (group (+ digit))             ; 5: column
+          "): ")
+     3 4 5 (1 . 2))
 
     (iar
      "^\"\\(.*\\)\",\\([0-9]+\\)\\s-+\\(?:Error\\|Warnin\\(g\\)\\)\\[[0-9]+\\]:"
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-06 15:05         ` Mattias Engdegård
  2020-12-06 15:25           ` Mattias Engdegård
@ 2020-12-07 10:41           ` Filipp Gunbin
  2020-12-07 13:49             ` Mattias Engdegård
  1 sibling, 1 reply; 18+ messages in thread
From: Filipp Gunbin @ 2020-12-07 10:41 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 18109

On 06/12/2020 16:05 +0100, Mattias Engdegård wrote:

> Thus it looks like we can expect exactly one space each after the first and second colon and after the comma, instead of zero-or-more spaces (the '??' comments above). As a Gradle user, can you confirm this?
>
> The way the pattern is written makes it prone to matching other messages entirely or partly, with potential negative consequences for correctness, performance or both.

It was me who put there those quantifiers, and I don't object to making
the regexps stricter.

But, we just need to be aware that Java tools usually don't expect the
output to be parsed.  Like, an IDE uses Gradle's API to run it, and
Gradle uses compiler API to compile - this way none of them have to
parse anything.  So they output something that can be parsed, yes, but
the format could change at any time.  That is why I'm more inclined to
making regexps more _lax_, not the other way around (and fix the
problems with them once they appear).

Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-07 10:41           ` Filipp Gunbin
@ 2020-12-07 13:49             ` Mattias Engdegård
  2020-12-07 20:07               ` Filipp Gunbin
  0 siblings, 1 reply; 18+ messages in thread
From: Mattias Engdegård @ 2020-12-07 13:49 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Lars Ingebrigtsen, 18109

7 dec. 2020 kl. 11.41 skrev Filipp Gunbin <fgunbin@fastmail.fm>:

> It was me who put there those quantifiers, and I don't object to making
> the regexps stricter.

It would be unfair to blame you for that! After all, that's how most of the other patterns were written, and for logical reasons: it seems intuitive and sensible to make the rules as loose as possible in case the format changes or there is otherwise a variation in the output. If the observed messages contain a single space in one place then standard practice has been to tolerate any number of spaces there, maybe even zero.

However, experience tells us that this intuition is wrong. Output formats do in fact tend to remain unchanged: Emacs and other editors, IDEs and other code are parsing them, and they are not all equally tolerant or in the same way. There is thus a self-reinforcing effect: the tool keeps output stable because we expect it to. (When output formats do change, it tends to be for good reasons and regexp tolerance is then rarely useful.)

> But, we just need to be aware that Java tools usually don't expect the
> output to be parsed.

Yes they do! The very composition of something like the gradle-kotlin output

e: FILENAME: (LINE, COL): MESSAGE

is so strict and formalised that it was definitely made with machine-readability in mind.

>  That is why I'm more inclined to
> making regexps more _lax_, not the other way around (and fix the
> problems with them once they appear).

As we have found out the hard way, the cost of lax patterns is insidious and diffuse until the mess really has to be sorted out -- and by then it's hard to get hold of the various people involved who have since long disappeared or forgot all about what they wrote years ago. Patterns are added independently of one another but interact in unexpected ways.

Thus, better to keep patterns strict, and only alter them when and if tool output changes; it is then clear exactly what needs to be done and why. For most rules this never becomes necessary.






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-07 13:49             ` Mattias Engdegård
@ 2020-12-07 20:07               ` Filipp Gunbin
  2020-12-09 18:41                 ` Mattias Engdegård
  0 siblings, 1 reply; 18+ messages in thread
From: Filipp Gunbin @ 2020-12-07 20:07 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 18109

On 07/12/2020 14:49 +0100, Mattias Engdegård wrote:

> However, experience tells us that this intuition is wrong. Output
> formats do in fact tend to remain unchanged: Emacs and other editors,
> IDEs and other code are parsing them, and they are not all equally
> tolerant or in the same way. There is thus a self-reinforcing effect:
> the tool keeps output stable because we expect it to. (When output
> formats do change, it tends to be for good reasons and regexp
> tolerance is then rarely useful.)

I would be very much happy if this was true (I don't say it's the
opposite, but I have a feeling that few in Java world care about how the
error parses in Emacs).

>> But, we just need to be aware that Java tools usually don't expect the
>> output to be parsed.
>
> Yes they do! The very composition of something like the gradle-kotlin output
>
> e: FILENAME: (LINE, COL): MESSAGE
>
> is so strict and formalised that it was definitely made with
> machine-readability in mind.

I doubt that any modern-or-so Java IDE will parse any error messages,
given that build tools and compilers have APIs.  At the level of build
tools, I can tell only for Gradle, and (to the best of my knowledge) it
doesn't - when invoking either compilers or other tools, like checkstyle
plugins.

>>  That is why I'm more inclined to
>> making regexps more _lax_, not the other way around (and fix the
>> problems with them once they appear).
>
> As we have found out the hard way, the cost of lax patterns is
> insidious and diffuse until the mess really has to be sorted out --
> and by then it's hard to get hold of the various people involved who
> have since long disappeared or forgot all about what they wrote years
> ago. Patterns are added independently of one another but interact in
> unexpected ways.
>
> Thus, better to keep patterns strict, and only alter them when and if
> tool output changes; it is then clear exactly what needs to be done
> and why. For most rules this never becomes necessary.

Just wondering - did we have really that much problems caused by bad
performance of compilation regexps?  Because if we did, then maybe we
should look at other approaches, like trying to detect the compiler
used, and narrow the set of regexps based on it.  It's natural to expect
that many different people would edit these regexps when something
doesn't work for them, and expecting that you will always come and fix
the things up would not be very fair to you :-)

Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-07 20:07               ` Filipp Gunbin
@ 2020-12-09 18:41                 ` Mattias Engdegård
  2020-12-10 13:12                   ` Filipp Gunbin
  0 siblings, 1 reply; 18+ messages in thread
From: Mattias Engdegård @ 2020-12-09 18:41 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: Lars Ingebrigtsen, 18109-done

7 dec. 2020 kl. 21.07 skrev Filipp Gunbin <fgunbin@fastmail.fm>:

> [...] I have a feeling that few in Java world care about how the
> error parses in Emacs).

Most likely. On the other hand, lack of interest in the output format can also imply that it's unlikely to change.

> I doubt that any modern-or-so Java IDE will parse any error messages,
> given that build tools and compilers have APIs.

Quite possible, but the very emission of formalised messages to stdout/stderr means that this mode of usage is still acknowledged as somewhat common and useful.

> - did we have really that much problems caused by bad
> performance of compilation regexps?  Because if we did, then maybe we
> should look at other approaches, like trying to detect the compiler
> used, and narrow the set of regexps based on it.

This is hard to do in any practical way, not the least because a single message buffer may consist of the combined output of dozens of different tools -- compilers, linters, build tools, spell checkers, testing, stack traces, packaging, and so on. Not to mention the practical difficulty of going from the string 'make' to 'GCC version 11.2'.

That things work reasonably anyway is very much thanks to the prevalence of a few fairly common formats, such as GNU (file:line: message).

>  It's natural to expect
> that many different people would edit these regexps when something
> doesn't work for them, and expecting that you will always come and fix
> the things up would not be very fair to you :-)

Very considerate, thank you! There seems to be a fairly good flow of reports when something doesn't work. (A more modern and inviting bug-reporting system would probably help but that is a completely different matter.)

I'm pushing the proposed tightening of gradle-kotlin because the principle is right, and even if the Java world internally prefer APIs for composing tools, a tighter regexp in Emacs helps performance and accuracy for other patterns. Loose regexps form a sort of tragedy of the commons.

It seems that we also have forgotten to close the bug; doing that now. Thank you again for the insightful comments!






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven
  2020-12-09 18:41                 ` Mattias Engdegård
@ 2020-12-10 13:12                   ` Filipp Gunbin
  0 siblings, 0 replies; 18+ messages in thread
From: Filipp Gunbin @ 2020-12-10 13:12 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 18109-done

On 09/12/2020 19:41 +0100, Mattias Engdegård wrote:

> Quite possible, but the very emission of formalised messages to
> stdout/stderr means that this mode of usage is still acknowledged as
> somewhat common and useful.

Yes, sure.

>> - did we have really that much problems caused by bad
>> performance of compilation regexps?  Because if we did, then maybe we
>> should look at other approaches, like trying to detect the compiler
>> used, and narrow the set of regexps based on it.
>
> This is hard to do in any practical way, not the least because a
> single message buffer may consist of the combined output of dozens of
> different tools -- compilers, linters, build tools, spell checkers,
> testing, stack traces, packaging, and so on. Not to mention the
> practical difficulty of going from the string 'make' to 'GCC version
> 11.2'.
>
> That things work reasonably anyway is very much thanks to the
> prevalence of a few fairly common formats, such as GNU (file:line:
> message).

Yes, btw I see that "gnu" regexp sometimes captures messages which I
expect to be captured by "javac" regexp.  This is not that unexpected,
given the occasional similarity between formats...  I'll look into that
later.

>>  It's natural to expect
>> that many different people would edit these regexps when something
>> doesn't work for them, and expecting that you will always come and fix
>> the things up would not be very fair to you :-)
>
> Very considerate, thank you! There seems to be a fairly good flow of
> reports when something doesn't work. (A more modern and inviting
> bug-reporting system would probably help but that is a completely
> different matter.)
>
> I'm pushing the proposed tightening of gradle-kotlin because the
> principle is right, and even if the Java world internally prefer APIs
> for composing tools, a tighter regexp in Emacs helps performance and
> accuracy for other patterns. Loose regexps form a sort of tragedy of
> the commons.
>
> It seems that we also have forgotten to close the bug; doing that
> now. Thank you again for the insightful comments!

Thank you for careful work.

Filipp





^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-12-10 13:12 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-25 17:33 bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven Filipp Gunbin
2014-07-26  7:22 ` Glenn Morris
2014-07-28 12:30   ` Filipp Gunbin
2014-08-03 15:12     ` Daniel Colascione
2020-09-09 11:16     ` Lars Ingebrigtsen
2020-12-03 14:59       ` Lars Ingebrigtsen
2020-12-04 18:11         ` Filipp Gunbin
2020-12-04 19:22 ` Mattias Engdegård
2020-12-05 22:21   ` Filipp Gunbin
2020-12-06  9:32     ` Mattias Engdegård
2020-12-06 14:22       ` Filipp Gunbin
2020-12-06 15:05         ` Mattias Engdegård
2020-12-06 15:25           ` Mattias Engdegård
2020-12-07 10:41           ` Filipp Gunbin
2020-12-07 13:49             ` Mattias Engdegård
2020-12-07 20:07               ` Filipp Gunbin
2020-12-09 18:41                 ` Mattias Engdegård
2020-12-10 13:12                   ` Filipp Gunbin

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).