unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#51941: Detect duplication of ERT tests
@ 2021-11-18 10:16 Mattias Engdegård
  2021-11-18 10:35 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-18 10:16 UTC (permalink / raw)
  To: 51941

[-- Attachment #1: Type: text/plain, Size: 1403 bytes --]

Duplicated `ert-deftest` forms (ie, two or more definitions with the same test name) is a serious problem: the older test is silently ignored, which has the consequence of the test covering less than intended.

A quick experiment in the Emacs tree immediately reveals 51 instances. There's no telling how many lie waiting in external code.

Having explored various options I've come to the conclusion that it needs to be a hard error, at run-time, when running non-interactively (see attached diff). Rationale:

* We can't just warn at run time because tests tend to spam a lot when run and users only care about whether the tests passed and won't even look at the logs otherwise.
* We can't just warn at compile time because many tests aren't compiled at all, and where they are (as in the Emacs tree) nobody cares much about the warnings, because they are "just tests".
* We can't issue a complaint when running interactively because it's natural to keep redefining tests during development.

Since `ert-deftest` forms are often generated as a result of macro-expansion, passive static textual linting won't do.

The effect of this change will be a visible and easily remedied failure in broken test code. I volunteer to fix the first-order errors in Emacs (by renaming the clashes); domain specialists may need to help fixing secondary failures (failures in previously ignored tests).


[-- Attachment #2: ert-deftest-redefine-error.diff --]
[-- Type: application/octet-stream, Size: 661 bytes --]

diff --git a/lisp/emacs-lisp/ert.el b/lisp/emacs-lisp/ert.el
index 36b4408dc8..92a1315f13 100644
--- a/lisp/emacs-lisp/ert.el
+++ b/lisp/emacs-lisp/ert.el
@@ -151,6 +151,10 @@ ert-set-test
     ;; Note that nil is still a valid value for the `name' slot in
     ;; ert-test objects.  It designates an anonymous test.
     (error "Attempt to define a test named nil"))
+  (when (and noninteractive (get symbol 'ert--test))
+    ;; If a test is redefined when running noninteractively, treat it
+    ;; as a hard error to make sure it is discovered.
+    (error "Test `%s' redefined" symbol))
   (define-symbol-prop symbol 'ert--test definition)
   definition)
 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 10:16 bug#51941: Detect duplication of ERT tests Mattias Engdegård
@ 2021-11-18 10:35 ` Lars Ingebrigtsen
  2021-11-18 11:54   ` Mattias Engdegård
  2021-11-24  7:30   ` Ihor Radchenko
  0 siblings, 2 replies; 19+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-18 10:35 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941

Mattias Engdegård <mattiase@acm.org> writes:

> Having explored various options I've come to the conclusion that it
> needs to be a hard error, at run-time, when running non-interactively
> (see attached diff). Rationale:

Makes sense to me.

> The effect of this change will be a visible and easily remedied
> failure in broken test code. I volunteer to fix the first-order errors
> in Emacs (by renaming the clashes); domain specialists may need to
> help fixing secondary failures (failures in previously ignored tests).

Sounds good.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 10:35 ` Lars Ingebrigtsen
@ 2021-11-18 11:54   ` Mattias Engdegård
  2021-11-18 11:58     ` Lars Ingebrigtsen
  2021-11-18 17:44     ` Lars Ingebrigtsen
  2021-11-24  7:30   ` Ihor Radchenko
  1 sibling, 2 replies; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-18 11:54 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 51941-done

18 nov. 2021 kl. 11.35 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> Sounds good.

Thank you, all now done. The tests seem to pass after clashes were resolved but do tell if I missed something.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 11:54   ` Mattias Engdegård
@ 2021-11-18 11:58     ` Lars Ingebrigtsen
  2021-11-18 17:44     ` Lars Ingebrigtsen
  1 sibling, 0 replies; 19+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-18 11:58 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941-done

Mattias Engdegård <mattiase@acm.org> writes:

> Thank you, all now done. The tests seem to pass after clashes were
> resolved but do tell if I missed something.

All the tests pass for me, too.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 11:54   ` Mattias Engdegård
  2021-11-18 11:58     ` Lars Ingebrigtsen
@ 2021-11-18 17:44     ` Lars Ingebrigtsen
  2021-11-18 17:53       ` Philipp Stephani
  2021-11-18 19:13       ` Mattias Engdegård
  1 sibling, 2 replies; 19+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-18 17:44 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941

By the way, this reminds me of a different error I often make in ert
files -- I type "defun" instead of "ert-deftest".  Would it be
productive to check whether a -tests.el file contains a defun that's
never called?  In which case it's probably supposed to be en
ert-deftest.

But perhaps it's such a marginal problem that it's not worth checking
for.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 17:44     ` Lars Ingebrigtsen
@ 2021-11-18 17:53       ` Philipp Stephani
  2021-11-18 19:13       ` Mattias Engdegård
  1 sibling, 0 replies; 19+ messages in thread
From: Philipp Stephani @ 2021-11-18 17:53 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Mattias Engdegård, 51941

Am Do., 18. Nov. 2021 um 18:45 Uhr schrieb Lars Ingebrigtsen <larsi@gnus.org>:
>
> By the way, this reminds me of a different error I often make in ert
> files -- I type "defun" instead of "ert-deftest".  Would it be
> productive to check whether a -tests.el file contains a defun that's
> never called?  In which case it's probably supposed to be en
> ert-deftest.
>
> But perhaps it's such a marginal problem that it's not worth checking
> for.

This also happens to me rather frequently, so I'd welcome such an
addition. The byte compiler can already generate call graphs (see
byte-compile-generate-call-tree), so maybe that functionality could be
reused.





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 17:44     ` Lars Ingebrigtsen
  2021-11-18 17:53       ` Philipp Stephani
@ 2021-11-18 19:13       ` Mattias Engdegård
  2021-11-18 19:34         ` Lars Ingebrigtsen
  2021-11-18 22:51         ` Glenn Morris
  1 sibling, 2 replies; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-18 19:13 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 51941, Philipp

18 nov. 2021 kl. 18.44 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> By the way, this reminds me of a different error I often make in ert
> files -- I type "defun" instead of "ert-deftest".  Would it be
> productive to check whether a -tests.el file contains a defun that's
> never called?

It would certainly be good to detect such mistakes, but I'm not sure how to go about finding them. I wrote a quick-and-dirty scanner that found about 400 uncalled functions in the test tree of which 1 or 2, maybe, should have been `ert-deftest` (will fix those right away).

Clearly something with much better signal/noise ratio is called for. One would be to only consider functions using `should` etc.

> But perhaps it's such a marginal problem that it's not worth checking
> for.

It's unclear how frequent this mistake is but at least now we have concrete evidence of its existence. On the other hand, it will be found by a test author who has adopted the standard "test first without then with the fix" ritual as habit.

(I'm not accusing you of bad discipline; I'm probably no better myself, and in any case berating people for being careless is not how to solve problems caused by human mistakes.)

I may or may not have the time to try to find something better so feel free to take a shot if you like.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 19:13       ` Mattias Engdegård
@ 2021-11-18 19:34         ` Lars Ingebrigtsen
  2021-11-18 19:43           ` Mattias Engdegård
  2021-11-18 22:51         ` Glenn Morris
  1 sibling, 1 reply; 19+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-18 19:34 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941, Philipp

Mattias Engdegård <mattiase@acm.org> writes:

> It's unclear how frequent this mistake is but at least now we have
> concrete evidence of its existence. On the other hand, it will be
> found by a test author who has adopted the standard "test first
> without then with the fix" ritual as habit.
>
> (I'm not accusing you of bad discipline; I'm probably no better
> myself, and in any case berating people for being careless is not how
> to solve problems caused by human mistakes.)

I usually eval the `should' statements by hand in the running Emacs when
developing stuff, so I frequently don't notice that they don't actually
run when saying "make check".

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 19:34         ` Lars Ingebrigtsen
@ 2021-11-18 19:43           ` Mattias Engdegård
  0 siblings, 0 replies; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-18 19:43 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 51941, Philipp

18 nov. 2021 kl. 20.34 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> I usually eval the `should' statements by hand in the running Emacs when
> developing stuff, so I frequently don't notice that they don't actually
> run when saying "make check".

Yes, that's a little dangerous because there are so many things that can go wrong. On the other hand, I often just run the added or changed test interactively with `ert` instead of the whole file in batch mode. (And I have many times fallen into the .elc vs .el trap of TEST_LOAD_EL!)

We should have a modern test runner in Emacs that runs tests in batch mode but displays the results interactively, points out exact failure points etc.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 19:13       ` Mattias Engdegård
  2021-11-18 19:34         ` Lars Ingebrigtsen
@ 2021-11-18 22:51         ` Glenn Morris
  2021-11-19  5:28           ` Lars Ingebrigtsen
  1 sibling, 1 reply; 19+ messages in thread
From: Glenn Morris @ 2021-11-18 22:51 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941, Lars Ingebrigtsen, Philipp

Mattias Engdegård wrote:

> It would certainly be good to detect such mistakes, but I'm not sure
> how to go about finding them. I wrote a quick-and-dirty scanner that
> found about 400 uncalled functions in the test tree of which 1 or 2,
> maybe, should have been `ert-deftest` (will fix those right away).
>
> Clearly something with much better signal/noise ratio is called for.
> One would be to only consider functions using `should` etc.

Seems to me that making "should" only be defined within tests would have
flagged these (https://debbugs.gnu.org/28385).





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 22:51         ` Glenn Morris
@ 2021-11-19  5:28           ` Lars Ingebrigtsen
  2021-11-19  9:33             ` Mattias Engdegård
  0 siblings, 1 reply; 19+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-19  5:28 UTC (permalink / raw)
  To: Glenn Morris; +Cc: Mattias Engdegård, Philipp, 51941

Glenn Morris <rgm@gnu.org> writes:

> Seems to me that making "should" only be defined within tests would have
> flagged these (https://debbugs.gnu.org/28385).

But like I said there, using `should' outside tests is really convenient
when developing.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-19  5:28           ` Lars Ingebrigtsen
@ 2021-11-19  9:33             ` Mattias Engdegård
  2021-11-20  8:38               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-19  9:33 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Glenn Morris, Philipp, 51941

19 nov. 2021 kl. 06.28 skrev Lars Ingebrigtsen <larsi@gnus.org>:

>> Seems to me that making "should" only be defined within tests would have
>> flagged these (https://debbugs.gnu.org/28385).
> 
> But like I said there, using `should' outside tests is really convenient
> when developing.

You are both right -- apart from the development convenience, functions form an important abstraction mechanism in Lisp and they can be valuable for decomposing test code (although I don't think I've ever done so myself). On the other hand, disallowing them outside `ert-deftest` might help eliminate the mistakes that we are talking about.

A quick scan finds 365 occurrences of `should`, `should-not` and `should-error` inside defuns in the test/ subtree.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-19  9:33             ` Mattias Engdegård
@ 2021-11-20  8:38               ` Lars Ingebrigtsen
  2021-11-20  8:49                 ` Mattias Engdegård
  0 siblings, 1 reply; 19+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-20  8:38 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Glenn Morris, Philipp, 51941

Mattias Engdegård <mattiase@acm.org> writes:

> A quick scan finds 365 occurrences of `should`, `should-not` and
> `should-error` inside defuns in the test/ subtree.

Most of those are probably called from an ert-deftest, I hope.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-20  8:38               ` Lars Ingebrigtsen
@ 2021-11-20  8:49                 ` Mattias Engdegård
  0 siblings, 0 replies; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-20  8:49 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Glenn Morris, Philipp, 51941

20 nov. 2021 kl. 09.38 skrev Lars Ingebrigtsen <larsi@gnus.org>:

>> A quick scan finds 365 occurrences of `should`, `should-not` and
>> `should-error` inside defuns in the test/ subtree.
> 
> Most of those are probably called from an ert-deftest, I hope.

I didn't do any call-tree analysis but some tests use ERT in rather unorthodox ways.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-18 10:35 ` Lars Ingebrigtsen
  2021-11-18 11:54   ` Mattias Engdegård
@ 2021-11-24  7:30   ` Ihor Radchenko
  2021-11-24  8:52     ` Mattias Engdegård
  1 sibling, 1 reply; 19+ messages in thread
From: Ihor Radchenko @ 2021-11-24  7:30 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Mattias Engdegård, 51941

Lars Ingebrigtsen <larsi@gnus.org> writes:

>> The effect of this change will be a visible and easily remedied
>> failure in broken test code. I volunteer to fix the first-order errors
>> in Emacs (by renaming the clashes); domain specialists may need to
>> help fixing secondary failures (failures in previously ignored tests).
>
> Sounds good.

This change breaks test code when a test file is "required" multiple
times. In particular, Org mode tests cannot run with latest Emacs.

Best,
Ihor





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-24  7:30   ` Ihor Radchenko
@ 2021-11-24  8:52     ` Mattias Engdegård
  2021-11-24  9:29       ` Ihor Radchenko
  0 siblings, 1 reply; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-24  8:52 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: 51941, Lars Ingebrigtsen

24 nov. 2021 kl. 08.30 skrev Ihor Radchenko <yantar92@gmail.com>:

> This change breaks test code when a test file is "required" multiple
> times. In particular, Org mode tests cannot run with latest Emacs.

We'll have to do something about that then. Would you tell me more precisely what those Org tests do and why?
('require' is supposed to be idempotent; maybe you are missing a 'provide' somewhere?)






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-24  8:52     ` Mattias Engdegård
@ 2021-11-24  9:29       ` Ihor Radchenko
  2021-11-24 10:00         ` Mattias Engdegård
  0 siblings, 1 reply; 19+ messages in thread
From: Ihor Radchenko @ 2021-11-24  9:29 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941, Lars Ingebrigtsen

Mattias Engdegård <mattiase@acm.org> writes:

> 24 nov. 2021 kl. 08.30 skrev Ihor Radchenko <yantar92@gmail.com>:
>
>> This change breaks test code when a test file is "required" multiple
>> times. In particular, Org mode tests cannot run with latest Emacs.
>
> We'll have to do something about that then. Would you tell me more precisely what those Org tests do and why?
> ('require' is supposed to be idempotent; maybe you are missing a 'provide' somewhere?)

In Org 9.5, we have test-oc.el. That file requires a macro defined in
test-ox.el:

In test-oc.el:
;; We need `org-test-with-parsed-data' macro.
(require 'test-ox "../testing/lisp/test-ox.el")

The load sequence is:
1. load-file test-oc
   1.1. test-oc requires test-ox
2. load file test-ox triggers error:
   Debugger entered--Lisp error: (error "Test ‘test-org-export/bind-keyword’ redefined")

Hope it helps.

Best,
Ihor





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-24  9:29       ` Ihor Radchenko
@ 2021-11-24 10:00         ` Mattias Engdegård
  2021-11-24 12:16           ` Ihor Radchenko
  0 siblings, 1 reply; 19+ messages in thread
From: Mattias Engdegård @ 2021-11-24 10:00 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: 51941, Lars Ingebrigtsen

24 nov. 2021 kl. 10.29 skrev Ihor Radchenko <yantar92@gmail.com>:

> 1. load-file test-oc
>   1.1. test-oc requires test-ox
> 2. load file test-ox triggers error:
>   Debugger entered--Lisp error: (error "Test ‘test-org-export/bind-keyword’ redefined")

Thank you, that was very clear.

Best practice is to run each ERT test file in a separate Emacs process. This ensures isolation between the tests, which can be quite important, but also permits them to be run in parallel. Easiest is to do this from a Makefile but you could of course do it from a controlling Emacs process if you prefer that.

Now if you for some reason want to stick with running them all in a single process, which I would advice against, then at least avoid the load-file if the file has already been loaded. Maybe use `require` with a file argument instead of load-file.

As a last resort you could insert something like (ert-delete-all-tests) between the test runs, but then you would no longer get the benefit of the global ERT test duplication check that is the one advantage of running them all in the same process.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#51941: Detect duplication of ERT tests
  2021-11-24 10:00         ` Mattias Engdegård
@ 2021-11-24 12:16           ` Ihor Radchenko
  0 siblings, 0 replies; 19+ messages in thread
From: Ihor Radchenko @ 2021-11-24 12:16 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 51941, Lars Ingebrigtsen

Mattias Engdegård <mattiase@acm.org> writes:

> Best practice is to run each ERT test file in a separate Emacs process. This ensures isolation between the tests, which can be quite important, but also permits them to be run in parallel. Easiest is to do this from a Makefile but you could of course do it from a controlling Emacs process if you prefer that.

Makes sense, though it is not trivial to change the existing code.

> Now if you for some reason want to stick with running them all in a single process, which I would advice against, then at least avoid the load-file if the file has already been loaded. Maybe use `require` with a file argument instead of load-file.

Using require indeed solved the issue.

Best,
Ihor





^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-11-24 12:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-18 10:16 bug#51941: Detect duplication of ERT tests Mattias Engdegård
2021-11-18 10:35 ` Lars Ingebrigtsen
2021-11-18 11:54   ` Mattias Engdegård
2021-11-18 11:58     ` Lars Ingebrigtsen
2021-11-18 17:44     ` Lars Ingebrigtsen
2021-11-18 17:53       ` Philipp Stephani
2021-11-18 19:13       ` Mattias Engdegård
2021-11-18 19:34         ` Lars Ingebrigtsen
2021-11-18 19:43           ` Mattias Engdegård
2021-11-18 22:51         ` Glenn Morris
2021-11-19  5:28           ` Lars Ingebrigtsen
2021-11-19  9:33             ` Mattias Engdegård
2021-11-20  8:38               ` Lars Ingebrigtsen
2021-11-20  8:49                 ` Mattias Engdegård
2021-11-24  7:30   ` Ihor Radchenko
2021-11-24  8:52     ` Mattias Engdegård
2021-11-24  9:29       ` Ihor Radchenko
2021-11-24 10:00         ` Mattias Engdegård
2021-11-24 12:16           ` Ihor Radchenko

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).