* bug#30219: 27.0.50; (should (equal ...)) bug for string equality @ 2018-01-22 22:55 Philipp 2019-10-09 21:29 ` Lars Ingebrigtsen 2019-10-10 14:34 ` Mattias Engdegård 0 siblings, 2 replies; 8+ messages in thread From: Philipp @ 2018-01-22 22:55 UTC (permalink / raw) To: 30219 Define the following ERT test: (ert-deftest foo () (should (equal "a\xFF" "a\u00FF"))) Then run M-x ert. The test triggers an assertion. This is because ERT incorrectly assumes that two arrays are equal if their lengths and elements are equal, but that's not the case when comparing unibyte and multibyte strings. In GNU Emacs 27.0.50 (build 10, x86_64-apple-darwin17.3.0, NS appkit-1561.20 Version 10.13.2 (Build 17C205)) of 2018-01-22 built on p Repository revision: 3558d96b60393893a346f4382b813ca0738f9d9b Windowing system distributor 'Apple', version 10.3.1561 System Description: Mac OS X 10.13.2 Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Configured using: 'configure --without-threads --with-modules --without-pop --with-mailutils --enable-gcc-warnings=yes --enable-checking --enable-check-lisp-object-type 'CFLAGS=-ggdb3 -O0'' Configured features: NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES JSON Important settings: value of $LANG: de_DE.UTF-8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message rmc puny seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib dired dired-loaddefs format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils elec-pair time-date tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/ns-win ns-win ucs-normalize mule-util term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote kqueue cocoa ns multi-tty make-network-process emacs) Memory information: ((conses 16 204871 8960) (symbols 48 20154 1) (miscs 40 57 145) (strings 32 28911 1996) (string-bytes 1 771735) (vectors 16 35241) (vector-slots 8 721940 14772) (floats 8 52 64) (intervals 56 208 0) (buffers 992 11)) ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2018-01-22 22:55 bug#30219: 27.0.50; (should (equal ...)) bug for string equality Philipp @ 2019-10-09 21:29 ` Lars Ingebrigtsen 2019-10-10 14:34 ` Mattias Engdegård 1 sibling, 0 replies; 8+ messages in thread From: Lars Ingebrigtsen @ 2019-10-09 21:29 UTC (permalink / raw) To: Philipp; +Cc: 30219 Philipp <p.stephani2@gmail.com> writes: > Define the following ERT test: > > (ert-deftest foo () > (should (equal "a\xFF" "a\u00FF"))) > > Then run M-x ert. The test triggers an assertion. This is because ERT > incorrectly assumes that two arrays are equal if their lengths and > elements are equal, but that's not the case when comparing unibyte and > multibyte strings. I tried putting (ert-deftest foo () (should (equal "a\xFF" "a\u00FF"))) into one of the test files and I got 1 unexpected results: FAILED foo So I'm not able to reproduce this bug. But the code in question hasn't changed since this was reported, so I'm not sure why I can't reproduce it. Does anybody else see this? ((pred arrayp) (if (/= (length a) (length b)) `(arrays-of-different-length ,(length a) ,(length b) ,a ,b ,@(unless (char-table-p a) `(first-mismatch-at ,(cl-mismatch a b :test 'equal)))) (cl-loop for i from 0 for ai across a for bi across b for xi = (ert--explain-equal-rec ai bi) do (when xi (cl-return `(array-elt ,i ,xi))) finally (cl-assert (equal a b) t)))) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2018-01-22 22:55 bug#30219: 27.0.50; (should (equal ...)) bug for string equality Philipp 2019-10-09 21:29 ` Lars Ingebrigtsen @ 2019-10-10 14:34 ` Mattias Engdegård 2019-10-10 19:29 ` Mattias Engdegård 1 sibling, 1 reply; 8+ messages in thread From: Mattias Engdegård @ 2019-10-10 14:34 UTC (permalink / raw) To: 30219; +Cc: Lars Ingebrigtsen, Philipp The bug is directly reproducible, and not very surprising given the subtleties in mixed uni/multibyte string comparisons. ERT treats strings as arrays to be compared element-wise via `aref', which isn't how `equal' works. There are several solutions, none perfect. I'll be back with a patch later today or tomorrow. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2019-10-10 14:34 ` Mattias Engdegård @ 2019-10-10 19:29 ` Mattias Engdegård 2019-10-11 5:43 ` Lars Ingebrigtsen 0 siblings, 1 reply; 8+ messages in thread From: Mattias Engdegård @ 2019-10-10 19:29 UTC (permalink / raw) To: 30219; +Cc: Lars Ingebrigtsen, Philipp [-- Attachment #1: Type: text/plain, Size: 125 bytes --] > There are several solutions, none perfect. I'll be back with a patch later today or tomorrow. Please try this patch. [-- Attachment #2: 0001-Correctly-explain-test-failures-with-mixed-uni-multi.patch --] [-- Type: application/octet-stream, Size: 3827 bytes --] From dc8570aee0fd01ae55c2214160fb609626fbcf4e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Thu, 10 Oct 2019 21:20:20 +0200 Subject: [PATCH] Correctly explain test failures with mixed uni/multibyte strings * lisp/emacs-lisp/ert.el (ert--explain-equal-rec) (ert--explain-equal-including-properties): * test/lisp/emacs-lisp/ert-tests.el (ert-test-explain-equal): When explaining a difference between a unibyte and a multibyte string, first convert both to multibyte. Otherwise, we might end up comparing unibyte char C to multibyte char C, 127<C<256, and not detect the difference (bug#30219). --- lisp/emacs-lisp/ert.el | 9 +++++++++ test/lisp/emacs-lisp/ert-tests.el | 23 +++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/lisp/emacs-lisp/ert.el b/lisp/emacs-lisp/ert.el index da241e6304..880a40dafd 100644 --- a/lisp/emacs-lisp/ert.el +++ b/lisp/emacs-lisp/ert.el @@ -512,6 +512,11 @@ ert--explain-equal-rec (cl-assert (equal a b) t) nil)))))))) ((pred arrayp) + ;; For mixed unibyte/multibyte string comparisons, make both multibyte. + (when (and (stringp a) + (xor (multibyte-string-p a) (multibyte-string-p b))) + (setq a (string-to-multibyte a)) + (setq b (string-to-multibyte b))) (if (/= (length a) (length b)) `(arrays-of-different-length ,(length a) ,(length b) ,a ,b @@ -606,6 +611,10 @@ ert--explain-equal-including-properties (cl-assert (stringp a) t) (cl-assert (stringp b) t) (cl-assert (eql (length a) (length b)) t) + ;; For mixed unibyte/multibyte comparisons, make both multibyte. + (when (xor (multibyte-string-p a) (multibyte-string-p b)) + (setq a (string-to-multibyte a)) + (setq b (string-to-multibyte b))) (cl-loop for i from 0 to (length a) for props-a = (text-properties-at i a) for props-b = (text-properties-at i b) diff --git a/test/lisp/emacs-lisp/ert-tests.el b/test/lisp/emacs-lisp/ert-tests.el index 36db1eeb42..3a9e81595b 100644 --- a/test/lisp/emacs-lisp/ert-tests.el +++ b/test/lisp/emacs-lisp/ert-tests.el @@ -627,6 +627,29 @@ ert-test-explain-equal (should (equal (ert--explain-equal 'a sym) `(different-symbols-with-the-same-name a ,sym))))) +(ert-deftest ert-test-explain-equal-strings () + (should (equal (ert--explain-equal "abc" "axc") + '(array-elt 1 (different-atoms + (?b "#x62" "?b") + (?x "#x78" "?x"))))) + (should (equal (ert--explain-equal "abc" "abxc") + '(arrays-of-different-length + 3 4 "abc" "abxc" first-mismatch-at 2))) + (should (equal (ert--explain-equal "xyA" "xyÅ") + '(array-elt 2 (different-atoms + (?A "#x41" "?A") + (?Å "#xc5" "?Å"))))) + (should (equal (ert--explain-equal "m\xff" "m\u00ff") + `(array-elt + 1 (different-atoms + (#x3fffff "#x3fffff" ,(string-to-multibyte "?\xff")) + (#xff "#xff" "?ÿ"))))) + (should (equal (ert--explain-equal (string-to-multibyte "m\xff") "m\u00ff") + `(array-elt + 1 (different-atoms + (#x3fffff "#x3fffff" ,(string-to-multibyte "?\xff")) + (#xff "#xff" "?ÿ")))))) + (ert-deftest ert-test-explain-equal-improper-list () (should (equal (ert--explain-equal '(a . b) '(a . c)) '(cdr (different-atoms b c))))) -- 2.21.0 (Apple Git-122) ^ permalink raw reply related [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2019-10-10 19:29 ` Mattias Engdegård @ 2019-10-11 5:43 ` Lars Ingebrigtsen 2019-10-11 9:21 ` Mattias Engdegård 0 siblings, 1 reply; 8+ messages in thread From: Lars Ingebrigtsen @ 2019-10-11 5:43 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 30219, Philipp Mattias Engdegård <mattiase@acm.org> writes: > ((pred arrayp) > + ;; For mixed unibyte/multibyte string comparisons, make both multibyte. > + (when (and (stringp a) > + (xor (multibyte-string-p a) (multibyte-string-p b))) > + (setq a (string-to-multibyte a)) > + (setq b (string-to-multibyte b))) I guess that makes sense. We (possibly) lose text properties here, but we don't care about those anyway in this context, I think? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2019-10-11 5:43 ` Lars Ingebrigtsen @ 2019-10-11 9:21 ` Mattias Engdegård 2019-10-13 18:10 ` Lars Ingebrigtsen 0 siblings, 1 reply; 8+ messages in thread From: Mattias Engdegård @ 2019-10-11 9:21 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 30219, Philipp [-- Attachment #1: Type: text/plain, Size: 245 bytes --] 11 okt. 2019 kl. 07.43 skrev Lars Ingebrigtsen <larsi@gnus.org>: > > I guess that makes sense. We (possibly) lose text properties here, but > we don't care about those anyway in this context, I think? Oops, that wasn't intended. Second try. [-- Attachment #2: 0001-Correctly-explain-test-failures-with-mixed-uni-multi.patch --] [-- Type: application/octet-stream, Size: 3238 bytes --] From 48514a003a591ca6ef9d5b2cc9678727b3d93b47 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Thu, 10 Oct 2019 21:20:20 +0200 Subject: [PATCH] Correctly explain test failures with mixed uni/multibyte strings * lisp/emacs-lisp/ert.el (ert--explain-equal-rec): * test/lisp/emacs-lisp/ert-tests.el (ert-test-explain-equal): When explaining a difference between a unibyte and a multibyte string, first convert both to multibyte. Otherwise, we might end up comparing unibyte char C to multibyte char C, 127<C<256, and not detect the difference (bug#30219). --- lisp/emacs-lisp/ert.el | 5 +++++ test/lisp/emacs-lisp/ert-tests.el | 23 +++++++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/lisp/emacs-lisp/ert.el b/lisp/emacs-lisp/ert.el index da241e6304..16a6d753f3 100644 --- a/lisp/emacs-lisp/ert.el +++ b/lisp/emacs-lisp/ert.el @@ -512,6 +512,11 @@ ert--explain-equal-rec (cl-assert (equal a b) t) nil)))))))) ((pred arrayp) + ;; For mixed unibyte/multibyte string comparisons, make both multibyte. + (when (and (stringp a) + (xor (multibyte-string-p a) (multibyte-string-p b))) + (setq a (string-to-multibyte a)) + (setq b (string-to-multibyte b))) (if (/= (length a) (length b)) `(arrays-of-different-length ,(length a) ,(length b) ,a ,b diff --git a/test/lisp/emacs-lisp/ert-tests.el b/test/lisp/emacs-lisp/ert-tests.el index 36db1eeb42..3a9e81595b 100644 --- a/test/lisp/emacs-lisp/ert-tests.el +++ b/test/lisp/emacs-lisp/ert-tests.el @@ -627,6 +627,29 @@ ert-test-explain-equal (should (equal (ert--explain-equal 'a sym) `(different-symbols-with-the-same-name a ,sym))))) +(ert-deftest ert-test-explain-equal-strings () + (should (equal (ert--explain-equal "abc" "axc") + '(array-elt 1 (different-atoms + (?b "#x62" "?b") + (?x "#x78" "?x"))))) + (should (equal (ert--explain-equal "abc" "abxc") + '(arrays-of-different-length + 3 4 "abc" "abxc" first-mismatch-at 2))) + (should (equal (ert--explain-equal "xyA" "xyÅ") + '(array-elt 2 (different-atoms + (?A "#x41" "?A") + (?Å "#xc5" "?Å"))))) + (should (equal (ert--explain-equal "m\xff" "m\u00ff") + `(array-elt + 1 (different-atoms + (#x3fffff "#x3fffff" ,(string-to-multibyte "?\xff")) + (#xff "#xff" "?ÿ"))))) + (should (equal (ert--explain-equal (string-to-multibyte "m\xff") "m\u00ff") + `(array-elt + 1 (different-atoms + (#x3fffff "#x3fffff" ,(string-to-multibyte "?\xff")) + (#xff "#xff" "?ÿ")))))) + (ert-deftest ert-test-explain-equal-improper-list () (should (equal (ert--explain-equal '(a . b) '(a . c)) '(cdr (different-atoms b c))))) -- 2.21.0 (Apple Git-122) ^ permalink raw reply related [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2019-10-11 9:21 ` Mattias Engdegård @ 2019-10-13 18:10 ` Lars Ingebrigtsen 2019-10-13 18:30 ` Mattias Engdegård 0 siblings, 1 reply; 8+ messages in thread From: Lars Ingebrigtsen @ 2019-10-13 18:10 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 30219, Philipp Mattias Engdegård <mattiase@acm.org> writes: > ((pred arrayp) > + ;; For mixed unibyte/multibyte string comparisons, make both multibyte. > + (when (and (stringp a) > + (xor (multibyte-string-p a) (multibyte-string-p b))) > + (setq a (string-to-multibyte a)) > + (setq b (string-to-multibyte b))) Looks good to me, I think. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#30219: 27.0.50; (should (equal ...)) bug for string equality 2019-10-13 18:10 ` Lars Ingebrigtsen @ 2019-10-13 18:30 ` Mattias Engdegård 0 siblings, 0 replies; 8+ messages in thread From: Mattias Engdegård @ 2019-10-13 18:30 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 30219-done, Philipp 13 okt. 2019 kl. 20.10 skrev Lars Ingebrigtsen <larsi@gnus.org>: > > Looks good to me, I think. Thank you, pushed to master and bug closed. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-10-13 18:30 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-22 22:55 bug#30219: 27.0.50; (should (equal ...)) bug for string equality Philipp 2019-10-09 21:29 ` Lars Ingebrigtsen 2019-10-10 14:34 ` Mattias Engdegård 2019-10-10 19:29 ` Mattias Engdegård 2019-10-11 5:43 ` Lars Ingebrigtsen 2019-10-11 9:21 ` Mattias Engdegård 2019-10-13 18:10 ` Lars Ingebrigtsen 2019-10-13 18:30 ` Mattias Engdegård
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).