unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#73312: 31.0.50; textsec test failure because of UTS #46 changes
@ 2024-09-17 10:11 Robert Pluim
  2024-09-17 13:10 ` Eli Zaretskii
  0 siblings, 1 reply; 3+ messages in thread
From: Robert Pluim @ 2024-09-17 10:11 UTC (permalink / raw)
  To: 73312

Following the update to Unicode 16, the textsec tests now fail:

  GEN      lisp/international/textsec-tests.log
Running 12 tests (2024-09-17 11:51:51+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
   passed   1/12  test-confusable (0.001562 sec)
   passed   2/12  test-minimal-scripts (0.000155 sec)
   passed   3/12  test-mixed-numbers (0.000846 sec)
   passed   4/12  test-resolved (0.000120 sec)
   passed   5/12  test-restriction-level (0.000259 sec)
   passed   6/12  test-scripts (0.000354 sec)
   passed   7/12  test-suspicious-email (0.001587 sec)
   passed   8/12  test-suspicious-link (0.015283 sec)
   passed   9/12  test-suspicious-local (0.000522 sec)
   passed  10/12  test-suspicious-name (0.000420 sec)
   passed  11/12  test-suspicious-url (0.000498 sec)
Test test-suspiction-domain backtrace:
  signal(ert-test-failed (((should (textsec-domain-suspicious-p "foo/b
  ert-fail(((should (textsec-domain-suspicious-p "foo/bar.org")) :form
  (if (unwind-protect (setq value-222 (apply fn-220 args-221)) (setq f
  (let (form-description-224) (if (unwind-protect (setq value-222 (app
  (let ((value-222 'ert-form-evaluation-aborted-223)) (let (form-descr
  (let* ((fn-220 #'textsec-domain-suspicious-p) (args-221 (condition-c
  #f(lambda () [t] (let* ((fn-220 #'textsec-domain-suspicious-p) (args
  #f(compiled-function () #<bytecode -0x167cc0e1752f76aa>)()
  handler-bind-1(#f(compiled-function () #<bytecode -0x167cc0e1752f76a
  ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
  ert-run-test(#s(ert-test :name test-suspiction-domain :documentation
  ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
  ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
  ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
  ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
  eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
  command-line-1(("-L" ":." "-l" "ert" "--eval" "(setq treesit-extra-l
  command-line()
  normal-top-level()
Test test-suspiction-domain condition:
    (ert-test-failed
     ((should (textsec-domain-suspicious-p "foo/bar.org")) :form
      (textsec-domain-suspicious-p "foo/bar.org") :value nil))
   FAILED  12/12  test-suspiction-domain (0.000228 sec) at lisp/international/textsec-tests.el:114

Ran 12 tests, 11 results as expected, 1 unexpected (2024-09-17 11:51:51+0200, 0.102143 sec)

1 unexpected results:
   FAILED  test-suspiction-domain

This is because UTS #46 in their infinite wisdom have decided to change
the rules on how to check what is considered an allowed character in a
domain name. Previously, IdnaMappingTable.txt contained eg:

002F          ; disallowed_STD3_valid                  # 1.1  SOLIDUS

but now it contains

002F          ; valid      ;      ; NV8    # 1.1  SOLIDUS

with a change to section 4.1 of UTS#46 saying that only [a-z0-9-] are
allowed for ASCII. Note that theyʼve helpfully marked
valid-but-invalid-in-idna characters with either NV8 or XV8, but then
have unhelpfully said that those markings are not normative. <sigh>

Anyway, willfully ignoring their verbiage about normative markings,
the following fixes it for me, at least until the next version of UTS
#46, I guess.

diff --git a/admin/unidata/unidata-gen.el b/admin/unidata/unidata-gen.el
index 7be03fe63af..adbe9c83670 100644
--- a/admin/unidata/unidata-gen.el
+++ b/admin/unidata/unidata-gen.el
@@ -1598,15 +1598,21 @@ unidata-gen-idna-mapping
   (let ((map (make-char-table nil)))
     (with-temp-buffer
       (unidata-gen--insert-file "IdnaMappingTable.txt")
-      (while (re-search-forward "^\\([0-9A-F]+\\)\\(?:\\.\\.\\([0-9A-F]+\\)\\)? +; +\\([^ ]+\\) +\\(?:; +\\([ 0-9A-F]+\\)\\)?"
+      (while (re-search-forward "^\\([0-9A-F]+\\)\\(?:\\.\\.\\([0-9A-F]+\\)\\)? +; +\\([^ ]+\\) +\\(?:; +\\([ 0-9A-F]+\\)\\)?\\(?:; \\(NV8\\|XV8\\)\\)?"
                                 nil t)
         (let ((start (match-string 1))
               (end (match-string 2))
               (status (match-string 3))
-              (mapped (match-string 4)))
+              (mapped (match-string 4))
+              (idna-status (match-string 5)))
           ;; Make reading the file slightly faster by using `t'
           ;; instead of `disallowed' all over the place.
-          (when (string-match-p "\\`disallowed" status)
+          (when (or (string-match-p "\\`disallowed" status)
+                    ;; UTS#46 messed us about with "status = valid" for
+                    ;; invalid characters, so we need to check for "NV8" or
+                    ;; "XV8".
+                    (string= idna-status "NV8")
+                    (string= idna-status "XV8"))
             (setq status "t"))
           (unless (or (equal status "valid")
                       (equal status "deviation"))



Robert
-- 





^ permalink raw reply related	[flat|nested] 3+ messages in thread

* bug#73312: 31.0.50; textsec test failure because of UTS #46 changes
  2024-09-17 10:11 bug#73312: 31.0.50; textsec test failure because of UTS #46 changes Robert Pluim
@ 2024-09-17 13:10 ` Eli Zaretskii
  2024-09-17 13:52   ` Robert Pluim
  0 siblings, 1 reply; 3+ messages in thread
From: Eli Zaretskii @ 2024-09-17 13:10 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 73312

> From: Robert Pluim <rpluim@gmail.com>
> Date: Tue, 17 Sep 2024 12:11:50 +0200
> 
> Following the update to Unicode 16, the textsec tests now fail:
> 
>   GEN      lisp/international/textsec-tests.log
> Running 12 tests (2024-09-17 11:51:51+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
>    passed   1/12  test-confusable (0.001562 sec)
>    passed   2/12  test-minimal-scripts (0.000155 sec)
>    passed   3/12  test-mixed-numbers (0.000846 sec)
>    passed   4/12  test-resolved (0.000120 sec)
>    passed   5/12  test-restriction-level (0.000259 sec)
>    passed   6/12  test-scripts (0.000354 sec)
>    passed   7/12  test-suspicious-email (0.001587 sec)
>    passed   8/12  test-suspicious-link (0.015283 sec)
>    passed   9/12  test-suspicious-local (0.000522 sec)
>    passed  10/12  test-suspicious-name (0.000420 sec)
>    passed  11/12  test-suspicious-url (0.000498 sec)
> Test test-suspiction-domain backtrace:
>   signal(ert-test-failed (((should (textsec-domain-suspicious-p "foo/b
>   ert-fail(((should (textsec-domain-suspicious-p "foo/bar.org")) :form
>   (if (unwind-protect (setq value-222 (apply fn-220 args-221)) (setq f
>   (let (form-description-224) (if (unwind-protect (setq value-222 (app
>   (let ((value-222 'ert-form-evaluation-aborted-223)) (let (form-descr
>   (let* ((fn-220 #'textsec-domain-suspicious-p) (args-221 (condition-c
>   #f(lambda () [t] (let* ((fn-220 #'textsec-domain-suspicious-p) (args
>   #f(compiled-function () #<bytecode -0x167cc0e1752f76aa>)()
>   handler-bind-1(#f(compiled-function () #<bytecode -0x167cc0e1752f76a
>   ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
>   ert-run-test(#s(ert-test :name test-suspiction-domain :documentation
>   ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
>   ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
>   ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
>   ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
>   eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
>   command-line-1(("-L" ":." "-l" "ert" "--eval" "(setq treesit-extra-l
>   command-line()
>   normal-top-level()
> Test test-suspiction-domain condition:
>     (ert-test-failed
>      ((should (textsec-domain-suspicious-p "foo/bar.org")) :form
>       (textsec-domain-suspicious-p "foo/bar.org") :value nil))
>    FAILED  12/12  test-suspiction-domain (0.000228 sec) at lisp/international/textsec-tests.el:114
> 
> Ran 12 tests, 11 results as expected, 1 unexpected (2024-09-17 11:51:51+0200, 0.102143 sec)
> 
> 1 unexpected results:
>    FAILED  test-suspiction-domain
> 
> This is because UTS #46 in their infinite wisdom have decided to change
> the rules on how to check what is considered an allowed character in a
> domain name. Previously, IdnaMappingTable.txt contained eg:
> 
> 002F          ; disallowed_STD3_valid                  # 1.1  SOLIDUS
> 
> but now it contains
> 
> 002F          ; valid      ;      ; NV8    # 1.1  SOLIDUS
> 
> with a change to section 4.1 of UTS#46 saying that only [a-z0-9-] are
> allowed for ASCII. Note that theyʼve helpfully marked
> valid-but-invalid-in-idna characters with either NV8 or XV8, but then
> have unhelpfully said that those markings are not normative. <sigh>
> 
> Anyway, willfully ignoring their verbiage about normative markings,
> the following fixes it for me, at least until the next version of UTS
> #46, I guess.

Please install, and thanks.





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#73312: 31.0.50; textsec test failure because of UTS #46 changes
  2024-09-17 13:10 ` Eli Zaretskii
@ 2024-09-17 13:52   ` Robert Pluim
  0 siblings, 0 replies; 3+ messages in thread
From: Robert Pluim @ 2024-09-17 13:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 73312

tags 73312 fixed
close 73312 31.1
quit

>>>>> On Tue, 17 Sep 2024 16:10:38 +0300, Eli Zaretskii <eliz@gnu.org> said:


    Eli> Please install, and thanks.

Closing.
Committed as 7d365a2d72d

Robert
-- 





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-17 13:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-17 10:11 bug#73312: 31.0.50; textsec test failure because of UTS #46 changes Robert Pluim
2024-09-17 13:10 ` Eli Zaretskii
2024-09-17 13:52   ` Robert Pluim

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).