* bug#73312: 31.0.50; textsec test failure because of UTS #46 changes
@ 2024-09-17 10:11 Robert Pluim
2024-09-17 13:10 ` Eli Zaretskii
0 siblings, 1 reply; 3+ messages in thread
From: Robert Pluim @ 2024-09-17 10:11 UTC (permalink / raw)
To: 73312
Following the update to Unicode 16, the textsec tests now fail:
GEN lisp/international/textsec-tests.log
Running 12 tests (2024-09-17 11:51:51+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
passed 1/12 test-confusable (0.001562 sec)
passed 2/12 test-minimal-scripts (0.000155 sec)
passed 3/12 test-mixed-numbers (0.000846 sec)
passed 4/12 test-resolved (0.000120 sec)
passed 5/12 test-restriction-level (0.000259 sec)
passed 6/12 test-scripts (0.000354 sec)
passed 7/12 test-suspicious-email (0.001587 sec)
passed 8/12 test-suspicious-link (0.015283 sec)
passed 9/12 test-suspicious-local (0.000522 sec)
passed 10/12 test-suspicious-name (0.000420 sec)
passed 11/12 test-suspicious-url (0.000498 sec)
Test test-suspiction-domain backtrace:
signal(ert-test-failed (((should (textsec-domain-suspicious-p "foo/b
ert-fail(((should (textsec-domain-suspicious-p "foo/bar.org")) :form
(if (unwind-protect (setq value-222 (apply fn-220 args-221)) (setq f
(let (form-description-224) (if (unwind-protect (setq value-222 (app
(let ((value-222 'ert-form-evaluation-aborted-223)) (let (form-descr
(let* ((fn-220 #'textsec-domain-suspicious-p) (args-221 (condition-c
#f(lambda () [t] (let* ((fn-220 #'textsec-domain-suspicious-p) (args
#f(compiled-function () #<bytecode -0x167cc0e1752f76aa>)()
handler-bind-1(#f(compiled-function () #<bytecode -0x167cc0e1752f76a
ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
ert-run-test(#s(ert-test :name test-suspiction-domain :documentation
ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
command-line-1(("-L" ":." "-l" "ert" "--eval" "(setq treesit-extra-l
command-line()
normal-top-level()
Test test-suspiction-domain condition:
(ert-test-failed
((should (textsec-domain-suspicious-p "foo/bar.org")) :form
(textsec-domain-suspicious-p "foo/bar.org") :value nil))
FAILED 12/12 test-suspiction-domain (0.000228 sec) at lisp/international/textsec-tests.el:114
Ran 12 tests, 11 results as expected, 1 unexpected (2024-09-17 11:51:51+0200, 0.102143 sec)
1 unexpected results:
FAILED test-suspiction-domain
This is because UTS #46 in their infinite wisdom have decided to change
the rules on how to check what is considered an allowed character in a
domain name. Previously, IdnaMappingTable.txt contained eg:
002F ; disallowed_STD3_valid # 1.1 SOLIDUS
but now it contains
002F ; valid ; ; NV8 # 1.1 SOLIDUS
with a change to section 4.1 of UTS#46 saying that only [a-z0-9-] are
allowed for ASCII. Note that theyʼve helpfully marked
valid-but-invalid-in-idna characters with either NV8 or XV8, but then
have unhelpfully said that those markings are not normative. <sigh>
Anyway, willfully ignoring their verbiage about normative markings,
the following fixes it for me, at least until the next version of UTS
#46, I guess.
diff --git a/admin/unidata/unidata-gen.el b/admin/unidata/unidata-gen.el
index 7be03fe63af..adbe9c83670 100644
--- a/admin/unidata/unidata-gen.el
+++ b/admin/unidata/unidata-gen.el
@@ -1598,15 +1598,21 @@ unidata-gen-idna-mapping
(let ((map (make-char-table nil)))
(with-temp-buffer
(unidata-gen--insert-file "IdnaMappingTable.txt")
- (while (re-search-forward "^\\([0-9A-F]+\\)\\(?:\\.\\.\\([0-9A-F]+\\)\\)? +; +\\([^ ]+\\) +\\(?:; +\\([ 0-9A-F]+\\)\\)?"
+ (while (re-search-forward "^\\([0-9A-F]+\\)\\(?:\\.\\.\\([0-9A-F]+\\)\\)? +; +\\([^ ]+\\) +\\(?:; +\\([ 0-9A-F]+\\)\\)?\\(?:; \\(NV8\\|XV8\\)\\)?"
nil t)
(let ((start (match-string 1))
(end (match-string 2))
(status (match-string 3))
- (mapped (match-string 4)))
+ (mapped (match-string 4))
+ (idna-status (match-string 5)))
;; Make reading the file slightly faster by using `t'
;; instead of `disallowed' all over the place.
- (when (string-match-p "\\`disallowed" status)
+ (when (or (string-match-p "\\`disallowed" status)
+ ;; UTS#46 messed us about with "status = valid" for
+ ;; invalid characters, so we need to check for "NV8" or
+ ;; "XV8".
+ (string= idna-status "NV8")
+ (string= idna-status "XV8"))
(setq status "t"))
(unless (or (equal status "valid")
(equal status "deviation"))
Robert
--
^ permalink raw reply related [flat|nested] 3+ messages in thread
* bug#73312: 31.0.50; textsec test failure because of UTS #46 changes
2024-09-17 10:11 bug#73312: 31.0.50; textsec test failure because of UTS #46 changes Robert Pluim
@ 2024-09-17 13:10 ` Eli Zaretskii
2024-09-17 13:52 ` Robert Pluim
0 siblings, 1 reply; 3+ messages in thread
From: Eli Zaretskii @ 2024-09-17 13:10 UTC (permalink / raw)
To: Robert Pluim; +Cc: 73312
> From: Robert Pluim <rpluim@gmail.com>
> Date: Tue, 17 Sep 2024 12:11:50 +0200
>
> Following the update to Unicode 16, the textsec tests now fail:
>
> GEN lisp/international/textsec-tests.log
> Running 12 tests (2024-09-17 11:51:51+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
> passed 1/12 test-confusable (0.001562 sec)
> passed 2/12 test-minimal-scripts (0.000155 sec)
> passed 3/12 test-mixed-numbers (0.000846 sec)
> passed 4/12 test-resolved (0.000120 sec)
> passed 5/12 test-restriction-level (0.000259 sec)
> passed 6/12 test-scripts (0.000354 sec)
> passed 7/12 test-suspicious-email (0.001587 sec)
> passed 8/12 test-suspicious-link (0.015283 sec)
> passed 9/12 test-suspicious-local (0.000522 sec)
> passed 10/12 test-suspicious-name (0.000420 sec)
> passed 11/12 test-suspicious-url (0.000498 sec)
> Test test-suspiction-domain backtrace:
> signal(ert-test-failed (((should (textsec-domain-suspicious-p "foo/b
> ert-fail(((should (textsec-domain-suspicious-p "foo/bar.org")) :form
> (if (unwind-protect (setq value-222 (apply fn-220 args-221)) (setq f
> (let (form-description-224) (if (unwind-protect (setq value-222 (app
> (let ((value-222 'ert-form-evaluation-aborted-223)) (let (form-descr
> (let* ((fn-220 #'textsec-domain-suspicious-p) (args-221 (condition-c
> #f(lambda () [t] (let* ((fn-220 #'textsec-domain-suspicious-p) (args
> #f(compiled-function () #<bytecode -0x167cc0e1752f76aa>)()
> handler-bind-1(#f(compiled-function () #<bytecode -0x167cc0e1752f76a
> ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
> ert-run-test(#s(ert-test :name test-suspiction-domain :documentation
> ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
> ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
> ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
> ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
> eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
> command-line-1(("-L" ":." "-l" "ert" "--eval" "(setq treesit-extra-l
> command-line()
> normal-top-level()
> Test test-suspiction-domain condition:
> (ert-test-failed
> ((should (textsec-domain-suspicious-p "foo/bar.org")) :form
> (textsec-domain-suspicious-p "foo/bar.org") :value nil))
> FAILED 12/12 test-suspiction-domain (0.000228 sec) at lisp/international/textsec-tests.el:114
>
> Ran 12 tests, 11 results as expected, 1 unexpected (2024-09-17 11:51:51+0200, 0.102143 sec)
>
> 1 unexpected results:
> FAILED test-suspiction-domain
>
> This is because UTS #46 in their infinite wisdom have decided to change
> the rules on how to check what is considered an allowed character in a
> domain name. Previously, IdnaMappingTable.txt contained eg:
>
> 002F ; disallowed_STD3_valid # 1.1 SOLIDUS
>
> but now it contains
>
> 002F ; valid ; ; NV8 # 1.1 SOLIDUS
>
> with a change to section 4.1 of UTS#46 saying that only [a-z0-9-] are
> allowed for ASCII. Note that theyʼve helpfully marked
> valid-but-invalid-in-idna characters with either NV8 or XV8, but then
> have unhelpfully said that those markings are not normative. <sigh>
>
> Anyway, willfully ignoring their verbiage about normative markings,
> the following fixes it for me, at least until the next version of UTS
> #46, I guess.
Please install, and thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-09-17 13:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-17 10:11 bug#73312: 31.0.50; textsec test failure because of UTS #46 changes Robert Pluim
2024-09-17 13:10 ` Eli Zaretskii
2024-09-17 13:52 ` Robert Pluim
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).