Eli Zaretskii writes: > However, I'm worried that we have no test for ucs-normalize, so it's > hard to be sure the non-trivial functionality is unchanged, even > though your changes are pretty straightforward. > > How about adding a test that uses the data in this file: > > http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt > > ucs-normalize claims to have passed an old version of this, but I see > no existing way of re-running that test, did I miss something? I don't see any evidence of an existing test. I stared writing a new one, and it's failing with the original ucs-normalize.el (or I'm misunderstanding the requirements). The first invariant to test is c2 == toNFC(c1) == toNFC(c2) == toNFC(c3) (cX is column X, columns numbered from 1). Line 15131 of NormalizationTest.txt has # c1 c2 c3 1112E;1112E;11131 11127;1112E;11131 11127; # (◌𑄮; ◌𑄮; ◌𑄱◌𑄧; ◌𑄮; ◌𑄱◌𑄧; ) CHAKMA VOWEL SIGN O So I think toNFC(c3) == c2 is equivalent to (equal (ucs-normalize-NFC-string (string #x11131 #x11127)) (string #x1112E)) which gives nil. Lines 15131 to 15139 and 16149 to 16289 are failing. To check invariants for a single line, load the attached ucs-normalize-tests.el, put point at the beginning of the line and evaluate (ucs-normalize-tests--invariants-hold-p (ucs-normalize-tests--parse-column) (ucs-normalize-tests--parse-column) (ucs-normalize-tests--parse-column) (ucs-normalize-tests--parse-column) (ucs-normalize-tests--parse-column))