bug#25646: [PATCH 0/3] Minor casing impromevents

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* bug#25646: [PATCH 0/3] Minor casing impromevents
@ 2017-02-07 18:04 Michal Nazarewicz
  2017-02-07 18:05 ` bug#25646: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2017-02-07 18:04 UTC (permalink / raw)
  To: 25646

If there will be no objections I’ll submit it in a week or so.

This is split from bug#24424 which contains many more changes.
Originally I hoped that I would be able to get all the paches in
bug#24424 to state where they can be upstreamed quickly but due to
various reasons it is taking a lot longer.  Because of that I’ll try
to submit a smaller, self-contained chunks of it separately so that
new features and fixes show up faster in Emacs.

Michal Nazarewicz (3):
  Add tests for casefiddle.c
  Generate upcase and downcase tables from Unicode data
  Don’t assume character can be either upper- or lower-case when casing

 etc/NEWS                         |   8 +-
 lisp/international/characters.el | 345 ++++++++-------------------------------
 src/buffer.h                     |  18 +-
 src/casefiddle.c                 |  20 +--
 src/keyboard.c                   |  25 +--
 test/src/casefiddle-tests.el     | 246 ++++++++++++++++++++++++++++
 6 files changed, 354 insertions(+), 308 deletions(-)
 create mode 100644 test/src/casefiddle-tests.el

-- 
2.11.0.483.g087da7b7c-goog






^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#25646: [PATCH 1/3] Add tests for casefiddle.c
  2017-02-07 18:04 bug#25646: [PATCH 0/3] Minor casing impromevents Michal Nazarewicz
@ 2017-02-07 18:05 ` Michal Nazarewicz
  2017-02-07 18:05   ` bug#25646: [PATCH 2/3] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
  2017-02-07 18:05   ` bug#25646: [PATCH 3/3] Don’t assume character can be either upper- or lower-case when casing Michal Nazarewicz
  2017-02-10  8:47 ` bug#25646: [PATCH 0/3] Minor casing impromevents Eli Zaretskii
  2017-02-15 16:13 ` Michal Nazarewicz
  2 siblings, 2 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2017-02-07 18:05 UTC (permalink / raw)
  To: 25646

* test/src/casefiddle-tests.el (casefiddle-tests-char-properties,
casefiddle-tests-case-table, casefiddle-tests-casing-character,
casefiddle-tests-casing, casefiddle-tests-casing-byte8,
casefiddle-tests-casing-byte8-with-changes): New tests.
(casefiddle-tests--test-casing): New helper function for runnig
some of the tests.
---
 test/src/casefiddle-tests.el | 247 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)
 create mode 100644 test/src/casefiddle-tests.el

diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el
new file mode 100644
index 00000000000..b1abe50fa4e
--- /dev/null
+++ b/test/src/casefiddle-tests.el
@@ -0,0 +1,247 @@
+;;; casefiddle-tests.el --- tests for casefiddle.c functions -*- lexical-binding: t -*-
+
+;; Copyright (C) 2015-2016 Free Software Foundation, Inc.
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'case-table)
+(require 'ert)
+
+(ert-deftest casefiddle-tests-char-properties ()
+  "Sanity check of character Unicode properties."
+  (should-not
+   (let (errors)
+     ;;            character  uppercase  lowercase  titlecase
+     (dolist (test '((?A nil ?a nil)
+                     (?a ?A nil ?A)
+                     (?Ł nil ?ł nil)
+                     (?ł ?Ł nil ?Ł)
+
+                     (?Ǆ nil ?ǆ ?ǅ)
+                     (?ǅ ?Ǆ ?ǆ ?ǅ)
+                     (?ǆ ?Ǆ nil ?ǅ)
+
+                     (?Σ nil ?σ nil)
+                     (?σ ?Σ nil ?Σ)
+                     (?ς ?Σ nil ?Σ)
+
+                     (?ⅷ ?Ⅷ nil ?Ⅷ)
+                     (?Ⅷ nil ?ⅷ nil)))
+       (let ((ch (car test))
+             (expected (cdr test))
+             (props '(uppercase lowercase titlecase)))
+         (while props
+           (let ((got (get-char-code-property ch (car props))))
+             (unless (equal (car expected) got)
+               (push (format "\n%c %s; expected: %s but got: %s"
+                             ch (car props) (car expected) got)
+                     errors)))
+           (setq props (cdr props) expected (cdr expected)))))
+     (when errors
+       (mapconcat (lambda (line) line) (nreverse errors) "")))))
+
+
+(defconst casefiddle-tests--characters
+  ;; character  uppercase  lowercase  titlecase
+  '((?A ?A ?a ?A)
+    (?a ?A ?a ?A)
+    (?Ł ?Ł ?ł ?Ł)
+    (?ł ?Ł ?ł ?Ł)
+
+    ;; FIXME: We should have:
+    ;;(?Ǆ ?Ǆ ?ǆ ?ǅ)
+    ;; but instead we have:
+    (?Ǆ ?Ǆ ?ǆ ?Ǆ)
+    ;; FIXME: Those two are broken at the moment:
+    ;;(?ǅ ?Ǆ ?ǆ ?ǅ)
+    ;;(?ǆ ?Ǆ ?ǆ ?ǅ)
+
+    (?Σ ?Σ ?σ ?Σ)
+    (?σ ?Σ ?σ ?Σ)
+    ;; FIXME: Another broken one:
+    ;;(?ς ?Σ ?ς ?Σ)
+
+    (?Ⅷ ?Ⅷ ?ⅷ ?Ⅷ)
+    (?ⅷ ?Ⅷ ?ⅷ ?Ⅷ)))
+
+
+(ert-deftest casefiddle-tests-case-table ()
+  "Sanity check of down and up case tables."
+  (should-not
+   (let (errors
+         (up (case-table-get-table (current-case-table) 'up))
+         (down (case-table-get-table (current-case-table) 'down)))
+     (dolist (test casefiddle-tests--characters)
+       (let ((ch (car test))
+             (expected (cdr test))
+             (props '(uppercase lowercase))
+             (tabs (list up down)))
+         (while props
+           (let ((got (aref (car tabs) ch)))
+             (unless (equal (car expected) got)
+               (push (format "\n%c %s; expected: %s but got: %s"
+                             ch (car props) (car expected) got)
+                     errors)))
+           (setq props (cdr props) tabs (cdr tabs) expected (cdr expected)))))
+     (when errors
+       (mapconcat (lambda (line) line) (nreverse errors) "")))))
+
+
+(ert-deftest casefiddle-tests-casing-character ()
+  (should-not
+   (let (errors)
+     (dolist (test casefiddle-tests--characters)
+       (let ((ch (car test))
+             (expected (cdr test))
+             (funcs '(upcase downcase capitalize)))
+         (while funcs
+           (let ((got (funcall (car funcs) ch)))
+             (unless (equal (car expected) got)
+               (push (format "\n%c %s; expected: %s but got: %s"
+                             ch (car funcs) (car expected) got)
+                     errors)))
+           (setq funcs (cdr funcs) expected (cdr expected)))))
+     (when errors
+       (mapconcat (lambda (line) line) (nreverse errors) "")))))
+
+
+(ert-deftest casefiddle-tests-casing-word ()
+  (with-temp-buffer
+    (dolist (test '((upcase-word     . "FOO Bar")
+                    (downcase-word   . "foo Bar")
+                    (capitalize-word . "Foo Bar")))
+      (dolist (back '(nil t))
+        (delete-region (point-min) (point-max))
+        (insert "foO Bar")
+        (goto-char (+ (if back 4 0) (point-min)))
+        (funcall (car test) (if back -1 1))
+        (should (string-equal (cdr test) (buffer-string)))
+        (should (equal (+ (if back 4 3) (point-min)) (point)))))))
+
+
+(defun casefiddle-tests--test-casing (tests)
+  (nreverse
+   (cl-reduce
+    (lambda (errors test)
+      (let* ((input (car test))
+             (expected (cdr test))
+             (func-pairs '((upcase upcase-region)
+                           (downcase downcase-region)
+                           (capitalize capitalize-region)
+                           (upcase-initials upcase-initials-region)))
+             (get-string (lambda (func) (funcall func input)))
+             (get-region (lambda (func)
+                           (delete-region (point-min) (point-max))
+                           (unwind-protect
+                               (progn
+                                 (unless (multibyte-string-p input)
+                                   (toggle-enable-multibyte-characters))
+                                 (insert input)
+                                 (funcall func (point-min) (point-max))
+                                 (buffer-string))
+                             (unless (multibyte-string-p input)
+                               (toggle-enable-multibyte-characters)))))
+             (fmt-str (lambda (str)
+                        (format "%s  (%sbyte; %d chars; %d bytes)"
+                                str
+                                (if (multibyte-string-p str) "multi" "uni")
+                                (length str) (string-bytes str))))
+             funcs getters)
+        (while (and func-pairs expected)
+          (setq funcs (car func-pairs)
+                getters (list get-string get-region))
+          (while (and funcs getters)
+            (let ((got (funcall (car getters) (car funcs))))
+              (unless (string-equal got (car expected))
+                (let ((fmt (length (symbol-name (car funcs)))))
+                  (setq fmt (format "\n%%%ds: %%s" (max fmt 8)))
+                  (push (format (concat fmt fmt fmt)
+                                (car funcs) (funcall fmt-str input)
+                                "expected" (funcall fmt-str (car expected))
+                                "but got" (funcall fmt-str got))
+                        errors))))
+            (setq funcs (cdr funcs) getters (cdr getters)))
+          (setq func-pairs (cdr func-pairs) expected (cdr expected))))
+      errors)
+    (cons () tests))))
+
+(ert-deftest casefiddle-tests-casing ()
+  (should-not
+   (with-temp-buffer
+     (casefiddle-tests--test-casing
+      ;; input     upper     lower    capitalize up-initials
+      '(("Foo baR" "FOO BAR" "foo bar" "Foo Bar" "Foo BaR")
+        ("Ⅷ ⅷ" "Ⅷ Ⅷ" "ⅷ ⅷ" "Ⅷ Ⅷ" "Ⅷ Ⅷ")
+        ;; FIXME: Everything below is broken at the moment.  Here’s what
+        ;; should happen:
+        ;;("ǄUNGLA" "ǄUNGLA" "ǆungla" "ǅungla" "ǅUNGLA")
+        ;;("ǅungla" "ǄUNGLA" "ǆungla" "ǅungla" "ǅungla")
+        ;;("ǆungla" "ǄUNGLA" "ǆungla" "ǅungla" "ǅungla")
+        ;;("deﬁne" "DEFINE" "deﬁne" "Deﬁne" "Deﬁne")
+        ;;("ﬁsh" "FIsh" "ﬁsh" "Fish" "Fish")
+        ;;("Straße" "STRASSE" "straße" "Straße" "Straße")
+        ;;("ΌΣΟΣ" "ΌΣΟΣ" "όσος" "Όσος" "Όσος")
+        ;;("όσος" "ΌΣΟΣ" "όσος" "Όσος" "Όσος")
+        ;; And here’s what is actually happening:
+        ("ǄUNGLA" "ǄUNGLA" "ǆungla" "Ǆungla" "ǄUNGLA")
+        ("ǅungla" "ǅUNGLA" "ǆungla" "ǅungla" "ǅungla")
+        ("ǆungla" "ǄUNGLA" "ǆungla" "Ǆungla" "Ǆungla")
+        ("deﬁne" "DEﬁNE" "deﬁne" "Deﬁne" "Deﬁne")
+        ("ﬁsh" "ﬁSH" "ﬁsh" "ﬁsh" "ﬁsh")
+        ("Straße" "STRAßE" "straße" "Straße" "Straße")
+        ("ΌΣΟΣ" "ΌΣΟΣ" "όσοσ" "Όσοσ" "ΌΣΟΣ")
+        ("όσος" "ΌΣΟς" "όσος" "Όσος" "Όσος"))))))
+
+(ert-deftest casefiddle-tests-casing-byte8 ()
+  (should-not
+   (with-temp-buffer
+     (casefiddle-tests--test-casing
+      '(("\xff Foo baR \xff"
+         "\xff FOO BAR \xff"
+         "\xff foo bar \xff"
+         "\xff Foo Bar \xff"
+         "\xff Foo BaR \xff")
+        ("\xff Zażółć gĘŚlą \xff"
+         "\xff ZAŻÓŁĆ GĘŚLĄ \xff"
+         "\xff zażółć gęślą \xff"
+         "\xff Zażółć Gęślą \xff"
+         "\xff Zażółć GĘŚlą \xff"))))))
+
+(ert-deftest casefiddle-tests-casing-byte8-with-changes ()
+  (let ((tab (copy-case-table (standard-case-table)))
+        (test '("\xff\xff\xef Foo baR \xcf\xcf"
+                "\xef\xef\xef FOO BAR \xcf\xcf"
+                "\xff\xff\xff foo bar \xcf\xcf"
+                "\xef\xff\xff Foo Bar \xcf\xcf"
+                "\xef\xff\xef Foo BaR \xcf\xcf"))
+        (byte8 #x3FFF00))
+    (should-not
+     (with-temp-buffer
+       (set-case-table tab)
+       (set-case-syntax-pair (+ byte8 #xef) (+ byte8 #xff) tab)
+       (casefiddle-tests--test-casing
+        (list test
+              (mapcar (lambda (str) (decode-coding-string str 'binary)) test)
+              '("\xff\xff\xef Zażółć gĘŚlą \xcf\xcf"
+                "\xef\xef\xef ZAŻÓŁĆ GĘŚLĄ \xcf\xcf"
+                "\xff\xff\xff zażółć gęślą \xcf\xcf"
+                "\xef\xff\xff Zażółć Gęślą \xcf\xcf"
+                "\xef\xff\xef Zażółć GĘŚlą \xcf\xcf")))))))
+
+
+;;; casefiddle-tests.el ends here
-- 
2.11.0.483.g087da7b7c-goog






^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#25646: [PATCH 2/3] Generate upcase and downcase tables from Unicode data
  2017-02-07 18:05 ` bug#25646: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
@ 2017-02-07 18:05   ` Michal Nazarewicz
  2017-02-07 18:05   ` bug#25646: [PATCH 3/3] Don’t assume character can be either upper- or lower-case when casing Michal Nazarewicz
  1 sibling, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2017-02-07 18:05 UTC (permalink / raw)
  To: 25646

Use Unicode data to generate case tables instead of mostly repeating
them in lisp code.  Do that in a way which maps ‘Dz’ (and similar)
digraph to ‘dz’ when down- and ‘DZ’ when upcasing.

https://debbugs.gnu.org/cgi/bugreport.cgi?msg=89;bug=24603 lists all
changes to syntax table and case tables introduced by this commit.

* lisp/international/characters.el: Remove case-pairs defined with
explicit Lisp code and instead use Unicode character properties.

* test/src/casefiddle-tests.el (casefiddle-tests--characters,
casefiddle-tests-casing): Update test cases which are now working
as they should.
---
 lisp/international/characters.el | 345 ++++++++-------------------------------
 test/src/casefiddle-tests.el     |   7 +-
 2 files changed, 73 insertions(+), 279 deletions(-)

diff --git a/lisp/international/characters.el b/lisp/international/characters.el
index 2b9711aec6b..b2c0e39741a 100644
--- a/lisp/international/characters.el
+++ b/lisp/international/characters.el
@@ -543,10 +543,6 @@ ?L
   (set-case-syntax ?½ "_" tbl)
   (set-case-syntax ?¾ "_" tbl)
   (set-case-syntax ?¿ "." tbl)
-  (let ((c 192))
-    (while (<= c 222)
-      (set-case-syntax-pair c (+ c 32) tbl)
-      (setq c (1+ c))))
   (set-case-syntax ?× "_" tbl)
   (set-case-syntax ?ß "w" tbl)
   (set-case-syntax ?÷ "_" tbl)
@@ -558,101 +554,8 @@ ?L
     (modify-category-entry c ?l)
     (setq c (1+ c)))
 
-  (let ((pair-ranges '((#x0100 . #x012F)
-		       (#x0132 . #x0137)
-		       (#x0139 . #x0148)
-		       (#x014a . #x0177)
-		       (#x0179 . #x017E)
-		       (#x0182 . #x0185)
-		       (#x0187 . #x0188)
-		       (#x018B . #x018C)
-		       (#x0191 . #x0192)
-		       (#x0198 . #x0199)
-		       (#x01A0 . #x01A5)
-		       (#x01A7 . #x01A8)
-		       (#x01AC . #x01AD)
-		       (#x01AF . #x01B0)
-		       (#x01B3 . #x01B6)
-		       (#x01B8 . #x01B9)
-		       (#x01BC . #x01BD)
-		       (#x01CD . #x01DC)
-		       (#x01DE . #x01EF)
-		       (#x01F4 . #x01F5)
-		       (#x01F8 . #x021F)
-		       (#x0222 . #x0233)
-		       (#x023B . #x023C)
-		       (#x0241 . #x0242)
-		       (#x0246 . #x024F))))
-    (dolist (elt pair-ranges)
-      (let ((from (car elt)) (to (cdr elt)))
-	(while (< from to)
-	  (set-case-syntax-pair from (1+ from) tbl)
-	  (setq from (+ from 2))))))
-
-  (set-case-syntax-pair ?Ÿ ?ÿ tbl)
-
-  ;; In some languages, such as Turkish, U+0049 LATIN CAPITAL LETTER I
-  ;; and U+0131 LATIN SMALL LETTER DOTLESS I make a case pair, and so
-  ;; do U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN
-  ;; SMALL LETTER I.
-
-  ;; We used to set up half of those correspondence unconditionally,
-  ;; but that makes searches slow.  So now we don't set up either half
-  ;; of these correspondences by default.
-
-  ;; (set-downcase-syntax  ?İ ?i tbl)
-  ;; (set-upcase-syntax    ?I ?ı tbl)
-
-  (set-case-syntax-pair ?Ɓ ?ɓ tbl)
-  (set-case-syntax-pair ?Ɔ ?ɔ tbl)
-  (set-case-syntax-pair ?Ɖ ?ɖ tbl)
-  (set-case-syntax-pair ?Ɗ ?ɗ tbl)
-  (set-case-syntax-pair ?Ǝ ?ǝ tbl)
-  (set-case-syntax-pair ?Ə ?ə tbl)
-  (set-case-syntax-pair ?Ɛ ?ɛ tbl)
-  (set-case-syntax-pair ?Ɠ ?ɠ tbl)
-  (set-case-syntax-pair ?Ɣ ?ɣ tbl)
-  (set-case-syntax-pair ?Ɩ ?ɩ tbl)
-  (set-case-syntax-pair ?Ɨ ?ɨ tbl)
-  (set-case-syntax-pair ?Ɯ ?ɯ tbl)
-  (set-case-syntax-pair ?Ɲ ?ɲ tbl)
-  (set-case-syntax-pair ?Ɵ ?ɵ tbl)
-  (set-case-syntax-pair ?Ʀ ?ʀ tbl)
-  (set-case-syntax-pair ?Ʃ ?ʃ tbl)
-  (set-case-syntax-pair ?Ʈ ?ʈ tbl)
-  (set-case-syntax-pair ?Ʊ ?ʊ tbl)
-  (set-case-syntax-pair ?Ʋ ?ʋ tbl)
-  (set-case-syntax-pair ?Ʒ ?ʒ tbl)
-  ;; We use set-downcase-syntax below, since we want upcase of ǆ
-  ;; return Ǆ, not ǅ, and the same for the rest.
-  (set-case-syntax-pair ?Ǆ ?ǆ tbl)
-  (set-downcase-syntax ?ǅ ?ǆ tbl)
-  (set-case-syntax-pair ?Ǉ ?ǉ tbl)
-  (set-downcase-syntax ?ǈ ?ǉ tbl)
-  (set-case-syntax-pair ?Ǌ ?ǌ tbl)
-  (set-downcase-syntax ?ǋ ?ǌ tbl)
-
-  ;; 01F0; F; 006A 030C; # LATIN SMALL LETTER J WITH CARON
-
-  (set-case-syntax-pair ?Ǳ ?ǳ tbl)
-  (set-downcase-syntax ?ǲ ?ǳ tbl)
-  (set-case-syntax-pair ?Ƕ ?ƕ tbl)
-  (set-case-syntax-pair ?Ƿ ?ƿ tbl)
-  (set-case-syntax-pair ?Ⱥ ?ⱥ tbl)
-  (set-case-syntax-pair ?Ƚ ?ƚ tbl)
-  (set-case-syntax-pair ?Ⱦ ?ⱦ tbl)
-  (set-case-syntax-pair ?Ƀ ?ƀ tbl)
-  (set-case-syntax-pair ?Ʉ ?ʉ tbl)
-  (set-case-syntax-pair ?Ʌ ?ʌ tbl)
-
   ;; Latin Extended Additional
   (modify-category-entry '(#x1e00 . #x1ef9) ?l)
-  (setq c #x1e00)
-  (while (<= c #x1ef9)
-    (and (zerop (% c 2))
-	 (or (<= c #x1e94) (>= c #x1ea0))
-	 (set-case-syntax-pair c (1+ c) tbl))
-    (setq c (1+ c)))
 
   ;; Latin Extended-C
   (setq c #x2C60)
@@ -660,57 +563,12 @@ ?L
     (modify-category-entry c ?l)
     (setq c (1+ c)))
 
-  (let ((pair-ranges '((#x2C60 . #x2C61)
-                       (#x2C67 . #x2C6C)
-                       (#x2C72 . #x2C73)
-                       (#x2C75 . #x2C76))))
-    (dolist (elt pair-ranges)
-      (let ((from (car elt)) (to (cdr elt)))
-        (while (< from to)
-          (set-case-syntax-pair from (1+ from) tbl)
-          (setq from (+ from 2))))))
-
-  (set-case-syntax-pair ?Ɫ ?ɫ tbl)
-  (set-case-syntax-pair ?Ᵽ ?ᵽ tbl)
-  (set-case-syntax-pair ?Ɽ ?ɽ tbl)
-  (set-case-syntax-pair ?Ɑ ?ɑ tbl)
-  (set-case-syntax-pair ?Ɱ ?ɱ tbl)
-  (set-case-syntax-pair ?Ɐ ?ɐ tbl)
-  (set-case-syntax-pair ?Ɒ ?ɒ tbl)
-  (set-case-syntax-pair ?Ȿ ?ȿ tbl)
-  (set-case-syntax-pair ?Ɀ ?ɀ tbl)
-
   ;; Latin Extended-D
   (setq c #xA720)
   (while (<= c #xA7FF)
     (modify-category-entry c ?l)
     (setq c (1+ c)))
 
-  (let ((pair-ranges '((#xA722 . #xA72F)
-                       (#xA732 . #xA76F)
-                       (#xA779 . #xA77C)
-                       (#xA77E . #xA787)
-                       (#xA78B . #xA78E)
-                       (#xA790 . #xA793)
-                       (#xA796 . #xA7A9)
-                       (#xA7B4 . #xA7B7))))
-    (dolist (elt pair-ranges)
-      (let ((from (car elt)) (to (cdr elt)))
-        (while (< from to)
-          (set-case-syntax-pair from (1+ from) tbl)
-          (setq from (+ from 2))))))
-
-  (set-case-syntax-pair ?Ᵹ ?ᵹ tbl)
-  (set-case-syntax-pair ?Ɦ ?ɦ tbl)
-  (set-case-syntax-pair ?Ɜ ?ɜ tbl)
-  (set-case-syntax-pair ?Ɡ ?ɡ tbl)
-  (set-case-syntax-pair ?Ɬ ?ɬ tbl)
-  (set-case-syntax-pair ?Ɪ ?ɪ tbl)
-  (set-case-syntax-pair ?Ʞ ?ʞ tbl)
-  (set-case-syntax-pair ?Ʇ ?ʇ tbl)
-  (set-case-syntax-pair ?Ʝ ?ʝ tbl)
-  (set-case-syntax-pair ?Ꭓ ?ꭓ tbl)
-
   ;; Latin Extended-E
   (setq c #xAB30)
   (while (<= c #xAB64)
@@ -719,102 +577,19 @@ ?L
 
   ;; Greek
   (modify-category-entry '(#x0370 . #x03ff) ?g)
-  (setq c #x0370)
-  (while (<= c #x03ff)
-    (if (or (and (>= c #x0391) (<= c #x03a1))
-	    (and (>= c #x03a3) (<= c #x03ab)))
-	(set-case-syntax-pair c (+ c 32) tbl))
-    (and (>= c #x03da)
-	 (<= c #x03ee)
-	 (zerop (% c 2))
-	 (set-case-syntax-pair c (1+ c) tbl))
-    (setq c (1+ c)))
-  (set-case-syntax-pair ?Ά ?ά tbl)
-  (set-case-syntax-pair ?Έ ?έ tbl)
-  (set-case-syntax-pair ?Ή ?ή tbl)
-  (set-case-syntax-pair ?Ί ?ί tbl)
-  (set-case-syntax-pair ?Ό ?ό tbl)
-  (set-case-syntax-pair ?Ύ ?ύ tbl)
-  (set-case-syntax-pair ?Ώ ?ώ tbl)
 
   ;; Armenian
   (setq c #x531)
-  (while (<= c #x556)
-    (set-case-syntax-pair c (+ c #x30) tbl)
-    (setq c (1+ c)))
 
   ;; Greek Extended
   (modify-category-entry '(#x1f00 . #x1fff) ?g)
-  (setq c #x1f00)
-  (while (<= c #x1fff)
-    (and (<= (logand c #x000f) 7)
-	 (<= c #x1fa7)
-	 (not (memq c '(#x1f16 #x1f17 #x1f56 #x1f57
-			       #x1f50 #x1f52 #x1f54 #x1f56)))
-	 (/= (logand c #x00f0) #x70)
-	 (set-case-syntax-pair (+ c 8) c tbl))
-    (setq c (1+ c)))
-  (set-case-syntax-pair ?Ᾰ ?ᾰ tbl)
-  (set-case-syntax-pair ?Ᾱ ?ᾱ tbl)
-  (set-case-syntax-pair ?Ὰ ?ὰ tbl)
-  (set-case-syntax-pair ?Ά ?ά tbl)
-  (set-case-syntax-pair ?ᾼ ?ᾳ tbl)
-  (set-case-syntax-pair ?Ὲ ?ὲ tbl)
-  (set-case-syntax-pair ?Έ ?έ tbl)
-  (set-case-syntax-pair ?Ὴ ?ὴ tbl)
-  (set-case-syntax-pair ?Ή ?ή tbl)
-  (set-case-syntax-pair ?ῌ ?ῃ tbl)
-  (set-case-syntax-pair ?Ῐ ?ῐ tbl)
-  (set-case-syntax-pair ?Ῑ ?ῑ tbl)
-  (set-case-syntax-pair ?Ὶ ?ὶ tbl)
-  (set-case-syntax-pair ?Ί ?ί tbl)
-  (set-case-syntax-pair ?Ῠ ?ῠ tbl)
-  (set-case-syntax-pair ?Ῡ ?ῡ tbl)
-  (set-case-syntax-pair ?Ὺ ?ὺ tbl)
-  (set-case-syntax-pair ?Ύ ?ύ tbl)
-  (set-case-syntax-pair ?Ῥ ?ῥ tbl)
-  (set-case-syntax-pair ?Ὸ ?ὸ tbl)
-  (set-case-syntax-pair ?Ό ?ό tbl)
-  (set-case-syntax-pair ?Ὼ ?ὼ tbl)
-  (set-case-syntax-pair ?Ώ ?ώ tbl)
-  (set-case-syntax-pair ?ῼ ?ῳ tbl)
 
   ;; cyrillic
   (modify-category-entry '(#x0400 . #x04FF) ?y)
-  (setq c #x0400)
-  (while (<= c #x04ff)
-    (and (>= c #x0400)
-	 (<= c #x040f)
-	 (set-case-syntax-pair c (+ c 80) tbl))
-    (and (>= c #x0410)
-	 (<= c #x042f)
-	 (set-case-syntax-pair c (+ c 32) tbl))
-    (and (zerop (% c 2))
-	 (or (and (>= c #x0460) (<= c #x0480))
-	     (and (>= c #x048c) (<= c #x04be))
-	     (and (>= c #x04d0) (<= c #x052e)))
-	 (set-case-syntax-pair c (1+ c) tbl))
-    (setq c (1+ c)))
-  (set-case-syntax-pair ?Ӂ ?ӂ tbl)
-  (set-case-syntax-pair ?Ӄ ?ӄ tbl)
-  (set-case-syntax-pair ?Ӈ ?ӈ tbl)
-  (set-case-syntax-pair ?Ӌ ?ӌ tbl)
-
   (modify-category-entry '(#xA640 . #xA69F) ?y)
-  (setq c #xA640)
-  (while (<= c #xA66C)
-    (set-case-syntax-pair c (+ c 1) tbl)
-    (setq c (+ c 2)))
-  (setq c #xA680)
-  (while (<= c #xA69A)
-    (set-case-syntax-pair c (+ c 1) tbl)
-    (setq c (+ c 2)))
 
   ;; Georgian
   (setq c #x10A0)
-  (while (<= c #x10CD)
-    (set-case-syntax-pair c (+ c #x1C60) tbl)
-    (setq c (1+ c)))
 
   ;; Cyrillic Extended-C
   (modify-category-entry '(#x1C80 . #x1C8F) ?y)
@@ -844,12 +619,6 @@ ?L
     (set-case-syntax c "." tbl)
     (setq c (1+ c)))
 
-  ;; Roman numerals
-  (setq c #x2160)
-  (while (<= c #x216f)
-    (set-case-syntax-pair c (+ c #x10) tbl)
-    (setq c (1+ c)))
-
   ;; Fixme: The following blocks might be better as symbol rather than
   ;; punctuation.
   ;; Arrows
@@ -873,25 +642,11 @@ ?L
   ;; Circled Latin
   (setq c #x24b6)
   (while (<= c #x24cf)
-    (set-case-syntax-pair c (+ c 26) tbl)
     (modify-category-entry c ?l)
     (modify-category-entry (+ c 26) ?l)
     (setq c (1+ c)))
 
-  ;; Glagolitic
-  (setq c #x2C00)
-  (while (<= c #x2C2E)
-    (set-case-syntax-pair c (+ c 48) tbl)
-    (setq c (1+ c)))
-
   ;; Coptic
-  (let ((pair-ranges '((#x2C80 . #x2CE2)
-		       (#x2CEB . #x2CF2))))
-    (dolist (elt pair-ranges)
-      (let ((from (car elt)) (to (cdr elt)))
-	(while (< from to)
-	  (set-case-syntax-pair from (1+ from) tbl)
-	  (setq from (+ from 2))))))
   ;; There's no Coptic category.  However, Coptic letters that are
   ;; part of the Greek block above get the Greek category, and those
   ;; in this block are derived from Greek letters, so let's be
@@ -901,45 +656,85 @@ ?L
   ;; Fullwidth Latin
   (setq c #xff21)
   (while (<= c #xff3a)
-    (set-case-syntax-pair c (+ c #x20) tbl)
     (modify-category-entry c ?l)
     (modify-category-entry (+ c #x20) ?l)
     (setq c (1+ c)))
 
-  ;; Deseret
-  (setq c #x10400)
-  (while (<= c #x10427)
-    (set-case-syntax-pair c (+ c 28) tbl)
-    (setq c (1+ c)))
+  ;; Combining diacritics
+  (modify-category-entry '(#x300 . #x362) ?^)
+  ;; Combining marks
+  (modify-category-entry '(#x20d0 . #x20ff) ?^)
 
-  ;; Osage
-  (setq c #x104B0)
-  (while (<= c #x104D3)
-    (set-case-syntax-pair c (+ c 40) tbl)
-    (setq c (1+ c)))
+  ;; Set all Letter, uppercase; Letter, lowercase and Letter, titlecase syntax
+  ;; to word.
+  (let ((syn-tab (standard-syntax-table)))
+    (map-char-table
+     (lambda (ch cat)
+       (when (memq cat '(Lu Ll Lt))
+         (modify-syntax-entry ch "w   " syn-tab)))
+     (unicode-property-table-internal 'general-category))
 
-  ;; Old Hungarian
-  (setq c #x10c80)
-  (while (<= c #x10cb2)
-    (set-case-syntax-pair c (+ c #x40) tbl)
-    (setq c (1+ c)))
+    ;; Ⅰ through Ⅻ had word syntax in the past so set it here as well.
+    ;; General category of those characers is Number, Letter.
+    (modify-syntax-entry '(#x2160 . #x216b) "w   " syn-tab)
 
-  ;; Warang Citi
-  (setq c #x118a0)
-  (while (<= c #x118bf)
-    (set-case-syntax-pair c (+ c #x20) tbl)
-    (setq c (1+ c)))
+    ;; ⓐ thourgh ⓩ are symbols, other according to Unicode but Emacs set
+    ;; their syntax to word in the past so keep backwards compatibility.
+    (modify-syntax-entry '(#x24D0 . #x24E9) "w   " syn-tab))
 
-  ;; Adlam
-  (setq c #x1e900)
-  (while (<= c #x1e921)
-    (set-case-syntax-pair c (+ c #x22) tbl)
-    (setq c (1+ c)))
+  ;; Set downcase and upcase from Unicode properties
 
-  ;; Combining diacritics
-  (modify-category-entry '(#x300 . #x362) ?^)
-  ;; Combining marks
-  (modify-category-entry '(#x20d0 . #x20ff) ?^)
+  ;; In some languages, such as Turkish, U+0049 LATIN CAPITAL LETTER I and
+  ;; U+0131 LATIN SMALL LETTER DOTLESS I make a case pair, and so do U+0130
+  ;; LATIN CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN SMALL LETTER I.
+
+  ;; We used to set up half of those correspondence unconditionally, but that
+  ;; makes searches slow.  So now we don't set up either half of these
+  ;; correspondences by default.
+
+  ;; (set-downcase-syntax  ?İ ?i tbl)
+  ;; (set-upcase-syntax    ?I ?ı tbl)
+
+  (let ((map-unicode-property
+         (lambda (property func)
+           (map-char-table
+            (lambda (ch cased)
+              ;; ASCII characters skipped due to reasons outlined above.  As of
+              ;; Unicode 9.0, this exception affects the following:
+              ;;   lc(U+0130 İ) = i
+              ;;   uc(U+0131 ı) = I
+              ;;   uc(U+017F ſ) = S
+              ;;   uc(U+212A K) = k
+              (when (> cased 127)
+                (let ((end (if (consp ch) (cdr ch) ch)))
+                  (setq ch (max 128 (if (consp ch) (car ch) ch)))
+                  (while (<= ch end)
+                    (funcall func ch cased)
+                    (setq ch (1+ ch))))))
+            (unicode-property-table-internal property))))
+        (down tbl)
+        (up (case-table-get-table tbl 'up)))
+
+    ;; This works on an assumption that if toUpper(x) != x then toLower(x) ==
+    ;; x (and the opposite for toLower/toUpper).  This doesn’t hold for title
+    ;; case characters but those incorrect mappings will be overwritten later.
+    (funcall map-unicode-property 'uppercase
+             (lambda (lc uc) (aset down lc lc) (aset up uc uc)))
+    (funcall map-unicode-property 'lowercase
+             (lambda (uc lc) (aset down lc lc) (aset up uc uc)))
+
+    ;; Now deal with the actual mapping.  This will correctly assign casing for
+    ;; title-case characters.
+    (funcall map-unicode-property 'uppercase
+             (lambda (lc uc) (aset up lc uc) (aset up uc uc)))
+    (funcall map-unicode-property 'lowercase
+             (lambda (uc lc) (aset down uc lc) (aset down lc lc))))
+
+  ;; Clear out the extra slots so that they will be recomputed from the main
+  ;; (downcase) table and upcase table.  Since we’re side-stepping the usual
+  ;; set-case-syntax-* functions, we need to do it explicitly.
+  (set-char-table-extra-slot tbl 1 nil)
+  (set-char-table-extra-slot tbl 2 nil)
 
   ;; Fixme: syntax for symbols &c
   )
diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el
index b1abe50fa4e..e2399664f47 100644
--- a/test/src/casefiddle-tests.el
+++ b/test/src/casefiddle-tests.el
@@ -73,8 +73,7 @@ casefiddle-tests--characters
 
     (?Σ ?Σ ?σ ?Σ)
     (?σ ?Σ ?σ ?Σ)
-    ;; FIXME: Another broken one:
-    ;;(?ς ?Σ ?ς ?Σ)
+    (?ς ?Σ ?ς ?Σ)
 
     (?Ⅷ ?Ⅷ ?ⅷ ?Ⅷ)
     (?ⅷ ?Ⅷ ?ⅷ ?Ⅷ)))
@@ -196,7 +195,6 @@ casefiddle-tests--test-casing
         ;;("ﬁsh" "FIsh" "ﬁsh" "Fish" "Fish")
         ;;("Straße" "STRASSE" "straße" "Straße" "Straße")
         ;;("ΌΣΟΣ" "ΌΣΟΣ" "όσος" "Όσος" "Όσος")
-        ;;("όσος" "ΌΣΟΣ" "όσος" "Όσος" "Όσος")
         ;; And here’s what is actually happening:
         ("ǄUNGLA" "ǄUNGLA" "ǆungla" "Ǆungla" "ǄUNGLA")
         ("ǅungla" "ǅUNGLA" "ǆungla" "ǅungla" "ǅungla")
@@ -205,7 +203,8 @@ casefiddle-tests--test-casing
         ("ﬁsh" "ﬁSH" "ﬁsh" "ﬁsh" "ﬁsh")
         ("Straße" "STRAßE" "straße" "Straße" "Straße")
         ("ΌΣΟΣ" "ΌΣΟΣ" "όσοσ" "Όσοσ" "ΌΣΟΣ")
-        ("όσος" "ΌΣΟς" "όσος" "Όσος" "Όσος"))))))
+
+        ("όσος" "ΌΣΟΣ" "όσος" "Όσος" "Όσος"))))))
 
 (ert-deftest casefiddle-tests-casing-byte8 ()
   (should-not
-- 
2.11.0.483.g087da7b7c-goog






^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#25646: [PATCH 3/3] Don’t assume character can be either upper- or lower-case when casing
  2017-02-07 18:05 ` bug#25646: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
  2017-02-07 18:05   ` bug#25646: [PATCH 2/3] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
@ 2017-02-07 18:05   ` Michal Nazarewicz
  1 sibling, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2017-02-07 18:05 UTC (permalink / raw)
  To: 25646

A compatibility digraph characters, such as ǅ, are neither upper- nor
lower-case.  At the moment however, those are reported as upper-case¹
despite the fact that they change when upper-cased.

Stop checking if a character is upper-case before trying to up-case it
so that title-case characters are handled correctly.

¹ Because they change when converted to lower-case.  Notice an asymmetry
  in that for a character to be considered lower-case it must not be
  upper-case (plus the usual condition of changing when upper-cased).

* src/buffer.h (upcase1): Delete.
(upcase): Change to upcase character unconditionally just like downcase
does it.  This is what upcase1 was.

* src/casefiddle.c (casify_object, casify_region): Use upcase instead
of upcase1 and don’t check !uppercasep(x) before calling upcase.

* src/keyboard.c (read_key_sequence): Don’t check if uppercase(x), just
downcase(x) and see if it changed.

* test/src/casefiddle-tests.el (casefiddle-tests--characters,
casefiddle-tests-casing): Update test cases which are now passing.
---
 etc/NEWS                     |  8 +++++++-
 src/buffer.h                 | 18 +++++++++---------
 src/casefiddle.c             | 20 +++++++-------------
 src/keyboard.c               | 25 +++++++++++++++----------
 test/src/casefiddle-tests.el |  8 ++++----
 5 files changed, 42 insertions(+), 37 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index da0b5388837..16e1ddd495e 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -338,6 +338,12 @@ same as in modes where the character is not whitespace.
 Instead of only checking the modification time, Emacs now also checks
 the file's actual content before prompting the user.
 
+** Title case characters are properly converted to upper case.
+'upcase', 'upcase-region' et al. convert title case characters (such
+as ǲ) into their upper case form (such as Ǳ).  As a downside,
+'capitalize' and 'upcase-initials' produce awkward words where first
+two letters are upper case, e.g. Ǆungla (instead of ǅungla).
+
 \f
 * Changes in Specialized Modes and Packages in Emacs 26.1
 
@@ -1017,7 +1023,7 @@ along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.
 
 \f
 Local variables:
-coding: us-ascii
+coding: utf-8
 mode: outline
 paragraph-separate: "[ 	\f]*$"
 end:
diff --git a/src/buffer.h b/src/buffer.h
index 4a23e4fdd2e..f53212e3120 100644
--- a/src/buffer.h
+++ b/src/buffer.h
@@ -1365,28 +1365,28 @@ downcase (int c)
   return NATNUMP (down) ? XFASTINT (down) : c;
 }
 
-/* True if C is upper case.  */
-INLINE bool uppercasep (int c) { return downcase (c) != c; }
-
-/* Upcase a character C known to be not upper case.  */
+/* Upcase a character C, or make no change if that cannot be done. */
 INLINE int
-upcase1 (int c)
+upcase (int c)
 {
   Lisp_Object upcase_table = BVAR (current_buffer, upcase_table);
   Lisp_Object up = CHAR_TABLE_REF (upcase_table, c);
   return NATNUMP (up) ? XFASTINT (up) : c;
 }
 
+/* True if C is upper case.  */
+INLINE bool uppercasep (int c)
+{
+  return downcase (c) != c;
+}
+
 /* True if C is lower case.  */
 INLINE bool
 lowercasep (int c)
 {
-  return !uppercasep (c) && upcase1 (c) != c;
+  return !uppercasep (c) && upcase (c) != c;
 }
 
-/* Upcase a character C, or make no change if that cannot be done.  */
-INLINE int upcase (int c) { return uppercasep (c) ? c : upcase1 (c); }
-
 INLINE_HEADER_END
 
 #endif /* EMACS_BUFFER_H */
diff --git a/src/casefiddle.c b/src/casefiddle.c
index 28ffcb298ff..b2b87e7a858 100644
--- a/src/casefiddle.c
+++ b/src/casefiddle.c
@@ -64,13 +64,9 @@ casify_object (enum case_action flag, Lisp_Object obj)
 	multibyte = 1;
       if (! multibyte)
 	MAKE_CHAR_MULTIBYTE (c1);
-      c = downcase (c1);
-      if (inword)
-	XSETFASTINT (obj, c | flags);
-      else if (c == (XFASTINT (obj) & ~flagbits))
+      c = flag == CASE_DOWN ? downcase (c1) : upcase (c1);
+      if (c != c1)
 	{
-	  if (! inword)
-	    c = upcase1 (c1);
 	  if (! multibyte)
 	    MAKE_CHAR_UNIBYTE (c);
 	  XSETFASTINT (obj, c | flags);
@@ -95,7 +91,7 @@ casify_object (enum case_action flag, Lisp_Object obj)
 	    c = downcase (c);
 	  else if (!uppercasep (c)
 		   && (!inword || flag != CASE_CAPITALIZE_UP))
-	    c = upcase1 (c1);
+	    c = upcase (c1);
 	  if ((int) flag >= (int) CASE_CAPITALIZE)
 	    inword = (SYNTAX (c) == Sword);
 	  if (c != c1)
@@ -127,9 +123,8 @@ casify_object (enum case_action flag, Lisp_Object obj)
 	  c = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len);
 	  if (inword && flag != CASE_CAPITALIZE_UP)
 	    c = downcase (c);
-	  else if (!uppercasep (c)
-		   && (!inword || flag != CASE_CAPITALIZE_UP))
-	    c = upcase1 (c);
+	  else if (!inword || flag != CASE_CAPITALIZE_UP)
+	    c = upcase (c);
 	  if ((int) flag >= (int) CASE_CAPITALIZE)
 	    inword = (SYNTAX (c) == Sword);
 	  o += CHAR_STRING (c, o);
@@ -236,9 +231,8 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e)
       c2 = c;
       if (inword && flag != CASE_CAPITALIZE_UP)
 	c = downcase (c);
-      else if (!uppercasep (c)
-	       && (!inword || flag != CASE_CAPITALIZE_UP))
-	c = upcase1 (c);
+      else if (!inword || flag != CASE_CAPITALIZE_UP)
+	c = upcase (c);
       if ((int) flag >= (int) CASE_CAPITALIZE)
 	inword = ((SYNTAX (c) == Sword)
 		  && (inword || !syntax_prefix_flag_p (c)));
diff --git a/src/keyboard.c b/src/keyboard.c
index a86e7c5f8e4..3f6298f4362 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -9642,22 +9642,26 @@ read_key_sequence (Lisp_Object *keybuf, int bufsize, Lisp_Object prompt,
 	 use the corresponding lower-case letter instead.  */
       if (NILP (current_binding)
 	  && /* indec.start >= t && fkey.start >= t && */ keytran.start >= t
-	  && INTEGERP (key)
-	  && ((CHARACTERP (make_number (XINT (key) & ~CHAR_MODIFIER_MASK))
-	       && uppercasep (XINT (key) & ~CHAR_MODIFIER_MASK))
-	      || (XINT (key) & shift_modifier)))
+	  && INTEGERP (key))
 	{
 	  Lisp_Object new_key;
+	  int k = XINT (key);
+
+	  if (k & shift_modifier)
+	    XSETINT (new_key, k & ~shift_modifier);
+	  else if (CHARACTERP (make_number (k & ~CHAR_MODIFIER_MASK)))
+	    {
+	      int dc = downcase(k & ~CHAR_MODIFIER_MASK);
+	      if (dc == (k & ~CHAR_MODIFIER_MASK))
+		goto not_upcase;
+	      XSETINT (new_key, dc | (k & CHAR_MODIFIER_MASK));
+	    }
+	  else
+	    goto not_upcase;
 
 	  original_uppercase = key;
 	  original_uppercase_position = t - 1;
 
-	  if (XINT (key) & shift_modifier)
-	    XSETINT (new_key, XINT (key) & ~shift_modifier);
-	  else
-	    XSETINT (new_key, (downcase (XINT (key) & ~CHAR_MODIFIER_MASK)
-			       | (XINT (key) & CHAR_MODIFIER_MASK)));
-
 	  /* We have to do this unconditionally, regardless of whether
 	     the lower-case char is defined in the keymaps, because they
 	     might get translated through function-key-map.  */
@@ -9668,6 +9672,7 @@ read_key_sequence (Lisp_Object *keybuf, int bufsize, Lisp_Object prompt,
 	  goto replay_sequence;
 	}
 
+    not_upcase:
       if (NILP (current_binding)
 	  && help_char_p (EVENT_HEAD (key)) && t > 1)
 	    {
diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el
index e2399664f47..1588cdbbd2b 100644
--- a/test/src/casefiddle-tests.el
+++ b/test/src/casefiddle-tests.el
@@ -63,13 +63,13 @@ casefiddle-tests--characters
     (?Ł ?Ł ?ł ?Ł)
     (?ł ?Ł ?ł ?Ł)
 
-    ;; FIXME: We should have:
+    ;; FIXME: Commented one is what we want.
     ;;(?Ǆ ?Ǆ ?ǆ ?ǅ)
-    ;; but instead we have:
     (?Ǆ ?Ǆ ?ǆ ?Ǆ)
-    ;; FIXME: Those two are broken at the moment:
     ;;(?ǅ ?Ǆ ?ǆ ?ǅ)
+    (?ǅ ?Ǆ ?ǆ ?Ǆ)
     ;;(?ǆ ?Ǆ ?ǆ ?ǅ)
+    (?ǆ ?Ǆ ?ǆ ?Ǆ)
 
     (?Σ ?Σ ?σ ?Σ)
     (?σ ?Σ ?σ ?Σ)
@@ -197,7 +197,7 @@ casefiddle-tests--test-casing
         ;;("ΌΣΟΣ" "ΌΣΟΣ" "όσος" "Όσος" "Όσος")
         ;; And here’s what is actually happening:
         ("ǄUNGLA" "ǄUNGLA" "ǆungla" "Ǆungla" "ǄUNGLA")
-        ("ǅungla" "ǅUNGLA" "ǆungla" "ǅungla" "ǅungla")
+        ("ǅungla" "ǄUNGLA" "ǆungla" "Ǆungla" "Ǆungla")
         ("ǆungla" "ǄUNGLA" "ǆungla" "Ǆungla" "Ǆungla")
         ("deﬁne" "DEﬁNE" "deﬁne" "Deﬁne" "Deﬁne")
         ("ﬁsh" "ﬁSH" "ﬁsh" "ﬁsh" "ﬁsh")
-- 
2.11.0.483.g087da7b7c-goog






^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#25646: [PATCH 0/3] Minor casing impromevents
  2017-02-07 18:04 bug#25646: [PATCH 0/3] Minor casing impromevents Michal Nazarewicz
  2017-02-07 18:05 ` bug#25646: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
@ 2017-02-10  8:47 ` Eli Zaretskii
  2017-02-15 16:13 ` Michal Nazarewicz
  2 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2017-02-10  8:47 UTC (permalink / raw)
  To: Michal Nazarewicz; +Cc: 25646

> From: Michal Nazarewicz <mina86@mina86.com>
> Cc: Eli Zaretskii <eliz@gnu.org>
> Date: Tue,  7 Feb 2017 19:04:02 +0100
> 
> If there will be no objections I’ll submit it in a week or so.

LGTM, thanks.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#25646: [PATCH 0/3] Minor casing impromevents
  2017-02-07 18:04 bug#25646: [PATCH 0/3] Minor casing impromevents Michal Nazarewicz
  2017-02-07 18:05 ` bug#25646: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
  2017-02-10  8:47 ` bug#25646: [PATCH 0/3] Minor casing impromevents Eli Zaretskii
@ 2017-02-15 16:13 ` Michal Nazarewicz
  2 siblings, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2017-02-15 16:13 UTC (permalink / raw)
  To: 25646-close

On Tue, Feb 07 2017, Michal Nazarewicz wrote:
> If there will be no objections I’ll submit it in a week or so.
>
> This is split from bug#24424 which contains many more changes.
> Originally I hoped that I would be able to get all the paches in
> bug#24424 to state where they can be upstreamed quickly but due to
> various reasons it is taking a lot longer.  Because of that I’ll try
> to submit a smaller, self-contained chunks of it separately so that
> new features and fixes show up faster in Emacs.

Pushed.

> Michal Nazarewicz (3):
>   Add tests for casefiddle.c
>   Generate upcase and downcase tables from Unicode data
>   Don’t assume character can be either upper- or lower-case when casing
>
>  etc/NEWS                         |   8 +-
>  lisp/international/characters.el | 345 ++++++++-------------------------------
>  src/buffer.h                     |  18 +-
>  src/casefiddle.c                 |  20 +--
>  src/keyboard.c                   |  25 +--
>  test/src/casefiddle-tests.el     | 246 ++++++++++++++++++++++++++++
>  6 files changed, 354 insertions(+), 308 deletions(-)
>  create mode 100644 test/src/casefiddle-tests.el
>
> -- 
> 2.11.0.483.g087da7b7c-goog
>

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-15 16:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-07 18:04 bug#25646: [PATCH 0/3] Minor casing impromevents Michal Nazarewicz
2017-02-07 18:05 ` bug#25646: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
2017-02-07 18:05   ` bug#25646: [PATCH 2/3] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
2017-02-07 18:05   ` bug#25646: [PATCH 3/3] Don’t assume character can be either upper- or lower-case when casing Michal Nazarewicz
2017-02-10  8:47 ` bug#25646: [PATCH 0/3] Minor casing impromevents Eli Zaretskii
2017-02-15 16:13 ` Michal Nazarewicz

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.