unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Michal Nazarewicz <mina86@mina86.com>
To: 24603@debbugs.gnu.org
Subject: bug#24603: [RFC 16/18] Refactor character class checking; optimise ASCII case
Date: Tue,  4 Oct 2016 03:10:39 +0200	[thread overview]
Message-ID: <1475543441-10493-16-git-send-email-mina86@mina86.com> (raw)
In-Reply-To: <1475543441-10493-1-git-send-email-mina86@mina86.com>

Use a lookup table to map Unicode general categories to character
categories.  This generalises lowercasep, uppercasep et al. functions.

Furthermore, provide another lookup table for ASCII characters such that
the common case can be optimised and Unicode general category lookup
avoided.

Using lookup table in place of conditions may theoretically improve
performance even though I have not measure it.  Moreover though, having
the lookup table will allow regex engine to be optimised in an upcoming
patch.  Stay tuned. ;)

* src/character.c (alphanumericp, alphabeticp, uppercasep, lowercasep,
graphicp, printablep): Replaced with static inline functions define in
the header file.
(category_char_bits): New lookup table mapping Unicode
general category to character classes.
(ascii_char_bits): New lookup table mapping ASCII characters to
character classes.

* src/character.h (unicode_alphanumericp, unicode_alphabeticp,
unicode_uppercasep, unicode_lowercasep, unicode_graphicp,
unicode_printablep, _ascii_alphanumericp, _ascii_alphabeticp,
_ascii_uppercasep, _ascii_lowercasep, _ascii_graphicp,
_ascii_printablep): New static inline functions which are special cases
of respective unprefixed functions.

* src/regex.c (ISALNUM, ISALPHA): Remove special cases for ASCII
characters since alphanumericp and uppercasep already handle those.
---
 src/character.c | 168 +++++++++++++++++++++++++-------------------------------
 src/character.h |  77 +++++++++++++++++++++++---
 src/regex.c     |  20 ++-----
 3 files changed, 151 insertions(+), 114 deletions(-)

diff --git a/src/character.c b/src/character.c
index 707ae10..63f89d3 100644
--- a/src/character.c
+++ b/src/character.c
@@ -960,104 +960,88 @@ character is not ASCII nor 8-bit character, an error is signaled.  */)
   return make_number (c);
 }
 
-static unicode_category_t
+/* Return C’s Unicode general category (or UNICODE_CATEGORY_UNKNOWN). */
+unicode_category_t
 char_unicode_category (int c)
 {
   Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c);
   return INTEGERP (category) ? XINT (category) : UNICODE_CATEGORY_UNKNOWN;
 }
 
-/* Return true if C is a upper case character.  This does not imply mean it
-   has a lower case form. */
-bool
-uppercasep (int c)
-{
-  unicode_category_t gen_cat = char_unicode_category (c);
-
-  /* See UTS #18.  There are additional characters that should be
-     here, those designated as Other_uppercase; FIXME.  */
-  return gen_cat == UNICODE_CATEGORY_Lu;
-}
-
-/* Return true if C is a lower case character.  This does not imply mean it
-   has an upper case form. */
-bool
-lowercasep (int c)
-{
-  unicode_category_t gen_cat = char_unicode_category (c);
-
-  /* See UTS #18.  There are additional characters that should be
-     here, those designated as Other_lowercase; FIXME.  */
-  return gen_cat == UNICODE_CATEGORY_Ll;
-}
-
-/* Return true if C is an alphabetic character.  */
-bool
-alphabeticp (int c)
-{
-  unicode_category_t gen_cat = char_unicode_category (c);
-
-  /* See UTS #18.  There are additional characters that should be
-     here, those designated as Other_uppercase, Other_lowercase,
-     and Other_alphabetic; FIXME.  */
-  return (gen_cat == UNICODE_CATEGORY_Lu
-	  || gen_cat == UNICODE_CATEGORY_Ll
-	  || gen_cat == UNICODE_CATEGORY_Lt
-	  || gen_cat == UNICODE_CATEGORY_Lm
-	  || gen_cat == UNICODE_CATEGORY_Lo
-	  || gen_cat == UNICODE_CATEGORY_Mn
-	  || gen_cat == UNICODE_CATEGORY_Mc
-	  || gen_cat == UNICODE_CATEGORY_Me
-	  || gen_cat == UNICODE_CATEGORY_Nl);
-}
-
-/* Return true if C is a alphabetic or decimal-number character.  */
-bool
-alphanumericp (int c)
-{
-  unicode_category_t gen_cat = char_unicode_category (c);
-
-  /* See UTS #18.  Same comment as for alphabeticp applies.  FIXME. */
-  return (gen_cat == UNICODE_CATEGORY_Lu
-	  || gen_cat == UNICODE_CATEGORY_Ll
-	  || gen_cat == UNICODE_CATEGORY_Lt
-	  || gen_cat == UNICODE_CATEGORY_Lm
-	  || gen_cat == UNICODE_CATEGORY_Lo
-	  || gen_cat == UNICODE_CATEGORY_Mn
-	  || gen_cat == UNICODE_CATEGORY_Mc
-	  || gen_cat == UNICODE_CATEGORY_Me
-	  || gen_cat == UNICODE_CATEGORY_Nl
-	  || gen_cat == UNICODE_CATEGORY_Nd);
-}
-
-/* Return true if C is a graphic character.  */
-bool
-graphicp (int c)
-{
-  unicode_category_t gen_cat = char_unicode_category (c);
-
-  /* See UTS #18.  */
-  return (!(gen_cat == UNICODE_CATEGORY_UNKNOWN
-	    || gen_cat == UNICODE_CATEGORY_Zs /* space separator */
-	    || gen_cat == UNICODE_CATEGORY_Zl /* line separator */
-	    || gen_cat == UNICODE_CATEGORY_Zp /* paragraph separator */
-	    || gen_cat == UNICODE_CATEGORY_Cc /* control */
-	    || gen_cat == UNICODE_CATEGORY_Cs /* surrogate */
-	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
-}
-
-/* Return true if C is a printable character.  */
-bool
-printablep (int c)
-{
-  unicode_category_t gen_cat = char_unicode_category (c);
-
-  /* See UTS #18.  */
-  return (!(gen_cat == UNICODE_CATEGORY_UNKNOWN
-	    || gen_cat == UNICODE_CATEGORY_Cc /* control */
-	    || gen_cat == UNICODE_CATEGORY_Cs /* surrogate */
-	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
-}
+#define CHAR_BIT_ALNUM_ CHAR_BIT_ALNUM | CHAR_BIT_GRAPH | CHAR_BIT_PRINT
+#define CHAR_BIT_ALPHA_ CHAR_BIT_ALPHA | CHAR_BIT_ALNUM_
+
+/* See UTS #18 and DerivedCoreProperties.txt.  alpha, alnum, upper and
+   lower are missing some characters, namely those designated as
+   Other_uppercase, Other_lowercase and Other_alphabetic; FIXME.  */
+
+const unsigned char category_char_bits[] = {
+  [UNICODE_CATEGORY_UNKNOWN] = 0,
+  [UNICODE_CATEGORY_Lu] = CHAR_BIT_ALPHA_ | CHAR_BIT_UPPER,
+  [UNICODE_CATEGORY_Ll] = CHAR_BIT_ALPHA_ | CHAR_BIT_LOWER,
+  [UNICODE_CATEGORY_Lt] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_Lm] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_Lo] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_Mn] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_Mc] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_Me] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_Nd] = CHAR_BIT_ALNUM_,
+  [UNICODE_CATEGORY_Nl] = CHAR_BIT_ALPHA_,
+  [UNICODE_CATEGORY_No] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Pc] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Pd] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Ps] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Pe] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Pi] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Pf] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Po] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Sm] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Sc] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Sk] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_So] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Zs] = CHAR_BIT_PRINT,
+  [UNICODE_CATEGORY_Zl] = CHAR_BIT_PRINT,
+  [UNICODE_CATEGORY_Zp] = CHAR_BIT_PRINT,
+  [UNICODE_CATEGORY_Cc] = 0,
+  [UNICODE_CATEGORY_Cf] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Cs] = 0,
+  [UNICODE_CATEGORY_Co] = CHAR_BIT_PRINT | CHAR_BIT_GRAPH,
+  [UNICODE_CATEGORY_Cn] = 0,
+};
+
+#undef CHAR_BIT_ALNUM_
+#undef CHAR_BIT_ALPHA_
+
+#define P_ CHAR_BIT_PRINT
+#define G_ CHAR_BIT_GRAPH | P_
+#define N_ CHAR_BIT_ALNUM | G_
+#define U_ CHAR_BIT_UPPER | CHAR_BIT_ALPHA | N_
+#define L_ CHAR_BIT_LOWER | CHAR_BIT_ALPHA | N_
+
+const unsigned char ascii_char_bits[] = {
+/*\0  ...                                                     \17 */
+   0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+/*\20  ...                                                    \37 */
+   0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+/*' ' '!' '"' '#' '$' '%' '&' '´' '(' ')' '*' '+' ',' '-' '.' '/' */
+  P_, G_, G_, G_, G_, G_, G_, G_, G_, G_, G_, G_, G_, G_, G_, G_,
+/*'0' '1' '2' '3' '4' '5' '6' '7' '8' '9' ':' ';' '<' '=' '>' '?' */
+  N_, N_, N_, N_, N_, N_, N_, N_, N_, N_, G_, G_, G_, G_, G_, G_,
+/*'@' 'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M' 'N' 'O' */
+  G_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_,
+/*'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W' 'X' 'Y' 'Z' '[' '\' ']' '^' '_' */
+  U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, U_, G_, G_, G_, G_, G_,
+/*'`' 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' */
+  G_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_,
+/*'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z' '{' '|' '}' '~' \177 */
+  L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, L_, G_, G_, G_, G_,  0,
+};
+
+#undef P_
+#undef G_
+#undef N_
+#undef U_
+#undef L_
 
 void
 syms_of_character (void)
diff --git a/src/character.h b/src/character.h
index 5931c5c..6dc95ad 100644
--- a/src/character.h
+++ b/src/character.h
@@ -652,8 +652,78 @@ typedef enum {
   UNICODE_CATEGORY_Cs,
   UNICODE_CATEGORY_Co,
   UNICODE_CATEGORY_Cn
+  /* Don’t forget to extend category_char_bits in character.c when new entries
+     are added here. */
 } unicode_category_t;
 
+extern unicode_category_t char_unicode_category (int);
+
+/* Limited set of character categories which syntax-independent.  Testing of
+ * those characters do not require any run-time data, e.g. do not depend on
+ * syntax table. */
+#define CHAR_BIT_ALNUM        (1 << 0)
+#define CHAR_BIT_ALPHA        (1 << 1)
+#define CHAR_BIT_UPPER        (1 << 2)
+#define CHAR_BIT_LOWER        (1 << 3)
+#define CHAR_BIT_GRAPH        (1 << 4)
+#define CHAR_BIT_PRINT        (1 << 5)
+
+/* Map from Unicode general category to character classes the character is in.
+ *
+ * Only character classes defined by CHAR_BIT_* above are present.
+ *
+ * This is an array of bit fields so for example ‘category_char_bits[gc] &
+ * CHAR_BIT_ALPHA’ tells you whether characters in general category GC are
+ * alphabetic or not. */
+extern const unsigned char category_char_bits[];
+
+/* Map from ASCII character to character classes the character is in.
+ *
+ * Only character classes defined by CHAR_BIT_* above are present.
+ *
+ * This is an array of bit fields so for example ascii_char_bits[ch] &
+ * CHAR_BIT_ALPHA’ tells you whether character CH is alphabetic or not. */
+extern const unsigned char ascii_char_bits[128];
+
+#define DEFINE_CHAR_TEST(name, bit)					\
+  static inline bool unicode_ ## name (int c) {			\
+    return category_char_bits[char_unicode_category(c)] & bit;		\
+  }									\
+  static inline bool _ascii_ ## name (int c) {				\
+    return ascii_char_bits[c] & bit;					\
+  }									\
+  static inline bool name (int c) {					\
+    return (unsigned)c < (sizeof ascii_char_bits / sizeof *ascii_char_bits) ? \
+      _ascii_ ## name (c) : unicode_ ## name (c);			\
+  }
+
+/* For TEST being one of:
+     alphanumericp
+     alphabeticp
+     uppercasep
+     lowercasep
+     graphicp
+     printablep
+   define
+     bool TEST (int c);
+     bool unicode_TEST (int c);
+     bool _ascii_TEST (int c);
+   which test whether C has given character property.  TEST works for any
+   character, Unicode or not.  unicode_TEST works for any character as well but
+   is potentially slower for ASCII characters (since it requires Unicode
+   category lookup).  _ascii_TEST works for ASCII characters only and creates
+   naked singularity if non-ASCII character is passed to it. */
+
+DEFINE_CHAR_TEST (alphanumericp, CHAR_BIT_ALNUM)
+DEFINE_CHAR_TEST (alphabeticp, CHAR_BIT_ALPHA)
+DEFINE_CHAR_TEST (uppercasep, CHAR_BIT_UPPER)
+DEFINE_CHAR_TEST (lowercasep, CHAR_BIT_LOWER)
+DEFINE_CHAR_TEST (graphicp, CHAR_BIT_GRAPH)
+DEFINE_CHAR_TEST (printablep, CHAR_BIT_PRINT)
+
+#undef DEFINE_CHAR_TEST
+
+
 extern EMACS_INT char_resolve_modifier_mask (EMACS_INT) ATTRIBUTE_CONST;
 extern int char_string (unsigned, unsigned char *);
 extern int string_char (const unsigned char *,
@@ -676,13 +746,6 @@ extern ptrdiff_t lisp_string_width (Lisp_Object, ptrdiff_t,
 extern Lisp_Object Vchar_unify_table;
 extern Lisp_Object string_escape_byte8 (Lisp_Object);
 
-extern bool uppercasep (int);
-extern bool lowercasep (int);
-extern bool alphabeticp (int);
-extern bool alphanumericp (int);
-extern bool graphicp (int);
-extern bool printablep (int);
-
 /* Return a translation table of id number ID.  */
 #define GET_TRANSLATION_TABLE(id) \
   (XCDR (XVECTOR (Vtranslation_table_vector)->contents[(id)]))
diff --git a/src/regex.c b/src/regex.c
index 1917a84..02da1fb 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -313,6 +313,11 @@ enum syntaxcode { Swhitespace = 0, Sword = 1, Ssymbol = 2 };
 
 /* The rest must handle multibyte characters.  */
 
+# define ISALNUM(c) alphanumericp (c)
+# define ISALPHA(c) alphabeticp (c)
+# define ISUPPER(c) uppercasep (c)
+# define ISLOWER(c) lowercasep (c)
+
 # define ISGRAPH(c) (SINGLE_BYTE_CHAR_P (c)				\
 		     ? (c) > ' ' && !((c) >= 0177 && (c) <= 0240)	\
 		     : graphicp (c))
@@ -321,19 +326,6 @@ enum syntaxcode { Swhitespace = 0, Sword = 1, Ssymbol = 2 };
 		    ? (c) >= ' ' && !((c) >= 0177 && (c) <= 0237)	\
 		     : printablep (c))
 
-# define ISALNUM(c) (IS_REAL_ASCII (c)			\
-		    ? (((c) >= 'a' && (c) <= 'z')	\
-		       || ((c) >= 'A' && (c) <= 'Z')	\
-		       || ((c) >= '0' && (c) <= '9'))	\
-		    : alphanumericp (c))
-
-# define ISALPHA(c) (IS_REAL_ASCII (c)			\
-		    ? (((c) >= 'a' && (c) <= 'z')	\
-		       || ((c) >= 'A' && (c) <= 'Z'))	\
-		    : alphabeticp (c))
-
-# define ISLOWER(c) lowercasep (c)
-
 # define ISPUNCT(c) (IS_REAL_ASCII (c)				\
 		    ? ((c) > ' ' && (c) < 0177			\
 		       && !(((c) >= 'a' && (c) <= 'z')		\
@@ -343,8 +335,6 @@ enum syntaxcode { Swhitespace = 0, Sword = 1, Ssymbol = 2 };
 
 # define ISSPACE(c) (SYNTAX (c) == Swhitespace)
 
-# define ISUPPER(c) uppercasep (c)
-
 # define ISWORD(c) (SYNTAX (c) == Sword)
 
 #else /* not emacs */
-- 
2.8.0.rc3.226.g39d4020






  parent reply	other threads:[~2016-10-04  1:10 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-04  1:05 bug#24603: [RFC 00/18] Improvement to casing Michal Nazarewicz
2016-10-04  1:10 ` bug#24603: [RFC 01/18] Add tests for casefiddle.c Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 02/18] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
2016-10-04  7:27     ` Eli Zaretskii
2016-10-04 14:54       ` Michal Nazarewicz
2016-10-04 15:06         ` Eli Zaretskii
2016-10-04 16:57           ` Michal Nazarewicz
2016-10-04 17:27             ` Eli Zaretskii
2016-10-04 17:44               ` Eli Zaretskii
2016-10-06 20:29                 ` Michal Nazarewicz
2016-10-07  6:52                   ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 03/18] Don’t assume character can be either upper- or lower-case when casing Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 04/18] Split casify_object into multiple functions Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 05/18] Introduce case_character function Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 06/18] Add support for title-casing letters Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 07/18] Split up casify_region function Michal Nazarewicz
2016-10-04  7:17     ` Eli Zaretskii
2016-10-18  2:27       ` Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 08/18] Support casing characters which map into multiple code points Michal Nazarewicz
2016-10-04  7:38     ` Eli Zaretskii
2016-10-06 21:40       ` Michal Nazarewicz
2016-10-07  7:46         ` Eli Zaretskii
2017-01-28 23:48           ` Michal Nazarewicz
2017-02-10  9:12             ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 09/18] Implement special sigma casing rule Michal Nazarewicz
2016-10-04  7:22     ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 10/18] Implement Turkic dotless and dotted i handling when casing strings Michal Nazarewicz
2016-10-04  7:12     ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 11/18] Implement casing rules for Lithuanian Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 12/18] Implement rules for title-casing Dutch ij ‘letter’ Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 13/18] Add some tricky Unicode characters to regex test Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 14/18] Factor out character category lookup to separate function Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 15/18] Base lower- and upper-case tests on Unicode properties Michal Nazarewicz
2016-10-04  6:54     ` Eli Zaretskii
2016-10-04  1:10   ` Michal Nazarewicz [this message]
2016-10-04  7:48     ` bug#24603: [RFC 16/18] Refactor character class checking; optimise ASCII case Eli Zaretskii
2016-10-17 13:22       ` Michal Nazarewicz
2016-11-06 19:26       ` Michal Nazarewicz
2016-11-06 19:44         ` Eli Zaretskii
2016-12-20 14:32           ` Michal Nazarewicz
2016-12-20 16:39             ` Eli Zaretskii
2016-12-22 14:02               ` Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 17/18] Optimise character class matching in regexes Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 18/18] Fix case-fold-search character class matching Michal Nazarewicz
2016-10-17 22:03 ` bug#24603: [PATCH 0/3] Case table updates Michal Nazarewicz
2016-10-17 22:03   ` bug#24603: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
2016-10-17 22:03   ` bug#24603: [PATCH 2/3] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
2016-10-17 22:03   ` bug#24603: [PATCH 3/3] Don’t generate ‘X maps to X’ entries in case tables Michal Nazarewicz
2016-10-18  6:36   ` bug#24603: [PATCH 0/3] Case table updates Eli Zaretskii
2016-10-24 15:11     ` Michal Nazarewicz
2016-10-24 15:33       ` Eli Zaretskii
2017-03-09 21:51 ` bug#24603: [PATCHv5 00/11] Casing improvements Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 01/11] Split casify_object into multiple functions Michal Nazarewicz
2017-03-10  9:00     ` Andreas Schwab
2017-03-09 21:51   ` bug#24603: [PATCHv5 02/11] Introduce case_character function Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 03/11] Add support for title-casing letters (bug#24603) Michal Nazarewicz
2017-03-11  9:03     ` Eli Zaretskii
2017-03-09 21:51   ` bug#24603: [PATCHv5 04/11] Split up casify_region function (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 05/11] Support casing characters which map into multiple code points (bug#24603) Michal Nazarewicz
2017-03-11  9:14     ` Eli Zaretskii
2017-03-21  2:09       ` Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 06/11] Implement special sigma casing rule (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 07/11] Introduce ‘buffer-language’ buffer-locar variable Michal Nazarewicz
2017-03-11  9:29     ` Eli Zaretskii
2017-03-09 21:51   ` bug#24603: [PATCHv5 08/11] Implement rules for title-casing Dutch ij ‘letter’ (bug#24603) Michal Nazarewicz
2017-03-11  9:40     ` Eli Zaretskii
2017-03-16 21:30       ` Michal Nazarewicz
2017-03-17 13:43         ` Eli Zaretskii
2017-03-09 21:51   ` bug#24603: [PATCHv5 09/11] Implement Turkic dotless and dotted i casing rules (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 10/11] Implement casing rules for Lithuanian (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 11/11] Implement Irish casing rules (bug#24603) Michal Nazarewicz
2017-03-11  9:44     ` Eli Zaretskii
2017-03-16 22:16       ` Michal Nazarewicz
2017-03-17  8:20         ` Eli Zaretskii
2017-03-11 10:00   ` bug#24603: [PATCHv5 00/11] Casing improvements Eli Zaretskii
2017-03-21  1:27   ` bug#24603: [PATCHv6 0/6] Casing improvements, language-independent part Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 1/6] Split casify_object into multiple functions Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 2/6] Introduce case_character function Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 3/6] Add support for title-casing letters (bug#24603) Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 4/6] Split up casify_region function (bug#24603) Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 5/6] Support casing characters which map into multiple code points (bug#24603) Michal Nazarewicz
2017-03-22 16:06       ` Eli Zaretskii
2017-04-03  9:01         ` Michal Nazarewicz
2017-04-03 14:52           ` Eli Zaretskii
2019-06-25  0:09           ` Lars Ingebrigtsen
2019-06-25  0:29             ` Michał Nazarewicz
2020-08-11 13:46               ` Lars Ingebrigtsen
2021-05-10 11:51                 ` bug#24603: [RFC 00/18] Improvement to casing Lars Ingebrigtsen
2017-03-21  1:27     ` bug#24603: [PATCHv6 6/6] Implement special sigma casing rule (bug#24603) Michal Nazarewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475543441-10493-16-git-send-email-mina86@mina86.com \
    --to=mina86@mina86.com \
    --cc=24603@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).