unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* One more string functions change
@ 2014-06-27 15:27 Dmitry Antipov
  2014-06-27 15:35 ` Andreas Schwab
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-27 15:27 UTC (permalink / raw)
  To: Emacs development discussions

[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

I would like to convert string-equal and string-lessp to
(defun string-equal s1 s2 &optional ignore-case) and
(defun string-lessp s1 s2 &optional ignore-case), respectively.
The goals are 1) to provide more consistent interface similar
to compare-strings and 2) to avoid an endless and annoying Elisp
up/downcasing like:

(defun gnus-string< (s1 s2)
   "Return t if first arg string is less than second in lexicographic order.
Case is significant if and only if `case-fold-search' is nil.
Symbols are also allowed; their print names are used instead."
   (if case-fold-search
       (string-lessp (downcase (if (symbolp s1) (symbol-name s1) s1))
                     (downcase (if (symbolp s2) (symbol-name s2) s2)))
     (string-lessp s1 s2)))

Note that unlike previous compare-strings change, this shouldn't break
backward compatibility.

Objections?

Dmitry

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: string_comparison.patch --]
[-- Type: text/x-patch; name="string_comparison.patch", Size: 24062 bytes --]

=== modified file 'doc/lispintro/emacs-lisp-intro.texi'
--- doc/lispintro/emacs-lisp-intro.texi	2014-06-10 02:20:31 +0000
+++ doc/lispintro/emacs-lisp-intro.texi	2014-06-27 14:49:59 +0000
@@ -4457,7 +4457,9 @@
 @itemx string-equal
 The @code{string-lessp} function tests whether its first argument is
 smaller than the second argument.  A shorter, alternative name for the
-same function (a @code{defalias}) is @code{string<}.
+same function (a @code{defalias}) is @code{string<}.  If the optional
+third argument is non-nil, strings are compared ignoring case
+differences.
 
 The arguments to @code{string-lessp} must be strings or symbols; the
 ordering is lexicographic, so case is significant.  The print names of
@@ -4469,7 +4471,9 @@
 
 @code{string-equal} provides the corresponding test for equality.  Its
 shorter, alternative name is @code{string=}.  There are no string test
-functions that correspond to @var{>}, @code{>=}, or @code{<=}.
+functions that correspond to @var{>}, @code{>=}, or @code{<=}.  This
+function accepts an optional third argument with the same meaning as
+in @code{stirng-lessp}.
 
 @item message
 Print a message in the echo area. The first argument is a string that

=== modified file 'doc/lispref/strings.texi'
--- doc/lispref/strings.texi	2014-04-24 15:11:04 +0000
+++ doc/lispref/strings.texi	2014-06-27 14:43:36 +0000
@@ -415,25 +415,27 @@
 @end example
 @end defun
 
-@defun string= string1 string2
+@defun string-equal string1 string2 &optional ignore-case
 This function returns @code{t} if the characters of the two strings
 match exactly.  Symbols are also allowed as arguments, in which case
-the symbol names are used.  Case is always significant, regardless of
-@code{case-fold-search}.
+the symbol names are used.  If the optional argument @var{ignore-case}
+is non-@code{nil}, characters are matched ignoring case differences.
 
 This function is equivalent to @code{equal} for comparing two strings
 (@pxref{Equality Predicates}).  In particular, the text properties of
 the two strings are ignored; use @code{equal-including-properties} if
 you need to distinguish between strings that differ only in their text
 properties.  However, unlike @code{equal}, if either argument is not a
-string or symbol, @code{string=} signals an error.
+string or symbol, @code{string-equal} signals an error.
 
 @example
-(string= "abc" "abc")
+(string-equal "abc" "abc")
      @result{} t
-(string= "abc" "ABC")
+(string-equal "abc" "ABC")
      @result{} nil
-(string= "ab" "ABC")
+(string-equal "abc" "ABC" t)
+     @result{} t
+(string-equal "ab" "ABC")
      @result{} nil
 @end example
 
@@ -454,13 +456,12 @@
 Representations}.
 @end defun
 
-@defun string-equal string1 string2
-@code{string-equal} is another name for @code{string=}.
+@defun string= string1 string2
+@code{string=} is another name for @code{string-equal}.
 @end defun
 
 @cindex lexical comparison
-@defun string< string1 string2
-@c (findex string< causes problems for permuted index!!)
+@defun string-lessp string1 string2 &optional ignore-case
 This function compares two strings a character at a time.  It
 scans both the strings at the same time to find the first pair of corresponding
 characters that do not match.  If the lesser character of these two is
@@ -468,6 +469,8 @@
 function returns @code{t}.  If the lesser character is the one from
 @var{string2}, then @var{string1} is greater, and this function returns
 @code{nil}.  If the two strings match entirely, the value is @code{nil}.
+If the optional argument @var{ignore-case} is non-@code{nil}, characters
+are compared ignoring case differences.
 
 Pairs of characters are compared according to their character codes.
 Keep in mind that lower case letters have higher numeric values in the
@@ -479,11 +482,11 @@
 
 @example
 @group
-(string< "abc" "abd")
+(string-lessp "abc" "abd")
      @result{} t
-(string< "abd" "abc")
+(string-lessp "abd" "abc")
      @result{} nil
-(string< "123" "abc")
+(string-lessp "123" "abc")
      @result{} t
 @end group
 @end example
@@ -495,15 +498,15 @@
 
 @example
 @group
-(string< "" "abc")
-     @result{} t
-(string< "ab" "abc")
-     @result{} t
-(string< "abc" "")
-     @result{} nil
-(string< "abc" "ab")
-     @result{} nil
-(string< "" "")
+(string-lessp "" "abc")
+     @result{} t
+(string-lessp "ab" "abc")
+     @result{} t
+(string-lessp "abc" "")
+     @result{} nil
+(string-lessp "abc" "ab")
+     @result{} nil
+(string-lessp "" "")
      @result{} nil
 @end group
 @end example
@@ -512,8 +515,8 @@
 are used.
 @end defun
 
-@defun string-lessp string1 string2
-@code{string-lessp} is another name for @code{string<}.
+@defun string< string1 string2
+@code{string<} is another name for @code{string-lessp}.
 @end defun
 
 @defun string-prefix-p string1 string2 &optional ignore-case

=== modified file 'src/buffer.c'
--- src/buffer.c	2014-06-23 04:11:29 +0000
+++ src/buffer.c	2014-06-27 14:33:07 +0000
@@ -435,7 +435,7 @@
     return general;
 }
 
-/* Like Fassoc, but use Fstring_equal to compare
+/* Like Fassoc, but use string_equal to compare
    (which ignores text properties),
    and don't ever QUIT.  */
 
@@ -447,7 +447,7 @@
     {
       register Lisp_Object elt, tem;
       elt = XCAR (tail);
-      tem = Fstring_equal (Fcar (elt), key);
+      tem = string_equal (Fcar (elt), key);
       if (!NILP (tem))
 	return elt;
     }
@@ -493,7 +493,7 @@
   FOR_EACH_LIVE_BUFFER (tail, buf)
     {
       if (!STRINGP (BVAR (XBUFFER (buf), filename))) continue;
-      if (!NILP (Fstring_equal (BVAR (XBUFFER (buf), filename), filename)))
+      if (!NILP (string_equal (BVAR (XBUFFER (buf), filename), filename)))
 	return buf;
     }
   return Qnil;
@@ -507,7 +507,7 @@
   FOR_EACH_LIVE_BUFFER (tail, buf)
     {
       if (!STRINGP (BVAR (XBUFFER (buf), file_truename))) continue;
-      if (!NILP (Fstring_equal (BVAR (XBUFFER (buf), file_truename), filename)))
+      if (!NILP (string_equal (BVAR (XBUFFER (buf), file_truename), filename)))
 	return buf;
     }
   return Qnil;
@@ -1076,7 +1076,7 @@
 
   CHECK_STRING (name);
 
-  tem = Fstring_equal (name, ignore);
+  tem = string_equal (name, ignore);
   if (!NILP (tem))
     return name;
   tem = Fget_buffer (name);
@@ -1101,7 +1101,7 @@
     {
       gentemp = concat2 (tem2, make_formatted_string
 			 (number, "<%"pD"d>", ++count));
-      tem = Fstring_equal (gentemp, ignore);
+      tem = string_equal (gentemp, ignore);
       if (!NILP (tem))
 	return gentemp;
       tem = Fget_buffer (gentemp);

=== modified file 'src/bytecode.c'
--- src/bytecode.c	2014-05-27 23:48:35 +0000
+++ src/bytecode.c	2014-06-27 14:33:07 +0000
@@ -1787,7 +1787,7 @@
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
-	    TOP = Fstring_equal (TOP, v1);
+	    TOP = string_equal (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
 	    NEXT;
 	  }
@@ -1797,7 +1797,7 @@
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
-	    TOP = Fstring_lessp (TOP, v1);
+	    TOP = string_lessp (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
 	    NEXT;
 	  }

=== modified file 'src/dbusbind.c'
--- src/dbusbind.c	2014-05-20 08:25:18 +0000
+++ src/dbusbind.c	2014-06-27 14:33:07 +0000
@@ -283,7 +283,7 @@
 	dbus_address_entries_free (entries);				\
 	/* Canonicalize session bus address.  */			\
 	if ((session_bus_address != NULL)				\
-	    && (!NILP (Fstring_equal					\
+	    && (!NILP (string_equal					\
 		       (bus, build_string (session_bus_address)))))	\
 	  bus = QCdbus_session_bus;					\
       }									\

=== modified file 'src/dired.c'
--- src/dired.c	2014-04-16 19:43:46 +0000
+++ src/dired.c	2014-06-27 14:33:07 +0000
@@ -996,9 +996,8 @@
 Comparison is in lexicographic order and case is significant.  */)
   (Lisp_Object f1, Lisp_Object f2)
 {
-  return Fstring_lessp (Fcar (f1), Fcar (f2));
+  return string_lessp (Fcar (f1), Fcar (f2));
 }
-\f
 
 DEFUN ("system-users", Fsystem_users, Ssystem_users, 0, 0, 0,
        doc: /* Return a list of user names currently registered in the system.

=== modified file 'src/editfns.c'
--- src/editfns.c	2014-06-23 04:11:29 +0000
+++ src/editfns.c	2014-06-27 14:33:07 +0000
@@ -137,7 +137,7 @@
 
   /* If the user name claimed in the environment vars differs from
      the real uid, use the claimed name to find the full name.  */
-  tem = Fstring_equal (Vuser_login_name, Vuser_real_login_name);
+  tem = string_equal (Vuser_login_name, Vuser_real_login_name);
   if (! NILP (tem))
     tem = Vuser_login_name;
   else

=== modified file 'src/fileio.c'
--- src/fileio.c	2014-06-23 04:11:29 +0000
+++ src/fileio.c	2014-06-27 14:33:07 +0000
@@ -2301,7 +2301,7 @@
 #ifdef DOS_NT
       /* If the file names are identical but for the case,
 	 don't attempt to move directory to itself. */
-      && (NILP (Fstring_equal (Fdowncase (file), Fdowncase (newname))))
+      && (NILP (Fstring_equal (file, newname, Qt)))
 #endif
       )
     {
@@ -2328,7 +2328,7 @@
   /* If the file names are identical but for the case, don't ask for
      confirmation: they simply want to change the letter-case of the
      file name.  */
-  if (NILP (Fstring_equal (Fdowncase (file), Fdowncase (newname))))
+  if (NILP (Fstring_equal (file, newname, Qt)))
 #endif
   if (NILP (ok_if_already_exists)
       || INTEGERP (ok_if_already_exists))
@@ -4544,8 +4544,8 @@
     }
 
   if (auto_saving
-      && NILP (Fstring_equal (BVAR (current_buffer, filename),
-			      BVAR (current_buffer, auto_save_file_name))))
+      && NILP (string_equal (BVAR (current_buffer, filename),
+			     BVAR (current_buffer, auto_save_file_name))))
     {
       val = Qutf_8_emacs;
       eol_parent = Qunix;
@@ -5023,8 +5023,8 @@
   else if (quietly)
     {
       if (auto_saving
-	  && ! NILP (Fstring_equal (BVAR (current_buffer, filename),
-				    BVAR (current_buffer, auto_save_file_name))))
+	  && ! NILP (string_equal (BVAR (current_buffer, filename),
+				   BVAR (current_buffer, auto_save_file_name))))
 	SAVE_MODIFF = MODIFF;
 
       return Qnil;

=== modified file 'src/fns.c'
--- src/fns.c	2014-06-26 07:13:13 +0000
+++ src/fns.c	2014-06-27 14:33:07 +0000
@@ -204,11 +204,43 @@
   return make_number (SBYTES (string));
 }
 
-DEFUN ("string-equal", Fstring_equal, Sstring_equal, 2, 2, 0,
+/* Similar to strcasecmp but for Lisp strings.  */
+
+static int
+string_compare (Lisp_Object s1, Lisp_Object s2, Lisp_Object ignore_case)
+{
+  ptrdiff_t i1 = 0, i1_byte = 0, i2 = 0, i2_byte = 0;
+
+  while (i1 < SCHARS (s1) && i2 < SCHARS (s2))
+    {
+      int c1, c2;
+
+      FETCH_STRING_CHAR_ADVANCE (c1, s1, i1, i1_byte);
+      FETCH_STRING_CHAR_ADVANCE (c2, s2, i2, i2_byte);
+
+      if (! NILP (ignore_case))
+       {
+         c1 = XINT (Fdowncase (make_number (c1)));
+         c2 = XINT (Fdowncase (make_number (c2)));
+       }
+
+      if (c1 != c2)
+       return c1 < c2 ? -1 : 1;
+    }
+
+  if (i1 < SCHARS (s2))
+    return -1;
+  else if (i2 < SCHARS (s1))
+    return 1;
+  return 0;
+}
+
+DEFUN ("string-equal", Fstring_equal, Sstring_equal, 2, 3, 0,
        doc: /* Return t if two strings have identical contents.
-Case is significant, but text properties are ignored.
-Symbols are also allowed; their print names are used instead.  */)
-  (register Lisp_Object s1, Lisp_Object s2)
+If IGNORE-CASE is non-nil, characters are converted to lower-case
+before comparing them.  Text properties are ignored.  Symbols are
+also allowed; their print names are used instead.  */)
+  (Lisp_Object s1, Lisp_Object s2, Lisp_Object ignore_case)
 {
   if (SYMBOLP (s1))
     s1 = SYMBOL_NAME (s1);
@@ -217,11 +249,11 @@
   CHECK_STRING (s1);
   CHECK_STRING (s2);
 
-  if (SCHARS (s1) != SCHARS (s2)
-      || SBYTES (s1) != SBYTES (s2)
-      || memcmp (SDATA (s1), SDATA (s2), SBYTES (s1)))
-    return Qnil;
-  return Qt;
+  if (NILP (ignore_case))
+    return (SCHARS (s1) != SCHARS (s2)
+	    || SBYTES (s1) != SBYTES (s2)
+	    || memcmp (SDATA (s1), SDATA (s2), SBYTES (s1))) ? Qnil : Qt;
+  return string_compare (s1, s2, ignore_case) == 0 ? Qt : Qnil;
 }
 
 DEFUN ("compare-strings", Fcompare_strings, Scompare_strings, 6, 7, 0,
@@ -300,15 +332,12 @@
   return Qt;
 }
 
-DEFUN ("string-lessp", Fstring_lessp, Sstring_lessp, 2, 2, 0,
+DEFUN ("string-lessp", Fstring_lessp, Sstring_lessp, 2, 3, 0,
        doc: /* Return t if first arg string is less than second in lexicographic order.
-Case is significant.
-Symbols are also allowed; their print names are used instead.  */)
-  (register Lisp_Object s1, Lisp_Object s2)
+If IGNORE-CASE is non-nil, characters are converted to lower-case before
+comparing them.  Symbols are also allowed; their print names are used instead.  */)
+  (Lisp_Object s1, Lisp_Object s2, Lisp_Object ignore_case)
 {
-  register ptrdiff_t end;
-  register ptrdiff_t i1, i1_byte, i2, i2_byte;
-
   if (SYMBOLP (s1))
     s1 = SYMBOL_NAME (s1);
   if (SYMBOLP (s2))
@@ -316,27 +345,9 @@
   CHECK_STRING (s1);
   CHECK_STRING (s2);
 
-  i1 = i1_byte = i2 = i2_byte = 0;
-
-  end = SCHARS (s1);
-  if (end > SCHARS (s2))
-    end = SCHARS (s2);
-
-  while (i1 < end)
-    {
-      /* When we find a mismatch, we must compare the
-	 characters, not just the bytes.  */
-      int c1, c2;
-
-      FETCH_STRING_CHAR_ADVANCE (c1, s1, i1, i1_byte);
-      FETCH_STRING_CHAR_ADVANCE (c2, s2, i2, i2_byte);
-
-      if (c1 != c2)
-	return c1 < c2 ? Qt : Qnil;
-    }
-  return i1 < SCHARS (s2) ? Qt : Qnil;
+  return string_compare (s1, s2, ignore_case) < 0 ? Qt : Qnil;
 }
-\f
+
 static Lisp_Object concat (ptrdiff_t nargs, Lisp_Object *args,
 			   enum Lisp_Type target_type, bool last_special);
 

=== modified file 'src/font.c'
--- src/font.c	2014-06-21 19:45:59 +0000
+++ src/font.c	2014-06-27 14:33:07 +0000
@@ -717,7 +717,7 @@
       Lisp_Object prev = Qnil;
 
       while (CONSP (extra)
-	     && NILP (Fstring_lessp (prop, XCAR (XCAR (extra)))))
+	     && NILP (string_lessp (prop, XCAR (XCAR (extra)))))
 	prev = extra, extra = XCDR (extra);
 
       if (NILP (prev))

=== modified file 'src/frame.c'
--- src/frame.c	2014-06-17 16:09:19 +0000
+++ src/frame.c	2014-06-27 14:33:07 +0000
@@ -2050,7 +2050,7 @@
       CHECK_STRING (name);
 
       /* Don't change the name if it's already NAME.  */
-      if (! NILP (Fstring_equal (name, f->name)))
+      if (! NILP (string_equal (name, f->name)))
 	return;
 
       /* Don't allow the user to set the frame name to F<num>, so it

=== modified file 'src/ftfont.c'
--- src/ftfont.c	2014-06-17 16:09:19 +0000
+++ src/ftfont.c	2014-06-27 14:33:07 +0000
@@ -1119,8 +1119,8 @@
 	  if (! NILP (AREF (spec, FONT_FAMILY_INDEX))
 	      && NILP (assq_no_quit (AREF (spec, FONT_FAMILY_INDEX),
 				     ftfont_generic_family_list))
-	      && NILP (Fstring_equal (AREF (spec, FONT_FAMILY_INDEX),
-				      AREF (entity, FONT_FAMILY_INDEX))))
+	      && NILP (string_equal (AREF (spec, FONT_FAMILY_INDEX),
+				     AREF (entity, FONT_FAMILY_INDEX))))
 	    entity = Qnil;
 	}
     }

=== modified file 'src/keymap.c'
--- src/keymap.c	2014-06-12 14:55:48 +0000
+++ src/keymap.c	2014-06-27 14:33:07 +0000
@@ -3199,8 +3199,8 @@
   if (INTEGERP (a->event) && !INTEGERP (b->event))
     return -1;
   if (SYMBOLP (a->event) && SYMBOLP (b->event))
-    return (!NILP (Fstring_lessp (a->event, b->event)) ? -1
-	    : !NILP (Fstring_lessp (b->event, a->event)) ? 1
+    return (!NILP (string_lessp (a->event, b->event)) ? -1
+	    : !NILP (string_lessp (b->event, a->event)) ? 1
 	    : 0);
   return 0;
 }

=== modified file 'src/lisp.h'
--- src/lisp.h	2014-06-25 12:11:08 +0000
+++ src/lisp.h	2014-06-27 14:33:07 +0000
@@ -3482,6 +3482,18 @@
 extern Lisp_Object string_make_unibyte (Lisp_Object);
 extern void syms_of_fns (void);
 
+INLINE Lisp_Object
+string_equal (Lisp_Object s1, Lisp_Object s2)
+{
+  return Fstring_equal (s1, s2, Qnil);
+}
+
+INLINE Lisp_Object
+string_lessp (Lisp_Object s1, Lisp_Object s2)
+{
+  return Fstring_lessp (s1, s2, Qnil);
+}
+
 /* Defined in floatfns.c.  */
 extern void syms_of_floatfns (void);
 extern Lisp_Object fmod_float (Lisp_Object x, Lisp_Object y);

=== modified file 'src/nsfns.m'
--- src/nsfns.m	2014-06-01 08:23:18 +0000
+++ src/nsfns.m	2014-06-27 14:33:07 +0000
@@ -180,7 +180,7 @@
   CHECK_STRING (name);
 
   for (dpyinfo = x_display_list; dpyinfo; dpyinfo = dpyinfo->next)
-    if (!NILP (Fstring_equal (XCAR (dpyinfo->name_list_element), name)))
+    if (!NILP (string_equal (XCAR (dpyinfo->name_list_element), name)))
       return dpyinfo;
 
   error ("Emacs for Nextstep does not yet support multi-display");
@@ -390,7 +390,7 @@
   /* see if it's changed */
   if (STRINGP (arg))
     {
-      if (STRINGP (oldval) && EQ (Fstring_equal (oldval, arg), Qt))
+      if (STRINGP (oldval) && EQ (string_equal (oldval, arg), Qt))
         return;
     }
   else if (!STRINGP (oldval) && EQ (oldval, Qnil) == EQ (arg, Qnil))
@@ -482,7 +482,7 @@
     CHECK_STRING (name);
 
   /* Don't change the name if it's already NAME.  */
-  if (! NILP (Fstring_equal (name, f->name)))
+  if (! NILP (string_equal (name, f->name)))
     return;
 
   fset_name (f, name);

=== modified file 'src/search.c'
--- src/search.c	2014-06-23 04:11:29 +0000
+++ src/search.c	2014-06-27 14:33:07 +0000
@@ -227,7 +227,7 @@
 	goto compile_it;
       if (SCHARS (cp->regexp) == SCHARS (pattern)
 	  && STRING_MULTIBYTE (cp->regexp) == STRING_MULTIBYTE (pattern)
-	  && !NILP (Fstring_equal (cp->regexp, pattern))
+	  && !NILP (string_equal (cp->regexp, pattern))
 	  && EQ (cp->buf.translate, (! NILP (translate) ? translate : make_number (0)))
 	  && cp->posix == posix
 	  && (EQ (cp->syntax_table, Qt)

=== modified file 'src/w32fns.c'
--- src/w32fns.c	2014-06-22 23:12:17 +0000
+++ src/w32fns.c	2014-06-27 14:33:07 +0000
@@ -1543,7 +1543,7 @@
     return;
 
   if (STRINGP (arg) && STRINGP (oldval)
-      && EQ (Fstring_equal (oldval, arg), Qt))
+      && EQ (string_equal (oldval, arg), Qt))
     return;
 
   if (SYMBOLP (arg) && SYMBOLP (oldval) && EQ (arg, oldval))
@@ -1566,7 +1566,7 @@
 {
   if (STRINGP (arg))
     {
-      if (STRINGP (oldval) && EQ (Fstring_equal (oldval, arg), Qt))
+      if (STRINGP (oldval) && EQ (string_equal (oldval, arg), Qt))
 	return;
     }
   else if (!NILP (arg) || NILP (oldval))
@@ -1774,7 +1774,7 @@
     CHECK_STRING (name);
 
   /* Don't change the name if it's already NAME.  */
-  if (! NILP (Fstring_equal (name, f->name)))
+  if (! NILP (string_equal (name, f->name)))
     return;
 
   fset_name (f, name);
@@ -5200,7 +5200,7 @@
   CHECK_STRING (name);
 
   for (dpyinfo = &one_w32_display_info; dpyinfo; dpyinfo = dpyinfo->next)
-    if (!NILP (Fstring_equal (XCAR (dpyinfo->name_list_element), name)))
+    if (!NILP (string_equal (XCAR (dpyinfo->name_list_element), name)))
       return dpyinfo;
 
   /* Use this general default value to start with.  */

=== modified file 'src/w32menu.c'
--- src/w32menu.c	2014-06-04 04:58:31 +0000
+++ src/w32menu.c	2014-06-27 14:33:07 +0000
@@ -1060,9 +1060,9 @@
     return 0;
   name = XCAR (name);
 
-  if (!NILP (Fstring_equal (name, yes)))
+  if (!NILP (string_equal (name, yes)))
     other = no;
-  else if (!NILP (Fstring_equal (name, no)))
+  else if (!NILP (string_equal (name, no)))
     other = yes;
   else
     return 0;
@@ -1075,7 +1075,7 @@
   if (!CONSP (name))
     return 0;
   name = XCAR (name);
-  if (NILP (Fstring_equal (name, other)))
+  if (NILP (string_equal (name, other)))
     return 0;
 
   /* Check there are no more options.  */
@@ -1181,7 +1181,7 @@
 	  value = Qnil;
 	}
 
-      if (!NILP (Fstring_equal (name, lispy_answer)))
+      if (!NILP (string_equal (name, lispy_answer)))
 	{
 	  return value;
 	}

=== modified file 'src/xfaces.c'
--- src/xfaces.c	2014-06-10 03:32:36 +0000
+++ src/xfaces.c	2014-06-27 14:33:07 +0000
@@ -976,7 +976,7 @@
 	     lookup STD_COLOR separately.  If it's impossible to lookup
 	     a standard color, we just give up and use TTY_COLOR.  */
 	  if ((!STRINGP (XCAR (color_desc))
-	       || NILP (Fstring_equal (color, XCAR (color_desc))))
+	       || NILP (string_equal (color, XCAR (color_desc))))
 	      && !NILP (Ffboundp (Qtty_color_standard_values)))
 	    {
 	      /* Look up STD_COLOR separately.  */

=== modified file 'src/xfns.c'
--- src/xfns.c	2014-06-22 05:00:14 +0000
+++ src/xfns.c	2014-06-27 14:33:07 +0000
@@ -895,7 +895,7 @@
 
   if (STRINGP (arg))
     {
-      if (STRINGP (oldval) && EQ (Fstring_equal (oldval, arg), Qt))
+      if (STRINGP (oldval) && EQ (string_equal (oldval, arg), Qt))
 	return;
     }
   else if (!STRINGP (oldval) && EQ (oldval, Qnil) == EQ (arg, Qnil))
@@ -927,7 +927,7 @@
 
   if (STRINGP (arg))
     {
-      if (STRINGP (oldval) && EQ (Fstring_equal (oldval, arg), Qt))
+      if (STRINGP (oldval) && EQ (string_equal (oldval, arg), Qt))
 	return;
     }
   else if (!NILP (arg) || NILP (oldval))
@@ -1442,7 +1442,7 @@
     CHECK_STRING (name);
 
   /* Don't change the name if it's already NAME.  */
-  if (! NILP (Fstring_equal (name, f->name)))
+  if (! NILP (string_equal (name, f->name)))
     return;
 
   fset_name (f, name);
@@ -4343,7 +4343,7 @@
   CHECK_STRING (name);
 
   for (dpyinfo = x_display_list; dpyinfo; dpyinfo = dpyinfo->next)
-    if (!NILP (Fstring_equal (XCAR (dpyinfo->name_list_element), name)))
+    if (!NILP (string_equal (XCAR (dpyinfo->name_list_element), name)))
       return dpyinfo;
 
   /* Use this general default value to start with.  */

=== modified file 'test/automated/fns-tests.el'
--- test/automated/fns-tests.el	2014-06-25 10:36:51 +0000
+++ test/automated/fns-tests.el	2014-06-27 15:15:23 +0000
@@ -100,3 +100,38 @@
   (should (compare-strings "こんにちはコンニチハ" nil nil "こんにちはコンニチハ" nil nil))
   (should (= (compare-strings "んにちはコンニチハこ" nil nil "こんにちはコンニチハ" nil nil) 1))
   (should (= (compare-strings "こんにちはコンニチハ" nil nil "んにちはコンニチハこ" nil nil) -1)))
+
+(ert-deftest fns-test-string-equal ()
+  (should-error (string-equal))
+  (should-error (string-equal 1 2))
+  (should-error (string-equal '[1 2 3 4] "1 2 3 4"))
+  (should-error (string-equal "aaa" "bbb" "ccc" "ddd"))
+  (should (string-equal "foo" 'foo))
+  (should (string-equal "BAR" 'bar t))
+  (should (string-equal "aaa" "aaa"))
+  (should-not (string-equal "aaa" "aaaa"))
+  (should-not (string-equal "aaaa" "aaa"))
+  (should-not (string-equal "aaa" "aab"))
+  (should-not (string-equal "aab" "aaa"))
+  (should (string-equal "AAA" "aaa" t))
+  (should (string-equal "bbb" "BBB" t))
+  (should-not (string-equal (make-string 10 1234) (make-string 11 1234)))
+  (should (string-equal "ӒӒӒ" "ӒӒӒ"))
+  (should-not (string-equal "ӓӓӓ" "ӒӒӒ"))
+  (should (string-equal "ӓӓӓ" "ӒӒӒ" t)))
+
+(ert-deftest fns-test-string-lessp ()
+  (should-error (string-lessp))
+  (should-error (string-lessp 1 2))
+  (should-error (string-lessp '[1 2 3 4] "1 2 3 4"))
+  (should-error (string-lessp "aaa" "bbb" "ccc" "ddd"))
+  (should (string-lessp "" "a"))
+  (should-not (string-lessp "" ""))
+  (should (string-lessp "aaa" "bbb"))
+  (should (string-lessp "aaa" "aab"))
+  (should (string-lessp "aaa" "aaaa"))
+  (should-not (string-lessp "ddd" "ddd"))
+  (should-not (string-lessp "ddd" "ccc"))
+  (should (string-lessp (make-string 4 1111) (make-string 4 1112)))
+  (should (string-lessp "ӒӒӒ" "ӓӓӓ"))
+  (should-not (string-lessp "ӒӒӒ" "ӓӓӓ" t)))


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 15:27 One more string functions change Dmitry Antipov
@ 2014-06-27 15:35 ` Andreas Schwab
  2014-06-27 16:46 ` Paul Eggert
  2014-06-27 22:46 ` Drew Adams
  2 siblings, 0 replies; 19+ messages in thread
From: Andreas Schwab @ 2014-06-27 15:35 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: Emacs development discussions

Dmitry Antipov <dmantipov@yandex.ru> writes:

> @@ -4469,7 +4471,9 @@
>  
>  @code{string-equal} provides the corresponding test for equality.  Its
>  shorter, alternative name is @code{string=}.  There are no string test
> -functions that correspond to @var{>}, @code{>=}, or @code{<=}.
> +functions that correspond to @var{>}, @code{>=}, or @code{<=}.  This
> +function accepts an optional third argument with the same meaning as
> +in @code{stirng-lessp}.

s/stirng/string/

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 15:27 One more string functions change Dmitry Antipov
  2014-06-27 15:35 ` Andreas Schwab
@ 2014-06-27 16:46 ` Paul Eggert
  2014-06-27 19:46   ` Eli Zaretskii
  2014-06-27 22:46 ` Drew Adams
  2 siblings, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2014-06-27 16:46 UTC (permalink / raw)
  To: Dmitry Antipov, Emacs development discussions

Dmitry Antipov wrote:
> If the optional
> +third argument is non-nil, strings are compared ignoring case
> +differences.

I suggest that this sort of thing be reworded to resemble the 
documentation of compare-strings, e.g., "If the optional third argument 
is non-nil, characters are converted to lower-case before comparing them."

Perhaps some day Emacs will support better case conversion, e.g., we 
could extend the third argument so that if it's a case table, that case 
table is used rather than the default one.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 16:46 ` Paul Eggert
@ 2014-06-27 19:46   ` Eli Zaretskii
  2014-06-27 20:46     ` Paul Eggert
  2014-06-28 16:21     ` Dmitry Antipov
  0 siblings, 2 replies; 19+ messages in thread
From: Eli Zaretskii @ 2014-06-27 19:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: dmantipov, emacs-devel

> Date: Fri, 27 Jun 2014 09:46:19 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> Perhaps some day Emacs will support better case conversion, e.g., we 
> could extend the third argument so that if it's a case table, that case 
> table is used rather than the default one.

That's not enough.  Currently, Emacs down-cases using the current
buffer's settings.  This is TRT in some cases, but very wrong in
others.  It is especially wrong when down-casing strings (as opposed
to portions of a buffer), because there's no reason to believe that a
particular string being processed has any relevance to the current
buffer and its defaults.

IOW, we don't have any good way of specifying language- or
locale-specific case-folding.  E.g., try writing a function that
compares file names case-insensitively for a given locale.  FWIW, I
think _that_ is where we should concentrate our energy, not on
nano-improvements such as the one proposed here.  But hey! 90% of
Emacs development energy goes to such changes, while important missing
features are being left unimplemented for years.  So who am I to
complain?



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 19:46   ` Eli Zaretskii
@ 2014-06-27 20:46     ` Paul Eggert
  2014-06-27 20:53       ` Eli Zaretskii
  2014-06-28 16:21     ` Dmitry Antipov
  1 sibling, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2014-06-27 20:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmantipov, emacs-devel

Eli Zaretskii wrote:
> That's not enough.  Currently, Emacs down-cases using the current
> buffer's settings.

OK.  If the change is making it easier to do the wrong thing, perhaps we 
should leave things alone, and save our energies for making better changes.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 20:46     ` Paul Eggert
@ 2014-06-27 20:53       ` Eli Zaretskii
  0 siblings, 0 replies; 19+ messages in thread
From: Eli Zaretskii @ 2014-06-27 20:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: dmantipov, emacs-devel

> Date: Fri, 27 Jun 2014 13:46:05 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: dmantipov@yandex.ru, emacs-devel@gnu.org
> 
> If the change is making it easier to do the wrong thing, perhaps we
> should leave things alone, and save our energies for making better
> changes.

I would agree with you wholeheartedly, but I somehow doubt that this
energy will get redirected towards making better changes (although I
still hope it will).  So I don't want to object to this kind of
changes on those grounds, since the contributors might find that
unreasonable.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: One more string functions change
  2014-06-27 15:27 One more string functions change Dmitry Antipov
  2014-06-27 15:35 ` Andreas Schwab
  2014-06-27 16:46 ` Paul Eggert
@ 2014-06-27 22:46 ` Drew Adams
  2014-06-28  3:48   ` Dmitry Antipov
  2 siblings, 1 reply; 19+ messages in thread
From: Drew Adams @ 2014-06-27 22:46 UTC (permalink / raw)
  To: Dmitry Antipov, Emacs development discussions

> (defun gnus-string< (s1 s2)
>    "..."
>    (if case-fold-search
>        (string-lessp (downcase (if (symbolp s1) (symbol-name s1) s1))
>                      (downcase (if (symbolp s2) (symbol-name s2) s2)))
>      (string-lessp s1 s2)))

Why?  Is (string-lessp s1 s2 t) really that much handier than being
explicit?

(let ((case-fold-search  t)) (string-lessp s1 s2))

or

(string-lessp (upper s1) (upper s2))

We already have a global variable for this.  Why add an argument for it?

And if you want to accept symbols too as args, then "string-lessp" is
anyway a poor name for what it does.

This is no different from lots of other uses of a function that binds
a global var, or converts/casts its args, to change its behavior.

How many occurrences of such a programming cliche for `string-lessp'
do you find in the Emacs sources, for example?  One?  Zero?

YAGNI.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 22:46 ` Drew Adams
@ 2014-06-28  3:48   ` Dmitry Antipov
  2014-06-28 13:48     ` Drew Adams
  2014-06-30 13:18     ` Stefan Monnier
  0 siblings, 2 replies; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-28  3:48 UTC (permalink / raw)
  To: Drew Adams; +Cc: Emacs development discussions

On 06/28/2014 02:46 AM, Drew Adams wrote:

> Why?  Is (string-lessp s1 s2 t) really that much handier than being
> explicit?
>
> (let ((case-fold-search  t)) (string-lessp s1 s2))
>
> or
>
> (string-lessp (upper s1) (upper s2))
>
> We already have a global variable for this.  Why add an argument for it?

Value of case-fold-search doesn't affect string-lessp and string-equal.

> How many occurrences of such a programming cliche for `string-lessp'
> do you find in the Emacs sources, for example?  One?  Zero?

Just ask grep:

lisp/textmodes/flyspell.el-1016-      (while (and (not r) (setq p (search-backward word bound t)))
lisp/textmodes/flyspell.el-1017-        (let ((lw (flyspell-get-word)))
lisp/textmodes/flyspell.el-1018-          (if (and (consp lw)
lisp/textmodes/flyspell.el-1019-                   (if ignore-case
lisp/textmodes/flyspell.el:1020:                       (string-equal (downcase (car lw)) (downcase word))
lisp/textmodes/flyspell.el:1021:                     (string-equal (car lw) word)))

lisp/gnus/gnus-util.el-1437-  "Like `string-equal', except it compares case-insensitively."
lisp/gnus/gnus-util.el-1438-  (and (= (length x) (length y))
lisp/gnus/gnus-util.el:1439:       (or (string-equal x y)
lisp/gnus/gnus-util.el:1440:       (string-equal (downcase x) (downcase y)))))

lisp/gnus/gnus-util.el-1972-       (if ignore-case
lisp/gnus/gnus-util.el:1973:           (string-equal (downcase str1) (downcase prefix))
lisp/gnus/gnus-util.el:1974:         (string-equal str1 prefix))))))

lisp/info.el-2882-                         ;; Use string-equal, not equal,
lisp/info.el-2883-                         ;; to ignore text properties.
lisp/info.el:2884:                         (string-equal (downcase prevnode)
lisp/info.el-2885-                                       (downcase upnode))))

lisp/recentf.el-312-  (if recentf-case-fold-search
lisp/recentf.el:313:      (string-equal (downcase s1) (downcase s2))
lisp/recentf.el:314:    (string-equal s1 s2)))

lisp/emacs-lisp/cl-extra.el:72:       (or (string-equal x y)
lisp/emacs-lisp/cl-extra.el:73:           (string-equal (downcase x) (downcase y))))) ;Lazy but simple!

lisp/ibuf-ext.el:1128:  (string-lessp (downcase
lisp/ibuf-ext.el-1129-           (symbol-name (buffer-local-value 'major-mode (car a))))
lisp/ibuf-ext.el-1130-          (downcase
lisp/ibuf-ext.el-1131-           (symbol-name (buffer-local-value 'major-mode (car b))))))

lisp/ibuf-ext.el:1138:  (string-lessp (downcase
lisp/ibuf-ext.el-1139-           (with-current-buffer
lisp/ibuf-ext.el-1140-               (car a)
lisp/ibuf-ext.el-1141-             (format-mode-line mode-name)))
lisp/ibuf-ext.el-1142-          (downcase
lisp/ibuf-ext.el-1143-           (with-current-buffer

etc.

Dmitry




^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: One more string functions change
  2014-06-28  3:48   ` Dmitry Antipov
@ 2014-06-28 13:48     ` Drew Adams
  2014-06-28 16:32       ` Dmitry Antipov
  2014-06-30 13:18     ` Stefan Monnier
  1 sibling, 1 reply; 19+ messages in thread
From: Drew Adams @ 2014-06-28 13:48 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: Emacs development discussions

> > (let ((case-fold-search  t)) (string-lessp s1 s2))
> > We already have a global variable for this. Why add an argument for it?
> 
> Value of case-fold-search doesn't affect string-lessp and string-equal.

Oh, right.  What was the reason for that?  Anyone know?

> > How many occurrences of such a programming cliche for `string-lessp'
> > do you find in the Emacs sources, for example?  One?  Zero?
> 
> Just ask grep:
>... lisp/gnus/gnus-util.el:1440: (string-equal (downcase x) (downcase y))
>...

> > or: (string-lessp (upper s1) (upper s2))

To me, that cliche seems just as easy & clear as (string-lessp s1 s2 t).



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-27 19:46   ` Eli Zaretskii
  2014-06-27 20:46     ` Paul Eggert
@ 2014-06-28 16:21     ` Dmitry Antipov
  2014-06-28 17:19       ` Eli Zaretskii
  2014-06-28 17:26       ` One more string functions change Yuri Khan
  1 sibling, 2 replies; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-28 16:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel

On 06/27/2014 11:46 PM, Eli Zaretskii wrote:

> That's not enough.  Currently, Emacs down-cases using the current
> buffer's settings.  This is TRT in some cases, but very wrong in
> others.  It is especially wrong when down-casing strings (as opposed
> to portions of a buffer), because there's no reason to believe that a
> particular string being processed has any relevance to the current
> buffer and its defaults.

What makes you think that the system locale is more relevant? String
may be the result of search/match operation in the buffer at least.

> IOW, we don't have any good way of specifying language- or
> locale-specific case-folding.

What's wrong with case tables? If we're talking about Unicode only,
is it enough/possible/desirable to have just one (huge) case table
for all supported characters?

> FWIW, I think _that_ is where we should concentrate our energy, not on
> nano-improvements such as the one proposed here.  But hey! 90% of
> Emacs development energy goes to such changes, while important missing
> features are being left unimplemented for years.  So who am I to
> complain?

"Why are you being so harsh?  We are not the enemy" (C).

If you have a personal TOTO/wishlist/roadmap/whatever, please share.
(Yes, I know about etc/TODO).

Dmitry



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-28 13:48     ` Drew Adams
@ 2014-06-28 16:32       ` Dmitry Antipov
  0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-28 16:32 UTC (permalink / raw)
  To: Drew Adams; +Cc: Emacs development discussions

On 06/28/2014 05:48 PM, Drew Adams wrote:

>> Value of case-fold-search doesn't affect string-lessp and string-equal.
>
> Oh, right.  What was the reason for that?  Anyone know?

Someone was too lazy or just didn't consider this important
enough - who knows?

>>> or: (string-lessp (upper s1) (upper s2))
>
> To me, that cliche seems just as easy & clear as (string-lessp s1 s2 t).

Sure, but if we treat strings as immutable objects, both 'upper' should
create copies; the latter can avoid memory allocation at all and so
disprove the well-known "LISP programmers know the value of everything
and the cost of nothing" principle.

Dmitry





^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-28 16:21     ` Dmitry Antipov
@ 2014-06-28 17:19       ` Eli Zaretskii
  2014-06-29  2:53         ` Dmitry Antipov
  2014-06-28 17:26       ` One more string functions change Yuri Khan
  1 sibling, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2014-06-28 17:19 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: eggert, emacs-devel

> Date: Sat, 28 Jun 2014 20:21:55 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
> 
> On 06/27/2014 11:46 PM, Eli Zaretskii wrote:
> 
> > That's not enough.  Currently, Emacs down-cases using the current
> > buffer's settings.  This is TRT in some cases, but very wrong in
> > others.  It is especially wrong when down-casing strings (as opposed
> > to portions of a buffer), because there's no reason to believe that a
> > particular string being processed has any relevance to the current
> > buffer and its defaults.
> 
> What makes you think that the system locale is more relevant?

I didn't say it was.  I said that we currently have no way of telling
Emacs to down-case in a locale-specific manner:

> > IOW, we don't have any good way of specifying language- or
> > locale-specific case-folding.
> 
> What's wrong with case tables?

They are not locale- and/or language-specific.  For example,
down-casing 'I' to 'i' is wrong for Turkish.

> If we're talking about Unicode only, is it enough/possible/desirable
> to have just one (huge) case table for all supported characters?

You can't, because language-specific rules interfere.  See section
5.18 in the Unicode Standard, and the SpecialCasing.txt file in the
Unicode Character Database.

> > FWIW, I think _that_ is where we should concentrate our energy, not on
> > nano-improvements such as the one proposed here.  But hey! 90% of
> > Emacs development energy goes to such changes, while important missing
> > features are being left unimplemented for years.  So who am I to
> > complain?
> 
> "Why are you being so harsh?  We are not the enemy" (C).

Sorry about that.

> If you have a personal TOTO/wishlist/roadmap/whatever, please share.

In the department we are talking about, look at the links on this
page:

  http://www.unicode.org/reports/

UAX#14, UAX#15, UTS#10, and UTS#18 should all be supported by Emacs.
(And yes, I should complete my work on bringing the bidirectional
editing support in line with the additions to UAX#9 in Unicode 6.3.)

Elsewhere, the recent IDE and WYSIWYG editing discussions suggest
major improvements in functionality that at least some users sorely
miss.  The FFI stuff (see on-going discussions here) is yet another.

And that's just results of a 10-sec thought.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-28 16:21     ` Dmitry Antipov
  2014-06-28 17:19       ` Eli Zaretskii
@ 2014-06-28 17:26       ` Yuri Khan
  1 sibling, 0 replies; 19+ messages in thread
From: Yuri Khan @ 2014-06-28 17:26 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: Eli Zaretskii, Paul Eggert, Emacs developers

On Sat, Jun 28, 2014 at 11:21 PM, Dmitry Antipov <dmantipov@yandex.ru> wrote:

> What's wrong with case tables? If we're talking about Unicode only,
> is it enough/possible/desirable to have just one (huge) case table
> for all supported characters?

It’s not generally possible, because in Turkic locales there is this
funny couple of letters, i and dotless ı. They uppercase into dotted İ
and I, respectively. This makes uppercase a function dependent on the
locale.

Further, comparing strings case-insensitively by downcasing is wrong,
because of this funny German letter ß (sharp s, eszett), and these
funny Greek letters σ (sigma) and ς (final sigma). Straße is
case-insensitively equivalent to STRASSE, but they downcase to straße
and strasse, respectively. Both sigma σ and final sigma ς are
case-insensitively equivalent to Capital Sigma Σ, but small letters
downcase to themselves and Capital Sigma downcases to σ.

The right, Unicode-compliant way to compare strings case-insensitively
involves a mapping called case folding, which is similar to
downcasing, but subtly different. For example, it expands ß into ss,
and normalizes final sigma to normal sigma, and does many other
expansions. Case-folded strings are largely not usable for human
consumption but only for case-insensitive comparison. Details can be
found in the Unicode Standard, section 5.18.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-28 17:19       ` Eli Zaretskii
@ 2014-06-29  2:53         ` Dmitry Antipov
  2014-06-29 15:13           ` Eli Zaretskii
  0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-29  2:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Kenichi Handa, emacs-devel

On 06/28/2014 09:19 PM, Eli Zaretskii wrote:

> UAX#14, UAX#15, UTS#10, and UTS#18 should all be supported by Emacs.
> (And yes, I should complete my work on bringing the bidirectional
> editing support in line with the additions to UAX#9 in Unicode 6.3.)

BTW, why do not use ICU plus our own special handling for 0x110000..0x3FFFFF?

Dmitry




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-29  2:53         ` Dmitry Antipov
@ 2014-06-29 15:13           ` Eli Zaretskii
  2014-06-29 16:38             ` Dmitry Antipov
  0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2014-06-29 15:13 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: handa, emacs-devel

> Date: Sun, 29 Jun 2014 06:53:45 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: Kenichi Handa <handa@gnu.org>, emacs-devel@gnu.org
> 
> On 06/28/2014 09:19 PM, Eli Zaretskii wrote:
> 
> > UAX#14, UAX#15, UTS#10, and UTS#18 should all be supported by Emacs.
> > (And yes, I should complete my work on bringing the bidirectional
> > editing support in line with the additions to UAX#9 in Unicode 6.3.)
> 
> BTW, why do not use ICU plus our own special handling for 0x110000..0x3FFFFF?

I don't think this was ever considered.

It's possible that we should consider this now, but the answer to your
question is not a trivial one in any case.  Emacs traditionally
exposed to Lisp all the Unicode character properties, as char-tables.
If we decide to use ICU, we'd need to think what to do with those
char-tables: remove them, populate them using ICU, something else?
(Having these databases twice would be an unnecessary bloat, IMO.)
Some of these properties need to support very fast access (e.g., for
bidi display), and the question is how fast is ICU in this regard.
Also, many Unicode features are already implemented, so they should be
reworked or refactored, or maybe the corresponding ICU features left
unused.  And features that depend on Unicode, like font selection,
will have to be adapted.

IOW, just coming up with a list of pros and cons will probably require
some research, IMO.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-29 15:13           ` Eli Zaretskii
@ 2014-06-29 16:38             ` Dmitry Antipov
  2014-06-29 16:48               ` Eli Zaretskii
  0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-29 16:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: handa, emacs-devel

On 06/29/2014 07:13 PM, Eli Zaretskii wrote:

> It's possible that we should consider this now, but the answer to your
> question is not a trivial one in any case.  Emacs traditionally
> exposed to Lisp all the Unicode character properties, as char-tables.

Are these exposed properties really used from Lisp in a high-level,
user-defined manner? For example, is it desirable/possible to customize
related things via .emacs? Or is there major/minor mode which relies
on the Lisp-visible character properties?

> If we decide to use ICU, we'd need to think what to do with those
> char-tables: remove them, populate them using ICU, something else?
> (Having these databases twice would be an unnecessary bloat, IMO.)

Yes, ICU itself is bloated enough. On my system, shared library
with compiled-in Unicode data is > 20M. Nevertheless it's commonly
considered "not too bloated" even for relatively small systems like
the modern Android-based gadgets.

> Some of these properties need to support very fast access (e.g., for
> bidi display), and the question is how fast is ICU in this regard.
> Also, many Unicode features are already implemented, so they should be
> reworked or refactored, or maybe the corresponding ICU features left
> unused.  And features that depend on Unicode, like font selection,
> will have to be adapted.

IIUC the things are even worse because ICU uses 16- and 32-bit quantities
to represent Unicode characters; this doesn't look too compatible
with our internal variable-size, 1-5 bytes-width encoding.

Dmitry



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-29 16:38             ` Dmitry Antipov
@ 2014-06-29 16:48               ` Eli Zaretskii
  2014-06-30  6:21                 ` Internationalize Emacs's messages [Was: Re: One more string functions change] Dmitry Antipov
  0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2014-06-29 16:48 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: handa, emacs-devel

> Date: Sun, 29 Jun 2014 20:38:26 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: handa@gnu.org, emacs-devel@gnu.org
> 
> On 06/29/2014 07:13 PM, Eli Zaretskii wrote:
> 
> > It's possible that we should consider this now, but the answer to your
> > question is not a trivial one in any case.  Emacs traditionally
> > exposed to Lisp all the Unicode character properties, as char-tables.
> 
> Are these exposed properties really used from Lisp in a high-level,
> user-defined manner? For example, is it desirable/possible to customize
> related things via .emacs? Or is there major/minor mode which relies
> on the Lisp-visible character properties?

These are exactly the questions we should ask ourselves.  I don't know
the answers off-hand, and I don't think these issues were discussed in
the past.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Internationalize Emacs's messages [Was: Re: One more string functions change]
  2014-06-29 16:48               ` Eli Zaretskii
@ 2014-06-30  6:21                 ` Dmitry Antipov
  0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Antipov @ 2014-06-30  6:21 UTC (permalink / raw)
  To: emacs-devel; +Cc: Eli Zaretskii

BTW, if we're talking about natural language-related improvements,
does anyone working on this long-standing etc/TODO item?

Dmitry




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One more string functions change
  2014-06-28  3:48   ` Dmitry Antipov
  2014-06-28 13:48     ` Drew Adams
@ 2014-06-30 13:18     ` Stefan Monnier
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2014-06-30 13:18 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: Drew Adams, Emacs development discussions

> Value of case-fold-search doesn't affect string-lessp and string-equal.

And rightly so (which part of "search" isn't clear?).



        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-06-30 13:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-27 15:27 One more string functions change Dmitry Antipov
2014-06-27 15:35 ` Andreas Schwab
2014-06-27 16:46 ` Paul Eggert
2014-06-27 19:46   ` Eli Zaretskii
2014-06-27 20:46     ` Paul Eggert
2014-06-27 20:53       ` Eli Zaretskii
2014-06-28 16:21     ` Dmitry Antipov
2014-06-28 17:19       ` Eli Zaretskii
2014-06-29  2:53         ` Dmitry Antipov
2014-06-29 15:13           ` Eli Zaretskii
2014-06-29 16:38             ` Dmitry Antipov
2014-06-29 16:48               ` Eli Zaretskii
2014-06-30  6:21                 ` Internationalize Emacs's messages [Was: Re: One more string functions change] Dmitry Antipov
2014-06-28 17:26       ` One more string functions change Yuri Khan
2014-06-27 22:46 ` Drew Adams
2014-06-28  3:48   ` Dmitry Antipov
2014-06-28 13:48     ` Drew Adams
2014-06-28 16:32       ` Dmitry Antipov
2014-06-30 13:18     ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).