all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Michael Albinus <michael.albinus@gmx.de>
To: Eli Zaretskii <eliz@gnu.org>
Cc: michael_heerdegen@web.de, 18051@debbugs.gnu.org
Subject: bug#18051: 24.3.92; ls-lisp: Sorting; make ls-lisp-string-lessp a normal function?
Date: Sat, 23 Aug 2014 18:42:44 +0200	[thread overview]
Message-ID: <87a96vuph7.fsf@gmx.de> (raw)
In-Reply-To: <83a96vmv80.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 23 Aug 2014 12:05:51 +0300")

[-- Attachment #1: Type: text/plain, Size: 1858 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

> I think everything in str_collate starting with the "Convert byte
> stream to code pointers." comment (btw, I guess you meant "code
> points" here) should be in a separate function, and the best place for
> that function is sysdep.c.  At least on MS-Windows, both the part that
> converts a Lisp string into wchar_t array, and the part that performs
> a locale-sensitive string comparison, will be implemented differently.

Well, I've moved (most of) str_collate to sysdep.c.

> Thanks.  (You didn't attach the new patch.)

Oops. Appended this time.

> Btw, I wonder whether we should have a way to pass the locale string
> explicitly, instead of relying on $LC_COLLATE.

We could add an optional argument to string-collate-*. But this would
break signature equivalence with string-lessp and string-equal,
respectively.

Or we could introduce a global var, which shall be let-bound to the
locale string.

>> I have added also configure checks HAVE_NEWLOCALE, HAVE_USELOCALE and
>> HAVE_FREELOCALE for the respective glibc functions. I don't know whether
>> it is overengineering, and whether I could simply apply the existing
>> HAVE_SETLOCALE check. I believe all these functions do exist in parallel
>> in locale.h, don't they?
>
> I'll defer to glibc experts on that.  My knowledge of 'newlocale'
> facilities is limited to what I saw in Guile's i18n.c module.

According to the manpages, setlocale is conforming to "C89, C99,
POSIX.1-2001". {new,use,free}locale are conforming to "POSIX.1-2008".
So we must check for HAVE_USELOCALE, indeed. Checks for HAVE_NEWLOCALE
and HAVE_FREELOCALE are not necessary, the functions exist in parallel
to uselocale (introduced in glibc 2.3).

This raises the question, whether we shall use also my first setlocale
approach in case of uselocale absence?

Best regards, Michael.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: collate-patch --]
[-- Type: text/x-patch, Size: 5348 bytes --]

=== modified file 'src/fns.c'
--- src/fns.c	2014-08-02 15:56:18 +0000
+++ src/fns.c	2014-08-23 15:57:06 +0000
@@ -40,7 +40,7 @@
 #include "xterm.h"
 #endif
 
-Lisp_Object Qstring_lessp;
+Lisp_Object Qstring_lessp, Qstring_collate_lessp, Qstring_collate_equalp;
 static Lisp_Object Qprovide, Qrequire;
 static Lisp_Object Qyes_or_no_p_history;
 Lisp_Object Qcursor_in_echo_area;
@@ -343,6 +343,84 @@
     }
   return i1 < SCHARS (s2) ? Qt : Qnil;
 }
+
+#ifdef __STDC_ISO_10646__
+/* Defined in sysdep.c.  */
+extern ptrdiff_t str_collate (Lisp_Object, Lisp_Object);
+#endif /* __STDC_ISO_10646__ */
+
+DEFUN ("string-collate-lessp", Fstring_collate_lessp, Sstring_collate_lessp, 2, 2, 0,
+       doc: /* Return t if first arg string is less than second in collation order.
+
+Case is significant.  Symbols are also allowed; their print names are
+used instead.
+
+This function obeys the conventions for collation order in your
+locale settings.  For example, punctuation and whitespace characters
+are considered less significant for sorting.
+
+\(sort '\("11" "12" "1 1" "1 2" "1.1" "1.2") 'string-collate-lessp)
+  => \("11" "1 1" "1.1" "12" "1 2" "1.2")
+
+If your system does not support a locale environment, this function
+behaves like `string-lessp'.
+
+If the environment variable \"LC_COLLATE\" is set in `process-environment',
+it overrides the setting of your current locale.  */)
+  (Lisp_Object s1, Lisp_Object s2)
+{
+#ifdef __STDC_ISO_10646__
+  /* Check parameters.  */
+  if (SYMBOLP (s1))
+    s1 = SYMBOL_NAME (s1);
+  if (SYMBOLP (s2))
+    s2 = SYMBOL_NAME (s2);
+  CHECK_STRING (s1);
+  CHECK_STRING (s2);
+
+  return (str_collate (s1, s2) < 0) ? Qt : Qnil;
+
+#else
+  return Fstring_lessp (s1, s2);
+#endif /* __STDC_ISO_10646__ */
+}
+
+DEFUN ("string-collate-equalp", Fstring_collate_equalp, Sstring_collate_equalp, 2, 2, 0,
+       doc: /* Return t if two strings have identical contents.
+
+Case is significant.  Symbols are also allowed; their print names are
+used instead.
+
+This function obeys the conventions for collation order in your locale
+settings.  For example, characters with different coding points but
+the same meaning are considered as equal, like different grave accent
+unicode characters.
+
+\(string-collate-equalp \(string ?\\uFF40) \(string ?\\u1FEF))
+  => t
+
+If your system does not support a locale environment, this function
+behaves like `string-equal'.
+
+If the environment variable \"LC_COLLATE\" is set in `process-environment',
+it overrides the setting of your current locale.  */)
+  (Lisp_Object s1, Lisp_Object s2)
+{
+#ifdef __STDC_ISO_10646__
+  /* Check parameters.  */
+  if (SYMBOLP (s1))
+    s1 = SYMBOL_NAME (s1);
+  if (SYMBOLP (s2))
+    s2 = SYMBOL_NAME (s2);
+  CHECK_STRING (s1);
+  CHECK_STRING (s2);
+
+  return (str_collate (s1, s2) == 0) ? Qt : Qnil;
+
+#else
+  return Fstring_equal (s1, s2);
+#endif /* __STDC_ISO_10646__ */
+}
 \f
 static Lisp_Object concat (ptrdiff_t nargs, Lisp_Object *args,
 			   enum Lisp_Type target_type, bool last_special);
@@ -4919,6 +4997,8 @@
   defsubr (&Sdefine_hash_table_test);
 
   DEFSYM (Qstring_lessp, "string-lessp");
+  DEFSYM (Qstring_collate_lessp, "string-collate-lessp");
+  DEFSYM (Qstring_collate_equalp, "string-collate-equalp");
   DEFSYM (Qprovide, "provide");
   DEFSYM (Qrequire, "require");
   DEFSYM (Qyes_or_no_p_history, "yes-or-no-p-history");
@@ -4972,6 +5052,8 @@
   defsubr (&Sstring_equal);
   defsubr (&Scompare_strings);
   defsubr (&Sstring_lessp);
+  defsubr (&Sstring_collate_lessp);
+  defsubr (&Sstring_collate_equalp);
   defsubr (&Sappend);
   defsubr (&Sconcat);
   defsubr (&Svconcat);

=== modified file 'src/sysdep.c'
--- src/sysdep.c	2014-07-14 19:23:18 +0000
+++ src/sysdep.c	2014-08-23 16:36:39 +0000
@@ -3513,3 +3513,63 @@
 }
 
 #endif	/* !defined (WINDOWSNT) */
+\f
+/* Wide character string collation.  */
+
+#ifdef __STDC_ISO_10646__
+#include <wchar.h>
+
+#ifdef HAVE_USELOCALE
+#include <locale.h>
+#endif /* HAVE_USELOCALE */
+
+ptrdiff_t
+str_collate (Lisp_Object s1, Lisp_Object s2)
+{
+  register ptrdiff_t res, len, i, i_byte;
+  wchar_t *p1, *p2;
+#ifdef HAVE_USELOCALE
+  Lisp_Object lc_collate;
+  locale_t loc = (locale_t) 0, oldloc = (locale_t) 0;
+#endif /* HAVE_USELOCALE */
+
+  USE_SAFE_ALLOCA;
+
+  /* Convert byte stream to code points.  */
+  len = SCHARS (s1); i = i_byte = 0;
+  p1 = (wchar_t *) SAFE_ALLOCA ((len+1) * (sizeof *p1));
+  while (i < len)
+    FETCH_STRING_CHAR_ADVANCE (*(p1+i-1), s1, i, i_byte);
+  *(p1+len) = 0;
+
+  len = SCHARS (s2); i = i_byte = 0;
+  p2 = (wchar_t *) SAFE_ALLOCA ((len+1) * (sizeof *p2));
+  while (i < len)
+    FETCH_STRING_CHAR_ADVANCE (*(p2+i-1), s2, i, i_byte);
+  *(p2+len) = 0;
+
+#ifdef HAVE_USELOCALE
+  /* Create a new locale object, and set it.  */
+  lc_collate =
+    Fgetenv_internal (build_string ("LC_COLLATE"), Vprocess_environment);
+
+  if (STRINGP (lc_collate)
+      && (loc = newlocale (LC_COLLATE_MASK, SSDATA (lc_collate), (locale_t) 0)))
+    oldloc = uselocale (loc);
+#endif /* HAVE_USELOCALE */
+
+  res = wcscoll (p1, p2);
+
+#ifdef HAVE_USELOCALE
+  /* Free the locale object, and reset.  */
+  if (loc)
+    freelocale (loc);
+  if (oldloc)
+    uselocale (oldloc);
+#endif /* HAVE_USELOCALE */
+
+  /* Return result.  */
+  SAFE_FREE ();
+  return res;
+}
+#endif /* __STDC_ISO_10646__ */


  reply	other threads:[~2014-08-23 16:42 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1XMiOq-0000si-VD@vcs.savannah.gnu.org>
2014-07-18  6:22 ` bug#18051: 24.3.92; ls-lisp: Sorting; make ls-lisp-string-lessp a normal function? Michael Heerdegen
2014-07-18  6:53   ` Eli Zaretskii
2014-07-18  7:33     ` Michael Heerdegen
2014-07-18  8:53       ` Eli Zaretskii
2014-07-18  9:37         ` Michael Heerdegen
2014-07-18  9:46           ` Eli Zaretskii
2014-07-18 10:18             ` Michael Heerdegen
2014-07-18 13:03               ` Eli Zaretskii
2014-07-19  1:25                 ` Michael Heerdegen
2014-07-19  8:17                   ` Eli Zaretskii
2014-07-19 10:52                     ` Michael Heerdegen
2014-07-19 10:56                     ` Eli Zaretskii
2014-07-18  9:24       ` Michael Albinus
2014-07-18  9:33         ` Eli Zaretskii
2014-07-18 10:12           ` Michael Albinus
2014-07-18 12:57             ` Eli Zaretskii
2014-07-18 13:18               ` Michael Albinus
2014-07-18 13:44                 ` Eli Zaretskii
2014-07-18 16:21                   ` Michael Albinus
2014-07-20  5:49               ` Michael Heerdegen
2014-07-20  6:07                 ` Eli Zaretskii
2014-07-20  6:21                   ` Michael Heerdegen
2014-07-20  6:33                     ` Eli Zaretskii
2014-07-20  7:30                       ` Michael Heerdegen
2014-07-20  8:14                         ` Eli Zaretskii
2014-07-20  8:24                           ` Michael Heerdegen
2014-07-20  8:38                             ` Eli Zaretskii
2014-07-20  9:15                               ` Michael Heerdegen
2014-07-20  9:18                                 ` Eli Zaretskii
2014-07-20 11:44                               ` Michael Albinus
2014-07-20 11:59                                 ` Eli Zaretskii
2014-07-20 15:26                                   ` Michael Albinus
2014-07-20 16:16                                     ` Eli Zaretskii
2014-08-16 21:52                                     ` Michael Albinus
2014-08-17 16:38                                       ` Eli Zaretskii
2014-08-17 17:55                                         ` Eli Zaretskii
2014-08-17 18:46                                           ` Michael Albinus
2014-08-17 18:52                                             ` Eli Zaretskii
2014-08-21  9:05                                               ` Michael Albinus
2014-08-21 14:41                                                 ` Eli Zaretskii
2014-08-22 14:23                                                   ` Michael Albinus
2014-08-23  9:05                                                     ` Eli Zaretskii
2014-08-23 16:42                                                       ` Michael Albinus [this message]
2014-08-23 17:33                                                         ` Eli Zaretskii
2014-08-23 20:32                                                           ` Michael Albinus
2014-08-24 14:54                                                             ` Eli Zaretskii
2014-08-24 16:18                                                               ` Michael Albinus
2014-08-25 15:01                                                               ` Stefan Monnier
2014-08-27  8:49                                                                 ` Michael Albinus
2014-08-27 15:37                                                                   ` Eli Zaretskii
2014-08-27 18:02                                                                     ` Michael Albinus
2014-08-27 15:48                                                                   ` Glenn Morris
2014-08-27 16:53                                                                     ` Eli Zaretskii
2014-08-28  3:23                                                                       ` Stefan Monnier
2014-08-27 18:08                                                                     ` Michael Albinus
2014-08-27 18:30                                                                       ` Glenn Morris
2014-08-25 16:45                                                             ` Glenn Morris
2014-08-25 17:36                                                               ` Eli Zaretskii
2014-07-20  6:18                 ` Michael Heerdegen
2014-07-20 14:22                   ` Stefan Monnier
2014-08-27 23:57   ` bug#18051: trunk r117751: Improve robustness of new string-collation code Katsumi Yamaoka
2014-08-28  0:51     ` Paul Eggert
2014-08-28  3:09   ` Katsumi Yamaoka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a96vuph7.fsf@gmx.de \
    --to=michael.albinus@gmx.de \
    --cc=18051@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=michael_heerdegen@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.