From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Michael Albinus Newsgroups: gmane.emacs.bugs Subject: bug#18051: 24.3.92; ls-lisp: Sorting; make ls-lisp-string-lessp a normal function? Date: Sun, 20 Jul 2014 17:26:04 +0200 Message-ID: <87egxggigj.fsf@gmx.de> References: <87ha2f5gp8.fsf@web.de> <838unr6ttu.fsf@gnu.org> <871ttj5dfi.fsf@web.de> <87iomvhvdg.fsf@gmx.de> <834myf6mfl.fsf@gnu.org> <87a987ht5r.fsf@gmx.de> <83y4vq6cz3.fsf@gnu.org> <87tx6c7f5v.fsf@web.de> <8338dw5zrf.fsf@gnu.org> <87lhro7dp4.fsf@web.de> <83zjg44jzd.fsf@gnu.org> <87wqb8mqqv.fsf@web.de> <83y4vo4fbr.fsf@gnu.org> <87silwmo8h.fsf@web.de> <83wqb84e7l.fsf@gnu.org> <87iomsgsqg.fsf@gmx.de> <83tx6c44x7.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1405870046 13214 80.91.229.3 (20 Jul 2014 15:27:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 20 Jul 2014 15:27:26 +0000 (UTC) Cc: michael_heerdegen@web.de, Paul Eggert , 18051@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Jul 20 17:27:18 2014 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1X8t1B-00056I-It for geb-bug-gnu-emacs@m.gmane.org; Sun, 20 Jul 2014 17:27:17 +0200 Original-Received: from localhost ([::1]:58078 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X8t1B-0004j5-6z for geb-bug-gnu-emacs@m.gmane.org; Sun, 20 Jul 2014 11:27:17 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40531) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X8t12-0004hy-K5 for bug-gnu-emacs@gnu.org; Sun, 20 Jul 2014 11:27:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X8t0w-0008Ul-87 for bug-gnu-emacs@gnu.org; Sun, 20 Jul 2014 11:27:08 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:36985) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X8t0w-0008UM-53 for bug-gnu-emacs@gnu.org; Sun, 20 Jul 2014 11:27:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1X8t0v-000693-QW for bug-gnu-emacs@gnu.org; Sun, 20 Jul 2014 11:27:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Michael Albinus Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 20 Jul 2014 15:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 18051 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 18051-submit@debbugs.gnu.org id=B18051.140586998323571 (code B ref 18051); Sun, 20 Jul 2014 15:27:01 +0000 Original-Received: (at 18051) by debbugs.gnu.org; 20 Jul 2014 15:26:23 +0000 Original-Received: from localhost ([127.0.0.1]:60484 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1X8t0I-000685-46 for submit@debbugs.gnu.org; Sun, 20 Jul 2014 11:26:23 -0400 Original-Received: from mout.gmx.net ([212.227.17.22]:53637) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1X8t0F-00067q-U1 for 18051@debbugs.gnu.org; Sun, 20 Jul 2014 11:26:20 -0400 Original-Received: from detlef.gmx.de ([87.146.61.166]) by mail.gmx.com (mrgmx102) with ESMTPSA (Nemesis) id 0LsOsW-1WOVad3Ytc-0124G2; Sun, 20 Jul 2014 17:26:12 +0200 In-Reply-To: <83tx6c44x7.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 20 Jul 2014 14:59:16 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) X-Provags-ID: V03:K0:zWoShfSX8fky1YXqa3cUFjkB/D8PZGn3IdY1LHyBA2DQDQH79ao 8lOGDOM1Js8ZVIyuTWmMeNiPQ1QeKUkJxOMzpkafhtsYqPbN5P5dA3mPwwfMd+4LrfzAwpr G8SmZmOYjulQh5MqcI8gX5FgchtD+JU4ewfG/MQTClI5CWqpAM3BHZV+nSiE4BR3x8svxoz xLAChh+Yy+1ySjg3Ffgyg== X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:91705 Archived-At: Eli Zaretskii writes: >> Maybe we should expose glib's g_utf8_collate() on Lisp level. > > Are you sure this does the job? Glib docs are minimal, and don't seem > to mention UTS#10. E.g., if g_utf8_collate relies on the underlying > libc's strcoll, we are back at square one. Well, I've checked the code of g_utf8_collate in glib 2.36. Shortly, it does --8<---------------cut here---------------start------------->8--- #ifdef HAVE_CARBON UCCompareTextDefault (kUCCollateStandardOptions, str1_utf16, len1, str2_utf16, len2, NULL, &retval); #elif defined(__STDC_ISO_10646__) result = wcscoll ((wchar_t *)str1_norm, (wchar_t *)str2_norm); #else /* !__STDC_ISO_10646__ */ result = strcoll (str1_norm, str2_norm); #endif --8<---------------cut here---------------end--------------->8--- Likely, wcscoll implements only ISO 14651 (a subset of UCA these days), and likely wcscoll supports single byte characters only. I will run some tests next days. An alternative would be libicu, which seems to implement UCA completely. I have no idea whether there are licensing issues when linking with Emacs, 'tho. Maybe Paul knows better which library to use? I've seen in GNU grep's Changelogs, that wcscoll was used, but removed last year. I haven't checked (yet) what is the replacement. >> On systems without glib, we might emulate it partially. Packages >> like ls-lisp could use it then for sorting. > > I think we need our own implementation in any case. If nothing else, > that would solve the issue of encoding strings into UTF-8 before > calling external C functions. Yep. But given the complexity of UCA, we will start slowly with a subset of the algorithm only. This and performance considerations will still demand for a native C library, if available. Best regards, Michael.