* Collation tests in fns-tests.el
@ 2015-10-30 17:51 Ken Brown
2015-10-30 20:28 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: Ken Brown @ 2015-10-30 17:51 UTC (permalink / raw)
To: Michael Albinus; +Cc: Emacs
Hi Michael,
I'm curious why you put the following test in fns-tests.el:
;; Punctuation and whitespace characters are not taken into account
;; for collation in other locales.
(should
(equal
(sort '("11" "12" "1 1" "1 2" "1.1" "1.2")
(lambda (a b)
(let ((w32-collate-ignore-punctuation t))
(string-collate-lessp
a b (if (eq system-type 'windows-nt) "enu_USA" "en_US.UTF-8")))))
'("11" "1 1" "1.1" "12" "1 2" "1.2")))
This suggests that punctuation and whitespace should definitely not be
taken into account in non-POSIX locales. But the docstring of 'sort' is
much less definitive:
"This function obeys the conventions for collation order in your locale
settings. For example, punctuation and whitespace characters *might* be
considered less significant for sorting." [My emphasis.]
Is there some place where emacs relies on punctuation and whitespace
being ignored? That certainly isn't the case on all supported systems,
nor is it mandated by POSIX.
Ken
P.S. My question is motivated by the fact that punctuation and
whitespace are not ignored on Cygwin in non-POSIX locales, and it does
not seem to be easy to make this happen. If you're interested in the
gory details, start here:
https://www.cygwin.com/ml/cygwin/2015-10/msg00516.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-30 17:51 Collation tests in fns-tests.el Ken Brown
@ 2015-10-30 20:28 ` Eli Zaretskii
2015-10-30 20:40 ` Eli Zaretskii
2015-10-30 21:10 ` Ken Brown
0 siblings, 2 replies; 9+ messages in thread
From: Eli Zaretskii @ 2015-10-30 20:28 UTC (permalink / raw)
To: Ken Brown; +Cc: michael.albinus, emacs-devel
> From: Ken Brown <kbrown@cornell.edu>
> Date: Fri, 30 Oct 2015 13:51:45 -0400
> Cc: Emacs <emacs-devel@gnu.org>
>
> I'm curious why you put the following test in fns-tests.el:
>
> ;; Punctuation and whitespace characters are not taken into account
> ;; for collation in other locales.
> (should
> (equal
> (sort '("11" "12" "1 1" "1 2" "1.1" "1.2")
> (lambda (a b)
> (let ((w32-collate-ignore-punctuation t))
> (string-collate-lessp
> a b (if (eq system-type 'windows-nt) "enu_USA" "en_US.UTF-8")))))
> '("11" "1 1" "1.1" "12" "1 2" "1.2")))
>
> This suggests that punctuation and whitespace should definitely not be
> taken into account in non-POSIX locales.
They were found to be ignored in all the cases we tested until now.
> But the docstring of 'sort' is much less definitive:
>
> "This function obeys the conventions for collation order in your locale
> settings. For example, punctuation and whitespace characters *might* be
> considered less significant for sorting." [My emphasis.]
>
> Is there some place where emacs relies on punctuation and whitespace
> being ignored?
Listing of files generally ignores them, as one example. ls-lisp.el
relies on that to emulate what 'ls' the program does on Posix hosts.
> P.S. My question is motivated by the fact that punctuation and
> whitespace are not ignored on Cygwin in non-POSIX locales, and it does
> not seem to be easy to make this happen. If you're interested in the
> gory details, start here:
>
> https://www.cygwin.com/ml/cygwin/2015-10/msg00516.html
You already said in that discussion what I'd suggest ;-)
Since Cygwin tries to be compatible to GNU/Linux (i.e. glibc), it
should indeed use some non-zero flags in its implementation of string
collation-dependent comparison. IMO, it makes no sense not to do
that, since users expect that to happen. Then the above test will
work for it, and moreover, ls-lisp.el will, too.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-30 20:28 ` Eli Zaretskii
@ 2015-10-30 20:40 ` Eli Zaretskii
2015-10-30 21:10 ` Ken Brown
1 sibling, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2015-10-30 20:40 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: michael.albinus, kbrown, emacs-devel
> Date: Fri, 30 Oct 2015 22:28:09 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: michael.albinus@gmx.de, emacs-devel@gnu.org
>
> > From: Ken Brown <kbrown@cornell.edu>
> > Date: Fri, 30 Oct 2015 13:51:45 -0400
> > Cc: Emacs <emacs-devel@gnu.org>
> >
> > I'm curious why you put the following test in fns-tests.el:
> >
> > ;; Punctuation and whitespace characters are not taken into account
> > ;; for collation in other locales.
> > (should
> > (equal
> > (sort '("11" "12" "1 1" "1 2" "1.1" "1.2")
> > (lambda (a b)
> > (let ((w32-collate-ignore-punctuation t))
> > (string-collate-lessp
> > a b (if (eq system-type 'windows-nt) "enu_USA" "en_US.UTF-8")))))
> > '("11" "1 1" "1.1" "12" "1 2" "1.2")))
> >
> > This suggests that punctuation and whitespace should definitely not be
> > taken into account in non-POSIX locales.
>
> They were found to be ignored in all the cases we tested until now.
I should have added "in UTF-8 locales" here, sorry.
But since Cygwin nowadays behaves like a UTF-8 locale (AFAIK), this
doesn't change the conclusions regarding Cygwin behavior, I think.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-30 20:28 ` Eli Zaretskii
2015-10-30 20:40 ` Eli Zaretskii
@ 2015-10-30 21:10 ` Ken Brown
2015-10-30 21:35 ` Eli Zaretskii
1 sibling, 1 reply; 9+ messages in thread
From: Ken Brown @ 2015-10-30 21:10 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: michael.albinus, emacs-devel
On 10/30/2015 4:28 PM, Eli Zaretskii wrote:
> You already said in that discussion what I'd suggest ;-)
>
> Since Cygwin tries to be compatible to GNU/Linux (i.e. glibc), it
> should indeed use some non-zero flags in its implementation of string
> collation-dependent comparison. IMO, it makes no sense not to do
> that, since users expect that to happen.
Yes, I agree completely. The issue is implementation. Simply using the
NORM_IGNORESYMBOLS flag yields comparison functions that can return 0 on
unequal strings. Eric pointed out the problem with that; moreover, it
seriously violates users' expectations and compatibility with glibc. I
thought I had a way around that, but Corinna pointed out in
https://www.cygwin.com/ml/cygwin/2015-10/msg00559.html why my suggestion
doesn't work. At this point I'm out of ideas.
Ken
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-30 21:10 ` Ken Brown
@ 2015-10-30 21:35 ` Eli Zaretskii
2015-10-30 22:16 ` Ken Brown
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2015-10-30 21:35 UTC (permalink / raw)
To: Ken Brown; +Cc: michael.albinus, emacs-devel
> Cc: michael.albinus@gmx.de, emacs-devel@gnu.org
> From: Ken Brown <kbrown@cornell.edu>
> Date: Fri, 30 Oct 2015 17:10:48 -0400
>
> On 10/30/2015 4:28 PM, Eli Zaretskii wrote:
> > You already said in that discussion what I'd suggest ;-)
> >
> > Since Cygwin tries to be compatible to GNU/Linux (i.e. glibc), it
> > should indeed use some non-zero flags in its implementation of string
> > collation-dependent comparison. IMO, it makes no sense not to do
> > that, since users expect that to happen.
>
> Yes, I agree completely. The issue is implementation. Simply using the
> NORM_IGNORESYMBOLS flag yields comparison functions that can return 0 on
> unequal strings. Eric pointed out the problem with that; moreover, it
> seriously violates users' expectations and compatibility with glibc. I
> thought I had a way around that, but Corinna pointed out in
> https://www.cygwin.com/ml/cygwin/2015-10/msg00559.html why my suggestion
> doesn't work. At this point I'm out of ideas.
I don't see why that conclusion is the only reasonable one (the
"seriously violates users' expectation" part surprises me), but I
don't really consider myself an expert on this, certainly not in
Cygwin.
If Cygwin's implementation of strcoll cannot be fixed, then we should
treat this test on Cygwin as expected failure.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-30 21:35 ` Eli Zaretskii
@ 2015-10-30 22:16 ` Ken Brown
2015-10-31 8:49 ` Michael Albinus
0 siblings, 1 reply; 9+ messages in thread
From: Ken Brown @ 2015-10-30 22:16 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: michael.albinus, emacs-devel
On 10/30/2015 5:35 PM, Eli Zaretskii wrote:
>> Cc: michael.albinus@gmx.de, emacs-devel@gnu.org
>> From: Ken Brown <kbrown@cornell.edu>
>> Date: Fri, 30 Oct 2015 17:10:48 -0400
>>
>> On 10/30/2015 4:28 PM, Eli Zaretskii wrote:
>>> You already said in that discussion what I'd suggest ;-)
>>>
>>> Since Cygwin tries to be compatible to GNU/Linux (i.e. glibc), it
>>> should indeed use some non-zero flags in its implementation of string
>>> collation-dependent comparison. IMO, it makes no sense not to do
>>> that, since users expect that to happen.
>>
>> Yes, I agree completely. The issue is implementation. Simply using the
>> NORM_IGNORESYMBOLS flag yields comparison functions that can return 0 on
>> unequal strings. Eric pointed out the problem with that; moreover, it
>> seriously violates users' expectations and compatibility with glibc. I
>> thought I had a way around that, but Corinna pointed out in
>> https://www.cygwin.com/ml/cygwin/2015-10/msg00559.html why my suggestion
>> doesn't work. At this point I'm out of ideas.
>
> I don't see why that conclusion is the only reasonable one (the
> "seriously violates users' expectation" part surprises me), but I
> don't really consider myself an expert on this, certainly not in
> Cygwin.
>
> If Cygwin's implementation of strcoll cannot be fixed, then we should
> treat this test on Cygwin as expected failure.
I'll probably do that, but I'll wait to see if Michael has anything to add.
Ken
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-30 22:16 ` Ken Brown
@ 2015-10-31 8:49 ` Michael Albinus
2015-10-31 9:07 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: Michael Albinus @ 2015-10-31 8:49 UTC (permalink / raw)
To: Ken Brown; +Cc: Eli Zaretskii, emacs-devel
Ken Brown <kbrown@cornell.edu> writes:
>> If Cygwin's implementation of strcoll cannot be fixed, then we should
>> treat this test on Cygwin as expected failure.
>
> I'll probably do that, but I'll wait to see if Michael has anything to add.
I have no other idea, sorry. Let's mark the test case as expected to
fail for Cygwin. And maybe a note about this behaviour could be added in
doc/lispref/strings.texi.
> Ken
Best regards, Michael.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-31 8:49 ` Michael Albinus
@ 2015-10-31 9:07 ` Eli Zaretskii
2015-11-02 16:25 ` Ken Brown
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2015-10-31 9:07 UTC (permalink / raw)
To: Michael Albinus; +Cc: kbrown, emacs-devel
> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> Date: Sat, 31 Oct 2015 09:49:30 +0100
>
> Ken Brown <kbrown@cornell.edu> writes:
>
> >> If Cygwin's implementation of strcoll cannot be fixed, then we should
> >> treat this test on Cygwin as expected failure.
> >
> > I'll probably do that, but I'll wait to see if Michael has anything to add.
>
> I have no other idea, sorry. Let's mark the test case as expected to
> fail for Cygwin. And maybe a note about this behaviour could be added in
> doc/lispref/strings.texi.
Well, one other idea is for Cygwin's Emacs build to have its private
implementation of strcoll, similar to what w32_compare_strings does.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Collation tests in fns-tests.el
2015-10-31 9:07 ` Eli Zaretskii
@ 2015-11-02 16:25 ` Ken Brown
0 siblings, 0 replies; 9+ messages in thread
From: Ken Brown @ 2015-11-02 16:25 UTC (permalink / raw)
To: Eli Zaretskii, Michael Albinus; +Cc: emacs-devel
On 10/31/2015 5:07 AM, Eli Zaretskii wrote:
>> From: Michael Albinus <michael.albinus@gmx.de>
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
>> Date: Sat, 31 Oct 2015 09:49:30 +0100
>>
>> Ken Brown <kbrown@cornell.edu> writes:
>>
>>>> If Cygwin's implementation of strcoll cannot be fixed, then we should
>>>> treat this test on Cygwin as expected failure.
>>>
>>> I'll probably do that, but I'll wait to see if Michael has anything to add.
>>
>> I have no other idea, sorry. Let's mark the test case as expected to
>> fail for Cygwin. And maybe a note about this behaviour could be added in
>> doc/lispref/strings.texi.
>
> Well, one other idea is for Cygwin's Emacs build to have its private
> implementation of strcoll, similar to what w32_compare_strings does.
I could do this in the future if necessary. For now, I've just
followed Michael's suggestion.
Ken
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-11-02 16:25 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-30 17:51 Collation tests in fns-tests.el Ken Brown
2015-10-30 20:28 ` Eli Zaretskii
2015-10-30 20:40 ` Eli Zaretskii
2015-10-30 21:10 ` Ken Brown
2015-10-30 21:35 ` Eli Zaretskii
2015-10-30 22:16 ` Ken Brown
2015-10-31 8:49 ` Michael Albinus
2015-10-31 9:07 ` Eli Zaretskii
2015-11-02 16:25 ` Ken Brown
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.