From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Anders Lindgren Newsgroups: gmane.emacs.bugs Subject: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Fri, 18 Dec 2015 07:29:17 +0100 Message-ID: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a114401764673c0052726422f X-Trace: ger.gmane.org 1450420221 20632 80.91.229.3 (18 Dec 2015 06:30:21 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 18 Dec 2015 06:30:21 +0000 (UTC) Cc: random832@fastmail.com To: 22169@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Dec 18 07:30:12 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a9oYN-0008Mu-LS for geb-bug-gnu-emacs@m.gmane.org; Fri, 18 Dec 2015 07:30:11 +0100 Original-Received: from localhost ([::1]:58598 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9oYM-0000WE-Qk for geb-bug-gnu-emacs@m.gmane.org; Fri, 18 Dec 2015 01:30:10 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47738) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9oYH-0000TY-Tr for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 01:30:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9oYE-0006H0-ND for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 01:30:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:47447) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9oYE-0006Ge-JE for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 01:30:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1a9oYE-0007TP-73 for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 01:30:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Anders Lindgren Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 18 Dec 2015 06:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22169 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 22169-submit@debbugs.gnu.org id=B22169.145042016528652 (code B ref 22169); Fri, 18 Dec 2015 06:30:02 +0000 Original-Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 06:29:25 +0000 Original-Received: from localhost ([127.0.0.1]:55049 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9oXc-0007S4-Sx for submit@debbugs.gnu.org; Fri, 18 Dec 2015 01:29:25 -0500 Original-Received: from mail-vk0-f50.google.com ([209.85.213.50]:35903) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9oXb-0007Rs-7K for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 01:29:23 -0500 Original-Received: by mail-vk0-f50.google.com with SMTP id f2so21390574vkb.3 for <22169@debbugs.gnu.org>; Thu, 17 Dec 2015 22:29:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=lRsZyLGcyEnfVe5EsEYCdeP6sfvDZSVYOiLsp8EphYI=; b=QLBn6RrOM+a3OIfAVS57OONNMsyvMfMc1pWufq83fDZv/MHBy0F8yFRuj4O6MIVWl+ 07g0mwO/3A//e+v7jh3h0JxQ/3FLMd3FjhkDkN3ezszypEMOOQk621w5/MS0HYeUEutL E+NGUsYYf5K5kT4dh8zyJiu/vUc0cEefUu4keNsvKZDRawoB5IT8b4268V83GSVSAk79 UtURAJ05OBIONXQTOxk22HLAJhWkDZ+hHL5Ihk0ZmCywcUDgELGXYlW/NKDEt+T+4wBf iENBjQifrekuXsKLNqAvZUwbgHMQAi1IfZanLSZDtyLf/hbaAFR9pFO9xapL2xtxSea/ 03Ew== X-Received: by 10.31.10.199 with SMTP id 190mr1171826vkk.51.1450420157705; Thu, 17 Dec 2015 22:29:17 -0800 (PST) Original-Received: by 10.31.210.133 with HTTP; Thu, 17 Dec 2015 22:29:17 -0800 (PST) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:110103 Archived-At: --001a114401764673c0052726422f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! I just realized that I missed parts of the ongoing discussion -- I was under impression that I as OP should be CC:ed, but apparently I wasn't. After reading through Random832:s comments, I also see the problem with "=C3=A5=C3=A4=C3=B6" and "aao" not being handled correctly. Typing "a TAB" = makes Emacs delete the "a", which seems very confusing. Typing "=C3=A5 TAB" or "aa TAB" works, though. (Here `(file-name-all-completions "a" ".")' returns `("=C3=A5=C3=A4=C3=B6first.txt" "aaosecond.txt")'. In other words, Emcas is in better shape with my than it was before, but there is still some work to be done. When it comes to "lax" matching -- I really don't think we should use it for file names. I don't want to match "=C3=A5" when I type "a" etc. HFS+ file systems are case sensitive (It's possible this can be disabled, but if so it's very rarely used). However, many OS X desktop applications work hard to make this invisible to users. I think that we should keep `read-file-name-completion-ignore-case' as it is, as this corresponds to how files really are stored. After giving this some thought, it feels like the file name matching should be done on decoded strings (so that an "a" doesn't match the "a" in a decomposed "=C3=A5"). However, this is a major change and needs to be discu= ssed further. -- Anders On Thu, Dec 17, 2015 at 11:01 PM, Anders Lindgren wrote= : > > Hi! > > I think I have solved this. > > The current coding system defined in ns-win.el didn't work because it only provided a decode but no encode functions. > > After revisiting the "hfs" encoder, I managed to get it to work, this time. > > Below is a patch where I have dropped the old encoder and use the new instead. The only thing noteworthy is that `ucs-normalize' is loaded by loadup (when ns is used) and thus included in the dumped Emacs (if I understand correctly). Unless anybody objects, I'll push it in a couple of days. > > -- Anders > > On Tue, Dec 15, 2015 at 9:05 PM, Anders Lindgren wrote: >> >> Hi, >> >>> >>> Can you write a patch to that effect, for emacs-25 branch? >> >> >> We have the find the cause of the problem first. But once we do that, this should be straight forward. >> >> >>> > What does this return: >>> > >>> > M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6" "/that/empty/d= irectory/") RET >>> > >>> > It returns nil. >>> >>> So this is the heart of the problem. I assume that if you do the same >>> with an ASCII first argument, the result is non-nil, yes? >> >> >> Yes. >> >> >>> >>> Then the next step is to step with a debugger through >>> file_name_completion, and see why this returns nil instead of a list >>> of files that begin. >> >> >> Auhm, I'll see what I can do. I'm a family father and have very, very, limited time, but I can see in I can find a time slot for it. >> >> -- Anders >> > --001a114401764673c0052726422f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I just realized that I missed parts of the ongo= ing discussion -- I was under impression that I as OP should be CC:ed, but = apparently I wasn't.

After reading through Random832:s comments,= I also see the problem with "=C3=A5=C3=A4=C3=B6" and "aao&q= uot; not being handled correctly. Typing "a TAB" makes Emacs dele= te the "a", which seems very confusing. Typing "=C3=A5 TAB&q= uot; or "aa TAB" works, though. (Here `(file-name-all-completions= "a" ".")' returns `("=C3=A5=C3=A4=C3=B6first.= txt" "aaosecond.txt")'.

In other words, Emcas is = in better shape with my than it was before, but there is still some work to= be done.

When it comes to "lax" matching -- I really don&= #39;t think we should use it for file names. I don't want to match &quo= t;=C3=A5" when I type "a" etc.

HFS+ file systems are = case sensitive (It's possible this can be disabled, but if so it's = very rarely used). However, =C2=A0many OS X desktop applications work hard = to make this invisible to users. I think that we should keep `read-file-nam= e-completion-ignore-case' as it is, as this corresponds to how files re= ally are stored.

After giving this some thought, it feels like the f= ile name matching should be done on decoded strings (so that an "a&quo= t; doesn't match the "a" in a decomposed "=C3=A5").= However, this is a major change and needs to be discussed further.

= =C2=A0 =C2=A0 -- Anders

On Thu, Dec 17, 2015 at 11:01 PM, Anders Lin= dgren <andlind@gmail.com> wr= ote:
>
> Hi!
>
> I think I have solved this.
>= ;
> The current coding system defined in ns-win.el didn't work be= cause it only provided a decode but no encode functions.
>
> Af= ter revisiting the "hfs" encoder, I managed to get it to work, th= is time.
>
> Below is a patch where I have dropped the old enco= der and use the new instead. The only thing noteworthy is that `ucs-normali= ze' is loaded by loadup (when ns is used) and thus included in the dump= ed Emacs (if I understand correctly). Unless anybody objects, I'll push= it in a couple of days.
>
> =C2=A0 =C2=A0 -- Anders
>> On Tue, Dec 15, 2015 at 9:05 PM, Anders Lindgren <andlind@gmail.com> wrote:
>>
>&g= t; Hi,
>> =C2=A0
>>>
>>> Can you write a p= atch to that effect, for emacs-25 branch?
>>
>>
>&g= t; We have the find the cause of the problem first. But once we do that, th= is should be straight forward.
>>
>>
>>> >= =C2=A0 =C2=A0 What does this return:
>>> >
>>> = > =C2=A0 =C2=A0 M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6&q= uot; "/that/empty/directory/") RET
>>> >
>&g= t;> > It returns nil.
>>>
>>> So this is the = heart of the problem.=C2=A0 I assume that if you do the same
>>>= ; with an ASCII first argument, the result is non-nil, yes?
>>
= >>
>> Yes.
>>
>> =C2=A0
>>>>>> Then the next step is to step with a debugger through
>= ;>> file_name_completion, and see why this returns nil instead of a l= ist
>>> of files that begin.
>>
>>
>>= ; Auhm, I'll see what I can do. I'm a family father and have very, = very, limited time, but I can see in I can find a time slot for it.
>= >
>> =C2=A0 =C2=A0 -- Anders
>>
>
--001a114401764673c0052726422f--