From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Anders Lindgren Newsgroups: gmane.emacs.bugs Subject: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Fri, 18 Dec 2015 09:38:08 +0100 Message-ID: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83h9jgxloz.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a114405c619e3360527280ff7 X-Trace: ger.gmane.org 1450427960 5529 80.91.229.3 (18 Dec 2015 08:39:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 18 Dec 2015 08:39:20 +0000 (UTC) Cc: 22169@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Dec 18 09:39:12 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a9qZD-0003A5-SR for geb-bug-gnu-emacs@m.gmane.org; Fri, 18 Dec 2015 09:39:12 +0100 Original-Received: from localhost ([::1]:58976 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9qZD-0001CO-6s for geb-bug-gnu-emacs@m.gmane.org; Fri, 18 Dec 2015 03:39:11 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9qZ7-0001C6-Ss for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 03:39:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9qZ4-0002BW-F8 for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 03:39:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:47506) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9qZ4-0002BR-Cq for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 03:39:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1a9qZ3-00025q-VJ for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 03:39:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Anders Lindgren Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 18 Dec 2015 08:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22169 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 22169-submit@debbugs.gnu.org id=B22169.14504278967994 (code B ref 22169); Fri, 18 Dec 2015 08:39:01 +0000 Original-Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 08:38:16 +0000 Original-Received: from localhost ([127.0.0.1]:55108 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9qYK-00024s-EP for submit@debbugs.gnu.org; Fri, 18 Dec 2015 03:38:16 -0500 Original-Received: from mail-vk0-f52.google.com ([209.85.213.52]:36315) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9qYI-00024g-MW for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 03:38:15 -0500 Original-Received: by mail-vk0-f52.google.com with SMTP id f2so22819892vkb.3 for <22169@debbugs.gnu.org>; Fri, 18 Dec 2015 00:38:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=eDmTxLW8aPtDn+NeU4DZPk4jBfk7bkQ6ZdwCK4Fwp6A=; b=FRqAfSpZLMx+i9cDIjPanX4vdFK5U0uqqDp4wDzp/yYeV0dEgks2mB6DYH5z7X9Cfc /dcvAb/S8esjUmQtIwD5aAmFqwWamcsJ+spiDSbGSf7YK3NSb4Ym4QptvAtiylpQe+fU UTfMglQUUAO9fDR2/96UKxQF0MQyZRMAXhBztz22pZaTBc46HctnO9Qxj2yiPCjZk7R6 WsPpdEpmQFtIHGcrI0+zEZH3+5x/dQPHjTlB6VxlQSgwIXQqvgRmkfm+mnyVrGWjM7rd TuorkhVbZfHVO+WvATQEeVGvG9WPBAUofTRLCZ9IBV1z3GwMXegtp+gsgOaaVyl77I8j /F/g== X-Received: by 10.31.58.74 with SMTP id h71mr1469408vka.149.1450427889081; Fri, 18 Dec 2015 00:38:09 -0800 (PST) Original-Received: by 10.31.210.133 with HTTP; Fri, 18 Dec 2015 00:38:08 -0800 (PST) In-Reply-To: <83h9jgxloz.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:110109 Archived-At: --001a114405c619e3360527280ff7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > > Below is a patch where I have dropped the old encoder and use the new > instead. > > The only thing noteworthy is that `ucs-normalize' is loaded by loadup > (when ns > > is used) and thus included in the dumped Emacs (if I understand > correctly). > > Unless anybody objects, I'll push it in a couple of days. > > Looks good to me, with one comment: > > > diff --git a/lisp/loadup.el b/lisp/loadup.el > > index f0caa8b..dda433e 100644 > > --- a/lisp/loadup.el > > +++ b/lisp/loadup.el > > @@ -276,6 +276,7 @@ > > (if (featurep 'ns) > > (progn > > (load "term/common-win") > > + (load "international/ucs-normalize") > > (load "term/ns-win"))) > > (if (fboundp 'x-create-frame) > > ;; Do it after loading term/foo-win.el since the value of the > > diff --git a/lisp/term/ns-win.el b/lisp/term/ns-win.el > > index 0b3e3bd..9bd59fc 100644 > > --- a/lisp/term/ns-win.el > > +++ b/lisp/term/ns-win.el > > @@ -51,6 +51,7 @@ > > (require 'menu-bar) > > (require 'fontset) > > (require 'dnd) > > +(require 'ucs-normalize) > > Why do you need the 'require' if loadup will unconditionally load > ucs-normalize? > I was just trying to follow the pattern in ns-win.el, there are a number of requires at the beginning, after a comment saying ";; Documentation-purposes only: actually loaded in loadup.el." I can easily drop the line, if you think it's better. > > After giving this some thought, it feels like the file name matching should be > > done on decoded strings (so that an "a" doesn't match the "a" in a decomposed > > "=C3=A5"). However, this is a major change and needs to be discussed fu= rther. > > I rather think it's a non-starter, at least for Emacs 25.1. It > probably means users of all systems will be punished by slower > directory searches, on behalf of one peculiar filesystem. Unless > there's some clever idea that avoids decoding each file name returned > by readdir, that is. The eternal question of correctness versus speed... My gut feeling is that the time it takes to decode the file names is dwarfed by the time it takes to read the file list from the harddisk (this needs to be verified, of course). In addition, for systems like Linux, encoding and decoding are no-ops (as both the source and destination is UTF-8), so there won't be a penalty there. I agree that this is not a project for Emacs 25.1 -- however, I think that we should at explore this for future versions. I suggest that we push the current patch (after dropping the `require' line), close the current issue, and post a new bug report suggesting performing the completion on decoded strings. -- Anders --001a114405c619e3360527280ff7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
> Below is a patch where I have dropped the old encod= er and use the new instead.
> The only thing noteworthy is that `ucs-normalize' is loaded by loa= dup (when ns
> is used) and thus included in the dumped Emacs (if I understand correc= tly).
> Unless anybody objects, I'll push it in a couple of days.

Looks good to me, with one comment:

> diff --git a/lisp/loadup.el b/lisp/loadup.el
> index f0caa8b..dda433e 100644
> --- a/lisp/loadup.el
> +++ b/lisp/loadup.el
> @@ -276,6 +276,7 @@
>=C2=A0 (if (featurep 'ns)
>=C2=A0 =C2=A0 =C2=A0 (progn
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 (load "term/common-win")
> +=C2=A0 =C2=A0 =C2=A0 (load "international/ucs-normalize") >=C2=A0 =C2=A0 =C2=A0 =C2=A0 (load "term/ns-win")))
>=C2=A0 (if (fboundp 'x-create-frame)
>=C2=A0 =C2=A0 =C2=A0 ;; Do it after loading term/foo-win.el since the v= alue of the
> diff --git a/lisp/term/ns-win.el b/lisp/term/ns-win.el
> index 0b3e3bd..9bd59fc 100644
> --- a/lisp/term/ns-win.el
> +++ b/lisp/term/ns-win.el
> @@ -51,6 +51,7 @@
>=C2=A0 (require 'menu-bar)
>=C2=A0 (require 'fontset)
>=C2=A0 (require 'dnd)
> +(require 'ucs-normalize)

Why do you need the 'require' if loadup will unconditionally load ucs-normalize?

I was just trying to fol= low the pattern in ns-win.el, there are a number of requires at the beginni= ng, after a comment saying ";; Docume= ntation-purposes only: actually loaded in loadup.el."

I can easily drop t= he line, if you think it's better.


<= div>> > After giving this some thoug= ht, it feels like the file name matching should be
> > done on dec= oded strings (so that an "a" doesn't match the "a" = in a decomposed
> > "=C3=A5"). However, this is a major = change and needs to be discussed further.
>
> I rather think it's a non-starter, at least for = Emacs 25.1.=C2=A0 It
> probably means users of all systems will be punished b= y slower
> directory searches, on behalf of one peculiar filesystem.=C2=A0 Un= less
= > there's some clever idea that avoids decoding each file name retur= ned
&= gt; by readdir, that is.

The eternal ques= tion of correctness versus speed...

My gut fe= eling is that the time it takes to decode the file names is dwarfed by the = time it takes to read the file list from the harddisk (this needs to be ver= ified, of course). In addition, for systems like Linux, encoding and decodi= ng are no-ops (as both the source and destination is UTF-8), so there won&#= 39;t be a penalty there.
=
I agree that this is not a project for Emacs 25.1 -- = however, I think that we should at explore this for future versions. I sugg= est that we push the current patch (after dropping the `require' line),= close the current issue, and post a new bug report suggesting performing t= he completion on decoded strings.

=C2=A0 =C2= =A0 -- Anders

--001a114405c619e3360527280ff7--