all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: kzhr@d1.dion.ne.jp, michael.albinus@gmx.de, emacs-devel@gnu.org
Subject: Re: Multibyte and unibyte file names
Date: Fri, 25 Jan 2013 22:31:19 +0200	[thread overview]
Message-ID: <83vcalj97s.fsf@gnu.org> (raw)
In-Reply-To: <jwvip6lts4p.fsf-monnier+emacs@gnu.org>

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org,  kzhr@d1.dion.ne.jp,  michael.albinus@gmx.de
> Date: Fri, 25 Jan 2013 06:36:39 -0500
> 
> >> That the callers get to see meaningful (decoded) names?
> >> That file-name manipulation functions don't have the side effect of
> >> encoding/decoding file names?
> > If we decode unibyte file names at entry to each primitive, before
> > doing anything else, and thereafter manipulate decoded multibyte
> > strings, this will happen anyway.
> 
> I get the impression that we're not talking about the same thing.

Looks like that.

> If you only decode on entry, then Elisp code will first see encoded file
> names returned by directory-files and will then see them converted to
> decoded form after passing the result to a file-name
> manipulation function.

No.  Elisp code will see _decoded_ file names from directory-files,
because we already decode them.  I didn't mean to change that.

What I meant was to return decoded file names from all file-name
primitives, such as file-name-nondirectory, even if their input was
encoded.

> Which is why I suggest to decode right away in the functions that return
> file names (e.g. directory-files).

We already do that, so there's no issue in that department.

The issue is in the file-name primitives that want to support both
encoded and decoded file names, and as I understand from this
discussion, this feature should stay.

> > But since everybody (at least those who spoke) seem to think this is a
> > w32 only problem, I will solve it for w32 only.
> 
> I think the specific problems you mentioned are mostly non-issues under
> POSIX, but the general problem of deciding which representation to use
> is more general.

I thought this was already decided in favor of decoded file names,
a.k.a. "multibyte strings".  The few calls that pass encoded file
names are rare exceptions, but since we want to keep support for
encoded file names, fixing those few places is not going to buy us
anything except code reshuffling.

The problem with encoded file names is that we have little support for
them.  E.g., we cannot up-/down-case them (except if we know the
encoding is supported by the current locale).  For multibyte encodings
that are not UTF-8, we also cannot scan them by characters, only by
bytes, so e.g. strchr will not generally work reliably.  We are
crippled.

So some things will never work with encoded file names, but I guess no
one cares, because most of those problems go away if the encoding is
UTF-8.  Fine; if no one cares, neither do I.



  reply	other threads:[~2013-01-25 20:31 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-23 17:45 Multibyte and unibyte file names Eli Zaretskii
2013-01-23 18:08 ` Paul Eggert
2013-01-23 19:04   ` Eli Zaretskii
2013-01-23 23:38     ` Paul Eggert
2013-01-23 19:42 ` Michael Albinus
2013-01-23 20:05   ` Eli Zaretskii
2013-01-23 20:58     ` Michael Albinus
2013-01-24 16:37       ` Eli Zaretskii
2013-01-23 21:09 ` Stefan Monnier
2013-01-24 17:02   ` Eli Zaretskii
2013-01-24 18:25     ` Stefan Monnier
2013-01-24 18:38       ` Eli Zaretskii
2013-01-25  0:06         ` Stefan Monnier
2013-01-25  7:37           ` Eli Zaretskii
2013-01-25 11:36             ` Stefan Monnier
2013-01-25 20:31               ` Eli Zaretskii [this message]
2013-01-25 22:28                 ` Stefan Monnier
2013-01-26 10:54                   ` Eli Zaretskii
2013-01-26 11:34                     ` Stefan Monnier
2013-01-26 13:16                       ` Eli Zaretskii
2013-01-26 22:11                         ` Stefan Monnier
2013-01-27  7:03                           ` Eli Zaretskii
2013-01-27  8:46                             ` Andreas Schwab
2013-01-27  9:40                               ` Eli Zaretskii
2013-01-28  1:55                             ` Stefan Monnier
2013-01-28 14:44                               ` Eli Zaretskii
2013-01-28 15:21                                 ` Stefan Monnier
2013-02-02 17:19                                   ` Eli Zaretskii
2013-01-26 13:20                       ` Stephen J. Turnbull
2013-01-26  3:04                 ` Stephen J. Turnbull
2013-01-26 11:27                   ` Eli Zaretskii
2013-01-26 13:03                     ` Stephen J. Turnbull
2013-01-26 13:36                       ` Eli Zaretskii
2013-01-26 16:26                         ` Paul Eggert
2013-01-26 18:30                           ` Stephen J. Turnbull
2013-01-26 17:10                         ` Stephen J. Turnbull
2013-01-26 17:33                           ` Eli Zaretskii
2013-01-26 18:06                             ` Paul Eggert
2013-01-26 18:20                               ` Eli Zaretskii
2013-01-26 18:56                             ` Stephen J. Turnbull
2013-01-26 21:40                               ` Stefan Monnier
2013-01-26 21:44                             ` Stefan Monnier
2013-01-27  6:14                               ` Eli Zaretskii
2013-01-26 16:05                   ` Richard Stallman
2013-01-26 17:57                     ` Stephen J. Turnbull
2013-01-26 22:16                     ` Stefan Monnier
2013-01-24 10:00 ` Michael Albinus
2013-01-24 16:40   ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83vcalj97s.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=kzhr@d1.dion.ne.jp \
    --cc=michael.albinus@gmx.de \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.