unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Drew Adams" <drew.adams@oracle.com>
To: "'Eli Zaretskii'" <eliz@gnu.org>
Cc: 7617@debbugs.gnu.org
Subject: bug#7617: 24.0.50; `expand-file-name': removal of slashes
Date: Sun, 12 Dec 2010 10:03:08 -0800	[thread overview]
Message-ID: <022D0B344DB64B2A9E1B771B0695237E@us.oracle.com> (raw)
In-Reply-To: <E1PRoJe-00024G-P9@fencepost.gnu.org>

> Please explain why you consider this a bug.  foo//bar is at best
> equivalent to foo/bar,

In what way is that _ever_ equivalent?  Please show the equivalence for any of
the OS's that Emacs supports.

In interactive use, `...//abc...' is treated as just `/abc...' by Emacs.  The
prefix `.../' is ignored until you hit `RET'.  And then the prefix is dropped
when you enter the file name (`RET') - but only then.  And that's appropriate.
Other than when you actually hit `RET', the prefix should not be removed but
simply ignored.

Ignoring the prefix is a user convenience.  Dropping it when you hit `RET'
guarantees that the caller (e.g. a file-system utility) never sees it.  So much
for treating the `...//abc...' as `/abc...'.

As to collapsing multiple consecutive slashes, which is something completely
different - what is the rationale for that?  You claim that `foo//bar' is
equivalent to `foo/bar'.  Where do you get that?  Unix?  GNU/Linux?  Windows?
VMS?  I don't think so.

If they were equivalent, the Emacs would transform `...//abc...' to just
`.../abc...' when you hit `RET', not transform it to `/abc...'.  Try changing
that behavior and see how many Emacs users give you their opinion about how
equivalent `//' and `/' are. ;-)

> and at worst simply fails various system calls
> and naive Lisp code that doesn't expect more than one slash in a row.

Red herring.  When such a name is actually _entered_ (`RET') the prefix `.../'
is removed.  So there is no way that system calls see such a name.

And arguing on the basis of some supposed "naive Lisp code" is pretty weak.
Which Lisp code currently depends on this collapsing?

And you know, if you introduce _any_ change in behavior, no matter how bad, some
Lisp code will eventually adapt to it and thus expect it.  That is no argument
at all.

> > But I see that comments in
> > the current C code indicate that it is intentional (but not _why_).
> 
> I didn't find any related discussions that would explain the immediate
> reasons for the above change.  So I don't know (and certainly don't
> remember ;-) why it was made.  But I do think it's the right behavior
> for expand-file-name, because other primitives and Lisp code normally
> expects a canonicalized file name, so it would make sense to have a
> single primitive that produces such canonicalized file names.

Then please create a separate primitive the does just that.

Currently, there is no way to maintain the integrity of the user's input if that
input is passed through `expand-file-name' to do what the doc says `expand*'
does.

Currently, using `expand*' on input can easily change perfectly valid input
(valid for Emacs, since it ignores and ultimately removes the prefix `.../')
into input that is invalid (represents no file at all).

IOW, please separate this slash-collapsing from what `expand-file-name' is
actually _documented_ to do - there is nothing about slash collapsing in
`expand*'s contract with its users.

Make a separate function `collapse-slashes' which does (only) that, if you like.
Then the cleaned-up `expand-file-name' can be used alone on user input without
changing valid references to actual files into non-matches.

Or if you prefer, leave `expand*' the way it is (but document the
slash-collapsing), and create another primitive that does only what `expand*' is
documented as doing: everything except the slash-collapsing.  

That new primitive would ignore a `...//' prefix up to the second slash - the
same way Emacs ignores it during input.  That primitive would simply expand the
file name wrt the dir argument, removing `.' etc. as documented now - in sum,
exactly what we _say_ `expand*' does, and no more.

I don't really care which is done: clean up `expand*' or provide a new function
that does what we _say_ `expand*' does (and what it actually did prior to Emacs
21).

Either way I will have a function I can call to get the currently _documented_
`expand*' behavior (and only that).  I will be able to apply it to user input
and know that I haven't suddenly moved the target of the input from one file to
another (probably nonexistent) one.

> > But shouldn't `expand-file-name' do the right thing if the 
> > second arg is in fact `file-directory-p'?
> > 
> > For a user on GNU/Linux with $HOME = /home/toto":
> > (file-directory-p "~/today//usr") -> nil, but
> > (file-directory-p "~//usr/") --> t, and we have the same problem:
> > (expand-file-name "foo" "~//usr/") -> "/home/toto/usr/foo"
> >  
> > Surely the behavior here is buggy, no?
> 
> Sorry, I don't see a bug here.  Please explain more.
> 
> > The result should be "/home/toto//usr/foo", I would think.
> 
> Not clear why.  As I said, "/home/toto//usr/foo" is at best equivalent
> to "/home/toto/usr/foo",

It is not equivalent - neither at best nor at worst - not at all.

In Emacs, file-name input `/home/toto//usr/foo' is interpreted as `/usr/foo'
when you enter it.  That _is_ an equivalence (for Emacs file-name input).  The
former is not valid for the file system, of course, but in Emacs the two names
are _equivalent as input_.

Assuming the file `/usr/foo' exists, `/home/toto//usr/foo' targets it validly
(in Emacs).  But `/home/toto/usr/foo' is not valid: Most likely no such file
exists, and if it did exist it would anyway not be the same file as `/usr/foo'.

That's the point.  Valid file-name input (for Emacs) is in this case changed by
`expand-file-name' into invalid input (invalid in the sense of not representing
the same file).  Not good; a bug.

> and at worst will simply fail in another
> place.  So what does the former give you that the latter doesn't?

See above.  It maintains the integrity of the user's input:
`/home/toto//usr/foo' is, as Emacs file-name input, equivalent to `/usr/foo'.
`/home/toto/usr/foo' is _not_ equivalent.

> > _Why should_ `expand-file-name' collapse multiple consecutive
> > slashes into a single slash?
> 
> In order to produce a _canonicalized_ file name.

"Canonicalization" should not change which file is targeted.  Just because you
call the behavior "canonicalization" does not mean that the transformation is a
good one.  That's a word game.

If the user's input targets file `/usr/foo', true canonicalization should not
change the target to `/home/toto/usr/foo', which is something completely
different.

> > Finally, I need the Emacs 20 behavior for this for some of my code.
> 
> Can you describe the use-case where the old behavior is needed?

No, and it is irrelevant.  I need the behavior of `expand-file-name' minus the
slash-collapsing part.  It does not matter why I need such a function.

If `expand-file-name' were defined in Lisp I would just gather the part of it
that I need into a new function.  But I cannot do that because it is in C.

But why is it in C, BTW?  Is that really necessary?  If you argue that some
parts of it deal with platform issues and so are better handled in C, then I ask
why not leave only those parts in C and move the actual logic of the function -
what it is documented as _doing_, to Lisp?  Expansion of the file name, removal
of `.' etc. could be done in Lisp with if necessary  sub-calls to C primitives
to perform any platform-dependent stuff that cannot be done in Lisp.

When Emacs code is in C it is much more difficult for users to (re)use it.

> > Oh, and another thing.  This behavior of `expand-file-name' 
> > is not documented.
> 
> Well, one could argue that "canonicalized" does mean removing
> consecutive slashes, 

No, one cannot argue that honestly.  You could define any transformation you
like, call it "canonicalization", and then claim that all of its behavior is
documented as soon as you have simply said that it "canonicalizes".  That's
completely disingenuous.

> but I guess it's easy enough to say that explicitly in the doc string.

Please add this missing information to the doc string, if you refuse to see the
misbehavior bug.

And I hope I can get some information about how, in Lisp, to get the `expand*'
behavior minus the target-changing slash-collapsing.






  reply	other threads:[~2010-12-12 18:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-11 21:53 bug#7617: 24.0.50; `expand-file-name': removal of slashes Drew Adams
2010-12-12 15:58 ` Eli Zaretskii
2010-12-12 18:03   ` Drew Adams [this message]
2010-12-12 19:33     ` Eli Zaretskii
2010-12-12 20:21       ` Drew Adams
2010-12-12 20:32         ` Eli Zaretskii
2010-12-12 20:36           ` Drew Adams
2010-12-12 20:42         ` Andreas Schwab
2010-12-13  3:53         ` Stefan Monnier
2010-12-13  4:32           ` Drew Adams
2010-12-13  5:23             ` Eli Zaretskii
2010-12-13 14:51               ` Drew Adams
2010-12-13 15:17                 ` Eli Zaretskii
2010-12-13 15:47                   ` Drew Adams
2010-12-13 16:17                     ` Eli Zaretskii
2010-12-13 20:40                     ` Stefan Monnier
2010-12-12 22:35       ` Drew Adams
2010-12-12 23:40         ` Andreas Schwab
2010-12-13  5:17         ` Eli Zaretskii
2010-12-13 14:51           ` Drew Adams
2010-12-12 20:15     ` Andreas Schwab
2010-12-12 20:25       ` Drew Adams
2010-12-12 20:36         ` Andreas Schwab
2010-12-12 20:42           ` Drew Adams
2010-12-12 21:00             ` Andreas Schwab
2010-12-13  0:49     ` Jason Rumney
2010-12-12 20:39   ` Eli Zaretskii
2010-12-12 21:04 ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=022D0B344DB64B2A9E1B771B0695237E@us.oracle.com \
    --to=drew.adams@oracle.com \
    --cc=7617@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).