unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Add a predicate for canonical file name
@ 2016-09-12  8:23 Tino Calancha
  2016-09-12 17:13 ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Tino Calancha @ 2016-09-12  8:23 UTC (permalink / raw)
  To: Emacs developers; +Cc: tino.calancha


Hi,

i dont see in Emacs a predicate for a file name being canonical.
We have a predicate for absolute file names, `file-name-absolute-p'.
In some cases we might want to check if the file name is canonical.
For instance, following might fail to set point in simple.el line:
it fails when `dired-goto-file' argument is not canonical:

(let* ((dir (expand-file-name "lisp" source-directory))
        (file (expand-file-name "simple.el" dir)))
   (when (file-name-absolute-p file)
     (dired-other-window dir)
     (goto-char (point-min))
     (dired-goto-file (abbreviate-file-name file))))
     ;; (dired-goto-file file))) ; This works.


Do you think has sense to add a predicate as follows?:
If the answer is yes: where should be defined?

(defsubst myfile-name-canonical-p (filename)
   "Return non-nil if FILENAME specifies an absolute canonical file name."
   (string= filename (expand-file-name filename)))

Following is a simple comparison `file-name-absolute-p' with
`myfile-name-canonical-p':

(let ((dirs '("./foo" "../foo" "/foo//bar" "/foo/./bar" "/foo/../bar" 
"~/bar"
               "//foo/bar" "/foo/bar" "/foo/bar/" 
"/sudo:baz@-pc:/foo/bar/")))
   (mapcar 'file-name-absolute-p dirs))
=> (nil nil t t t t t t t t)

(let ((dirs '("./foo" "../foo" "/foo//bar" "/foo/./bar" "/foo/../bar" 
"~/bar"
               "//foo/bar" "/foo/bar" "/foo/bar/" 
"/sudo:baz@-pc:/foo/bar/")))
   (mapcar 'myfile-name-canonical-p dirs))
=> (nil nil nil nil nil nil t t t t)




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-09-12  8:23 Add a predicate for canonical file name Tino Calancha
@ 2016-09-12 17:13 ` Eli Zaretskii
  2016-09-12 20:01   ` Stefan Monnier
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2016-09-12 17:13 UTC (permalink / raw)
  To: Tino Calancha; +Cc: emacs-devel

> From: Tino Calancha <tino.calancha@gmail.com>
> Date: Mon, 12 Sep 2016 17:23:03 +0900 (JST)
> Cc: tino.calancha@gmail.com
> 
> i dont see in Emacs a predicate for a file name being canonical.

Keeping in mind that just running a file name through expand-file-name
will guarantee an absolute (and "canonical") file name as result, what
problem would such a predicate solve that invoking expand-file-name
doesn't?  Especially given that your implementation calls
expand-file-name anyway?

> We have a predicate for absolute file names, `file-name-absolute-p'.

I personally never had a problem with the notion of "absolute file
name".  Yes, file names like "~/foo/bar" and "~user/foo" cause
file-name-absolute-p to return non-nil, but this is so obscure and
marginal feature that I doubt many Lisp programmers even remember
that.

> (defsubst myfile-name-canonical-p (filename)
>    "Return non-nil if FILENAME specifies an absolute canonical file name."
>    (string= filename (expand-file-name filename)))

Using string= here will cause false negatives, e.g. with Windows file
names that use backslashes vs forward slashes, or due to letter-case
differences on case-insensitive file systems.  Did you really mean
that?

But anyway, the need for this is not clear to me.

Thanks.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-09-12 17:13 ` Eli Zaretskii
@ 2016-09-12 20:01   ` Stefan Monnier
  2016-09-13  6:54     ` Philipp Stephani
  2016-10-14 21:39     ` John Yates
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Monnier @ 2016-09-12 20:01 UTC (permalink / raw)
  To: emacs-devel

> Using string= here will cause false negatives, e.g. with Windows file
> names that use backslashes vs forward slashes, or due to letter-case
> differences on case-insensitive file systems.  Did you really mean
> that?

Indeed, such notions have already been requested and discussed here, and
it's not clear exactly what is needed and when.

I can see several different meanings of "canonical", i.e. representative
member of an equivalence class (and I'd be inclined to prefer a function
that checks for "equivalence" between two file names rather than
a function that tries to find one canonical name).  The equivalence
classes could be:

- equivalent regardless of the actual on-disk data.  I.e. this can't
  take symlinks or hard links into account.  Questions remain about
  whether it could presume the "normal semantics of the most common
  file-system".  E.g. should it assume case-insensitive names in
  MacOS/Windows and case-sensitive in GNU/Linux?

- equivalent in practice for the current state of the file-system.
  You can test this equivalence by comparing the output of
  `file-attributes', except when the name corresponds to a file that
  doesn't exist (yet?).

Which flavor do you want?


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-09-12 20:01   ` Stefan Monnier
@ 2016-09-13  6:54     ` Philipp Stephani
  2016-09-13 12:14       ` Stefan Monnier
  2016-10-14 21:39     ` John Yates
  1 sibling, 1 reply; 11+ messages in thread
From: Philipp Stephani @ 2016-09-13  6:54 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1945 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> schrieb am Mo., 12. Sep. 2016 um
22:09 Uhr:

> > Using string= here will cause false negatives, e.g. with Windows file
> > names that use backslashes vs forward slashes, or due to letter-case
> > differences on case-insensitive file systems.  Did you really mean
> > that?
>
> Indeed, such notions have already been requested and discussed here, and
> it's not clear exactly what is needed and when.
>
> I can see several different meanings of "canonical", i.e. representative
> member of an equivalence class (and I'd be inclined to prefer a function
> that checks for "equivalence" between two file names rather than
> a function that tries to find one canonical name).  The equivalence
> classes could be:
>
> - equivalent regardless of the actual on-disk data.  I.e. this can't
>   take symlinks or hard links into account.  Questions remain about
>   whether it could presume the "normal semantics of the most common
>   file-system".  E.g. should it assume case-insensitive names in
>   MacOS/Windows and case-sensitive in GNU/Linux?
>

You couldn't even do that because case-folding on Windows depends on the
file system. The only possibilities I see here are converting backslashes
into slashes (or vice versa), and collapsing "//" and "/./". You couldn't
even collapse "/../" because of symlinks. (Or you could ignore directory
symlinks, as in https://golang.org/pkg/path/#Clean).


>
> - equivalent in practice for the current state of the file-system.
>   You can test this equivalence by comparing the output of
>   `file-attributes', except when the name corresponds to a file that
>   doesn't exist (yet?).
>

Like all filesystem operations, this easily introduces race conditions if
decisions are made based on it: What might be an equivalent file name now
(pointing to the same inode), might not be a nanosecond later.
Over all, such an operation sounds more useful than it actually is.

[-- Attachment #2: Type: text/html, Size: 3050 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-09-13  6:54     ` Philipp Stephani
@ 2016-09-13 12:14       ` Stefan Monnier
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Monnier @ 2016-09-13 12:14 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: emacs-devel

>> - equivalent regardless of the actual on-disk data.  I.e. this can't
>> take symlinks or hard links into account.  Questions remain about
>> whether it could presume the "normal semantics of the most common
>> file-system".  E.g. should it assume case-insensitive names in
>> MacOS/Windows and case-sensitive in GNU/Linux?
> You couldn't even do that because case-folding on Windows depends on the
> file system.

That's what I said: « Questions remain about whether it could presume the
"normal semantics of the most common file-system". »

Doesn't mean you can't do it.  Just that you may end up doing it in
a way which isn't what the caller expects/needs.

> You couldn't even collapse "/../" because of symlinks.

Here again, that's actually a choice.  `expand-file-name` (and by
extension, Emacs in general) already made the choice to disregard
this case.


        Stefan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-09-12 20:01   ` Stefan Monnier
  2016-09-13  6:54     ` Philipp Stephani
@ 2016-10-14 21:39     ` John Yates
  2016-10-15 22:05       ` Richard Stallman
  1 sibling, 1 reply; 11+ messages in thread
From: John Yates @ 2016-10-14 21:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 472 bytes --]

On Mon, Sep 12, 2016 at 4:01 PM, Stefan Monnier <monnier@iro.umontreal.ca>
wrote:

> I can see several different meanings of "canonical", i.e. representative
> member of an equivalence class (and I'd be inclined to prefer a function
> that checks for "equivalence" between two file names rather than
> a function that tries to find one canonical name).


​A further wrinkle arises from Unicode normalization.  IIRC Linux uses NFC,
MacOS uses NFD.

/john​

[-- Attachment #2: Type: text/html, Size: 1348 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-10-14 21:39     ` John Yates
@ 2016-10-15 22:05       ` Richard Stallman
  2016-10-15 23:07         ` Kalle Olavi Niemitalo
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Stallman @ 2016-10-15 22:05 UTC (permalink / raw)
  To: John Yates; +Cc: monnier, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > ​A further wrinkle arises from Unicode normalization.  IIRC Linux uses NFC,
  > MacOS uses NFD.

Do you mean Linux, the kernel, or GNU/Linux, the operating system?
I would suppose that Linux has no reason to concern itself with this
question.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-10-15 22:05       ` Richard Stallman
@ 2016-10-15 23:07         ` Kalle Olavi Niemitalo
  2016-10-16  0:02           ` John Yates
  2016-10-16 10:42           ` Richard Stallman
  0 siblings, 2 replies; 11+ messages in thread
From: Kalle Olavi Niemitalo @ 2016-10-15 23:07 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel, monnier, John Yates

Richard Stallman <rms@gnu.org> writes:

>   > ​A further wrinkle arises from Unicode normalization.  IIRC Linux uses NFC,
>   > MacOS uses NFD.
>
> Do you mean Linux, the kernel, or GNU/Linux, the operating system?
> I would suppose that Linux has no reason to concern itself with this
> question.

Linux cares about it when reading file names from Apple's HFS+
file system.  I think the file names are stored as NFD on disk
and the hfsplus_readdir function converts them to NFC, except I'm
not sure the conversion entirely matches Unicode specifications.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-10-15 23:07         ` Kalle Olavi Niemitalo
@ 2016-10-16  0:02           ` John Yates
  2016-10-16  1:05             ` Kalle Olavi Niemitalo
  2016-10-16 10:42           ` Richard Stallman
  1 sibling, 1 reply; 11+ messages in thread
From: John Yates @ 2016-10-16  0:02 UTC (permalink / raw)
  To: Kalle Olavi Niemitalo; +Cc: Emacs developers, Richard Stallman, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

On Sat, Oct 15, 2016 at 7:07 PM, Kalle Olavi Niemitalo <kon@iki.fi> wrote:

> Richard Stallman <rms@gnu.org> writes:
>
> Linux cares about it when reading file names from Apple's HFS+
> file system.  I think the file names are stored as NFD on disk
> and the hfsplus_readdir function converts them to NFC, except I'm
> ​​
>
> not sure the conversion entirely matches Unicode specifications.
>

​I know that on Linux a readdir returns NFC strings.  Are you saying
that even though HFS+ stores NFD on MacOS it return NFC?  Or
are you saying that the Linux ​

​HFS+ driver performs a conversion?
If only the latter then my point that the definition of a canonical
filename​
is
​host OS
specific remains.

​/john​

[-- Attachment #2: Type: text/html, Size: 2323 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-10-16  0:02           ` John Yates
@ 2016-10-16  1:05             ` Kalle Olavi Niemitalo
  0 siblings, 0 replies; 11+ messages in thread
From: Kalle Olavi Niemitalo @ 2016-10-16  1:05 UTC (permalink / raw)
  To: John Yates; +Cc: Richard Stallman, Stefan Monnier, Emacs developers

John Yates <john@yates-sheets.org> writes:

> Or are you saying that the Linux HFS+ driver performs a conversion?

Yes: in Linux 3.6.0, fs/hfsplus/dir.c (hfsplus_readdir) calls
fs/hfsplus/unicode.c (hfsplus_uni2asc), which first converts
decomposed UTF-16 to precomposed UTF-16 and then converts that to
the charset specified with the "nls" mount option.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Add a predicate for canonical file name
  2016-10-15 23:07         ` Kalle Olavi Niemitalo
  2016-10-16  0:02           ` John Yates
@ 2016-10-16 10:42           ` Richard Stallman
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Stallman @ 2016-10-16 10:42 UTC (permalink / raw)
  To: Kalle Olavi Niemitalo; +Cc: emacs-devel, monnier, john

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Linux cares about it when reading file names from Apple's HFS+
  > file system.

Oh, I see.  Thanks.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-10-16 10:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-12  8:23 Add a predicate for canonical file name Tino Calancha
2016-09-12 17:13 ` Eli Zaretskii
2016-09-12 20:01   ` Stefan Monnier
2016-09-13  6:54     ` Philipp Stephani
2016-09-13 12:14       ` Stefan Monnier
2016-10-14 21:39     ` John Yates
2016-10-15 22:05       ` Richard Stallman
2016-10-15 23:07         ` Kalle Olavi Niemitalo
2016-10-16  0:02           ` John Yates
2016-10-16  1:05             ` Kalle Olavi Niemitalo
2016-10-16 10:42           ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).