* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
@ 2006-12-27 21:16 Richard Stallman
0 siblings, 0 replies; 4+ messages in thread
From: Richard Stallman @ 2006-12-27 21:16 UTC (permalink / raw)
Would someone please DTRT and ack?
------- Start of forwarded message -------
From: James Cloos <cloos@jhcloos.com>
To: emacs-devel@gnu.org
In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message
of "Tue\, 26 Dec 2006 12\:22\:29 -0500")
Copyright: Copyright 2006 James Cloos
Date: Tue, 26 Dec 2006 13:58:12 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: rms@gnu.org
Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO
autolearn=failed version=3.0.4
|> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it
|> acts like the file Doesn't exist.
I tried it on several filenames (all of which exist on my filesystem).
On every file where the first character was ASCII and was not modified
by a combining character, ffap worked as expected.
But on every file where the second character was a combining character
(such as the file named C?.utf8 -- that is a C followed by U+0336, which
is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the
string as being a filename. It also failed when the string started
with a non-ASCII character, such as a kanji or a greek character.
Some testing shows that (ffap-string-at-point) skips strings such as
those described above. I guess this is because of the default value
of ffap-string-at-point-mode-alist. For finding files it looks for
strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and
dropping "@>;.,!:" from the end. That first string needs to be
expanded to support non-ASCII characters which might be used for
filenames.
- -JimC
- --
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------
^ permalink raw reply [flat|nested] 4+ messages in thread
* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
@ 2007-01-04 2:32 Richard Stallman
0 siblings, 0 replies; 4+ messages in thread
From: Richard Stallman @ 2007-01-04 2:32 UTC (permalink / raw)
[I sent this message a week ago but did not get a response.]
Would someone please DTRT and ack?
------- Start of forwarded message -------
From: James Cloos <cloos@jhcloos.com>
To: emacs-devel@gnu.org
In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message
of "Tue\, 26 Dec 2006 12\:22\:29 -0500")
Copyright: Copyright 2006 James Cloos
Date: Tue, 26 Dec 2006 13:58:12 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: rms@gnu.org
Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO
autolearn=failed version=3.0.4
|> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it
|> acts like the file Doesn't exist.
I tried it on several filenames (all of which exist on my filesystem).
On every file where the first character was ASCII and was not modified
by a combining character, ffap worked as expected.
But on every file where the second character was a combining character
(such as the file named C?.utf8 -- that is a C followed by U+0336, which
is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the
string as being a filename. It also failed when the string started
with a non-ASCII character, such as a kanji or a greek character.
Some testing shows that (ffap-string-at-point) skips strings such as
those described above. I guess this is because of the default value
of ffap-string-at-point-mode-alist. For finding files it looks for
strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and
dropping "@>;.,!:" from the end. That first string needs to be
expanded to support non-ASCII characters which might be used for
filenames.
- -JimC
- --
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------
^ permalink raw reply [flat|nested] 4+ messages in thread
* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
@ 2007-01-10 23:05 Richard Stallman
2007-01-10 23:48 ` Stefan Monnier
0 siblings, 1 reply; 4+ messages in thread
From: Richard Stallman @ 2007-01-10 23:05 UTC (permalink / raw)
[I sent this message twice but did not get a response.]
Would someone please DTRT and ack?
------- Start of forwarded message -------
From: James Cloos <cloos@jhcloos.com>
To: emacs-devel@gnu.org
In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message
of "Tue\, 26 Dec 2006 12\:22\:29 -0500")
Copyright: Copyright 2006 James Cloos
Date: Tue, 26 Dec 2006 13:58:12 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: rms@gnu.org
Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO
autolearn=failed version=3.0.4
|> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it
|> acts like the file Doesn't exist.
I tried it on several filenames (all of which exist on my filesystem).
On every file where the first character was ASCII and was not modified
by a combining character, ffap worked as expected.
But on every file where the second character was a combining character
(such as the file named C?.utf8 -- that is a C followed by U+0336, which
is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the
string as being a filename. It also failed when the string started
with a non-ASCII character, such as a kanji or a greek character.
Some testing shows that (ffap-string-at-point) skips strings such as
those described above. I guess this is because of the default value
of ffap-string-at-point-mode-alist. For finding files it looks for
strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and
dropping "@>;.,!:" from the end. That first string needs to be
expanded to support non-ASCII characters which might be used for
filenames.
- -JimC
- --
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
2007-01-10 23:05 Richard Stallman
@ 2007-01-10 23:48 ` Stefan Monnier
0 siblings, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 2007-01-10 23:48 UTC (permalink / raw)
Cc: emacs-devel
> [I sent this message twice but did not get a response.]
> Would someone please DTRT and ack?
I've committed the patch below.
Stefan
--- ffap.el 07 déc 2006 21:44:40 -0500 1.60
+++ ffap.el 10 jan 2007 18:44:40 -0500
@@ -1,7 +1,7 @@
;;; ffap.el --- find file (or url) at point
;; Copyright (C) 1995, 1996, 1997, 2000, 2001, 2002, 2003, 2004,
-;; 2005, 2006 Free Software Foundation, Inc.
+;; 2005, 2006, 2007 Free Software Foundation, Inc.
;; Author: Michelangelo Grigni <mic@mathcs.emory.edu>
;; Maintainer: FSF
@@ -310,7 +310,7 @@
;;
;; It pays to put a big fancy regexp here, since ffap-guesser is
;; much more time-consuming than regexp searching:
- "[/:.~a-zA-Z]/\\|@[a-zA-Z][-a-zA-Z0-9]*\\."
+ "[/:.~[:alpha:]]/\\|@[[:alpha:]][-[:alnum:]]*\\."
"*Regular expression governing movements of `ffap-next'."
:type 'regexp
:group 'ffap)
@@ -426,7 +426,7 @@
;; (ffap-machine-p "mathcs" 5678 nil 'ping)
;; (ffap-machine-p "foo.bonk" nil nil 'ping)
;; (ffap-machine-p "foo.bonk.com" nil nil 'ping)
- (if (or (string-match "[^-a-zA-Z0-9.]" host) ; Illegal chars (?)
+ (if (or (string-match "[^-[:alnum:].]" host) ; Illegal chars (?)
(not (string-match "[^0-9]" host))) ; 1: a number? 2: quick reject
nil
(let* ((domain
@@ -575,7 +575,7 @@
(ffap-ftp-regexp (ffap-host-to-filename mach))
))
-(defvar ffap-newsgroup-regexp "^[a-z]+\\.[-+a-z_0-9.]+$"
+(defvar ffap-newsgroup-regexp "^[[:lower:]]+\\.[-+[:lower:]_0-9.]+$"
"Strings not matching this fail `ffap-newsgroup-p'.")
(defvar ffap-newsgroup-heads ; entirely inadequate
'("alt" "comp" "gnu" "misc" "news" "sci" "soc" "talk")
@@ -601,7 +601,7 @@
(setq heads nil))
(error nil)))
(or ret (not heads)
- (let ((head (string-match "\\`\\([a-z]+\\)\\." string)))
+ (let ((head (string-match "\\`\\([[:lower:]]+\\)\\." string)))
(and head (setq head (substring string 0 (match-end 1)))
(member head heads)
(setq ret string))))
@@ -780,7 +780,7 @@
("" . ffap-completable) ; completion, slow on some systems
("\\.info\\'" . ffap-info) ; gzip.info
("\\`info/" . ffap-info-2) ; info/emacs
- ("\\`[-a-z]+\\'" . ffap-info-3) ; (emacs)Top [only in the parentheses]
+ ("\\`[-[:lower:]]+\\'" . ffap-info-3) ; (emacs)Top [only in the parentheses]
("\\.elc?\\'" . ffap-el) ; simple.el, simple.elc
(emacs-lisp-mode . ffap-el-mode) ; rmail, gnus, simple, custom
;; (lisp-interaction-mode . ffap-el-mode) ; maybe
@@ -969,15 +969,15 @@
;; Slightly controversial decisions:
;; * strip trailing "@" and ":"
;; * no commas (good for latex)
- (file "--:$+<>@-Z_a-z~*?" "<@" "@>;.,!:")
+ (file "--:$+<>@-Z_[:lower:]~*?" "<@" "@>;.,!:")
;; An url, or maybe a email/news message-id:
- (url "--:=&?$+@-Z_a-z~#,%;*" "^A-Za-z0-9" ":;.,!?")
+ (url "--:=&?$+@-Z_[:lower:]~#,%;*" "^[:alnum:]" ":;.,!?")
;; Find a string that does *not* contain a colon:
- (nocolon "--9$+<>@-Z_a-z~" "<@" "@>;.,!?")
+ (nocolon "--9$+<>@-Z_[:lower:]~" "<@" "@>;.,!?")
;; A machine:
- (machine "-a-zA-Z0-9." "" ".")
+ (machine "-[:alnum:]." "" ".")
;; Mathematica paths: allow backquotes
- (math-mode ",-:$+<>@-Z_a-z~`" "<" "@>;.,!?`:")
+ (math-mode ",-:$+<>@-Z_[:lower:]~`" "<" "@>;.,!?`:")
)
"Alist of \(MODE CHARS BEG END\), where MODE is a symbol,
possibly a major-mode name, or one of the symbol
@@ -1062,7 +1062,7 @@
(let ((name (ffap-string-at-point 'url)))
(cond
((string-match "^url:" name) (setq name (substring name 4)))
- ((and (string-match "\\`[^:</>@]+@[^:</>@]+[a-zA-Z0-9]\\'" name)
+ ((and (string-match "\\`[^:</>@]+@[^:</>@]+[[:alnum:]]\\'" name)
;; "foo@bar": could be "mailto" or "news" (a Message-ID).
;; Without "<>" it must be "mailto". Otherwise could be
;; either, so consult `ffap-foo-at-bar-prefix'.
@@ -1074,7 +1074,7 @@
"mailto")))
(and prefix (setq name (concat prefix ":" name))))))
((ffap-newsgroup-p name) (setq name (concat "news:" name)))
- ((and (string-match "\\`[a-z0-9]+\\'" name) ; <mic> <root> <nobody>
+ ((and (string-match "\\`[[:alnum:]]+\\'" name) ; <mic> <root> <nobody>
(equal (ffap-string-around) "<>")
;; (ffap-user-p name):
(not (string-match "~" (expand-file-name (concat "~" name))))
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-01-10 23:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-04 2:32 [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] Richard Stallman
-- strict thread matches above, loose matches on Subject: below --
2007-01-10 23:05 Richard Stallman
2007-01-10 23:48 ` Stefan Monnier
2006-12-27 21:16 Richard Stallman
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).