* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] @ 2007-01-10 23:05 Richard Stallman 2007-01-10 23:48 ` Stefan Monnier 0 siblings, 1 reply; 4+ messages in thread From: Richard Stallman @ 2007-01-10 23:05 UTC (permalink / raw) [I sent this message twice but did not get a response.] Would someone please DTRT and ack? ------- Start of forwarded message ------- From: James Cloos <cloos@jhcloos.com> To: emacs-devel@gnu.org In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message of "Tue\, 26 Dec 2006 12\:22\:29 -0500") Copyright: Copyright 2006 James Cloos Date: Tue, 26 Dec 2006 13:58:12 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: rms@gnu.org Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready] X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=failed version=3.0.4 |> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it |> acts like the file Doesn't exist. I tried it on several filenames (all of which exist on my filesystem). On every file where the first character was ASCII and was not modified by a combining character, ffap worked as expected. But on every file where the second character was a combining character (such as the file named C?.utf8 -- that is a C followed by U+0336, which is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the string as being a filename. It also failed when the string started with a non-ASCII character, such as a kanji or a greek character. Some testing shows that (ffap-string-at-point) skips strings such as those described above. I guess this is because of the default value of ffap-string-at-point-mode-alist. For finding files it looks for strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and dropping "@>;.,!:" from the end. That first string needs to be expanded to support non-ASCII characters which might be used for filenames. - -JimC - -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ------- End of forwarded message ------- ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] 2007-01-10 23:05 [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] Richard Stallman @ 2007-01-10 23:48 ` Stefan Monnier 0 siblings, 0 replies; 4+ messages in thread From: Stefan Monnier @ 2007-01-10 23:48 UTC (permalink / raw) Cc: emacs-devel > [I sent this message twice but did not get a response.] > Would someone please DTRT and ack? I've committed the patch below. Stefan --- ffap.el 07 déc 2006 21:44:40 -0500 1.60 +++ ffap.el 10 jan 2007 18:44:40 -0500 @@ -1,7 +1,7 @@ ;;; ffap.el --- find file (or url) at point ;; Copyright (C) 1995, 1996, 1997, 2000, 2001, 2002, 2003, 2004, -;; 2005, 2006 Free Software Foundation, Inc. +;; 2005, 2006, 2007 Free Software Foundation, Inc. ;; Author: Michelangelo Grigni <mic@mathcs.emory.edu> ;; Maintainer: FSF @@ -310,7 +310,7 @@ ;; ;; It pays to put a big fancy regexp here, since ffap-guesser is ;; much more time-consuming than regexp searching: - "[/:.~a-zA-Z]/\\|@[a-zA-Z][-a-zA-Z0-9]*\\." + "[/:.~[:alpha:]]/\\|@[[:alpha:]][-[:alnum:]]*\\." "*Regular expression governing movements of `ffap-next'." :type 'regexp :group 'ffap) @@ -426,7 +426,7 @@ ;; (ffap-machine-p "mathcs" 5678 nil 'ping) ;; (ffap-machine-p "foo.bonk" nil nil 'ping) ;; (ffap-machine-p "foo.bonk.com" nil nil 'ping) - (if (or (string-match "[^-a-zA-Z0-9.]" host) ; Illegal chars (?) + (if (or (string-match "[^-[:alnum:].]" host) ; Illegal chars (?) (not (string-match "[^0-9]" host))) ; 1: a number? 2: quick reject nil (let* ((domain @@ -575,7 +575,7 @@ (ffap-ftp-regexp (ffap-host-to-filename mach)) )) -(defvar ffap-newsgroup-regexp "^[a-z]+\\.[-+a-z_0-9.]+$" +(defvar ffap-newsgroup-regexp "^[[:lower:]]+\\.[-+[:lower:]_0-9.]+$" "Strings not matching this fail `ffap-newsgroup-p'.") (defvar ffap-newsgroup-heads ; entirely inadequate '("alt" "comp" "gnu" "misc" "news" "sci" "soc" "talk") @@ -601,7 +601,7 @@ (setq heads nil)) (error nil))) (or ret (not heads) - (let ((head (string-match "\\`\\([a-z]+\\)\\." string))) + (let ((head (string-match "\\`\\([[:lower:]]+\\)\\." string))) (and head (setq head (substring string 0 (match-end 1))) (member head heads) (setq ret string)))) @@ -780,7 +780,7 @@ ("" . ffap-completable) ; completion, slow on some systems ("\\.info\\'" . ffap-info) ; gzip.info ("\\`info/" . ffap-info-2) ; info/emacs - ("\\`[-a-z]+\\'" . ffap-info-3) ; (emacs)Top [only in the parentheses] + ("\\`[-[:lower:]]+\\'" . ffap-info-3) ; (emacs)Top [only in the parentheses] ("\\.elc?\\'" . ffap-el) ; simple.el, simple.elc (emacs-lisp-mode . ffap-el-mode) ; rmail, gnus, simple, custom ;; (lisp-interaction-mode . ffap-el-mode) ; maybe @@ -969,15 +969,15 @@ ;; Slightly controversial decisions: ;; * strip trailing "@" and ":" ;; * no commas (good for latex) - (file "--:$+<>@-Z_a-z~*?" "<@" "@>;.,!:") + (file "--:$+<>@-Z_[:lower:]~*?" "<@" "@>;.,!:") ;; An url, or maybe a email/news message-id: - (url "--:=&?$+@-Z_a-z~#,%;*" "^A-Za-z0-9" ":;.,!?") + (url "--:=&?$+@-Z_[:lower:]~#,%;*" "^[:alnum:]" ":;.,!?") ;; Find a string that does *not* contain a colon: - (nocolon "--9$+<>@-Z_a-z~" "<@" "@>;.,!?") + (nocolon "--9$+<>@-Z_[:lower:]~" "<@" "@>;.,!?") ;; A machine: - (machine "-a-zA-Z0-9." "" ".") + (machine "-[:alnum:]." "" ".") ;; Mathematica paths: allow backquotes - (math-mode ",-:$+<>@-Z_a-z~`" "<" "@>;.,!?`:") + (math-mode ",-:$+<>@-Z_[:lower:]~`" "<" "@>;.,!?`:") ) "Alist of \(MODE CHARS BEG END\), where MODE is a symbol, possibly a major-mode name, or one of the symbol @@ -1062,7 +1062,7 @@ (let ((name (ffap-string-at-point 'url))) (cond ((string-match "^url:" name) (setq name (substring name 4))) - ((and (string-match "\\`[^:</>@]+@[^:</>@]+[a-zA-Z0-9]\\'" name) + ((and (string-match "\\`[^:</>@]+@[^:</>@]+[[:alnum:]]\\'" name) ;; "foo@bar": could be "mailto" or "news" (a Message-ID). ;; Without "<>" it must be "mailto". Otherwise could be ;; either, so consult `ffap-foo-at-bar-prefix'. @@ -1074,7 +1074,7 @@ "mailto"))) (and prefix (setq name (concat prefix ":" name)))))) ((ffap-newsgroup-p name) (setq name (concat "news:" name))) - ((and (string-match "\\`[a-z0-9]+\\'" name) ; <mic> <root> <nobody> + ((and (string-match "\\`[[:alnum:]]+\\'" name) ; <mic> <root> <nobody> (equal (ffap-string-around) "<>") ;; (ffap-user-p name): (not (string-match "~" (expand-file-name (concat "~" name)))) ^ permalink raw reply [flat|nested] 4+ messages in thread
* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] @ 2007-01-04 2:32 Richard Stallman 0 siblings, 0 replies; 4+ messages in thread From: Richard Stallman @ 2007-01-04 2:32 UTC (permalink / raw) [I sent this message a week ago but did not get a response.] Would someone please DTRT and ack? ------- Start of forwarded message ------- From: James Cloos <cloos@jhcloos.com> To: emacs-devel@gnu.org In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message of "Tue\, 26 Dec 2006 12\:22\:29 -0500") Copyright: Copyright 2006 James Cloos Date: Tue, 26 Dec 2006 13:58:12 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: rms@gnu.org Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready] X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=failed version=3.0.4 |> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it |> acts like the file Doesn't exist. I tried it on several filenames (all of which exist on my filesystem). On every file where the first character was ASCII and was not modified by a combining character, ffap worked as expected. But on every file where the second character was a combining character (such as the file named C?.utf8 -- that is a C followed by U+0336, which is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the string as being a filename. It also failed when the string started with a non-ASCII character, such as a kanji or a greek character. Some testing shows that (ffap-string-at-point) skips strings such as those described above. I guess this is because of the default value of ffap-string-at-point-mode-alist. For finding files it looks for strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and dropping "@>;.,!:" from the end. That first string needs to be expanded to support non-ASCII characters which might be used for filenames. - -JimC - -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ------- End of forwarded message ------- ^ permalink raw reply [flat|nested] 4+ messages in thread
* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] @ 2006-12-27 21:16 Richard Stallman 0 siblings, 0 replies; 4+ messages in thread From: Richard Stallman @ 2006-12-27 21:16 UTC (permalink / raw) Would someone please DTRT and ack? ------- Start of forwarded message ------- From: James Cloos <cloos@jhcloos.com> To: emacs-devel@gnu.org In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message of "Tue\, 26 Dec 2006 12\:22\:29 -0500") Copyright: Copyright 2006 James Cloos Date: Tue, 26 Dec 2006 13:58:12 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: rms@gnu.org Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready] X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=failed version=3.0.4 |> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it |> acts like the file Doesn't exist. I tried it on several filenames (all of which exist on my filesystem). On every file where the first character was ASCII and was not modified by a combining character, ffap worked as expected. But on every file where the second character was a combining character (such as the file named C?.utf8 -- that is a C followed by U+0336, which is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the string as being a filename. It also failed when the string started with a non-ASCII character, such as a kanji or a greek character. Some testing shows that (ffap-string-at-point) skips strings such as those described above. I guess this is because of the default value of ffap-string-at-point-mode-alist. For finding files it looks for strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and dropping "@>;.,!:" from the end. That first string needs to be expanded to support non-ASCII characters which might be used for filenames. - -JimC - -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ------- End of forwarded message ------- ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-01-10 23:48 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-01-10 23:05 [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] Richard Stallman 2007-01-10 23:48 ` Stefan Monnier -- strict thread matches above, loose matches on Subject: below -- 2007-01-04 2:32 Richard Stallman 2006-12-27 21:16 Richard Stallman
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).