unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
@ 2006-12-27 21:16 Richard Stallman
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Stallman @ 2006-12-27 21:16 UTC (permalink / raw)


Would someone please DTRT and ack?

------- Start of forwarded message -------
From: James Cloos <cloos@jhcloos.com>
To: emacs-devel@gnu.org
In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message
	of "Tue\, 26 Dec 2006 12\:22\:29 -0500")
Copyright: Copyright 2006 James Cloos
Date: Tue, 26 Dec 2006 13:58:12 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: rms@gnu.org
Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO 
	autolearn=failed version=3.0.4

|> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it
|> acts like the file Doesn't exist.

I tried it on several filenames (all of which exist on my filesystem).

On every file where the first character was ASCII and was not modified
by a combining character, ffap worked as expected.

But on every file where the second character was a combining character
(such as the file named C?.utf8 -- that is a C followed by U+0336, which
is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the
string as being a filename.  It also failed when the string started
with a non-ASCII character, such as a kanji or a greek character.

Some testing shows that (ffap-string-at-point) skips strings such as
those described above.  I guess this is because of the default value
of ffap-string-at-point-mode-alist.  For finding files it looks for
strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and
dropping "@>;.,!:" from the end.  That first string needs to be
expanded to support non-ASCII characters which might be used for
filenames.

- -JimC
- -- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6



_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
@ 2007-01-04  2:32 Richard Stallman
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Stallman @ 2007-01-04  2:32 UTC (permalink / raw)


[I sent this message a week ago but did not get a response.]

Would someone please DTRT and ack?

------- Start of forwarded message -------
From: James Cloos <cloos@jhcloos.com>
To: emacs-devel@gnu.org
In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message
	of "Tue\, 26 Dec 2006 12\:22\:29 -0500")
Copyright: Copyright 2006 James Cloos
Date: Tue, 26 Dec 2006 13:58:12 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: rms@gnu.org
Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO 
	autolearn=failed version=3.0.4

|> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it
|> acts like the file Doesn't exist.

I tried it on several filenames (all of which exist on my filesystem).

On every file where the first character was ASCII and was not modified
by a combining character, ffap worked as expected.

But on every file where the second character was a combining character
(such as the file named C?.utf8 -- that is a C followed by U+0336, which
is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the
string as being a filename.  It also failed when the string started
with a non-ASCII character, such as a kanji or a greek character.

Some testing shows that (ffap-string-at-point) skips strings such as
those described above.  I guess this is because of the default value
of ffap-string-at-point-mode-alist.  For finding files it looks for
strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and
dropping "@>;.,!:" from the end.  That first string needs to be
expanded to support non-ASCII characters which might be used for
filenames.

- -JimC
- -- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6



_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
@ 2007-01-10 23:05 Richard Stallman
  2007-01-10 23:48 ` Stefan Monnier
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Stallman @ 2007-01-10 23:05 UTC (permalink / raw)


[I sent this message twice but did not get a response.]

Would someone please DTRT and ack?

------- Start of forwarded message -------
From: James Cloos <cloos@jhcloos.com>
To: emacs-devel@gnu.org
In-Reply-To: <E1GzG0T-0001eu-7N@fencepost.gnu.org> (Richard Stallman's message
	of "Tue\, 26 Dec 2006 12\:22\:29 -0500")
Copyright: Copyright 2006 James Cloos
Date: Tue, 26 Dec 2006 13:58:12 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: rms@gnu.org
Subject: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO 
	autolearn=failed version=3.0.4

|> ffap is not UTF-8 ready, I put the cursor on ./???? or whatever and it
|> acts like the file Doesn't exist.

I tried it on several filenames (all of which exist on my filesystem).

On every file where the first character was ASCII and was not modified
by a combining character, ffap worked as expected.

But on every file where the second character was a combining character
(such as the file named C?.utf8 -- that is a C followed by U+0336, which
is called COMBINING LONG STROKE OVERLAY) ffap failed to recognize the
string as being a filename.  It also failed when the string started
with a non-ASCII character, such as a kanji or a greek character.

Some testing shows that (ffap-string-at-point) skips strings such as
those described above.  I guess this is because of the default value
of ffap-string-at-point-mode-alist.  For finding files it looks for
strings of "--:$+<>@-Z_a-z~*?", dropping "<@" from the beginning and
dropping "@>;.,!:" from the end.  That first string needs to be
expanded to support non-ASCII characters which might be used for
filenames.

- -JimC
- -- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6



_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]]
  2007-01-10 23:05 Richard Stallman
@ 2007-01-10 23:48 ` Stefan Monnier
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 2007-01-10 23:48 UTC (permalink / raw)
  Cc: emacs-devel

> [I sent this message twice but did not get a response.]
> Would someone please DTRT and ack?

I've committed the patch below.


        Stefan


--- ffap.el	07 déc 2006 21:44:40 -0500	1.60
+++ ffap.el	10 jan 2007 18:44:40 -0500	
@@ -1,7 +1,7 @@
 ;;; ffap.el --- find file (or url) at point
 
 ;; Copyright (C) 1995, 1996, 1997, 2000, 2001, 2002, 2003, 2004,
-;;   2005, 2006 Free Software Foundation, Inc.
+;;   2005, 2006, 2007 Free Software Foundation, Inc.
 
 ;; Author: Michelangelo Grigni <mic@mathcs.emory.edu>
 ;; Maintainer: FSF
@@ -310,7 +310,7 @@
   ;;
   ;; It pays to put a big fancy regexp here, since ffap-guesser is
   ;; much more time-consuming than regexp searching:
-  "[/:.~a-zA-Z]/\\|@[a-zA-Z][-a-zA-Z0-9]*\\."
+  "[/:.~[:alpha:]]/\\|@[[:alpha:]][-[:alnum:]]*\\."
   "*Regular expression governing movements of `ffap-next'."
   :type 'regexp
   :group 'ffap)
@@ -426,7 +426,7 @@
   ;; (ffap-machine-p "mathcs" 5678 nil 'ping)
   ;; (ffap-machine-p "foo.bonk" nil nil 'ping)
   ;; (ffap-machine-p "foo.bonk.com" nil nil 'ping)
-  (if (or (string-match "[^-a-zA-Z0-9.]" host) ; Illegal chars (?)
+  (if (or (string-match "[^-[:alnum:].]" host) ; Illegal chars (?)
 	  (not (string-match "[^0-9]" host))) ; 1: a number? 2: quick reject
       nil
     (let* ((domain
@@ -575,7 +575,7 @@
    (ffap-ftp-regexp (ffap-host-to-filename mach))
    ))
 
-(defvar ffap-newsgroup-regexp "^[a-z]+\\.[-+a-z_0-9.]+$"
+(defvar ffap-newsgroup-regexp "^[[:lower:]]+\\.[-+[:lower:]_0-9.]+$"
   "Strings not matching this fail `ffap-newsgroup-p'.")
 (defvar ffap-newsgroup-heads		; entirely inadequate
   '("alt" "comp" "gnu" "misc" "news" "sci" "soc" "talk")
@@ -601,7 +601,7 @@
 	     (setq heads nil))
 	 (error nil)))
      (or ret (not heads)
-	 (let ((head (string-match "\\`\\([a-z]+\\)\\." string)))
+	 (let ((head (string-match "\\`\\([[:lower:]]+\\)\\." string)))
 	   (and head (setq head (substring string 0 (match-end 1)))
 		(member head heads)
 		(setq ret string))))
@@ -780,7 +780,7 @@
     ("" . ffap-completable)		; completion, slow on some systems
     ("\\.info\\'" . ffap-info)		; gzip.info
     ("\\`info/" . ffap-info-2)		; info/emacs
-    ("\\`[-a-z]+\\'" . ffap-info-3)	; (emacs)Top [only in the parentheses]
+    ("\\`[-[:lower:]]+\\'" . ffap-info-3) ; (emacs)Top [only in the parentheses]
     ("\\.elc?\\'" . ffap-el)		; simple.el, simple.elc
     (emacs-lisp-mode . ffap-el-mode)	; rmail, gnus, simple, custom
     ;; (lisp-interaction-mode . ffap-el-mode) ; maybe
@@ -969,15 +969,15 @@
     ;; Slightly controversial decisions:
     ;; * strip trailing "@" and ":"
     ;; * no commas (good for latex)
-    (file "--:$+<>@-Z_a-z~*?" "<@" "@>;.,!:")
+    (file "--:$+<>@-Z_[:lower:]~*?" "<@" "@>;.,!:")
     ;; An url, or maybe a email/news message-id:
-    (url "--:=&?$+@-Z_a-z~#,%;*" "^A-Za-z0-9" ":;.,!?")
+    (url "--:=&?$+@-Z_[:lower:]~#,%;*" "^[:alnum:]" ":;.,!?")
     ;; Find a string that does *not* contain a colon:
-    (nocolon "--9$+<>@-Z_a-z~" "<@" "@>;.,!?")
+    (nocolon "--9$+<>@-Z_[:lower:]~" "<@" "@>;.,!?")
     ;; A machine:
-    (machine "-a-zA-Z0-9." "" ".")
+    (machine "-[:alnum:]." "" ".")
     ;; Mathematica paths: allow backquotes
-    (math-mode ",-:$+<>@-Z_a-z~`" "<" "@>;.,!?`:")
+    (math-mode ",-:$+<>@-Z_[:lower:]~`" "<" "@>;.,!?`:")
     )
   "Alist of \(MODE CHARS BEG END\), where MODE is a symbol,
 possibly a major-mode name, or one of the symbol
@@ -1062,7 +1062,7 @@
     (let ((name (ffap-string-at-point 'url)))
       (cond
        ((string-match "^url:" name) (setq name (substring name 4)))
-       ((and (string-match "\\`[^:</>@]+@[^:</>@]+[a-zA-Z0-9]\\'" name)
+       ((and (string-match "\\`[^:</>@]+@[^:</>@]+[[:alnum:]]\\'" name)
 	     ;; "foo@bar": could be "mailto" or "news" (a Message-ID).
 	     ;; Without "<>" it must be "mailto".  Otherwise could be
 	     ;; either, so consult `ffap-foo-at-bar-prefix'.
@@ -1074,7 +1074,7 @@
 			     "mailto")))
 	       (and prefix (setq name (concat prefix ":" name))))))
        ((ffap-newsgroup-p name) (setq name (concat "news:" name)))
-       ((and (string-match "\\`[a-z0-9]+\\'" name) ; <mic> <root> <nobody>
+       ((and (string-match "\\`[[:alnum:]]+\\'" name) ; <mic> <root> <nobody>
 	     (equal (ffap-string-around) "<>")
 	     ;;	(ffap-user-p name):
 	     (not (string-match "~" (expand-file-name (concat "~" name))))

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-01-10 23:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-04  2:32 [cloos@jhcloos.com: Re: [jidanni@jidanni.org: ffap not UTF-8 ready]] Richard Stallman
  -- strict thread matches above, loose matches on Subject: below --
2007-01-10 23:05 Richard Stallman
2007-01-10 23:48 ` Stefan Monnier
2006-12-27 21:16 Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).