* Re: ffap not UTF-8 ready [not found] <E1GSrVS-0000rW-Qr@jidanni.org> @ 2006-10-02 7:22 ` Kenichi Handa 2006-10-02 13:49 ` Stefan Monnier 2006-10-02 21:05 ` Kevin Ryde 0 siblings, 2 replies; 8+ messages in thread From: Kenichi Handa @ 2006-10-02 7:22 UTC (permalink / raw) Cc: emacs-pretest-bug, emacs-devel In article <E1GSrVS-0000rW-Qr@jidanni.org>, Dan Jacobson <jidanni@jidanni.org> writes: > Gentlemen, do > $ touch aaa bbb 中文檔名 > $ emacs -Q -f ffap-bindings -f ffap-list-directory > RET C-x o > Now place the cursor on each filename and do C-x C-f and see what is > shown in the minibuffer. > Well, ffap knows about the ASCII filenames, but is unwilling to help > with the Chinese UTF-8 filename. It seems that this is because the variable ffap-string-at-point-mode-alist doesn't contain a multibyte character in CHARS. Unfortunately, we don't have a handy notation that represents all multibyte characters. One way I can think of is to use negation as this: Change (file "--:$+<>@-Z_a-z~*?" ...) to (file "^\0-#%-),;=[-^`{-}\^?" ...) Another way is to build a special syntax table (or a category table) and use re-search-forward/backward instead of skip-chars-forward/backward. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ffap not UTF-8 ready 2006-10-02 7:22 ` ffap not UTF-8 ready Kenichi Handa @ 2006-10-02 13:49 ` Stefan Monnier 2006-10-02 21:05 ` Kevin Ryde 1 sibling, 0 replies; 8+ messages in thread From: Stefan Monnier @ 2006-10-02 13:49 UTC (permalink / raw) Cc: emacs-pretest-bug, emacs-devel, Dan Jacobson > (file "^\0-#%-),;=[-^`{-}\^?" ...) For what it's worth, GNU Arch uses file names with `{', `}', and `=' (and `,' as well, but these are probably less important). Stefan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ffap not UTF-8 ready 2006-10-02 7:22 ` ffap not UTF-8 ready Kenichi Handa 2006-10-02 13:49 ` Stefan Monnier @ 2006-10-02 21:05 ` Kevin Ryde 2006-10-03 1:20 ` Kenichi Handa 1 sibling, 1 reply; 8+ messages in thread From: Kevin Ryde @ 2006-10-02 21:05 UTC (permalink / raw) Kenichi Handa <handa@m17n.org> writes: > > It seems that this is because the variable > ffap-string-at-point-mode-alist doesn't contain a multibyte > character in CHARS. Perhaps "(thing-at-point 'filename)", in thing-at-point-file-name-chars, has the same problem. (I was pondering the slight duplication between ffap guessing and thing-at-point the other day. It might be cute if you could somehow have ffap handlers based on a test for a thing-at-point thing, to get a consistent notion of what might be "at point".) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ffap not UTF-8 ready 2006-10-02 21:05 ` Kevin Ryde @ 2006-10-03 1:20 ` Kenichi Handa [not found] ` <E1GUliX-0003XK-Lx@fencepost.gnu.org> 0 siblings, 1 reply; 8+ messages in thread From: Kenichi Handa @ 2006-10-03 1:20 UTC (permalink / raw) Cc: rv, emacs-devel In article <87y7ryjuml.fsf@zip.com.au>, Kevin Ryde <user42@zip.com.au> writes: > Kenichi Handa <handa@m17n.org> writes: > > > > It seems that this is because the variable > > ffap-string-at-point-mode-alist doesn't contain a multibyte > > character in CHARS. > Perhaps "(thing-at-point 'filename)", in thing-at-point-file-name-chars, > has the same problem. !! The variable thing-at-point-file-name-chars is defined as "-~/[:alnum:]_.${}#%,:". I've forgotten about [:XXX:] notation. I've just read src/regex.c and found that [:alnum:] also works for multibyte characters (it matches with a multibyte character whose syntax is "word"), and [:multibyte:] is available too. So, the current definition of thing-at-point-file-name-chars works in most cases. But, considering that a non-word multibyte character can also be used in a file name, perhaps, defining that as "-~/[:alnum:][:multibyte:]_.${}#%,:" is better. And, I think ffap.el should also use that kind of pattern instead of something like this: "0-9A-Za-z". --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <E1GUliX-0003XK-Lx@fencepost.gnu.org>]
* Re: ffap not UTF-8 ready [not found] ` <E1GUliX-0003XK-Lx@fencepost.gnu.org> @ 2006-10-03 23:26 ` Kenichi Handa 2006-10-04 16:22 ` Richard Stallman 2006-10-11 20:29 ` Richard Stallman 0 siblings, 2 replies; 8+ messages in thread From: Kenichi Handa @ 2006-10-03 23:26 UTC (permalink / raw) Cc: emacs-devel In article <E1GUliX-0003XK-Lx@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: > Since we already have regexp constructs for multibyte > characters, I guess we should make ffap use them now. I agree. In ffap.el, similar regexps are used not only in ffap-string-at-point-mode-alist but also in the other places. So, I'd like to ask the maintainer of that file to check throughout the file. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ffap not UTF-8 ready 2006-10-03 23:26 ` Kenichi Handa @ 2006-10-04 16:22 ` Richard Stallman 2006-10-11 20:29 ` Richard Stallman 1 sibling, 0 replies; 8+ messages in thread From: Richard Stallman @ 2006-10-04 16:22 UTC (permalink / raw) Cc: emacs-devel > Since we already have regexp constructs for multibyte > characters, I guess we should make ffap use them now. I agree. In ffap.el, similar regexps are used not only in ffap-string-at-point-mode-alist but also in the other places. So, I'd like to ask the maintainer of that file to check throughout the file. Rajesh Vaidheeswarran <rv@gnu.org>, do you read me? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ffap not UTF-8 ready 2006-10-03 23:26 ` Kenichi Handa 2006-10-04 16:22 ` Richard Stallman @ 2006-10-11 20:29 ` Richard Stallman 2006-10-12 15:48 ` Rajesh Vaidheeswarran 1 sibling, 1 reply; 8+ messages in thread From: Richard Stallman @ 2006-10-11 20:29 UTC (permalink / raw) Cc: emacs-devel [I sent this message a week ago but did not get a response.] > Since we already have regexp constructs for multibyte > characters, I guess we should make ffap use them now. I agree. In ffap.el, similar regexps are used not only in ffap-string-at-point-mode-alist but also in the other places. So, I'd like to ask the maintainer of that file to check throughout the file. Rajesh Vaidheeswarran <rv@gnu.org>, do you read me? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ffap not UTF-8 ready 2006-10-11 20:29 ` Richard Stallman @ 2006-10-12 15:48 ` Rajesh Vaidheeswarran 0 siblings, 0 replies; 8+ messages in thread From: Rajesh Vaidheeswarran @ 2006-10-12 15:48 UTC (permalink / raw) Cc: emacs-devel, Kenichi Handa [-- Attachment #1.1: Type: text/plain, Size: 653 bytes --] I don't regularly read emacs-devel mails. I will, however, check this out when I return to the US in a couple of weeks. rv On 10/12/06, Richard Stallman <rms@gnu.org> wrote: > > [I sent this message a week ago but did not get a response.] > > > Since we already have regexp constructs for multibyte > > characters, I guess we should make ffap use them now. > > I agree. In ffap.el, similar regexps are used not only in > ffap-string-at-point-mode-alist but also in the other > places. So, I'd like to ask the maintainer of that file to > check throughout the file. > > Rajesh Vaidheeswarran <rv@gnu.org>, do you read me? > > [-- Attachment #1.2: Type: text/html, Size: 1126 bytes --] [-- Attachment #2: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-10-12 15:48 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <E1GSrVS-0000rW-Qr@jidanni.org> 2006-10-02 7:22 ` ffap not UTF-8 ready Kenichi Handa 2006-10-02 13:49 ` Stefan Monnier 2006-10-02 21:05 ` Kevin Ryde 2006-10-03 1:20 ` Kenichi Handa [not found] ` <E1GUliX-0003XK-Lx@fencepost.gnu.org> 2006-10-03 23:26 ` Kenichi Handa 2006-10-04 16:22 ` Richard Stallman 2006-10-11 20:29 ` Richard Stallman 2006-10-12 15:48 ` Rajesh Vaidheeswarran
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).