unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: ffap not UTF-8 ready
       [not found] <E1GSrVS-0000rW-Qr@jidanni.org>
@ 2006-10-02  7:22 ` Kenichi Handa
  2006-10-02 13:49   ` Stefan Monnier
  2006-10-02 21:05   ` Kevin Ryde
  0 siblings, 2 replies; 8+ messages in thread
From: Kenichi Handa @ 2006-10-02  7:22 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

In article <E1GSrVS-0000rW-Qr@jidanni.org>, Dan Jacobson <jidanni@jidanni.org> writes:

> Gentlemen, do
> $ touch aaa bbb 中文檔名
> $ emacs -Q -f ffap-bindings -f ffap-list-directory
> RET C-x o
> Now place the cursor on each filename and do C-x C-f and see what is
> shown in the minibuffer.

> Well, ffap knows about the ASCII filenames, but is unwilling to help
> with the Chinese UTF-8 filename.

It seems that this is because the variable
ffap-string-at-point-mode-alist doesn't contain a multibyte
character in CHARS.  Unfortunately, we don't have a handy
notation that represents all multibyte characters.

One way I can think of is to use negation as this:

Change
   (file "--:$+<>@-Z_a-z~*?" ...)
to
   (file "^\0-#%-),;=[-^`{-}\^?" ...)

Another way is to build a special syntax table (or a
category table) and use re-search-forward/backward instead
of skip-chars-forward/backward.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
  2006-10-02  7:22 ` ffap not UTF-8 ready Kenichi Handa
@ 2006-10-02 13:49   ` Stefan Monnier
  2006-10-02 21:05   ` Kevin Ryde
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2006-10-02 13:49 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel, Dan Jacobson

>    (file "^\0-#%-),;=[-^`{-}\^?" ...)

For what it's worth, GNU Arch uses file names with `{', `}', and `=' (and
`,' as well, but these are probably less important).


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
  2006-10-02  7:22 ` ffap not UTF-8 ready Kenichi Handa
  2006-10-02 13:49   ` Stefan Monnier
@ 2006-10-02 21:05   ` Kevin Ryde
  2006-10-03  1:20     ` Kenichi Handa
  1 sibling, 1 reply; 8+ messages in thread
From: Kevin Ryde @ 2006-10-02 21:05 UTC (permalink / raw)


Kenichi Handa <handa@m17n.org> writes:
>
> It seems that this is because the variable
> ffap-string-at-point-mode-alist doesn't contain a multibyte
> character in CHARS.

Perhaps "(thing-at-point 'filename)", in thing-at-point-file-name-chars,
has the same problem.


(I was pondering the slight duplication between ffap guessing and
thing-at-point the other day.  It might be cute if you could somehow
have ffap handlers based on a test for a thing-at-point thing, to get
a consistent notion of what might be "at point".)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
  2006-10-02 21:05   ` Kevin Ryde
@ 2006-10-03  1:20     ` Kenichi Handa
       [not found]       ` <E1GUliX-0003XK-Lx@fencepost.gnu.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Kenichi Handa @ 2006-10-03  1:20 UTC (permalink / raw)
  Cc: rv, emacs-devel

In article <87y7ryjuml.fsf@zip.com.au>, Kevin Ryde <user42@zip.com.au> writes:

> Kenichi Handa <handa@m17n.org> writes:
> >
> > It seems that this is because the variable
> > ffap-string-at-point-mode-alist doesn't contain a multibyte
> > character in CHARS.

> Perhaps "(thing-at-point 'filename)", in thing-at-point-file-name-chars,
> has the same problem.

!! The variable thing-at-point-file-name-chars is defined as
    "-~/[:alnum:]_.${}#%,:".
I've forgotten about [:XXX:] notation.  I've just read
src/regex.c and found that [:alnum:] also works for
multibyte characters (it matches with a multibyte character
whose syntax is "word"), and [:multibyte:] is available too.
So, the current definition of thing-at-point-file-name-chars
works in most cases.  But, considering that a non-word
multibyte character can also be used in a file name,
perhaps, defining that as
   "-~/[:alnum:][:multibyte:]_.${}#%,:"
is better.

And, I think ffap.el should also use that kind of pattern
instead of something like this: "0-9A-Za-z".

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
       [not found]       ` <E1GUliX-0003XK-Lx@fencepost.gnu.org>
@ 2006-10-03 23:26         ` Kenichi Handa
  2006-10-04 16:22           ` Richard Stallman
  2006-10-11 20:29           ` Richard Stallman
  0 siblings, 2 replies; 8+ messages in thread
From: Kenichi Handa @ 2006-10-03 23:26 UTC (permalink / raw)
  Cc: emacs-devel

In article <E1GUliX-0003XK-Lx@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

> Since we already have regexp constructs for multibyte
> characters, I guess we should make ffap use them now.

I agree.  In ffap.el, similar regexps are used not only in
ffap-string-at-point-mode-alist but also in the other
places.  So, I'd like to ask the maintainer of that file to
check throughout the file.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
  2006-10-03 23:26         ` Kenichi Handa
@ 2006-10-04 16:22           ` Richard Stallman
  2006-10-11 20:29           ` Richard Stallman
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Stallman @ 2006-10-04 16:22 UTC (permalink / raw)
  Cc: emacs-devel

    > Since we already have regexp constructs for multibyte
    > characters, I guess we should make ffap use them now.

    I agree.  In ffap.el, similar regexps are used not only in
    ffap-string-at-point-mode-alist but also in the other
    places.  So, I'd like to ask the maintainer of that file to
    check throughout the file.

Rajesh Vaidheeswarran <rv@gnu.org>, do you read me?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
  2006-10-03 23:26         ` Kenichi Handa
  2006-10-04 16:22           ` Richard Stallman
@ 2006-10-11 20:29           ` Richard Stallman
  2006-10-12 15:48             ` Rajesh Vaidheeswarran
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Stallman @ 2006-10-11 20:29 UTC (permalink / raw)
  Cc: emacs-devel

[I sent this message a week ago but did not get a response.]

    > Since we already have regexp constructs for multibyte
    > characters, I guess we should make ffap use them now.

    I agree.  In ffap.el, similar regexps are used not only in
    ffap-string-at-point-mode-alist but also in the other
    places.  So, I'd like to ask the maintainer of that file to
    check throughout the file.

Rajesh Vaidheeswarran <rv@gnu.org>, do you read me?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ffap not UTF-8 ready
  2006-10-11 20:29           ` Richard Stallman
@ 2006-10-12 15:48             ` Rajesh Vaidheeswarran
  0 siblings, 0 replies; 8+ messages in thread
From: Rajesh Vaidheeswarran @ 2006-10-12 15:48 UTC (permalink / raw)
  Cc: emacs-devel, Kenichi Handa


[-- Attachment #1.1: Type: text/plain, Size: 653 bytes --]

I don't regularly read emacs-devel mails. I will, however, check this out
when I return to the US in a couple of weeks.

rv

On 10/12/06, Richard Stallman <rms@gnu.org> wrote:
>
> [I sent this message a week ago but did not get a response.]
>
>     > Since we already have regexp constructs for multibyte
>     > characters, I guess we should make ffap use them now.
>
>     I agree.  In ffap.el, similar regexps are used not only in
>     ffap-string-at-point-mode-alist but also in the other
>     places.  So, I'd like to ask the maintainer of that file to
>     check throughout the file.
>
> Rajesh Vaidheeswarran <rv@gnu.org>, do you read me?
>
>

[-- Attachment #1.2: Type: text/html, Size: 1126 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-10-12 15:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1GSrVS-0000rW-Qr@jidanni.org>
2006-10-02  7:22 ` ffap not UTF-8 ready Kenichi Handa
2006-10-02 13:49   ` Stefan Monnier
2006-10-02 21:05   ` Kevin Ryde
2006-10-03  1:20     ` Kenichi Handa
     [not found]       ` <E1GUliX-0003XK-Lx@fencepost.gnu.org>
2006-10-03 23:26         ` Kenichi Handa
2006-10-04 16:22           ` Richard Stallman
2006-10-11 20:29           ` Richard Stallman
2006-10-12 15:48             ` Rajesh Vaidheeswarran

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).