unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: juri@jurta.org, emacs-devel@gnu.org
Subject: Re: find-composition still depends on the composition property
Date: Thu, 23 Oct 2008 10:18:22 +0900	[thread overview]
Message-ID: <E1KsoqE-0006VL-Ty@etlken.m17n.org> (raw)
In-Reply-To: <uwsg0e837.fsf@gnu.org> (message from Eli Zaretskii on Wed, 22 Oct 2008 21:35:40 +0200)

In article <uwsg0e837.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> Thanks, but Emacs still does not get this quite right.  For example,
> in the following line:

>   אבגדה12345

> Which mixes Hebrew letters with digits, M-f stops at the first digit,
> whereas in this line:

>   abcde12345

> it does not.  The latter behavior is correct, the former is not.  (I'm
> ashamed to admit that even MS Word gets it right.)

> I understand that the way for fixing this would be to install more
> entries in word-combining-categories, but more infrastructure seems to
> be missing, since right now no characters have the "Hebrew" category,
> for example (at least judging by the output of describe-categories).

Then what to do is:

(1-1) assign the category "6" (digit) to "0123456789".
(1-2) define a category, say "D", and assign it to all
characters that have no word-boundary between digits.
(1-3) add (?D . ?6) and (?6 . ?D) to word-combining-categories.

Another way is:

(2-1) modify word_boundary_p to handle negative category mnemonic in
word-*-categories to catch a character that doesn't have the
specified category.
(2-2) assign the category "6" (digit) to "0123456789".
(2-3) define a category, say "X", and assign it to all
characters that have word-boundary between digits.
(2-4) add ((- ?X) . ?6) and (?6 . (- ?X)) to
word-combining-categories.

Or,

(3-1) Make `common' script and classify digits, etc to it.
(3-2) modify word_boundary_p not to distinguish `common' from
any other script.
(3-3) define a category, say "X", and assign it to all
characters that have word-boundary between digits.
(3-4) add (?X . ?6) and (?6 . ?X) to
word-separating-categories.

> By the way, I'd suggest to move the legend generated by
> describe-categories to the beginning of the buffer, because the buffer
> is huge and it does not say anywhere at the beginning that there's a
> legend at the end.  Without the legend, the buffer looks like a large
> pile of gibberish.

The legend is longer than 40 lines.  If we put that at the
head, it will occupy the whole first page, which I think is
not that good.  Saying something like "See the end of the
buffer for the legend." with "legend" clickable at the first
line will be good.  What do you think?

> And another wish: can we have word-combining-categories and
> word-separating-categories display their elements with human-readable
> letters, not as their ASCII codes?  (Quick: what letter is code 94?)

How about modifing word_boundary_p to accept a mnemonic
string (instead of a mnemonic character) in those variables?
Then we can specify multiple categories in the string to
catch a character that have one of them.

---
Kenichi Handa
handa@ni.aist.go.jp
 




  reply	other threads:[~2008-10-23  1:18 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-29 13:46 find-composition still depends on the composition property Juanma Barranquero
2008-09-05  1:24 ` Kenichi Handa
2008-10-19 23:15   ` Juri Linkov
2008-10-20  6:46     ` Kenichi Handa
2008-10-21 23:46       ` Juri Linkov
2008-10-22  1:17         ` Kenichi Handa
2008-10-22  4:25           ` Eli Zaretskii
2008-10-22  5:43             ` Kenichi Handa
2008-10-22  5:29           ` Kenichi Handa
2008-10-22 19:35             ` Eli Zaretskii
2008-10-23  1:18               ` Kenichi Handa [this message]
2008-10-23 23:44                 ` describe-categories (was: find-composition still depends on the composition property) Juri Linkov
2008-10-25  1:37                   ` Kenichi Handa
2008-10-25  8:33                     ` Eli Zaretskii
2008-10-23 23:48                 ` Word boundary " Juri Linkov
2008-10-25 18:03                   ` Eli Zaretskii
2008-10-26 13:36                     ` Kenichi Handa
2008-10-26 19:32                       ` Eli Zaretskii
2008-10-27  0:17                         ` Word boundary Miles Bader
2008-10-27  0:27                           ` Kenichi Handa
2008-10-27  4:12                             ` Eli Zaretskii
2008-10-27  5:16                             ` Miles Bader
2008-10-31  5:50                               ` Kenichi Handa
2008-10-26  8:15                   ` Word boundary (was: find-composition still depends on the composition property) Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1KsoqE-0006VL-Ty@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=juri@jurta.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).