From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: emacs-devel@gnu.org
Subject: Re: chinese word mode
Date: Thu, 07 Nov 2013 15:13:55 +0800 [thread overview]
Message-ID: <87iow4fzv0.fsf@ericabrahamsen.net> (raw)
In-Reply-To: m2fvr9ee3d.fsf@gmail.com
William Xu <william.xwl@gmail.com> writes:
> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>
>> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>>
>> [...]
>>
>>> (define-minor-mode thai-word-mode
>>> :global t :group 'mule
>>> (cond (thai-word-mode
>>> ;; This enables linebreak between Thai characters.
>>> (modify-category-entry (make-char 'thai-tis620) ?|)
>>> ;; This enables linebreak at a Thai word boundary.
>>> (put-charset-property 'thai-tis620 'fill-find-break-point-function
>>> 'thai-fill-find-break-point))
>>> (t
>>> (modify-category-entry (make-char 'thai-tis620) ?| nil t)
>>> (put-charset-property 'thai-tis620 'fill-find-break-point-function
>>> nil))))
>>>
>>
>> [...]
>>
>>> My buffers are utf-8 encoded, and describe-char on a Chinese character
>>> shows "preferred charset: unicode-bmp". So what do I put for the charset
>>> in order to make these functions target the right characters? Chinese
>>> characters all seem to have the "|" line-breakable category by default,
>>> but (I think) I can only add the custom fill break point function one
>>> charset at a time.
>>
>> I've tried slapping the 'fill-find-break-point-function onto the
>> 'unicode charset for now, and it works fine because the function only
>> does anything if point is in the midst of Chinese. It presumably gets
>> applied to all characters, though, and that can't be a real solution.
>
> modify-category-entry also accepts a range cons, where you can select
> Chinese characters by range. For example,
>
> (#x3400 . #x4DBF) ; CJK Unified Ideographs Extension A
> (#x4E00 . #x9FFF) ; CJK Unified Ideographs
> (#xF900 . #xFAFF) ; CJK Compatibility Ideographs
>
> put-charset-property seems only accepts a charset..
>
>> I'm guessing I'll need to separate simplified and traditional word sets
>> and make two versions of the mode. Both modes will loop through their
>> applicable charsets and apply/remove the custom break point function.
>>
>> Assuming I fix this problem and other inevitable bugs, would this
>> library be of general interest to Emacs?
>
> It can make those word movement functions useful. :)
That's certainly the idea! I'll admit I was motivated to do this by
using LibreOffice, which I usually can't stand, and noticing it DTRT
with Chinese words. A bit of Emacs chauvanism kicked in...
Thanks for the tips on categories and all. I don't think I need the
modify-category-entry section at all, since Chinese characters have the
"|" category by default. So it's just looping on applicable charsets.
E
prev parent reply other threads:[~2013-11-07 7:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-05 9:11 chinese word mode Eric Abrahamsen
2013-11-06 6:59 ` Eric Abrahamsen
2013-11-06 13:36 ` Stefan Monnier
2013-11-07 12:15 ` Kenichi Handa
2013-11-08 3:36 ` Eric Abrahamsen
2013-11-08 23:03 ` Xue Fuqiao
2013-11-09 2:51 ` Eric Abrahamsen
2013-11-06 15:37 ` William Xu
2013-11-07 7:13 ` Eric Abrahamsen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87iow4fzv0.fsf@ericabrahamsen.net \
--to=eric@ericabrahamsen.net \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.