unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* chinese word mode
@ 2013-11-05  9:11 Eric Abrahamsen
  2013-11-06  6:59 ` Eric Abrahamsen
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Abrahamsen @ 2013-11-05  9:11 UTC (permalink / raw)
  To: emacs-devel

So the follow-up to my earlier message is that I'm trying to create a
chinese-word-mode, which will behave (almost exactly) like the existing
thai-word-mode defined in lisp/language/thai-util.el and friends.

The idea is that an entire dictionary of words are provided in a nested
char table, and then a minor mode both remaps most word-related commands
to use that dictionary, and fill-find-break-point-function is rewired to
do the same. The Thai version looks like this:

(define-minor-mode thai-word-mode
  :global t :group 'mule
  (cond (thai-word-mode
	 ;; This enables linebreak between Thai characters.
	 (modify-category-entry (make-char 'thai-tis620) ?|)
	 ;; This enables linebreak at a Thai word boundary.
	 (put-charset-property 'thai-tis620 'fill-find-break-point-function
			       'thai-fill-find-break-point))
	(t
	 (modify-category-entry (make-char 'thai-tis620) ?| nil t)
	 (put-charset-property 'thai-tis620 'fill-find-break-point-function
			       nil))))

I have shamelessly copied most of the code, and begun reworking it for
Chinese. But I'm confused about the charset specifications above.

Thai has only two charsets (one of which is thai-tis620), while Chinese
has more than a dozen (though I'm only messing with simplified Chinese
for now, so call it six or so).

My buffers are utf-8 encoded, and describe-char on a Chinese character
shows "preferred charset: unicode-bmp". So what do I put for the charset
in order to make these functions target the right characters? Chinese
characters all seem to have the "|" line-breakable category by default,
but (I think) I can only add the custom fill break point function one
charset at a time.

Thanks!
Eric




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-11-09  2:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-05  9:11 chinese word mode Eric Abrahamsen
2013-11-06  6:59 ` Eric Abrahamsen
2013-11-06 13:36   ` Stefan Monnier
2013-11-07 12:15     ` Kenichi Handa
2013-11-08  3:36       ` Eric Abrahamsen
2013-11-08 23:03         ` Xue Fuqiao
2013-11-09  2:51           ` Eric Abrahamsen
2013-11-06 15:37   ` William Xu
2013-11-07  7:13     ` Eric Abrahamsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).