unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Building intermediate Chinese language romanization alists
@ 2019-01-15 21:19 Eric Abrahamsen
  2019-01-16  0:09 ` Karl Fogel
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Eric Abrahamsen @ 2019-01-15 21:19 UTC (permalink / raw)
  To: emacs-devel

Hi,

I often would like to get access to the correspondences between
romanized Chinese, and Chinese characters. E.g., in the pinyin
romanization method, the string "zhong" can map to any of the characters
"中种重众终钟忠衷肿仲锺踵盅冢忪舯螽". This is useful for creating
language utilities, and other people have put together their own
correspondences for their own purposes[1].

Emacs ships with several of these mappings (though I understand they are
not included in the distribution), which are used to build the relevant
input methods. In the case of pinyin, the text
file ./leim/MISC-DIC/pinyin.map is converted with `titdic-convert' into
the file ./lisp/leim/quail/PY.el.

PY.el is automatically generated (by the function `py-converter' in
titdic-cnv.el): the mapping in pinyin.map is directly inserted into the
generated file, then wrapped in quotes and parens, to construct a call
to `quail-define-rules'.

I might be able to get the map back out of quail somehow, but since this
seems to be something that more than a few people would like access to,
I wonder if it would be acceptable to add an intermediary step, creating
(for instance) a defconst called `pinyin-map-alist' that holds the
contents of pinyin.map, and then changing the `quail-define-rules' call
to:

(apply #'quail-define-rules pinyin-map-alist)

The input method wouldn't be affected, but we'd have access to the
mapping via the constant, which would be very useful.

Pinyin would be the most useful romanization method to do this for, but
it looks like the CTLau and possibly ziranma methods might benefit from
similar treatment.

(Another issue is that if the constant is written into PY.el, which
isn't a library, it might be a bit difficult to get out again, but
perhaps the defconst could be appended to one
of./lisp/language/{chinese.el,china-util.el}. Or PY.el could be made a
library.)

I'm not entirely familiar with the language-related build process, but I
hope there might be an appropriate stage at which to hang the alist on a
variable name.

Thanks,
Eric

[1]: https://github.com/tumashu/pyim/blob/master/pyim-pymap.el




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-01-27  5:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-15 21:19 Building intermediate Chinese language romanization alists Eric Abrahamsen
2019-01-16  0:09 ` Karl Fogel
2019-01-16  0:26   ` Eric Abrahamsen
2019-01-16  0:23 ` Karl Fogel
2019-01-16  0:29   ` Eric Abrahamsen
2019-01-18  0:34     ` Karl Fogel
2019-01-20  6:01 ` Feng Shu
2019-01-27  5:50   ` Eric Abrahamsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).