From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: ja-dic.el and SKK-JYSYO.L Date: Tue, 16 Feb 2010 16:51:49 +0900 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=euc-jp Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1266306736 3395 80.91.229.12 (16 Feb 2010 07:52:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 16 Feb 2010 07:52:16 +0000 (UTC) Cc: emacs-devel@gnu.org To: Ivan Kanis Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 16 08:52:14 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1NhIE8-0006Ze-Df for ged-emacs-devel@m.gmane.org; Tue, 16 Feb 2010 08:52:13 +0100 Original-Received: from localhost ([127.0.0.1]:49185 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NhIE7-00045g-1Q for ged-emacs-devel@m.gmane.org; Tue, 16 Feb 2010 02:52:11 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NhIDx-00044M-5P for emacs-devel@gnu.org; Tue, 16 Feb 2010 02:52:01 -0500 Original-Received: from [140.186.70.92] (port=59995 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NhIDv-00043b-LN for emacs-devel@gnu.org; Tue, 16 Feb 2010 02:52:00 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1NhIDu-0001Hj-6R for emacs-devel@gnu.org; Tue, 16 Feb 2010 02:51:59 -0500 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:54816) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NhIDt-0001HJ-Ml for emacs-devel@gnu.org; Tue, 16 Feb 2010 02:51:58 -0500 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id o1G7ppFf015314; Tue, 16 Feb 2010 16:51:51 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp3.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id o1G7ppTv009710; Tue, 16 Feb 2010 16:51:51 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp3.aist.go.jp with ESMTP id o1G7poQ2001647; Tue, 16 Feb 2010 16:51:50 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.69) (envelope-from ) id 1NhIDl-0002kn-Uv; Tue, 16 Feb 2010 16:51:49 +0900 In-Reply-To: (message from Ivan Kanis on Sat, 13 Feb 2010 11:42:38 +0100) X-detected-operating-system: by eggs.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:121139 Archived-At: In article , Ivan Kanis writes: > I put the new SKK-JYSHO.L in my tmp directory and run the command > skkdic-convert on the dictionary. It created ja-dic.el in my home > directory. > However when I eval ja-dic.el I get an error: > Debugger entered--Lisp error: (args-out-of-range ">\x308fk \x5206" 6) > string-match("[^ ]+" ">\x308fk \x5206" 6) > (while (string-match "[^ ]+" entry i) (setq candidates (cons ... candid= ates)) (setq i (match-end 0))) > (let ((kana ...) (i ...) candidates) (while (string-match "[^ ]+" entry= i) (setq candidates ...) (setq i ...)) (cons (skkdic-get-kana-compact-code= s kana) candidates)) > skkdic-extract-conversion-data(">\x308fk \x5206") I found that the latest SKK-JISYO.L contains entries that can't be handled by the current ja-dic-cnv.el. So, I installed the fix in addition to new SKK-JISYO.L and re-generated ja-dic.el. Please try the latest one, or try the attached patch. > Another issue with the new SKK-JYSHO.L is that it has some comments, for > example : > "=A4=A2=A4=AAt =C0=FA;=B2=D0=A4=F2=C0=FA=A4=EB =D2=EE;=BC=F2=A4=F2=D2=EE= =A4=EB" > My guess is that the block after ; should be ignored. I think =D2=EE shou= ld > be added as a candidate. I think I can fix that. Instead, I downloaded SKK-JISYO.L.unannotated and renamed it to SKK-JISYO.L. >>> I don't think SKK-JYSYO.L needs to be included in the source code, it's >>> 2.7M and doesn't need to be there if ja-dic.el exists. It would make the >>> tar ball a bit smaller. > > > > new SKK-JISYO.L much bigger than the current one? > The new SKK-JISYO.L is 4.3M. But the compressed one is just 400k bigger than the old one. It is surely big but not that disastrous. So, ... > > Perhaps we should consider moving SKK-JISYO.L (and the other > > big files) to `admin' directory which is not included in the > > tarball. > It sounds like a good idea, smaller tar ball saves bandwith. I'll do that after 23.2. --- Kenichi Handa handa@m17n.org =3D=3D=3D modified file 'lisp/international/ja-dic-cnv.el' --- lisp/international/ja-dic-cnv.el 2010-01-13 08:35:10 +0000 +++ lisp/international/ja-dic-cnv.el 2010-02-16 06:47:31 +0000 @@ -45,15 +45,6 @@ ;; Name of a file to generate from SKK dictionary. (defvar ja-dic-filename "ja-dic.el") =20 -;; To make a generated ja-dic.el smaller. -(define-coding-system 'iso-2022-7bit-short - "Like `iso-2022-7bit' but no ASCII designation before SPC." - :coding-type 'iso-2022 - :mnemonic ?J - :charset-list 'iso-2022 - :designation [(ascii t) nil nil nil] - :flags '(short 7-bit designation)) - (defun skkdic-convert-okuri-ari (skkbuf buf) (message "Processing OKURI-ARI entries ...") (goto-char (point-min)) @@ -61,24 +52,22 @@ (insert ";; Setting okuri-ari entries.\n" "(skkdic-set-okuri-ari\n")) (while (not (eobp)) - (let ((from (point)) - to) - (end-of-line) - (setq to (point)) - - (with-current-buffer buf - (insert-buffer-substring skkbuf from to) - (beginning-of-line) - (insert "\"") - (search-forward " ") - (delete-char 1) ; delete the first '/' - (let ((p (point))) - (end-of-line) - (delete-char -1) ; delete the last '/' - (subst-char-in-region p (point) ?/ ? 'noundo)) - (insert "\"\n")) + (if (/=3D (following-char) ?>) + (let ((from (point)) + (to (line-end-position))) + (with-current-buffer buf + (insert-buffer-substring skkbuf from to) + (beginning-of-line) + (insert "\"") + (search-forward " ") + (delete-char 1) ; delete the first '/' + (let ((p (point))) + (end-of-line) + (delete-char -1) ; delete the last '/' + (subst-char-in-region p (point) ?/ ? 'noundo)) + (insert "\"\n")))) =20 - (forward-line 1))) + (forward-line 1)) (with-current-buffer buf (insert ")\n\n"))) =20 @@ -348,7 +337,7 @@ (erase-buffer) (buffer-disable-undo) (insert ";;; ja-dic.el --- dictionary for Japanese input method" - " -*-coding: iso-2022-jp; byte-compile-disable-print-circle:t; -*-\= n" + " -*-coding: euc-japan; byte-compile-disable-print-circle:t; -*-\n" ";;\tGenerated by the command `skkdic-convert'\n" ";;\tDate: " (current-time-string) "\n" ";;\tOriginal SKK dictionary file: " @@ -410,7 +399,7 @@ ;; Save the working buffer. (set-buffer buf) (set-visited-file-name (expand-file-name ja-dic-filename dirname) t) - (set-buffer-file-coding-system 'iso-2022-7bit-short) + (set-buffer-file-coding-system 'euc-japan) (save-buffer 0)) (kill-buffer skkbuf) (switch-to-buffer buf)))