From: Kenichi Handa <handa@m17n.org>
Cc: lionel@mamane.lu, emacs-devel@gnu.org, 130397@bugs.debian.org
Subject: Re: Bug 130397 (Was: Emacs - Ispell problem with i[no]german dictionary)
Date: Tue, 4 Jan 2005 21:50:33 +0900 (JST) [thread overview]
Message-ID: <200501041250.VAA10883@etlken.m17n.org> (raw)
In-Reply-To: <20041222171306.GA4462@agmartin.aq.upm.es> (message from Agustin Martin on Wed, 22 Dec 2004 18:13:06 +0100)
In article <20041222171306.GA4462@agmartin.aq.upm.es>, Agustin Martin <agustin.martin@hispalinux.es> writes:
> I was aware of this, but anyway thanks for reminding. Code is probably too
> ad-hoc, but latin{0,1} thing is also a somewhat ad-hoc scenario, where
> latin0 should have really be named as something like iso-8859-1v2, that is,
> a revision. I cannot imagine somebody using a iso-8859-2 dict and trying to
> write in a iso8859-1 buffer, but with iso-8859-1 and iso-8859-15 that is
> happening too frequently.
> So we have a lot of people that blindly select the locale @euro variant
> without realizing its implications, and that iso-8859-1 and iso-8859-15
> are different, but very close encodings (from a practical point of view,
> they are fully equivalent for most languages but IIRC french (oe,"Y) and
> finnish {sSzZ}^, ^ stands for caron; the euro symbol seems not significant
> to spellchecking).
> Furthermore (this is probably fixed by the CVS code you mentioned above),
> in current sid emacs utf-8 files can be checked with a latin1 dict (of
> course if they do not use chars outside latin1) using the ispell.el
> internal reencodings, but fails for iso-8859-15 declared dict.
No, this is not yet fixed.
> The current state of ispell dicts in Debian is that ifrench is iso-8859-15
> as default (although has a real latin1 entry), while finnish do not set at
> all the {s,z}-caron chars, so it is a fully latin1 entry. aspell-fr and
> aspell-fi are set to plain latin1.
> So the only language that might currently require extra work is french, and
> for it I find reasonable to use for emacs as default the iso-8859-15 entry
> (tagged as iso-8859-1 for the above sustem to work). For this I would like
> to hear Lionel's point of view, since he has put a lot of effort to make
> iso-8859-15 available for spellchecking (Hi, Lionel).
> I personally do not like having separate iso-8859-15 entries unless they are
> really required. For the above dicts, that would be for french, and I am not
> at all sure that it is really required.
Hmmm, then how about the attached patch to the latest CVS
emacs? With that, all equivalent charaters (e.g a-grave in
all laitn-X) should be handled well. This patch will be
applicable also to Emacs 21.3 but not yet tested in that
version.
---
Ken'ichi HANDA
handa@m17n.org
*** ispell.el 25 Dec 2004 11:43:11 +0900 1.151
--- ispell.el 03 Jan 2005 16:05:48 +0900
***************
*** 1074,1088 ****
(decode-coding-string str (ispell-get-coding-system))
str))
(defun ispell-get-casechars ()
! (ispell-decode-string
! (nth 1 (assoc ispell-dictionary ispell-dictionary-alist))))
(defun ispell-get-not-casechars ()
! (ispell-decode-string
! (nth 2 (assoc ispell-dictionary ispell-dictionary-alist))))
(defun ispell-get-otherchars ()
! (ispell-decode-string
! (nth 3 (assoc ispell-dictionary ispell-dictionary-alist))))
(defun ispell-get-many-otherchars-p ()
(nth 4 (assoc ispell-dictionary ispell-dictionary-alist)))
(defun ispell-get-ispell-args ()
--- 1074,1127 ----
(decode-coding-string str (ispell-get-coding-system))
str))
+ (put 'ispell-unified-chars-table 'char-table-extra-slots 0)
+
+ ;; Char-table that maps an Unicode character (charset:
+ ;; latin-iso8859-1, mule-unicode-0100-24ff) to
+ ;; a string in which all equivalent characters are listed.
+
+ (defconst ispell-unified-chars-table
+ (let ((table (make-char-table 'ispell-unified-chars-table)))
+ (map-char-table
+ #'(lambda (c v)
+ (if (and v (/= c v))
+ (let ((unified (or (aref table v) (string v))))
+ (aset table v (concat unified (string c))))))
+ ucs-mule-8859-to-mule-unicode)
+ table))
+
+ ;; Return a string decoded from Nth element of the current dictionary
+ ;; while splicing equivalent characters into the string. This splicing
+ ;; is done only if the string is a regular expression of the form
+ ;; "[...]" because, otherwise, splicing will result in incorrect
+ ;; regular expression matching.
+
+ (defun ispell-get-decoded-string (n)
+ (let* ((slot (assoc ispell-dictionary ispell-dictionary-alist))
+ (str (nth n slot)))
+ (when (and (> (length str) 0)
+ (not (multibyte-string-p str)))
+ (setq str (ispell-decode-string str))
+ (if (and (= (aref str 0) ?\[)
+ (eq (string-match "\\]" str) (1- (length str))))
+ (setq str
+ (string-as-multibyte
+ (mapconcat
+ #'(lambda (c)
+ (let ((unichar (aref ucs-mule-8859-to-mule-unicode c)))
+ (if unichar
+ (aref ispell-unified-chars-table unichar)
+ (string c))))
+ str ""))))
+ (setcar (nthcdr n slot) str))
+ str))
+
(defun ispell-get-casechars ()
! (ispell-get-decoded-string 1))
(defun ispell-get-not-casechars ()
! (ispell-get-decoded-string 2))
(defun ispell-get-otherchars ()
! (ispell-get-decoded-string 3))
(defun ispell-get-many-otherchars-p ()
(nth 4 (assoc ispell-dictionary ispell-dictionary-alist)))
(defun ispell-get-ispell-args ()
next prev parent reply other threads:[~2005-01-04 12:50 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.4.43.0305140821370.30166-100000@wr-linux02.rki.ivbb.bund.de>
[not found] ` <m3addpd2ur.fsf@dionysos.nib>
[not found] ` <E19HNCh-0000tv-00@fencepost.gnu.org>
[not found] ` <20040517120658.GA6919@agmartin.aq.upm.es>
[not found] ` <E1BQ5z5-0000f4-5u@fencepost.gnu.org>
2004-05-19 11:44 ` Bug 130397 (Was: Emacs - Ispell problem with i[no]german dictionary) Agustin Martin
2004-05-21 8:01 ` Agustin Martin
2004-12-17 12:15 ` Agustin Martin
2004-12-22 12:37 ` Kenichi Handa
2004-12-22 17:13 ` Agustin Martin
2005-01-04 12:50 ` Kenichi Handa [this message]
2005-01-04 14:55 ` Bug 130397 Stefan
2005-01-05 2:00 ` Kenichi Handa
2005-01-05 4:42 ` Stefan Monnier
2005-01-05 5:50 ` Kenichi Handa
2005-01-05 14:02 ` Stefan Monnier
2005-01-06 0:44 ` Kenichi Handa
2005-01-06 16:30 ` Ken Stevens
2005-01-06 17:33 ` Stefan Monnier
2005-01-07 0:39 ` Kenichi Handa
2005-01-07 15:48 ` Agustin Martin
2005-01-08 12:31 ` Geoff Kuenning
2005-01-08 12:47 ` David Kastrup
2005-01-08 13:29 ` Miles Bader
2005-01-08 17:15 ` Geoff Kuenning
2005-01-10 4:45 ` Eli Zaretskii
2005-01-10 9:09 ` David Kastrup
2005-01-10 20:16 ` Eli Zaretskii
2005-01-13 7:50 ` Kenichi Handa
2005-01-08 22:39 ` Peter Heslin
2005-01-07 15:36 ` Agustin Martin
2005-01-07 20:29 ` Ken Stevens
2005-01-07 21:27 ` Juri Linkov
2005-01-13 5:59 ` Kenichi Handa
2005-01-18 10:44 ` Juri Linkov
2005-01-18 13:57 ` Geoff Kuenning
2005-01-19 7:34 ` Juri Linkov
2005-01-19 12:22 ` Geoff Kuenning
2005-04-29 0:29 ` Geoff Kuenning
2005-04-29 8:45 ` Thien-Thi Nguyen
2005-01-18 23:24 ` Kenichi Handa
2005-01-19 7:43 ` Juri Linkov
2005-01-19 12:52 ` Kenichi Handa
2005-01-19 13:08 ` David Kastrup
2005-01-07 15:34 ` Bug 130397 (Was: Emacs - Ispell problem with i[no]german dictionary) Agustin Martin
2005-01-10 13:06 ` Lionel Elie Mamane
2005-01-10 17:16 ` Agustin Martin
2005-01-11 5:16 ` Kenichi Handa
2005-01-11 19:56 ` Agustin Martin
2005-01-11 21:39 ` Lionel Elie Mamane
2005-01-12 7:37 ` Kenichi Handa
2005-01-12 19:17 ` Agustin Martin
2005-01-13 5:53 ` Kenichi Handa
2005-01-11 14:29 ` Richard Stallman
2005-01-12 7:45 ` Kenichi Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200501041250.VAA10883@etlken.m17n.org \
--to=handa@m17n.org \
--cc=130397@bugs.debian.org \
--cc=emacs-devel@gnu.org \
--cc=lionel@mamane.lu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.