From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Dave Love Newsgroups: gmane.emacs.devel Subject: Re: iso-8859-1 and non-latin-1 chars Date: 28 Nov 2002 17:01:59 +0000 Sender: emacs-devel-admin@gnu.org Message-ID: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1038503134 20450 80.91.224.249 (28 Nov 2002 17:05:34 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 28 Nov 2002 17:05:34 +0000 (UTC) Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18HS6X-0005Jh-00 for ; Thu, 28 Nov 2002 18:05:33 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 18HSDh-0004J9-00 for ; Thu, 28 Nov 2002 18:12:57 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 18HS4S-0006Wk-00; Thu, 28 Nov 2002 12:03:24 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 18HS3E-00052J-00 for emacs-devel@gnu.org; Thu, 28 Nov 2002 12:02:08 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 18HS37-0004nb-00 for emacs-devel@gnu.org; Thu, 28 Nov 2002 12:02:05 -0500 Original-Received: from albion.dl.ac.uk ([148.79.80.39]) by monty-python.gnu.org with esmtp (Exim 4.10) id 18HS37-0004l5-00 for emacs-devel@gnu.org; Thu, 28 Nov 2002 12:02:01 -0500 Original-Received: from fx by albion.dl.ac.uk with local (Exim 3.35 #1 (Debian)) id 18HS36-0006yL-00 for ; Thu, 28 Nov 2002 17:02:00 +0000 Original-To: emacs-devel@gnu.org Original-Lines: 117 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:9732 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:9732 On querying a change, handa referred me to a thread which lead to it. The archive loses threading info, sorry. > Ok, I've just installed this change. > > 2002-11-18 Kenichi Handa <[13]handa@m17n.org> > > * language/cyrillic.el (cyrillic-iso-8bit): Make it safe. > > * language/european.el (iso-latin-1): Make it safe. > (iso-latin-2, iso-latin-3, iso-latin-4, iso-latin-5, iso-latin-8) > (iso-latin-9): Likewise. > > * language/greek.el (greek-iso-8bit): Make it safe. > > * language/hebrew.el (hebrew-iso-8bit): Make it safe. > > * language/lao.el (lao): Make it safe. > > * language/thai.el (thai-tis620): Make it safe. I think this is the wrong thing to do. As rms said, it's a user-visible change that affects more than Ispell. It breaks the principle that Emacs tries to avoid losing information in coding conversions (whereas XEmacs readily trashes data). If something uses one of these encodings incorrectly now you can't recover the data as you used to be able to. Anyhow, if the issue is just Ispell, it's definitely the wrong fix. ispell.el and similar stuff shouldn't send un-encodable text in the first place. (If `?' happens to be one of the extra word characters for Ispell, you'll lose anyway.) With the development source, you can easily fix what Stefan complained about as follows. There's more to it than that, though -- see the comment in the diff. I haven't had time to sort that out, though I did make changes to Flyspell along similar lines. That's easier, since Flyspell already works word-wise (roughly), but of course you likely run into problems displaying the choices without multilingual menus. [If you really wanted to fix such a thing just with a coding system change, you could set up a scratch coding system for the job or temporarily set a coding system property around the process setup.] By the way, ispell.el in CVS isn't up-to-date with Stevens' version. *** ispell.el.~1.133.~ Tue Nov 19 14:49:21 2002 --- ispell.el Mon Nov 25 15:41:02 2002 *************** *** 1347,1364 **** (or quietly (message "Checking spelling of %s..." (funcall ispell-format-word word))) ! (ispell-send-string "%\n") ; put in verbose mode ! (ispell-send-string (concat "^" word "\n")) ! ;; wait until ispell has processed word ! (while (progn ! (ispell-accept-output) ! (not (string= "" (car ispell-filter))))) ! ;;(ispell-send-string "!\n") ;back to terse mode. ! (setq ispell-filter (cdr ispell-filter)) ; remove extra \n ! (if (and ispell-filter (listp ispell-filter)) ! (if (> (length ispell-filter) 1) ! (error "Ispell and its process have different character maps") ! (setq poss (ispell-parse-output (car ispell-filter))))) (cond ((eq poss t) (or quietly (message "%s is correct" --- 1347,1369 ---- (or quietly (message "Checking spelling of %s..." (funcall ispell-format-word word))) ! (if (and enable-multibyte-characters ! (unencodable-char-position ! 0 (length word) (process-coding-system ispell-process) ! nil word)) ! (setq poss (list word 1 nil nil)) ! (ispell-send-string "%\n") ; put in verbose mode ! (ispell-send-string (concat "^" word "\n")) ! ;; wait until ispell has processed word ! (while (progn ! (ispell-accept-output) ! (not (string= "" (car ispell-filter))))) ! ;;(ispell-send-string "!\n") ;back to terse mode. ! (setq ispell-filter (cdr ispell-filter)) ; remove extra \n ! (if (and ispell-filter (listp ispell-filter)) ! (if (> (length ispell-filter) 1) ! (error "Ispell and its process have different character maps") ! (setq poss (ispell-parse-output (car ispell-filter)))))) (cond ((eq poss t) (or quietly (message "%s is correct" *************** *** 2604,2609 **** --- 2609,2628 ---- (let (poss accept-list) (if (not (numberp shift)) (setq shift 0)) + (if (and enable-multibyte-characters + (fboundp 'unencodable-char-position)) + ;; Avoid sending un-encodable input to the process, which can + ;; specifically confuse the current implementation. Fixme: Do + ;; it for 21.2 too. Fixme: The implementation here needs + ;; changing to check word-by-word (according to syntax tables, + ;; not a fixed list of characters) from known positions in the + ;; buffer, not not looking for matches of ispell output (which + ;; may be inappropriately encoded, for instance) in the + ;; original buffer. + (dolist (i (unencodable-char-position + 0 (length string) (process-coding-system ispell-process) + (length string) string)) + (aset string i ?\ ))) ;; send string to spell process and get input. (ispell-send-string string) (while (progn