unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Dave Love <d.love@dl.ac.uk>
Subject: Re: iso-8859-1 and non-latin-1 chars
Date: 28 Nov 2002 17:01:59 +0000	[thread overview]
Message-ID: <rzq65uhy6ew.fsf@albion.dl.ac.uk> (raw)

On querying a change, handa referred me to a thread which lead to it.
The archive loses threading info, sorry.

> Ok, I've just installed this change.
> 
> 2002-11-18  Kenichi Handa  <[13]handa@m17n.org>
> 
>         * language/cyrillic.el (cyrillic-iso-8bit): Make it safe.
> 
>         * language/european.el (iso-latin-1): Make it safe.
>         (iso-latin-2, iso-latin-3, iso-latin-4, iso-latin-5, iso-latin-8)
>         (iso-latin-9): Likewise.
> 
>         * language/greek.el (greek-iso-8bit): Make it safe.
> 
>         * language/hebrew.el (hebrew-iso-8bit): Make it safe.
> 
>         * language/lao.el (lao): Make it safe.
> 
>         * language/thai.el (thai-tis620): Make it safe.       

I think this is the wrong thing to do.  As rms said, it's a
user-visible change that affects more than Ispell.  It breaks the
principle that Emacs tries to avoid losing information in coding
conversions (whereas XEmacs readily trashes data).  If something uses
one of these encodings incorrectly now you can't recover the data as
you used to be able to.

Anyhow, if the issue is just Ispell, it's definitely the wrong fix.
ispell.el and similar stuff shouldn't send un-encodable text in the
first place.  (If `?' happens to be one of the extra word characters
for Ispell, you'll lose anyway.)

With the development source, you can easily fix what Stefan complained
about as follows.  There's more to it than that, though -- see the
comment in the diff.  I haven't had time to sort that out, though I
did make changes to Flyspell along similar lines.  That's easier,
since Flyspell already works word-wise (roughly), but of course you
likely run into problems displaying the choices without multilingual
menus.

[If you really wanted to fix such a thing just with a coding system
change, you could set up a scratch coding system for the job or
temporarily set a coding system property around the process setup.]

By the way, ispell.el in CVS isn't up-to-date with Stevens' version.

*** ispell.el.~1.133.~	Tue Nov 19 14:49:21 2002
--- ispell.el	Mon Nov 25 15:41:02 2002
***************
*** 1347,1364 ****
  	(or quietly
  	    (message "Checking spelling of %s..."
  		     (funcall ispell-format-word word)))
! 	(ispell-send-string "%\n")	; put in verbose mode
! 	(ispell-send-string (concat "^" word "\n"))
! 	;; wait until ispell has processed word
! 	(while (progn
! 		 (ispell-accept-output)
! 		 (not (string= "" (car ispell-filter)))))
! 	;;(ispell-send-string "!\n") ;back to terse mode.
! 	(setq ispell-filter (cdr ispell-filter)) ; remove extra \n
! 	(if (and ispell-filter (listp ispell-filter))
! 	    (if (> (length ispell-filter) 1)
! 		(error "Ispell and its process have different character maps")
! 	      (setq poss (ispell-parse-output (car ispell-filter)))))
  	(cond ((eq poss t)
  	       (or quietly
  		   (message "%s is correct"
--- 1347,1369 ----
  	(or quietly
  	    (message "Checking spelling of %s..."
  		     (funcall ispell-format-word word)))
! 	(if (and enable-multibyte-characters
! 		 (unencodable-char-position
! 		  0 (length word) (process-coding-system ispell-process)
! 		  nil word))
! 	    (setq poss (list word 1 nil nil))
! 	  (ispell-send-string "%\n")	; put in verbose mode
! 	  (ispell-send-string (concat "^" word "\n"))
! 	  ;; wait until ispell has processed word
! 	  (while (progn
! 		   (ispell-accept-output)
! 		   (not (string= "" (car ispell-filter)))))
! 	  ;;(ispell-send-string "!\n") ;back to terse mode.
! 	  (setq ispell-filter (cdr ispell-filter)) ; remove extra \n
! 	  (if (and ispell-filter (listp ispell-filter))
! 	      (if (> (length ispell-filter) 1)
! 		  (error "Ispell and its process have different character maps")
! 		(setq poss (ispell-parse-output (car ispell-filter))))))
  	(cond ((eq poss t)
  	       (or quietly
  		   (message "%s is correct"
***************
*** 2604,2609 ****
--- 2609,2628 ----
    (let (poss accept-list)
      (if (not (numberp shift))
  	(setq shift 0))
+     (if (and enable-multibyte-characters
+ 	     (fboundp 'unencodable-char-position))
+ 	;; Avoid sending un-encodable input to the process, which can
+ 	;; specifically confuse the current implementation.  Fixme: Do
+ 	;; it for 21.2 too.  Fixme: The implementation here needs
+ 	;; changing to check word-by-word (according to syntax tables,
+ 	;; not a fixed list of characters) from known positions in the
+ 	;; buffer, not not looking for matches of ispell output (which
+ 	;; may be inappropriately encoded, for instance) in the
+ 	;; original buffer.
+ 	(dolist (i (unencodable-char-position
+ 		    0 (length string) (process-coding-system ispell-process)
+ 		    (length string) string))
+ 	  (aset string i ?\ )))
      ;; send string to spell process and get input.
      (ispell-send-string string)
      (while (progn

             reply	other threads:[~2002-11-28 17:01 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-11-28 17:01 Dave Love [this message]
2002-12-02 15:47 ` iso-8859-1 and non-latin-1 chars Richard Stallman
2002-12-06 16:38   ` Dave Love
2002-12-09  6:08     ` Kenichi Handa
2002-12-15 16:24       ` Dave Love
2002-12-16  0:42         ` Kenichi Handa
2002-12-19 22:35           ` Dave Love
2002-12-23  6:40             ` Kenichi Handa
2002-12-23 12:27               ` Dave Love
2002-12-25 13:05                 ` Kenichi Handa
2002-12-31 17:14                   ` Ken Stevens
2003-01-06 19:28                     ` Dave Love
2003-01-06 19:18                   ` Dave Love
2003-01-07 13:01                     ` Kenichi Handa
2003-01-10 10:59                       ` Dave Love
2003-01-06 19:19                   ` Dave Love
2002-12-16 14:06         ` Stefan Monnier
2002-12-19 22:33           ` Dave Love
2002-12-16 16:42         ` Richard Stallman
     [not found]       ` <E18LZqb-0007si-00@fencepost.gnu.org>
2002-12-15 16:25         ` Dave Love
2002-12-16 16:42           ` Richard Stallman
     [not found]     ` <E18LCz8-0004It-00@fencepost.gnu.org>
2002-12-10 23:47       ` Dave Love
2002-12-11 20:39         ` Richard Stallman
2002-12-13  2:58           ` Kenichi Handa
2002-12-14 18:31             ` Richard Stallman
2002-12-17 11:41               ` None Kenichi Handa
  -- strict thread matches above, loose matches on Subject: below --
2002-11-07 14:57 iso-8859-1 and non-latin-1 chars Stefan Monnier
2002-11-07 15:25 ` Eli Zaretskii
2002-11-07 17:06   ` Stefan Monnier
2002-11-07 23:42     ` Kenichi Handa
2002-11-07 23:58       ` Stefan Monnier
2002-11-09 11:54       ` Richard Stallman
2002-11-09 20:32         ` Stefan Monnier
2002-11-11 10:19           ` Richard Stallman
2002-11-11  4:00         ` Kenichi Handa
2002-11-12  5:47           ` Richard Stallman
2002-11-18  0:08             ` Kenichi Handa
2002-11-18 19:09               ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=rzq65uhy6ew.fsf@albion.dl.ac.uk \
    --to=d.love@dl.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).