unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: iso-8859-1 and non-latin-1 chars
@ 2002-11-28 17:01 Dave Love
  2002-12-02 15:47 ` Richard Stallman
  0 siblings, 1 reply; 38+ messages in thread
From: Dave Love @ 2002-11-28 17:01 UTC (permalink / raw)


On querying a change, handa referred me to a thread which lead to it.
The archive loses threading info, sorry.

> Ok, I've just installed this change.
> 
> 2002-11-18  Kenichi Handa  <[13]handa@m17n.org>
> 
>         * language/cyrillic.el (cyrillic-iso-8bit): Make it safe.
> 
>         * language/european.el (iso-latin-1): Make it safe.
>         (iso-latin-2, iso-latin-3, iso-latin-4, iso-latin-5, iso-latin-8)
>         (iso-latin-9): Likewise.
> 
>         * language/greek.el (greek-iso-8bit): Make it safe.
> 
>         * language/hebrew.el (hebrew-iso-8bit): Make it safe.
> 
>         * language/lao.el (lao): Make it safe.
> 
>         * language/thai.el (thai-tis620): Make it safe.       

I think this is the wrong thing to do.  As rms said, it's a
user-visible change that affects more than Ispell.  It breaks the
principle that Emacs tries to avoid losing information in coding
conversions (whereas XEmacs readily trashes data).  If something uses
one of these encodings incorrectly now you can't recover the data as
you used to be able to.

Anyhow, if the issue is just Ispell, it's definitely the wrong fix.
ispell.el and similar stuff shouldn't send un-encodable text in the
first place.  (If `?' happens to be one of the extra word characters
for Ispell, you'll lose anyway.)

With the development source, you can easily fix what Stefan complained
about as follows.  There's more to it than that, though -- see the
comment in the diff.  I haven't had time to sort that out, though I
did make changes to Flyspell along similar lines.  That's easier,
since Flyspell already works word-wise (roughly), but of course you
likely run into problems displaying the choices without multilingual
menus.

[If you really wanted to fix such a thing just with a coding system
change, you could set up a scratch coding system for the job or
temporarily set a coding system property around the process setup.]

By the way, ispell.el in CVS isn't up-to-date with Stevens' version.

*** ispell.el.~1.133.~	Tue Nov 19 14:49:21 2002
--- ispell.el	Mon Nov 25 15:41:02 2002
***************
*** 1347,1364 ****
  	(or quietly
  	    (message "Checking spelling of %s..."
  		     (funcall ispell-format-word word)))
! 	(ispell-send-string "%\n")	; put in verbose mode
! 	(ispell-send-string (concat "^" word "\n"))
! 	;; wait until ispell has processed word
! 	(while (progn
! 		 (ispell-accept-output)
! 		 (not (string= "" (car ispell-filter)))))
! 	;;(ispell-send-string "!\n") ;back to terse mode.
! 	(setq ispell-filter (cdr ispell-filter)) ; remove extra \n
! 	(if (and ispell-filter (listp ispell-filter))
! 	    (if (> (length ispell-filter) 1)
! 		(error "Ispell and its process have different character maps")
! 	      (setq poss (ispell-parse-output (car ispell-filter)))))
  	(cond ((eq poss t)
  	       (or quietly
  		   (message "%s is correct"
--- 1347,1369 ----
  	(or quietly
  	    (message "Checking spelling of %s..."
  		     (funcall ispell-format-word word)))
! 	(if (and enable-multibyte-characters
! 		 (unencodable-char-position
! 		  0 (length word) (process-coding-system ispell-process)
! 		  nil word))
! 	    (setq poss (list word 1 nil nil))
! 	  (ispell-send-string "%\n")	; put in verbose mode
! 	  (ispell-send-string (concat "^" word "\n"))
! 	  ;; wait until ispell has processed word
! 	  (while (progn
! 		   (ispell-accept-output)
! 		   (not (string= "" (car ispell-filter)))))
! 	  ;;(ispell-send-string "!\n") ;back to terse mode.
! 	  (setq ispell-filter (cdr ispell-filter)) ; remove extra \n
! 	  (if (and ispell-filter (listp ispell-filter))
! 	      (if (> (length ispell-filter) 1)
! 		  (error "Ispell and its process have different character maps")
! 		(setq poss (ispell-parse-output (car ispell-filter))))))
  	(cond ((eq poss t)
  	       (or quietly
  		   (message "%s is correct"
***************
*** 2604,2609 ****
--- 2609,2628 ----
    (let (poss accept-list)
      (if (not (numberp shift))
  	(setq shift 0))
+     (if (and enable-multibyte-characters
+ 	     (fboundp 'unencodable-char-position))
+ 	;; Avoid sending un-encodable input to the process, which can
+ 	;; specifically confuse the current implementation.  Fixme: Do
+ 	;; it for 21.2 too.  Fixme: The implementation here needs
+ 	;; changing to check word-by-word (according to syntax tables,
+ 	;; not a fixed list of characters) from known positions in the
+ 	;; buffer, not not looking for matches of ispell output (which
+ 	;; may be inappropriately encoded, for instance) in the
+ 	;; original buffer.
+ 	(dolist (i (unencodable-char-position
+ 		    0 (length string) (process-coding-system ispell-process)
+ 		    (length string) string))
+ 	  (aset string i ?\ )))
      ;; send string to spell process and get input.
      (ispell-send-string string)
      (while (progn

^ permalink raw reply	[flat|nested] 38+ messages in thread
* iso-8859-1 and non-latin-1 chars
@ 2002-11-07 14:57 Stefan Monnier
  2002-11-07 15:25 ` Eli Zaretskii
  0 siblings, 1 reply; 38+ messages in thread
From: Stefan Monnier @ 2002-11-07 14:57 UTC (permalink / raw)



When encoding text containing non-latin-1 chars with the latin-1
coding-system, they get output as some kind of escape sequence.

Would it be possible to output a `?' instead ?
If so, how ?


	Stefan


PS: This manifests itself in the "ispell misalignment" problem because
    ispell doesn't like these escape sequences.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2003-01-10 10:59 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-28 17:01 iso-8859-1 and non-latin-1 chars Dave Love
2002-12-02 15:47 ` Richard Stallman
2002-12-06 16:38   ` Dave Love
2002-12-09  6:08     ` Kenichi Handa
2002-12-15 16:24       ` Dave Love
2002-12-16  0:42         ` Kenichi Handa
2002-12-19 22:35           ` Dave Love
2002-12-23  6:40             ` Kenichi Handa
2002-12-23 12:27               ` Dave Love
2002-12-25 13:05                 ` Kenichi Handa
2002-12-31 17:14                   ` Ken Stevens
2003-01-06 19:28                     ` Dave Love
2003-01-06 19:18                   ` Dave Love
2003-01-07 13:01                     ` Kenichi Handa
2003-01-10 10:59                       ` Dave Love
2003-01-06 19:19                   ` Dave Love
2002-12-16 14:06         ` Stefan Monnier
2002-12-19 22:33           ` Dave Love
2002-12-16 16:42         ` Richard Stallman
     [not found]       ` <E18LZqb-0007si-00@fencepost.gnu.org>
2002-12-15 16:25         ` Dave Love
2002-12-16 16:42           ` Richard Stallman
     [not found]     ` <E18LCz8-0004It-00@fencepost.gnu.org>
2002-12-10 23:47       ` Dave Love
2002-12-11 20:39         ` Richard Stallman
2002-12-13  2:58           ` Kenichi Handa
2002-12-14 18:31             ` Richard Stallman
2002-12-17 11:41               ` None Kenichi Handa
  -- strict thread matches above, loose matches on Subject: below --
2002-11-07 14:57 iso-8859-1 and non-latin-1 chars Stefan Monnier
2002-11-07 15:25 ` Eli Zaretskii
2002-11-07 17:06   ` Stefan Monnier
2002-11-07 23:42     ` Kenichi Handa
2002-11-07 23:58       ` Stefan Monnier
2002-11-09 11:54       ` Richard Stallman
2002-11-09 20:32         ` Stefan Monnier
2002-11-11 10:19           ` Richard Stallman
2002-11-11  4:00         ` Kenichi Handa
2002-11-12  5:47           ` Richard Stallman
2002-11-18  0:08             ` Kenichi Handa
2002-11-18 19:09               ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).