From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.devel Subject: Bad ispell.el <-> aspell-0.60 interactions in utf8 Date: Wed, 6 Apr 2005 17:25:58 +0200 Message-ID: <20050406152558.GA1975@agmartin.aq.upm.es> Reply-To: Agustin Martin , emacs-devel@gnu.org NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1112802963 12177 80.91.229.2 (6 Apr 2005 15:56:03 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 6 Apr 2005 15:56:03 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Apr 06 17:56:01 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1DJCrx-0005PO-UP for ged-emacs-devel@m.gmane.org; Wed, 06 Apr 2005 17:55:06 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1DJCR5-0004Tv-Nk for ged-emacs-devel@m.gmane.org; Wed, 06 Apr 2005 11:27:19 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1DJCG2-0003vy-Nd for emacs-devel@gnu.org; Wed, 06 Apr 2005 11:15:55 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1DJCEa-0002dj-8s for emacs-devel@gnu.org; Wed, 06 Apr 2005 11:14:24 -0400 Original-Received: from [138.100.4.49] (helo=edison.ccupm.upm.es) by monty-python.gnu.org with esmtp (Exim 4.34) id 1DJCQC-0001Pv-Tr for emacs-devel@gnu.org; Wed, 06 Apr 2005 11:26:25 -0400 Original-Received: from mala.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by edison.ccupm.upm.es (8.12.10/8.12.10) with ESMTP id j36FPut2009129; Wed, 6 Apr 2005 17:25:56 +0200 Original-Received: by mala.aq.upm.es (Postfix, from userid 1000) id CC5131C819; Wed, 6 Apr 2005 17:25:58 +0200 (CEST) Original-To: emacs-devel@gnu.org Content-Disposition: inline User-Agent: Mutt/1.5.8i X-MIME-Autoconverted: from 8bit to quoted-printable by edison.ccupm.upm.es id j36FPut2009129 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:35636 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:35636 (please cc me replies, I am not subscribed to emacs-devel) Hi, Just to let you know about a problem that has been reported to Debian and that seems caused by a undesired interaction between ispell.el and aspell-0.60 when the environment (really LC_CYPE) is utf8, http://bugs.debian.org/299725 In summary, when run in an UTF-8 environment, aspell 0.60 expects utf-8 text and returns utf-8 text, so if latin1 text is piped to it, some problems appear. This sounds crazy when done from the command line, but seems to happen when ispell.el pipes to aspell a text as latin1 (because the corresponding ispell-dictionary-alist entry says the dict is latin1), but aspell is run in an utf8 environment, e.g., piping the word r=F4le (as is, in latin1 encoding) to aspell (as aspell -a -d british-w_accents) in a latin1 environment gives @(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3-200501= 21) & r=F4le 35 0: role, Roley, rile, Rolfe, roles, tole, roe, Ole, ol=E9, ro= ll, rule, prole, Rolf, rou=E9, Cole, Dole, Pole, Rome, Rose, Rowe, Roze, bole= , dole, hole, mole, pole, robe, rode, rope, rose, rote, rove, sole, vole, role's but doing the same in an utf8 environment (but the word in latin1) will return @(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3-200501= 21) * & le 73 2: Le, Lea, Lee, Leo, Lew, Lie, lea, lee, lei, lie, El, L, l, LED= , Lek, Lem, Len, Les, Lev, Lr, led, leg, let, lye, E, e, LA, LL, La, Li, Lu= , Ly, la, ll, lo, Ole, ale, ol=C3=A9, LC, LP, Ln, Lt, lb, lg, ls, Be, Ce, D= E, De, Fe, GE, Ge, He, IE, ME, Me, NE, Ne, OE, PE, Re, SE, Se, Te, Xe, be, he, m= e, re, we, ye, Le's, L's This last seems what emacs do trough ispell.el, resulting in a 'Ispell an= d its process have different charsets' error on ispell-word The fix I am considering is to modify ispell.el so --encoding=3Dispell_dict_encoding is added to the aspell call (and only to the aspell call), assuming C.J. Madsen patch for aspell-learn-from-user-misspelings is applied and ispell-really-aspell is available (as is in emacs CVS ispell.el) diff -urNad dictionaries-common/support/emacsen/ispell.el /tmp/dpep.3PGMw= L/dictionaries-common/support/emacsen/ispell.el --- dictionaries-common/support/emacsen/ispell.el Sun Apr 3 23:27:= 46 2005 +++ /tmp/dpep.3PGMwL/dictionaries-common/support/emacsen/ispell.el S= un Apr 3 23:29:55 2005 @@ -2250,8 +2250,16 @@ (append args (list "-p" (expand-file-name ispell-personal-dictionary))))) + ;; ----- Debian changes + (if ispell-really-aspell + (setq args=20 + (append args=20 + (list + (concat "--encoding=3D"=20 + (symbol-name (ispell-get-coding-system))))= ))) + ;; ----- End of Debian changes (setq args (append args ispell-extra-args)) - + =20 (if ispell-async-processp (let ((process-connection-type ispell-use-ptys-p)) (apply 'start-process so we make sure that both ispell.el and aspell use the same encoding. Note that this will not work neither with aspell-0.33 (no --encoding opti= on available) nor with aspell-0.50 (uses only iso8859-1 syntax, while for aspell-0.60 iso-8859-1 syntax is also available). I think all dicts in ispell.el are supported by the current aspell-0.60 syntax. This problem seems to not appear for aspell-0.{33,50} (no utf8 support available). For a more general ispell.el some aditional checking on the aspell versio= n might be desirable. Cheers, --=20 Agustin