From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Tor Kringeland Newsgroups: gmane.emacs.bugs Subject: bug#52179: Highlighting a word in `ispell' using `enchant' Date: Mon, 29 Nov 2021 21:46:05 +0100 Message-ID: References: <83k0grui9h.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8666"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 52179@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Nov 29 21:47:19 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mrnYU-0001yV-OP for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 29 Nov 2021 21:47:18 +0100 Original-Received: from localhost ([::1]:42130 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mrnYS-0006vH-PJ for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 29 Nov 2021 15:47:16 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:55998) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mrnYE-0006up-0E for bug-gnu-emacs@gnu.org; Mon, 29 Nov 2021 15:47:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:56276) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mrnYD-0005pF-N0 for bug-gnu-emacs@gnu.org; Mon, 29 Nov 2021 15:47:01 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mrnYD-0001DZ-JO for bug-gnu-emacs@gnu.org; Mon, 29 Nov 2021 15:47:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Tor Kringeland Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 29 Nov 2021 20:47:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 52179 X-GNU-PR-Package: emacs Original-Received: via spool by 52179-submit@debbugs.gnu.org id=B52179.16382187804634 (code B ref 52179); Mon, 29 Nov 2021 20:47:01 +0000 Original-Received: (at 52179) by debbugs.gnu.org; 29 Nov 2021 20:46:20 +0000 Original-Received: from localhost ([127.0.0.1]:39589 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mrnXX-0001Cf-FN for submit@debbugs.gnu.org; Mon, 29 Nov 2021 15:46:20 -0500 Original-Received: from mailgw301.it.ntnu.no ([129.241.56.179]:49250) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mrnXN-0001CO-Go for 52179@debbugs.gnu.org; Mon, 29 Nov 2021 15:46:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ntnu.no; i=@ntnu.no; q=dns/txt; s=s1-1909-ntnu; t=1638218767; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type : content-transfer-encoding : from; bh=nX8jSmnavmqbAXnfYHq69iW0uehLz3MpRkJs00ftdCA=; b=cZ80/7nMfDrpmXBPqQTj6sGSeB42bblRkUPAsoDCvo+Hzhv+B15Wtwke7BbxbLBcEpeSo W1k0tREpBFX8yj19c8IuEm52m7mXFVszsgwoYkxhH99O8jhJrQgETsHyyJqgdWlGAYrD+K0 bRYyKTgsZbOJLi4HCvC7Jp4L56RAB3BYA8y97dG20tbVtvcExWFTaI74gRgV0LC/yKpMRxx /8+QQqALXJ0cso9Q/10ZfnuhPYhPMMZ21nKHc5nGTTMjkkPoAz4mZ0bp8GjOBuZBSNBpUDX eRLJdUPARf9F3zhSFHdB6vMDZb456m/Qt+cL5tAHIzEiNpLLDspAYUMXJEng== Original-Received: from localhost (localhost [127.0.0.1]) by mailgw301.it.ntnu.no (Postfix) with ESMTP id 9D545682C56; Mon, 29 Nov 2021 21:46:07 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mailgw301.it.ntnu.no Original-Received: from mailgw301.it.ntnu.no ([127.0.0.1]) by localhost (mailgw301.it.ntnu.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 27QzxVG0E7D1; Mon, 29 Nov 2021 21:46:07 +0100 (CET) Original-Received: from localhost (unknown [91.219.215.154]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: takringe@ntnu.no) by mailgw301.it.ntnu.no (Postfix) with ESMTPSA id 06A0368297F; Mon, 29 Nov 2021 21:46:06 +0100 (CET) In-Reply-To: <83k0grui9h.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:221086 Archived-At: Eli Zaretskii writes: >> From: Tor Kringeland >> Date: Mon, 29 Nov 2021 15:44:39 +0100 >>=20 >> Using `ispell' with `enchant' on macOS yields the following problem. If >> a word contains some non-ASCII character, said character will not be >> considered part of the word and will split it (like a digit would). For >> example in "na=C3=AFve" both "na" and "ve" are considered two words. Th= is >> does not happen if I use `aspell' instead of `enchant', and if I run >>=20 >> echo -n "na=C3=AFve" | enchant-2 -a >>=20 >> it registers that this is one word, and that it is valid (using an >> English dictionary). >>=20 >> I'm using Enchant version 2.3.1 and an Emacs 29 build from 24 November >> on macOS Catalina. > > Which dictionary do you use, and what encoding does that dictionary > require? In Emacs, the relevant entry in `ispell-dictionary-alist' is ("en" "[[:alpha:]]" "[^[:alpha:]]" "" t nil nil utf-8) I installed `aspell' and `enchant' from Homebrew. The installation of `aspell' included a bunch of dictionaries downloaded from gnu.org. In particular, the "en" dictionary is downloaded from [1]. It is in some kind of binary format after installation (see [2] for details). The weird part is that it works fine in a command line, and switching `ispell-program-name' to use `aspell' fixes the issue, so the problem seems to be somehow in how Emacs interacts with the `enchant-2' binary. It's doing the same for non-ASCII characters as one would expect from numbers: the string "one0two" is valid, as "one" and "two" are treated as separate words and "0" is ignored. - [1] https://ftp.gnu.org/gnu/aspell/dict/en/aspell6-en-2018.04.16-0.tar.bz2 - [2] https://github.com/Homebrew/homebrew-core/blob/master/Formula/aspell.= rb