all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Dmitry Alexandrov <dag@gnui.org>
To: Eric Abrahamsen <eric@ericabrahamsen.net>
Cc: help-gnu-emacs@gnu.org
Subject: Re: Hunspell and contractions with apostrophes
Date: Wed, 27 May 2020 03:23:16 +0300	[thread overview]
Message-ID: <o8qa5iq3.dag@gnui.org> (raw)
In-Reply-To: <87y2pelh8t.fsf@ericabrahamsen.net> (Eric Abrahamsen's message of "Tue, 26 May 2020 10:48:34 -0700")

[-- Attachment #1: Type: text/plain, Size: 3129 bytes --]

Eric Abrahamsen <eric@ericabrahamsen.net> wrote:
> I've battled with this for years now: Hunspell marks any contraction with an apostrophe (eg the "I've" that starts this sentence) as misspelled.

Iʼd say that this is an obvious bug in the dictionary, that should be reported.

FWIW, it is present in Debian 10 as well:

	$ HOME=/tmp DICPATH='' hunspell -d en_US
	Hunspell 1.7.0
	I've
	*
	& ve 15 2: be, v, e, eve, vie, ave, vet, veg, Eve, Ave, vs, vi, re, me, he

and Iʼve never noticed it only because I use en_GB dictionary, which is fine:

	$ HOME=/tmp DICPATH='' hunspell -d en_GB
	Hunspell 1.7.0
	I've
        *

> It used to be that I could edit /usr/share/hunspell/en_US.aff and add the apostrophe to WORDCHARS (and also "ICONV ’ '")

First and foremost, your ’ is *not* an apostrophe, itʼs a right quote.  Apostrophe is ʼ.

This does matter, just check how do word-moving commands act on weird “I’ve” vs proper “Iʼve” and ascii (but no less proper) “I've”.

> and that would do it. Until the next time the hunspell package updated, and over-wrote its config files (I'm running Arch linux), and I would have to do it again.

Sure.  You are not supposed to tamper with files under package management.  Put your customized dictionaries somewhere else (in /etc, in your home directory).  I do not remember, whether hunspell(1) have any non-/usr paths hardcoded, but these lines in my ~/.profile suggest, that it does not:

	if [ -z "$DICPATH" ]; then
	    if [ -d '/usr/share/hunspell' ]; then
	        DICPATH='/usr/share/hunspell'
	    fi
	fi
	
	if [ -d "$HOME/.share/hunspell" ]; then
	    DICPATH="$HOME/.share/hunspell:$DICPATH"
	fi
	
	if [ -d "$HOME/.local/share/hunspell" ]; then
	    DICPATH="$HOME/.local/share/hunspell:$DICPATH"
	fi
	
	export DICPATH

> As of six months or a year or so ago, that trick no longer works.

It seems, that the question have changed in meanwhile.  Whether a right single quote is recognized as apostrophe is orthogonal to whether “I've” is recognized as a correct English word.

Things like “I've” or “I'm” are normally explicitly mentioned in the dictionary (since something like “pointʼve” is not entirely okay, afaiu).

If they are there, then double-check, that the used affix file does have apostrophe among WORDCHARS:

	WORDCHARS 0123456789'

(Thatʼs what is wrong with en_US.aff in Debian.)

If “I've” had still been recognized as a mistake, that would be pretty odd.


Now to Unicode.

Make sure, that the affix file correctly declares its encoding:

	SET UTF-8

Make Unicode apostrophe recognized as apostrophe:

	ICONV 1
	ICONV ʼ '

Optionally, make Unicode preferred to ASCII one for suggestions:

	OCONV 1
	OCONV ' ʼ


Now to Emacs.

If all of the above works with hunspell(1) itself, no configurations besides (setq ispell-program-name "hunspell") should be required.

I you insist on using right single quote as apostrophe, though, I have no idea, how to make ispell.el pass it to hunspell(1) as a part of a word.  Neither why ever do that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

  parent reply	other threads:[~2020-05-27  0:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 17:48 Hunspell and contractions with apostrophes Eric Abrahamsen
2020-05-26 19:08 ` Eli Zaretskii
2020-05-26 20:03   ` Eric Abrahamsen
2020-05-27  2:25     ` Eli Zaretskii
2020-05-27 21:24       ` Eric Abrahamsen
2020-05-27 21:53         ` Eric Abrahamsen
2020-05-28  6:22           ` Eli Zaretskii
2020-05-28 14:00             ` Eric Abrahamsen
2020-05-28 16:06               ` Eli Zaretskii
2020-05-28 17:30                 ` Eric Abrahamsen
2020-05-28 17:45                   ` Eric Abrahamsen
2020-05-27  0:23 ` Dmitry Alexandrov [this message]
2020-05-27  2:32   ` Eli Zaretskii
2020-05-27  4:22   ` Yuri Khan
2020-05-27  6:05     ` (Mis?)using quote as apostrophe (was: Hunspell and contractions with apostrophes) Dmitry Alexandrov
2020-05-27  6:53       ` Yuri Khan
2020-05-27  7:53         ` (Mis?)using quote as apostrophe Dmitry Alexandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=o8qa5iq3.dag@gnui.org \
    --to=dag@gnui.org \
    --cc=eric@ericabrahamsen.net \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.