unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#19558: ispell in 24.4: hunspell cannot check German text
@ 2015-01-10 14:02 Heinz Rommerskirchen
  2015-01-10 18:27 ` Eli Zaretskii
  0 siblings, 1 reply; 3+ messages in thread
From: Heinz Rommerskirchen @ 2015-01-10 14:02 UTC (permalink / raw)
  To: 19558

When using hunspell ispell cannot check German texts.

To recreate the problem create a file containing only the one line
 > zwanzigjährigen Arbeitszeit bei der Motorenfabrik Jank gebracht.
(This is a valid fragment of German, but 'Jank' is a proper name unknown
to hunspell). I have used both latin-0 and utf8 encoding with no visible
difference.
$ emacs -Q
in *scratch buffer execute (setq ispell-program-name "hunspell")
open the file and type M-x ispell.
I got the error message
ispell-process-line: Ispell misalignment: word `Jank' point 52; probably
incompatible versions

If you delete the first word in the file, ispell works fine and flags
the unknown 'Jank'.

hunspell on the command line has no problems with this files.
I was also able to use hunspell by adding the following to my start files
(add-to-list 'ispell-local-dictionary-alist '("deutsch" "[[:alpha:]]" 
"[^[:alpha:]]" "[']" t ("-d" "de_DE") nil iso-8859-1))

'hunspell --version' gives
 > $ hunspell --version
 > @(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
 >
 >
 > Copyright (C) 2002-2008 László Németh. License: MPL/GPL/LGPL.
 >
 > Based on OpenOffice.org's Myspell library.
 > Myspell's copyright (C) Kevin Hendricks, 2001-2002, License: BSD.
 >
 > This is free software; see the source for copying conditions.  There 
is NO
 > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE,
 > to the extent permitted by law.
and 'hunspell -D' gives
 > SEARCH PATH:
 > 
.::/usr/share/hunspell:/usr/share/myspell:/usr/share/myspell/dicts:/Library/Spelling:/home/hz/.openoffice.org/3/user/wordbook:.openoffice.org2/user/wordbook:.openoffice.org2.0/user/wordbook:Library/Spelling:/opt/openoffice.org/basis3.0/share/dict/ooo:/usr/lib/openoffice.org/basis3.0/share/dict/ooo:/opt/openoffice.org2.4/share/dict/ooo:/usr/lib/openoffice.org2.4/share/dict/ooo:/opt/openoffice.org2.3/share/dict/ooo:/usr/lib/openoffice.org2.3/share/dict/ooo:/opt/openoffice.org2.2/share/dict/ooo:/usr/lib/openoffice.org2.2/share/dict/ooo:/opt/openoffice.org2.1/share/dict/ooo:/usr/lib/openoffice.org2.1/share/dict/ooo:/opt/openoffice.org2.0/share/dict/ooo:/usr/lib/openoffice.org2.0/share/dict/ooo
 > AVAILABLE DICTIONARIES (path is not mandatory for -d option):
 > /usr/share/myspell/en_US
 > /usr/share/myspell/de_DE
 > LOADED DICTIONARY:
 > /usr/share/myspell/de_DE.aff
 > /usr/share/myspell/de_DE.dic
 > Hunspell 1.3.2




In GNU Emacs 24.4.1 (x86_64-suse-linux-gnu, GTK+ Version 3.6.4)
  of 2014-10-29 on cloud103
Windowing system distributor `The X.Org Foundation', version 11.0.11302000
System Description:	openSUSE 12.3 (x86_64)

Configured using:
  `configure --with-pop --without-hesiod --with-kerberos --with-kerberos5
  --with-xim --with-wide-int --with-file-notification=inotify
  --enable-autodepend
 
--enable-locallisppath=/usr/share/emacs/24.4/site-lisp:/usr/share/emacs/site-lisp
  --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info
  --datadir=/usr/share --localstatedir=/var --sharedstatedir=/var/lib
  --libexecdir=/usr/lib --with-x --with-sound --with-xpm --with-jpeg
  --with-tiff --with-gif --with-png --with-rsvg --with-dbus --without-gpm
  --with-x-toolkit=gtk3 --x-includes=/usr/include
  --x-libraries=/usr/lib64 --with-xft --with-libotf --with-m17n-flt
  --build=x86_64-suse-linux 'CFLAGS=-fmessage-length=0 -O2 -Wall
  -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables
  -fasynchronous-unwind-tables -g -D_GNU_SOURCE -pipe -Wno-pointer-sign
  -Wno-unused-variable -Wno-unused-label -Wno-unprototyped-calls
  -fno-optimize-sibling-calls -DSYSTEM_PURESIZE_EXTRA=55000
  -DSITELOAD_PURESIZE_EXTRA=10000 ' 'LDFLAGS=-Wl,-O2
  -Wl,--hash-size=65521''

Important settings:
   value of $LC_COLLATE: C
   value of $LC_NUMERIC: POSIX
   value of $LANG: de_DE.UTF-8
   value of $XMODIFIERS: @im=local
   locale-coding-system: utf-8-unix

Major mode: Text

Minor modes in effect:
   tooltip-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   line-number-mode: t
   transient-mark-mode: t

Recent input:
<help-echo> <down-mouse-2> <mouse-2> C-S-j C-x C-f
t m p . t x t <return> M-x i s p e l l <return> <help-echo>
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
<help-echo> <help-echo> <menu-bar> <help-menu> <se
nd-emacs-bug-report>

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Mark set
Starting new Ispell process hunspell with default dictionary...
Spell-checking tmp.txt using hunspell with default dictionary...done
ispell-process-line: Ispell misalignment: word `Jank' point 52; probably 
incompatible versions

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util help-fns mail-prsvr mail-utils ispell time-date delsel lpr
tooltip electric uniquify ediff-hook vc-hooks lisp-float-type mwheel
x-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list
newcomment lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai
tai-viet lao korean japanese hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help
simple abbrev minibuffer nadvice loaddefs button faces cus-face macroexp
files text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind inotify dynamic-setting system-font-setting font-render-setting
move-toolbar gtk x-toolkit x multi-tty emacs)

Memory information:
((conses 16 74299 8240)
  (symbols 48 18008 0)
  (miscs 40 43 163)
  (strings 32 10328 4490)
  (string-bytes 1 283385)
  (vectors 16 9133)
  (vector-slots 8 387069 15398)
  (floats 8 63 290)
  (intervals 56 250 0)
  (buffers 960 12)
  (heap 1024 18487 970))

-- 
Dr. Heinrich Rommerskirchen
Prof.-Schmid-Str. 41
82140 Olching

Tel. 08142 28787

Email heinz@h-rommerskirchen.de





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#19558: ispell in 24.4: hunspell cannot check German text
  2015-01-10 14:02 bug#19558: ispell in 24.4: hunspell cannot check German text Heinz Rommerskirchen
@ 2015-01-10 18:27 ` Eli Zaretskii
  2015-01-12  9:51   ` Heinz Rommerskirchen
  0 siblings, 1 reply; 3+ messages in thread
From: Eli Zaretskii @ 2015-01-10 18:27 UTC (permalink / raw)
  To: Heinz Rommerskirchen; +Cc: 19558

> Date: Sat, 10 Jan 2015 15:02:53 +0100
> From: Heinz Rommerskirchen <heinz@h-rommerskirchen.de>
> 
> When using hunspell ispell cannot check German texts.
> 
> To recreate the problem create a file containing only the one line
>  > zwanzigjährigen Arbeitszeit bei der Motorenfabrik Jank gebracht.
> (This is a valid fragment of German, but 'Jank' is a proper name unknown
> to hunspell). I have used both latin-0 and utf8 encoding with no visible
> difference.
> $ emacs -Q
> in *scratch buffer execute (setq ispell-program-name "hunspell")
> open the file and type M-x ispell.
> I got the error message
> ispell-process-line: Ispell misalignment: word `Jank' point 52; probably
> incompatible versions
> 
> If you delete the first word in the file, ispell works fine and flags
> the unknown 'Jank'.
> 
> hunspell on the command line has no problems with this files.
> I was also able to use hunspell by adding the following to my start files
> (add-to-list 'ispell-local-dictionary-alist '("deutsch" "[[:alpha:]]" 
> "[^[:alpha:]]" "[']" t ("-d" "de_DE") nil iso-8859-1))
> 
> 'hunspell --version' gives
>  > $ hunspell --version
>  > @(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)

If this is an unpatched version 1.3.2 of Hunspell, then that's
probably a known problem with Hunspell: it reports byte offsets of
misspelled words rather than character offsets, something that Emacs
doesn't expect.

There are patches in the Hunspell bug tracker to fix this problem.

FWIW, your test case works flawlessly for me in Emacs 24.4 with
Hunspell 1.3.2 patched to fix that problem (and a few others).





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#19558: ispell in 24.4: hunspell cannot check German text
  2015-01-10 18:27 ` Eli Zaretskii
@ 2015-01-12  9:51   ` Heinz Rommerskirchen
  0 siblings, 0 replies; 3+ messages in thread
From: Heinz Rommerskirchen @ 2015-01-12  9:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 19558



Am 10.01.2015 um 19:27 schrieb Eli Zaretskii:
>> Date: Sat, 10 Jan 2015 15:02:53 +0100
>> From: Heinz Rommerskirchen <heinz@h-rommerskirchen.de>
>>
>> When using hunspell ispell cannot check German texts.
>>
>>   .....
>> 'hunspell --version' gives
>>   > $ hunspell --version
>>   > @(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
>
> If this is an unpatched version 1.3.2 of Hunspell, then that's
> probably a known problem with Hunspell: it reports byte offsets of
> misspelled words rather than character offsets, something that Emacs
> doesn't expect.
>
> There are patches in the Hunspell bug tracker to fix this problem.
>
> FWIW, your test case works flawlessly for me in Emacs 24.4 with
> Hunspell 1.3.2 patched to fix that problem (and a few others).
>

Thank you, Eli. I think you are right. Too lazy to search for the 
patches, I installed the newest version (1.3.3) from the source code
at it's homepage and it works as it should for both German and English.

-- 
Dr. Heinrich Rommerskirchen
Prof.-Schmid-Str. 41
82140 Olching

Tel. 08142 28787

Email heinz@h-rommerskirchen.de





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-12  9:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-10 14:02 bug#19558: ispell in 24.4: hunspell cannot check German text Heinz Rommerskirchen
2015-01-10 18:27 ` Eli Zaretskii
2015-01-12  9:51   ` Heinz Rommerskirchen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).