all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#7781: 23.2.91; ispell problem with hunspell and UTF-8 file
@ 2011-01-03 23:14 Reuben Thomas
  2011-01-07 13:14 ` Agustin Martin
                   ` (5 more replies)
  0 siblings, 6 replies; 35+ messages in thread
From: Reuben Thomas @ 2011-01-03 23:14 UTC (permalink / raw)
  To: 7781

With the following text, and using emacs -Q, I get the errors you can
see in the messages log below when using hunspell to spell-check a UTF-8
buffer with some extended characters in it.

I did test this with emacs -Q, but the current session, in which I
reproduced the problem and am now composing this bug report, was not
started with -Q (this is so submitting the bug report works properly!).

I am running a freshly bzr-pulled build of the emacs-23 branch.

Text follows

----cut here----
---
title: Kindle 3 is a good first attempt
tags: computing, books
format: markdown
date: Mon, 03 Jan 2011 20:53:13 +0000
post-id: 2585181001
---

Giving my girlfriend a Kindle for Christmas was the carrot in a multi-pronged strategy to avoid needing more bookshelves (the stick being “I will start giving away your books” and my contribution being to archive books I’ve read (or return the many that aren’t even mine). This therefore required that I stocked it with books before she got her hands on it, which in turn was all the excuse I needed to play with the thing.

My lazy solution was simply to download all of [Feedbooks](http://www.feedbooks.com); I [wrote some scripts](http://rrt.sc3d.org/Software/Kindle/) to make this actually lazy, rather than brain-numbingly dull. In the process I found that while the Kindle is nice to hold and great to read, it struggles to cope with a large collection of books (even though the nearly 3,000 volumes of Feedbooks only half-filled its 4Gb memory), and is woeful as a research tool. And, of course, Amazon’s first-mover-evil surfaced early.

Here are the problems I had:

1. Amazon’s own store doesn’t seem to contain free books. I think it’s poor form not to give people a straightforward choice of free editions of out-of-copyright works. The Kindle may be a loss leader, but at £109 it’s still not cheap. Feedbooks, rather than integrating easily into the Kindle, like, say, a 3rd-party software provider into Ubuntu’s Software Center, provide a catalogue which itself is in the form of a book, doesn’t automatically update, and offers a list ordered only by title. In other words, it’s useless; one is better off using the built-in web browser to search the online catalogue…

2. …or better, another browser, since the Kindle’s is woefully slow (and I don’t just mean the screen update). It’s just about usable, and hence useful in an emergency, but is no good as, for example, an online research tool to use in parallel with the books you have downloaded, although…

3. …offline search is awful too. With just the few ebooks that come loaded on the device, it was slow; with the thousands of books I loaded, it simply locked up the device, even when trying to search in the manual, presumably already indexed. The Kindle seems to index its contents in the background, but even now, over a week later, search doesn’t work. The only effective navigation is by a book’s table of contents, and, to choose which books to read, the user-definable collections, though…

4. …collections are a pain to set up for many books, as you have to select each book manually; there is no way I have found to select a range. (Fortunately, I was able to define collections programmatically, but this will be beyond most users.)

In summary, it’s a lovely device, but the software is rather toytown. Amazon could improve it (and indeed, the 3.0.3 firmware update, at the experimental stage when I checked, claims, vaguely, “performance improvements”), but given that their main interest is in selling books and Kindles, I’m not hopeful that it will happen before the next hardware iteration; whether it happens at all depends on competition, and there should be plenty of that, to go by the number of other ebook readers.

----cut here----


In GNU Emacs 23.2.91.3 (i686-pc-linux-gnu, GTK+ Version 2.22.0)
 of 2011-01-03 on mord
Windowing system distributor `The X.Org Foundation', version 11.0.10900000
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_GB.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Text

Minor modes in effect:
  longlines-mode: t
  buffer-face-mode: t
  flyspell-mode: t
  show-paren-mode: t
  savehist-mode: t
  minibuffer-electric-default-mode: t
  iswitchb-mode: t
  icomplete-mode: t
  global-auto-revert-mode: t
  desktop-save-mode: t
  smart-quotes-mode: t
  mouse-wheel-mode: t
  use-hard-newlines: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
M-x r e p o r t - e m <tab> <return> h u n s p e l 
l SPC <M-backspace> i s p e l l SPC w i t h SPC h u 
n s l e <backspace> <backspace> s p e <backspace> <backspace> 
p e <backspace> <backspace> <backspace> p e l l SPC 
f a i l s C-g <down> <down> <down> <down> <down> <down> 
<down> <up> <up> <up> <up> <up> <up> <up> <up> <up> 
<up> <up> <up> <up> <up> <up> <up> M-x i s p e l l 
<return> SPC SPC SPC M-x i s p e <backspace> <backspace> 
<backspace> <backspace> <up> <up> <return>

Recent messages:
Scanning for "hard" Perl constructions... done
Applying style hooks... done
Scanning for "hard" Perl constructions... done
Scanning for "hard" Perl constructions... done
Scanning for "hard" Perl constructions... done
Scanning for "hard" Perl constructions... done
Lazy desktop load complete
Quit
Spell-checking Kindle 3 is a good first attempt using hunspell with british+accs dictionary...
Spell-checking region using hunspell with british+accs dictionary...done
ispell-process-line: Ispell misalignment: word `Feedbooks' point 1363; probably incompatible versions

Load-path shadows:
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-style hides /usr/share/emacs/site-lisp/auctex/tex-style
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-buf hides /usr/share/emacs/site-lisp/auctex/tex-buf
/usr/local/share/emacs/23.2.91/site-lisp/auctex/context hides /usr/share/emacs/site-lisp/auctex/context
/usr/local/share/emacs/23.2.91/site-lisp/auctex/bib-cite hides /usr/share/emacs/site-lisp/auctex/bib-cite
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-fold hides /usr/share/emacs/site-lisp/auctex/tex-fold
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-jp hides /usr/share/emacs/site-lisp/auctex/tex-jp
/usr/local/share/emacs/23.2.91/site-lisp/auctex/context-nl hides /usr/share/emacs/site-lisp/auctex/context-nl
/usr/local/share/emacs/23.2.91/site-lisp/auctex/toolbar-x hides /usr/share/emacs/site-lisp/auctex/toolbar-x
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-mik hides /usr/share/emacs/site-lisp/auctex/tex-mik
/usr/local/share/emacs/23.2.91/site-lisp/auctex/context-en hides /usr/share/emacs/site-lisp/auctex/context-en
/usr/local/share/emacs/23.2.91/site-lisp/auctex/texmathp hides /usr/share/emacs/site-lisp/auctex/texmathp
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-info hides /usr/share/emacs/site-lisp/auctex/tex-info
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-fptex hides /usr/share/emacs/site-lisp/auctex/tex-fptex
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-font hides /usr/share/emacs/site-lisp/auctex/tex-font
/usr/local/share/emacs/23.2.91/site-lisp/auctex/latex hides /usr/share/emacs/site-lisp/auctex/latex
/usr/local/share/emacs/23.2.91/site-lisp/auctex/font-latex hides /usr/share/emacs/site-lisp/auctex/font-latex
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-bar hides /usr/share/emacs/site-lisp/auctex/tex-bar
/usr/local/share/emacs/23.2.91/site-lisp/auctex/multi-prompt hides /usr/share/emacs/site-lisp/auctex/multi-prompt
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex hides /usr/share/emacs/site-lisp/auctex/tex

Features:
(shadow sort mail-extr message sendmail ecomplete rfc822 mml mml-sec
password-cache mm-decode mm-bodies mm-encode mailcap mail-parse rfc2231
rfc2047 rfc2045 qp ietf-drums mailabbrev nnheader gnus-util netrc
time-date mm-util mail-prsvr gmm-utils wid-edit mailheader canlock sha1
hex-util hashcash mail-utils emacsbug preview prv-emacs byte-opt
warnings tex-buf noutline outline font-latex bytecomp byte-compile latex
tex-style tex nxml-uchnm rng-xsd xsd-regexp rng-cmpct rng-nxml rng-valid
rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt rng-util rng-pttrn
nxml-ns nxml-mode nxml-outln nxml-rap nxml-util nxml-glyph nxml-enc
xmltok sgml-mode conf-mode newcomment make-mode vc-git cperl-mode
longlines face-remap filladapt flyspell auto-dictionary-autoloads
dictionary-autoloads js2-mode-autoloads package reporter completing-help
ff-paths uniquify paren savehist minibuf-eldef iswitchb icomplete
autorevert time cus-start cus-load desktop server change-mode advice
help-fns advice-preload php-mode derived etags cc-langs cl cl-19 cc-mode
cc-fonts cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
speedbar sb-image ezimage dframe easymenu assoc lua-mode regexp-opt
comint ring whitespace etags-update smart-quotes edmacro kmacro ispell
ffap muse-autoloads emacs-goodies-el emacs-goodies-custom
emacs-goodies-loaddefs easy-mmode devhelp preview-latex tex-site
auto-loads tooltip ediff-hook vc-hooks lisp-float-type mwheel x-win
x-dnd font-setting tool-bar dnd fontset image fringe lisp-mode register
page menu-bar rfn-eshadow timer select scroll-bar mldrag mouse jit-lock
font-lock syntax facemenu font-core frame cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew
greek romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev loaddefs button
minibuffer faces cus-face files text-properties overlay md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote make-network-process dbusbind system-font-setting
font-render-setting gtk x-toolkit x multi-tty emacs)

-- 
http://rrt.sc3d.org/





^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2020-08-28 12:56 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-03 23:14 bug#7781: 23.2.91; ispell problem with hunspell and UTF-8 file Reuben Thomas
2011-01-07 13:14 ` Agustin Martin
2011-01-07 14:30   ` Reuben Thomas
2011-02-11 17:00   ` Agustin Martin
2014-10-16 13:37     ` Agustin Martin
2014-10-16 13:54       ` Eli Zaretskii
2014-10-16 14:08         ` Agustin Martin
2012-01-01 21:42 ` bug#7781: ispell problem with hunspell and UTF-8 file (and other, related hunspell problems) Richard Wordingham
2013-04-13 19:12 ` bug#7781: [PATCH] Fix ispell problem with hunspell and UTF-8 file Николай Сущенко
2013-04-14  5:42   ` Eli Zaretskii
2013-04-14  6:33     ` Николай Сущенко
2013-04-14  7:08       ` Eli Zaretskii
2013-04-20 18:43         ` Николай Сущенко
2014-04-27 21:30 ` bug#7781: hunspell and latex-mode Peter Münster
2014-04-28 15:37   ` Eli Zaretskii
2014-04-28 16:18     ` Peter Münster
2014-04-28 16:48       ` Eli Zaretskii
2014-04-28 17:17         ` Peter Münster
2014-04-28 17:32           ` Eli Zaretskii
2014-04-28 18:27             ` Peter Münster
2014-04-29 10:03       ` Agustin Martin
2014-04-29 10:13         ` Peter Münster
2014-04-29 10:21           ` Agustin Martin
2014-04-29 10:20         ` Peter Münster
2014-04-29 10:39           ` Agustin Martin
2014-04-29 11:54             ` Peter Münster
2014-04-29 12:48               ` Peter Münster
2014-04-29 13:57                 ` Eli Zaretskii
2014-04-29 14:30                   ` Peter Münster
2014-04-29 15:25                     ` Eli Zaretskii
2014-04-29 16:34                       ` Peter Münster
2014-09-25  9:54 ` bug#7781: Bug still present in hunspell 1.3.3; Eli's patch still works Reuben Thomas
2020-08-28 12:00 ` bug#7781: 23.2.91; ispell problem with hunspell and UTF-8 file Stefan Kangas
2020-08-28 12:36   ` Eli Zaretskii
2020-08-28 12:56     ` Stefan Kangas

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.