* Intermittent problem with unencodable-char-position
@ 2010-04-14 4:19 Harald Hanche-Olsen
2010-04-14 4:38 ` Harald Hanche-Olsen
0 siblings, 1 reply; 4+ messages in thread
From: Harald Hanche-Olsen @ 2010-04-14 4:19 UTC (permalink / raw)
To: emacs-devel
Evaluating the form
(unencodable-char-position 0 5 'iso-latin-1-unix 1 "100 Ω")
normally returns the list (4), since capital Omega is not encodable in
latin-1. However, after I have run emacs for a while, it happens that
this form begins to return nil [*]. I have no idea what triggers this
behaviour, and the only cure seems to be to quit and restart emacs.
I suspect some internal memory corruption, but if anyone here can
suggest another possible reason, I'd like to hear about it. Or if you
can think of a debugging technique that might shed some light on this,
I'll be happy to try it when it happens again. (I warn you that I run
on OS X, though, so debugging is, um, different.)
[*] I notice because attempts to save a buffer containing non-latin-1
characters with the latin-1 charset fails without the usual offer to
select a different character set. I have narrowed the problem down to
the above behaviour inside select-safe-coding-system-interactively.
(That code doesn't use the string argument, but it happens whether you
look at a string or the current buffer.)
- Harald
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Intermittent problem with unencodable-char-position
2010-04-14 4:19 Intermittent problem with unencodable-char-position Harald Hanche-Olsen
@ 2010-04-14 4:38 ` Harald Hanche-Olsen
2010-04-14 15:42 ` Harald Hanche-Olsen
0 siblings, 1 reply; 4+ messages in thread
From: Harald Hanche-Olsen @ 2010-04-14 4:38 UTC (permalink / raw)
To: emacs-devel
+ Harald Hanche-Olsen <hanche@math.ntnu.no>:
> Evaluating the form
>
> (unencodable-char-position 0 5 'iso-latin-1-unix 1 "100 Ω")
>
> normally returns the list (4), since capital Omega is not encodable in
> latin-1. However, after I have run emacs for a while, it happens that
> this form begins to return nil [*]. I have no idea what triggers this
> behaviour, [...]
Well, lo and behold, after sending the above mail I immediately
discovered how to trigger the problem: Sending mail does it.
I use mew and send through a TSL encrypted server; mew uses stunnel to
handle the encryption. I suppose it tweaks some global setting in the
process of doing the communication, but surely, that should not affect
the behaviour of unencodable-char-position? I have asked about this on
the mew mailing list too, but maybe this narrows it down enough to
give someone here an idea.
- Harald
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Intermittent problem with unencodable-char-position
2010-04-14 4:38 ` Harald Hanche-Olsen
@ 2010-04-14 15:42 ` Harald Hanche-Olsen
2010-04-14 16:11 ` Harald Hanche-Olsen
0 siblings, 1 reply; 4+ messages in thread
From: Harald Hanche-Olsen @ 2010-04-14 15:42 UTC (permalink / raw)
To: emacs-devel
+ Harald Hanche-Olsen <hanche@math.ntnu.no>:
> + Harald Hanche-Olsen <hanche@math.ntnu.no>:
>
> > Evaluating the form
> >
> > (unencodable-char-position 0 5 'iso-latin-1-unix 1 "100 Ω")
> >
> > normally returns the list (4), since capital Omega is not encodable in
> > latin-1. However, after I have run emacs for a while, it happens that
> > this form begins to return nil [*]. I have no idea what triggers this
> > behaviour, [...]
>
> Well, lo and behold, after sending the above mail I immediately
> discovered how to trigger the problem: Sending mail does it.
After a couple hours of debugging effort I managed to drill down to
the code in mew that triggers the problem: It is this little snippet
(apply 'set-charset-priority charset-list)
in which charset-list is a humongous list of charset names. (Included
below my signature in order to not interrupt your train of thought.)
I can undo the damage by running set-charset-priority on a much
shorter list, snipped from the head of the big one.
I have no idea why the author of mew thinks he needs to do this, but
in any case, having it influence the behaviour of
unencodable-char-position must surely be a bug? I'll submit a bug
report to that effect unless someone here jumps up and explains why it
is not a bug.
- Harald
PS. Damaging value of charset-list:
(unicode-bmp unicode iso-8859-1 ascii latin-iso8859-1 control-1
iso-8859-2 latin-iso8859-2 iso-8859-3 latin-iso8859-3 iso-8859-4
latin-iso8859-4 iso-8859-5 cyrillic-iso8859-5 iso-8859-6
arabic-iso8859-6 iso-8859-7 greek-iso8859-7 iso-8859-8
hebrew-iso8859-8 iso-8859-9 latin-iso8859-9 iso-8859-10
latin-iso8859-10 iso-8859-11 thai-iso8859-11 iso-8859-13
latin-iso8859-13 iso-8859-14 latin-iso8859-14 iso-8859-15
latin-iso8859-15 iso-8859-16 latin-iso8859-16 thai-tis620 tis620-2533
jisx0201 chinese-gb2312 chinese-gbk chinese-cns11643-1
chinese-cns11643-2 chinese-cns11643-3 chinese-cns11643-4
chinese-cns11643-5 chinese-cns11643-6 chinese-cns11643-7 big5
japanese-jisx0208 japanese-jisx0208-1978 japanese-jisx0212
japanese-jisx0213-1 japanese-jisx0213-2 japanese-jisx0213.2004-1
cp932 korean-ksc5601 big5-hkscs cp949 viscii vscii vscii-2 koi8-r
alternativnyj cp866 koi8-u koi8-t georgian-ps georgian-academy
windows-1250 windows-1251 windows-1252 windows-1253 windows-1254
windows-1255 windows-1256 windows-1257 windows-1258 next cp1125 cp437
cp720 cp737 cp775 cp851 cp852 cp855 cp857 cp858 cp860 cp861 cp862
cp863 cp864 cp865 cp869 cp874 unicode-smp unicode-sip unicode-ssp
mac-roman ebcdic-us ebcdic-uk ibm1047 hp-roman8
adobe-standard-encoding symbol ibm850 mik ptcp154 gb18030
chinese-cns11643-15 emacs eight-bit eight-bit-control
eight-bit-graphic latin-jisx0201 katakana-jisx0201 chinese-big5-1
chinese-big5-2 japanese-jisx0213-a katakana-sjis cp932-2-byte
cp949-2-byte chinese-sisheng ipa vietnamese-viscii-lower
vietnamese-viscii-upper arabic-digit arabic-1-column arabic-2-column
lao mule-lao indian-is13194 devanagari-cdac sanskrit-cdac
bengali-cdac tamil-cdac telugu-cdac assamese-cdac oriya-cdac
kannada-cdac malayalam-cdac gujarati-cdac punjabi-cdac
devanagari-akruti bengali-akruti punjabi-akruti gujarati-akruti
oriya-akruti tamil-akruti telugu-akruti kannada-akruti
malayalam-akruti indian-glyph indian-1-column indian-2-column tibetan
tibetan-1-column mule-unicode-2500-33ff mule-unicode-e000-ffff
mule-unicode-0100-24ff ethiopic gb18030-2-byte gb18030-4-byte-bmp
gb18030-4-byte-smp gb18030-4-byte-ext-1 gb18030-4-byte-ext-2)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Intermittent problem with unencodable-char-position
2010-04-14 15:42 ` Harald Hanche-Olsen
@ 2010-04-14 16:11 ` Harald Hanche-Olsen
0 siblings, 0 replies; 4+ messages in thread
From: Harald Hanche-Olsen @ 2010-04-14 16:11 UTC (permalink / raw)
To: emacs-devel
My simplest way to show the bug yet:
(list
(unencodable-char-position 0 5 'iso-latin-1-unix 1 "100 Ω")
(progn (apply 'set-charset-priority (charset-priority-list))
(unencodable-char-position 0 5 'iso-latin-1-unix 1 "100 Ω"))
(progn (apply 'set-charset-priority (list (charset-priority-list t)))
(unencodable-char-position 0 5 'iso-latin-1-unix 1 "100 Ω")))
=> ((4) nil (4)) ; the middle nil is wrong
I am submitting a bug report.
- Harald
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-04-14 16:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-14 4:19 Intermittent problem with unencodable-char-position Harald Hanche-Olsen
2010-04-14 4:38 ` Harald Hanche-Olsen
2010-04-14 15:42 ` Harald Hanche-Olsen
2010-04-14 16:11 ` Harald Hanche-Olsen
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.