From: Eli Zaretskii <eliz@gnu.org>
To: handa@gnu.org (K. Handa)
Cc: dmantipov@yandex.ru, maden.ldm@gmail.com, 18610@debbugs.gnu.org
Subject: bug#18610: 24.4.50; Specific file causing emacs to segfault upon opening
Date: Sun, 05 Oct 2014 19:09:26 +0300 [thread overview]
Message-ID: <83zjdamrbd.fsf@gnu.org> (raw)
In-Reply-To: <871tqmevsu.fsf@gnu.org>
> From: handa@gnu.org (K. Handa)
> Date: Sun, 05 Oct 2014 17:59:45 +0900
> Cc: dmantipov@yandex.ru, maden.ldm@gmail.com, 18610@debbugs.gnu.org
>
> > However, detect_coding_iso_2022 returns with the 'found' member of its
> > second argument having zero value, which I interpret as meaning that
> > it didn't really find any ISO-2022 sequences. So the simple patch
> > below fixes this for me. Kenichi, is this patch OK?
>
> No. Even if there's no special ISO-2022 escape sequence, we
> should not reject iso-2022 as a detected coding system.
Can you explain why? AFAICT, all the other detectors are required to
set some flag in the 'found' field, so why is ISO-2022 special in this
regard?
> And, even if that detection was incorrect, the decoder
> should not produce an invalid byte sequence in a
> buffer/string which leads to Emacs crash.
No argument here.
> The bug is in detect_coding_iso_2022 which doesn't set
> CATEGORY_MASK_ISO_7_ELSE in coding->rejected in this case.
Btw, it would be nice if these masks could be documented so that their
meaning was clear. I considered the possibility that the flags are
not set correctly, but couldn't test that hypothesis given my
insufficient knowledge of ISO-2022 details and variants.
> I've just installed a fix to trunk. Could you please try
> the latest version?
It fixes the crash, but I'm not sure the results are what we want.
Emacs 24.3, which also did not crash, would set the
buffer-file-coding-system of the buffer visiting the file to
'undecided', and regarded the \226 characters as 8-bit raw bytes:
character: \226 (displayed as \226) (codepoint 4194198, #o17777626, #x3fff96)
...
general-category: Cn (Other, Not Assigned)
By contrast, the current trunk sets buffer-file-coding-system to
'latin-1' and thinks this character is a Latin-1 character:
character: \226 (displayed as \226) (codepoint 150, #o226, #x96)
preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1))
...
old-name: START OF GUARDED AREA
general-category: Cc (Other, Control)
That doesn't sound right to me.
If I force some specific coding system, e.g.
C-x RET c utf-8 RET C-x C-f FILE RET
then the \226 characters are correctly recognized as 8-bit bytes by
the current trunk (as was the case before your changes).
Could it be that the current trunk fails to recognize the 8-bit bytes
in the file?
next prev parent reply other threads:[~2014-10-05 16:09 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-02 14:50 bug#18610: 24.4.50; Specific file causing emacs to segfault upon opening maden.ldm
2014-10-02 15:40 ` Nicolas Richard
2014-10-03 8:14 ` Eli Zaretskii
2014-10-03 11:22 ` Dmitry Antipov
2014-10-03 12:20 ` Eli Zaretskii
2014-10-03 12:39 ` Eli Zaretskii
2014-10-03 15:16 ` Andreas Schwab
2014-10-03 15:32 ` Eli Zaretskii
2014-10-03 16:02 ` Andreas Schwab
2014-10-03 16:35 ` Eli Zaretskii
2014-10-03 16:40 ` Andreas Schwab
2014-10-03 16:57 ` Eli Zaretskii
2014-10-05 8:59 ` K. Handa
2014-10-05 16:09 ` Eli Zaretskii [this message]
2014-10-06 14:00 ` K. Handa
2014-10-06 15:20 ` Eli Zaretskii
2014-10-07 12:34 ` K. Handa
2014-10-07 13:20 ` Ivan Shmakov
2014-10-07 14:33 ` Eli Zaretskii
2014-10-07 15:10 ` Ivan Shmakov
2014-10-07 15:19 ` Eli Zaretskii
2014-10-08 7:33 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83zjdamrbd.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=18610@debbugs.gnu.org \
--cc=dmantipov@yandex.ru \
--cc=handa@gnu.org \
--cc=maden.ldm@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.