all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: taylanbayirli@gmail.com (Taylan Ulrich Bayırlı/Kammer)
To: Andreas Schwab <schwab@linux-m68k.org>
Cc: 23701@debbugs.gnu.org
Subject: bug#23701: Decoding broken by sequence ESC comma
Date: Mon, 06 Jun 2016 01:35:26 +0300	[thread overview]
Message-ID: <874m979scx.fsf@T420.taylan> (raw)
In-Reply-To: <87a8iz5rvv.fsf@linux-m68k.org> (Andreas Schwab's message of "Sun, 05 Jun 2016 21:59:16 +0200")

Andreas Schwab <schwab@linux-m68k.org> writes:

> taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:
>
>> The occurrence of the sequence of the bytes 1B 2C (ASCII ESC and comma)
>> messes up Emacs's decoding of an ASCII file from that point on.
>
> This is one of the ISO 2022 escape sequences.
>
>> This doesn't happen in any other text-displaying application I tested,
>> including a terminal emulator (given it's an escape sequence and all).
>
> None of them know about ISO 2022, apparently.
>
> Andreas.

Hmm, OK.  I figure it's an obscure use-case, but perhaps so is its
accidental(?) occurrence in a text file.

On the meanwhile I found out C-x RET r us-ascii RET fixes my issue.

The file in which I encountered this (mailing list archives of R6RS)
actually contains the sequences escape, comma, capital-a, and that in
places where these seem intentionally positioned, such as between
sentences.  I wonder what this is about.  Whatever it means, if this is
more common than uses of that ISO 2022 sequence, that would be a problem
I suppose.  Here's the relevant snippet from the file, with literal ESC
characters changed to ^[:

>  | On Fri, Sep 11, 2009 at 10:46 PM, Aubrey Jaffer<agj at alum.mit.edu> wrote:
>  | > ^[,A | Date: Wed, 9 Sep 2009 00:30:18 -0400
>  | > ^[,A | From: Lynn Winebarger <owinebar at gmail.com>
>  | > ^[,A |
>  | > ^[,A | ...
>  | > ^[,A | The advent of hygeinic macros marked the end of the era in which
>  | > ^[,A | symbols could be equated with identifiers. ^[,A Identifiers have a lot
>  | > ^[,A | more information in them.
>  | >
>  | > The SLIB implementations of syntactic-closures, syntax-case,

I just grepped all the files and the archives seem to contain a few more
files in which the ESC , sequence appears, such as:

    G^[,Avdel vs Godel vs Goedel

    ^[,Hylem vs ^[,Hylen vs the same with proper vowel symbols

    ... I know that there is a single bit sequence that specifies
    strings, and it's not ^[,A+;^[(Bs; I know that there's another
    single sequence that specifies ellipsis, and it's not ^[$,1s&^[(B
    ...

These aren't ISO-8859-1 either.  I don't know what encoding they're
supposed to be in.  Could also be a mail server breaking things.

All in all, I'm just throwing this out there; I have no idea how
commonly used ISO 2022 is, but handling it by default certainly breaks
some files that contain ESC , either by accident or with some other
purpose.  Maybe it should not be handled by default.

Taylan





  reply	other threads:[~2016-06-05 22:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-05 19:13 bug#23701: Decoding broken by sequence ESC comma Taylan Ulrich Bayırlı/Kammer
2016-06-05 19:59 ` Andreas Schwab
2016-06-05 22:35   ` Taylan Ulrich Bayırlı/Kammer [this message]
2016-06-06  2:33     ` Eli Zaretskii
2016-06-06 13:17       ` Taylan Ulrich Bayırlı/Kammer
2016-06-06 15:07         ` Eli Zaretskii
2016-06-06  7:27     ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874m979scx.fsf@T420.taylan \
    --to=taylanbayirli@gmail.com \
    --cc=23701@debbugs.gnu.org \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.