From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: taylanbayirli@gmail.com (Taylan Ulrich =?UTF-8?Q?Bay=C4=B1rl=C4=B1/Kammer?=) Newsgroups: gmane.emacs.bugs Subject: bug#23701: Decoding broken by sequence ESC comma Date: Mon, 06 Jun 2016 01:35:26 +0300 Message-ID: <874m979scx.fsf@T420.taylan> References: <878tyja1q3.fsf@T420.taylan> <87a8iz5rvv.fsf@linux-m68k.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1465166184 19859 80.91.229.3 (5 Jun 2016 22:36:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 5 Jun 2016 22:36:24 +0000 (UTC) Cc: 23701@debbugs.gnu.org To: Andreas Schwab Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Jun 06 00:36:13 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1b9geS-0003FZ-TY for geb-bug-gnu-emacs@m.gmane.org; Mon, 06 Jun 2016 00:36:13 +0200 Original-Received: from localhost ([::1]:38850 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9geS-0008TM-5e for geb-bug-gnu-emacs@m.gmane.org; Sun, 05 Jun 2016 18:36:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50265) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9geL-0008TF-VU for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2016 18:36:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b9geI-0004OB-Ma for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2016 18:36:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:43627) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9geI-0004Nv-Ii for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2016 18:36:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1b9geI-0001q6-EE for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2016 18:36:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: taylanbayirli@gmail.com (Taylan Ulrich =?UTF-8?Q?Bay=C4=B1rl=C4=B1/Kammer?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Jun 2016 22:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 23701 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 23701-submit@debbugs.gnu.org id=B23701.14651661367035 (code B ref 23701); Sun, 05 Jun 2016 22:36:02 +0000 Original-Received: (at 23701) by debbugs.gnu.org; 5 Jun 2016 22:35:36 +0000 Original-Received: from localhost ([127.0.0.1]:55964 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b9gdr-0001pO-Nr for submit@debbugs.gnu.org; Sun, 05 Jun 2016 18:35:35 -0400 Original-Received: from mail-wm0-f50.google.com ([74.125.82.50]:37004) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b9gdq-0001p9-Cz for 23701@debbugs.gnu.org; Sun, 05 Jun 2016 18:35:34 -0400 Original-Received: by mail-wm0-f50.google.com with SMTP id k204so5035574wmk.0 for <23701@debbugs.gnu.org>; Sun, 05 Jun 2016 15:35:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=DcvyxdAzZJ243V3QQnNo9rEpZvfDRvlrteoELgAAy/4=; b=JFoz9mz+ZeLgIVJ4A9DeiFIBUTXZSsZJCdcy7qwCOCVsFnXYkYf25f+MQSFyTuYgp9 2CIuRPSsod5NjWJ1On0XhPLB/e0XnqkTqLhvLTeoF9lChwwBRDfXbAuuB0cKJLlLbksl V5AoOzbeAxSCHlTWkI6uR+2hIWoHKb90UO2xISqVt/cIMYRV/hDTue2D5vEe6Ho3f08Q f13Xim5gTsNd1zOcWvKeWhCNeViYb7CLzwfTXrrbg3Mxn3DOVtp1z0AzpL+0fL2ywlhx ZfGKas45zJkrzfrNw1n3EJAc8gmgsEm47JFJWU4jvHigB0LIvg7LFWJ2uHakJ7GyY68D 3qGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=DcvyxdAzZJ243V3QQnNo9rEpZvfDRvlrteoELgAAy/4=; b=ki2qPohc6aZxVDyecsDXt06RnNrTWH0cknmGj+k1A/UheAf7bqBQqDLpaMaIcEIiCg 4oRqpq0mYxONgQXkDX+ky4l+ovKW5cCqxnro1Vxh25qq7ukvhYWr4gzJGJUzFzc+F5Wo QqNwugDjF8KlF28ABxsHQe59lWIMAQ+nlUSlR0URPoHhGRKG8uHBv4fC6cULduJapD4U 5YGxeKC34BsdDEprf8MA8KbOy5RGGqVgeYtkHPqzrYF6pkmPyQOuxv8aQBANPpxzHbCJ q4KaBluFnP2xxJNegzDQs42QakC5nVXZ7GJZ/nC0Ql+AV/SRyij1SFjfv4Y1MT7Lz5Ru TX7w== X-Gm-Message-State: ALyK8tLynhhBOUKrDiSTjkC7GYOLe1+XrTkFKCmisZTporMnvTE2lZYM7EPe2/MDi7DFyg== X-Received: by 10.194.81.8 with SMTP id v8mr8426049wjx.155.1465166128499; Sun, 05 Jun 2016 15:35:28 -0700 (PDT) Original-Received: from T420.taylan ([88.243.220.199]) by smtp.gmail.com with ESMTPSA id r129sm11095530wmr.20.2016.06.05.15.35.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 05 Jun 2016 15:35:27 -0700 (PDT) In-Reply-To: <87a8iz5rvv.fsf@linux-m68k.org> (Andreas Schwab's message of "Sun, 05 Jun 2016 21:59:16 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:119133 Archived-At: Andreas Schwab writes: > taylanbayirli@gmail.com (Taylan Ulrich "Bay=C4=B1rl=C4=B1/Kammer") writes: > >> The occurrence of the sequence of the bytes 1B 2C (ASCII ESC and comma) >> messes up Emacs's decoding of an ASCII file from that point on. > > This is one of the ISO 2022 escape sequences. > >> This doesn't happen in any other text-displaying application I tested, >> including a terminal emulator (given it's an escape sequence and all). > > None of them know about ISO 2022, apparently. > > Andreas. Hmm, OK. I figure it's an obscure use-case, but perhaps so is its accidental(?) occurrence in a text file. On the meanwhile I found out C-x RET r us-ascii RET fixes my issue. The file in which I encountered this (mailing list archives of R6RS) actually contains the sequences escape, comma, capital-a, and that in places where these seem intentionally positioned, such as between sentences. I wonder what this is about. Whatever it means, if this is more common than uses of that ISO 2022 sequence, that would be a problem I suppose. Here's the relevant snippet from the file, with literal ESC characters changed to ^[: > | On Fri, Sep 11, 2009 at 10:46 PM, Aubrey Jaffer w= rote: > | > ^[,A | Date: Wed, 9 Sep 2009 00:30:18 -0400 > | > ^[,A | From: Lynn Winebarger > | > ^[,A | > | > ^[,A | ... > | > ^[,A | The advent of hygeinic macros marked the end of the era in wh= ich > | > ^[,A | symbols could be equated with identifiers. ^[,A Identifiers h= ave a lot > | > ^[,A | more information in them. > | > > | > The SLIB implementations of syntactic-closures, syntax-case, I just grepped all the files and the archives seem to contain a few more files in which the ESC , sequence appears, such as: G^[,Avdel vs Godel vs Goedel ^[,Hylem vs ^[,Hylen vs the same with proper vowel symbols ... I know that there is a single bit sequence that specifies strings, and it's not ^[,A+;^[(Bs; I know that there's another single sequence that specifies ellipsis, and it's not ^[$,1s&^[(B ... These aren't ISO-8859-1 either. I don't know what encoding they're supposed to be in. Could also be a mail server breaking things. All in all, I'm just throwing this out there; I have no idea how commonly used ISO 2022 is, but handling it by default certainly breaks some files that contain ESC , either by accident or with some other purpose. Maybe it should not be handled by default. Taylan