From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#5235: 23.1; Unibyte keyboard input problem Date: Tue, 29 Dec 2009 13:43:01 -0200 Message-ID: References: <200912162217.14991.scianagoryczy@wp.pl> <4B2A60A1.2050804@gnu.org> <200912172025.58502.scianagoryczy@wp.pl> <4B338705.5090508@gnu.org> Reply-To: Stefan Monnier , 5235@debbugs.gnu.org NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1262178794 1841 80.91.229.12 (30 Dec 2009 13:13:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 30 Dec 2009 13:13:14 +0000 (UTC) Cc: Tomasz =?UTF-8?Q?Zbro=C5=BCek?= , 5235@debbugs.gnu.org To: Jason Rumney Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Dec 30 14:13:06 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NPyMH-0002M3-Tm for geb-bug-gnu-emacs@m.gmane.org; Wed, 30 Dec 2009 14:13:02 +0100 Original-Received: from localhost ([127.0.0.1]:33298 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NPyMI-00062Q-4x for geb-bug-gnu-emacs@m.gmane.org; Wed, 30 Dec 2009 08:13:02 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NPyI8-0003GM-VI for bug-gnu-emacs@gnu.org; Wed, 30 Dec 2009 08:08:45 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NPyHz-000394-Ir for bug-gnu-emacs@gnu.org; Wed, 30 Dec 2009 08:08:44 -0500 Original-Received: from [199.232.76.173] (port=58336 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NPyHz-00038z-7T for bug-gnu-emacs@gnu.org; Wed, 30 Dec 2009 08:08:35 -0500 Original-Received: from [140.186.70.43] (port=55206 helo=debbugs.gnu.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1NPyHy-0003n9-W6 for bug-gnu-emacs@gnu.org; Wed, 30 Dec 2009 08:08:35 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1NPyCd-0002e8-An; Wed, 30 Dec 2009 08:03:03 -0500 X-Loop: bug-gnu-emacs@gnu.org Mail-Followup-To: Stefan Monnier , 5235@debbugs.gnu.org Resent-From: Stefan Monnier Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 30 Dec 2009 13:03:03 +0000 Resent-Message-ID: Resent-Sender: bug-gnu-emacs@gnu.org X-Emacs-PR-Message: followup 5235 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by 5235-submit@debbugs.gnu.org id=B5235.126217815510152 (code B ref 5235); Wed, 30 Dec 2009 13:03:03 +0000 Original-Received: (at 5235) by debbugs.gnu.org; 30 Dec 2009 13:02:35 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NPyCA-0002dg-6I for submit@debbugs.gnu.org; Wed, 30 Dec 2009 08:02:34 -0500 Original-Received: from pruche.dit.umontreal.ca ([132.204.246.22]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NPyC7-0002dB-O7 for 5235@debbugs.gnu.org; Wed, 30 Dec 2009 08:02:32 -0500 Original-Received: from vpn-132-204-232-59.acd.umontreal.ca (vpn-132-204-232-59.acd.umontreal.ca [132.204.232.59]) by pruche.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id nBUD2NSZ027690; Wed, 30 Dec 2009 08:02:24 -0500 Original-Received: by vpn-132-204-232-59.acd.umontreal.ca (Postfix, from userid 501) id DEB1B3D9FB6; Tue, 29 Dec 2009 13:43:01 -0200 (ARST) In-Reply-To: <4B338705.5090508@gnu.org> (Jason Rumney's message of "Thu, 24 Dec 2009 23:21:41 +0800") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (darwin) X-NAI-Spam-Score: 0.5 X-NAI-Spam-Rules: 2 Rules triggered DATE_IN_PAST_12_24=0.5, RV3437=0 X-NAI-Spam-Level: X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list X-Spam-Score: -0.7 (/) Resent-Date: Wed, 30 Dec 2009 08:03:03 -0500 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:33803 Archived-At: >>> I'll try to explain why I need unibyte mode. I'm maintener of a C/C++ >>> source code which has comments coded in cp1250 (polish language) but >>> strings in code are coded in cp852. So I have two different code >>> pages in source code file. This is old source code and it was >>> developed in Windows (that's why comments are in cp1250) but is >>> compiled to work on MS-DOS (that's why strings are coded in cp852). >> So what happens if you read those files as binary (i.e. C-x RET >> r binary RET)? > At best, he'd end up silently screwing up his files even further, with > cp1250, cp852 and now utf-8 encoded characters in them. More likely he > would still get prompted when saving, just as if he'd used cp1250 or cp852 > to read them. That would be a bug: a file visited as `binary' (or as `raw-text') should be placed in a unibyte buffer, so it should not screw anything up more than was already the case to start with. > The problem here is the files, not Emacs. Basically the reason for using > unibyte is that it allows the user to bury their head in the sand and > pretend the problem does not exist. Of course, but if you start with such files and can't (or don't want to) recode the parts consistently, we can't do much better. > I work on similar files in my day job, with Japanese comments in ShiftJIS > and Chinese comments in GB2312. An easy method of fixing such files would be > nice, but the best I can think of would be to provide a recode-region > function, which would still be too much manual work to be worth it to me > given that I can barely make sense of the Japanese comments and can't make > any sense of the Chinese ones. The original poster might be more motivated > to make use of such a function if it existed though. I'm not sure what would be the best approach in general or in particular cases, but we could certainly provide a command that recodes comments. Or another one that looks for invalid byte sequences (i.e. decoded as eight-bit-bytes) and tries to re-decode them with a secondary coding system. Stefan