From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Stefan Monnier <monnier@iro.umontreal.ca>
Newsgroups: gmane.emacs.devel
Subject: Re: decode-coding-string gone awry?
Date: Mon, 14 Feb 2005 14:30:32 -0500
Message-ID: <jwvd5v3gdaq.fsf-monnier+emacs@gnu.org>
References: <x5d5v52k4m.fsf@lola.goethe.zz>
	<874qgf1dkv.fsf-monnier+emacs@gnu.org> <x5hdkf5jzi.fsf@lola.goethe.zz>
	<jwvbranhykt.fsf-monnier+emacs@gnu.org>
	<x5fyzz3vh4.fsf@lola.goethe.zz>
	<jwvu0ofggsu.fsf-monnier+emacs@gnu.org>
	<x53bvz3rxs.fsf@lola.goethe.zz>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1108410080 6882 80.91.229.6 (14 Feb 2005 19:41:20 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Mon, 14 Feb 2005 19:41:20 +0000 (UTC)
Cc: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Feb 14 20:41:12 2005
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Original-Received: from lists.gnu.org ([199.232.76.165])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 1D0m5o-0001Z4-00
	for <ged-emacs-devel@m.gmane.org>; Mon, 14 Feb 2005 20:41:12 +0100
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1D0mLP-0006tl-De
	for ged-emacs-devel@m.gmane.org; Mon, 14 Feb 2005 14:57:19 -0500
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1D0mIU-00063k-Iv
	for emacs-devel@gnu.org; Mon, 14 Feb 2005 14:54:19 -0500
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1D0mIN-00060T-Te
	for emacs-devel@gnu.org; Mon, 14 Feb 2005 14:54:11 -0500
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1D0mIM-0005q2-Jl
	for emacs-devel@gnu.org; Mon, 14 Feb 2005 14:54:10 -0500
Original-Received: from [132.204.24.67] (helo=mercure.iro.umontreal.ca)
	by monty-python.gnu.org with esmtp (Exim 4.34)
	id 1D0lvh-00042a-UV; Mon, 14 Feb 2005 14:30:46 -0500
Original-Received: from hidalgo.iro.umontreal.ca (hidalgo.iro.umontreal.ca
	[132.204.27.50]) by mercure.iro.umontreal.ca (Postfix) with ESMTP
	id A073E8282C5; Mon, 14 Feb 2005 14:30:45 -0500 (EST)
Original-Received: from asado.iro.umontreal.ca (asado.iro.umontreal.ca [132.204.24.84])
	by hidalgo.iro.umontreal.ca (Postfix) with ESMTP
	id F33584AC196; Mon, 14 Feb 2005 14:30:32 -0500 (EST)
Original-Received: by asado.iro.umontreal.ca (Postfix, from userid 20848)
	id CC1AC4BCF2; Mon, 14 Feb 2005 14:30:32 -0500 (EST)
Original-To: David Kastrup <dak@gnu.org>
In-Reply-To: <x53bvz3rxs.fsf@lola.goethe.zz> (David Kastrup's message of
	"Mon, 14 Feb 2005 19:41:19 +0100")
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux)
X-DIRO-MailScanner-Information: Please contact the ISP for more information
X-DIRO-MailScanner: Found to be clean
X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel,
	SpamAssassin (score=-4.796, requis 5, AWL 0.10, BAYES_00 -4.90)
X-MailScanner-From: monnier@iro.umontreal.ca
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: main.gmane.org gmane.emacs.devel:33414
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:33414

> Give me a clue: what happens if a process inserts stuff with 'raw-text
> encoding into a multibyte buffer?  'raw-text is a reconstructible
> encoding, isn't it, so the stuff will get converted into some prefix
> byte indicating "isolated single-byte entity instead of utf-8 char"
> and the byte itself or something, right?  And decode-encoding-string
> does not want to work on something like that?

If you want accented chars to appear as accented chars in the (process)
buffer (i.e. you don't want to change the AUCTeX part), then raw-text is
not an option anyway.  If you don't mind about accented chars appearing as
\NNN, then you can make the buffer unibyte and use `raw-text' as the
process's output coding-system.  That's the more robust approach.

If that option is out (i.e. you have to use a multibyte buffer), you'll have
to basically recover the original byte-sequence by replacing the

   (regexp-quote (substring string 0 (match-beginning 1)))

with

   (regexp-quote (encode-coding-string
                  (substring string 0 (match-beginning 1))
                  buffer-file-coding-system))

[assuming buffer-file-coding-system is the process's output coding-system] or

   (regexp-quote (string-make-unibyte
                  (substring string 0 (match-beginning 1))))

which is basically equivalent except that you lose control over which
coding-system is used.


        Stefan