From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: decode-coding-string gone awry? Date: Mon, 14 Feb 2005 14:30:32 -0500 Message-ID: References: <874qgf1dkv.fsf-monnier+emacs@gnu.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1108410080 6882 80.91.229.6 (14 Feb 2005 19:41:20 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 14 Feb 2005 19:41:20 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Feb 14 20:41:12 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1D0m5o-0001Z4-00 for ; Mon, 14 Feb 2005 20:41:12 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1D0mLP-0006tl-De for ged-emacs-devel@m.gmane.org; Mon, 14 Feb 2005 14:57:19 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1D0mIU-00063k-Iv for emacs-devel@gnu.org; Mon, 14 Feb 2005 14:54:19 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1D0mIN-00060T-Te for emacs-devel@gnu.org; Mon, 14 Feb 2005 14:54:11 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1D0mIM-0005q2-Jl for emacs-devel@gnu.org; Mon, 14 Feb 2005 14:54:10 -0500 Original-Received: from [132.204.24.67] (helo=mercure.iro.umontreal.ca) by monty-python.gnu.org with esmtp (Exim 4.34) id 1D0lvh-00042a-UV; Mon, 14 Feb 2005 14:30:46 -0500 Original-Received: from hidalgo.iro.umontreal.ca (hidalgo.iro.umontreal.ca [132.204.27.50]) by mercure.iro.umontreal.ca (Postfix) with ESMTP id A073E8282C5; Mon, 14 Feb 2005 14:30:45 -0500 (EST) Original-Received: from asado.iro.umontreal.ca (asado.iro.umontreal.ca [132.204.24.84]) by hidalgo.iro.umontreal.ca (Postfix) with ESMTP id F33584AC196; Mon, 14 Feb 2005 14:30:32 -0500 (EST) Original-Received: by asado.iro.umontreal.ca (Postfix, from userid 20848) id CC1AC4BCF2; Mon, 14 Feb 2005 14:30:32 -0500 (EST) Original-To: David Kastrup In-Reply-To: (David Kastrup's message of "Mon, 14 Feb 2005 19:41:19 +0100") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) X-DIRO-MailScanner-Information: Please contact the ISP for more information X-DIRO-MailScanner: Found to be clean X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=-4.796, requis 5, AWL 0.10, BAYES_00 -4.90) X-MailScanner-From: monnier@iro.umontreal.ca X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:33414 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:33414 > Give me a clue: what happens if a process inserts stuff with 'raw-text > encoding into a multibyte buffer? 'raw-text is a reconstructible > encoding, isn't it, so the stuff will get converted into some prefix > byte indicating "isolated single-byte entity instead of utf-8 char" > and the byte itself or something, right? And decode-encoding-string > does not want to work on something like that? If you want accented chars to appear as accented chars in the (process) buffer (i.e. you don't want to change the AUCTeX part), then raw-text is not an option anyway. If you don't mind about accented chars appearing as \NNN, then you can make the buffer unibyte and use `raw-text' as the process's output coding-system. That's the more robust approach. If that option is out (i.e. you have to use a multibyte buffer), you'll have to basically recover the original byte-sequence by replacing the (regexp-quote (substring string 0 (match-beginning 1))) with (regexp-quote (encode-coding-string (substring string 0 (match-beginning 1)) buffer-file-coding-system)) [assuming buffer-file-coding-system is the process's output coding-system] or (regexp-quote (string-make-unibyte (substring string 0 (match-beginning 1)))) which is basically equivalent except that you lose control over which coding-system is used. Stefan