From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Robert Pluim Newsgroups: gmane.emacs.devel Subject: Re: emacs-26 8f18d12: Improve documentation of decoding into a unibyte buffer Date: Mon, 27 May 2019 15:49:50 +0200 Message-ID: References: <20190525191039.14136.23307@vcs0.savannah.gnu.org> <20190525191040.CCD6C207F5@vcs0.savannah.gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="126404"; mail-complaints-to="usenet@blaine.gmane.org" Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 27 15:50:04 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hVG0q-000Wmg-1a for ged-emacs-devel@m.gmane.org; Mon, 27 May 2019 15:50:04 +0200 Original-Received: from localhost ([127.0.0.1]:46210 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hVG0o-0003HW-JL for ged-emacs-devel@m.gmane.org; Mon, 27 May 2019 09:50:02 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:38417) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hVG0h-0003HF-R1 for emacs-devel@gnu.org; Mon, 27 May 2019 09:49:56 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hVG0g-0001LS-To for emacs-devel@gnu.org; Mon, 27 May 2019 09:49:55 -0400 Original-Received: from mail-ed1-x529.google.com ([2a00:1450:4864:20::529]:39115) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hVG0g-0001KO-Kb for emacs-devel@gnu.org; Mon, 27 May 2019 09:49:54 -0400 Original-Received: by mail-ed1-x529.google.com with SMTP id e24so26838400edq.6 for ; Mon, 27 May 2019 06:49:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-followup-to:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version :content-transfer-encoding; bh=BMYC191QmN+FqwTgczxYvGvCl9hz6t8QZhcACcTVSa0=; b=Jg5udE5Gr5j0rgbSBaNQ1pJWLcOpMoi7rSh874ZCeHOqJZAHy9eQbcYvaZlIRXJZ52 CVX09Kf2nbVZnUt9mDXVfdFv5l2O6cQk32sbLokhLNnRwWyGhnjYS8BstbE+X1GKnCR3 RrVHgK1oCrfG5BD5Q/YOJC5Tz3/RHhrrvEnp/Tr818QFC3H+15MTPJ5CIHTEFyKyhbdD bn1Z43JzeDeWxm9lTFp1OzKgcUx08wpfN4gurJ5z3hV07lJjXQuI3nK3whKVJ7+Y5CLk ppbjUSJ8IUsilo/sm3C+H11GwLcbheecYnDI2LnDWoBlZwahT36C6WxFl9fFlRIAd7YB HIkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-followup-to :mail-copies-to:gmane-reply-to-list:date:in-reply-to:message-id :mime-version:content-transfer-encoding; bh=BMYC191QmN+FqwTgczxYvGvCl9hz6t8QZhcACcTVSa0=; b=tdeZzq3Zxd0rEkzXeg5ARtI+XQAkfdSVFkM9MMJ2GMvzCYtSo9jq/TwfaJ7Tj2hDdq cFWoR7y60TFrjlJXP+dTOfE6Czbobto46sKNqdUl6kKpbqQW7tstmj1OQH2ZLHe3a7u4 ayW66fVMIaqB8J5ahDUxqwl/L2NKIpm2vQSur+pPZYNpfGlRZmYgAm/pji1q77yCdGr4 o/2SVzc4rRuPmQncFiY8wHafD2q216Rs7D28c5kFw9pWh1kIR5wxgIkGoDSidbx5cnPy Za8wrkVwEcgBwORoVAlRh7RJBs/g4BXkNoC3BmU+lLLPrvsYuxJiNfx05NlgtVkLyRsQ XPqw== X-Gm-Message-State: APjAAAXRHdSOdoI7wV7qpFcpZaPzK9EJWMX6np7i7r2GBZqpgURLexsA aML044dnnK/ZxtYrRPxtRguX9vDs X-Google-Smtp-Source: APXvYqxvu+KazOvPT7vxkyJsCnNVCAngouY0g2yhQvtA0taRJaCp1nJTeKQJuDXKEBiApn+2RScrdA== X-Received: by 2002:a17:906:a955:: with SMTP id hh21mr53432620ejb.296.1558964992914; Mon, 27 May 2019 06:49:52 -0700 (PDT) Original-Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id v2sm3307951eds.69.2019.05.27.06.49.51 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 27 May 2019 06:49:51 -0700 (PDT) Mail-Followup-To: emacs-devel@gnu.org Mail-Copies-To: never Gmane-Reply-To-List: yes In-Reply-To: (Stefan Monnier's message of "Mon, 27 May 2019 09:32:11 -0400") X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::529 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:237042 Archived-At: >>>>> On Mon, 27 May 2019 09:32:11 -0400, Stefan Monnier said: >> If I take a string of say "=CE=B2", and replace string-as-unibyte wi= th >> (encode-coding-string 'emacs-internal), `encoded-string-description' >> prints "#xCE #xB2", which is the correct UTF-8 encoded >> value. 'raw-text works too. I=CA=BCm certain that there are subtle >> differences between the two that I don=CA=BCt understand. Stefan> But "=CE=B2" is not a "STR that is encoded by CODING-SYSTEM", s= o this output Stefan> is neither correct nor incorrect in any case. It matches the current output of encoded-string-description, though. Stefan> I think the right thing to do here is one of: Stefan> - signal an error if `str` is multibyte. Stefan> - signal an error if `str` is multibyte and contains non-byte c= hars. Stefan> - if multibyte, encode `str` with `coding-system`. Stefan> - just don't bother looking at whether `str` is unibyte or not,= just Stefan> pass it as is to `mapconcat`. Stefan> - just don't bother looking at whether `str` is unibyte or not,= just Stefan> pass it as is to `mapconcat` but in the lambda, do catch the = case Stefan> where `x` is an "eight bit raw-byte char" and if so pass it to Stefan> multibyte-char-to-unibyte. Stefan> - ... Since this is the underlying code that displays the 'buffer code' section of 'C-u C-x =3D', I don=CA=BCt think barfing on multibyte is the right thing to do. Nor is passing it on as is. Stefan> But encoding `str` with any coding system like raw-text or Stefan> emacs-internal doesn't seem to make much sense. Then what is the correct way to say 'give me the raw byte version of this character'? (or maybe we should just let sleeping encodings lie :-) ) Robert