From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: [Emacs-diffs] emacs-26 8f18d12: Improve documentation of decoding into a unibyte buffer Date: Sat, 25 May 2019 17:11:00 -0400 Message-ID: References: <20190525191039.14136.23307@vcs0.savannah.gnu.org> <20190525191040.CCD6C207F5@vcs0.savannah.gnu.org> <83sgt22tmh.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="110213"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 25 23:11:49 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hUdxF-000Sah-0h for ged-emacs-devel@m.gmane.org; Sat, 25 May 2019 23:11:49 +0200 Original-Received: from localhost ([127.0.0.1]:46427 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hUdxE-0001zr-1J for ged-emacs-devel@m.gmane.org; Sat, 25 May 2019 17:11:48 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:50229) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hUdwe-0001zk-Pj for emacs-devel@gnu.org; Sat, 25 May 2019 17:11:13 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hUdwd-00072s-SY for emacs-devel@gnu.org; Sat, 25 May 2019 17:11:12 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:41090) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hUdwc-00071T-5O; Sat, 25 May 2019 17:11:10 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 1CC398116B; Sat, 25 May 2019 17:11:09 -0400 (EDT) Original-Received: from mail02.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id B8AFE80B85; Sat, 25 May 2019 17:11:07 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1558818667; bh=JKHt0XIyBYFDKOkDtzxzLtuTpAV1As0+aLi/R0kPdCY=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=gVkrPjJfH9PVWQr5Kz6dPxPT33x4ceL7lSsz0bI3TexWyQp3TR5JrjeIjXorqaHOg yl7wx5UzHQmQFL31Hgqh7aIvsAxJNPsfGgKfYnbYNx/B5hKaLb2aZUGrd1BOjXl10C 2/FVGGrhe247zaHvjZD2R6nkKsWS5bH7qHvE4KPcOEEoW4i0DekrlNtedPNcWqpjrD RBJUX8URXL8MT0/4Gg4LGaMVpeEohyw7X03l5tJovLs6U1yMDc30yDtJyGVa/An1YZ cVGHCouUyOXMEHgXx508iWv9tQN57WO2/JybEo8MWRmaIlMvp92xrud/DOAAgmONHA CO44XYCwVaa/Q== Original-Received: from alfajor (unknown [167.88.27.42]) by mail02.iro.umontreal.ca (Postfix) with ESMTPSA id 89BB9120622; Sat, 25 May 2019 17:11:07 -0400 (EDT) In-Reply-To: <83sgt22tmh.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 25 May 2019 22:59:02 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 132.204.25.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:236992 Archived-At: > The internal representation of the decoded text could include both. > If some of the bytes in the original byte stream couldn't be decoded > using the specified coding-system, they will be represented as raw > bytes, using 2-byte sequences. OTOH, Latin characters successfully > decoded into codepoints less than 256 will take 1 byte. > Again, this is just the internal representation of what was decoded. Great, thanks. But now I wonder, what can we do with this representation. I guess set-buffer-multibyte will convert it to the intended chars, but that bugs the question "why bother deciding into the unibyte buffer and call set-buffer-multibyte afterwards rather than do the reverse"? Anything else we can do with it? Stefan