From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: emacs-26 8f18d12: Improve documentation of decoding into a unibyte buffer Date: Wed, 29 May 2019 12:28:30 -0400 Message-ID: References: <20190525191039.14136.23307@vcs0.savannah.gnu.org> <20190525191040.CCD6C207F5@vcs0.savannah.gnu.org> <83v9xv2649.fsf@gnu.org> <83imtv1fbf.fsf@gnu.org> <7F0B61E6-C0CA-449B-B432-095569589168@gnu.org> <83y32qzk9b.fsf@gnu.org> <83pno2za3d.fsf@gnu.org> <83blzmyo5h.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="175880"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 29 18:29:29 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hW1SD-000jdY-KR for ged-emacs-devel@m.gmane.org; Wed, 29 May 2019 18:29:29 +0200 Original-Received: from localhost ([127.0.0.1]:57639 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hW1SC-0005Ru-Kb for ged-emacs-devel@m.gmane.org; Wed, 29 May 2019 12:29:28 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:37502) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hW1Rb-0005RI-4U for emacs-devel@gnu.org; Wed, 29 May 2019 12:28:52 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hW1Ra-0005ft-90 for emacs-devel@gnu.org; Wed, 29 May 2019 12:28:51 -0400 Original-Received: from [195.159.176.226] (port=38218 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hW1RZ-0005er-DS for emacs-devel@gnu.org; Wed, 29 May 2019 12:28:50 -0400 Original-Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1hW1RS-000ic8-5Y for emacs-devel@gnu.org; Wed, 29 May 2019 18:28:42 +0200 X-Injected-Via-Gmane: http://gmane.org/ Cancel-Lock: sha1:w+i+dw0B+TFE31ZN2K3RPml6XBk= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:237157 Archived-At: >> In any case, I think we should strive to avoid using "encoded" multibyte >> strings. > I don't think it's possible, "strive to avoid" is always possible. I didn't say we should completely disallow it (which might be possible, but it's too far from where we are to be able to tell). > because buffers are by default multibyte. And those contains chars 99,99% of the time. And buffers that contain bytes are unibyte in most cases. This is the sane way to work. It makes it easy to know what is what. Also, not only it's possible, but it's pretty much the case already. Whether we'll be able to eliminate all cases, I don't know. But I think we should try to make the cases of "decoded text in unibyte" and "encoded text in multibyte" as rare as possible. [ Similarly, set-buffer-multibyte should only ever be called in an empty buffer. ] Stefan PS: I added checks in encoding/decoding functions to signal errors when decoding from multibyte and encoding from unibyte (in my local Emacs), and that's been tremendously useful to track down and fix encoding bugs in Gnus.