From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#70988: (read FUNCTION) uses Latin-1 [PATCH] Date: Thu, 16 May 2024 21:47:52 +0300 Message-ID: <86seyhh9uv.fsf@gnu.org> References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="40682"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 70988@debbugs.gnu.org, monnier@iro.umontreal.ca To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu May 16 20:49:27 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1s7gAQ-000AM1-FX for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 16 May 2024 20:49:26 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s7gA5-0003rk-FK; Thu, 16 May 2024 14:49:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s7gA0-0003rG-Lk for bug-gnu-emacs@gnu.org; Thu, 16 May 2024 14:49:00 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1s7gA0-0005Lc-DQ for bug-gnu-emacs@gnu.org; Thu, 16 May 2024 14:49:00 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1s7gA2-0002KO-Dn for bug-gnu-emacs@gnu.org; Thu, 16 May 2024 14:49:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 May 2024 18:49:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 70988 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 70988-submit@debbugs.gnu.org id=B70988.17158852878930 (code B ref 70988); Thu, 16 May 2024 18:49:02 +0000 Original-Received: (at 70988) by debbugs.gnu.org; 16 May 2024 18:48:07 +0000 Original-Received: from localhost ([127.0.0.1]:50351 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s7g99-0002Jy-56 for submit@debbugs.gnu.org; Thu, 16 May 2024 14:48:07 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:41950) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s7g94-0002JR-Uw for 70988@debbugs.gnu.org; Thu, 16 May 2024 14:48:05 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s7g8x-0005HX-4V; Thu, 16 May 2024 14:47:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=WuSDusavepW5zeVFHFNG9WEzTnEyFCXmTpez/9Br28E=; b=jy9tcBcbSz1M3UajjfFL sZvySlfKhpp/4XB/v329PQtTHSUsMLGxZntAXhPdD49uYuClfHoQPNmaLOgIR2zEhlDhy2grPUh17 /T6Grw9uf/a4TeHEpHvqDkjyt5RfeFDeeb4LbhH+Bfq2Q3x2Y644Hl3aAaT4D7F3vV9+K57fvYo9P C1fiTklf3Sbm0ttuZHlA82YhcQUBozrCHDOih5+mjtrzs8sRVHrX+5GG5ff7/KJ1zORGP7zdw9rB1 ym9sIfX1Ih70ncxLxhY5zGwWtZxQWpENpwUcYBNikE2jS8UsKNV+/Vt8hjtoayjzG8qRWON2zrByM SLECUV//AWuX/g==; In-Reply-To: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@gmail.com> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Thu, 16 May 2024 20:13:18 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:285190 Archived-At: > Cc: Stefan Monnier > From: Mattias Engdegård > Date: Thu, 16 May 2024 20:13:18 +0200 > > When `read` is called with a function as stream argument, the return values of that function are often interpreted as Latin-1 characters with only the 8 low bits used. Example: > > (let* ((next '(?A #x12a nil)) > (f (lambda (&rest args) > (if args > (push (car args) next) > (pop next))))) > (read f)) > => A* ; expected: AĪ > > This is a result of `readchar` setting *multibyte to 0 on this code path. When is this situation relevant? How many uses of function-as-a-stream are there out there? In general, I wouldn't touch these rare cases with a 3-mile pole. The gain is generally very small (satisfaction from some abstract sense of correctness aside), while the risk to break some code is usually high. It is better to document this behavior and move on. > The fix is straightforward (attached). > > diff --git a/src/lread.c b/src/lread.c > index c92b2ede932..2626272c4e2 100644 > --- a/src/lread.c > +++ b/src/lread.c > @@ -422,6 +422,8 @@ readchar (Lisp_Object readcharfun, bool *multibyte) > goto read_multibyte; > } > > + if (multibyte) > + *multibyte = 1; > tem = call0 (readcharfun); Is it an accident that the code does the same only _after_ the call to readbyte?