unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: "Mattias Engdegård" <mattias.engdegard@gmail.com>
Cc: 70988@debbugs.gnu.org, monnier@iro.umontreal.ca
Subject: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
Date: Thu, 16 May 2024 21:47:52 +0300	[thread overview]
Message-ID: <86seyhh9uv.fsf@gnu.org> (raw)
In-Reply-To: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@gmail.com> (message from Mattias Engdegård on Thu, 16 May 2024 20:13:18 +0200)

> Cc: Stefan Monnier <monnier@iro.umontreal.ca>
> From: Mattias Engdegård <mattias.engdegard@gmail.com>
> Date: Thu, 16 May 2024 20:13:18 +0200
> 
> When `read` is called with a function as stream argument, the return values of that function are often interpreted as Latin-1 characters with only the 8 low bits used. Example:
> 
> (let* ((next '(?A #x12a nil))
>        (f (lambda (&rest args)
>             (if args
>                 (push (car args) next)
>               (pop next)))))
>   (read f))
> => A*   ; expected: AĪ
> 
> This is a result of `readchar` setting *multibyte to 0 on this code path.

When is this situation relevant?  How many uses of
function-as-a-stream are there out there?

In general, I wouldn't touch these rare cases with a 3-mile pole.  The
gain is generally very small (satisfaction from some abstract sense of
correctness aside), while the risk to break some code is usually high.
It is better to document this behavior and move on.

> The fix is straightforward (attached).
> 
> diff --git a/src/lread.c b/src/lread.c
> index c92b2ede932..2626272c4e2 100644
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -422,6 +422,8 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
>        goto read_multibyte;
>      }
>  
> +  if (multibyte)
> +    *multibyte = 1;
>    tem = call0 (readcharfun);

Is it an accident that the code does the same only _after_ the call to
readbyte?





  reply	other threads:[~2024-05-16 18:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-16 18:13 bug#70988: (read FUNCTION) uses Latin-1 [PATCH] Mattias Engdegård
2024-05-16 18:47 ` Eli Zaretskii [this message]
2024-05-16 19:45   ` Mattias Engdegård
2024-05-16 19:54     ` Eli Zaretskii
2024-05-17  8:08       ` Mattias Engdegård
2024-05-17 10:45         ` Eli Zaretskii
2024-05-17 17:08           ` Mattias Engdegård
2024-05-30 15:43             ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86seyhh9uv.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=70988@debbugs.gnu.org \
    --cc=mattias.engdegard@gmail.com \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).