unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Evgeny Kurnevsky <kurnevsky@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 74922@debbugs.gnu.org
Subject: bug#74922: Fwd: bug#74922: 29.4; copy_string_contents doesn't always produce a valid utf-8
Date: Tue, 17 Dec 2024 14:46:28 +0000	[thread overview]
Message-ID: <CAOEHfogetPusuLUajxMxuEyWs=yQuMoLo9wk+Xd7eFwJp_6-nA@mail.gmail.com> (raw)
In-Reply-To: <8634imo0aa.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

It can definitely do it, but I guess in emacs-module-rs it's not done by
default because of performance implications - it might be quite costly to
check every string in some cases, and it wasn't really clear if emacs can
pass an invalid string. So currently this case causes undefined behavior
there which results in emacs crash.

On Tue, Dec 17, 2024 at 2:24 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Evgeny Kurnevsky <kurnevsky@gmail.com>
> > Date: Tue, 17 Dec 2024 13:31:57 +0000
> >
> > Yes, that's a binary file that is not an utf-8 string. From the comment
> in module_copy_string_contents
> > implementation I guessed that in such cases emacs should signal an
> error, but instead it just passes this
> > invalid string to the dynamic library which caused this bug in
> emacs-module-rs (see
> >
> https://ubolonton.github.io/emacs-module-rs/latest/type-conversions.html#strings
> ). So if it's expected then
> > maybe it should be explicitly said in the docs of copy_string_contents
> here
> >
> https://www.gnu.org/software/emacs/manual/html_node/elisp/Module-Values.html
> ? It just says that it stores
> > the utf-8 encoded text which makes an impression that it's an always
> valid utf-8 string.
>
> I could look into the internals, but I actually wonder why the module
> doesn't check the text before relying on such subtle behaviors.  We
> didn't document the fact that it signals an error for a reason.
>
> So: why cannot the module code or the application which uses it test
> up from that the string it copies is human-readable text, nit some
> binary junk?
>


-- 
С уважением, Курневский Евгений.

[-- Attachment #2: Type: text/html, Size: 2485 bytes --]

  reply	other threads:[~2024-12-17 14:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-17  6:08 bug#74922: 29.4; copy_string_contents doesn't always produce a valid utf-8 Evgeny Kurnevsky
2024-12-17 13:18 ` Eli Zaretskii
     [not found]   ` <CAOEHfojGKXoUKbf1-5N=973OURs==BQTXejLFd8cLhsR1DWh+g@mail.gmail.com>
2024-12-17 13:31     ` bug#74922: Fwd: " Evgeny Kurnevsky
2024-12-17 14:24       ` Eli Zaretskii
2024-12-17 14:46         ` Evgeny Kurnevsky [this message]
2024-12-17 15:10           ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOEHfogetPusuLUajxMxuEyWs=yQuMoLo9wk+Xd7eFwJp_6-nA@mail.gmail.com' \
    --to=kurnevsky@gmail.com \
    --cc=74922@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).