all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 42904@debbugs.gnu.org, alan@idiocy.org
Subject: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS
Date: Fri, 21 Aug 2020 11:39:30 +0200	[thread overview]
Message-ID: <F577D88E-4D40-4DA9-B2F5-1CCE2A56CD87@acm.org> (raw)
In-Reply-To: <834koxcere.fsf@gnu.org>

20 aug. 2020 kl. 21.13 skrev Eli Zaretskii <eliz@gnu.org>:

> I don't think I understand.  mode_line_noprop_buf gets the bytes, and
> then we call make_string on it, so the result is the same as the one
> you'd like to avoid.  Or am I missing something?
> 
> By "settling on multibyte representation", do you mean that we should
> convert raw bytes to their multibyte form?  Or do you mean something
> else?

No, I think we are talking about the same thing. Basically, it's about how the bytes end up in mode_line_noprop_buf in the first place, since currently the information of whether it should be interpreted as unibyte or multibyte gets lost as soon as data from the strings it is composed of (like the buffer name for %b, file name for %f etc) is added to it. Then make_string tries to restore that information by looking at the bytes, and it is not always accurate.

One way of doing this is to always make sure that the input strings (buffer name, file name, frame-title-format etc) are always in multibyte form. Another would be to convert to multibyte as those strings are used, presumably in decode_mode_spec. You know this code a lot better than I do, but the former may be slightly more workable (and efficient).

> Again, what would you like to have instead?  Would calling
> str_as_multibyte do what you want?

No, I don't think so -- once the unibyte/multibyte bit is lost, it can only be restored imperfectly if all we have is the sequence of bytes. In mathematical terms, the function that maps an arbitrary string object to its bytes has no inverse. (Consider the unibyte string "\xc3\xa5" -- should the bytes {c3, a5} be recreated as that unibyte string, or as the multibyte string "å"?)

Again we are talking about trivialities here, but perhaps the same syndrome will arise in other contexts where it matters more. If we wrote Emacs from scratch we likely wouldn't have unibyte strings at all: they are only there for compatibility and various niche uses and performance hacks. I don't think it's unreasonable to start normalising strings to multibyte where it matters.

Thanks for your patience!






  reply	other threads:[~2020-08-21  9:39 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-17 14:11 bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Mattias Engdegård
2020-08-17 14:54 ` Andrii Kolomoiets
2020-08-17 15:55   ` Mattias Engdegård
2020-08-17 15:55 ` Eli Zaretskii
2020-08-17 16:11   ` Mattias Engdegård
2020-08-17 17:05     ` Eli Zaretskii
2020-08-17 18:48       ` Mattias Engdegård
2020-08-17 19:56         ` Alan Third
2020-08-18  8:07           ` Mattias Engdegård
2020-08-18  8:43             ` Alan Third
2020-08-18 11:48               ` Mattias Engdegård
2020-08-18 12:22                 ` Eli Zaretskii
2020-08-18 17:28                 ` Alan Third
2020-08-20  9:27                   ` Mattias Engdegård
2020-08-20 13:24                     ` Eli Zaretskii
2020-08-20 18:46                       ` Mattias Engdegård
2020-08-20 19:13                         ` Eli Zaretskii
2020-08-21  9:39                           ` Mattias Engdegård [this message]
2020-08-21 13:26                             ` Eli Zaretskii
2020-08-21 14:53                               ` Mattias Engdegård
2020-08-21 15:27                                 ` Eli Zaretskii
2020-08-21 15:50                                   ` Mattias Engdegård
2020-08-23 17:23                                     ` Mattias Engdegård
2020-08-20 13:24                     ` Alan Third
2020-08-20 17:44                       ` Mattias Engdegård
2020-08-18 12:24               ` Eli Zaretskii
2020-08-18 14:11                 ` Mattias Engdegård
2020-08-18 14:40                   ` Eli Zaretskii
2020-08-18 15:21                     ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F577D88E-4D40-4DA9-B2F5-1CCE2A56CD87@acm.org \
    --to=mattiase@acm.org \
    --cc=42904@debbugs.gnu.org \
    --cc=alan@idiocy.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.