all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 56347@debbugs.gnu.org
Subject: bug#56347: Optimize/simplify STRING_SET_MULTIBYTE
Date: Sat, 02 Jul 2022 12:12:06 -0400	[thread overview]
Message-ID: <jwvtu7zqz8c.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <83pmioca3h.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 02 Jul 2022 09:17:06 +0300")

>> The patch below simplifies code around STRING_SET_MULTIBYTE.
>> Any objection?
> Rationale?

STRING_SET_MULTIBYTE is fundamentally evil because it changes the nature
of an object.  Its current definition (like that of STRING_SET_UNIBYTE)
is rather scary (it sometimes changes the nature of the arg passed to
it, and sometimes replaces the arg with something else).

>> --- a/src/composite.c
>> +++ b/src/composite.c
>> @@ -1879,11 +1879,7 @@ Otherwise (for terminal display), FONT-OBJECT must be a terminal ID, a
>>  	  for (i = SBYTES (string) - 1; i >= 0; i--)
>>  	    if (!ASCII_CHAR_P (SREF (string, i)))
>>  	      error ("Attempt to shape unibyte text");
>> -	  /* STRING is a pure-ASCII string, so we can convert it (or,
>> -	     rather, its copy) to multibyte and use that thereafter.  */
>> -	  Lisp_Object string_copy = Fconcat (1, &string);
>> -	  STRING_SET_MULTIBYTE (string_copy);
>> -	  string = string_copy;
>> +	  /* STRING is a pure-ASCII string, so we can treat it as multibyte.  */
>
> Did you actually try your change in the situations where this problem
> pops up?

I don't even know how to go about doing that, no.

> AFAIR, the code makes a copy of the string for good reasons:
> the rest of handling of the string down the line barfs if we keep a
> multibyte string here.

[ I assume you meant "barfs if we keep a *uni*byte string here".  ]

Where?  AFAICT `string` is only used in the subsequent code by passing
it to `fill_gstring_header` and that function only passes that arg to
`fetch_string_char_advance_no_check` and that function only looks at the
string's SDATA, so as long as the sequence of bytes is consistent with
a multibyte string (which we just checked with the ASCII_CHAR_P loop),
I don't see any problem.

>> --- a/src/lisp.h
>> +++ b/src/lisp.h
>> @@ -1637,12 +1637,10 @@ #define STRING_SET_UNIBYTE(STR)				\
>>  
>>  /* Mark STR as a multibyte string.  Assure that STR contains only
>>     ASCII characters in advance.  */
>> -#define STRING_SET_MULTIBYTE(STR)			\
>> -  do {							\
>> -    if (XSTRING (STR)->u.s.size == 0)			\
>> -      (STR) = empty_multibyte_string;			\
>> -    else						\
>> -      XSTRING (STR)->u.s.size_byte = XSTRING (STR)->u.s.size; \
>> +#define STRING_SET_MULTIBYTE(STR)			    \
>> +  do {							    \
>> +    eassert (XSTRING (STR)->u.s.size > 0);		    \
>> +    XSTRING (STR)->u.s.size_byte = XSTRING (STR)->u.s.size; \
>>    } while (false)
>>  
>>  /* Convenience functions for dealing with Lisp strings.  */
>
> You want to disallow uses of empty_multibyte_string? why?

No, I want to reduce the scope of semantics of the macro, e.g. so it can
be implemented as a function rather than a macro and so it doesn't
magically substitute empty_multibyte_string into a variable that held
something else.


        Stefan






  reply	other threads:[~2022-07-02 16:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-01 23:32 bug#56347: Optimize/simplify STRING_SET_MULTIBYTE Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-02  6:17 ` Eli Zaretskii
2022-07-02 16:12   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2022-07-02 16:24     ` Eli Zaretskii
2022-07-02 18:00       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-02 18:31         ` Eli Zaretskii
2022-07-02 18:37           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-02 18:45             ` Eli Zaretskii
2022-07-02 16:49   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-02 17:06     ` Eli Zaretskii
2022-07-02 17:57       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvtu7zqz8c.fsf-monnier+emacs@gnu.org \
    --to=bug-gnu-emacs@gnu.org \
    --cc=56347@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.