unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: master 8c81818673a 6/7: Tune volatile in read_char
       [not found] ` <20240817041648.A6687C2BC66@vcs2.savannah.gnu.org>
@ 2024-08-17 15:04   ` Andrea Corallo
  2024-08-17 17:03     ` Pip Cet
  2024-08-17 18:17     ` Paul Eggert
  0 siblings, 2 replies; 22+ messages in thread
From: Andrea Corallo @ 2024-08-17 15:04 UTC (permalink / raw)
  To: emacs-devel; +Cc: Paul Eggert

Paul Eggert <eggert@cs.ucla.edu> writes:

> branch: master
> commit 8c81818673ae9ff788c6e65fb90984f327b27964
> Author: Paul Eggert <eggert@cs.ucla.edu>
> Commit: Paul Eggert <eggert@cs.ucla.edu>
>
>     Tune volatile in read_char
>     
>     * src/keyboard.c (read_char): Optimize access to a local volatile.

Hi Paul,

this change building with "--enable-checking=all
--enable-check-lisp-object-type" on my system (GCC 14
x86_64-pc-linux-gnu) is introducing:

"keyboard.c:2520:15: warning: variable ‘c’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]"

Also wanted to ask, is this a performance optimization?

  Andrea



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-17 15:04   ` master 8c81818673a 6/7: Tune volatile in read_char Andrea Corallo
@ 2024-08-17 17:03     ` Pip Cet
  2024-08-17 18:16       ` Paul Eggert
  2024-08-17 18:17     ` Paul Eggert
  1 sibling, 1 reply; 22+ messages in thread
From: Pip Cet @ 2024-08-17 17:03 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: emacs-devel, Paul Eggert

"Andrea Corallo" <acorallo@gnu.org> writes:

> Paul Eggert <eggert@cs.ucla.edu> writes:
>
>> branch: master
>> commit 8c81818673ae9ff788c6e65fb90984f327b27964
>> Author: Paul Eggert <eggert@cs.ucla.edu>
>> Commit: Paul Eggert <eggert@cs.ucla.edu>
>>
>>     Tune volatile in read_char
>>
>>     * src/keyboard.c (read_char): Optimize access to a local volatile.
>
> Hi Paul,
>
> this change building with "--enable-checking=all
> --enable-check-lisp-object-type" on my system (GCC 14
> x86_64-pc-linux-gnu) is introducing:
>
> "keyboard.c:2520:15: warning: variable ‘c’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]"

Well, GCC is late to the party, we already found that bug ;-)

I think GCC, by its own logic, is right about this one. It doesn't know
that local_getcjmp is local (and hasn't escaped), so it would be, in
theory, possible for a signal handler to call 'longjmp' before
c_volatile is initialized.  And since GCC does not distinguish between
the two branches following a 'setjmp', it might end up using the value
of 'c' which might hypothetically have been clobbered.

Of course there are many reasons that can't actually happen, but the GCC
limitations are well known and we can work around them:

diff --git a/src/keyboard.c b/src/keyboard.c
index 0d3506bc59b..0992fab653b 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -2752,7 +2752,7 @@ read_char (int commandflag, Lisp_Object map,
      it *must not* be in effect when we call redisplay.  */
 
   specpdl_ref jmpcount = SPECPDL_INDEX ();
-  Lisp_Object volatile c_volatile;
+  Lisp_Object volatile c_volatile = c;
   if (sys_setjmp (local_getcjmp))
     {
       c = c_volatile;
@@ -2800,7 +2800,6 @@ read_char (int commandflag, Lisp_Object map,
       goto non_reread;
     }
 
-  c_volatile = c;
 #if GCC_LINT && __GNUC__ && !__clang__
   /* This useless assignment pacifies GCC 14.2.1 x86-64
      <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21161>.  */

Does that help? It does here, but these warnings are highly dependent on
the precise GCC version and optimization flags used, which is why I
still think we should default to -Werror=clobbered to catch distributors
performing LTO builds and such.

Pip




^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-17 17:03     ` Pip Cet
@ 2024-08-17 18:16       ` Paul Eggert
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Eggert @ 2024-08-17 18:16 UTC (permalink / raw)
  To: Pip Cet, Andrea Corallo; +Cc: emacs-devel

On 2024-08-17 10:03, Pip Cet wrote:

> I think GCC, by its own logic, is right about this one. It doesn't know
> that local_getcjmp is local (and hasn't escaped), so it would be, in
> theory, possible for a signal handler to call 'longjmp' before
> c_volatile is initialized.

For what it's worth, gcc (GCC) 14.2.1 20240801 (Red Hat 14.2.1-1) does 
not issue the warning on x86-64, when Emacs is built with 
--enable-gcc-warnings. Either it's smarter about escaping, or it figures 
it's not worth issuing warnings about what random signal handlers might 
do, or (most likely) it's just the luck of that particular configuation.

Thanks for the patch; I installed it.


> I still think we should default to -Werror=clobbered to catch distributors
> performing LTO builds and such.

Not sure that'd be wise given the false positive that your patch fixes, 
More false positives are likely lurking in other GCC setups, and 
defaulting to -Werror=clobbered would likely break builds due to false 
alarms.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-17 15:04   ` master 8c81818673a 6/7: Tune volatile in read_char Andrea Corallo
  2024-08-17 17:03     ` Pip Cet
@ 2024-08-17 18:17     ` Paul Eggert
  2024-08-18  7:39       ` Andrea Corallo
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2024-08-17 18:17 UTC (permalink / raw)
  To: Andrea Corallo, emacs-devel

On 2024-08-17 08:04, Andrea Corallo wrote:
> Also wanted to ask, is this a performance optimization?

Yes, the idea is to keep the recently-read c in a register rather than 
always referring to RAM. Admittedly a minor optimization these days.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-17 18:17     ` Paul Eggert
@ 2024-08-18  7:39       ` Andrea Corallo
  2024-08-18 21:39         ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Andrea Corallo @ 2024-08-18  7:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

> On 2024-08-17 08:04, Andrea Corallo wrote:
>> Also wanted to ask, is this a performance optimization?
>
> Yes, the idea is to keep the recently-read c in a register rather than
> always referring to RAM. Admittedly a minor optimization these days.

Has this optimization any measurable effect?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-18  7:39       ` Andrea Corallo
@ 2024-08-18 21:39         ` Paul Eggert
  2024-08-18 21:57           ` Sam James
  2024-08-19 14:43           ` Andrea Corallo
  0 siblings, 2 replies; 22+ messages in thread
From: Paul Eggert @ 2024-08-18 21:39 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: emacs-devel

On 2024-08-18 00:39, Andrea Corallo wrote:
> Has this optimization any measurable effect?

On my platform (Ubuntu 24.04 LTS, x86-64, gcc -O2) it shrinks the size 
of read_char's machine code by 5%.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-18 21:39         ` Paul Eggert
@ 2024-08-18 21:57           ` Sam James
  2024-08-18 22:03             ` Paul Eggert
  2024-08-19 14:43           ` Andrea Corallo
  1 sibling, 1 reply; 22+ messages in thread
From: Sam James @ 2024-08-18 21:57 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Andrea Corallo, emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

> On 2024-08-18 00:39, Andrea Corallo wrote:
>> Has this optimization any measurable effect?
>
> On my platform (Ubuntu 24.04 LTS, x86-64, gcc -O2) it shrinks the size
> of read_char's machine code by 5%.

Could you include that information in the commit message in future?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-18 21:57           ` Sam James
@ 2024-08-18 22:03             ` Paul Eggert
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Eggert @ 2024-08-18 22:03 UTC (permalink / raw)
  To: Sam James; +Cc: Andrea Corallo, emacs-devel

On 2024-08-18 14:57, Sam James wrote:
> Paul Eggert <eggert@cs.ucla.edu> writes:
> 
>> On 2024-08-18 00:39, Andrea Corallo wrote:
>>> Has this optimization any measurable effect?
>>
>> On my platform (Ubuntu 24.04 LTS, x86-64, gcc -O2) it shrinks the size
>> of read_char's machine code by 5%.
> 
> Could you include that information in the commit message in future?

Although I'll try to remember, to be honest I hadn't bothered to compute 
that number until you asked, as it was so obvious that the patch was a 
performance win (admittedly small). I don't always have the time to 
measure performance of such patches.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-18 21:39         ` Paul Eggert
  2024-08-18 21:57           ` Sam James
@ 2024-08-19 14:43           ` Andrea Corallo
  2024-08-19 15:01             ` Eli Zaretskii
  1 sibling, 1 reply; 22+ messages in thread
From: Andrea Corallo @ 2024-08-19 14:43 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, eliz, Stefan Kangas

Paul Eggert <eggert@cs.ucla.edu> writes:

> On 2024-08-18 00:39, Andrea Corallo wrote:
>> Has this optimization any measurable effect?
>
> On my platform (Ubuntu 24.04 LTS, x86-64, gcc -O2) it shrinks the size
> of read_char's machine code by 5%.

Don't know what the other maintainers think about this but FWIW I don't
like this change (and similar ones).  Our codebase is already
sufficiently tricky and convoluted, complexifying code for no observable
improvements should be IMO out of our goals.

  Andrea



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 14:43           ` Andrea Corallo
@ 2024-08-19 15:01             ` Eli Zaretskii
  2024-08-19 15:32               ` Pip Cet
  0 siblings, 1 reply; 22+ messages in thread
From: Eli Zaretskii @ 2024-08-19 15:01 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: eggert, emacs-devel, stefankangas

> From: Andrea Corallo <acorallo@gnu.org>
> Cc: emacs-devel@gnu.org, eliz@gnu.org, Stefan Kangas <stefankangas@gmail.com>
> Date: Mon, 19 Aug 2024 10:43:00 -0400
> 
> Paul Eggert <eggert@cs.ucla.edu> writes:
> 
> > On 2024-08-18 00:39, Andrea Corallo wrote:
> >> Has this optimization any measurable effect?
> >
> > On my platform (Ubuntu 24.04 LTS, x86-64, gcc -O2) it shrinks the size
> > of read_char's machine code by 5%.
> 
> Don't know what the other maintainers think about this but FWIW I don't
> like this change (and similar ones).  Our codebase is already
> sufficiently tricky and convoluted, complexifying code for no observable
> improvements should be IMO out of our goals.

I tend to agree.  Especially given that the rationale for this
juggling and why exactly we assign and re-assign these two variables
in those particular places is nowhere to be found, neither in the
source code nor in the commit log message.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 15:01             ` Eli Zaretskii
@ 2024-08-19 15:32               ` Pip Cet
  2024-08-19 15:44                 ` Eli Zaretskii
  0 siblings, 1 reply; 22+ messages in thread
From: Pip Cet @ 2024-08-19 15:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andrea Corallo, eggert, emacs-devel, stefankangas

"Eli Zaretskii" <eliz@gnu.org> writes:

>> From: Andrea Corallo <acorallo@gnu.org>
>> Cc: emacs-devel@gnu.org, eliz@gnu.org, Stefan Kangas <stefankangas@gmail.com>
>> Date: Mon, 19 Aug 2024 10:43:00 -0400
>>
>> Paul Eggert <eggert@cs.ucla.edu> writes:
>>
>> > On 2024-08-18 00:39, Andrea Corallo wrote:
>> >> Has this optimization any measurable effect?
>> >
>> > On my platform (Ubuntu 24.04 LTS, x86-64, gcc -O2) it shrinks the size
>> > of read_char's machine code by 5%.
>>
>> Don't know what the other maintainers think about this but FWIW I don't
>> like this change (and similar ones).  Our codebase is already
>> sufficiently tricky and convoluted, complexifying code for no observable
>> improvements should be IMO out of our goals.
>
> I tend to agree.  Especially given that the rationale for this
> juggling and why exactly we assign and re-assign these two variables
> in those particular places is nowhere to be found, neither in the
> source code nor in the commit log message.

Could we measure the performance impact of making all stack variables in
functions that call setjmp() volatile?

As I pointed out in the bug thread, gcc's -Wclobbered warning doesn't
actually complain about all violations of the C standard, it only
complains about registers that will definitely be clobbered in a
specific build.  As it's unpredictable which build options distributors
will use (in the case of this bug, the combination was "-flto=auto 
-fexceptions".  I don't even know whether -flto=auto results in
reproducible builds!), I think we should do one of these things:

1. make all automatic variables in setjmp()-calling functions volatile,
perhaps with one or two well-documented exceptions
2. default to -Werror=clobbered, which will break distributors' builds
rather than letting them ship broken Emacs versions (as Arch is doing).
3. adapt the GCC analyzer (-fanalyzer, find a machine with enough RAM to
run it on Emacs...) to report ALL instances of potentially clobbered
variables, as it has enough information to do so.

(3) is an "interesting" project, meaning that I looked at the analyzer
code today and have no clue where to even start doing it. (2) is
problematic because distributors will simply remove the -Werror.  So (1)
is our best choice, I'm afraid.

We can add a comment deploring how C pretends setjmp() is an ordinary
function; GCC and clang on POSIX systems don't (and can't) treat it that
way, and on Windows systems things are even more complicated.

Maybe we can even add -Wsetjmp-with-any-nonvolatiles to GCC...

Pip




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 15:32               ` Pip Cet
@ 2024-08-19 15:44                 ` Eli Zaretskii
  2024-08-19 16:01                   ` Pip Cet
  0 siblings, 1 reply; 22+ messages in thread
From: Eli Zaretskii @ 2024-08-19 15:44 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, eggert, emacs-devel, stefankangas

> Date: Mon, 19 Aug 2024 15:32:40 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Andrea Corallo <acorallo@gnu.org>, eggert@cs.ucla.edu, emacs-devel@gnu.org, stefankangas@gmail.com
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> Don't know what the other maintainers think about this but FWIW I don't
> >> like this change (and similar ones).  Our codebase is already
> >> sufficiently tricky and convoluted, complexifying code for no observable
> >> improvements should be IMO out of our goals.
> >
> > I tend to agree.  Especially given that the rationale for this
> > juggling and why exactly we assign and re-assign these two variables
> > in those particular places is nowhere to be found, neither in the
> > source code nor in the commit log message.
> 
> Could we measure the performance impact of making all stack variables in
> functions that call setjmp() volatile?

Why would that be useful?  (Many/most of them already are volatile.)

In any case, this is not what this thread is about, so let's not mix
together unrelated issues.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 15:44                 ` Eli Zaretskii
@ 2024-08-19 16:01                   ` Pip Cet
  2024-08-19 16:15                     ` Eli Zaretskii
  0 siblings, 1 reply; 22+ messages in thread
From: Pip Cet @ 2024-08-19 16:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, eggert, emacs-devel, stefankangas

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Mon, 19 Aug 2024 15:32:40 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Andrea Corallo <acorallo@gnu.org>, eggert@cs.ucla.edu, emacs-devel@gnu.org, stefankangas@gmail.com
>>
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>>
>> >> Don't know what the other maintainers think about this but FWIW I don't
>> >> like this change (and similar ones).  Our codebase is already
>> >> sufficiently tricky and convoluted, complexifying code for no observable
>> >> improvements should be IMO out of our goals.
>> >
>> > I tend to agree.  Especially given that the rationale for this
>> > juggling and why exactly we assign and re-assign these two variables
>> > in those particular places is nowhere to be found, neither in the
>> > source code nor in the commit log message.
>>
>> Could we measure the performance impact of making all stack variables in
>> functions that call setjmp() volatile?
>
> Why would that be useful?  (Many/most of them already are volatile.)

Because we will run into further setjmp()-related bugs if we don't do
it, and because we could then stop putting effort into deciding under
which delicate circumstances we can make them non-volatile (which is
fun, I understand that).

> In any case, this is not what this thread is about, so let's not mix
> together unrelated issues.

This thread is about how to deal with automatic variables in
setjmp()-calling functions, either duplicating them into a 'c_volatile'
/ 'c' pair, as Paul's patch did, or keeping a single volatile
variable. I think "let's make all of them volatile" is an obvious
simplification of that idea.

Pip




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 16:01                   ` Pip Cet
@ 2024-08-19 16:15                     ` Eli Zaretskii
  2024-08-19 18:59                       ` Paul Eggert
  2024-08-19 19:05                       ` Pip Cet
  0 siblings, 2 replies; 22+ messages in thread
From: Eli Zaretskii @ 2024-08-19 16:15 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, eggert, emacs-devel, stefankangas

> Date: Mon, 19 Aug 2024 16:01:31 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, stefankangas@gmail.com
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> > In any case, this is not what this thread is about, so let's not mix
> > together unrelated issues.
> 
> This thread is about how to deal with automatic variables in
> setjmp()-calling functions

No, it's about that particular part of what Paul installed.

> either duplicating them into a 'c_volatile'
> / 'c' pair, as Paul's patch did, or keeping a single volatile
> variable. I think "let's make all of them volatile" is an obvious
> simplification of that idea.

Why not simply undo that single patch, and be done?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 16:15                     ` Eli Zaretskii
@ 2024-08-19 18:59                       ` Paul Eggert
  2024-08-19 19:27                         ` Eli Zaretskii
  2024-08-19 19:05                       ` Pip Cet
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2024-08-19 18:59 UTC (permalink / raw)
  To: Eli Zaretskii, Pip Cet; +Cc: acorallo, emacs-devel, stefankangas

On 2024-08-19 09:15, Eli Zaretskii wrote:
> Why not simply undo that single patch, and be done?

Let's not. The issue has long been present and is not limited to 
read_char; see, for example, the variable "this_op" in exec_byte_code, 
which is merely a volatile copy of "op". The method Emacs uses to 
address this issue works and there's no reason to use a different method 
in read_char than everywhere else.

A better way to deal with 'volatile' is to not use it, the way Stefan 
did in 2013; see commit adf2aa61404305e58e71cde0193bb650aff2c4b3.

If we wanted to do further surgery in read_char etc., I suggest 
splitting each setjmp-calling function into multiple functions: the part 
that really needs setjmp and the (hopefully) larger part that does not. 
We could then focus on minimizing the number of local variables affected 
in the functions that really need setjmp. When we're lucky the number of 
affected variables would be zero. Most likely there would still be a few 
volatile variables, which would still cause trouble (good luck taking 
their addresses, for example) but that's life and at least we'd have 
less 'volatile' than we do now.

I suspect, though, it's better to quit while we're ahead.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 16:15                     ` Eli Zaretskii
  2024-08-19 18:59                       ` Paul Eggert
@ 2024-08-19 19:05                       ` Pip Cet
  2024-08-19 19:29                         ` Eli Zaretskii
  1 sibling, 1 reply; 22+ messages in thread
From: Pip Cet @ 2024-08-19 19:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, eggert, emacs-devel, stefankangas

"Eli Zaretskii" <eliz@gnu.org> writes:

>> either duplicating them into a 'c_volatile'
>> / 'c' pair, as Paul's patch did, or keeping a single volatile
>> variable. I think "let's make all of them volatile" is an obvious
>> simplification of that idea.
>
> Why not simply undo that single patch, and be done?

Because we will run into further setjmp()-related bugs. For example, I
believe Paul's recent patches to bytecode.c subtly changed quit behavior
when exception handlers are involved, at least from the intended mode of
operation: now we reset the quit counter to the value it had when we
registered the handler, which might reduce it, and thus lead to missed
quits (though I haven't found a bytecode object that causes that bug
yet).

IOW, I think we need this patch:

diff --git a/src/bytecode.c b/src/bytecode.c
index 48a29c22d55..f261617783d 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -857,6 +857,7 @@ #define DEFINE(name, value) [name] = &&insn_ ## name,
              maybe_gc ();
              maybe_quit ();
            }
+         saved_quitcounter = quitcounter;
          pc += op;
          NEXT;
 

... or simply make quitcounter a volatile variable and be done with it.

My apologies if I'm mistaken about this.

Pip




^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 18:59                       ` Paul Eggert
@ 2024-08-19 19:27                         ` Eli Zaretskii
  0 siblings, 0 replies; 22+ messages in thread
From: Eli Zaretskii @ 2024-08-19 19:27 UTC (permalink / raw)
  To: Paul Eggert; +Cc: pipcet, acorallo, emacs-devel, stefankangas

> Date: Mon, 19 Aug 2024 11:59:28 -0700
> Cc: acorallo@gnu.org, emacs-devel@gnu.org, stefankangas@gmail.com
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> On 2024-08-19 09:15, Eli Zaretskii wrote:
> > Why not simply undo that single patch, and be done?
> 
> Let's not. The issue has long been present and is not limited to 
> read_char; see, for example, the variable "this_op" in exec_byte_code, 
> which is merely a volatile copy of "op". The method Emacs uses to 
> address this issue works and there's no reason to use a different method 
> in read_char than everywhere else.

I'm talking only about that particular change, whose motivation is
described as "Optimize access to a local volatile."  If this is just
an optimization, and its effect on performance is as you describe it,
but it has a downside of making the code harder to read and maintain,
why not pay that small penalty in performance?

> A better way to deal with 'volatile' is to not use it, the way Stefan 
> did in 2013; see commit adf2aa61404305e58e71cde0193bb650aff2c4b3.

I have nothing against 'volatile' in general.  Once again, this is
only about the "optimization" part of your changes.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 19:05                       ` Pip Cet
@ 2024-08-19 19:29                         ` Eli Zaretskii
  2024-08-19 19:43                           ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Eli Zaretskii @ 2024-08-19 19:29 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, eggert, emacs-devel, stefankangas

> Date: Mon, 19 Aug 2024 19:05:12 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, stefankangas@gmail.com
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> either duplicating them into a 'c_volatile'
> >> / 'c' pair, as Paul's patch did, or keeping a single volatile
> >> variable. I think "let's make all of them volatile" is an obvious
> >> simplification of that idea.
> >
> > Why not simply undo that single patch, and be done?
> 
> Because we will run into further setjmp()-related bugs.

What bugs? that change is described as a minor optimization, so no
bugs should be caused by reverting it.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 19:29                         ` Eli Zaretskii
@ 2024-08-19 19:43                           ` Paul Eggert
  2024-08-19 20:08                             ` Pip Cet
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2024-08-19 19:43 UTC (permalink / raw)
  To: Eli Zaretskii, Pip Cet; +Cc: acorallo, emacs-devel, stefankangas

On 2024-08-19 12:29, Pip Cet wrote:
> now we reset the quit counter to the value it had when we
> registered the handler, which might reduce it, and thus lead to missed
> quits (though I haven't found a bytecode object that causes that bug
> yet)

I don't see how that could lead to missed quits. Can you explain? (Are 
you thinking of dealing with corrupted bytecode objects? But I don't see 
it even then.)

quit_counter is merely a heuristic: it merely needs to increase if the 
current bytecode object loops forever. If I understand things correctly 
this increase should happen (albeit perhaps more slowly) regardless of 
whether your proposed patch is applied, which means the patch shouldn't 
be needed.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 19:43                           ` Paul Eggert
@ 2024-08-19 20:08                             ` Pip Cet
  2024-08-19 22:20                               ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Pip Cet @ 2024-08-19 20:08 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, acorallo, emacs-devel, stefankangas

"Paul Eggert" <eggert@cs.ucla.edu> writes:

> On 2024-08-19 12:29, Pip Cet wrote:
>> now we reset the quit counter to the value it had when we
>> registered the handler, which might reduce it, and thus lead to missed
>> quits (though I haven't found a bytecode object that causes that bug
>> yet)
>
> I don't see how that could lead to missed quits. Can you explain? (Are
> you thinking of dealing with corrupted bytecode objects? But I don't see
> it even then.)

Well, before the patch, the logic was:

quit_counter always increases, once per backward jump, except when it
becomes 0, and then we quit.

Now, I don't understand the logic, but it's complicated:
* quit_counter usually increases
* but sometimes, it's reset to a lower value, when we hit a condition
case handler
* there's no guarantee, or not a strong as there once was, that we will
call maybe_quit
* there certainly is no guarantee that we will call it once per 255
backward jumps

> quit_counter is merely a heuristic: it merely needs to increase if the
> current bytecode object loops forever. If I understand things correctly
> this increase should happen (albeit perhaps more slowly) regardless of
> whether your proposed patch is applied, which means the patch shouldn't
> be needed.

I believe you're right and the "once per 255 backward jumps" simply
becomes "once per 254 * 255 backward jumps", for compiler-generated
bytecode.  In my defense, I did say it was subtle :-)

(The rest of this email may not be interesting, feel free to skip it:

A highly contrived example here, which results in maybe_quit() being
called from exec_bytecode() with the old code, but no such call with the
new code.

Consider:

(disassemble #[65535 "\210\3011\0\0\202\013\0\302\211\241\202\010\000"
[argument wrong-type-argument 0 t] 65536])

which results in:

0:1	discard
1	constant  wrong-type-argument
2	pushconditioncase 1
5	goto	  3
8:2	constant  0
9	dup
10	setcdr
11:3	goto	  2

There's a backward jump, so this code should call quit, but now it
doesn't.  Of course, one can come up with similar examples without a
backward jump, which loop unquittably even in emacs-30, so all this is
academic at best.  I'm pretty sure behavior also changed for
self-modifying bytecode objects, but that would be cheating.)

Pip




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 20:08                             ` Pip Cet
@ 2024-08-19 22:20                               ` Paul Eggert
  2024-08-19 23:40                                 ` Pip Cet
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2024-08-19 22:20 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, acorallo, emacs-devel, stefankangas

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

On 2024-08-19 13:08, Pip Cet wrote:
> one can come up with similar examples without a
> backward jump, which loop unquittably even in emacs-30

Ouch, in that case this issue happens regardless of the recent 
'volatile'-related changes, and we should address the issue regardless 
of what we do with 'volatile'.

Would the attached patch fix the issue in master? It can be tuned 
further (and obviously needs comments); I'm just trying to see whether I 
understand the point you're making.

This patch affects behavior only if BYTE_CODE_SAFE, because as I 
understand things we trust bytecode anyway otherwise.

[-- Attachment #2: quitcounter.diff --]
[-- Type: text/x-patch, Size: 332 bytes --]

diff --git a/src/bytecode.c b/src/bytecode.c
index 48a29c22d55..c023f450d1f 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -995,6 +995,8 @@ #define DEFINE(name, value) [name] = &&insn_ ## name,
 		  }
 		pc = bytestr_data;
 		PUSH (c->val);
+		if (BYTE_CODE_SAFE)
+		  quitcounter += ! (op < 0);
 		goto op_branch;
 	      }
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: master 8c81818673a 6/7: Tune volatile in read_char
  2024-08-19 22:20                               ` Paul Eggert
@ 2024-08-19 23:40                                 ` Pip Cet
  0 siblings, 0 replies; 22+ messages in thread
From: Pip Cet @ 2024-08-19 23:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, acorallo, emacs-devel, stefankangas

"Paul Eggert" <eggert@cs.ucla.edu> writes:

> On 2024-08-19 13:08, Pip Cet wrote:
>> one can come up with similar examples without a
>> backward jump, which loop unquittably even in emacs-30
>
> Ouch, in that case this issue happens regardless of the recent
> 'volatile'-related changes, and we should address the issue regardless
> of what we do with 'volatile'.

Agreed.

> Would the attached patch fix the issue in master? It can be tuned
> further (and obviously needs comments); I'm just trying to see whether I
> understand the point you're making.

 		pc = bytestr_data;
 		PUSH (c->val);
+		if (BYTE_CODE_SAFE)
+		  quitcounter += ! (op < 0);
 		goto op_branch;
 	      }

Yes, that does work, though I had to fix a bug in my code and it now
reads:

(disassemble #[65535 "\210\3011\0\0\202\013\0\302\211\241\202\010\000"
[argument (wrong-type-argument) 0 t] 65536])

But, as you say, it could be simplified:

 		pc = bytestr_data;
 		PUSH (c->val);
+		if (BYTE_CODE_SAFE)
+		  quitcounter++;
 		goto op_branch;
 	      }

'op' is relative to the beginning of the byte string, so it should never
be less than 0.  If it is, and BYTE_CODE_SAFE is true, we will call
emacs_abort in op_branch.

> This patch affects behavior only if BYTE_CODE_SAFE, because as I
> understand things we trust bytecode anyway otherwise.

(That reminds me of a bug that I hunted down in which it turned out gcc
helpfully generated different code depending on whether -g (not -Og) was
specified or not.  I would prefer to err on the side of making things
reproducible and unconditionally increment quitcounter.)

By the way, I was very surprised that 'signal_or_quit', by way of
'Fmemq', calls 'maybe_quit'.  I hope that cannot result in infinite
recursion.

Pip




^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-08-19 23:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <172386820621.30556.15409337288904485218@vcs2.savannah.gnu.org>
     [not found] ` <20240817041648.A6687C2BC66@vcs2.savannah.gnu.org>
2024-08-17 15:04   ` master 8c81818673a 6/7: Tune volatile in read_char Andrea Corallo
2024-08-17 17:03     ` Pip Cet
2024-08-17 18:16       ` Paul Eggert
2024-08-17 18:17     ` Paul Eggert
2024-08-18  7:39       ` Andrea Corallo
2024-08-18 21:39         ` Paul Eggert
2024-08-18 21:57           ` Sam James
2024-08-18 22:03             ` Paul Eggert
2024-08-19 14:43           ` Andrea Corallo
2024-08-19 15:01             ` Eli Zaretskii
2024-08-19 15:32               ` Pip Cet
2024-08-19 15:44                 ` Eli Zaretskii
2024-08-19 16:01                   ` Pip Cet
2024-08-19 16:15                     ` Eli Zaretskii
2024-08-19 18:59                       ` Paul Eggert
2024-08-19 19:27                         ` Eli Zaretskii
2024-08-19 19:05                       ` Pip Cet
2024-08-19 19:29                         ` Eli Zaretskii
2024-08-19 19:43                           ` Paul Eggert
2024-08-19 20:08                             ` Pip Cet
2024-08-19 22:20                               ` Paul Eggert
2024-08-19 23:40                                 ` Pip Cet

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).