unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Docstring hack
@ 2022-07-30 12:14 Lynn Winebarger
  2022-07-30 12:25 ` Po Lu
  2022-07-30 13:28 ` Eli Zaretskii
  0 siblings, 2 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-30 12:14 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

The core emacs lisp libraries are riddled with strings that are erroneously
treated as docstrings in dump mode, which causes problems in the build
when, say, format gets a 0 as its template string in a macro expansion.
There are a few possible fixes I see, I'm not sure which is most likely to
be accepted.
1) Change every non-docstring that starts with and explicit escaped newline
to start with \n instead (there are a lot of them)
2) Change read_literal_string in lread.c to respect the setting of the
dynamic-docstring setting the way the byte compiler does, and change all
the lisp files not in loadup.el to set it to nil
3) Like 2, but make the default setting of dynamic-docstring nil and either
set it as a local variable in the files in loadup, or set it in loadup when
dump-mode is set
4) Make a special read syntax for literal docstrings, e.g. #", and do away
with the weird context-sensitive semantics of ordinary string literals
altogether.

Also, the test in read_literal_string should probably be for "will_dump_p"
rather than the purify flag, since it's the dumping that prompts the
deferral of docstring loading, not the identification of constants.

Any preferences?

Lynn

[-- Attachment #2: Type: text/html, Size: 1558 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 12:14 Docstring hack Lynn Winebarger
@ 2022-07-30 12:25 ` Po Lu
  2022-07-30 12:50   ` Lynn Winebarger
  2022-07-30 13:28 ` Eli Zaretskii
  1 sibling, 1 reply; 33+ messages in thread
From: Po Lu @ 2022-07-30 12:25 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: emacs-devel

Lynn Winebarger <owinebar@gmail.com> writes:

> The core emacs lisp libraries are riddled with strings that are erroneously treated as docstrings in dump mode, which causes problems in the build when, say, format gets
> a 0 as its template string in a macro expansion.  
> There are a few possible fixes I see, I'm not sure which is most likely to be accepted.
> 1) Change every non-docstring that starts with and explicit escaped newline to start with \n instead (there are a lot of them)
> 2) Change read_literal_string in lread.c to respect the setting of the dynamic-docstring setting the way the byte compiler does, and change all the lisp files not in
> loadup.el to set it to nil
> 3) Like 2, but make the default setting of dynamic-docstring nil and either set it as a local variable in the files in loadup, or set it in loadup when dump-mode is set
> 4) Make a special read syntax for literal docstrings, e.g. #", and do away with the weird context-sensitive semantics of ordinary string literals altogether.
>
> Also, the test in read_literal_string should probably be for "will_dump_p" rather than the purify flag, since it's the dumping that prompts the deferral of docstring
> loading, not the identification of constants.
>
> Any preferences?

None in particular, except that option 4 is unacceptable as it is not
compatible with older code, and is completely different from all other
Lisp implementations.

Thanks.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 12:25 ` Po Lu
@ 2022-07-30 12:50   ` Lynn Winebarger
  2022-07-30 13:04     ` Lynn Winebarger
  2022-07-30 13:32     ` Po Lu
  0 siblings, 2 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-30 12:50 UTC (permalink / raw)
  To: Po Lu; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

On Sat, Jul 30, 2022, 8:25 AM Po Lu <luangruo@yahoo.com> wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
> > 4) Make a special read syntax for literal docstrings, e.g. #", and do
> away with the weird context-sensitive semantics of ordinary string literals
> altogether.
> >
> > Also, the test in read_literal_string should probably be for
> "will_dump_p" rather than the purify flag, since it's the dumping that
> prompts the deferral of docstring
> > loading, not the identification of constants.
> >
> > Any preferences?
>
> None in particular, except that option 4 is unacceptable as it is not
> compatible with older code, and is completely different from all other
> Lisp implementations.
>

Not compatible in what sense?
I'm not that familiar with lisp implementations - isn't Emacs's treatment
of a leading escaped literal newline already completely different?  Is
there a typical use of #" as a reader macro?  It's"undefined" according to
https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node191.html

Thanks,
Lynn

[-- Attachment #2: Type: text/html, Size: 1920 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 12:50   ` Lynn Winebarger
@ 2022-07-30 13:04     ` Lynn Winebarger
  2022-07-30 13:32     ` Po Lu
  1 sibling, 0 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-30 13:04 UTC (permalink / raw)
  To: Po Lu; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1815 bytes --]

On Sat, Jul 30, 2022, 8:50 AM Lynn Winebarger <owinebar@gmail.com> wrote:

> On Sat, Jul 30, 2022, 8:25 AM Po Lu <luangruo@yahoo.com> wrote:
>
>> Lynn Winebarger <owinebar@gmail.com> writes:
>> > 4) Make a special read syntax for literal docstrings, e.g. #", and do
>> away with the weird context-sensitive semantics of ordinary string literals
>> altogether.
>> >
>> > Also, the test in read_literal_string should probably be for
>> "will_dump_p" rather than the purify flag, since it's the dumping that
>> prompts the deferral of docstring
>> > loading, not the identification of constants.
>> >
>> > Any preferences?
>>
>> None in particular, except that option 4 is unacceptable as it is not
>> compatible with older code, and is completely different from all other
>> Lisp implementations.
>>
>
> Not compatible in what sense?
> I'm not that familiar with lisp implementations - isn't Emacs's treatment
> of a leading escaped literal newline already completely different?  Is
> there a typical use of #" as a reader macro?  It's"undefined" according to
> https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node191.html
>


Also, I'm only talking about the treatment of docstrings by the reader.
The byte compiler has much more accurate identification of docstrings that
can be relied on when using the bootstrap emacs to byte compile files to be
pre-loaded.
Another approach would to eliminate the special treatment by the reader
altogether to the extent it only saves space in the bootstrap dump, then
rely on lazy loading from byte-compiled files for docstrings in lisp
files.  That might require evicting those docstrings during the first
post-bootstrap dump, or just eliminating them from the function symbol
property during the bootstrap dump, since who is using the help system in
the bootstrap emacs?

Lynn

[-- Attachment #2: Type: text/html, Size: 3255 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 12:14 Docstring hack Lynn Winebarger
  2022-07-30 12:25 ` Po Lu
@ 2022-07-30 13:28 ` Eli Zaretskii
  2022-07-30 13:36   ` Po Lu
  1 sibling, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-30 13:28 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Sat, 30 Jul 2022 08:14:10 -0400
> 
> The core emacs lisp libraries are riddled with strings that are erroneously treated as docstrings in dump
> mode, which causes problems in the build when, say, format gets a 0 as its template string in a macro
> expansion.  

Please show several examples, as I don't think I understand the issue
well enough to have an opinion.

Thanks.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 12:50   ` Lynn Winebarger
  2022-07-30 13:04     ` Lynn Winebarger
@ 2022-07-30 13:32     ` Po Lu
  1 sibling, 0 replies; 33+ messages in thread
From: Po Lu @ 2022-07-30 13:32 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: emacs-devel

Lynn Winebarger <owinebar@gmail.com> writes:

> I'm not that familiar with lisp implementations - isn't Emacs's
> treatment of a leading escaped literal newline already completely
> different?  Is there a typical use of #" as a reader macro?

All other Lisps I know of use a regular string as a doc string, without
any special read syntax.

There is no special treatment of an escaped newline either -- AFAIK we
only apply that upon dumping, which is not a feature that is supposed to
be visible to user-level Lisp code.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 13:28 ` Eli Zaretskii
@ 2022-07-30 13:36   ` Po Lu
  2022-07-30 15:11     ` Eli Zaretskii
  2022-07-31  7:52     ` Stefan Monnier
  0 siblings, 2 replies; 33+ messages in thread
From: Po Lu @ 2022-07-30 13:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Lynn Winebarger, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Lynn Winebarger <owinebar@gmail.com>
>> Date: Sat, 30 Jul 2022 08:14:10 -0400
>> 
>> The core emacs lisp libraries are riddled with strings that are erroneously treated as docstrings in dump
>> mode, which causes problems in the build when, say, format gets a 0 as its template string in a macro
>> expansion.  
>
> Please show several examples, as I don't think I understand the issue
> well enough to have an opinion.
>
> Thanks.

I think the problem is this:

  (format <control string starting with escaped newline> args...)

will result in (format 0 args...) during dumping.

To be honest, I don't see the importance of the issue.  We only have to
make sure such code never exists in code loaded prior to dumping.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 13:36   ` Po Lu
@ 2022-07-30 15:11     ` Eli Zaretskii
  2022-07-30 15:38       ` Lynn Winebarger
  2022-07-31  0:52       ` Po Lu
  2022-07-31  7:52     ` Stefan Monnier
  1 sibling, 2 replies; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-30 15:11 UTC (permalink / raw)
  To: Po Lu; +Cc: owinebar, emacs-devel

> From: Po Lu <luangruo@yahoo.com>
> Cc: Lynn Winebarger <owinebar@gmail.com>,  emacs-devel@gnu.org
> Date: Sat, 30 Jul 2022 21:36:46 +0800
> 
> I think the problem is this:
> 
>   (format <control string starting with escaped newline> args...)
> 
> will result in (format 0 args...) during dumping.

"Result" in what sense?



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 15:11     ` Eli Zaretskii
@ 2022-07-30 15:38       ` Lynn Winebarger
  2022-07-30 15:44         ` Eli Zaretskii
  2022-07-31  0:52       ` Po Lu
  1 sibling, 1 reply; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-30 15:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Po Lu, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 820 bytes --]

On Sat, Jul 30, 2022, 11:11 AM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Po Lu <luangruo@yahoo.com>
> > Cc: Lynn Winebarger <owinebar@gmail.com>,  emacs-devel@gnu.org
> > Date: Sat, 30 Jul 2022 21:36:46 +0800
> >
> > I think the problem is this:
> >
> >   (format <control string starting with escaped newline> args...)
> >
> > will result in (format 0 args...) during dumping.
>
> "Result" in what sense?
>

As in, if you load emacs-lisp/eieio-core in site-load.el with dump-mode
pdump without having byte-compiled eieio-core first, the load
cedet/semantic/loaddefs.el, you will get a cryptic error message stating
stringp: 0 is not a string.  And upon investigation, the closure for
def-eieio-autoload mysteriously has a "(format 0 cname)" in the code, even
though it's not 0 in the eieio-core source code.

Lynn

[-- Attachment #2: Type: text/html, Size: 1723 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 15:38       ` Lynn Winebarger
@ 2022-07-30 15:44         ` Eli Zaretskii
  2022-07-30 16:32           ` Lynn Winebarger
  0 siblings, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-30 15:44 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: luangruo, emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Sat, 30 Jul 2022 11:38:13 -0400
> Cc: Po Lu <luangruo@yahoo.com>, emacs-devel <emacs-devel@gnu.org>
> 
>  >   (format <control string starting with escaped newline> args...)
>  > 
>  > will result in (format 0 args...) during dumping.
> 
>  "Result" in what sense?
> 
> As in, if you load emacs-lisp/eieio-core in site-load.el with dump-mode pdump without having byte-compiled
> eieio-core first, the load cedet/semantic/loaddefs.el, you will get a cryptic error message stating stringp: 0 is
> not a string.  And upon investigation, the closure for def-eieio-autoload mysteriously has a "(format 0
> cname)" in the code, even though it's not 0 in the eieio-core source code.

Does this happen with any package we actually preload via loadup.el?



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 15:44         ` Eli Zaretskii
@ 2022-07-30 16:32           ` Lynn Winebarger
  2022-07-30 16:43             ` Eli Zaretskii
  0 siblings, 1 reply; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-30 16:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3631 bytes --]

On Sat, Jul 30, 2022, 11:44 AM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Lynn Winebarger <owinebar@gmail.com>
> > Date: Sat, 30 Jul 2022 11:38:13 -0400
> > Cc: Po Lu <luangruo@yahoo.com>, emacs-devel <emacs-devel@gnu.org>
> >
> >  >   (format <control string starting with escaped newline> args...)
> >  >
> >  > will result in (format 0 args...) during dumping.
> >
> >  "Result" in what sense?
> >
> > As in, if you load emacs-lisp/eieio-core in site-load.el with dump-mode
> pdump without having byte-compiled
> > eieio-core first, the load cedet/semantic/loaddefs.el, you will get a
> cryptic error message stating stringp: 0 is
> > not a string.  And upon investigation, the closure for
> def-eieio-autoload mysteriously has a "(format 0
> > cname)" in the code, even though it's not 0 in the eieio-core source
> code.
>
> Does this happen with any package we actually preload via loadup.el?
>

If you're not going to support using site-load except in the most trivial
of ways, then you should say that in the documentation.  I don't think it's
unreasonable for a user to expect to be able to load a core emacs library
in a file provided for loading additional libraries without it blowing up
in their face.  Being cryptic in your documentation is a poor substitute
for just explicitly stating that it won't work for most of the core
libraries (unless they are "arranged to be" pre-compiled), and no attempt
to facilitate such customization will be supported.  Why even include the
option in the official distribution?  OS vendors are perfectly capable of
maintaining patches that modify the build process to customize it to their
system.

 I'm asking because I'm trying to use this supposed feature on a close to
stock source distribution.  I've only been changing the lisp libraries
because I did not want to debug C code just to preload some stock
libraries.  To the extent a small set of bugs in the C run-time and my
ignorance of the process were responsible, I'll gladly dump the changes to
the lisp code and make the few small changes to the C and the construction
of my site-load file.  I'd prefer to make those changes in a way that would
be most likely to be accepted to the core (assuming I can get cleared by my
employer), so that I don't have to maintain these bug fixes locally
indefinitely.  And it doesn't hurt that other users might benefit from a
usable site configuration in the build process.
But if you're not going to be willing to take up such bug fixes on some
general principle (that I don't understand in a free software project),
that would be useful to know.
And yes, I consider segfaults and runaway allocation during the execution
of lisp code (not due to the semantics of the lisp code) to be bugs even if
it is during the build process.  Especially when that behaviour is from
loading stock libraries from the distribution itself, and site-init and
site-load are documented features.
Your inference that a user who doesn't utilize these documented features
will not be impacted by bugs in these features is presumably correct.
Personally, I am going to proceed with the last option (the fifth) of
having the interpreter do the replacement of a documentation string by 0 at
the time it's attempted to be stored as documentation during a bootstrap
dump, rather than at read time.  That seems a lot cleaner and robust to me
than either the current approach or my other ideas.  It would have the
additional benefit of removing the dependency of etc/DOC on any source lisp
files.  If that functions well, and I'm cleared to assign the rights, I
hope it would be considered for inclusion.

Lynn

[-- Attachment #2: Type: text/html, Size: 4770 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 16:32           ` Lynn Winebarger
@ 2022-07-30 16:43             ` Eli Zaretskii
  2022-07-31  2:17               ` Po Lu
  2022-07-31 11:57               ` Lynn Winebarger
  0 siblings, 2 replies; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-30 16:43 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: luangruo, emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Sat, 30 Jul 2022 12:32:27 -0400
> Cc: luangruo@yahoo.com, emacs-devel <emacs-devel@gnu.org>
> 
>  Does this happen with any package we actually preload via loadup.el?
> 
> If you're not going to support using site-load except in the most trivial of ways, then you should say that in the
> documentation.

Did I say that we are not going to support this?

I asked an informative question, with the purpose of figuring out how
urgent it is for us to fix this issue.  Do I really deserve a cold
shower for asking such questions?

> But if you're not going to be willing to take up such bug fixes on some general principle (that I don't
> understand in a free software project), that would be useful to know.

I didn't say that, either.  And I couldn't really give any specific
answer to such a general question, because it depends on what kinds of
bugs and in which parts of the code you will report.  I could only
answer such question on a case by case basis.

> Personally, I am going to proceed with the last option (the fifth)

TBH I don't really understand your 5 alternatives, because you are
alluding to symbols I cannot find in the sources: I see neither
read_literal_string nor dynamic-docstring.  Are those typos or am I
blind?



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 15:11     ` Eli Zaretskii
  2022-07-30 15:38       ` Lynn Winebarger
@ 2022-07-31  0:52       ` Po Lu
  1 sibling, 0 replies; 33+ messages in thread
From: Po Lu @ 2022-07-31  0:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: owinebar, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Po Lu <luangruo@yahoo.com>
>> Cc: Lynn Winebarger <owinebar@gmail.com>,  emacs-devel@gnu.org
>> Date: Sat, 30 Jul 2022 21:36:46 +0800
>> 
>> I think the problem is this:
>> 
>>   (format <control string starting with escaped newline> args...)
>> 
>> will result in (format 0 args...) during dumping.
>
> "Result" in what sense?

In that form being read instead of the form with the string.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 16:43             ` Eli Zaretskii
@ 2022-07-31  2:17               ` Po Lu
  2022-07-31  6:27                 ` Eli Zaretskii
  2022-07-31 11:57               ` Lynn Winebarger
  1 sibling, 1 reply; 33+ messages in thread
From: Po Lu @ 2022-07-31  2:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Lynn Winebarger, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> TBH I don't really understand your 5 alternatives, because you are
> alluding to symbols I cannot find in the sources: I see neither
> read_literal_string nor dynamic-docstring.  Are those typos or am I
> blind?

I think he means read_string_literal and
byte-compile-dynamic-docstrings.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  2:17               ` Po Lu
@ 2022-07-31  6:27                 ` Eli Zaretskii
  2022-07-31  7:24                   ` Po Lu
  2022-07-31  8:03                   ` Stefan Monnier
  0 siblings, 2 replies; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-31  6:27 UTC (permalink / raw)
  To: Po Lu; +Cc: owinebar, emacs-devel

> From: Po Lu <luangruo@yahoo.com>
> Cc: Lynn Winebarger <owinebar@gmail.com>,  emacs-devel@gnu.org
> Date: Sun, 31 Jul 2022 10:17:54 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > TBH I don't really understand your 5 alternatives, because you are
> > alluding to symbols I cannot find in the sources: I see neither
> > read_literal_string nor dynamic-docstring.  Are those typos or am I
> > blind?
> 
> I think he means read_string_literal and
> byte-compile-dynamic-docstrings.

OK, but I still lack some glue to understand the issue.  Specifically:

  . the OP said "strings that are erroneously treated as docstrings in
    dump mode" -- where's the code which makes that mistake, and how
    is read_literal_string related to that mistake?
  . why isn't there an alternative to fix read_literal_string not to
    generate zero instead of the format template? the other
    alternatives all look like partial kludges to me



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  6:27                 ` Eli Zaretskii
@ 2022-07-31  7:24                   ` Po Lu
  2022-07-31  7:56                     ` Eli Zaretskii
  2022-07-31  8:03                   ` Stefan Monnier
  1 sibling, 1 reply; 33+ messages in thread
From: Po Lu @ 2022-07-31  7:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: owinebar, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> OK, but I still lack some glue to understand the issue.  Specifically:
>
>   . the OP said "strings that are erroneously treated as docstrings in
>     dump mode" -- where's the code which makes that mistake, and how
>     is read_literal_string related to that mistake?

He's referring to this part of lread.c:

  /* If purifying, and string starts with \ newline,
     return zero instead.  This is for doc strings
     that we are really going to find in etc/DOC.nn.nn.  */
  if (!NILP (Vpurify_flag) && NILP (Vdoc_file_name) && cancel)
    {
      unbind_to (count, Qnil);
      return make_fixnum (0);
    }

>   . why isn't there an alternative to fix read_literal_string not to
>     generate zero instead of the format template? the other
>     alternatives all look like partial kludges to me

I can't answer this question, sorry.  You'll have to ask the OP.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 13:36   ` Po Lu
  2022-07-30 15:11     ` Eli Zaretskii
@ 2022-07-31  7:52     ` Stefan Monnier
  1 sibling, 0 replies; 33+ messages in thread
From: Stefan Monnier @ 2022-07-31  7:52 UTC (permalink / raw)
  To: Po Lu; +Cc: Eli Zaretskii, Lynn Winebarger, emacs-devel

> I think the problem is this:
>
>   (format <control string starting with escaped newline> args...)
>
> will result in (format 0 args...) during dumping.

Ah, I see.  This is a hack I'd like to get rid of, indeed.

AFAIK nowadays it's only useful for those docstrings found in
lisp/loaddefs.el since it's the only Lisp file we still scrape for
docstrings (to put them in etc/DOC).

[ E.g. If we were to byte-compile `lisp/loaddefs.el`, then it wouldn't be used
  at all any more, I believe.  We have loose plans o do that, but
  someone needs to dig into it to see what that breaks and how to fix it.  ]

IOW we can tighten the test in `read_string_literal` to replace strings
with 0 only in this one specific case.


        Stefan




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  7:24                   ` Po Lu
@ 2022-07-31  7:56                     ` Eli Zaretskii
  2022-07-31  8:48                       ` Po Lu
  2022-07-31 12:53                       ` Lynn Winebarger
  0 siblings, 2 replies; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-31  7:56 UTC (permalink / raw)
  To: Po Lu; +Cc: owinebar, emacs-devel

> From: Po Lu <luangruo@yahoo.com>
> Cc: owinebar@gmail.com,  emacs-devel@gnu.org
> Date: Sun, 31 Jul 2022 15:24:13 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > OK, but I still lack some glue to understand the issue.  Specifically:
> >
> >   . the OP said "strings that are erroneously treated as docstrings in
> >     dump mode" -- where's the code which makes that mistake, and how
> >     is read_literal_string related to that mistake?
> 
> He's referring to this part of lread.c:
> 
>   /* If purifying, and string starts with \ newline,
>      return zero instead.  This is for doc strings
>      that we are really going to find in etc/DOC.nn.nn.  */
>   if (!NILP (Vpurify_flag) && NILP (Vdoc_file_name) && cancel)
>     {
>       unbind_to (count, Qnil);
>       return make_fixnum (0);
>     }

Does this mean that just resetting purify-flag is enough to avoid the
problem?  If so, I think purify-flag is only meant for preloaded
packages, and dumping Emacs with additional packages isn't supposed to
set that flag.  Or maybe loadup.el should load an additional file
(beyond site-load and site-init), after it resets purify-flag?

An alternative is, of course, to make that code in lread.c smarter in
detecting doc strings and applying that handling only to doc strings.

> >   . why isn't there an alternative to fix read_literal_string not to
> >     generate zero instead of the format template? the other
> >     alternatives all look like partial kludges to me
> 
> I can't answer this question, sorry.  You'll have to ask the OP.

I did: the OP was on the CC list.

Thanks.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  6:27                 ` Eli Zaretskii
  2022-07-31  7:24                   ` Po Lu
@ 2022-07-31  8:03                   ` Stefan Monnier
  2022-07-31 12:43                     ` Lynn Winebarger
  1 sibling, 1 reply; 33+ messages in thread
From: Stefan Monnier @ 2022-07-31  8:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Po Lu, owinebar, emacs-devel

> OK, but I still lack some glue to understand the issue.  Specifically:
>
>   . the OP said "strings that are erroneously treated as docstrings in
>     dump mode" -- where's the code which makes that mistake, and how
>     is read_literal_string related to that mistake?
>   . why isn't there an alternative to fix read_literal_string not to
>     generate zero instead of the format template? the other
>     alternatives all look like partial kludges to me

In `read_literal_string` there is a hack that dates back to Emacs's
early life where we drop the string we just read and return the
0 literal instead.  We do that for those strings we think are docstrings
that will be found in etc/DOC and will be re-provided later when we call
`Snarf-documentation` (which should then replace those 0 literals with
appropriate integers pointing into etc/DOC).

The reason for this hack is to avoid allocating the string in the heap
(or worse, the purespace) since it's to be found lazily in
etc/DOC instead.

But we don't have a sure-fire way to recognize those strings, so we use
a convention that they start with "double-quote backslash newline" (this
same convention is then used in `make-docfile` in order to find those
strings).  But some non-preloaded files also use "double-quote backslash
newline" for other reasons, such as in `eieio-defclass-autoload`.

Not sure why it's a problem for Lynn, tho: he should not try to preload
`eieio-core.el` but only `eieio-core.elc` where the problem should not
appear any more.  But as noted elsewhere in Lynn's saga, the way we
currently handle `site-load.el`, those site-loaded files are also
preloaded in the `bootstrap-emacs.pdmp` (hence in their non-compiled
form), which is a bad idea.  We should fix our handling of
`site-load.el` so it's only loaded in the "final" dump after the
site-loaded files have been byte-compiled.


        Stefan




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  7:56                     ` Eli Zaretskii
@ 2022-07-31  8:48                       ` Po Lu
  2022-07-31  9:14                         ` Lars Ingebrigtsen
  2022-08-04  4:12                         ` Lynn Winebarger
  2022-07-31 12:53                       ` Lynn Winebarger
  1 sibling, 2 replies; 33+ messages in thread
From: Po Lu @ 2022-07-31  8:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: owinebar, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> Does this mean that just resetting purify-flag is enough to avoid the
> problem?  If so, I think purify-flag is only meant for preloaded
> packages, and dumping Emacs with additional packages isn't supposed to
> set that flag.  Or maybe loadup.el should load an additional file
> (beyond site-load and site-init), after it resets purify-flag?

Maybe this would work? (This question is also partly intended for Lynn)

diff --git a/lisp/loadup.el b/lisp/loadup.el
index 21a87dbd77..e81eccb58e 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -387,7 +387,8 @@
 ;; you may load them with a "site-load.el" file.
 ;; But you must also cause them to be scanned when the DOC file
 ;; is generated.
-(let ((lp load-path))
+(let ((lp load-path)
+      (purify-flag nil))
   (load "site-load" t)
   ;; We reset load-path after dumping.
   ;; For a permanent change in load-path, use configure's


> An alternative is, of course, to make that code in lread.c smarter in
> detecting doc strings and applying that handling only to doc strings.

Or perhaps what Stefan said about applying the kludge only to
loaddefs.el.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  8:48                       ` Po Lu
@ 2022-07-31  9:14                         ` Lars Ingebrigtsen
  2022-07-31 22:31                           ` Stefan Monnier
  2022-08-04  4:12                         ` Lynn Winebarger
  1 sibling, 1 reply; 33+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-31  9:14 UTC (permalink / raw)
  To: Po Lu; +Cc: Eli Zaretskii, owinebar, emacs-devel

Po Lu <luangruo@yahoo.com> writes:

> Or perhaps what Stefan said about applying the kludge only to
> loaddefs.el.

I've had another look at byte-compiling loaddefs.el (which would allow
us to get rid of the kludge altogether, along with ditching make-doc for
this stuff (see bug#53024)), and I think we're closer to allowing that
to happen than I originally thought.

There's a handful of warnings due to things being referred to before
being defined, but that's pretty easy to fix.  The only practical
problem is really that #$ is byte-compiled into nil, which we have to
find a solution for.

I'll go ahead and fix some of the warnings, but does anybody see an
obvious fix for the #$ problem?




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-30 16:43             ` Eli Zaretskii
  2022-07-31  2:17               ` Po Lu
@ 2022-07-31 11:57               ` Lynn Winebarger
  1 sibling, 0 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-31 11:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: luangruo, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1240 bytes --]

On Sat, Jul 30, 2022, 12:43 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Lynn Winebarger <owinebar@gmail.com>
> > Date: Sat, 30 Jul 2022 12:32:27 -0400
> > Cc: luangruo@yahoo.com, emacs-devel <emacs-devel@gnu.org>
> >
> >  Does this happen with any package we actually preload via loadup.el?
> >
> > If you're not going to support using site-load except in the most
> trivial of ways, then you should say that in the
> > documentation.
>
> Did I say that we are not going to support this?
>
> I asked an informative question, with the purpose of figuring out how
> urgent it is for us to fix this issue.  Do I really deserve a cold
> shower for asking such questions?
>

I'm sorry, I misread the exchange with Po and the response to my other
question as being dismissive, and I see I was incorrect.
I just spent way too many hours stepping through code looking for some
weird memory corruption issue before discovering the location of the hack
in the source.  I had actually looked for it before in lread.c, but for
some reason did not find it, since I had seen references to this kind of
issue in the documentation.
Thanks for the attention to the matter - I believe others have answered
these questions, I'll respond to those.

Lynn

[-- Attachment #2: Type: text/html, Size: 2210 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  8:03                   ` Stefan Monnier
@ 2022-07-31 12:43                     ` Lynn Winebarger
  2022-07-31 21:32                       ` Stefan Monnier
  0 siblings, 1 reply; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-31 12:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Po Lu, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3496 bytes --]

On Sun, Jul 31, 2022, 4:03 AM Stefan Monnier <monnier@iro.umontreal.ca>
wrote:

> > OK, but I still lack some glue to understand the issue.  Specifically:
> >
> >   . the OP said "strings that are erroneously treated as docstrings in
> >     dump mode" -- where's the code which makes that mistake, and how
> >     is read_literal_string related to that mistake?
> >   . why isn't there an alternative to fix read_literal_string not to
> >     generate zero instead of the format template? the other

>     alternatives all look like partial kludges to me

The first 4 yes, but the fifth one was to remove that and just modify the
evaluator to not set those properties during dump mode.
But if the only need for this feature is in loading loaddefs.el in dump
mode, then that is overkill.  I don't understand why the docstrings are
even being extracted for the byte compiled files at all, since they would
be lazy loaded anyway.  Then you could remove the files from lisp.mk from
the dependencies of DOC in the Makefile and just leave loaddefs and the
docstrings from C source files.

In `read_literal_string` there is a hack that dates back to Emacs's
> early life where we drop the string we just read and return the
> 0 literal instead.  We do that for those strings we think are docstrings
> that will be found in etc/DOC and will be re-provided later when we call
> `Snarf-documentation` (which should then replace those 0 literals with
> appropriate integers pointing into etc/DOC).
>
> The reason for this hack is to avoid allocating the string in the heap
> (or worse, the purespace) since it's to be found lazily in
> etc/DOC instead.
>
> But we don't have a sure-fire way to recognize those strings, so we use
> a convention that they start with "double-quote backslash newline" (this
> same convention is then used in `make-docfile` in order to find those
> strings).  But some non-preloaded files also use "double-quote backslash
> newline" for other reasons, such as in `eieio-defclass-autoload`.
>

They are everywhere in fact, not just as arguments to format.


> Not sure why it's a problem for Lynn, tho: he should not try to preload
> `eieio-core.el` but only `eieio-core.elc` where the problem should not
> appear any more.


This is the conundrum of trying to do anything significant in site-load
without Makefile support.  If you're bootstrapping, then none of those
files are compiled.  So I put in a check to only load after the bootstrap
during the dump. But nothing is byte compiled at this point other than the
files in loadup.  Trying to do the byte compile from within site-load after
the bootstrap but during the dump requires pre loading the source of all
dependencies, because require and autoload hit the panic button in dump
mode.

I finally just wrote a shell script yesterday that I could invoke from
site-load to add the compiled versions of the libraries loaded in site-load
to lisp.mk (so they will get added to the list of preloaded libraries in
the final dump), then invoke make on them.  This can cause some issues
since a significant number of them require loaddefs to be loaded.  I might
be better off just adding the extra libraries directly to the environment
variable the compiler references (to determine preloaded libraries).  I
don't know if that will make a difference in determining coherence for
dumping during native-compilation.  In any case, when the shell script
calls make, it needs to arrange for the byte compiler to load loaddefs.

Lynn

[-- Attachment #2: Type: text/html, Size: 4803 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  7:56                     ` Eli Zaretskii
  2022-07-31  8:48                       ` Po Lu
@ 2022-07-31 12:53                       ` Lynn Winebarger
  2022-07-31 13:05                         ` Eli Zaretskii
  1 sibling, 1 reply; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-31 12:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Po Lu, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2439 bytes --]

On Sun, Jul 31, 2022, 3:56 AM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Po Lu <luangruo@yahoo.com>
> > Cc: owinebar@gmail.com,  emacs-devel@gnu.org
> > Date: Sun, 31 Jul 2022 15:24:13 +0800
> >
> > Eli Zaretskii <eliz@gnu.org> writes:
> >
> > > OK, but I still lack some glue to understand the issue.  Specifically:
> > >
> > >   . the OP said "strings that are erroneously treated as docstrings in
> > >     dump mode" -- where's the code which makes that mistake, and how
> > >     is read_literal_string related to that mistake?
> >
> > He's referring to this part of lread.c:
> >
> >   /* If purifying, and string starts with \ newline,
> >      return zero instead.  This is for doc strings
> >      that we are really going to find in etc/DOC.nn.nn.  */
> >   if (!NILP (Vpurify_flag) && NILP (Vdoc_file_name) && cancel)
> >     {
> >       unbind_to (count, Qnil);
> >       return make_fixnum (0);
> >     }
>
> Does this mean that just resetting purify-flag is enough to avoid the
> problem?  If so, I think purify-flag is only meant for preloaded
> packages, and dumping Emacs with additional packages isn't supposed to
> set that flag.  Or maybe loadup.el should load an additional file
> (beyond site-load and site-init), after it resets purify-flag?
>

I'm not sure why you'd not want to use the purify flag, since there are a
lot of explicit calls to purecopy that appear intended to take advantage of
hash consing.  I don't know why that benefit should not apply to libraries
being preloaded by site-load and site-init.


An alternative is, of course, to make that code in lread.c smarter in
> detecting doc strings and applying that handling only to doc strings.
>
> > >   . why isn't there an alternative to fix read_literal_string not to
> > >     generate zero instead of the format template? the other
> > >     alternatives all look like partial kludges to me
> >
> > I can't answer this question, sorry.  You'll have to ask the OP.
>
> I did: the OP was on the CC list.
>

My fifth alternative was to implement the substitution of 0 directly in the
evaluator semantics, so that it would only record 0 for actual docstrings
identified while evaluating source code forms during dump mode.
That seems like overkill if this is only required for loaddefs.  But the
manual should probably state that files loaded in site-load and site-init
need to be byte compiled for their docstrings to be dynamically loaded.

Lynn

[-- Attachment #2: Type: text/html, Size: 4074 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 12:53                       ` Lynn Winebarger
@ 2022-07-31 13:05                         ` Eli Zaretskii
  2022-07-31 20:29                           ` Lynn Winebarger
  0 siblings, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2022-07-31 13:05 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: luangruo, emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Sun, 31 Jul 2022 08:53:57 -0400
> Cc: Po Lu <luangruo@yahoo.com>, emacs-devel <emacs-devel@gnu.org>
> 
>  Does this mean that just resetting purify-flag is enough to avoid the
>  problem?  If so, I think purify-flag is only meant for preloaded
>  packages, and dumping Emacs with additional packages isn't supposed to
>  set that flag.  Or maybe loadup.el should load an additional file
>  (beyond site-load and site-init), after it resets purify-flag?
> 
> I'm not sure why you'd not want to use the purify flag, since there are a lot of explicit calls to purecopy that
> appear intended to take advantage of hash consing.  I don't know why that benefit should not apply to
> libraries being preloaded by site-load and site-init.

We will remove pure space at some not-too-distant future, which is a
clear sign that it is not very important in the pdumper builds.  So
investing time in something that works when purify-flag is off doesn't
sound like a good investment to me.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 13:05                         ` Eli Zaretskii
@ 2022-07-31 20:29                           ` Lynn Winebarger
  2022-08-01  1:05                             ` Po Lu
  2022-08-01 11:07                             ` Eli Zaretskii
  0 siblings, 2 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-07-31 20:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Po Lu, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1581 bytes --]

On Sun, Jul 31, 2022, 9:06 AM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Lynn Winebarger <owinebar@gmail.com>
> > Date: Sun, 31 Jul 2022 08:53:57 -0400
> > Cc: Po Lu <luangruo@yahoo.com>, emacs-devel <emacs-devel@gnu.org>
> >
> >  Does this mean that just resetting purify-flag is enough to avoid the
> >  problem?  If so, I think purify-flag is only meant for preloaded
> >  packages, and dumping Emacs with additional packages isn't supposed to
> >  set that flag.  Or maybe loadup.el should load an additional file
> >  (beyond site-load and site-init), after it resets purify-flag?
> >
> > I'm not sure why you'd not want to use the purify flag, since there are
> a lot of explicit calls to purecopy that
> > appear intended to take advantage of hash consing.  I don't know why
> that benefit should not apply to
> > libraries being preloaded by site-load and site-init.
>
> We will remove pure space at some not-too-distant future, which is a
> clear sign that it is not very important in the pdumper builds.  So
> investing time in something that works when purify-flag is off doesn't
> sound like a good investment to me.
>

I see your point.  I'm not that familiar with the policy on doing
maintenance releases on older versions (27 and 28 in particular).  Is pure
space elimination going to be incorporated in those as well?
Asking because AFAIK, 28.1 is only released as an rpm for "rolling release"
type distributions at this point, while (admittedly near EOL) installation
of RHEL 7.x still provide 24.3 as the "latest release" (for that antiquated
version).

Lynn

[-- Attachment #2: Type: text/html, Size: 2534 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 12:43                     ` Lynn Winebarger
@ 2022-07-31 21:32                       ` Stefan Monnier
  2022-08-02 16:55                         ` Lynn Winebarger
  0 siblings, 1 reply; 33+ messages in thread
From: Stefan Monnier @ 2022-07-31 21:32 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Eli Zaretskii, Po Lu, emacs-devel

> mode, then that is overkill.  I don't understand why the docstrings are
> even being extracted for the byte compiled files at all, since they would
> be lazy loaded anyway.  Then you could remove the files from lisp.mk from
> the dependencies of DOC in the Makefile and just leave loaddefs and the
> docstrings from C source files.

That's indeed what we in Emacs-29.

> This is the conundrum of trying to do anything significant in site-load
> without Makefile support.  If you're bootstrapping, then none of those
> files are compiled.  So I put in a check to only load after the bootstrap
> during the dump. But nothing is byte compiled at this point other than the
> files in loadup.  Trying to do the byte compile from within site-load after
> the bootstrap but during the dump requires pre loading the source of all
> dependencies, because require and autoload hit the panic button in dump
> mode.

Loading `site-load.el` in the first dump is a bad idea because files
haven't been compiled yet.

Loading `site-load.el` in the second dump is a bad idea because:
- as currently written, the site-loaded files aren't compiled, so it's
  too early to dump them.
- if you change the build to byte-compile them before the second dump,
  you'll be byte-compiling them with the bootstrap-emacs which might
  work but will lead to a slower compilation.

So, I suggest something like:

    mv lisp/site-load.el lisp/my-site-load.el
    make
    rm src/emacs
    mv lisp/my-site-load.el  lisp/site-load.el
    mv lisp/my-site-load.elc lisp/site-load.elc
    make

So the first 2 dumps are "normal" without any site-loaded files, and
that's followed by a 3rd dump, where all the ELisp files are already
byte-compiled.


        Stefan




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  9:14                         ` Lars Ingebrigtsen
@ 2022-07-31 22:31                           ` Stefan Monnier
  2022-08-01 10:38                             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 33+ messages in thread
From: Stefan Monnier @ 2022-07-31 22:31 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Po Lu, Eli Zaretskii, owinebar, emacs-devel

> There's a handful of warnings due to things being referred to before
> being defined, but that's pretty easy to fix.  The only practical
> problem is really that #$ is byte-compiled into nil, which we have to
> find a solution for.

`#$` doesn't seem to be used in `lisp/loaddefs.el`, so that shouldn't be
a problem.


        Stefan




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 20:29                           ` Lynn Winebarger
@ 2022-08-01  1:05                             ` Po Lu
  2022-08-01 11:07                             ` Eli Zaretskii
  1 sibling, 0 replies; 33+ messages in thread
From: Po Lu @ 2022-08-01  1:05 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Eli Zaretskii, emacs-devel

Lynn Winebarger <owinebar@gmail.com> writes:

> I see your point.  I'm not that familiar with the policy on doing
> maintenance releases on older versions (27 and 28 in particular).  Is
> pure space elimination going to be incorporated in those as well?

No.  We avoid any changes to the release branch that are not fixing
regressions or build failures, and even in that case many changes are
considered too "risky".



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 22:31                           ` Stefan Monnier
@ 2022-08-01 10:38                             ` Lars Ingebrigtsen
  0 siblings, 0 replies; 33+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-01 10:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Po Lu, Eli Zaretskii, owinebar, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> There's a handful of warnings due to things being referred to before
>> being defined, but that's pretty easy to fix.  The only practical
>> problem is really that #$ is byte-compiled into nil, which we have to
>> find a solution for.
>
> `#$` doesn't seem to be used in `lisp/loaddefs.el`, so that shouldn't be
> a problem.

Yes, I was thinking of the package loaddefs, which all use #$.  But of
course, we don't need to byte-compile those even if we're byte-compiling
the in-tree loaddefs files, so it's not really relevant.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 20:29                           ` Lynn Winebarger
  2022-08-01  1:05                             ` Po Lu
@ 2022-08-01 11:07                             ` Eli Zaretskii
  1 sibling, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2022-08-01 11:07 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: luangruo, emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Sun, 31 Jul 2022 16:29:43 -0400
> Cc: Po Lu <luangruo@yahoo.com>, emacs-devel <emacs-devel@gnu.org>
> 
>  We will remove pure space at some not-too-distant future, which is a
>  clear sign that it is not very important in the pdumper builds.  So
>  investing time in something that works when purify-flag is off doesn't
>  sound like a good investment to me.
> 
> I see your point.  I'm not that familiar with the policy on doing maintenance releases on older versions (27
> and 28 in particular).  Is pure space elimination going to be incorporated in those as well?

No, we are not going to backport such significant changes.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31 21:32                       ` Stefan Monnier
@ 2022-08-02 16:55                         ` Lynn Winebarger
  0 siblings, 0 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-08-02 16:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Po Lu, emacs-devel

On Sun, Jul 31, 2022 at 5:32 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
> Loading `site-load.el` in the first dump is a bad idea because files
> haven't been compiled yet.
>
> Loading `site-load.el` in the second dump is a bad idea because:
> - as currently written, the site-loaded files aren't compiled, so it's
>   too early to dump them.
> - if you change the build to byte-compile them before the second dump,
>   you'll be byte-compiling them with the bootstrap-emacs which might
>   work but will lead to a slower compilation.
>
> So, I suggest something like:
>
>     mv lisp/site-load.el lisp/my-site-load.el
>     make
>     rm src/emacs
>     mv lisp/my-site-load.el  lisp/site-load.el
>     mv lisp/my-site-load.elc lisp/site-load.elc
>     make
>
> So the first 2 dumps are "normal" without any site-loaded files, and
> that's followed by a 3rd dump, where all the ELisp files are already
> byte-compiled.

I have now successfully dumped (byte-compiled only) an additional 511
core emacs files by calling a shell script during the current
second-stage dump to make the actual second stage dump and build the
required elc files, as well as finder-inf and cus-load.
The bad news is, when I just turn off Vpurify_flag at points in the
site-load to avoid a segfault or other problem (and then re-enable it
after the problem libraries), site-load will finish without error, but
the actual dumping process will fail for some reason I haven't
debugged.
The good news is that I was able to resolve the remaining problems
with a handful of changes to the C code in alloc and lread, and
changes to a couple of elisp files.
* alloc - put in a hash table of objects that have been purified
during a call to Fpurecopy, so cycles are not followed.
*           Also changed the "small_amount" in pure_alloc to 20000 and
printed a message on every allocation going over, since I can't rely
on the process actually finishing if many megabytes of additional pure
space are required.
* lread - put in a docstring-hack flag used as an extra and
conditional for the hack, so I could turn that off in site-load
without changing any variables used by the existing code
* easy-mmode - easy-mmode-defmap produces a defconst for a keymap
variable, which I changed to defvar
* tramp-sh - the defconsts for the tramp-completion-function-alist-*
variables changed to defvar, since for some reason the values are
getting modified, probably during tramp-startup-hook

I am going to try native-compilation again.  I can't see why I was
getting an "incoherent" library message, now that I understand the
loadup code that sets the compilation unit file name before the dump.
So I've made the table of compilation units registered by the
load_compilation_unit visible in lisp.  That way I can compare the
method used by loadup, which depends on subrs in compilation units
being accessible through function symbol values, in the site-load file
directly.  Or just do the conversion directly based on that table,
which should guarantee any NCU encountered by pdumper has the file
field set "coherently".

Given the elimination of pure space will not be back ported to 27 and
28, and the problem with pdump when the purify flag is turned off
while loading files,  I think some of these changes (or similar)
should be included maintenance releases of those two major versions,
so there will be some way for users stuck with those versions to
effectively dump significant chunks of core emacs beyond what's in
loadup.

Lynn



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Docstring hack
  2022-07-31  8:48                       ` Po Lu
  2022-07-31  9:14                         ` Lars Ingebrigtsen
@ 2022-08-04  4:12                         ` Lynn Winebarger
  1 sibling, 0 replies; 33+ messages in thread
From: Lynn Winebarger @ 2022-08-04  4:12 UTC (permalink / raw)
  To: Po Lu; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

On Sun, Jul 31, 2022, 4:48 AM Po Lu <luangruo@yahoo.com> wrote:

> Eli Zaretskii <eliz@gnu.org> writes:
> Maybe this would work? (This question is also partly intended for Lynn)
>
> diff --git a/lisp/loadup.el b/lisp/loadup.el
> index 21a87dbd77..e81eccb58e 100644
> --- a/lisp/loadup.el
> +++ b/lisp/loadup.el
> @@ -387,7 +387,8 @@
>  ;; you may load them with a "site-load.el" file.
>  ;; But you must also cause them to be scanned when the DOC file
>  ;; is generated.
> -(let ((lp load-path))
> +(let ((lp load-path)
> +      (purify-flag nil))
>    (load "site-load" t)
>    ;; We reset load-path after dumping.
>    ;; For a permanent change in load-path, use configure's


When I tried switching the purify-flag on and off selectively in site-load
itself resulted in pdumper never completing.
It was easier to just add a variable dedicated solely to turning the hack
off that has no other dependencies.  Keep in mind I'm only worried about
28.x for this exercise.

Lynn

[-- Attachment #2: Type: text/html, Size: 1549 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-08-04  4:12 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-30 12:14 Docstring hack Lynn Winebarger
2022-07-30 12:25 ` Po Lu
2022-07-30 12:50   ` Lynn Winebarger
2022-07-30 13:04     ` Lynn Winebarger
2022-07-30 13:32     ` Po Lu
2022-07-30 13:28 ` Eli Zaretskii
2022-07-30 13:36   ` Po Lu
2022-07-30 15:11     ` Eli Zaretskii
2022-07-30 15:38       ` Lynn Winebarger
2022-07-30 15:44         ` Eli Zaretskii
2022-07-30 16:32           ` Lynn Winebarger
2022-07-30 16:43             ` Eli Zaretskii
2022-07-31  2:17               ` Po Lu
2022-07-31  6:27                 ` Eli Zaretskii
2022-07-31  7:24                   ` Po Lu
2022-07-31  7:56                     ` Eli Zaretskii
2022-07-31  8:48                       ` Po Lu
2022-07-31  9:14                         ` Lars Ingebrigtsen
2022-07-31 22:31                           ` Stefan Monnier
2022-08-01 10:38                             ` Lars Ingebrigtsen
2022-08-04  4:12                         ` Lynn Winebarger
2022-07-31 12:53                       ` Lynn Winebarger
2022-07-31 13:05                         ` Eli Zaretskii
2022-07-31 20:29                           ` Lynn Winebarger
2022-08-01  1:05                             ` Po Lu
2022-08-01 11:07                             ` Eli Zaretskii
2022-07-31  8:03                   ` Stefan Monnier
2022-07-31 12:43                     ` Lynn Winebarger
2022-07-31 21:32                       ` Stefan Monnier
2022-08-02 16:55                         ` Lynn Winebarger
2022-07-31 11:57               ` Lynn Winebarger
2022-07-31  0:52       ` Po Lu
2022-07-31  7:52     ` Stefan Monnier

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).