unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#27748: 26.0.50; doc strings should be in DOC file
@ 2017-07-18  6:47 Ken Raeburn
  2017-08-06  0:09 ` npostavs
  2017-08-20 22:05 ` npostavs
  0 siblings, 2 replies; 15+ messages in thread
From: Ken Raeburn @ 2017-07-18  6:47 UTC (permalink / raw)
  To: 27748

There are a bunch of doc strings from the preloaded Lisp files that do not wind up in the DOC file.  Presumably this means they wind up in the emacs executable image itself, saved away as extra string objects that GC needs to track.  (In the big-elc-file work Stefan started and I'm experimenting with, these doc strings wind up in the dumped Lisp environment file, and need to be parsed and saved away at load time.)

1. defcustom doc strings from files compiled with lexical binding.

   For example, files.el (lexical bindings) defines
   delete-auto-save-files but it doesn't show up in the DOC file;
   files.elc starts with an initial byte-code blob which includes the
   symbol delete-auto-save-files and its doc string in the constants
   array.

   On the other hand, custom.el (dynamic bindings) declares
   custom-theme-directory, the .elc file dumps out the doc string in a
   #@... block before a separate top-level call to
   custom-declare-variable, and since this is what make-docfile looks
   for, the doc string winds up in the DOC file.

2. In isearch, CL macro expansion results in symbols like
   isearch--state-forward--cmacro having function definitions that
   include the two doc strings "Access slot \"forward\" of
   `(isearch--state (:constructor nil) (:copier nil) (:constructor
   isearch--get-state (&aux (string isearch-string) (message
   isearch-message) (point (point)) (success isearch-success) (forward
   isearch-forward) (other-end isearch-other-end) (word
   isearch-regexp-function) (error isearch-error) (wrapped
   isearch-wrapped) (barrier isearch-barrier) (case-fold-search
   isearch-case-fold-search) (pop-fun (if isearch-push-state-function
   (funcall isearch-push-state-function))))))' struct CL-X." and
   "\n\n(fn CL-WHOLE-ARG CL-X)".  The former string is also in DOC (with
   "\n\n(fn CL-X)" appended) as the documentation for the function
   isearch--state-forward.

   It appears that, as far as the Emacs help system is concerned,
   isearch--state-*--cmacro functions are undocumented.

   The --cmacro functions generate cl-block forms that include the
   original doc string, since it's treated as part of the body, but it's
   not clear to me whether it has any use there at all, or if it could
   just be discarded, or perhaps looked up via the non --cmacro symbols
   at run time if it is of use.

3. Undocumented functions, strange as it sounds... in files.el, function
   file-name-non-special is defined with no documentation.  In
   files.elc, the bytecode object constructed has a doc string "\n\n(fn
   OPERATION &rest ARGUMENTS)" instead of a (#$ . NNN) reference to a
   separate string that make-docfile can pick up.

To “reproduce”…

Delete or “chmod 0” the DOC file before starting Emacs.  Use ielm to look at the symbol property lists of delete-auto-save-files and custom-theme-directory; the former has a string for its variable-documentation property, and the latter has a number.  Look at the function definitions of isearch--state-forward--cmacro and file-name-non-special and see the doc strings in them.  Examine the .elc files and DOC.






In GNU Emacs 26.0.50 (build 2, x86_64-apple-darwin15.6.0, NS appkit-1404.47 Version 10.11.6 (Build 15G1217))
 of 2017-07-18 built on bang.local
Repository revision: 0083123499cc29e301c197218d3809b225675e57
Windowing system distributor 'Apple', version 10.3.1404
Recent messages:
Loading ~/lib/elisp/sue.elc...done
Loading desktop...done
Warning: desktop file appears to be in use by PID 83193.
Using it may cause conflicts.  Use it anyway? (y or n) n
Desktop file in use; not loaded.
For information about GNU Emacs and the GNU system, type C-h C-a.

Configured features:
RSVG IMAGEMAGICK DBUS NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS
NS

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  desktop-save-mode: t
  global-hi-lock-mode: t
  hi-lock-mode: t
  which-function-mode: t
  icomplete-mode: t
  shell-dirtrack-mode: t
  display-time-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr warnings emacsbug message subr-x puny dired
dired-loaddefs rfc822 mml mml-sec epa derived epg gnus-util rmail
rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader add-log desktop frameset cus-start
cus-load kr-defs hi-lock which-func imenu icomplete iso-transl
smart-quotes easy-mmode tramp tramp-compat tramp-loaddefs trampver shell
pcomplete comint ansi-color ring parse-time format-spec advice cc-mode
cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars
cc-defs server time sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils finder-inf package easymenu epg-config
url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache url-vars seq byte-opt gv bytecomp
byte-compile cconv cl-loaddefs cl-lib time-date tooltip eldoc electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel term/ns-win ns-win
ucs-normalize mule-util term/common-win tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode
lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote dbusbind kqueue cocoa ns multi-tty make-network-process emacs)

Memory information:
((conses 16 261240 12410)
 (symbols 48 26239 2)
 (miscs 40 57 285)
 (strings 32 47062 2197)
 (string-bytes 1 1448035)
 (vectors 16 42658)
 (vector-slots 8 784892 12266)
 (floats 8 70 90)
 (intervals 56 241 0)
 (buffers 992 12))






^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-07-18  6:47 bug#27748: 26.0.50; doc strings should be in DOC file Ken Raeburn
@ 2017-08-06  0:09 ` npostavs
  2017-08-08  1:03   ` npostavs
  2017-08-20 22:05 ` npostavs
  1 sibling, 1 reply; 15+ messages in thread
From: npostavs @ 2017-08-06  0:09 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: 27748

[-- Attachment #1: Type: text/plain, Size: 4671 bytes --]

Ken Raeburn <raeburn@raeburn.org> writes:

>
> 1. defcustom doc strings from files compiled with lexical binding.
>
>    For example, files.el (lexical bindings) defines
>    delete-auto-save-files but it doesn't show up in the DOC file;
>    files.elc starts with an initial byte-code blob which includes the
>    symbol delete-auto-save-files and its doc string in the constants
>    array.

Actually, it's not only about lexical binding, the following file:

    ;;; -*- lexical-binding: nil -*-

    (defcustom custom-foo nil
      "a custom variable"
      :type 'boolean
      :group 'foo-group)

    ;; (defun foo ()
    ;;   t)

    (defcustom custom-bar nil
      "another custom variable"
      :type 'boolean
      :group 'foo-group)

produces (with prologue removed, and reformatted for readability)

    (byte-code "\300\301\302\303\304\305\306\307&\a\210\300\310\302\311\304\305\306\307&\a\207"
               [custom-declare-variable custom-foo nil "a custom variable" :type boolean :group foo-group
                                        custom-bar "another custom variable"]
               8)

Uncommenting the (defun foo...) produces:

    #@19 a custom variable\x1f
    (custom-declare-variable 'custom-foo nil '(#$ . 411) :type 'boolean :group 'foo-group)
    (defalias 'foo #[nil "\300\207" [t] 1])
    #@25 another custom variable\x1f
    (custom-declare-variable 'custom-bar nil '(#$ . 562) :type 'boolean :group 'foo-group)

Then changing to lexical binding produces:

    (byte-code "\300\301\302\303\304DD\305\306\307\310\311&\a\207"
               [custom-declare-variable
                custom-foo funcall function
                #[0 "\300\207" [nil] 1] "a custom variable"
                :type boolean :group foo-group]
               8)
    (defalias 'foo #[0 "\300\207" [t] 1])
    (byte-code "\300\301\302\303\304DD\305\306\307\310\311&\a\207"
               [custom-declare-variable
                custom-bar funcall function
                #[0 "\300\207" [nil] 1]
                "another custom variable" :type boolean :group foo-group]
               8)

As far as I can tell, the problem is that the
byte-compile-dynamic-docstrings feature (that's the #@19 thing) relies
on `byte-compile-out-toplevel' to decompile "trivial" functions back
into source code.  So having lexical binding set, or 2 defcustoms in a
row produces "non-trivial" code which is not decompiled, and therefore
not recognized in byte-compile-output-file-form as something which
should be used with byte-compile-output-docform.

    (defun byte-compile-out-toplevel (&optional for-effect output-type)
      ...
      ;; Decompile trivial functions:
      ...
        (cond
         ;; #### This should be split out into byte-compile-nontrivial-function-p.
         ((or (eq output-type 'lambda)
          (nthcdr (if (eq output-type 'file) 50 8) byte-compile-output)
          ...
            (while
                    (cond
                     ((memq (car (car rest)) '(byte-varref byte-constant))
                     ...
                     ((and maycall
                           ;; Allow a funcall if at most one atom follows it.
                      ...
                      (setq maycall nil)	; Only allow one real function call.
                      ...
                      (or (eq output-type 'file)
                          (not (delq nil (mapcar 'consp (cdr (car body))))))))
            ...
        (list 'byte-code (byte-compile-lapcode byte-compile-output)
              byte-compile-vector byte-compile-maxdepth)))
         ;; it's a trivial function
         ((cdr body) (cons 'progn (nreverse body)))
         ((car body))))

    (defun byte-compile-output-file-form (form)
      ;; Write the given form to the output buffer, being careful of docstrings
      ;; in defvar, defvaralias, defconst, autoload and
      ;; custom-declare-variable because make-docfile is so amazingly stupid.
      ...
        (if (and (memq (car-safe form) '(defvar defvaralias defconst
                                          autoload custom-declare-variable))
                 (stringp (nth 3 form)))
            (byte-compile-output-docform nil nil '("\n(" 3 ")") form nil
                                         (memq (car form)
                                               '(defvaralias autoload
                                                  custom-declare-variable)))
          (princ "\n" byte-compile--outbuffer)
          (prin1 form byte-compile--outbuffer)
          nil)))

The following patch prevents custom-declare-variable from being compiled
and lets the docstrings get printed properly.  Probably needs a bit more
refinement though.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch --]
[-- Type: text/x-diff, Size: 981 bytes --]

From 4cb45936966de76f91b95971c886599a24361c5b Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sat, 5 Aug 2017 20:02:19 -0400
Subject: [PATCH] * lisp/custom.el (custom-declare-variable): Don't compile
 (Bug#27748).

---
 lisp/custom.el | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lisp/custom.el b/lisp/custom.el
index ecfa34db5b..5876d3fd56 100644
--- a/lisp/custom.el
+++ b/lisp/custom.el
@@ -144,6 +144,9 @@ (defun custom-declare-variable (symbol default doc &rest args)
 DEFAULT is stored as SYMBOL's standard value, in SYMBOL's property
 `standard-value'.  At the same time, SYMBOL's property `force-value' is
 set to nil, as the value is no longer rogue."
+  (declare (compiler-macro
+            (lambda (form)
+              `(eval ',form lexical-binding))))
   (put symbol 'standard-value (purecopy (list default)))
   ;; Maybe this option was rogue in an earlier version.  It no longer is.
   (when (get symbol 'force-value)
-- 
2.11.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-08-06  0:09 ` npostavs
@ 2017-08-08  1:03   ` npostavs
  2017-08-13 18:04     ` npostavs
  0 siblings, 1 reply; 15+ messages in thread
From: npostavs @ 2017-08-08  1:03 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: 27748

npostavs@users.sourceforge.net writes:

> The following patch prevents custom-declare-variable from being compiled
> and lets the docstrings get printed properly.  Probably needs a bit more
> refinement though.

Hmm, this approach might not work at all, it causes a bazillion "free
variable reference" warnings, one for each reference to a defcustom
variable.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-08-08  1:03   ` npostavs
@ 2017-08-13 18:04     ` npostavs
  2019-06-24 22:38       ` Lars Ingebrigtsen
  0 siblings, 1 reply; 15+ messages in thread
From: npostavs @ 2017-08-13 18:04 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: 27748

[-- Attachment #1: Type: text/plain, Size: 660 bytes --]

npostavs@users.sourceforge.net writes:

> npostavs@users.sourceforge.net writes:
>
>> The following patch prevents custom-declare-variable from being compiled
>> and lets the docstrings get printed properly.  Probably needs a bit more
>> refinement though.
>
> Hmm, this approach might not work at all, it causes a bazillion "free
> variable reference" warnings, one for each reference to a defcustom
> variable.

Okay, here is an alternate approach which decouples the docstring
production from decompilation (note this isn't finished yet, I only
implemented it for defcustom, so applying this patch currently prevents
make-doc from finishing successfully).


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 13194 bytes --]

From 0fbe96c0add052338d68453de5fb3486201e61d0 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 13 Aug 2017 13:15:10 -0400
Subject: [PATCH] [WIP] Produce dynamic docstrings for bytecode (Bug#27748)

Instead of relying on decompilation to create source forms that we can
easily extract docstrings from, record the docstrings as we compile
and then print out all the docstrings with their corresponding symbol
names in a single comment at the top.

NOTE: Currently only updates recording docstrings for defcustom
leaving other forms to produce the old format, but make-docfile is
updated for the new format, so this prevents successful building of
emacs.

* lisp/emacs-lisp/bytecomp.el (byte-compile-docstring-handler): New
function, push a special kind of constant onto
`byte-compile-constants' for the given docstring.
(byte-compile-file-form-defvar-function): Use it on the docstring.
(byte-compile--docstring-constants): New variable.
(byte-compile-constants-vector): Use it to rememember the indices of
the special docstring constants.
(byte-compile-top-level, byte-compile-flush-pending): Let-bind
byte-compile--docstring-constants to nil.

(byte-compile--docstring-marker): New variable.
(byte-compile-from-buffer): Let-bind it to nil.
(byte-compile-insert-header): Set it to a pair markers pointing to the
end of the header.
(byte-compile-output-file-form): Write the docstrings collected into
byte-compile--docstring-constants to the second marker in
byte-compile--docstring-marker.  When writing out the constants
vector, use the (#$ . %d) format instead of the string itself.
(byte-compile-escape-docstring): New function, extracted from
`byte-compile-output-as-comment'.
(byte-compile-fix-header-docstring-comment): New function, comment out
the docstrings at the top of the file with a #@N kind of comment.
Delete semicolons from the header as needed to preserve offsets.
(byte-compile-fix-header-multibyte): Rename from
byte-compile-fix-header.
(byte-compile-fix-header): Call both
`byte-compile-fix-header-multibyte' and
`byte-compile-fix-header-docstring-comment'.

* lib-src/make-docfile.c (scan_lisp_file): Update for new format,
collect all of symbol type, name, and docstring directly from #@N
comments.
---
 lib-src/make-docfile.c      |  43 +++++++++++-------
 lisp/emacs-lisp/bytecomp.el | 104 +++++++++++++++++++++++++++++++++++++-------
 2 files changed, 115 insertions(+), 32 deletions(-)

diff --git a/lib-src/make-docfile.c b/lib-src/make-docfile.c
index ecd6447ab7..06377c5fda 100644
--- a/lib-src/make-docfile.c
+++ b/lib-src/make-docfile.c
@@ -1258,7 +1258,8 @@ read_lisp_symbol (FILE *infile, char *buffer)
       c = getc (infile);
       if (c == '\\')
 	*(++fillp) = getc (infile);
-      else if (c == ' ' || c == '\t' || c == '\n' || c == '\r' || c == '(' || c == ')')
+      else if (c == ' ' || c == '\t' || c == '\n' || c == '\r' ||
+               c == '(' || c == ')' || c == '\037')
 	{
 	  ungetc (c, infile);
 	  *fillp = 0;
@@ -1374,7 +1375,6 @@ scan_lisp_file (const char *filename, const char *mode)
 	  if (c == '@')
 	    {
 	      ptrdiff_t length = 0;
-	      ptrdiff_t i;
 
 	      /* Read the length.  */
 	      while ((c = getc (infile),
@@ -1398,20 +1398,31 @@ scan_lisp_file (const char *filename, const char *mode)
 	      length--;
 
 	      /* Read in the contents.  */
-	      free (saved_string);
-	      saved_string = xmalloc (length);
-	      for (i = 0; i < length; i++)
-		saved_string[i] = getc (infile);
-	      /* The last character is a ^_.
-		 That is needed in the .elc file
-		 but it is redundant in DOC.  So get rid of it here.  */
-	      saved_string[length - 1] = 0;
-	      /* Skip the line break.  */
-	      while (c == '\n' || c == '\r')
-		c = getc (infile);
-	      /* Skip the following line.  */
-	      while (c != '\n' && c != '\r')
-		c = getc (infile);
+              for (;;)
+                {
+                  c = getc (infile);
+                  if (c != '\037') break;
+                  type = getc (infile);
+                  if (type != 'V' && type != 'F')
+                    fatal ("'V' or 'F' not found before symbol name (%c)\n", c);
+                  read_lisp_symbol (infile, buffer);
+                  c = getc (infile);
+                  if (c != '\037')
+                    fatal ("\\037 not found after symbol name");
+
+                  printf ("\037%c%s\n", type, buffer);
+                  for (;;)
+                    {
+                      c = getc (infile);
+                      if (c == '\037')
+                        {
+                          if ('\n' != getc (infile))
+                            fatal ("newline not found after dynamic doc string\n");
+                          break;
+                        }
+                      putc (c, stdout);
+                    }
+                }
 	    }
 	  continue;
 	}
diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index d82b0385b1..28fd2b9cdd 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -1976,6 +1976,8 @@ byte-compile-from-buffer
 	;; Simulate entry to byte-compile-top-level
         (byte-compile-jump-tables nil)
         (byte-compile-constants nil)
+        (byte-compile--docstring-constants nil)
+        (byte-compile--docstring-marker nil)
 	(byte-compile-variables nil)
 	(byte-compile-tag-number 0)
 	(byte-compile-depth 0)
@@ -2050,6 +2052,21 @@ byte-compile-from-buffer
      byte-compile--outbuffer)))
 
 (defun byte-compile-fix-header (_filename)
+  (byte-compile-fix-header-multibyte)
+  (byte-compile-fix-header-docstring-comment))
+
+(defun byte-compile-fix-header-docstring-comment ()
+  (pcase byte-compile--docstring-marker
+    (`(,beg . ,end)
+     (let ((comment-beg (format "#@%d " (- (position-bytes end)
+                                           (position-bytes beg)))))
+       (goto-char (point-min))
+       (search-forward ";;;;;;;;;;" beg)
+       (beginning-of-line)
+       (delete-char (length comment-beg))
+       (princ comment-beg beg)))))
+
+(defun byte-compile-fix-header-multibyte ()
   "If the current buffer has any multibyte characters, insert a version test."
   (when (< (point-max) (position-bytes (point-max)))
     (goto-char (point-min))
@@ -2127,7 +2144,9 @@ byte-compile-insert-header
        ;; can delete them so as to keep the buffer positions
        ;; constant for the actual compiled code.
        ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n"
-       ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n"))))
+       ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n")
+      (setq byte-compile--docstring-marker
+            (cons (point-marker) (point-marker))))))
 
 (defun byte-compile-output-file-form (form)
   ;; Write the given form to the output buffer, being careful of docstrings
@@ -2151,7 +2170,39 @@ byte-compile-output-file-form
                                            '(defvaralias autoload
                                               custom-declare-variable)))
       (princ "\n" byte-compile--outbuffer)
-      (prin1 form byte-compile--outbuffer)
+      (pcase form
+        ((and (guard byte-compile--docstring-constants)
+              (guard byte-compile--docstring-marker)
+              `(byte-code ,bytestr ,constants ,maxdepth))
+         (princ "(byte-code " byte-compile--outbuffer)
+         (prin1 bytestr byte-compile--outbuffer)
+         (princ " [" byte-compile--outbuffer)
+         (cl-callf cl-sort byte-compile--docstring-constants #'< :key #'car)
+         (let ((docs-head byte-compile--docstring-constants))
+           (dotimes (i (length constants))
+             (if (/= i (caar docs-head))
+                 (prin1 (aref constants i) byte-compile--outbuffer)
+               (pcase-let* ((doc-marker (cdr byte-compile--docstring-marker))
+                            (start (marker-position doc-marker))
+                            (`(,_i ,symtype ,sym) (car docs-head)))
+                 (princ (format "(#$ . %d)"
+                                (with-current-buffer byte-compile--outbuffer
+                                  (- (position-bytes start) (point-min))))
+                        byte-compile--outbuffer)
+                 (write-char ?\037 doc-marker)
+                 (write-char symtype doc-marker)
+                 (princ sym doc-marker)
+                 (write-char ?\037 doc-marker)
+                 (princ (aref constants i) doc-marker)
+                 (princ "\037\n" doc-marker)
+                 (byte-compile-escape-docstring
+                  start doc-marker))
+               (pop docs-head))
+             (princ " " byte-compile--outbuffer)))
+         (princ "] " byte-compile--outbuffer)
+         (prin1 maxdepth byte-compile--outbuffer)
+         (princ ")" byte-compile--outbuffer))
+        (_ (prin1 form byte-compile--outbuffer)))
       nil)))
 
 (defvar byte-compile--for-effect)
@@ -2275,6 +2326,7 @@ byte-compile-flush-pending
 	      (form
 	       (byte-compile-output-file-form form)))
 	(setq byte-compile-constants nil
+              byte-compile--docstring-constants nil
 	      byte-compile-variables nil
 	      byte-compile-depth 0
 	      byte-compile-maxdepth 0
@@ -2389,10 +2441,18 @@ byte-compile-file-form-defvar
 (put 'defvaralias 'byte-hunk-handler 'byte-compile-file-form-defvar-function)
 
 (defun byte-compile-file-form-defvar-function (form)
-  (pcase-let (((or `',name (let name nil)) (nth 1 form)))
-    (if name (byte-compile--declare-var name)))
+  (pcase-let (((or `',name (let name nil)) (nth 1 form))
+              (docstr (nth 3 form)))
+    (if name (byte-compile--declare-var name))
+    (when (stringp docstr)
+      (setf (nth 3 form) `(byte-compile-docstring ,docstr ?V ,name))))
   (byte-compile-keep-pending form))
 
+(defun byte-compile-docstring-handler (form)
+  (byte-compile-out 'byte-constant
+                    (car (push (cl-list* (cadr form) 'docstring (cddr form))
+                               byte-compile-constants))))
+
 (put 'custom-declare-variable 'byte-hunk-handler
      'byte-compile-file-form-custom-declare-variable)
 (defun byte-compile-file-form-custom-declare-variable (form)
@@ -2578,6 +2638,18 @@ byte-compile-file-form-defmumble
           (princ ")" byte-compile--outbuffer)
           t)))))
 
+(defun byte-compile-escape-docstring (beg &optional end)
+  "Quote characters in the range BEG to END for `get_doc_string'."
+  (goto-char beg)
+  (while (search-forward "\^A" end t)
+    (replace-match "\^A\^A" t t))
+  (goto-char beg)
+  (while (search-forward "\000" end t)
+    (replace-match "\^A0" t t))
+  (goto-char beg)
+  (while (search-forward "\037" end t)
+    (replace-match "\^A_" t t)))
+
 (defun byte-compile-output-as-comment (exp quoted)
   "Print Lisp object EXP in the output file, inside a comment,
 and return the file (byte) position it will have.
@@ -2590,17 +2662,7 @@ byte-compile-output-as-comment
       (if quoted
           (prin1 exp byte-compile--outbuffer)
         (princ exp byte-compile--outbuffer))
-      (goto-char position)
-      ;; Quote certain special characters as needed.
-      ;; get_doc_string in doc.c does the unquoting.
-      (while (search-forward "\^A" nil t)
-        (replace-match "\^A\^A" t t))
-      (goto-char position)
-      (while (search-forward "\000" nil t)
-        (replace-match "\^A0" t t))
-      (goto-char position)
-      (while (search-forward "\037" nil t)
-        (replace-match "\^A_" t t))
+      (byte-compile-escape-docstring position)
       (goto-char (point-max))
       (insert "\037")
       (goto-char position)
@@ -2838,6 +2900,8 @@ byte-compile-lambda
                   (list (nth 1 int))))))))
 
 (defvar byte-compile-reserved-constants 0)
+(defvar byte-compile--docstring-constants nil)
+(defvar byte-compile--docstring-marker nil)
 
 (defun byte-compile-constants-vector ()
   ;; Builds the constants-vector from the current variables and constants.
@@ -2866,7 +2930,11 @@ byte-compile-constants-vector
 	 ((setq tmp (assq (car (car rest)) ret))
 	  (setcdr (car rest) (cdr tmp)))
 	 (t
-	  (setcdr (car rest) (setq i (1+ i)))
+          (setq i (1+ i))
+          (pcase (car rest)
+            (`(,_docstr docstring ,symtype ,sym)
+             (push (list i symtype sym) byte-compile--docstring-constants)))
+	  (setcdr (car rest) i)
 	  (setq ret (cons (car rest) ret))))
 	(setq rest (cdr rest)))
       (setq limits (cdr limits)         ;Step
@@ -2885,6 +2953,7 @@ byte-compile-top-level
   ;;	'file		-> used at file-level.
   (let ((byte-compile--for-effect for-effect)
         (byte-compile-constants nil)
+        (byte-compile--docstring-constants nil)
 	(byte-compile-variables nil)
 	(byte-compile-tag-number 0)
 	(byte-compile-depth 0)
@@ -3493,6 +3562,9 @@ byte-defop-compiler-1
 ;;####(byte-defop-compiler move-to-column	1)
 (byte-defop-compiler-1 interactive byte-compile-noop)
 
+(byte-defop-compiler (byte-compile-docstring nil)
+                     byte-compile-docstring-handler)
+
 \f
 (defun byte-compile-subr-wrong-args (form n)
   (byte-compile-set-symbol-position (car form))
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-07-18  6:47 bug#27748: 26.0.50; doc strings should be in DOC file Ken Raeburn
  2017-08-06  0:09 ` npostavs
@ 2017-08-20 22:05 ` npostavs
  2017-08-29 10:09   ` Ken Raeburn
  1 sibling, 1 reply; 15+ messages in thread
From: npostavs @ 2017-08-20 22:05 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: 27748

[-- Attachment #1: Type: text/plain, Size: 1018 bytes --]

tags 27748 + patch
quit

Ken Raeburn <raeburn@raeburn.org> writes:

> 1. defcustom doc strings from files compiled with lexical binding.
>
>    For example, files.el (lexical bindings) defines
>    delete-auto-save-files but it doesn't show up in the DOC file;
>    files.elc starts with an initial byte-code blob which includes the
>    symbol delete-auto-save-files and its doc string in the constants
>    array.
>
>    On the other hand, custom.el (dynamic bindings) declares
>    custom-theme-directory, the .elc file dumps out the doc string in a
>    #@... block before a separate top-level call to
>    custom-declare-variable, and since this is what make-docfile looks
>    for, the doc string winds up in the DOC file.

With patch 0001 defcustoms which are compiled to bytecode now produce
dynamic docstrings which make-doc can digest (note that I had to change
make-doc a bit for this, but the .elc format remains the same as far as
the Emacs loading it is concerned.  See the commit message for details).


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 18166 bytes --]

From 73c753f07c21ad2fe32fac124b7287bd8b6ab01b Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 13 Aug 2017 13:15:10 -0400
Subject: [PATCH 1/3] Produce dynamic docstrings for bytecode (Bug#27748)

Instead of relying on decompilation to create source forms that we can
easily extract docstrings from, record the docstrings as we compile
and then print out all the docstrings with their corresponding symbol
names in a single comment at the top.

Old format:

    #@ nnn Docstring of var1^_
    (defvar var1 (init-expression) (#$ . nn))
    #@ nnn Docstring of var2^_
    (defvar var2 (init-expression) (#$ . nn))

New format:

    #@ nnnnnn ^_Vvar1^_Docstring of var1^_
    ^_Vvar2^_Docstring of var2^_
    (defvar var1 (init-expression) (#$ . nn))
    (defvar var2 (init-expression) (#$ . nn))

(Where "^_" represents the character \037, aka "Unit Separator".)

The new format can still be loaded by older Emacs versions since the
bytecode loader only requires that dynamic docstrings be at the right
file offset and preceded with "^_" or "#@ ".  It cannot be used by
older make-doc versions.

* lisp/emacs-lisp/bytecomp.el (byte-compile-docstring-handler): New
function, push a special kind of constant onto
`byte-compile-constants' for the given docstring.
(byte-compile-file-form-defvar-function): Use it on the docstring.
(byte-compile--docstring-constants): New variable.
(byte-compile-constants-vector): Use it to rememember the indices of
the special docstring constants.
(byte-compile-top-level, byte-compile-flush-pending): Let-bind
byte-compile--docstring-constants to nil.

(byte-compile--docstring-marker): New variable.
(byte-compile-from-buffer): Let-bind it to nil.
(byte-compile-insert-header): Set it to a pair markers pointing to the
end of the header.
(byte-compile-output-as-comment): Write the docstrings collected into
byte-compile--docstring-constants to the second marker in
byte-compile--docstring-marker.  When writing docstrings (as opposed
to lazy loaded bytecode), also print V<symbol> or F<symbol> prior to the
docstring.
(byte-compile-output-file-form): When writing out the constants
vector, use the (#$ . %d) format instead of the string itself.
(byte-compile-escape-docstring): New function, extracted from
`byte-compile-output-as-comment'.
(byte-compile-fix-header-docstring-comment): New function, comment out
the docstrings at the top of the file with a #@N kind of comment.
Delete semicolons from the header as needed to preserve offsets.
(byte-compile-fix-header-multibyte): Rename from
byte-compile-fix-header.
(byte-compile-fix-header): Call both
`byte-compile-fix-header-multibyte' and
`byte-compile-fix-header-docstring-comment'.

* lib-src/make-docfile.c (scan_lisp_file): Update for new format,
collect all of symbol type, name, and docstring directly from #@N
comments.
---
 lib-src/make-docfile.c      |  61 ++++++++++-------
 lisp/emacs-lisp/bytecomp.el | 158 ++++++++++++++++++++++++++++++--------------
 2 files changed, 147 insertions(+), 72 deletions(-)

diff --git a/lib-src/make-docfile.c b/lib-src/make-docfile.c
index ecd6447ab7..8daca9aba2 100644
--- a/lib-src/make-docfile.c
+++ b/lib-src/make-docfile.c
@@ -1258,7 +1258,8 @@ read_lisp_symbol (FILE *infile, char *buffer)
       c = getc (infile);
       if (c == '\\')
 	*(++fillp) = getc (infile);
-      else if (c == ' ' || c == '\t' || c == '\n' || c == '\r' || c == '(' || c == ')')
+      else if (c == ' ' || c == '\t' || c == '\n' || c == '\r' ||
+               c == '(' || c == ')' || c == '\037')
 	{
 	  ungetc (c, infile);
 	  *fillp = 0;
@@ -1367,14 +1368,13 @@ scan_lisp_file (const char *filename, const char *mode)
       /* Skip the line break.  */
       while (c == '\n' || c == '\r')
 	c = getc (infile);
-      /* Detect a dynamic doc string and save it for the next expression.  */
+      /* Detect the dynamic string block.  */
       if (c == '#')
 	{
 	  c = getc (infile);
 	  if (c == '@')
 	    {
 	      ptrdiff_t length = 0;
-	      ptrdiff_t i;
 
 	      /* Read the length.  */
 	      while ((c = getc (infile),
@@ -1387,31 +1387,46 @@ scan_lisp_file (const char *filename, const char *mode)
 		}
 
 	      if (length <= 1)
-		fatal ("invalid dynamic doc string length");
+		fatal ("%s: invalid dynamic doc string length", filename);
+
+              /* We expect one newline character following the
+                 comment.  */
+              ptrdiff_t end_offset = ftell (infile) + length + 1;
 
 	      if (c != ' ')
 		fatal ("space not found after dynamic doc string length");
 
-	      /* The next character is a space that is counted in the length
-		 but not part of the doc string.
-		 We already read it, so just ignore it.  */
-	      length--;
-
 	      /* Read in the contents.  */
-	      free (saved_string);
-	      saved_string = xmalloc (length);
-	      for (i = 0; i < length; i++)
-		saved_string[i] = getc (infile);
-	      /* The last character is a ^_.
-		 That is needed in the .elc file
-		 but it is redundant in DOC.  So get rid of it here.  */
-	      saved_string[length - 1] = 0;
-	      /* Skip the line break.  */
-	      while (c == '\n' || c == '\r')
-		c = getc (infile);
-	      /* Skip the following line.  */
-	      while (c != '\n' && c != '\r')
-		c = getc (infile);
+              for (;;)
+                {
+                  c = getc (infile);
+                  if (c != '\037') break;
+                  type = getc (infile);
+                  if (type != 'V' && type != 'F')
+                    fatal ("%s: 'V' or 'F' not found before symbol name (%c)\n", filename, c);
+                  read_lisp_symbol (infile, buffer);
+                  c = getc (infile);
+                  if (c != '\037')
+                    fatal ("\\037 not found after symbol name");
+
+                  printf ("\037%c%s\n", type, buffer);
+                  for (;;)
+                    {
+                      c = getc (infile);
+                      if (c == '\037')
+                        {
+                          if ('\n' != getc (infile))
+                            fatal ("newline not found after dynamic doc string\n");
+                          break;
+                        }
+                      putc (c, stdout);
+                    }
+                }
+              /* All dynamic strings should be in that block.  */
+              if (ftell (infile) != end_offset)
+                fatal ("%s: wrong dynamic doc string length (%ld != %ld)",
+                       filename, ftell (infile), end_offset);
+              break;
 	    }
 	  continue;
 	}
diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index cf06c0c8ef..d2768a159b 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -1976,6 +1976,8 @@ byte-compile-from-buffer
 	;; Simulate entry to byte-compile-top-level
         (byte-compile-jump-tables nil)
         (byte-compile-constants nil)
+        (byte-compile--docstring-constants nil)
+        (byte-compile--docstring-marker nil)
 	(byte-compile-variables nil)
 	(byte-compile-tag-number 0)
 	(byte-compile-depth 0)
@@ -2050,6 +2052,22 @@ byte-compile-from-buffer
      byte-compile--outbuffer)))
 
 (defun byte-compile-fix-header (_filename)
+  (byte-compile-fix-header-multibyte)
+  (byte-compile-fix-header-docstring-comment))
+
+(defun byte-compile-fix-header-docstring-comment ()
+  (pcase byte-compile--docstring-marker
+    (`(,beg . ,end)
+     (let* ((bytes (- (position-bytes end) (position-bytes beg)))
+            (comment-beg (format "#@%d " bytes)))
+       (when (> bytes 0)
+         (goto-char (point-min))
+         (search-forward ";;;;;;;;;;" beg)
+         (beginning-of-line)
+         (delete-char (length comment-beg))
+         (princ comment-beg beg))))))
+
+(defun byte-compile-fix-header-multibyte ()
   "If the current buffer has any multibyte characters, insert a version test."
   (when (< (point-max) (position-bytes (point-max)))
     (goto-char (point-min))
@@ -2127,7 +2145,9 @@ byte-compile-insert-header
        ;; can delete them so as to keep the buffer positions
        ;; constant for the actual compiled code.
        ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n"
-       ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n"))))
+       ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n")
+      (setq byte-compile--docstring-marker
+            (cons (point-marker) (point-marker))))))
 
 (defun byte-compile-output-file-form (form)
   ;; Write the given form to the output buffer, being careful of docstrings
@@ -2151,7 +2171,29 @@ byte-compile-output-file-form
                                            '(defvaralias autoload
                                               custom-declare-variable)))
       (princ "\n" byte-compile--outbuffer)
-      (prin1 form byte-compile--outbuffer)
+      (pcase form
+        ((and (guard byte-compile--docstring-constants)
+              (guard byte-compile--docstring-marker)
+              `(byte-code ,bytestr ,constants ,maxdepth))
+         (princ "(byte-code " byte-compile--outbuffer)
+         (prin1 bytestr byte-compile--outbuffer)
+         (princ " [" byte-compile--outbuffer)
+         (cl-callf cl-sort byte-compile--docstring-constants #'< :key #'car)
+         (let ((docs-head byte-compile--docstring-constants))
+           (dotimes (i (length constants))
+             (if (or (null docs-head) (/= i (caar docs-head)))
+                 (prin1 (aref constants i) byte-compile--outbuffer)
+               (pcase-let* ((`(,_i ,symtype ,sym) (car docs-head)))
+                 (princ (format "(#$ . %d)"
+                                (byte-compile-output-as-comment
+                                 (aref constants i) nil symtype sym))
+                        byte-compile--outbuffer))
+               (pop docs-head))
+             (princ " " byte-compile--outbuffer)))
+         (princ "] " byte-compile--outbuffer)
+         (prin1 maxdepth byte-compile--outbuffer)
+         (princ ")" byte-compile--outbuffer))
+        (_ (prin1 form byte-compile--outbuffer)))
       nil)))
 
 (defvar byte-compile--for-effect)
@@ -2172,17 +2214,15 @@ byte-compile-output-docform
   (let ((dynamic-docstrings byte-compile-dynamic-docstrings))
     (with-current-buffer byte-compile--outbuffer
       (let (position)
-
         ;; Insert the doc string, and make it a comment with #@LENGTH.
         (and (>= (nth 1 info) 0)
              dynamic-docstrings
-             (progn
-               ;; Make the doc string start at beginning of line
-               ;; for make-docfile's sake.
-               (insert "\n")
-               (setq position
-                     (byte-compile-output-as-comment
-                      (nth (nth 1 info) form) nil))
+             (pcase-let* (((or `',sym sym) (or name (nth 1 form)))
+                          (symtype (if (if preface (string-match-p "defalias" preface)
+                                         (memq (car form) '(autoload defalias)))
+                                       ?F ?V)))
+               (setq position (byte-compile-output-as-comment
+                               (nth (nth 1 info) form) nil symtype sym))
                ;; If the doc string starts with * (a user variable),
                ;; negate POSITION.
                (if (and (stringp (nth (nth 1 info) form))
@@ -2227,10 +2267,9 @@ byte-compile-output-docform
                           (not non-nil)))
                    ;; Output the byte code and constants specially
                    ;; for lazy dynamic loading.
-                   (let ((position
-                          (byte-compile-output-as-comment
-                           (cons (car form) (nth 1 form))
-                           t)))
+                   (let* ((position (byte-compile-output-as-comment
+                                     (cons (car form) (nth 1 form))
+                                     t nil nil)))
                      (princ (format "(#$ . %d) nil" position)
                             byte-compile--outbuffer)
                      (setq form (cdr form))
@@ -2275,6 +2314,7 @@ byte-compile-flush-pending
 	      (form
 	       (byte-compile-output-file-form form)))
 	(setq byte-compile-constants nil
+              byte-compile--docstring-constants nil
 	      byte-compile-variables nil
 	      byte-compile-depth 0
 	      byte-compile-maxdepth 0
@@ -2389,8 +2429,11 @@ byte-compile-file-form-defvar
 (put 'defvaralias 'byte-hunk-handler 'byte-compile-file-form-defvar-function)
 
 (defun byte-compile-file-form-defvar-function (form)
-  (pcase-let (((or `',name (let name nil)) (nth 1 form)))
-    (if name (byte-compile--declare-var name)))
+  (pcase-let (((or `',name (let name nil)) (nth 1 form))
+              (docstr (nth 3 form)))
+    (if name (byte-compile--declare-var name))
+    (when (stringp docstr)
+      (setf (nth 3 form) `(byte-compile-docstring ,docstr ?V ,name))))
   (byte-compile-keep-pending form))
 
 (put 'custom-declare-variable 'byte-hunk-handler
@@ -2578,42 +2621,43 @@ byte-compile-file-form-defmumble
           (princ ")" byte-compile--outbuffer)
           t)))))
 
-(defun byte-compile-output-as-comment (exp quoted)
-  "Print Lisp object EXP in the output file, inside a comment,
+(defun byte-compile-escape-docstring (beg &optional end)
+  "Quote characters in the range BEG to END for `get_doc_string'."
+  (save-excursion
+    (goto-char beg)
+    (while (search-forward "\^A" end t)
+      (replace-match "\^A\^A" t t))
+    (goto-char beg)
+    (while (search-forward "\000" end t)
+      (replace-match "\^A0" t t))
+    (goto-char beg)
+    (while (search-forward "\037" end t)
+      (replace-match "\^A_" t t))))
+
+(defun byte-compile-output-as-comment (exp quoted symtype sym)
+  "Print Lisp object EXP to the output file's header comment,
 and return the file (byte) position it will have.
-If QUOTED is non-nil, print with quoting; otherwise, print without quoting."
+The header lies between the markers in
+`byte-compile--docstring-marker'.
+If QUOTED is non-nil, print with quoting; otherwise, print without quoting.
+If SYMTYPE is a character, print it and SYM before EXP."
   (with-current-buffer byte-compile--outbuffer
-    (let ((position (point)))
-
-      ;; Insert EXP, and make it a comment with #@LENGTH.
-      (insert " ")
+    (let* ((doc-marker (cdr byte-compile--docstring-marker))
+           (position (progn (when (characterp symtype)
+                              (write-char ?\037 doc-marker)
+                              (write-char symtype doc-marker)
+                              (princ sym doc-marker))
+                            (write-char ?\037 doc-marker)
+                            (marker-position doc-marker))))
       (if quoted
-          (prin1 exp byte-compile--outbuffer)
-        (princ exp byte-compile--outbuffer))
-      (goto-char position)
-      ;; Quote certain special characters as needed.
-      ;; get_doc_string in doc.c does the unquoting.
-      (while (search-forward "\^A" nil t)
-        (replace-match "\^A\^A" t t))
-      (goto-char position)
-      (while (search-forward "\000" nil t)
-        (replace-match "\^A0" t t))
-      (goto-char position)
-      (while (search-forward "\037" nil t)
-        (replace-match "\^A_" t t))
-      (goto-char (point-max))
-      (insert "\037")
-      (goto-char position)
-      (insert "#@" (format "%d" (- (position-bytes (point-max))
-                                   (position-bytes position))))
-
+          (prin1 exp doc-marker)
+        (princ exp doc-marker))
+      (byte-compile-escape-docstring position doc-marker)
+      (princ "\037\n" doc-marker)
       ;; Save the file position of the object.
-      ;; Note we add 1 to skip the space that we inserted before the actual doc
-      ;; string, and subtract point-min to convert from an 1-origin Emacs
-      ;; position to a file position.
-      (prog1
-          (- (position-bytes (point)) (point-min) -1)
-        (goto-char (point-max))))))
+      ;; Note we subtract point-min to convert from an 1-origin Emacs
+      ;; position to a 0-origin file offset.
+      (- (position-bytes position) (point-min)))))
 
 (defun byte-compile--reify-function (fun)
   "Return an expression which will evaluate to a function value FUN.
@@ -2838,6 +2882,8 @@ byte-compile-lambda
                   (list (nth 1 int))))))))
 
 (defvar byte-compile-reserved-constants 0)
+(defvar byte-compile--docstring-constants nil)
+(defvar byte-compile--docstring-marker nil)
 
 (defun byte-compile-constants-vector ()
   ;; Builds the constants-vector from the current variables and constants.
@@ -2866,7 +2912,11 @@ byte-compile-constants-vector
 	 ((setq tmp (assq (car (car rest)) ret))
 	  (setcdr (car rest) (cdr tmp)))
 	 (t
-	  (setcdr (car rest) (setq i (1+ i)))
+          (setq i (1+ i))
+          (pcase (car rest)
+            (`(,_docstr docstring ,symtype ,sym)
+             (push (list i symtype sym) byte-compile--docstring-constants)))
+          (setcdr (car rest) i)
 	  (setq ret (cons (car rest) ret))))
 	(setq rest (cdr rest)))
       (setq limits (cdr limits)         ;Step
@@ -2885,6 +2935,7 @@ byte-compile-top-level
   ;;	'file		-> used at file-level.
   (let ((byte-compile--for-effect for-effect)
         (byte-compile-constants nil)
+        (byte-compile--docstring-constants nil)
 	(byte-compile-variables nil)
 	(byte-compile-tag-number 0)
 	(byte-compile-depth 0)
@@ -4578,6 +4629,15 @@ byte-compile-make-obsolete-variable
     (push (nth 1 (nth 1 form)) byte-compile-global-not-obsolete-vars))
   (byte-compile-normal-call form))
 
+
+(byte-defop-compiler
+ (byte-compile-docstring nil) byte-compile-docstring-handler)
+(defun byte-compile-docstring-handler (form)
+  ;; FORM = (byte-compile-docstring DOCSTR ?V NAME)
+  (byte-compile-out 'byte-constant
+                    (car (push (cl-list* (cadr form) 'docstring (cddr form))
+                               byte-compile-constants))))
+
 (defconst byte-compile-tmp-var (make-symbol "def-tmp-var"))
 
 (defun byte-compile-defvar (form)
-- 
2.14.1


[-- Attachment #3: Type: text/plain, Size: 1122 bytes --]


> 2. In isearch, CL macro expansion results in symbols like
>    isearch--state-forward--cmacro having function definitions that
>    include the two doc strings "Access slot \"forward\" of
>    `(isearch--state (:constructor nil) [...]" and
>    "\n\n(fn CL-WHOLE-ARG CL-X)".  The former string is also in DOC (with
>    "\n\n(fn CL-X)" appended) as the documentation for the function
>    isearch--state-forward.
>
>    It appears that, as far as the Emacs help system is concerned,
>    isearch--state-*--cmacro functions are undocumented.
>
>    The --cmacro functions generate cl-block forms that include the
>    original doc string, since it's treated as part of the body, but it's
>    not clear to me whether it has any use there at all, or if it could
>    just be discarded, or perhaps looked up via the non --cmacro symbols
>    at run time if it is of use.

I think it is just an oversight, since the string was put inside the
cl-block it is not recognized as a docstring at all.  Patch 0002 drops
the function's docstring from compiler-macro and adds a simple
"compiler-macro for inlining `NAME'" instead.


[-- Attachment #4: patch --]
[-- Type: text/plain, Size: 1220 bytes --]

From 3e9846d2fb9f4857f9bde152227ebc28f6faccc7 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Fri, 18 Aug 2017 08:15:25 -0400
Subject: [PATCH 2/3] Drop docstrings from cl-defsubst produced inline bodies
 (Bug#27748)

* lisp/emacs-lisp/cl-macs.el (cl-defsubst): Use macroexp-parse-progn
to drop the docstring.  Add a simple docstring to the compiler-macro.
---
 lisp/emacs-lisp/cl-macs.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lisp/emacs-lisp/cl-macs.el b/lisp/emacs-lisp/cl-macs.el
index 451e6490d7..6c69f16a20 100644
--- a/lisp/emacs-lisp/cl-macs.el
+++ b/lisp/emacs-lisp/cl-macs.el
@@ -2468,8 +2468,9 @@ cl-defsubst
              ,(if (memq '&key args)
                   `(&whole cl-whole &cl-quote ,@args)
                 (cons '&cl-quote args))
+             ,(format "compiler-macro for inlining `%s'." name)
              (cl--defsubst-expand
-              ',argns '(cl-block ,name ,@body)
+              ',argns '(cl-block ,name ,@(cdr (macroexp-parse-body body)))
               ;; We used to pass `simple' as
               ;; (not (or unsafe (cl-expr-access-order pbody argns)))
               ;; But this is much too simplistic since it
-- 
2.14.1


[-- Attachment #5: Type: text/plain, Size: 386 bytes --]


> 3. Undocumented functions, strange as it sounds... in files.el, function
>    file-name-non-special is defined with no documentation.  In
>    files.elc, the bytecode object constructed has a doc string "\n\n(fn
>    OPERATION &rest ARGUMENTS)" instead of a (#$ . NNN) reference to a
>    separate string that make-docfile can pick up.

Looks fairly trivial to fix, see patch 0003.


[-- Attachment #6: patch --]
[-- Type: text/plain, Size: 1268 bytes --]

From 50113c164038f63bec1ba65ac21f9c6f41bd1f16 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sat, 19 Aug 2017 10:29:05 -0400
Subject: [PATCH 3/3] Support lazy loading for autogenerated usage docstrings
 too (Bug#27748)

* lisp/emacs-lisp/bytecomp.el (byte-compile-file-form-defmumble):
Consider any documentation that ended up in code as a docstring (e.g.,
autogenerated (fn ARG1 ARG2) type things), not just what the user
passed.
---
 lisp/emacs-lisp/bytecomp.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index d2768a159b..b11c49f230 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -2607,7 +2607,7 @@ byte-compile-file-form-defmumble
           (let ((index
                  ;; If there's no doc string, provide -1 as the "doc string
                  ;; index" so that no element will be treated as a doc string.
-                 (if (not (stringp (car body))) -1 4)))
+                 (if (not (stringp (documentation code t))) -1 4)))
             ;; Output the form by hand, that's much simpler than having
             ;; b-c-output-file-form analyze the defalias.
             (byte-compile-output-docform
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-08-20 22:05 ` npostavs
@ 2017-08-29 10:09   ` Ken Raeburn
  2017-08-31  0:50     ` npostavs
  0 siblings, 1 reply; 15+ messages in thread
From: Ken Raeburn @ 2017-08-29 10:09 UTC (permalink / raw)
  To: npostavs; +Cc: 27748

Sorry it’s taken me a while to get to testing these out…

On Aug 20, 2017, at 18:05, npostavs@users.sourceforge.net wrote:
> 
>> 1. defcustom doc strings from files compiled with lexical binding.

> With patch 0001 defcustoms which are compiled to bytecode now produce
> dynamic docstrings which make-doc can digest (note that I had to change
> make-doc a bit for this, but the .elc format remains the same as far as
> the Emacs loading it is concerned.  See the commit message for details).

I think I like the new format.  It’s a little bit bigger, but it may load faster, as we can do one big fseek at the beginning of the file (thus not even loading a lot of those pages) rather than lots of small ones as we go along.

Will this new make-docfile play nicely with files compiled with byte-compile-dynamic, where byte code is mixed in with the usual doc strings?  Or if we decide to make lambdas (which have “(fn…)” doc strings by default but have no names to associate with them in DOC) load their doc strings dynamically from the .elc file?

>> 2. In isearch, CL macro expansion results in symbols like
>>   isearch--state-forward--cmacro having function definitions that
>>   include the two doc strings "Access slot \"forward\" of
>>   `(isearch--state (:constructor nil) [...]" and
>>   "\n\n(fn CL-WHOLE-ARG CL-X)".  The former string is also in DOC (with
>>   "\n\n(fn CL-X)" appended) as the documentation for the function
>>   isearch--state-forward.

> I think it is just an oversight, since the string was put inside the
> cl-block it is not recognized as a docstring at all.  Patch 0002 drops
> the function's docstring from compiler-macro and adds a simple
> "compiler-macro for inlining `NAME'" instead.

I’m seeing the shorter string, *and* it’s stored in the DOC file.

>> 3. Undocumented functions, strange as it sounds... in files.el, function
>>   file-name-non-special is defined with no documentation.  In
>>   files.elc, the bytecode object constructed has a doc string "\n\n(fn
>>   OPERATION &rest ARGUMENTS)" instead of a (#$ . NNN) reference to a
>>   separate string that make-docfile can pick up.
> 
> Looks fairly trivial to fix, see patch 0003.

This one seems to be working well too.

I did a few spot checks looking at what wound up in the DOC file, and checking the ability to load the documentation in Emacs, and  things look good.

Between these three changes, the byte count for lisp/*.elc grew by 2-3%.  The DOC file is about 6% bigger, due to the strings that weren’t being picked up before.  Curiously, the generated emacs executable was just a little bit bigger, less than 1%, but from the numbers returned by (garbage-collect) I think that’s probably freed space.

I tested the changes out against the scratch/raeburn-startup branch as well.  In that case, the saved Lisp environment (with the more costly parsing process than etc/DOC) shrinks by about 5%, which will probably speed up the startup load time a little bit. There are still some doc strings left, but I’ll look into those later.

Thanks for digging into this!

Ken




^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-08-29 10:09   ` Ken Raeburn
@ 2017-08-31  0:50     ` npostavs
  0 siblings, 0 replies; 15+ messages in thread
From: npostavs @ 2017-08-31  0:50 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: 27748

Ken Raeburn <raeburn@raeburn.org> writes:

> Sorry it’s taken me a while to get to testing these out…

Hah, no problem.  I confess it's been on my todo list to test out your
scratch/raeburn-startup branch for an even longer while...

> On Aug 20, 2017, at 18:05, npostavs@users.sourceforge.net wrote:
>> 
>>> 1. defcustom doc strings from files compiled with lexical binding.
>
>> With patch 0001 defcustoms which are compiled to bytecode now produce
>> dynamic docstrings which make-doc can digest (note that I had to change
>> make-doc a bit for this, but the .elc format remains the same as far as
>> the Emacs loading it is concerned.  See the commit message for details).
>
> I think I like the new format.  It’s a little bit bigger, but it may
> load faster, as we can do one big fseek at the beginning of the file
> (thus not even loading a lot of those pages) rather than lots of small
> ones as we go along.

Indeed, that was my thought too.  I haven't measured anything though.

> Will this new make-docfile play nicely with files compiled with
> byte-compile-dynamic, where byte code is mixed in with the usual doc
> strings?  Or if we decide to make lambdas (which have “(fn…)” doc
> strings by default but have no names to associate with them in DOC)
> load their doc strings dynamically from the .elc file?

Hmm, it will not.  We would have to add a "nameless" type I guess,
something like ^_A^_anonymous docstring here...^_.

I pushed patches [2: bc5d96a0b2] and [3: 160295867d] to master, since
they are pretty straightforward bugfixes.

[2: bc5d96a0b2]: 2017-08-30 20:07:39 -0400
  Drop docstrings from cl-defsubst produced inline bodies (Bug#27748)
  http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=bc5d96a0b2a1dccf7eeeec459e40d21b54c977f4>

[3: 160295867d]: 2017-08-30 20:07:39 -0400
  Support lazy loading for autogenerated usage docstrings too (Bug#27748)
  http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=160295867de98241a16f2ede93da7e825ed4406b  






^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2017-08-13 18:04     ` npostavs
@ 2019-06-24 22:38       ` Lars Ingebrigtsen
  2019-06-24 23:00         ` Noam Postavsky
  0 siblings, 1 reply; 15+ messages in thread
From: Lars Ingebrigtsen @ 2019-06-24 22:38 UTC (permalink / raw)
  To: npostavs; +Cc: Ken Raeburn, 27748

npostavs@users.sourceforge.net writes:

> Okay, here is an alternate approach which decouples the docstring
> production from decompilation (note this isn't finished yet, I only
> implemented it for defcustom, so applying this patch currently prevents
> make-doc from finishing successfully).

These strings still aren't in the DOC file, but I'm not sure whether
that's a problem or not -- what are the practical effects of the doc
string of `delete-auto-save-files' not being in that file?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2019-06-24 22:38       ` Lars Ingebrigtsen
@ 2019-06-24 23:00         ` Noam Postavsky
  2020-08-10 15:00           ` Lars Ingebrigtsen
  2021-05-10 12:09           ` Lars Ingebrigtsen
  0 siblings, 2 replies; 15+ messages in thread
From: Noam Postavsky @ 2019-06-24 23:00 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Ken Raeburn, 27748

Lars Ingebrigtsen <larsi@gnus.org> writes:

> npostavs@users.sourceforge.net writes:
>
>> Okay, here is an alternate approach which decouples the docstring
>> production from decompilation (note this isn't finished yet, I only
>> implemented it for defcustom, so applying this patch currently prevents
>> make-doc from finishing successfully).
>
> These strings still aren't in the DOC file, but I'm not sure whether
> that's a problem or not -- what are the practical effects of the doc
> string of `delete-auto-save-files' not being in that file?

It's mostly just an optimization thing, so not hugely important (at the
time, Ken was working on a pure Lisp dumping strategy, so shrinking the
preloaded code was important for that, but we've gone with the pdumper
instead).  However, I seem to recall that applying something like this
will make it possible to solve Bug#4845, because the docstring loading
mechanism will no longer be reliant on finding "(defun foo" in the .elc
file.  So that might be nifty.






^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2019-06-24 23:00         ` Noam Postavsky
@ 2020-08-10 15:00           ` Lars Ingebrigtsen
  2021-05-10 12:09           ` Lars Ingebrigtsen
  1 sibling, 0 replies; 15+ messages in thread
From: Lars Ingebrigtsen @ 2020-08-10 15:00 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Ken Raeburn, 27748

Noam Postavsky <npostavs@gmail.com> writes:

> It's mostly just an optimization thing, so not hugely important (at the
> time, Ken was working on a pure Lisp dumping strategy, so shrinking the
> preloaded code was important for that, but we've gone with the pdumper
> instead).  However, I seem to recall that applying something like this
> will make it possible to solve Bug#4845, because the docstring loading
> mechanism will no longer be reliant on finding "(defun foo" in the .elc
> file.  So that might be nifty.

Hm...  4845 is about leaking uninterned symbols, so that seems
unrelated?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2019-06-24 23:00         ` Noam Postavsky
  2020-08-10 15:00           ` Lars Ingebrigtsen
@ 2021-05-10 12:09           ` Lars Ingebrigtsen
  2021-09-25 15:41             ` Stefan Kangas
  1 sibling, 1 reply; 15+ messages in thread
From: Lars Ingebrigtsen @ 2021-05-10 12:09 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Ken Raeburn, Stefan Monnier, 27748

Noam Postavsky <npostavs@gmail.com> writes:

>> These strings still aren't in the DOC file, but I'm not sure whether
>> that's a problem or not -- what are the practical effects of the doc
>> string of `delete-auto-save-files' not being in that file?
>
> It's mostly just an optimization thing, so not hugely important (at the
> time, Ken was working on a pure Lisp dumping strategy, so shrinking the
> preloaded code was important for that, but we've gone with the pdumper
> instead).  However, I seem to recall that applying something like this
> will make it possible to solve Bug#4845, because the docstring loading
> mechanism will no longer be reliant on finding "(defun foo" in the .elc
> file.  So that might be nifty.

Stefan M recently suggested getting rid of the DOC file entirely, so I
wonder whether he has any comments here.  (Added to the CCs.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2021-05-10 12:09           ` Lars Ingebrigtsen
@ 2021-09-25 15:41             ` Stefan Kangas
  2021-09-26  5:32               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Kangas @ 2021-09-25 15:41 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Ken Raeburn, Noam Postavsky, Stefan Monnier, 27748

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Noam Postavsky <npostavs@gmail.com> writes:
>
>>> These strings still aren't in the DOC file, but I'm not sure whether
>>> that's a problem or not -- what are the practical effects of the doc
>>> string of `delete-auto-save-files' not being in that file?
>>
>> It's mostly just an optimization thing, so not hugely important (at the
>> time, Ken was working on a pure Lisp dumping strategy, so shrinking the
>> preloaded code was important for that, but we've gone with the pdumper
>> instead).  However, I seem to recall that applying something like this
>> will make it possible to solve Bug#4845, because the docstring loading
>> mechanism will no longer be reliant on finding "(defun foo" in the .elc
>> file.  So that might be nifty.
>
> Stefan M recently suggested getting rid of the DOC file entirely, so I
> wonder whether he has any comments here.  (Added to the CCs.)

FWIW, I think we should get rid of it.  Stefan M did an analysis of this
and the amount of saved memory was very small, from bytecomp.el:

  ;; For the compilation itself, we could largely get rid of this hunk-handler,
  ;; if it weren't for the fact that we need to figure out when a defalias
  ;; defines a macro, so as to add it to byte-compile-macro-environment.
  ;;
  ;; FIXME: we also use this hunk-handler to implement the function's dynamic
  ;; docstring feature.  We could actually implement it more elegantly in
  ;; byte-compile-lambda so it applies to all lambdas, but the problem is that
  ;; the resulting .elc format will not be recognized by make-docfile, so
  ;; either we stop using DOC for the docstrings of preloaded elc files (at the
  ;; cost of around 24KB on 32bit hosts, double on 64bit hosts) or we need to
  ;; build DOC in a more clever way (e.g. handle anonymous elements).





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2021-09-25 15:41             ` Stefan Kangas
@ 2021-09-26  5:32               ` Lars Ingebrigtsen
  2021-10-23 17:32                 ` Stefan Kangas
  0 siblings, 1 reply; 15+ messages in thread
From: Lars Ingebrigtsen @ 2021-09-26  5:32 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: Ken Raeburn, Noam Postavsky, Stefan Monnier, 27748

Stefan Kangas <stefan@marxist.se> writes:

> FWIW, I think we should get rid of it.  Stefan M did an analysis of this
> and the amount of saved memory was very small, from bytecomp.el:

[...]

>   ;; (at the cost of around 24KB on 32bit hosts, double on 64bit
>   ;; hosts)

Yes, that seems pretty insignificant.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2021-09-26  5:32               ` Lars Ingebrigtsen
@ 2021-10-23 17:32                 ` Stefan Kangas
  2021-10-24 12:59                   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Kangas @ 2021-10-23 17:32 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 27748, Ken Raeburn, Noam Postavsky, Stefan Monnier

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Stefan Kangas <stefan@marxist.se> writes:
>
>> FWIW, I think we should get rid of it.  Stefan M did an analysis of this
>> and the amount of saved memory was very small, from bytecomp.el:
>
> [...]
>
>>   ;; (at the cost of around 24KB on 32bit hosts, double on 64bit
>>   ;; hosts)
>
> Yes, that seems pretty insignificant.

If getting rid of DOC is what we want to eventually do, we should
probably just close this as wontfix.  It hardly seems worth fixing such
problems in an area that we basically want to get rid of anyway.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#27748: 26.0.50; doc strings should be in DOC file
  2021-10-23 17:32                 ` Stefan Kangas
@ 2021-10-24 12:59                   ` Lars Ingebrigtsen
  0 siblings, 0 replies; 15+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-24 12:59 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: 27748, Ken Raeburn, Noam Postavsky, Stefan Monnier

Stefan Kangas <stefan@marxist.se> writes:

> If getting rid of DOC is what we want to eventually do, we should
> probably just close this as wontfix.  It hardly seems worth fixing such
> problems in an area that we basically want to get rid of anyway.

Yup.  So I'm closing this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-10-24 12:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-18  6:47 bug#27748: 26.0.50; doc strings should be in DOC file Ken Raeburn
2017-08-06  0:09 ` npostavs
2017-08-08  1:03   ` npostavs
2017-08-13 18:04     ` npostavs
2019-06-24 22:38       ` Lars Ingebrigtsen
2019-06-24 23:00         ` Noam Postavsky
2020-08-10 15:00           ` Lars Ingebrigtsen
2021-05-10 12:09           ` Lars Ingebrigtsen
2021-09-25 15:41             ` Stefan Kangas
2021-09-26  5:32               ` Lars Ingebrigtsen
2021-10-23 17:32                 ` Stefan Kangas
2021-10-24 12:59                   ` Lars Ingebrigtsen
2017-08-20 22:05 ` npostavs
2017-08-29 10:09   ` Ken Raeburn
2017-08-31  0:50     ` npostavs

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).