unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [ELPA] Update proposal: psgml: asl-like parsing in PI
@ 2017-11-20  2:25 Lucien Pullen
  2017-11-22 22:15 ` Stefan Monnier
  0 siblings, 1 reply; 4+ messages in thread
From: Lucien Pullen @ 2017-11-20  2:25 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1046 bytes --]

I was wanting to do some processing of processing instructions 
similar to how <?PSGML ELEMENT> works without having to write my 
own SGML parser, since there's already one loaded.  I wrote a 
slight adjustment that separates the SGML parsing from the 
handling of the PI text and made it available as a standalone 
function (sgml-pi-asl-parser).

It works like attribute specification list parsing but also allows 
literals without a name= and when the format is name=value you can 
specify a function to call to parse the value instead of 
sgml-parse-literal or sgml-parse-name.

Instead of operating immediately on the values it returns a list 
of strings and (name . value) pairs that the handler function can 
operate on instead, with sgml--pi-element-handler as an example.

See the attached patch file for specifics.  I'd appreciate some 
feedback on the docstring and the possibility of turning on and 
off floating literals as an optional argument so it could be used 
as the parser for sgml-parse-attribute-specification-list as well.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-psgml-parse.el-sgml-pi-asl-parser-a-procedure-for-PI.patch --]
[-- Type: text/x-patch, Size: 3216 bytes --]

From daa7b2c0919ca39b374cd2cf25176f089b89bf38 Mon Sep 17 00:00:00 2001
From: Lucien Pullen <drurowin@gmail.com>
Date: Sun, 19 Nov 2017 18:27:03 -0700
Subject: [PATCH] * psgml-parse.el (sgml-pi-asl-parser): a procedure for PI
 handlers

A convenience function for parsing the text of a PI like an attribute
specification list with a couple syntax enhancements.  May be called to
parse the PI instead of using the string argument to sgml-pi-function.

The parsing work of sgml--pi-element-handler made available for other
processing.
---
 psgml-parse.el | 52 ++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/psgml-parse.el b/psgml-parse.el
index 5609767..0710287 100644
--- a/psgml-parse.el
+++ b/psgml-parse.el
@@ -1620,27 +1620,47 @@ in any of them."
                (sgml-log-warning "Unknown processing instruction for PSGML: %s"
                                  command)))))))
 
-
-(defun sgml--pi-element-handler ()
+(defun sgml-parse-name-or-literal ()
   (sgml-parse-s)
-  (let ((eltype (sgml-lookup-eltype (sgml-parse-name)))
-        name value)
+  (if (looking-at "['\"]")
+      (sgml-parse-literal)
+    (sgml-parse-name)))
+
+(defun sgml-pi-asl-parser (&optional value-fn)
+  "Parse the processing instruction like an attribute specification list.
+
+Returns a list of strings and (name-string . value-string) pairs
+in the order they were written in; atom strings when there is no
+value indicator and conses when there is.
+
+As an extension, the atom string can appear as a literal instead
+of only a token.
+
+As an extension, the value of a name-value pair can be parsed by
+calling value-fn with no arguments instead of as a token or
+literal."
+  (let ((acc '())
+        name val)
     (sgml-parse-s)
-    (while (setq name (sgml-parse-name))
-      ;; FIXME: check name not reserved
+    (while (setq name (sgml-parse-name-or-literal))
       (sgml-parse-s)
       (cond ((sgml-parse-delim "VI")
-             (sgml-parse-s)
-             (setq value
-                   (if (looking-at "['\"]")
-                       (sgml-parse-literal)
-                     (read (current-buffer)))))
-            (t
-             (setq value t)))
-      (message "%s = %S" name value)
-      (setf (sgml-eltype-appdata eltype (intern (downcase name))) value)
-      (sgml-parse-s))))
+             (setq val (funcall (if value-fn value-fn 'sgml-parse-name-or-literal))))
+            (t (setq val nil)))
+      (push (if (null val) name (cons name val)) acc)
+      (sgml-parse-s))
+    (nreverse acc)))
 
+(defun sgml--pi-element-handler ()
+  (destructuring-bind (elname &rest options)
+      (sgml-pi-asl-parser (lambda () (read (current-buffer))))
+    (let ((eltype (sgml-lookup-eltype elname)))
+      (dolist (option options)
+        ;; FIXME: check name not reserved
+        (let ((name (if (consp option) (car option) option))
+              (value (if (consp option) (cdr option) t)))
+          (message "%s = %S" name value)
+          (setf (sgml-eltype-appdata eltype (intern (downcase name))) value))))))
 
 ;;[lenst/1998-03-09 19:52:08]  Perhaps not the right place
 (defun sgml-general-insert-case (text)
-- 
2.3.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [ELPA] Update proposal: psgml: asl-like parsing in PI
  2017-11-20  2:25 [ELPA] Update proposal: psgml: asl-like parsing in PI Lucien Pullen
@ 2017-11-22 22:15 ` Stefan Monnier
  2017-11-27  5:11   ` Lucien Pullen
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Monnier @ 2017-11-22 22:15 UTC (permalink / raw)
  To: emacs-devel; +Cc: Lennart Staflin

Hi,

[ I Cc'd Lennart, but I don't think he reads this email address any more,
  and AFAIK he's not interested in maintaining PSGML any more.
  If you're interested, you could be the new maintainer.  ]

> Instead of operating immediately on the values it returns a list of strings
> and (name . value) pairs that the handler function can operate on instead,
> with sgml--pi-element-handler as an example.

Sounds good.

> See the attached patch file for specifics.  I'd appreciate some feedback on
> the docstring and the possibility of turning on and off floating literals as
> an optional argument so it could be used as the parser for
> sgml-parse-attribute-specification-list as well.

Not sure what you mean by "turning on and off floating literals".
[ I've taken charge of including psgml into GNU ELPA and cleanup the
  code for use in Emacs-24, but I'm generally not familiar with the code
  and I've never used PSGML myself.  ]

> -(defun sgml--pi-element-handler ()
> +(defun sgml-parse-name-or-literal ()
>    (sgml-parse-s)
> -  (let ((eltype (sgml-lookup-eltype (sgml-parse-name)))
> -        name value)
> +  (if (looking-at "['\"]")
> +      (sgml-parse-literal)
> +    (sgml-parse-name)))

AFAICT the old code used (read (current-buffer)) instead of
(sgml-parse-name) here.  What's the implication of this change?

> +(defun sgml-pi-asl-parser (&optional value-fn)

I'd personally just make `value-fn` into a mandatory argument.
[ Which also justifies using the name `sgml-parse-name-or-literal`
  without double dash, since that function might be useful for external
  callers to provide as value-fn argument.
  But sgml-parse-name-or-literal probably needs a docstring then.  ]

> +  "Parse the processing instruction like an attribute specification list.
> +
> +Returns a list of strings and (name-string . value-string) pairs
> +in the order they were written in; atom strings when there is no
> +value indicator and conses when there is.

We use all-caps names as metavariables in docstrings.  So I'd write it
something like:

    Return a list of elements of the form (NAME . VALUE) or just NAME
    when there was no value indicator.

> +As an extension, the atom string can appear as a literal instead
> +of only a token.

I don't know what that means.

> +As an extension, the value of a name-value pair can be parsed by
> +calling value-fn with no arguments instead of as a token or
> +literal."

Usually we use the following wording:

   Optional argument VALUE-FN when provided is called with no argument
   to parse the value part of a name-value pair, instead of parsing it
   as a token or a literal.  It should return the VALUE represented as
   a string.

>        (cond ((sgml-parse-delim "VI")
> +             (setq val (funcall (if value-fn value-fn 'sgml-parse-name-or-literal))))
> +            (t (setq val nil)))

(if A A B) is better written (or A B):

          (cond ((sgml-parse-delim "VI")
                 (setq val (funcall (or value-fn #'sgml-parse-name-or-literal))))
                (t (setq val nil)))

And the `setq val` can be hoisted outside the conditional:

          (setq val (cond ((sgml-parse-delim "VI")
                           (funcall (or value-fn #'sgml-parse-name-or-literal)))
                          (t nil)))

And the trailing (t nil) can be dropped:

          (setq val (if (sgml-parse-delim "VI")
                        (funcall (or value-fn #'sgml-parse-name-or-literal))))

Some people may prefer `when` over `if` here.

Also, rather than let-bind `val` outside the loop and `setq` it at each
turn, we can do:

        (let ((val (if (sgml-parse-delim "VI")
                       (funcall (or value-fn #'sgml-parse-name-or-literal)))))
          (push (if (null value) name (cons name value)) acc))

With the use of `lexical-binding` this form is just as efficient.

> +      (push (if (null val) name (cons name val)) acc)

AFAICT the old code used `t` rather than `nil` to mark the "no
value" case.  Is there a risk that sgml-parse-name-or-literal can
return nil?  If so, is the resulting change in behavior on purpose?

If it can't return nil, then the previous code can be simplified further
to have a single `if` test.
        
> +(defun sgml--pi-element-handler ()
> +  (destructuring-bind (elname &rest options)

the new psgml-parse.el code uses `cl-lib` rather than `cl` so the above
needs to be changed to `cl-destructuring-bind`.


        Stefan




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [ELPA] Update proposal: psgml: asl-like parsing in PI
  2017-11-22 22:15 ` Stefan Monnier
@ 2017-11-27  5:11   ` Lucien Pullen
  2017-11-27 21:20     ` Stefan Monnier
  0 siblings, 1 reply; 4+ messages in thread
From: Lucien Pullen @ 2017-11-27  5:11 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 6947 bytes --]

Also sprach Stefan Monnier on 2017-11-22:
> [ I Cc'd Lennart, but I don't think he reads this email address 
> any more,
>   and AFAIK he's not interested in maintaining PSGML any more.
>   If you're interested, you could be the new maintainer.  ]
I certainly use it enough and have gone digging into it.  Let me 
consider the proposal.

>> See the attached patch file for specifics.  I'd appreciate some 
>> feedback on
>> the docstring and the possibility of turning on and off 
>> floating literals as
>> an optional argument so it could be used as the parser for
>> sgml-parse-attribute-specification-list as well.
>
> Not sure what you mean by "turning on and off floating 
> literals".

>> +As an extension, the atom string can appear as a literal 
>> instead
>> +of only a token.
>
> I don't know what that means.

It's formal names of SGML syntax, except the 'floating' part.  I 
don't have a copy of ISO 8859 in front of me to give the correct 
name for the "attribute specification without name and value 
indicator" syntax so I called it 'floating'.

There was talk of amending SGML to allow

    <!ATTLIST my-el data CDATA #IMPLIED>

to be minimized as

    <my-el "My data= value">

when there is only one CDATA attribute, like with enumerated 
tokens, but nothing appears to have come of it.

This also lets you write '<?my-pi "data"="My data= value">' if you 
wanted, which is illegal if a start tag.  Perhaps it is better to 
forbid this syntax in a PI parser for SGML attribute specification 
list syntax since it contradicts SGML's attribute specification 
list syntax.

>> -(defun sgml--pi-element-handler ()
>> +(defun sgml-parse-name-or-literal ()
>>    (sgml-parse-s)
>> -  (let ((eltype (sgml-lookup-eltype (sgml-parse-name)))
>> -        name value)
>> +  (if (looking-at "['\"]")
>> +      (sgml-parse-literal)
>> +    (sgml-parse-name)))

(
This new function has nothing to do with the existing function 
except by git's insistence on making a small patch file that works 
by location of character strings and not lisp syntax.
)

In SGML you can't write '<el "value">' as an alternative to '<el 
value>' because one is a literal and the other is a name token. 
You can write '<el name="value">', though, with or without the 
quotes, since "value" appears in the value position of the 
attribute specification, and psgml calls this frequently enough 
but doesn't have a function for it.

> AFAICT the old code used (read (current-buffer)) instead of
> (sgml-parse-name) here.  What's the implication of this change?
>
>> +(defun sgml-pi-asl-parser (&optional value-fn)
>
> I'd personally just make `value-fn` into a mandatory argument.

The old code uses the following form:

<?PSGML ELEMENT MY-EL face=(elisp face forms)>

instead of

<?PSGML ELEMENT MY-EL face="(elisp face forms)">

so it mixes SGML parsing and elisp parsing.  I made `value-fn' a 
parameter so the same procedure could be used to read regular SGML 
and SGML+Elisp.  Pretty much the only thing I can think of it 
being used for is this one case.

> [ Which also justifies using the name 
> `sgml-parse-name-or-literal`
>   without double dash, since that function might be useful for 
>   external
>   callers to provide as value-fn argument.
>   But sgml-parse-name-or-literal probably needs a docstring 
>   then.  ]

Almost none of the other parse functions have docstrings.  For 
some reason `sgml-parse-minimum-literal' does.  Added one in its 
style.

>>        (cond ((sgml-parse-delim "VI")
>> +             (setq val (funcall (if value-fn value-fn 
>> 'sgml-parse-name-or-literal))))
>> +            (t (setq val nil)))
>
> (if A A B) is better written (or A B):

Can facepalms work by homeopathy?  Surely my screen has the 
negative imprint of one to prevent future forgetfulness of logic 
macros.

>           (cond ((sgml-parse-delim "VI")
>                  (setq val (funcall (or value-fn 
>                  #'sgml-parse-name-or-literal))))
>                 (t (setq val nil)))
>
> And the `setq val` can be hoisted outside the conditional:
>
>           (setq val (cond ((sgml-parse-delim "VI")
>                            (funcall (or value-fn 
>                            #'sgml-parse-name-or-literal)))
>                           (t nil)))
>
> And the trailing (t nil) can be dropped:
>
>           (setq val (if (sgml-parse-delim "VI")
>                         (funcall (or value-fn 
>                         #'sgml-parse-name-or-literal))))
>
> Some people may prefer `when` over `if` here.
>
> Also, rather than let-bind `val` outside the loop and `setq` it 
> at each
> turn, we can do:
>
>         (let ((val (if (sgml-parse-delim "VI")
>                        (funcall (or value-fn 
>                        #'sgml-parse-name-or-literal)))))
>           (push (if (null value) name (cons name value)) acc))
>
> With the use of `lexical-binding` this form is just as 
> efficient.

This is mostly from the original.  I didn't do much style 
rewriting, and kinda copied the structure without much thinking. 
I think I wrote the patch before going to work; it needs cleanup.

>> +      (push (if (null val) name (cons name val)) acc)
>
> AFAICT the old code used `t` rather than `nil` to mark the "no
> value" case.  Is there a risk that sgml-parse-name-or-literal 
> can
> return nil?  If so, is the resulting change in behavior on 
> purpose?
>
> If it can't return nil, then the previous code can be simplified 
> further
> to have a single `if` test.

sgml-parse-name-or-literal is convenience for "if looking at a LIT 
or LITA use parse literal code otherwise use parse token code", 
which always returns a string (I think).  The function in question 
is `sgml-pi-asl-parser' that returns the list of strings and 
conses.

I'm reading over the code again and saving "token" and "token=nil" 
as "token" (but ONLY when using a parser other than 
`sgml-parse-name-or-literal', since that will return "nil" and not 
'nil) in the return is incorrect.

    <?PSGML ELEMENT my-el face=nil>

I was probably thinking in terms of "bool-option (bool-option) 
#IMPLIED", where, when omitted, the attribute is false and true 
otherwise.  Changed it to "(let ((default (cons nil nil))) ...)" 
style to catch that.

>> +(defun sgml--pi-element-handler ()
>> +  (destructuring-bind (elname &rest options)
>
> the new psgml-parse.el code uses `cl-lib` rather than `cl` so 
> the above
> needs to be changed to `cl-destructuring-bind`.

Whoops.  Normally I program CL and this doesn't show up as a 
warning while compiling.



Here is a patch showing the new changes.  I haven't made note that 
you can use `sgml-pi-asl-parser' in an `sgml-pi-function' 
procedure, and `sgml-do-processing-instruction' likely needs 
modified to make this work:

Instead of calling `narrow-to-region' in `sgml--pi-psgml-handler' 
call it like this instead?

(let ((next (point)))
  (save-restriction ... (cond (psgml-pi ...) ...))
  (goto-char next))




[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/patch, Size: 3063 bytes --]

From a3670a3ebb64496b868bc10ec997e9cc31bc900a Mon Sep 17 00:00:00 2001
From: Lucien Pullen <drurowin@gmail.com>
Date: Sun, 26 Nov 2017 21:49:32 -0700
Subject: [PATCH 2/2] * psgml-parse.el: clean up PI parser code

(sgml-parse-name-or-literal): Add a docstring.

(sgml-pi-asl-parser): Follow Elisp conventions for the docstring.  Write
about literals in terms of SGML syntax.

Use functional style in loop body instead of declarative.

Differentiate the value parser returning 'nil and the name=value form
not being used.

(sgml--pi-element-handler): `cl-lib' library prefixes names with "cl-".
---
 psgml-parse.el | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/psgml-parse.el b/psgml-parse.el
index 0710287..3cd762b 100644
--- a/psgml-parse.el
+++ b/psgml-parse.el
@@ -1621,6 +1621,7 @@ in any of them."
                                  command)))))))
 
 (defun sgml-parse-name-or-literal ()
+  "Parse an SGML token or literal and return it as a string."
   (sgml-parse-s)
   (if (looking-at "['\"]")
       (sgml-parse-literal)
@@ -1629,30 +1630,32 @@ in any of them."
 (defun sgml-pi-asl-parser (&optional value-fn)
   "Parse the processing instruction like an attribute specification list.
 
-Returns a list of strings and (name-string . value-string) pairs
-in the order they were written in; atom strings when there is no
-value indicator and conses when there is.
+Returns a list of VALUE and (NAME . VALUE) pairs in the order
+they were written in; atom strings when there is no value
+indicator and conses when there is.
 
-As an extension, the atom string can appear as a literal instead
-of only a token.
+  The NAME and VALUE can be an SGML literal.  This is an
+  extension to SGML syntax that requires NAME and a VALUE without
+  a 'name=' to be a name token.
 
-As an extension, the value of a name-value pair can be parsed by
-calling value-fn with no arguments instead of as a token or
-literal."
+The optional argument VALUE-FN (default
+`sgml-parse-name-or-literal') is called with zero arguments to
+parse the value part of a name=value pair."
   (let ((acc '())
-        name val)
+        name
+        (implied (cons nil nil)))
     (sgml-parse-s)
     (while (setq name (sgml-parse-name-or-literal))
       (sgml-parse-s)
-      (cond ((sgml-parse-delim "VI")
-             (setq val (funcall (if value-fn value-fn 'sgml-parse-name-or-literal))))
-            (t (setq val nil)))
-      (push (if (null val) name (cons name val)) acc)
+      (let ((val (if (sgml-parse-delim "VI")
+                     (funcall (or value-fn 'sgml-parse-name-or-literal))
+                   implied)))
+        (push (if (eq val implied) name (cons name val)) acc))
       (sgml-parse-s))
     (nreverse acc)))
 
 (defun sgml--pi-element-handler ()
-  (destructuring-bind (elname &rest options)
+  (cl-destructuring-bind (elname &rest options)
       (sgml-pi-asl-parser (lambda () (read (current-buffer))))
     (let ((eltype (sgml-lookup-eltype elname)))
       (dolist (option options)
-- 
2.3.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [ELPA] Update proposal: psgml: asl-like parsing in PI
  2017-11-27  5:11   ` Lucien Pullen
@ 2017-11-27 21:20     ` Stefan Monnier
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 2017-11-27 21:20 UTC (permalink / raw)
  To: emacs-devel

>> [ I Cc'd Lennart, but I don't think he reads this email address any more,
>> and AFAIK he's not interested in maintaining PSGML any more.
>> If you're interested, you could be the new maintainer.  ]
> I certainly use it enough and have gone digging into it.  Let me consider
> the proposal.

I installed your patch(es), thank you very much.

Note that they're pushing the limit of what we consider acceptable
without paperwork.  If you intend to contribute further changes (and
a fortiori if you intend to become maintainer), you'll need to fill the
form below and send it as instructed to the FSF so they can send you the
appropriate paperwork to sign.


        Stefan


Please email the following information to assign@gnu.org, and we
will send you the assignment form for your past and future changes.

Please use your full legal name (in ASCII characters) as the subject
line of the message.
----------------------------------------------------------------------
REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES

[What is the name of the program or package you're contributing to?]
Emacs

[Did you copy any files or text written by someone else in these changes?
Even if that material is free software, we need to know about it.]


[Do you have an employer who might have a basis to claim to own
your changes?  Do you attend a school which might make such a claim?]


[For the copyright registration, what country are you a citizen of?]


[What year were you born?]


[Please write your email address here.]


[Please write your postal address here.]





[Which files have you changed so far, and which new files have you written
so far?]




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-27 21:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-20  2:25 [ELPA] Update proposal: psgml: asl-like parsing in PI Lucien Pullen
2017-11-22 22:15 ` Stefan Monnier
2017-11-27  5:11   ` Lucien Pullen
2017-11-27 21:20     ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).