unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* nxml-mode parser and multi major modes
@ 2008-05-27 22:11 Lennart Borgman (gmail)
  2008-05-27 22:21 ` Jason Rumney
  0 siblings, 1 reply; 5+ messages in thread
From: Lennart Borgman (gmail) @ 2008-05-27 22:11 UTC (permalink / raw)
  To: Emacs Devel, Daniel Colascione

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

Since Daniel started to do some work on nxml-mode I decided to take a 
new look on the possibility to tame the nxml-mode parser to only look at 
those pieces of the buffer that it should care about.

To my surprise it was much easier now :-)

I have attached the relevant pieces from rng-valid.el. I think this is 
much easier to understand as one piece than as a diff.

I believe it might need to be mixed with Daniel's code, not sure. So I 
send this here just in case someone has a comment or can make it 
better/faster.

[-- Attachment #2: rng-valid-for-mumamo.el --]
[-- Type: text/plain, Size: 7571 bytes --]


(defvar rng-get-major-mode-chunk-function nil
  "Function to use to get major mode chunk.
It should take one argument, the point where to get the major mode chunk.

This is to be set by multiple major mode frame works, like
mumamo.

See also `rng-valid-nxml-major-mode-chunk-function' and
`rng-end-major-mode-chunk-function'. Note that all three
variables must be set.")
(make-variable-buffer-local 'rng-get-major-mode-chunk-function)
(put 'rng-get-major-mode-chunk-function 'permanent-local t)

(defvar rng-valid-nxml-major-mode-chunk-function nil
  "Function to use to check if nxml can parse major mode chunk.
It should take one argument, the chunk.

For more info see also `rng-get-major-mode-chunk-function'.")
(make-variable-buffer-local 'rng-valid-nxml-major-mode-chunk-function)
(put 'rng-valid-nxml-major-mode-chunk-function 'permanent-local t)

(defvar rng-end-major-mode-chunk-function nil
  "Function to use to get the end of a major mode chunk.
It should take one argument, the chunk.

For more info see also `rng-get-major-mode-chunk-function'.")
(make-variable-buffer-local 'rng-end-major-mode-chunk-function)
(put 'rng-end-major-mode-chunk-function 'permanent-local t)

(defun rng-do-some-validation-1 (&optional continue-p-function)
  (let (major-mode-chunk
        end-major-mode-chunk
        (limit (+ rng-validate-up-to-date-end
		  rng-validate-chunk-size))
	(remove-start rng-validate-up-to-date-end)
	(next-cache-point (+ (point) rng-state-cache-distance))
	(continue t)
	(xmltok-dtd rng-dtd)
	have-remaining-chars
	xmltok-type
	xmltok-start
	xmltok-name-colon
	xmltok-name-end
	xmltok-replacement
	xmltok-attributes
	xmltok-namespace-attributes
	xmltok-dependent-regions
	xmltok-errors
        )
    ;;(message ">>>>>>>>> here -1, p=%s" (point)) ;;(sit-for 4)
    (when (and continue (= (point) 1))
      (let ((regions (xmltok-forward-prolog)))
	(rng-clear-overlays 1 (point))
	(while regions
	  (when (eq (aref (car regions) 0) 'encoding-name)
	    (rng-process-encoding-name (aref (car regions) 1)
				       (aref (car regions) 2)))
	  (setq regions (cdr regions))))
      (unless (equal rng-dtd xmltok-dtd)
	(rng-clear-conditional-region))
      (setq rng-dtd xmltok-dtd))
    (while continue
      ;; If mumamo (or something similar) is used then jump over parts
      ;; that can not be parsed by nxml-mode.
      (when (and rng-get-major-mode-chunk-function
                 rng-valid-nxml-major-mode-chunk-function
                 rng-end-major-mode-chunk-function)
        (let ((here (point))
              next-non-space-pos)
          (skip-chars-forward " \t\r\n")
          (setq next-non-space-pos (point))
          (goto-char here)
          ;;(message "here when, p=%s emmc=%s non-space=%s" (point) end-major-mode-chunk next-non-space-pos) ;;(sit-for 4)
          (unless (and end-major-mode-chunk
                       ;; Remaining chars in this chunk?
                       (< next-non-space-pos end-major-mode-chunk))
            (setq end-major-mode-chunk nil)
            (setq major-mode-chunk (funcall rng-get-major-mode-chunk-function next-non-space-pos))
            (while (and major-mode-chunk
                        (not (funcall rng-valid-nxml-major-mode-chunk-function major-mode-chunk))
                        (< next-non-space-pos (point-max)))
              (let ((end-pos (funcall rng-end-major-mode-chunk-function major-mode-chunk)))
                (goto-char (+ end-pos 0))
                (setq major-mode-chunk (funcall rng-get-major-mode-chunk-function (point)))
                ;;(message "---> here 3, point=%s, ep=%s, mm-chunk=%s" (point) end-pos major-mode-chunk)
                )
              (setq next-non-space-pos (point))))
          ;; Stop parsing if we do not have a chunk here yet.
          (setq continue (and major-mode-chunk
                              (funcall rng-valid-nxml-major-mode-chunk-function major-mode-chunk)))
          (when continue
            ;;(message "  continue=t")
            (setq end-major-mode-chunk (funcall rng-end-major-mode-chunk-function major-mode-chunk)))))

      (when continue
        ;;(message "*** here remain, p=%s" (point))
        (setq have-remaining-chars (rng-forward end-major-mode-chunk))
        ;;(message "*** here remain b, p=%s" (point))
        (let ((pos (point)))
          (when end-major-mode-chunk
            ;; Fix-me: Seems like we need a new initialization (or why
            ;; do we otherwise hang without this?)
            (and (> limit end-major-mode-chunk) (setq limit end-major-mode-chunk)))
          (setq continue
                (and have-remaining-chars
                     continue
                     (or (< pos limit)
                         (and continue-p-function
                              (funcall continue-p-function)
                              (setq limit (+ limit rng-validate-chunk-size))
                              t))))
          (cond ((and rng-conditional-up-to-date-start
                      ;; > because we are getting the state from (1- pos)
                      (> pos rng-conditional-up-to-date-start)
                      (< pos rng-conditional-up-to-date-end)
                      (rng-state-matches-current (get-text-property (1- pos)
                                                                    'rng-state)))
                 (when (< remove-start (1- pos))
                   (rng-clear-cached-state remove-start (1- pos)))
                 ;; sync up with cached validation state
                 (setq continue nil)
                 ;; do this before settting rng-validate-up-to-date-end
                 ;; in case we get a quit
                 (rng-mark-xmltok-errors)
                 (rng-mark-xmltok-dependent-regions)
                 (setq rng-validate-up-to-date-end
                       (marker-position rng-conditional-up-to-date-end))
                 (rng-clear-conditional-region)
                 (setq have-remaining-chars
                       (< rng-validate-up-to-date-end (point-max))))
                ((or (>= pos next-cache-point)
                     (not continue))
                 (setq next-cache-point (+ pos rng-state-cache-distance))
                 (rng-clear-cached-state remove-start pos)
                 (when have-remaining-chars
                   ;;(message "rng-cach-state (1- %s)" pos)
                   (rng-cache-state (1- pos)))
                 (setq remove-start pos)
                 (unless continue
                   ;; if we have just blank chars skip to the end
                   (when have-remaining-chars
                     (skip-chars-forward " \t\r\n")
                     (when (= (point) (point-max))
                       (rng-clear-overlays pos (point))
                       (rng-clear-cached-state pos (point))
                       (setq have-remaining-chars nil)
                       (setq pos (point))))
                   (when (not have-remaining-chars)
                     (rng-process-end-document))
                   (rng-mark-xmltok-errors)
                   (rng-mark-xmltok-dependent-regions)
                   (setq rng-validate-up-to-date-end pos)
                   (when rng-conditional-up-to-date-end
                     (cond ((<= rng-conditional-up-to-date-end pos)
                            (rng-clear-conditional-region))
                           ((< rng-conditional-up-to-date-start pos)
                            (set-marker rng-conditional-up-to-date-start
                                        pos))))))))))
    have-remaining-chars))

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: nxml-mode parser and multi major modes
  2008-05-27 22:11 nxml-mode parser and multi major modes Lennart Borgman (gmail)
@ 2008-05-27 22:21 ` Jason Rumney
  2008-05-27 23:38   ` Lennart Borgman (gmail)
  0 siblings, 1 reply; 5+ messages in thread
From: Jason Rumney @ 2008-05-27 22:21 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Daniel Colascione, Emacs Devel

Lennart Borgman (gmail) wrote:
> I have attached the relevant pieces from rng-valid.el. I think this is 
> much easier to understand as one piece than as a diff.

It may be so for you, who wrote the changes, but for anyone else who 
does not know what you've changed, a diff is much easier.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: nxml-mode parser and multi major modes
  2008-05-27 22:21 ` Jason Rumney
@ 2008-05-27 23:38   ` Lennart Borgman (gmail)
  2008-05-28  0:06     ` Jason Rumney
  0 siblings, 1 reply; 5+ messages in thread
From: Lennart Borgman (gmail) @ 2008-05-27 23:38 UTC (permalink / raw)
  To: Jason Rumney; +Cc: Daniel Colascione, Emacs Devel

[-- Attachment #1: Type: text/plain, Size: 1825 bytes --]

Jason Rumney wrote:
> Lennart Borgman (gmail) wrote:
>> I have attached the relevant pieces from rng-valid.el. I think this is 
>> much easier to understand as one piece than as a diff.
> 
> It may be so for you, who wrote the changes, but for anyone else who 
> does not know what you've changed, a diff is much easier.

Ok, here comes the diff.

It is noteworthy that I actually use nxml-mode just to parse the buffer 
when it tie it together with mumamo.

You can not see that in this diff, since turning of the nxml-mode 
fontification is done in mumamo.

After "finishing" the change I sent here to rng-valid.el I realize I can 
now use the fontification from nxml-mode together with mumamo. That was 
not possible before because the nxml-mode fontification would then 
override fontification in chunks of the buffer with other major modes.

Daniel, I am starting to wonder if that is possible with your changes. I 
actually wonder if I can use the parser from nxml-mode at all. I have 
not thought much about it but it is not clear to me how it can be done yet.

My thoughts on this matter before has been that perhaps a two step 
approch to fontification would be good:

- In the first pass a font-lock keyword fontification would be done.
- In the second step a parser like nxml-mode or js2 mode could make it a 
little bit better.

This would mean that the parser have to cooperate with font-lock in some 
way. I do not know how, but the font-lock/jit lock frame work could 
possible be extended to do handle the cooperation.

I think however that there perhaps is no need for the parser to know 
about fontify-region. It would rather need to know things like how to 
tell font-lock what it has done and where major mode chunks are. (Mumamo 
does the major mode chunk dividing in font-lock-fontify-region-function.)

[-- Attachment #2: rng-valid.el.diff --]
[-- Type: text/plain, Size: 7028 bytes --]

Index: rng-valid.el
===================================================================
RCS file: /sources/emacs/emacs/lisp/nxml/rng-valid.el,v
retrieving revision 1.6
diff -c -b -r1.6 rng-valid.el
*** rng-valid.el	6 May 2008 04:25:58 -0000	1.6
--- rng-valid.el	27 May 2008 23:23:24 -0000
***************
*** 520,527 ****
  			    (t (rng-set-initial-state))))))))))
  
  
  (defun rng-do-some-validation-1 (&optional continue-p-function)
!   (let ((limit (+ rng-validate-up-to-date-end
  		  rng-validate-chunk-size))
  	(remove-start rng-validate-up-to-date-end)
  	(next-cache-point (+ (point) rng-state-cache-distance))
--- 520,558 ----
  			    (t (rng-set-initial-state))))))))))
  
  
+ (defvar rng-get-major-mode-chunk-function nil
+   "Function to use to get major mode chunk.
+ It should take one argument, the point where to get the major mode chunk.
+ 
+ This is to be set by multiple major mode frame works, like
+ mumamo.
+ 
+ See also `rng-valid-nxml-major-mode-chunk-function' and
+ `rng-end-major-mode-chunk-function'. Note that all three
+ variables must be set.")
+ (make-variable-buffer-local 'rng-get-major-mode-chunk-function)
+ (put 'rng-get-major-mode-chunk-function 'permanent-local t)
+ 
+ (defvar rng-valid-nxml-major-mode-chunk-function nil
+   "Function to use to check if nxml can parse major mode chunk.
+ It should take one argument, the chunk.
+ 
+ For more info see also `rng-get-major-mode-chunk-function'.")
+ (make-variable-buffer-local 'rng-valid-nxml-major-mode-chunk-function)
+ (put 'rng-valid-nxml-major-mode-chunk-function 'permanent-local t)
+ 
+ (defvar rng-end-major-mode-chunk-function nil
+   "Function to use to get the end of a major mode chunk.
+ It should take one argument, the chunk.
+ 
+ For more info see also `rng-get-major-mode-chunk-function'.")
+ (make-variable-buffer-local 'rng-end-major-mode-chunk-function)
+ (put 'rng-end-major-mode-chunk-function 'permanent-local t)
+ 
  (defun rng-do-some-validation-1 (&optional continue-p-function)
!   (let (major-mode-chunk
!         end-major-mode-chunk
!         (limit (+ rng-validate-up-to-date-end
  		  rng-validate-chunk-size))
  	(remove-start rng-validate-up-to-date-end)
  	(next-cache-point (+ (point) rng-state-cache-distance))
***************
*** 536,543 ****
  	xmltok-attributes
  	xmltok-namespace-attributes
  	xmltok-dependent-regions
! 	xmltok-errors)
!     (when (= (point) 1)
        (let ((regions (xmltok-forward-prolog)))
  	(rng-clear-overlays 1 (point))
  	(while regions
--- 567,576 ----
  	xmltok-attributes
  	xmltok-namespace-attributes
  	xmltok-dependent-regions
! 	xmltok-errors
!         )
!     ;;(message ">>>>>>>>> here -1, p=%s" (point)) ;;(sit-for 4)
!     (when (and continue (= (point) 1))
        (let ((regions (xmltok-forward-prolog)))
  	(rng-clear-overlays 1 (point))
  	(while regions
***************
*** 549,558 ****
  	(rng-clear-conditional-region))
        (setq rng-dtd xmltok-dtd))
      (while continue
!       (setq have-remaining-chars (rng-forward))
        (let ((pos (point)))
  	(setq continue
  	      (and have-remaining-chars
  		   (or (< pos limit)
  		       (and continue-p-function
  			    (funcall continue-p-function)
--- 582,631 ----
  	(rng-clear-conditional-region))
        (setq rng-dtd xmltok-dtd))
      (while continue
!       ;; If mumamo (or something similar) is used then jump over parts
!       ;; that can not be parsed by nxml-mode.
!       (when (and rng-get-major-mode-chunk-function
!                  rng-valid-nxml-major-mode-chunk-function
!                  rng-end-major-mode-chunk-function)
!         (let ((here (point))
!               next-non-space-pos)
!           (skip-chars-forward " \t\r\n")
!           (setq next-non-space-pos (point))
!           (goto-char here)
!           ;;(message "here when, p=%s emmc=%s non-space=%s" (point) end-major-mode-chunk next-non-space-pos) ;;(sit-for 4)
!           (unless (and end-major-mode-chunk
!                        ;; Remaining chars in this chunk?
!                        (< next-non-space-pos end-major-mode-chunk))
!             (setq end-major-mode-chunk nil)
!             (setq major-mode-chunk (funcall rng-get-major-mode-chunk-function next-non-space-pos))
!             (while (and major-mode-chunk
!                         (not (funcall rng-valid-nxml-major-mode-chunk-function major-mode-chunk))
!                         (< next-non-space-pos (point-max)))
!               (let ((end-pos (funcall rng-end-major-mode-chunk-function major-mode-chunk)))
!                 (goto-char (+ end-pos 0))
!                 (setq major-mode-chunk (funcall rng-get-major-mode-chunk-function (point)))
!                 ;;(message "---> here 3, point=%s, ep=%s, mm-chunk=%s" (point) end-pos major-mode-chunk)
!                 )
!               (setq next-non-space-pos (point))))
!           ;; Stop parsing if we do not have a chunk here yet.
!           (setq continue (and major-mode-chunk
!                               (funcall rng-valid-nxml-major-mode-chunk-function major-mode-chunk)))
!           (when continue
!             ;;(message "  continue=t")
!             (setq end-major-mode-chunk (funcall rng-end-major-mode-chunk-function major-mode-chunk)))))
! 
!       (when continue
!         ;;(message "*** here remain, p=%s" (point))
!         (setq have-remaining-chars (rng-forward end-major-mode-chunk))
!         ;;(message "*** here remain b, p=%s" (point))
          (let ((pos (point)))
+           (when end-major-mode-chunk
+             ;; Fix-me: Seems like we need a new initialization (or why
+             ;; do we otherwise hang without this?)
+             (and (> limit end-major-mode-chunk) (setq limit end-major-mode-chunk)))
            (setq continue
                  (and have-remaining-chars
+                      continue
                       (or (< pos limit)
                           (and continue-p-function
                                (funcall continue-p-function)
***************
*** 582,587 ****
--- 655,661 ----
                   (setq next-cache-point (+ pos rng-state-cache-distance))
                   (rng-clear-cached-state remove-start pos)
                   (when have-remaining-chars
+                    ;;(message "rng-cach-state (1- %s)" pos)
                     (rng-cache-state (1- pos)))
                   (setq remove-start pos)
                   (unless continue
***************
*** 603,609 ****
  			  (rng-clear-conditional-region))
  			 ((< rng-conditional-up-to-date-start pos)
  			  (set-marker rng-conditional-up-to-date-start
! 				      pos)))))))))
      have-remaining-chars))
      
  (defun rng-clear-conditional-region ()
--- 677,683 ----
                              (rng-clear-conditional-region))
                             ((< rng-conditional-up-to-date-start pos)
                              (set-marker rng-conditional-up-to-date-start
!                                         pos))))))))))
      have-remaining-chars))
  
  (defun rng-clear-conditional-region ()

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: nxml-mode parser and multi major modes
  2008-05-27 23:38   ` Lennart Borgman (gmail)
@ 2008-05-28  0:06     ` Jason Rumney
  2008-05-28  6:21       ` Lennart Borgman (gmail)
  0 siblings, 1 reply; 5+ messages in thread
From: Jason Rumney @ 2008-05-28  0:06 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Daniel Colascione, Emacs Devel


> Ok, here comes the diff.

It seems that the point of this diff is to make the validation code of 
nxml-mode ignore certain chunks of invalid XML within the buffer, is 
that right?
If so, then XML already has a mechanism for telling parsers not to parse 
a block of text: <![CDATA[ ... ]]>, though you may need to customize the 
face that is used to display the contents to avoid conflict with the 
other mode that you intend to use on its contents.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: nxml-mode parser and multi major modes
  2008-05-28  0:06     ` Jason Rumney
@ 2008-05-28  6:21       ` Lennart Borgman (gmail)
  0 siblings, 0 replies; 5+ messages in thread
From: Lennart Borgman (gmail) @ 2008-05-28  6:21 UTC (permalink / raw)
  To: Jason Rumney; +Cc: Daniel Colascione, Emacs Devel

Jason Rumney wrote:
> 
>> Ok, here comes the diff.
> 
> It seems that the point of this diff is to make the validation code of 
> nxml-mode ignore certain chunks of invalid XML within the buffer, is 
> that right?

Yes.

> If so, then XML already has a mechanism for telling parsers not to parse 
> a block of text: <![CDATA[ ... ]]>, though you may need to customize the 
> face that is used to display the contents to avoid conflict with the 
> other mode that you intend to use on its contents.

It is not supposed to be used in that context.

Think PHP, Jsp, Smarty, EmbPerl or any other xhtml template language. 
The files are simply not valid XML, but parts of them can still be 
parsed by the nxml-mode parser. This allows both error checking and 
XHTML completion.




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-05-28  6:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-27 22:11 nxml-mode parser and multi major modes Lennart Borgman (gmail)
2008-05-27 22:21 ` Jason Rumney
2008-05-27 23:38   ` Lennart Borgman (gmail)
2008-05-28  0:06     ` Jason Rumney
2008-05-28  6:21       ` Lennart Borgman (gmail)

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).