all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* nested comments in sgml-mode are not properly quoted.
@ 2003-01-29 22:19 Martin Schwamberger
  0 siblings, 0 replies; only message in thread
From: Martin Schwamberger @ 2003-01-29 22:19 UTC (permalink / raw)


Hi,

I frequently use comment-region and I was really unhappy 
when I found, that I couldn't use it savely in sgml/xml mode,
due to an already known quoting problem.

Since I couldn't find any way to avoid the problem without changing the code,
I decided to fix the bug in newcomment.el which was shipped with emacs 21.2.1.

The original quoting algorithm inserts one or more backslashes
between first and second character of the comment markers.
This leads to <\!-- .....  -\-> for SGML/XML comments.
Unfortunatly, the resulting -- sequence is not allowed within SGML comments 
(see http://www.w3.org/TR/REC-xml#sec-comments)

My algorithm inserts backslashes after every character
except the last if the marker is longer than one character.
This leads to <\!\-\- .....  -\-\>, which is allowed within comments.

I've tested it for SGML and C style comments.
I've also played with pascal comments in order to see
what happens with single char endcomment markers.
Everything seems to work well.

Since it does only require the backslash(es) after the first character
when it unquotes, it is able to unquote comment markers
quoted by prior versions.

Here are my new versions of comment-quote-re and comment-quote-nested.
I left the original lines as comments.
Immediately after these comments, my code starts with
;; MS:
and ends with
;; --------------------------------------------------------------------


(defun comment-quote-re (str unp)
;; --------------------------------------------------------------------
;;   (concat (regexp-quote (substring str 0 1))
;; 	  "\\\\" (if unp "+" "*")
;; 	  (regexp-quote (substring str 1))))
;; --------------------------------------------------------------------
;; MS:
  (let ((i 1)
        (len (length str))
        ;; Each backslash sequence is defined as subexpression
        ;; in order add or remove backslashes easily (see comment-quote-nested).
        (qre (concat (regexp-quote (substring str 0 1)) "\\(\\\\" (if unp "+" "*") "\\)")))
    (while (< i len)
      (setq qre
        (concat qre
          (regexp-quote (substring str i (1+ i)))
          ;; No trailing backslash for strings longer than one char.
          ;; Even though UNP is true, Backslash is optional to remain compatible.
          (if (< (1+ i) len) "\\(\\\\*\\)")))
      (setq i (1+ i)))
    qre))
;; --------------------------------------------------------------------

(defun comment-quote-nested (cs ce unp)
  "Quote or unquote nested comments.
If UNP is non-nil, unquote nested comment markers."
  (setq cs (comment-string-strip cs t t))
  (setq ce (comment-string-strip ce t t))
  (when (and comment-quote-nested (> (length ce) 0))
    (let ((re (concat (comment-quote-re ce unp)
                "\\|" (comment-quote-re cs unp))))
      (goto-char (point-min))
      (while (re-search-forward re nil t)
;; --------------------------------------------------------------------
;;      (goto-char (match-beginning 0))
;;	(forward-char 1)
;;	(if unp (delete-char 1) (insert "\\"))
;; --------------------------------------------------------------------
;; MS:
        (let ((i (regexp-opt-depth re)))
          ;; For each subexpression (sequence of backslashes) 
          (while (> i 0)
            (when (match-beginning i)
              (goto-char (match-beginning i))
              (if unp
                ;; quoted?
                (if (> (match-end i) (match-beginning i))
                  (delete-char 1))
                (insert "\\")))
            (setq i (1- i))))
;; --------------------------------------------------------------------
	(when (= (length ce) 1)
	  ;; If the comment-end is a single char, adding a \ after that
	  ;; "first" char won't deactivate it, so we turn such a CE
	  ;; into !CS.  I.e. for pascal, we turn } into !{
	  (if (not unp)
	      (when (string= (match-string 0) ce)
		(replace-match (concat "!" cs) t t))
	    (when (and (< (point-min) (match-beginning 0))
		       (string= (buffer-substring (1- (match-beginning 0))
						  (1- (match-end 0)))
				(concat "!" cs)))
;; --------------------------------------------------------------------
;;	      (backward-char 2)
;; --------------------------------------------------------------------
;; MS:
              (goto-char (1- (match-beginning 0)))
;; --------------------------------------------------------------------
	      (delete-char (- (match-end 0) (match-beginning 0)))
	      (insert ce))))))))


I hope, this gives you at least a few useful ideas,

Martin

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2003-01-29 22:19 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-29 22:19 nested comments in sgml-mode are not properly quoted Martin Schwamberger

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.