unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: "D. D. Brierton" <darren@dzr-web.com>
Subject: Help needed with regexps
Date: Fri, 13 Feb 2004 19:17:44 +0000	[thread overview]
Message-ID: <pan.2004.02.13.19.17.40.790445@dzr-web.com> (raw)

Hi,

Could a regexp guru look over these regexps and tell me if they're correct
and if they could be improved/simplified?

I'm tweaking my multiple-major-mode setup of psgml / php-mode / css-mode /
javascript-generic-mode for (X)HTML editing. My previous regexps worked
only 75% of the time, and I was trying to improve them and have ended up
breaking things altogether. The current attempt seems to send emacs into
some kind of loop -- CPU hits 100% and I have to kill emacs:

; Set up an mmm group for fancy html editing
(mmm-add-group
 'fancy-html
 '(
         (html-php-embedded
                :submode php-mode
                :face mmm-code-submode-face
                :front "<[?]php"
                :back "[?]>")
	 (html-css-embedded
	        :submode css-mode
		:face mmm-code-submode-face
		:front "<style\\s-+\\(\\s-*.*\\s-+\\)*.*css\"?\\(\\s-*.*\\s-*\\)*\\s-*>"
		:back "</style>")
         (html-css-attribute
                :submode css-mode
                :face mmm-code-submode-face
                :front "\\bstyle=\"?"
                :back "\"")
	 (html-javascript-embedded
	        :submode javascript-generic-mode
		:face mmm-code-submode-face
		:front "<script\\s-+\\(\\s-*.*\\s-+\\)*.*javascript.*\\(\\s-*.*\\s-+\\)*\\s-*>"
		:back "</script>")
         (html-javascript-attribute
                :submode javascript-generic-mode
                :face mmm-code-submode-face
                :front "\\bon\\w+=\"?"
                :back "\"")
   )
)

I have to edit a lot of other people's HTML, and it is very often invalid.
Element and attribute names may be in a mix of upper and lower case,
atrribute values may or may not be quoted, required attributes may be
omitted and nonexistent attributes included!

In particular, the regexps for html-css-embedded and
html-javascript-embedded are the ones I need someone to look over for me.

So, for CSS

"<style\\s-+\\(\\s-*.*\\s-+\\)*.*css\"?\\(\\s-*.*\\s-*\\)*\\s-*>"

should match a "style" element regardless of how its spaced out which at
least contains the string "css" somewhere (and "style" and "css" may be
upper or lower case). For example,

<style
   attr1="val1"
   attr2="val2"
   type="text/css"
   attr3="val3"
   attr4="val4"
>

and

<style type="text/css">

For javascript

"<script\\s-+\\(\\s-*.*\\s-+\\)*.*javascript.*\\(\\s-*.*\\s-+\\)*\\s-*>"

should match a "script" element that contains the string "javascript" and
which may again be variably spaced and either upper case or lower case.

It's mainly the variable whitespacing, and the fact that it's so hard to
know what might come between "<style/<script", "css/javascipt" and ">"
that is throwing me, and my attempts at just experimenting and seeing what
got highlighted correctly have been dampened somewhat by emacs being sent
into a tailspin by my last "experiment". I'd really appreciate some help.
Thanks in advance.

Best, Darren

-- 
======================================================================
D. D. Brierton            darren@dzr-web.com           www.dzr-web.com
       Trying is the first step towards failure (Homer Simpson)
======================================================================

             reply	other threads:[~2004-02-13 19:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-13 19:17 D. D. Brierton [this message]
2004-02-13 19:21 ` Help needed with regexps D. D. Brierton
2004-02-13 19:37 ` Stefan Monnier
2004-02-13 19:57   ` D. D. Brierton
2004-02-13 20:16 ` D. D. Brierton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pan.2004.02.13.19.17.40.790445@dzr-web.com \
    --to=darren@dzr-web.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).