unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Xah Lee <xahlee@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: Re: how to use parsing expressing grammar
Date: Sat, 20 Dec 2008 13:41:59 -0800 (PST)	[thread overview]
Message-ID: <91fa65a0-a7e7-4374-8f96-5786c8908615@w24g2000prd.googlegroups.com> (raw)
In-Reply-To: m2bpv7ur4a.fsf@gmail.com

Xah Lee wrote:
> let's say i want to change tags of the form
> “<img src="archimedesspiral.png">”
> into
> “<img src="★">”
> I tried the following:
> (defun doMyReplace ()
> (interactive)
>   (peg-parse
>    (start imgTag)
>   (imgTag "<img" whitespace "src=" "\"" (replace filePath "★") "\"" ">")
>   (whitespace [" "])
>   (filePath [a-z "."])
>    )
> )

Helmut Eller wrote:
> The filePath rule only matches the first character.  You probably
> want to write (+ [a-z "."]).  Same issue for whitespace.

Thanks a lot! It worked out great!

I have another question, hopefully this one is not a dumb one.

In summary, if i have

   (imgTag "<img" whitespace (+ attributes whitespace) ">")

how to tell PEG that if a attribute is the last item, then the
whitespace following it is optional?

For example, the above will match
<A B C >
but won't match
<A B C>

Here's my code:

(defun doMyReplace ()
(interactive)
  (peg-parse
   (imgTag "<img" _ (+ attributes _) ">")
   (attributes (or src alt width height))
   (src "src" _* "=" _* "\"" filePath "\"")
   (filePath (+ [A-Z a-z "./_-"]))
   (alt "alt" _* "=" _* "\"" altStr "\"")
   (altStr (* [A-Z a-z "./ '_"]))
   (width "width" _* "=" _* "\"" digits "\"")
   (height "height" _* "=" _* "\"" digits "\"")
   (_* (* ["\n \t"])) ; 0 or more white space
   (_ (+ ["\n \t"])) ; 1 or more white space
   (digits (+ [0-9]))
   )
)

here's a sample text to be matched:
<img src="archimedes_spiral_k.png" alt="archimedean spiral"
width="288" height="115">

if i add a space to the ending “>”, it matches.

Thanks again.

> > Btw, would you be interested in starting a mailing list on PEG in
> > emacs? e.g. yasnippet has one thru google, nxml has one in yahoo
> > group, ljupdate has one in livejournal. I think it'd be helpful.
>
> So far only 2 people asked questions.  If there are some more we can set
> up a mailing list.

I'm pretty sure if you create it, more and more people will join it.
I'm very interested in PEG and think it is of critical importance. If
say emacs 24 has it built in as C code, with all its regex functions
such as search-forward-regexp, query-replace-regexp etc having PEG
version, it would make emacs a killer app.

From Wikipedia, it appears that people have already wrote PEG lib for
most major langs. There is already a C lib for PEG. The problem with
them is that most comes with a background of computer lang parsing, as
opposed to practical use for text processing like regex. (note: regex
itself came from computer science background as a way to determine
languages with “regular” grammar, but today it is far removed from
theoretical parsing. The process of this transmutation took i think 10
years, and it took another 10 or so years until it become a widely
popular tool in langs, starting with Perl in the 1990s) I don't forsee
that in the next 10 years that practicing programers will all know
about computer science of parsing or that major langs will all have
formal grammar spec. I'm pretty certain people are already seeing the
potential of PEG as regex replacement and working towards creatings
such practical goal.

  Xah
∑ http://xahlee.org/

  reply	other threads:[~2008-12-20 21:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <6b8a1070-1a89-48b0-9287-343b673b5758@a29g2000pra.googlegroups.com>
     [not found] ` <m27i5yygi5.fsf@gmail.com>
     [not found]   ` <m2k59ywtj2.fsf@gmail.com>
     [not found]     ` <b3203a8b-324f-440f-98a9-653c8d582c7c@y1g2000pra.googlegroups.com>
2008-12-20  8:42       ` how to use parsing expressing grammar Xah Lee
2008-12-20  9:34         ` Helmut Eller
2008-12-20 21:41           ` Xah Lee [this message]
2008-12-21  9:49             ` Helmut Eller
2009-03-03 17:34               ` Leo
2009-03-03 17:59                 ` Mike Mattie
     [not found]                 ` <mailman.2299.1236115586.31690.help-gnu-emacs@gnu.org>
2009-03-03 22:05                   ` Xah Lee
2009-03-03 23:52                     ` W Dan Meyer
2009-03-04  0:35                       ` Miles Bader
2009-03-05  6:55                         ` Mike Mattie
2009-03-05  6:18                       ` Mike Mattie
2009-03-05 16:38                     ` Mike Mattie
2009-03-06  8:53                     ` Helmut Eller
2008-12-20 22:27           ` Xah Lee
2008-12-21 11:24             ` Helmut Eller
2008-12-23 23:21         ` ashishnkadakia
2008-12-17 11:53 Xah Lee
2008-12-18  3:43 ` Kevin Rodgers
     [not found] ` <mailman.3007.1229571828.26697.help-gnu-emacs@gnu.org>
2008-12-18  9:24   ` Xah Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91fa65a0-a7e7-4374-8f96-5786c8908615@w24g2000prd.googlegroups.com \
    --to=xahlee@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).