From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Krister Svanlund Newsgroups: gmane.lisp.guile.devel Subject: Re: GNU Guile PEG-parser Date: Wed, 8 Feb 2012 19:29:46 +0100 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=f46d0442885683b93504b8781458 X-Trace: dough.gmane.org 1328750388 19954 80.91.229.3 (9 Feb 2012 01:19:48 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 9 Feb 2012 01:19:48 +0000 (UTC) Cc: guile-devel To: Noah Lavine Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Feb 09 02:19:47 2012 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RvIfo-0001mH-9F for guile-devel@m.gmane.org; Thu, 09 Feb 2012 02:19:44 +0100 Original-Received: from localhost ([::1]:59834 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RvIfl-0007wv-J8 for guile-devel@m.gmane.org; Wed, 08 Feb 2012 20:19:41 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:52201) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RvCHA-0007jI-E5 for guile-devel@gnu.org; Wed, 08 Feb 2012 13:29:57 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RvCH5-0002LI-SW for guile-devel@gnu.org; Wed, 08 Feb 2012 13:29:52 -0500 Original-Received: from mail-tul01m020-f169.google.com ([209.85.214.169]:57049) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RvCH5-0002L8-L0 for guile-devel@gnu.org; Wed, 08 Feb 2012 13:29:47 -0500 Original-Received: by obbta7 with SMTP id ta7so1449318obb.0 for ; Wed, 08 Feb 2012 10:29:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=JULURMPDjgXcExi0t0IlHneI9F6EnNVffOWqW7xMpfE=; b=F65yRhsfBv4ZHiJavFkdOqwZolxWyYEE1OLePH9hq2O9nhB+C+H+ZSq31ezgM2uDWb wqtPnV+5sHDPZNUQiZ/ruCb0dIYOcMWWDIrpgBL5umg3fmEFShWBFF+xKejx2KtwTsRc IrncxEgw59wIiJbFJoTd6KnGOuqxcEOwCSIDI= Original-Received: by 10.182.118.34 with SMTP id kj2mr17066758obb.37.1328725786605; Wed, 08 Feb 2012 10:29:46 -0800 (PST) Original-Received: by 10.60.6.42 with HTTP; Wed, 8 Feb 2012 10:29:46 -0800 (PST) In-Reply-To: X-Google-Sender-Auth: 0LyoNE506YvbVbnvS4GnRelySiQ X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.214.169 X-Mailman-Approved-At: Wed, 08 Feb 2012 20:19:39 -0500 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:13832 Archived-At: --f46d0442885683b93504b8781458 Content-Type: text/plain; charset=UTF-8 Hi, thanks for a quick response! I've actually found no PEG library that has a string syntax for the equivalent of ignore. I'm guessing most people are satisfied with just specifying another nonterminal and matching that one. Probably because it is seen as less ugly than extending on the formal definition of PEG but I really think we could get a cleaner PEG definition of our parser if we where able to ignore text that wasn't needed or gets in the way while using string-patterns. It's actually exactly Python I'm thinking about, we are currently doing a preprocessor that will put #{ and #} before and after each block but I was hoping that there exists a cleaner solution using the power of PEG instead of basic string manipulation. If you could help in any way shape or form that would be greatly appreciated, even just suggesting on what parts of PEG internals to look at would be really useful. I hope you or the guile-devel list can be of help. Yors, Krister Svanlund On Wed, Feb 8, 2012 at 1:47 AM, Noah Lavine wrote: > Hello, > > Thanks for emailing! I suppose I am the one to talk to, since I was > the last one to work on it. > > I didn't make the PEG parsing syntax, but I would guess the reason > there isn't a string syntax for ignore is that there's no conventional > way to write it, but there is for the other PEG elements. It would be > easy to add one if it was useful, but we'd want to make sure our > syntax agreed with other PEG libraries, so people wouldn't be confused > later. > > For blank-space indented blocks, do you mean you want to group > together lines with the same indentation, like Python syntax? If you > know what the indentation will be at the beginning of each line, you > can do something like this: > > (* (and "\t" "\n")), > > where you replace "\t" with whatever indentation you want. > > However, what you probably want to do is look at the indentation in > the first line and then group it with every following line that has > the same indentation. I'm not sure if it's possible, but it would > probably be ugly. If you tell me what you're trying to do, though, I > can help you write your own parser to handle it. You can even write > some of your parser yourself and use PEGs for the rest, if you're > willing to use PEG internals. > > Can you tell me more about what you're trying to do? I am happy to > help now, but I will be more helpful if I know more. > > I'm going to CC the guile-devel mailing list because of the issue with > the string syntax. > > Noah > > On Tue, Feb 7, 2012 at 10:03 AM, Krister Svanlund > wrote: > > Hi, > > I'm currently involved in a project that plans on using the PEG module > for > > Guile for parsing and I've understod that you are the one to talk to > about > > it. I'm mostly just curious how come there isn't an equivalent to ignore > in > > string-patterns and if this would be complex to add? > > > > I'm also curious if there is any way to deal with blank-space indented > > blocks in PEG. > > > > Yours > > Krister Svanlund > --f46d0442885683b93504b8781458 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, thanks for a quick response!

I've actually found= no PEG library that has a string syntax for the equivalent of ignore. I= 9;m guessing most people are satisfied with just specifying another nonterm= inal and matching that one. Probably because it is seen as less ugly than e= xtending on the formal definition of PEG but I really think we could get a = cleaner PEG definition of our parser if we where able to ignore text that w= asn't needed or gets in the way while using string-patterns.

It's actually exactly Python I'm thinking about= , we are currently doing a preprocessor that will put #{ and #} before and = after each block but I was hoping that there exists a cleaner solution usin= g the power of PEG instead of basic string manipulation. If you could help = in any way shape or form that would be greatly appreciated, even just sugge= sting on what parts of PEG internals to look at would be really useful.=C2= =A0

I hope you or the guile-devel list can be of help.

Yors,
Krister Svanlund

On Wed, Feb 8, 2012 at 1:47 AM, Noah Lavine <n= oah.b.lavine@gmail.com> wrote:
Hello,

Thanks for emailing! I suppose I am the one to talk to, since I was
the last one to work on it.

I didn't make the PEG parsing syntax, but I would guess the reason
there isn't a string syntax for ignore is that there's no conventio= nal
way to write it, but there is for the other PEG elements. It would be
easy to add one if it was useful, but we'd want to make sure our
syntax agreed with other PEG libraries, so people wouldn't be confused<= br> later.

For blank-space indented blocks, do you mean you want to group
together lines with the same indentation, like Python syntax? If you
know what the indentation will be at the beginning of each line, you
can do something like this:

(* (and "\t" <match-line> "\n")),

where you replace "\t" with whatever indentation you want.

However, what you probably want to do is look at the indentation in
the first line and then group it with every following line that has
the same indentation. I'm not sure if it's possible, but it would probably be ugly. If you tell me what you're trying to do, though, I can help you write your own parser to handle it. You can even write
some of your parser yourself and use PEGs for the rest, if you're
willing to use PEG internals.

Can you tell me more about what you're trying to do? I am happy to
help now, but I will be more helpful if I know more.

I'm going to CC the guile-devel mailing list because of the issue with<= br> the string syntax.

Noah

On Tue, Feb 7, 2012 at 10:03 AM, Krister Svanlund
<krister= .svanlund@gmail.com> wrote:
> Hi,
> I'm currently involved in a project that plans on using the PEG mo= dule for
> Guile for parsing and I've understod that you are the one to talk = to about
> it. I'm mostly just curious how come there isn't an equivalent= to ignore in
> string-patterns and if this would be complex to add?
>
> I'm also curious if there is any way to deal with blank-space inde= nted
> blocks in PEG.
>
> Yours
> Krister Svanlund

--f46d0442885683b93504b8781458--