From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.devel Subject: Re: Make peg.el a built-in library? Date: Thu, 30 Sep 2021 20:27:36 -0700 Message-ID: <871r55o253.fsf@ericabrahamsen.net> References: <875yvtbbn3.fsf@ericabrahamsen.net> <87bl5k87hq.fsf@alphapapa.net> <87fsuvpod4.fsf@ericabrahamsen.net> <874ka7wqko.fsf@gmail.com> <87sfxp4ard.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13860"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) To: emacs-devel@gnu.org Cancel-Lock: sha1:Mi4e3L6VVBMZw8qqpDDVYTVMULI= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Oct 01 05:28:38 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mW9Dy-0003QF-BU for ged-emacs-devel@m.gmane-mx.org; Fri, 01 Oct 2021 05:28:38 +0200 Original-Received: from localhost ([::1]:43994 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mW9Dx-0007p7-8q for ged-emacs-devel@m.gmane-mx.org; Thu, 30 Sep 2021 23:28:37 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:60338) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mW9DK-00078q-71 for emacs-devel@gnu.org; Thu, 30 Sep 2021 23:27:58 -0400 Original-Received: from ciao.gmane.io ([116.202.254.214]:51470) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mW9DI-0000OQ-KS for emacs-devel@gnu.org; Thu, 30 Sep 2021 23:27:57 -0400 Original-Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1mW9DG-0002WB-3e for emacs-devel@gnu.org; Fri, 01 Oct 2021 05:27:54 +0200 X-Injected-Via-Gmane: http://gmane.org/ Received-SPF: pass client-ip=116.202.254.214; envelope-from=ged-emacs-devel@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:275946 Archived-At: Richard Stallman writes: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > Basically a way of composing a parser out of smaller regexp-like > > expressions. They can be very useful in a wide variety of situations. > > It does sound useful. Can you post a descripion of a specific simple > example where this approach is advantageous? I feel like I've ended up advocating for this thing when I know less about it than anyone here, but... My sense is that really powerful PEG systems are the sort of thing you use to parse source code into ASTs, or do syntax highlighting, etc. We don't need that, and the use-cases I have in mind, anyway, are simpler situations where I want to parse a stream of well-defined-but-still-pretty-complicated text. The sort of thing where a regexp solution turns into a rat's nest very quickly. One theoretical example is parsing IMAP server responses. The response text is fully defined, but could vary enormously depending on the capabilities of the server. Writing naive regexps is a headache. Another non-theoretical example is the homemade token-parser in lisp/gnus/gnus-search.el:390-680, which turns a string like: from:bob (subject:lunch or subject:dinner) into the sexp ((from . "bob") (or (subject . "lunch") (subject . "dinner")) There are many, many libraries that need to do something similar. With peg.el I can parse the above (including arbitrarily-nested sub-expressions) with twenty lines of peg definition, which is comprehensible to look at (once you've got the basics), easier to reason about, and easier to modify. I guess it's sort of equivalent to a BNF. PEGs and their implementation are the subject of academic research, obviously, but for my modest uses, anyway, almost anything will do.