From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.help Subject: Re: How to grok a complicated regex? Date: Wed, 18 Mar 2015 16:40:35 +0000 (UTC) Organization: muc.de e.V. Message-ID: References: NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1426697149 2728 80.91.229.3 (18 Mar 2015 16:45:49 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 18 Mar 2015 16:45:49 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Mar 18 17:45:39 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YYH62-0000ph-LK for geh-help-gnu-emacs@m.gmane.org; Wed, 18 Mar 2015 17:45:30 +0100 Original-Received: from localhost ([::1]:34802 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYH61-0000fS-N7 for geh-help-gnu-emacs@m.gmane.org; Wed, 18 Mar 2015 12:45:29 -0400 Original-Path: usenet.stanford.edu!goblin3!goblin.stu.neva.ru!news.tu-darmstadt.de!news.muc.de!.POSTED!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 124 Original-NNTP-Posting-Host: news.muc.de Original-X-Trace: colin.muc.de 1426696835 16664 193.149.48.2 (18 Mar 2015 16:40:35 GMT) Original-X-Complaints-To: news-admin@muc.de Original-NNTP-Posting-Date: Wed, 18 Mar 2015 16:40:35 +0000 (UTC) User-Agent: tin/2.2.0-20131224 ("Lochindaal") (UNIX) (FreeBSD/10.1-RELEASE (amd64)) Original-Xref: usenet.stanford.edu gnu.emacs.help:210923 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:103202 Archived-At: Hi, Marcin. Sorry if I'm a bit late to this discussion. Marcin Borkowski wrote: > Hi all, > so I have this monstrosity [note: I know, there are much worse ones, > too!]: > "\\`\\(?:\\\\[([]\\|\\$+\\)?\\(.*?\\)\\(?:\\\\[])]\\|\\$+\\)?\\'" > (it's in the org-latex--script-size function in ox-latex.el, if you're > curious). > I'm not asking ?what does this match? ? I can read it myself. But it > comes with a considerable effort. Are you aware of any tools that might > help to understand such regexen? > I know about re-builder, but it?s well suited for constructing a regex > matching a given string, not the other way round. > For instance, show-paren-mode does not really help here, since it seems > to pair ?\\(? with unescaped ?)?. > Any ideas? I wrote myself the following tool. It's not production quality, but you might find it useful nonetheless. To use it, Type M-: (pp-regexp re-horror). It displays the regexp at the end of the *scratch* buffer, dropping the contents of any \(..\) construct by one line. I find it useful. So might you. Feel free to adapt it, or pass it on to other people. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (defun pp-regexp (regexp) "Pretty print a regexp. This means, contents of \\\\\(s are lowered a line." (or (stringp regexp) (error "parameter is not a string.")) (let ((depth 0) (re (replace-regexp-in-string "[\t\n\r\f]" (lambda (s) (or (cdr (assoc s '(("\t" . "??") ("\n" . "??") ("\r" . "??")))) "??")) regexp)) (start 0) ; earliest position still without an acm-depth property. (pos 0) ; current analysis position. (max-depth 0) ; How many lines do we need to print? (min-depth 0) ; Pick up "negative depth" errors. pr-line ; output line being constructed line-no ; line number of pr-line, varies between min-depth and max-depth. ch ) ;(translate-rnt re) ;; apply acm-depth properties to the whole string. (while (< start (length re)) (setq pos (string-match ;; "\\\\\\((\\(\\?:\\)?\\||\\|)\\)" "\\\\\\(\\\\\\|(\\(\\?:\\)?\\||\\|)\\)" re start)) (put-text-property start (or pos (length re)) 'acm-depth depth re) (when pos (setq ch (aref (match-string 1 re) 0)) (cond ((eq ch ?\\) (put-text-property pos (match-end 1) 'acm-depth depth re)) ((eq ch ?\() (put-text-property pos (match-end 1) 'acm-depth depth re) (setq depth (1+ depth)) (if (> depth max-depth) (setq max-depth depth))) ((eq ch ?\|) (put-text-property pos (match-end 1) 'acm-depth (1- depth) re) (if (< (1- depth) min-depth) (setq min-depth (1- depth)))) (t ; (eq ch ?\)) (setq depth (1- depth)) (if (< depth min-depth) (setq min-depth depth)) (put-text-property pos (match-end 1) 'acm-depth depth re)))) (setq start (if pos (match-end 1) (length re)))) ;; print out the strings (setq line-no min-depth) (while (<= line-no max-depth) (with-current-buffer "*scratch*" (goto-char (point-max)) (insert ?\n) (setq pr-line "") (setq start 0) (while (< start (length re)) (setq pos (next-single-property-change start 'acm-depth re (length re))) (setq depth (get-text-property start 'acm-depth re)) (setq pr-line (concat pr-line (if (= depth line-no) (substring re start pos) (make-string (- pos start) ?\ )))) (setq start pos)) (insert pr-line) (setq line-no (1+ line-no)))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > (Note: if there are no such tools, I might be tempted to craft one. Two > things that come to my mind are proper highlighting of matching parens > of various kinds and eldoc-like hints for all the regex constructs ? > I never seem to remember what does ?\\`? do, for instance. Also, > displaying the string with single backslashes and not in the way it is > actually typed in in Elisp, with all the backslash escaping, might be > helpful. Would there be a demand for such a tool larger than one > person?) > Best, > -- > Marcin Borkowski > http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski > Faculty of Mathematics and Computer Science > Adam Mickiewicz University -- Alan Mackenzie (Nuremberg, Germany).