all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Emanuel Berg <embe8573@student.uu.se>
To: help-gnu-emacs@gnu.org
Subject: Re: How to grok a complicated regex?
Date: Fri, 13 Mar 2015 23:46:48 +0100	[thread overview]
Message-ID: <87twxo1pnr.fsf@debian.uxu> (raw)
In-Reply-To: mailman.1979.1426282552.31049.help-gnu-emacs@gnu.org

Marcin Borkowski <mbork@wmi.amu.edu.pl> writes:

> so I have this monstrosity [note: I know, there are
> much worse ones, too!]:
>
> "\\`\\(?:\\\\[([]\\|\\$+\\)?\\(.*?\\)\\(?:\\\\[])]\\|\\$+\\)?\\'"
>
> (it's in the org-latex--script-size function in
> ox-latex.el, if you're curious).
>
> I'm not asking “what does this match” – I can read
> it myself. But it comes with a considerable effort.

I dare say most people (even programmers) cannot read
that so if you can that's great. As a math
professional you are of course aware of the discipline
called automata theory that deals with such things.
Perhaps relational algebra might help to, if the data
in the sets are strings. But automata theory should be
it even more.

Also, remember you don't have to understand those
expressions. Often they are setup incrementally. They
only need to be correct. The computer understands them
- the programmer only understands the purpose, and the
latest edition. Kind of risky, perhaps not what I math
person would be appealed by, but I've constructed many
that way so I know that method works.

> Are you aware of any tools that might help to
> understand such regexen?

I have seen tools with which you can construct such
expressions and they output figures, states,
transitions, and so on. I wonder how advanced
expression they can deal with? But if you get the
basics right, it should be just basic building blocks
that stick together and from there on the sky is the
limit.

Instead the problem is, as I see it: will those
figures, balls and arrows, tagged with preconditions,
postconditions, everything you can think of, will that
actually be *clearer*?

If I were to do it (which I am not thanks god) my
answer would be *no*. The only way I could do it would
instead be the opposite. Train the brain with such
expressions - exactly as they are - day in, day out,
until they are second nature.

Example: a C++ OO project with classes and everything.
Silly inheritance and interfaces. Some people would
consider those pretty darn difficult to understand.
But to the seasoned C++ programmer (no exaggerating
here, a few years of focused training is enough) those
programs are clear. For those guys, giving up writing
C++ code and instead using some other representation
(be it graphical or not) would be to in one stroke
cripple their skills.

So no, I think that representation is the best there
is. To translate it back and forth would not only be
very difficult to do - and even if possible, which of
course it is, because a representation is just a
representation of I don't know how many possible - I
don't see the end result being any more clear: on the
contrary, most likely.

What I would do - try to get it more readable by using
classes, string classes (do they exist?), and even
more advanced constructs if necessary - as in this
simple example:

    (defconst stop-char-default "\\([[:punct:]]\\|[[:space:]][[:alnum:]]\\)")

How do you define those? Can you identify any which
aren't there, but could/should be?

Example: say there is a class called "delimiters"
which contain [, (, {, <, >, }, ), and ]. Can you
split that up, in "opening-delimiters" and closing
ditto?

Second, exactly you mentioned - the font lock issue -
work on that.

You do know, of course, of

    font-lock-regexp-grouping-construct
    font-lock-regexp-grouping-backslash

Are there more of those, that you can identify, and
add?

-- 
underground experts united


       reply	other threads:[~2015-03-13 22:46 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.1979.1426282552.31049.help-gnu-emacs@gnu.org>
2015-03-13 22:46 ` Emanuel Berg [this message]
2015-03-13 23:16   ` How to grok a complicated regex? Marcin Borkowski
2015-03-14  0:12     ` Rasmus
2015-03-14 13:18       ` Stefan Monnier
     [not found]       ` <mailman.2003.1426339118.31049.help-gnu-emacs@gnu.org>
2015-03-15  4:31         ` Rusi
2015-03-22  2:29       ` Tom Tromey
2015-03-22  2:44         ` Rasmus
2015-03-14  5:14     ` Yuri Khan
2015-03-14  7:03     ` Drew Adams
     [not found]   ` <mailman.1984.1426288628.31049.help-gnu-emacs@gnu.org>
2015-03-14  3:58     ` Emanuel Berg
2015-03-14  4:44       ` Emanuel Berg
2015-03-14  4:58         ` Emanuel Berg
2015-03-14  8:43         ` Thien-Thi Nguyen
     [not found]         ` <mailman.1997.1426324089.31049.help-gnu-emacs@gnu.org>
2015-03-20  1:05           ` Emanuel Berg
2015-03-18 16:40 ` Alan Mackenzie
2015-03-19  8:15   ` Tassilo Horn
2015-04-25  4:23 ` Rusi
2015-04-27 13:26   ` Julien Cubizolles
2015-03-14  8:16 martin rudalics
  -- strict thread matches above, loose matches on Subject: below --
2015-03-13 21:35 Marcin Borkowski
2015-03-13 21:45 ` Marcin Borkowski
2015-03-13 21:47 ` Alexis
2015-03-13 21:57   ` Marcin Borkowski
2015-03-23 12:18 ` Vaidheeswaran C

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87twxo1pnr.fsf@debian.uxu \
    --to=embe8573@student.uu.se \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.