From: Tobias Geerinckx-Rice <me@tobias.gr>
To: Mortimer Cladwell <mbcladwell@gmail.com>
Cc: help-guix@gnu.org
Subject: Re: Help needed with substitute* command
Date: Thu, 06 Jan 2022 14:50:15 +0100 [thread overview]
Message-ID: <87k0fchqjz.fsf@nckx> (raw)
In-Reply-To: <CAOcxjM5hvMypudEOzUL8T+468HupS2wPW=KUthv_Oyi3wb9TDg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2853 bytes --]
Hullo Mortimer,
I hope this answer isn't too basic for you. This input:
Mortimer Cladwell 写道:
> ---input.txt(2)-------
> foo(abc)bar(def)
does not match the extended regular expression:
> "foo([a-z]+)bar(.*)$"
This would:
---input.txt(3)-------
foolishbarista
result: bazlishista
I'm not the one to either write or recommend a tutorial on
extended regular expressions, but you'll find plenty on the 'net.
There's also ‘info (grep)Regular Expressions’ which might be good.
These things aren't specific to Guile, although a few dialects
exist, and I think Guile uses the POSIX one. The differences are
quite small.
In this specific example
> ("foo([a-z]+)bar(.*)$" all letters end)
the first string is an extended regular expression.
It will match a literal ‘foo’ anywhere on a line, followed by 1 or
more lowercase letters, followed by a literal ‘bar’, followed by
anything until the end of the line.
It will NOT match anything with ‘()’ brackets in it, like your
original input.txt(2). The brackets are regexp syntax used for
grouping and capturing.
If an optional variable name follows the regexp, it will be set to
the complete match. Here, that is ‘all’, which in our example
will contain "foolishbarista". It's not used here.
In practice, this variable would be named ‘_’ to indicate that
it's unimportant:
(("foo([a-z]+)bar(.*)$" _ letters end)
(string-append "baz" letters end))
but the author of the manual example thought that ‘all’ would be
more clear.
Each subsequent optional variable will be set to the content
matched by () groups. Here, ‘letters’ will be set to whatever
matched ‘[a-z]+’, and ‘end’ to whatever matched ‘.*’.
In our example ‘letters’ is "lish" and ‘end’ is "ista".
This is powerful, because we can construct arbitrary strings at
run time based that can differ significantly for each line that
matches the same regexp:
> (string-append "baz" letter end)
is just Scheme code that uses the captured variables above,
without hard-coding assumptions about what was matched.
footbarnacles → baztnacles
foodiebarmaid → bazdiemaid
…
Minutes of fun.
This special meaning of ‘()’ in extended rexeps means that if you
would want to match:
---input.txt(4)-------
fo(bizzle)
you'd write:
"fo\\(bizzle\\)"
Because "\" in a string *also* has special meaning to Guile
itself, we have to write "\\(" if we want the regexp engine to see
"\(".
> Is the letters/letter in the manual a typo? If I use letter I
> get
> "...unbound variable..."
Yes, that was a typo, both names should match. I've fixed it.
Thanks for apparently being the first to test this snippet!
Kind regards,
T G-R
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]
next prev parent reply other threads:[~2022-01-06 14:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-06 12:53 Help needed with substitute* command Mortimer Cladwell
2022-01-06 13:50 ` Tobias Geerinckx-Rice [this message]
2022-01-06 14:53 ` Mortimer Cladwell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k0fchqjz.fsf@nckx \
--to=me@tobias.gr \
--cc=help-guix@gnu.org \
--cc=mbcladwell@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).