unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
From: Mortimer Cladwell <mbcladwell@gmail.com>
To: Tobias Geerinckx-Rice <me@tobias.gr>
Cc: help-guix <help-guix@gnu.org>
Subject: Re: Help needed with substitute* command
Date: Thu, 6 Jan 2022 09:53:26 -0500	[thread overview]
Message-ID: <CAOcxjM6YbLZb7V601WkP_k9EgMTmj2M=q+wpUVGYyivnF_2LQg@mail.gmail.com> (raw)
In-Reply-To: <87k0fchqjz.fsf@nckx>

Thanks for the detailed explanation Tobias.  Guess I need to brush up on
regular expressions.
Mortimer

On Thu, Jan 6, 2022 at 9:43 AM Tobias Geerinckx-Rice <me@tobias.gr> wrote:

> Hullo Mortimer,
>
> I hope this answer isn't too basic for you.  This input:
>
> Mortimer Cladwell 写道:
> >   ---input.txt(2)-------
> >   foo(abc)bar(def)
>
> does not match the extended regular expression:
>
> > "foo([a-z]+)bar(.*)$"
>
> This would:
>
>   ---input.txt(3)-------
>   foolishbarista
>
> result: bazlishista
>
> I'm not the one to either write or recommend a tutorial on
> extended regular expressions, but you'll find plenty on the 'net.
> There's also ‘info (grep)Regular Expressions’ which might be good.
> These things aren't specific to Guile, although a few dialects
> exist, and I think Guile uses the POSIX one.  The differences are
> quite small.
>
> In this specific example
>
> > ("foo([a-z]+)bar(.*)$" all letters end)
>
> the first string is an extended regular expression.
>
> It will match a literal ‘foo’ anywhere on a line, followed by 1 or
> more lowercase letters, followed by a literal ‘bar’, followed by
> anything until the end of the line.
>
> It will NOT match anything with ‘()’ brackets in it, like your
> original input.txt(2).  The brackets are regexp syntax used for
> grouping and capturing.
>
> If an optional variable name follows the regexp, it will be set to
> the complete match.  Here, that is ‘all’, which in our example
> will contain "foolishbarista".  It's not used here.
>
> In practice, this variable would be named ‘_’ to indicate that
> it's unimportant:
>
>   (("foo([a-z]+)bar(.*)$" _ letters end)
>    (string-append "baz" letters end))
>
> but the author of the manual example thought that ‘all’ would be
> more clear.
>
> Each subsequent optional variable will be set to the content
> matched by () groups.  Here, ‘letters’ will be set to whatever
> matched ‘[a-z]+’, and ‘end’ to whatever matched ‘.*’.
>
> In our example ‘letters’ is "lish" and ‘end’ is "ista".
>
> This is powerful, because we can construct arbitrary strings at
> run time based that can differ significantly for each line that
> matches the same regexp:
>
> > (string-append "baz" letter end)
>
> is just Scheme code that uses the captured variables above,
> without hard-coding assumptions about what was matched.
>
>   footbarnacles → baztnacles
>   foodiebarmaid → bazdiemaid
>   …
>
> Minutes of fun.
>
> This special meaning of ‘()’ in extended rexeps means that if you
> would want to match:
>
>   ---input.txt(4)-------
>   fo(bizzle)
>
> you'd write:
>
>   "fo\\(bizzle\\)"
>
> Because "\" in a string *also* has special meaning to Guile
> itself, we have to write "\\(" if we want the regexp engine to see
> "\(".
>
> >   Is the letters/letter in the manual a typo?  If I use letter I
> >   get
> > "...unbound variable..."
>
> Yes, that was a typo, both names should match.  I've fixed it.
> Thanks for apparently being the first to test this snippet!
>
> Kind regards,
>
> T G-R
>

      reply	other threads:[~2022-01-06 14:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-06 12:53 Help needed with substitute* command Mortimer Cladwell
2022-01-06 13:50 ` Tobias Geerinckx-Rice
2022-01-06 14:53   ` Mortimer Cladwell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOcxjM6YbLZb7V601WkP_k9EgMTmj2M=q+wpUVGYyivnF_2LQg@mail.gmail.com' \
    --to=mbcladwell@gmail.com \
    --cc=help-guix@gnu.org \
    --cc=me@tobias.gr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).