unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
* Help needed with substitute* command
@ 2022-01-06 12:53 Mortimer Cladwell
  2022-01-06 13:50 ` Tobias Geerinckx-Rice
  0 siblings, 1 reply; 3+ messages in thread
From: Mortimer Cladwell @ 2022-01-06 12:53 UTC (permalink / raw)
  To: help-guix

Hi,
I do not understand the second substitute* example in the Guix Manual.
I have the executable mysubs.scm:

--mysubs.scm---------------------------------------------------------
(add-to-load-path
"/gnu/store/rjzj1z89jqcb60nhg5gknkibcl84b3jb-guix-29745d23b-modules/share/guile/site/3.0")
(add-to-load-path ".")

(use-modules
    (guix build utils)
    (ice-9 rdelim)
    (ice-9 popen)
    (ice-9 regex) ;;list-matches
    (ice-9 pretty-print))

(define (main args)
  (begin
    (substitute* "./input.txt"
(("hello")
 "good morning\n")
(("foo([a-z]+)bar(.*)$" all letters end)
 (string-append "baz" letters end))))
  (pretty-print "done"))

  ---------end------------------------------------------------------------

  I run this on 2 different inputs representing the two examples in the
manual:


  ---input.txt(1)------
  hello

  ---------------------

  result: good morning


  ---input.txt(2)-------
  foo(abc)bar(def)

  -------------------

  result:  foo(abc)bar(def)

  Why no substitution with the second example?
  Is the letters/letter in the manual a typo?  If I use letter I get
"...unbound variable..."

  Thanks
  Mortimer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help needed with substitute* command
  2022-01-06 12:53 Help needed with substitute* command Mortimer Cladwell
@ 2022-01-06 13:50 ` Tobias Geerinckx-Rice
  2022-01-06 14:53   ` Mortimer Cladwell
  0 siblings, 1 reply; 3+ messages in thread
From: Tobias Geerinckx-Rice @ 2022-01-06 13:50 UTC (permalink / raw)
  To: Mortimer Cladwell; +Cc: help-guix

[-- Attachment #1: Type: text/plain, Size: 2853 bytes --]

Hullo Mortimer,

I hope this answer isn't too basic for you.  This input:

Mortimer Cladwell 写道:
>   ---input.txt(2)-------
>   foo(abc)bar(def)

does not match the extended regular expression:

> "foo([a-z]+)bar(.*)$"

This would:

  ---input.txt(3)-------
  foolishbarista

result: bazlishista

I'm not the one to either write or recommend a tutorial on 
extended regular expressions, but you'll find plenty on the 'net. 
There's also ‘info (grep)Regular Expressions’ which might be good. 
These things aren't specific to Guile, although a few dialects 
exist, and I think Guile uses the POSIX one.  The differences are 
quite small.

In this specific example

> ("foo([a-z]+)bar(.*)$" all letters end)

the first string is an extended regular expression.

It will match a literal ‘foo’ anywhere on a line, followed by 1 or 
more lowercase letters, followed by a literal ‘bar’, followed by 
anything until the end of the line.

It will NOT match anything with ‘()’ brackets in it, like your 
original input.txt(2).  The brackets are regexp syntax used for 
grouping and capturing.

If an optional variable name follows the regexp, it will be set to 
the complete match.  Here, that is ‘all’, which in our example 
will contain "foolishbarista".  It's not used here.

In practice, this variable would be named ‘_’ to indicate that 
it's unimportant:

  (("foo([a-z]+)bar(.*)$" _ letters end)
   (string-append "baz" letters end))

but the author of the manual example thought that ‘all’ would be 
more clear.

Each subsequent optional variable will be set to the content 
matched by () groups.  Here, ‘letters’ will be set to whatever 
matched ‘[a-z]+’, and ‘end’ to whatever matched ‘.*’.

In our example ‘letters’ is "lish" and ‘end’ is "ista".

This is powerful, because we can construct arbitrary strings at 
run time based that can differ significantly for each line that 
matches the same regexp:

> (string-append "baz" letter end)

is just Scheme code that uses the captured variables above, 
without hard-coding assumptions about what was matched.

  footbarnacles → baztnacles
  foodiebarmaid → bazdiemaid
  …

Minutes of fun.

This special meaning of ‘()’ in extended rexeps means that if you 
would want to match:

  ---input.txt(4)-------
  fo(bizzle)

you'd write:

  "fo\\(bizzle\\)"

Because "\" in a string *also* has special meaning to Guile 
itself, we have to write "\\(" if we want the regexp engine to see 
"\(".

>   Is the letters/letter in the manual a typo?  If I use letter I 
>   get
> "...unbound variable..."

Yes, that was a typo, both names should match.  I've fixed it. 
Thanks for apparently being the first to test this snippet!

Kind regards,

T G-R

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help needed with substitute* command
  2022-01-06 13:50 ` Tobias Geerinckx-Rice
@ 2022-01-06 14:53   ` Mortimer Cladwell
  0 siblings, 0 replies; 3+ messages in thread
From: Mortimer Cladwell @ 2022-01-06 14:53 UTC (permalink / raw)
  To: Tobias Geerinckx-Rice; +Cc: help-guix

Thanks for the detailed explanation Tobias.  Guess I need to brush up on
regular expressions.
Mortimer

On Thu, Jan 6, 2022 at 9:43 AM Tobias Geerinckx-Rice <me@tobias.gr> wrote:

> Hullo Mortimer,
>
> I hope this answer isn't too basic for you.  This input:
>
> Mortimer Cladwell 写道:
> >   ---input.txt(2)-------
> >   foo(abc)bar(def)
>
> does not match the extended regular expression:
>
> > "foo([a-z]+)bar(.*)$"
>
> This would:
>
>   ---input.txt(3)-------
>   foolishbarista
>
> result: bazlishista
>
> I'm not the one to either write or recommend a tutorial on
> extended regular expressions, but you'll find plenty on the 'net.
> There's also ‘info (grep)Regular Expressions’ which might be good.
> These things aren't specific to Guile, although a few dialects
> exist, and I think Guile uses the POSIX one.  The differences are
> quite small.
>
> In this specific example
>
> > ("foo([a-z]+)bar(.*)$" all letters end)
>
> the first string is an extended regular expression.
>
> It will match a literal ‘foo’ anywhere on a line, followed by 1 or
> more lowercase letters, followed by a literal ‘bar’, followed by
> anything until the end of the line.
>
> It will NOT match anything with ‘()’ brackets in it, like your
> original input.txt(2).  The brackets are regexp syntax used for
> grouping and capturing.
>
> If an optional variable name follows the regexp, it will be set to
> the complete match.  Here, that is ‘all’, which in our example
> will contain "foolishbarista".  It's not used here.
>
> In practice, this variable would be named ‘_’ to indicate that
> it's unimportant:
>
>   (("foo([a-z]+)bar(.*)$" _ letters end)
>    (string-append "baz" letters end))
>
> but the author of the manual example thought that ‘all’ would be
> more clear.
>
> Each subsequent optional variable will be set to the content
> matched by () groups.  Here, ‘letters’ will be set to whatever
> matched ‘[a-z]+’, and ‘end’ to whatever matched ‘.*’.
>
> In our example ‘letters’ is "lish" and ‘end’ is "ista".
>
> This is powerful, because we can construct arbitrary strings at
> run time based that can differ significantly for each line that
> matches the same regexp:
>
> > (string-append "baz" letter end)
>
> is just Scheme code that uses the captured variables above,
> without hard-coding assumptions about what was matched.
>
>   footbarnacles → baztnacles
>   foodiebarmaid → bazdiemaid
>   …
>
> Minutes of fun.
>
> This special meaning of ‘()’ in extended rexeps means that if you
> would want to match:
>
>   ---input.txt(4)-------
>   fo(bizzle)
>
> you'd write:
>
>   "fo\\(bizzle\\)"
>
> Because "\" in a string *also* has special meaning to Guile
> itself, we have to write "\\(" if we want the regexp engine to see
> "\(".
>
> >   Is the letters/letter in the manual a typo?  If I use letter I
> >   get
> > "...unbound variable..."
>
> Yes, that was a typo, both names should match.  I've fixed it.
> Thanks for apparently being the first to test this snippet!
>
> Kind regards,
>
> T G-R
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-01-06 14:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-06 12:53 Help needed with substitute* command Mortimer Cladwell
2022-01-06 13:50 ` Tobias Geerinckx-Rice
2022-01-06 14:53   ` Mortimer Cladwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).