all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter
@ 2015-06-06 17:44 gnuist006
  2015-06-06 18:15 ` Emanuel Berg
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: gnuist006 @ 2015-06-06 17:44 UTC (permalink / raw
  To: help-gnu-emacs

tricky regexp

How to insert a marker every 3rd number in a sequence that begins with a certain delimiter, and ends with a certain delimiter and its length is a multiple of three?

I want to isolate sequences like this in a text and to work on them only.

Given:-

text
text
BEGIN N N N END
text BEGIN N N N N N N END
some text BEGIN N N N N N N N N N END
text
N N N N N N N
text

The sequences I want to work on start with BEGIN and end with END with exact multiple of 3 B's in between with only single space. I want to place a newline before every 3 B's. So the above text would transform to

text
text
BEGIN 
N N N 
Z
text BEGIN 
N N N 
N N N 
Z
some text BEGIN 
N N N 
N N N 
N N N 
Z
text
N N N N N N N
text


More accurately, in a "BEGIN N N N N N N N N N END" type sequence, I want to insert a \n before every N whose cardinality is 3n where n=0,1,2,... and the first N has cardinality 0. I also want to insert a \n righ before such a END.


My efforts:

(replace-regexp "BEGIN \\([0-9]\\) \\([0-9]\\) \\([0-9]\\) END" "BEGIN \n\\1 \\2 \\3 \nEND")



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter
  2015-06-06 17:44 Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter gnuist006
@ 2015-06-06 18:15 ` Emanuel Berg
  2015-06-06 19:08 ` John Mastro
       [not found] ` <mailman.4479.1433617759.904.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 6+ messages in thread
From: Emanuel Berg @ 2015-06-06 18:15 UTC (permalink / raw
  To: help-gnu-emacs

gnuist006@gmail.com writes:

> tricky regexp
>
> How to insert a marker every 3rd number in a sequence
> that begins with a certain delimiter, and ends with
> a certain delimiter and its length is a multiple
> of three?
>
> I want to isolate sequences like this in a text and to
> work on them only.
>
> Given:-
>
> text text BEGIN N N N END text BEGIN N N N N N N END
> some text BEGIN N N N N N N N N N END text
> N N N N N N N text
>
> The sequences I want to work on start with BEGIN and
> end with END with exact multiple of 3 B's in between
> with only single space. I want to place a newline
> before every 3 B's. So the above text would transform
> to
>
> text text BEGIN N N N Z text BEGIN N N N N N N Z some
> text BEGIN N N N N N N N N N Z text
> N N N N N N N text

I don't have a lot of confidence this can be done with
regexps. Even if it can (?), say you want to change
some detail or do something else (but similar) next
time? Oh, no! More regexps!

Instead you will be helped by a parser to crunch and
tokenize the input string, then some (small) program
to act on them tokens and output the output string
accordingly. In particular, one such token would be
"3-SEQ" (or something like that) representing the
sequences you describe (as they are equivalent no
matter how many Ns, just as they are multiples of 3).
Then it would just be a matter of checking "is the
current token a 3-SEQ?" - "if so, insert Z!"

I don't know if the parser and compiler compilers
tools are easier for this or if it is easier to just
do it in Lisp (Elisp). Probably if you already know
them tools that would be more easy. If you don't know
them Lisp would be just as good a bid.

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter
  2015-06-06 17:44 Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter gnuist006
  2015-06-06 18:15 ` Emanuel Berg
@ 2015-06-06 19:08 ` John Mastro
       [not found]   ` <CAF=27p0sY6oPo=Hcj1uZn2-ZSb5tiLQDVV3YHjZdXQVZH2VSUA@mail.gmail.com>
       [not found] ` <mailman.4479.1433617759.904.help-gnu-emacs@gnu.org>
  2 siblings, 1 reply; 6+ messages in thread
From: John Mastro @ 2015-06-06 19:08 UTC (permalink / raw
  To: gnuist006, help-gnu-emacs@gnu.org

<gnuist006@gmail.com> wrote:
> tricky regexp
>
> How to insert a marker every 3rd number in a sequence that begins with
> a certain delimiter, and ends with a certain delimiter and its length
> is a multiple of three?
>
> I want to isolate sequences like this in a text and to work on them
> only.
>
> Given:-
>
> text
> text
> BEGIN N N N END
> text BEGIN N N N N N N END
> some text BEGIN N N N N N N N N N END
> text
> N N N N N N N
> text
>
> The sequences I want to work on start with BEGIN and end with END with
> exact multiple of 3 B's in between with only single space. I want to
> place a newline before every 3 B's. So the above text would transform
> to
>
> text
> text
> BEGIN
> N N N
> Z
> text BEGIN
> N N N
> N N N
> Z
> some text BEGIN
> N N N
> N N N
> N N N
> Z
> text
> N N N N N N N
> text

Is there a particular reason you want/need to use a (single) regular
expression? It doesn't seem like a good fit to me. Unless you're
somehow restricted to a single `replace-regexp', you may as well
use more of Emacs's toolbox.

Here's some quick-and-dirty Lisp (which, of course, uses regular
expressions) that works on your example but would need more work and
refinement to serve your general purpose.

    (defun something ()
      (interactive)
      (save-excursion
        (goto-char (point-min))
        (while (re-search-forward "BEGIN" nil t)
          (insert "\n")
          (delete-horizontal-space)
          (while (looking-at "\\(^\\| \\)N N N\\( \\|$\\)")
            (goto-char (match-end 0))
            (delete-horizontal-space t)
            (insert "\n"))
          (when (looking-at "END")
            (replace-match "Z" nil nil nil 0)))))

-- 
john



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter
       [not found] ` <mailman.4479.1433617759.904.help-gnu-emacs@gnu.org>
@ 2015-06-06 22:14   ` Emanuel Berg
  0 siblings, 0 replies; 6+ messages in thread
From: Emanuel Berg @ 2015-06-06 22:14 UTC (permalink / raw
  To: help-gnu-emacs

John Mastro <john.b.mastro@gmail.com> writes:

> Is there a particular reason you want/need to use
> a (single) regular expression? It doesn't seem like
> a good fit to me. Unless you're somehow restricted
> to a single `replace-regexp', you may as well use
> more of Emacs's toolbox.
>
> Here's some quick-and-dirty Lisp (which, of course,
> uses regular expressions)

That's what I meant as well but you put it better.
You (the OP) should use BOTH Lisp and regexps!

> that works on your example but would need more work
> and refinement to serve your general purpose.

Are you sure there is one? :)

The only thing that might not be immediately obvious
from the code is the use of `re-search-forward' and
then `match-end'. re-search-forward with NOERROR (3rd
argument) supplied as t will return nil if there is no
hit. That will cause the loop to terminate.

While there are hits, match-end will return the
"position of end of text matched by last search", but
this isn't from the re-search-forward search but from
the `looking-at' "search".

Then the same mechanisms are at work with
`replace-match'.

And this general method is a good one! Search for
a regexp, then examine it (an easy fetch as the search
hit is stored), then insert or operate on it to
produce the desired output. Repeat until done.

Just add water!

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter
       [not found]     ` <CAOj2CQSLKKxS1xT1nWLe3mP9XaypddXux7F=9tnT_jiMgN+CXA@mail.gmail.com>
@ 2015-06-07  0:57       ` gnu ist
  2015-06-07  2:46         ` John Mastro
  0 siblings, 1 reply; 6+ messages in thread
From: gnu ist @ 2015-06-07  0:57 UTC (permalink / raw
  To: John Mastro, help-gnu-emacs@gnu.org, Emanuel Berg

On Saturday, June 6, 2015 at 10:44:55 AM UTC-7, gnui...@gmail.com wrote:
> tricky regexp
>
> How to insert a marker every 3rd number in a sequence that begins with a
certain delimiter, and ends with a certain delimiter and its length is a
multiple of three?
>
> I want to isolate sequences like this in a text and to work on them only.
>
> Given:-
>
> text
> text
> BEGIN N N N END
> text BEGIN N N N N N N END
> some text BEGIN N N N N N N N N N END
> text
> N N N N N N N
> text
>
> The sequences I want to work on start with BEGIN and end with END with
exact multiple of 3 B's in between with only single space. I want to place
a newline before every 3 B's. So the above text would transform to
>
> text
> text
> BEGIN
> N N N
> Z
> text BEGIN
> N N N
> N N N
> Z
> some text BEGIN
> N N N
> N N N
> N N N
> Z
> text
> N N N N N N N
> text
>
>
> More accurately, in a "BEGIN N N N N N N N N N END" type sequence, I want
to insert a \n before every N whose cardinality is 3n where n=0,1,2,... and
the first N has cardinality 0. I also want to insert a \n righ before such
a END.
>
>
> My efforts:
>
> (replace-regexp "BEGIN \\([0-9]\\) \\([0-9]\\) \\([0-9]\\) END" "BEGIN
\n\\1 \\2 \\3 \nEND")

Ok, your replies are quite helpful.

Now, lets focus on this part.

          (while (looking-at "\\(^\\| \\)N N N\\( \\|$\\)")
            (goto-char (match-end 0))
            (delete-horizontal-space t)
            (insert "\n"))


This successfully adds the newlines every three N's and is equivalent to
John Mastro's if I use replace-regexp-in-string and it allows me to refer
to the first three as \\1 \\2 \\3 while the outer re-search-forward as
(match-string 0).

A further problem is that I need to take the average of the first two N's
and last two N's and replace the first by the first average and last by the
second average leaving the middle N as it is.

Unfortunately, the problem is that within re-search-forward, replace-match
can reference tagged-groups as (match-string 2) etc but to do an eval for
replace-match and get the indefinite string "N N N ..." which is equivalent
to John Mastro's if I use replace-regexp-in-string.
but if you want to an arithmetic on this I am getting a problem. The
snippet is

(replace-match   (replace-regexp-in-string "\\([0-9.]+\\) \\([0-9.]+\\)
\\([0-9.]+\\)"
(concat "\n" (format "%s " (string-to-number "\\1")) "\\2" .....))

The second \\2 is correctly replaced by its string value, but the first by
two inverse transformations is not. \\2 is just inside a function (concat)
while \\1 is a second level.

The error is that the value of \\1 is just zero. If I replace \\1 by lets
say 6 then I get the correct answer.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter
  2015-06-07  0:57       ` gnu ist
@ 2015-06-07  2:46         ` John Mastro
  0 siblings, 0 replies; 6+ messages in thread
From: John Mastro @ 2015-06-07  2:46 UTC (permalink / raw
  To: gnu ist, help-gnu-emacs@gnu.org

> Unfortunately, the problem is that within re-search-forward, replace-match
> can reference tagged-groups as (match-string 2) etc but to do an eval for
> replace-match and get the indefinite string "N N N ..." which is equivalent
> to John Mastro's if I use replace-regexp-in-string.
> but if you want to an arithmetic on this I am getting a problem. The snippet
> is
>
> (replace-match   (replace-regexp-in-string "\\([0-9.]+\\) \\([0-9.]+\\)
> \\([0-9.]+\\)"
> (concat "\n" (format "%s " (string-to-number "\\1")) "\\2" .....))
>
> The second \\2 is correctly replaced by its string value, but the first by
> two inverse transformations is not. \\2 is just inside a function (concat)
> while \\1 is a second level.
>
> The error is that the value of \\1 is just zero. If I replace \\1 by lets
> say 6 then I get the correct answer.

I don't think the "\\1"-style references are going to help you, since
you need to do math with the numbers first.

So what you need is `match-string', like I used in my answer to your
previous question. That will let you get the numbers and do what you
like with them.

Also, `replace-regexp-in-string' probably isn't what you want, since
you're operating on buffer text (as opposed to a string). For this
purpose, stick with search functions (like `re-search-forward' and
`looking-at') followed by `replace-match'.

This is what it would look like to essentially combine my answers from
yesterday and today:

(defvar number-triplet-regexp
  "\\(^\\| \\)\\([0-9.]+\\) \\([0-9.]+\\) \\([0-9.]+\\)\\( \\|$\\)")

(defun something ()
  (interactive)
  (save-excursion
    (goto-char (point-min))
    (while (re-search-forward "BEGIN" nil t)
      (insert "\n")
      (delete-horizontal-space)
      (while (looking-at number-triplet-regexp)
        (let ((i (string-to-number (match-string 2)))
              (j (string-to-number (match-string 3)))
              (k (string-to-number (match-string 4))))
          (replace-match (format "%s %s\n" (/ (+ i j) 2.0) (/ (+ j k) 2.0)))))
      (when (looking-at "END")
        (replace-match "Z")))))

(I pulled the regexp out to a defvar purely to combat line length; it
doesn't change anything about how it works).

-- 
john



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-06-07  2:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-06 17:44 Tricky Regexp - How to insert a marker every 3rd number in a sequence that begins with a certain delimiter gnuist006
2015-06-06 18:15 ` Emanuel Berg
2015-06-06 19:08 ` John Mastro
     [not found]   ` <CAF=27p0sY6oPo=Hcj1uZn2-ZSb5tiLQDVV3YHjZdXQVZH2VSUA@mail.gmail.com>
     [not found]     ` <CAOj2CQSLKKxS1xT1nWLe3mP9XaypddXux7F=9tnT_jiMgN+CXA@mail.gmail.com>
2015-06-07  0:57       ` gnu ist
2015-06-07  2:46         ` John Mastro
     [not found] ` <mailman.4479.1433617759.904.help-gnu-emacs@gnu.org>
2015-06-06 22:14   ` Emanuel Berg

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.