unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* extract lines with regexp
@ 2009-04-30  8:09 Sebastien LE MAGUER
  0 siblings, 0 replies; 9+ messages in thread
From: Sebastien LE MAGUER @ 2009-04-30  8:09 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 248 bytes --]

Hi,

I wonder how to extract lines using a regexp. My file contains something 
like that :

<useless lines>
<line X>
theq() :
<useless lines>

and I want to extract all lines before theq (here line X)

does anyone have an idea ?

Thanks

Sebastien

[-- Attachment #2: Sebastien_Le_maguer.vcf --]
[-- Type: text/x-vcard, Size: 364 bytes --]

begin:vcard
fn;quoted-printable:S=C3=A9bastien Le Maguer
n;quoted-printable:Le Maguer;S=C3=A9bastien
org:;Equipe CORDIAL
adr:ENSSAT - 6 rue de Kerampont, BP 80518 ;;Bureau 203-I;Lannion Cedex;;22305 ;France
email;internet:Sebastien.Le_Maguer@irisa.fr
title:Doctorant
tel;work:02 96 46 91 29
url:http://www.irisa.fr/cordial/Members/slemague/
version:2.1
end:vcard


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
       [not found] <mailman.6302.1241079206.31690.help-gnu-emacs@gnu.org>
@ 2009-04-30 18:57 ` Xah Lee
  2009-05-01 12:02   ` Sebastien Le Maguer
  2009-05-01 15:54   ` Ted Zlatanov
  0 siblings, 2 replies; 9+ messages in thread
From: Xah Lee @ 2009-04-30 18:57 UTC (permalink / raw)
  To: help-gnu-emacs

On Apr 30, 1:09 am, Sebastien LE MAGUER <Sebastien.Le_mag...@irisa.fr>
wrote:
> Hi,
>
> I wonder how to extract lines using a regexp. My file contains something
> like that :
>
> <useless lines>
> <line X>
> theq() :
> <useless lines>
>
> and I want to extract all lines before theq (here line X)
>
> does anyone have an idea ?

regex is very limited in extracting text that span multiple lines.

what you want can be done in emacs, but we need a bit more detail. For
example, what pattern does the lines you want start? “All lines before
theq” doesn't specify how it starts.

A better solution is to use search-forward-regexp to search the begin
pattern, mark, then search-forward-regexp again to search for the
ending pattern “theq() :”, then do search-backward-regexp to move
point to the beginning of “theq”. Then, grab the region.

here's some pieces of code (untested):

(let (p1 p2)
  (save-excursion
    (goto-char (point-min))
    (search-forward-regexp "^A.+$") ; begin pattern
    (setq p1 (point)) ; save cursor pos
    (search-forward-regexp "theq() :") ; ending pattern
    (backward-char 8)
    (setq p2 (point)) ; save cursor pos
    (setq mytext (buffer-substring p1 p2))
    )
  )

  Xah
∑ http://xahlee.org/^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-04-30 18:57 ` extract lines with regexp Xah Lee
@ 2009-05-01 12:02   ` Sebastien Le Maguer
  2009-05-01 16:38     ` Peter Dyballa
  2009-05-01 15:54   ` Ted Zlatanov
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastien Le Maguer @ 2009-05-01 12:02 UTC (permalink / raw)
  To: help-gnu-emacs

In fact I need just the line before theq ().

All lines, except thoses which begin with  "theq", respect the same 
topology :
../rep1/rep2/nom_ficXXX_refXX_sentX.wav

I can use your code to build what what I want. I will send my solution 
when it will be finished

Thanks a lot

Xah Lee a écrit :
> On Apr 30, 1:09 am, Sebastien LE MAGUER <Sebastien.Le_mag...@irisa.fr>
> wrote:
>   
>> Hi,
>>
>> I wonder how to extract lines using a regexp. My file contains something
>> like that :
>>
>> <useless lines>
>> <line X>
>> theq() :
>> <useless lines>
>>
>> and I want to extract all lines before theq (here line X)
>>
>> does anyone have an idea ?
>>     
>
> regex is very limited in extracting text that span multiple lines.
>
> what you want can be done in emacs, but we need a bit more detail. For
> example, what pattern does the lines you want start? “All lines before
> theq” doesn't specify how it starts.
>
> A better solution is to use search-forward-regexp to search the begin
> pattern, mark, then search-forward-regexp again to search for the
> ending pattern “theq() :”, then do search-backward-regexp to move
> point to the beginning of “theq”. Then, grab the region.
>
> here's some pieces of code (untested):
>
> (let (p1 p2)
>   (save-excursion
>     (goto-char (point-min))
>     (search-forward-regexp "^A.+$") ; begin pattern
>     (setq p1 (point)) ; save cursor pos
>     (search-forward-regexp "theq() :") ; ending pattern
>     (backward-char 8)
>     (setq p2 (point)) ; save cursor pos
>     (setq mytext (buffer-substring p1 p2))
>     )
>   )
>
>   Xah
> ∑ http://xahlee.org/
>
> ☄
>   





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-04-30 18:57 ` extract lines with regexp Xah Lee
  2009-05-01 12:02   ` Sebastien Le Maguer
@ 2009-05-01 15:54   ` Ted Zlatanov
  2009-05-01 21:53     ` Sebastien Le Maguer
  2009-05-04 17:44     ` Raymond Wiker
  1 sibling, 2 replies; 9+ messages in thread
From: Ted Zlatanov @ 2009-05-01 15:54 UTC (permalink / raw)
  To: help-gnu-emacs

On Thu, 30 Apr 2009 11:57:06 -0700 (PDT) Xah Lee <xahlee@gmail.com> wrote: 

XL> (let (p1 p2)
XL>   (save-excursion
XL>     (goto-char (point-min))
XL>     (search-forward-regexp "^A.+$") ; begin pattern
XL>     (setq p1 (point)) ; save cursor pos
XL>     (search-forward-regexp "theq() :") ; ending pattern
XL>     (backward-char 8)
XL>     (setq p2 (point)) ; save cursor pos
XL>     (setq mytext (buffer-substring p1 p2))
XL>     )
XL>   )

I don't think your first patten is exactly what the OP needed.

You can use (forward-line -1) to move the point back to the previous
line, and (beginning-of-line -1) to move to the beginning of the
previous line.  Also, you don't need search-forward-regexp the second
time, just search-forward will work.  Plus, of course, (backward-char 8)
is just asking for trouble.

Anyhow, regular expressions can handle multiple lines just fine:

A
theq() :
non
B
theq() :

(save-excursion
  (goto-char (point-min))
  (while (re-search-forward "\\(.*\\)\ntheq() :" nil t)
    (message (match-string 1))))

will produce "A" and "B"

HTH
Ted


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-05-01 12:02   ` Sebastien Le Maguer
@ 2009-05-01 16:38     ` Peter Dyballa
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Dyballa @ 2009-05-01 16:38 UTC (permalink / raw)
  To: Sebastien Le Maguer; +Cc: help-gnu-emacs


Am 01.05.2009 um 14:02 schrieb Sebastien Le Maguer:

> In fact I need just the line before theq ().
>
> All lines, except thoses which begin with  "theq", respect the same  
> topology :
> ../rep1/rep2/nom_ficXXX_refXX_sentX.wav


Can't you decimate? I.e., remove all those lines you don't want and  
then save the remainder in a new file? The original file will not be  
changed.

--
Greetings

   Pete

Upgraded, adj.:
	Didn't work the first time.







^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-05-01 15:54   ` Ted Zlatanov
@ 2009-05-01 21:53     ` Sebastien Le Maguer
  2009-05-04 17:44     ` Raymond Wiker
  1 sibling, 0 replies; 9+ messages in thread
From: Sebastien Le Maguer @ 2009-05-01 21:53 UTC (permalink / raw)
  To: help-gnu-emacs

I was thinking about (forward-line -1) but your solution is simpler.


Peter > all lines  I want to remove respect the same topology of the 
ones I want to keep :)


Thanks all

Sébastien


Ted Zlatanov a écrit :
> On Thu, 30 Apr 2009 11:57:06 -0700 (PDT) Xah Lee <xahlee@gmail.com> wrote: 
>
> XL> (let (p1 p2)
> XL>   (save-excursion
> XL>     (goto-char (point-min))
> XL>     (search-forward-regexp "^A.+$") ; begin pattern
> XL>     (setq p1 (point)) ; save cursor pos
> XL>     (search-forward-regexp "theq() :") ; ending pattern
> XL>     (backward-char 8)
> XL>     (setq p2 (point)) ; save cursor pos
> XL>     (setq mytext (buffer-substring p1 p2))
> XL>     )
> XL>   )
>
> I don't think your first patten is exactly what the OP needed.
>
> You can use (forward-line -1) to move the point back to the previous
> line, and (beginning-of-line -1) to move to the beginning of the
> previous line.  Also, you don't need search-forward-regexp the second
> time, just search-forward will work.  Plus, of course, (backward-char 8)
> is just asking for trouble.
>
> Anyhow, regular expressions can handle multiple lines just fine:
>
> A
> theq() :
> non
> B
> theq() :
>
> (save-excursion
>   (goto-char (point-min))
>   (while (re-search-forward "\\(.*\\)\ntheq() :" nil t)
>     (message (match-string 1))))
>
> will produce "A" and "B"
>
> HTH
> Ted
>   





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-05-01 15:54   ` Ted Zlatanov
  2009-05-01 21:53     ` Sebastien Le Maguer
@ 2009-05-04 17:44     ` Raymond Wiker
  2009-05-06 11:37       ` no-toppost
  1 sibling, 1 reply; 9+ messages in thread
From: Raymond Wiker @ 2009-05-04 17:44 UTC (permalink / raw)
  To: help-gnu-emacs

Ted Zlatanov <tzz@lifelogs.com> writes:

> On Thu, 30 Apr 2009 11:57:06 -0700 (PDT) Xah Lee <xahlee@gmail.com> wrote: 
>
> XL> (let (p1 p2)
> XL>   (save-excursion
> XL>     (goto-char (point-min))
> XL>     (search-forward-regexp "^A.+$") ; begin pattern
> XL>     (setq p1 (point)) ; save cursor pos
> XL>     (search-forward-regexp "theq() :") ; ending pattern
> XL>     (backward-char 8)
> XL>     (setq p2 (point)) ; save cursor pos
> XL>     (setq mytext (buffer-substring p1 p2))
> XL>     )
> XL>   )
>
> I don't think your first patten is exactly what the OP needed.
>
> You can use (forward-line -1) to move the point back to the previous
> line, and (beginning-of-line -1) to move to the beginning of the
> previous line.  Also, you don't need search-forward-regexp the second
> time, just search-forward will work.  Plus, of course, (backward-char 8)
> is just asking for trouble.
>
> Anyhow, regular expressions can handle multiple lines just fine:
>
> A
> theq() :
> non
> B
> theq() :
>
> (save-excursion
>   (goto-char (point-min))
>   (while (re-search-forward "\\(.*\\)\ntheq() :" nil t)
>     (message (match-string 1))))
>
> will produce "A" and "B"

	A slightly more elaborate version (not necessarily more
correct, but with a few more functions to ponder :-)

;;; -------------------------------------------------
(defun collect-all-before (pattern)
  (interactive "sPattern: ")
  (let (ret)
    (while (re-search-forward pattern nil t)
      (save-excursion
	(if (zerop (forward-line -1))
	    (push (buffer-substring (point)
				  (progn 
				    (forward-line 1)
				    (point)))
		  ret))))
    (with-output-to-temp-buffer "*tmp*"
      (set-buffer "*tmp*")
      (dolist (elt (nreverse ret))
	(insert elt)))))
;;; -------------------------------------------------

        (push ...) and (dolist ...) are from a package that tries to
make Emacs-Lisp a bit more like Common Lisp, and can be trivially
replaced with native Emacs-Lisp constructs.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-05-04 17:44     ` Raymond Wiker
@ 2009-05-06 11:37       ` no-toppost
  2009-05-06 13:33         ` Pascal J. Bourguignon
  0 siblings, 1 reply; 9+ messages in thread
From: no-toppost @ 2009-05-06 11:37 UTC (permalink / raw)
  To: help-gnu-emacs

someone wrote:
> Anyhow, regular expressions can handle multiple lines just fine:
>
> A
> theq() :
> non
> B
> theq() :
>
> (save-excursion
>   (goto-char (point-min))
>   (while (re-search-forward "\\(.*\\)\ntheq() :" nil t)
>     (message (match-string 1))))
>
> will produce "A" and "B"

I've had much difficulty in deleteing the multiple pattern
repeating in file/s, which would need:
 delete all lines starting with line containing Str1
 up until/including the line containing Str2,
 provided that   N < number-of-lines-to-delete <  M.

PS
 Since I've suddenly found out how to chroot /<emacs'Partition>
 I'm keen to try my [once looked at]  GNU Emacs 21.3.1 !! 
 
 I've found the powerfull: ^c h i leading to a massive verbose info,
 but can't find how to enter the: edit-eval-print cycle from emacs.
 
 Please advise,
 
 == Chris Glur.
 
 
 
Although this NewsGroup still functions well, 
there are already many other previously good
NewsGroups which have been crowed-out by
the twittering-idiot-masses. To avoid further
displacement of the NNT-protocol by the 
dumbed-down inefficient clik-blogs, we need
to take a stand.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: extract lines with regexp
  2009-05-06 11:37       ` no-toppost
@ 2009-05-06 13:33         ` Pascal J. Bourguignon
  0 siblings, 0 replies; 9+ messages in thread
From: Pascal J. Bourguignon @ 2009-05-06 13:33 UTC (permalink / raw)
  To: help-gnu-emacs

no-toppost@motz.invalid writes:
>  I've found the powerfull: ^c h i leading to a massive verbose info,
>  but can't find how to enter the: edit-eval-print cycle from emacs.

Basically, in all modes (but some, where it's redefined),
eval-last-sexp is bound to C-x C-e 

You may use the *scratch* buffer, put it in the emacs-lisp-mode (M-x
emacs-lisp-mode RET) and edit some emacs lisp forms, then move the
cursor after the form you want to eval and type C-u C-x C-e  to get
the result inserted at the point. (C-x C-e will display the results in
the mini-buffer).

Now, if you want a REPL for emacs lisp, you may launch it with M-x ielm RET


-- 
__Pascal Bourguignon__


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-05-06 13:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.6302.1241079206.31690.help-gnu-emacs@gnu.org>
2009-04-30 18:57 ` extract lines with regexp Xah Lee
2009-05-01 12:02   ` Sebastien Le Maguer
2009-05-01 16:38     ` Peter Dyballa
2009-05-01 15:54   ` Ted Zlatanov
2009-05-01 21:53     ` Sebastien Le Maguer
2009-05-04 17:44     ` Raymond Wiker
2009-05-06 11:37       ` no-toppost
2009-05-06 13:33         ` Pascal J. Bourguignon
2009-04-30  8:09 Sebastien LE MAGUER

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).