all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Re: Counting words
       [not found] <mailman.1903.1176202249.7795.help-gnu-emacs@gnu.org>
@ 2007-04-10 11:05 ` Robert D. Crawford
  2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 9+ messages in thread
From: Robert D. Crawford @ 2007-04-10 11:05 UTC (permalink / raw)
  To: help-gnu-emacs

"Wilfred Zegwaard (privé)" <wilfred.zegwaard@home.nl> writes:

> Can someone point me out some good documentation about counting words,
> tags, etc in EMacs?

You might have to be a little more specific.  Do you mean counting total
words in a buffer?  If so, here is how to do that:

http://www.emacswiki.org/cgi-bin/wiki/WordCount

If you are talking about counting the number of times a specific word
occurs in a buffer or the like, I am sure someone else has done that
before and will post a solution soon.

Concerning tags, can you be more specific here as well?  HTML tags?

rdc
-- 
Robert D. Crawford                                      rdc1x@comcast.net

Ginger snap.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Counting words
       [not found] <mailman.1903.1176202249.7795.help-gnu-emacs@gnu.org>
  2007-04-10 11:05 ` Counting words Robert D. Crawford
@ 2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
  2007-04-10 15:13   ` Peter Dyballa
  2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
  2 siblings, 2 replies; 9+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-10 11:46 UTC (permalink / raw)
  To: help-gnu-emacs

I found the wordcount thing.
I mean specific instances of words. The number of times eg that "the" 
occurs in a text. But I need to search on specific combinations, like 
"the exact word", but also a fuzzy search on specific combinations.

Not HTML tags, but specific strings that this package that I use calls 
TAGS and who are easily identifiable with a string or string combination.

Wilfred



"Wilfred Zegwaard (privé)" <wilfred.zegwaard@home.nl> writes:

> Can someone point me out some good documentation about counting words,
> tags, etc in EMacs?

You might have to be a little more specific.  Do you mean counting total
words in a buffer?  If so, here is how to do that:

http://www.emacswiki.org/cgi-bin/wiki/WordCount

If you are talking about counting the number of times a specific word
occurs in a buffer or the like, I am sure someone else has done that
before and will post a solution soon.

Concerning tags, can you be more specific here as well?  HTML tags?

rdc
-- 
Robert D. Crawford                                      rdc1x@comcast.net

Ginger snap.
_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.0.0/754 - Release Date: 9-4-2007 
22:59

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Counting words
  2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
@ 2007-04-10 15:13   ` Peter Dyballa
  2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Dyballa @ 2007-04-10 15:13 UTC (permalink / raw)
  To:  Wilfred Zegwaard (privé) ; +Cc: help-gnu-emacs


Am 10.04.2007 um 13:46 schrieb Wilfred Zegwaard (privé):

> I mean specific instances of words. The number of times eg that  
> "the" occurs in a text. But I need to search on specific  
> combinations, like "the exact word", but also a fuzzy search on  
> specific combinations.

You might think of making the whole text or region temporarily to one  
line and split it at "the exact word" to have as many lines as  
instances exist.

For counting a particular word you can convert each instance of white  
space into a newline, grep for exactly this particular word, and then  
count (a pipe of tr, grep, wc as shell-command for example).

--
Greetings

   Pete

A common mistake that people make when trying to design something  
completely foolproof is to underestimate the ingenuity of complete  
fools.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Counting words
  2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
  2007-04-10 15:13   ` Peter Dyballa
@ 2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 9+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-10 17:37 UTC (permalink / raw)
  To: "Wilfred Zegwaard (privé)"; +Cc: help-gnu-emacs

Yep. This is what I need. Can you point me to documentation where the 
appropriate functions and key-bindings can be found? An entry / link is 
ok. (Not grep. I've got that.)

Wilfred

PS: There seems to be a function in EMacs where I can attach a specific 
signature, a sort of approximate CRC, to a fuzzy search, a bind to "the 
exact word". Is that available?


Am 10.04.2007 um 13:46 schrieb Wilfred Zegwaard (privé):

> I mean specific instances of words. The number of times eg that  
> "the" occurs in a text. But I need to search on specific  
> combinations, like "the exact word", but also a fuzzy search on  
> specific combinations.

You might think of making the whole text or region temporarily to one
line and split it at "the exact word" to have as many lines as
instances exist.

For counting a particular word you can convert each instance of white
space into a newline, grep for exactly this particular word, and then
count (a pipe of tr, grep, wc as shell-command for example).

--
Greetings

   Pete

A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete
fools.




-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.0.0/754 - Release Date: 9-4-2007 
22:59

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Counting words
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
@ 2007-04-11 19:42   ` Colin S. Miller
  2007-04-12 13:14     ` thorne
  2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
  1 sibling, 1 reply; 9+ messages in thread
From: Colin S. Miller @ 2007-04-11 19:42 UTC (permalink / raw)
  To: help-gnu-emacs

Wilfred Zegwaard (privé) wrote:
> I found the wordcount thing.
> I mean specific instances of words. The number of times eg that "the" 
> occurs in a text. But I need to search on specific combinations, like 
> "the exact word", but also a fuzzy search on specific combinations.
> 
> Not HTML tags, but specific strings that this package that I use calls 
> TAGS and who are easily identifiable with a string or string combination.
> 
> Wilfred
> 
>

Wilfred,

you can use replace-regexp to do this

Try
M-x replace-regexp
\bthe exact phrase\b
\&


\b means word-boundary,
\& means replace with what was found.

This is a bit nasty, but after the regexp-replace has
finished, it should echo "Replaced xx occurrences"
to the minibuffer.

HTH,
Colin S. Miller


-- 
Replace the obvious in my email address with the first three letters of the hostname to reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Counting words
  2007-04-11 19:42   ` Colin S. Miller
@ 2007-04-12 13:14     ` thorne
  0 siblings, 0 replies; 9+ messages in thread
From: thorne @ 2007-04-12 13:14 UTC (permalink / raw)
  To: help-gnu-emacs

> Wilfred Zegwaard (privé) wrote:
>> I found the wordcount thing.
>> I mean specific instances of words. The number of times eg that
>> "the" occurs in a text. But I need to search on specific
>> combinations, like "the exact word", but also a fuzzy search on
>> specific combinations.

My versions of Emacs i use (22 and 23) both have an interactive
function called `how-many' (aliased to `count-matches' also) that
counts the number of matches for a regexp in a buffer.  Is that what
you are looking for?

I was just using it last night editing a large fiction work to look
for possibly overused words.

,----[ C-h f how-many RET ]
| how-many is an interactive compiled Lisp function in `replace.el'.
| (how-many regexp &optional rstart rend interactive)
| 
| Print and return number of matches for regexp following point.
| When called from Lisp and interactive is omitted or nil, just return
| the number, do not print it; if interactive is t, the function behaves
| in all respects has if it had been called interactively.
| 
| If regexp contains upper case characters (excluding those preceded by `\'),
| the matching is case-sensitive.
| 
| Second and third arg rstart and rend specify the region to operate on.
| 
| Interactively, in Transient Mark mode when the mark is active, operate
| on the contents of the region.  Otherwise, operate from point to the
| end of (the accessible portion of) the buffer.
| 
| This function starts looking for the next match from the end of
| the previous match.  Hence, it ignores matches that overlap
| a previously found match.
`----


-- 
þ    theron tlåx    þ
(compose-mail (concat "thorne@" (rot13 "gvzoeny") ".net"))

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Fuzzy search (was: Counting words)
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
  2007-04-11 19:42   ` Colin S. Miller
@ 2007-04-12 21:06   ` "Wilfred Zegwaard (privé)"
  2007-04-12 21:43     ` Peter Dyballa
  2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
  1 sibling, 2 replies; 9+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-12 21:06 UTC (permalink / raw)
  To: help-gnu-emacs

Both methodes replace-regexp en count-matches are exact. That is not 
exactly what I want in the end. I'm looking for a type of fuzzy search 
with words which nearly exact. (Like the R statistical fuzzy search).
Eg.: the hero was here
Fuzzy search in the document, and it finds: the hero was there

It almost matches.
This is what I'm looking for. Any functions in Emacs which does the trick?

Wilfred




Wilfred Zegwaard (privé) wrote:
> I found the wordcount thing.
> I mean specific instances of words. The number of times eg that "the" 
> occurs in a text. But I need to search on specific combinations, like 
> "the exact word", but also a fuzzy search on specific combinations.
> 
> Not HTML tags, but specific strings that this package that I use calls 
> TAGS and who are easily identifiable with a string or string combination.
> 
> Wilfred
> 
>

Wilfred,

you can use replace-regexp to do this

Try
M-x replace-regexp
\bthe exact phrase\b
\&


\b means word-boundary,
\& means replace with what was found.

This is a bit nasty, but after the regexp-replace has
finished, it should echo "Replaced xx occurrences"
to the minibuffer.

HTH,
Colin S. Miller


-- 
Replace the obvious in my email address with the first three letters of 
the hostname to reply.
_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.2.0/757 - Release Date: 11-4-2007 
17:14

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fuzzy search (was: Counting words)
  2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
@ 2007-04-12 21:43     ` Peter Dyballa
  2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Dyballa @ 2007-04-12 21:43 UTC (permalink / raw)
  To:  Wilfred Zegwaard (privé) ; +Cc: help-gnu-emacs


Am 12.04.2007 um 23:06 schrieb Wilfred Zegwaard (privé):

> Both methodes replace-regexp en count-matches are exact. That is  
> not exactly what I want in the end. I'm looking for a type of fuzzy  
> search with words which nearly exact.

Then use a shell-command with agrep: "search a file for a string or  
regular expression, with approximate matching capabilities," ftp:// 
ftp.cs.arizona.edu/agrep/, http://webglimpse.net/.

--
Greetings

   Pete

Globalisation -- communism from above.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fuzzy search (was: Counting words)
  2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
  2007-04-12 21:43     ` Peter Dyballa
@ 2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 9+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-12 22:20 UTC (permalink / raw)
  To: "Wilfred Zegwaard (privé)"; +Cc: help-gnu-emacs

Nice.
Does anyone know a link to approximate matching cap's based on sound 
assessment (the way the word is spoken?).

Wilfred


Am 12.04.2007 um 23:06 schrieb Wilfred Zegwaard (privé):

> Both methodes replace-regexp en count-matches are exact. That is  
> not exactly what I want in the end. I'm looking for a type of fuzzy  
> search with words which nearly exact.

Then use a shell-command with agrep: "search a file for a string or
regular expression, with approximate matching capabilities," ftp://
ftp.cs.arizona.edu/agrep/, http://webglimpse.net/.

--
Greetings

   Pete

Globalisation -- communism from above.




-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.2.0/757 - Release Date: 11-4-2007 
17:14

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-04-12 22:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.1903.1176202249.7795.help-gnu-emacs@gnu.org>
2007-04-10 11:05 ` Counting words Robert D. Crawford
2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
2007-04-10 15:13   ` Peter Dyballa
2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
     [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
2007-04-11 19:42   ` Colin S. Miller
2007-04-12 13:14     ` thorne
2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
2007-04-12 21:43     ` Peter Dyballa
2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.