all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Counting words
@ 2003-11-30 18:54 David Sumbler
  2003-11-30 19:03 ` Jesper Harder
  0 siblings, 1 reply; 16+ messages in thread
From: David Sumbler @ 2003-11-30 18:54 UTC (permalink / raw)


How do I do a word count in Emacs?

I'm sure there must be a simple way, but using the command apropos and
looking in reference indexes hasn't found me an answer.

David

-- 

David Sumbler

Please reply to the newsgroup.

However, if you _really_ want to send me an e-mail,
replace "nospam" in my address with "aeolia".

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2003-11-30 18:54 David Sumbler
@ 2003-11-30 19:03 ` Jesper Harder
  2003-11-30 20:03   ` David Sumbler
  2003-12-01  1:55   ` Roodwriter
  0 siblings, 2 replies; 16+ messages in thread
From: Jesper Harder @ 2003-11-30 19:03 UTC (permalink / raw)


David Sumbler <david@nospam.co.uk> writes:

> How do I do a word count in Emacs?

`M-| wc -w'

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2003-11-30 19:03 ` Jesper Harder
@ 2003-11-30 20:03   ` David Sumbler
  2003-11-30 23:51     ` Matthias Mees
  2003-12-01  1:55   ` Roodwriter
  1 sibling, 1 reply; 16+ messages in thread
From: David Sumbler @ 2003-11-30 20:03 UTC (permalink / raw)


On Sun, 30 Nov 2003, Jesper Harder wrote:

> David Sumbler <david@nospam.co.uk> writes:
>
> > How do I do a word count in Emacs?
>
> `M-| wc -w'

Brilliant!

I hadn't thought of using a shell command, and I wasn't really
familiar with M-! and M-| anyway, so it has been a very useful lesson.

Thanks for your help.

David

-- 

David Sumbler

Please reply to the newsgroup.

However, if you _really_ want to send me an e-mail,
replace "nospam" in my address with "aeolia".

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
       [not found] <E1AQYTg-0008GI-SU@monty-python.gnu.org>
@ 2003-11-30 20:55 ` Masashi Ito
  0 siblings, 0 replies; 16+ messages in thread
From: Masashi Ito @ 2003-11-30 20:55 UTC (permalink / raw)


Hi,

In case you would like to count words in the specified region rather than
the whole buffer, the following lisp works. I am using it, putting it in my
.emacs file, and binding the function to C-c c. I found the lisp definition
at:

http://olympus.het.brown.edu/cgi-bin/info2www?(emacs-lisp-intro)Counting+Words

Also, to have, say, "two-story" counted as one word rather than two words,
see:

http://olympus.het.brown.edu/cgi-bin/info2www?(emacs-lisp-intro)Syntax

Best,

Masashi

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Final version: while'
(defun count-words-region (beginning end)
  "Print number of words in the region."
  (interactive "r")
  (message "Counting words in region ... ")

;;; 1. Set up appropriate conditions.
  (save-excursion
    (let ((count 0))
      (goto-char beginning)

;;; 2. Run the while loop.
      (while (and (< (point) end)
;;; original
;;                  (re-search-forward "\\w+\\W*" end t))
;;; but, to count words joined by a hyphen (or hyphens) as one word
        (re-search-forward "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*" end t))
        (setq count (1+ count)))

;;; 3. Send a message to the user.
      (cond ((zerop count)
             (message
              "The region does NOT have any words."))
            ((= 1 count)
             (message
              "The region has 1 word."))
            (t
             (message
              "The region has %d words." count))))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2003-11-30 20:03   ` David Sumbler
@ 2003-11-30 23:51     ` Matthias Mees
  0 siblings, 0 replies; 16+ messages in thread
From: Matthias Mees @ 2003-11-30 23:51 UTC (permalink / raw)


David Sumbler <david@nospam.co.uk> wrote:

> On Sun, 30 Nov 2003, Jesper Harder wrote:
>
>> David Sumbler <david@nospam.co.uk> writes:
>>
>>> How do I do a word count in Emacs?
>>
>> `M-| wc -w'
>
> Brilliant!

Just in case this is too much typing for you (and JFTR):
<URL:http://members.a1.net/t.link/textstats.el.gz>

Matthias

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2003-11-30 19:03 ` Jesper Harder
  2003-11-30 20:03   ` David Sumbler
@ 2003-12-01  1:55   ` Roodwriter
  1 sibling, 0 replies; 16+ messages in thread
From: Roodwriter @ 2003-12-01  1:55 UTC (permalink / raw)


Jesper Harder wrote:

> David Sumbler <david@nospam.co.uk> writes:
> 
>> How do I do a word count in Emacs?
> 
> `M-| wc -w'


That is an elegant solution. This after I slipped some Lisp code I found on 
the web into my .emacs file.

It's coming out now.

I need to think a little more about the M-| command.

Thanks.

--Rod

__________

Author of "Linux for Non-Geeks--Clear-eyed Answered for Practical Consumers" 
and "Boring Stories from Uncle Rod." Both are available at 
http://www.rodwriterpublishing.com/index.html

To reply by e-mail, take the extra "o" out of the name.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Counting words
@ 2007-04-10 10:46 "Wilfred Zegwaard (privé)"
  0 siblings, 0 replies; 16+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-10 10:46 UTC (permalink / raw)
  To: help-gnu-emacs

Hello,

Can someone point me out some good documentation about counting words, 
tags, etc in EMacs?

Thanks,

Wilfred

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
       [not found] <mailman.1903.1176202249.7795.help-gnu-emacs@gnu.org>
@ 2007-04-10 11:05 ` Robert D. Crawford
  2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 16+ messages in thread
From: Robert D. Crawford @ 2007-04-10 11:05 UTC (permalink / raw)
  To: help-gnu-emacs

"Wilfred Zegwaard (privé)" <wilfred.zegwaard@home.nl> writes:

> Can someone point me out some good documentation about counting words,
> tags, etc in EMacs?

You might have to be a little more specific.  Do you mean counting total
words in a buffer?  If so, here is how to do that:

http://www.emacswiki.org/cgi-bin/wiki/WordCount

If you are talking about counting the number of times a specific word
occurs in a buffer or the like, I am sure someone else has done that
before and will post a solution soon.

Concerning tags, can you be more specific here as well?  HTML tags?

rdc
-- 
Robert D. Crawford                                      rdc1x@comcast.net

Ginger snap.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
       [not found] <mailman.1903.1176202249.7795.help-gnu-emacs@gnu.org>
  2007-04-10 11:05 ` Counting words Robert D. Crawford
@ 2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
  2007-04-10 15:13   ` Peter Dyballa
  2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
  2 siblings, 2 replies; 16+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-10 11:46 UTC (permalink / raw)
  To: help-gnu-emacs

I found the wordcount thing.
I mean specific instances of words. The number of times eg that "the" 
occurs in a text. But I need to search on specific combinations, like 
"the exact word", but also a fuzzy search on specific combinations.

Not HTML tags, but specific strings that this package that I use calls 
TAGS and who are easily identifiable with a string or string combination.

Wilfred



"Wilfred Zegwaard (privé)" <wilfred.zegwaard@home.nl> writes:

> Can someone point me out some good documentation about counting words,
> tags, etc in EMacs?

You might have to be a little more specific.  Do you mean counting total
words in a buffer?  If so, here is how to do that:

http://www.emacswiki.org/cgi-bin/wiki/WordCount

If you are talking about counting the number of times a specific word
occurs in a buffer or the like, I am sure someone else has done that
before and will post a solution soon.

Concerning tags, can you be more specific here as well?  HTML tags?

rdc
-- 
Robert D. Crawford                                      rdc1x@comcast.net

Ginger snap.
_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.0.0/754 - Release Date: 9-4-2007 
22:59

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
@ 2007-04-10 15:13   ` Peter Dyballa
  2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 16+ messages in thread
From: Peter Dyballa @ 2007-04-10 15:13 UTC (permalink / raw)
  To:  Wilfred Zegwaard (privé) ; +Cc: help-gnu-emacs


Am 10.04.2007 um 13:46 schrieb Wilfred Zegwaard (privé):

> I mean specific instances of words. The number of times eg that  
> "the" occurs in a text. But I need to search on specific  
> combinations, like "the exact word", but also a fuzzy search on  
> specific combinations.

You might think of making the whole text or region temporarily to one  
line and split it at "the exact word" to have as many lines as  
instances exist.

For counting a particular word you can convert each instance of white  
space into a newline, grep for exactly this particular word, and then  
count (a pipe of tr, grep, wc as shell-command for example).

--
Greetings

   Pete

A common mistake that people make when trying to design something  
completely foolproof is to underestimate the ingenuity of complete  
fools.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
  2007-04-10 15:13   ` Peter Dyballa
@ 2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 16+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-10 17:37 UTC (permalink / raw)
  To: "Wilfred Zegwaard (privé)"; +Cc: help-gnu-emacs

Yep. This is what I need. Can you point me to documentation where the 
appropriate functions and key-bindings can be found? An entry / link is 
ok. (Not grep. I've got that.)

Wilfred

PS: There seems to be a function in EMacs where I can attach a specific 
signature, a sort of approximate CRC, to a fuzzy search, a bind to "the 
exact word". Is that available?


Am 10.04.2007 um 13:46 schrieb Wilfred Zegwaard (privé):

> I mean specific instances of words. The number of times eg that  
> "the" occurs in a text. But I need to search on specific  
> combinations, like "the exact word", but also a fuzzy search on  
> specific combinations.

You might think of making the whole text or region temporarily to one
line and split it at "the exact word" to have as many lines as
instances exist.

For counting a particular word you can convert each instance of white
space into a newline, grep for exactly this particular word, and then
count (a pipe of tr, grep, wc as shell-command for example).

--
Greetings

   Pete

A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete
fools.




-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.0.0/754 - Release Date: 9-4-2007 
22:59

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
@ 2007-04-11 19:42   ` Colin S. Miller
  2007-04-12 13:14     ` thorne
  2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
  1 sibling, 1 reply; 16+ messages in thread
From: Colin S. Miller @ 2007-04-11 19:42 UTC (permalink / raw)
  To: help-gnu-emacs

Wilfred Zegwaard (privé) wrote:
> I found the wordcount thing.
> I mean specific instances of words. The number of times eg that "the" 
> occurs in a text. But I need to search on specific combinations, like 
> "the exact word", but also a fuzzy search on specific combinations.
> 
> Not HTML tags, but specific strings that this package that I use calls 
> TAGS and who are easily identifiable with a string or string combination.
> 
> Wilfred
> 
>

Wilfred,

you can use replace-regexp to do this

Try
M-x replace-regexp
\bthe exact phrase\b
\&


\b means word-boundary,
\& means replace with what was found.

This is a bit nasty, but after the regexp-replace has
finished, it should echo "Replaced xx occurrences"
to the minibuffer.

HTH,
Colin S. Miller


-- 
Replace the obvious in my email address with the first three letters of the hostname to reply.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Counting words
  2007-04-11 19:42   ` Colin S. Miller
@ 2007-04-12 13:14     ` thorne
  0 siblings, 0 replies; 16+ messages in thread
From: thorne @ 2007-04-12 13:14 UTC (permalink / raw)
  To: help-gnu-emacs

> Wilfred Zegwaard (privé) wrote:
>> I found the wordcount thing.
>> I mean specific instances of words. The number of times eg that
>> "the" occurs in a text. But I need to search on specific
>> combinations, like "the exact word", but also a fuzzy search on
>> specific combinations.

My versions of Emacs i use (22 and 23) both have an interactive
function called `how-many' (aliased to `count-matches' also) that
counts the number of matches for a regexp in a buffer.  Is that what
you are looking for?

I was just using it last night editing a large fiction work to look
for possibly overused words.

,----[ C-h f how-many RET ]
| how-many is an interactive compiled Lisp function in `replace.el'.
| (how-many regexp &optional rstart rend interactive)
| 
| Print and return number of matches for regexp following point.
| When called from Lisp and interactive is omitted or nil, just return
| the number, do not print it; if interactive is t, the function behaves
| in all respects has if it had been called interactively.
| 
| If regexp contains upper case characters (excluding those preceded by `\'),
| the matching is case-sensitive.
| 
| Second and third arg rstart and rend specify the region to operate on.
| 
| Interactively, in Transient Mark mode when the mark is active, operate
| on the contents of the region.  Otherwise, operate from point to the
| end of (the accessible portion of) the buffer.
| 
| This function starts looking for the next match from the end of
| the previous match.  Hence, it ignores matches that overlap
| a previously found match.
`----


-- 
þ    theron tlåx    þ
(compose-mail (concat "thorne@" (rot13 "gvzoeny") ".net"))

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Fuzzy search (was: Counting words)
       [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
  2007-04-11 19:42   ` Colin S. Miller
@ 2007-04-12 21:06   ` "Wilfred Zegwaard (privé)"
  2007-04-12 21:43     ` Peter Dyballa
  2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
  1 sibling, 2 replies; 16+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-12 21:06 UTC (permalink / raw)
  To: help-gnu-emacs

Both methodes replace-regexp en count-matches are exact. That is not 
exactly what I want in the end. I'm looking for a type of fuzzy search 
with words which nearly exact. (Like the R statistical fuzzy search).
Eg.: the hero was here
Fuzzy search in the document, and it finds: the hero was there

It almost matches.
This is what I'm looking for. Any functions in Emacs which does the trick?

Wilfred




Wilfred Zegwaard (privé) wrote:
> I found the wordcount thing.
> I mean specific instances of words. The number of times eg that "the" 
> occurs in a text. But I need to search on specific combinations, like 
> "the exact word", but also a fuzzy search on specific combinations.
> 
> Not HTML tags, but specific strings that this package that I use calls 
> TAGS and who are easily identifiable with a string or string combination.
> 
> Wilfred
> 
>

Wilfred,

you can use replace-regexp to do this

Try
M-x replace-regexp
\bthe exact phrase\b
\&


\b means word-boundary,
\& means replace with what was found.

This is a bit nasty, but after the regexp-replace has
finished, it should echo "Replaced xx occurrences"
to the minibuffer.

HTH,
Colin S. Miller


-- 
Replace the obvious in my email address with the first three letters of 
the hostname to reply.
_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.2.0/757 - Release Date: 11-4-2007 
17:14

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fuzzy search (was: Counting words)
  2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
@ 2007-04-12 21:43     ` Peter Dyballa
  2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 16+ messages in thread
From: Peter Dyballa @ 2007-04-12 21:43 UTC (permalink / raw)
  To:  Wilfred Zegwaard (privé) ; +Cc: help-gnu-emacs


Am 12.04.2007 um 23:06 schrieb Wilfred Zegwaard (privé):

> Both methodes replace-regexp en count-matches are exact. That is  
> not exactly what I want in the end. I'm looking for a type of fuzzy  
> search with words which nearly exact.

Then use a shell-command with agrep: "search a file for a string or  
regular expression, with approximate matching capabilities," ftp:// 
ftp.cs.arizona.edu/agrep/, http://webglimpse.net/.

--
Greetings

   Pete

Globalisation -- communism from above.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fuzzy search (was: Counting words)
  2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
  2007-04-12 21:43     ` Peter Dyballa
@ 2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
  1 sibling, 0 replies; 16+ messages in thread
From: "Wilfred Zegwaard (privé)" @ 2007-04-12 22:20 UTC (permalink / raw)
  To: "Wilfred Zegwaard (privé)"; +Cc: help-gnu-emacs

Nice.
Does anyone know a link to approximate matching cap's based on sound 
assessment (the way the word is spoken?).

Wilfred


Am 12.04.2007 um 23:06 schrieb Wilfred Zegwaard (privé):

> Both methodes replace-regexp en count-matches are exact. That is  
> not exactly what I want in the end. I'm looking for a type of fuzzy  
> search with words which nearly exact.

Then use a shell-command with agrep: "search a file for a string or
regular expression, with approximate matching capabilities," ftp://
ftp.cs.arizona.edu/agrep/, http://webglimpse.net/.

--
Greetings

   Pete

Globalisation -- communism from above.




-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 269.2.0/757 - Release Date: 11-4-2007 
17:14

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-04-12 22:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.1903.1176202249.7795.help-gnu-emacs@gnu.org>
2007-04-10 11:05 ` Counting words Robert D. Crawford
2007-04-10 11:46 ` "Wilfred Zegwaard (privé)"
2007-04-10 15:13   ` Peter Dyballa
2007-04-10 17:37   ` "Wilfred Zegwaard (privé)"
     [not found] ` <mailman.1904.1176205848.7795.help-gnu-emacs@gnu.org>
2007-04-11 19:42   ` Colin S. Miller
2007-04-12 13:14     ` thorne
2007-04-12 21:06   ` Fuzzy search (was: Counting words) "Wilfred Zegwaard (privé)"
2007-04-12 21:43     ` Peter Dyballa
2007-04-12 22:20     ` "Wilfred Zegwaard (privé)"
2007-04-10 10:46 Counting words "Wilfred Zegwaard (privé)"
     [not found] <E1AQYTg-0008GI-SU@monty-python.gnu.org>
2003-11-30 20:55 ` Masashi Ito
  -- strict thread matches above, loose matches on Subject: below --
2003-11-30 18:54 David Sumbler
2003-11-30 19:03 ` Jesper Harder
2003-11-30 20:03   ` David Sumbler
2003-11-30 23:51     ` Matthias Mees
2003-12-01  1:55   ` Roodwriter

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.