unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Way to screen out non-ASCII characters?
@ 2003-08-14  0:14 Edward Dodge
  2003-08-14 16:14 ` roodwriter
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Edward Dodge @ 2003-08-14  0:14 UTC (permalink / raw)



Does anyone know if there is a mode or something that you can use in
EMACS to screen out everything that *isn't* ASCII text?  I am
currently trying to get some old documents converted to a plain-old
ASCII text file,  and I don't have access to the original program to
"save as text,"  nor do I want to write a script for this.

I would assume that because EMACS can distinguish between different
character-sets, maybe there is a way for EMACS to strain out the
other garbage for me.  

-- 
Edward Dodge

/GNU Emacs 21.3.50.1 (powerpc-apple-darwin5.5) of 2002-10-11 on G3/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Way to screen out non-ASCII characters?
  2003-08-14  0:14 Way to screen out non-ASCII characters? Edward Dodge
@ 2003-08-14 16:14 ` roodwriter
  2003-08-14 23:32 ` Joshua D. Guttman
  2003-08-17 17:00 ` Kai Großjohann
  2 siblings, 0 replies; 4+ messages in thread
From: roodwriter @ 2003-08-14 16:14 UTC (permalink / raw)


Edward Dodge wrote:

> 
> Does anyone know if there is a mode or something that you can use in
> EMACS to screen out everything that *isn't* ASCII text?  I am
> currently trying to get some old documents converted to a plain-old
> ASCII text file,  and I don't have access to the original program to
> "save as text,"  nor do I want to write a script for this.
> 
> I would assume that because EMACS can distinguish between different
> character-sets, maybe there is a way for EMACS to strain out the
> other garbage for me.
> 

It's not Emacs, but if you're using Linux, you can use the strings command. 
You can launch it from the Emacs shell if you prefer. Read man strings.

If those are old Word files, there's antiword and a few others that will 
take out the text.

--Rod

-- 
Author of "Linux for Non-Geeks--Clear-eyed Answered for Practical Consumers" 
and "Boring Stories from Uncle Rod." Both are available at 
http://www.rodwriterpublishing.com/index.html

To reply by e-mail, take the extra "o" out of the name.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Way to screen out non-ASCII characters?
  2003-08-14  0:14 Way to screen out non-ASCII characters? Edward Dodge
  2003-08-14 16:14 ` roodwriter
@ 2003-08-14 23:32 ` Joshua D. Guttman
  2003-08-17 17:00 ` Kai Großjohann
  2 siblings, 0 replies; 4+ messages in thread
From: Joshua D. Guttman @ 2003-08-14 23:32 UTC (permalink / raw)
  Cc: Joshua D. Guttman

[-- Attachment #1: Type: text/plain, Size: 498 bytes --]

Edward Dodge <someone@g3.com> writes:

> Does anyone know if there is a mode or something that you can use in
> EMACS to screen out everything that *isn't* ASCII text?  I am
> currently trying to get some old documents converted to a plain-old
> ASCII text file,  and I don't have access to the original program to
> "save as text,"  nor do I want to write a script for this.
> 

If the documents are in MS Word format or something like that, then
you can use the undoc program, attached below.  


[-- Attachment #2: undoc.el -- Strip MS word formatting --]
[-- Type: application/emacs-lisp, Size: 7986 bytes --]

[-- Attachment #3: Type: text/plain, Size: 186 bytes --]



-- 
	Joshua D. Guttman		<guttman@mitre.org>
	MITRE, Mail Stop S119		Office:	+1 781 271 2654
	202 Burlington Rd.		Fax:	+1 781 271 8953
	Bedford, MA 01730-1420 USA	Cell:	+1 781 526 5713

[-- Attachment #4: Type: text/plain, Size: 151 bytes --]

_______________________________________________
Help-gnu-emacs mailing list
Help-gnu-emacs@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Way to screen out non-ASCII characters?
  2003-08-14  0:14 Way to screen out non-ASCII characters? Edward Dodge
  2003-08-14 16:14 ` roodwriter
  2003-08-14 23:32 ` Joshua D. Guttman
@ 2003-08-17 17:00 ` Kai Großjohann
  2 siblings, 0 replies; 4+ messages in thread
From: Kai Großjohann @ 2003-08-17 17:00 UTC (permalink / raw)


Edward Dodge <someone@g3.com> writes:

> Does anyone know if there is a mode or something that you can use in
> EMACS to screen out everything that *isn't* ASCII text?  I am
> currently trying to get some old documents converted to a plain-old
> ASCII text file,  and I don't have access to the original program to
> "save as text,"  nor do I want to write a script for this.

I guess that MacOS X has the "strings" command.

strings -a old-document.doc > plain-old.txt
-- 
Two cafe au lait please, but without milk.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-08-17 17:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-14  0:14 Way to screen out non-ASCII characters? Edward Dodge
2003-08-14 16:14 ` roodwriter
2003-08-14 23:32 ` Joshua D. Guttman
2003-08-17 17:00 ` Kai Großjohann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).