all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* MS Word mode?
@ 2002-11-07 18:01 Tiarnan
  2002-11-07 18:48 ` Roger Mason
  0 siblings, 1 reply; 11+ messages in thread
From: Tiarnan @ 2002-11-07 18:01 UTC (permalink / raw)


Hi--

does anyone know is there is an emacs mode to read MS word documents
(sent by colleagues), as ASCII. I'm thinking of something along the
lines of antiword, which produces text from MS Word (and even keeps
tables and so on).

Cheers

Tiarnan


-- 
Tiarnán Ó Corráin		CMG-WDSC
Sysadmin 			Cork.
tiarnan.o'corrain@cmg.com	+353-21-4933200

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
  2002-11-07 18:01 MS Word mode? Tiarnan
@ 2002-11-07 18:48 ` Roger Mason
  0 siblings, 0 replies; 11+ messages in thread
From: Roger Mason @ 2002-11-07 18:48 UTC (permalink / raw)
  Cc: help-gnu-emacs

Hello,

There was a question about this recently on this forum.  Look for
undoc.el, I got it from the wiki (I think).  It has worked very well for
me to date, although I have not attempted ro read complex documents.

Roger Mason

On 7 Nov 2002, Tiarnan wrote:

> Hi--
> 
> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).
> 
> Cheers
> 
> Tiarnan
> 
> 
> -- 
> Tiarnán Ó Corráin		CMG-WDSC
> Sysadmin 			Cork.
> tiarnan.o'corrain@cmg.com	+353-21-4933200
> 
> 
> _______________________________________________
> Help-gnu-emacs mailing list
> Help-gnu-emacs@gnu.org
> http://mail.gnu.org/mailman/listinfo/help-gnu-emacs
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
       [not found] <mailman.1036693626.13090.help-gnu-emacs@gnu.org>
@ 2002-11-07 18:59 ` Bruce Mobarry
  2002-11-08  6:18 ` Julien Avarre
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Bruce Mobarry @ 2002-11-07 18:59 UTC (permalink / raw)


Tiarnan <tiarnan.ocorrain@cmg.com> writes:

> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).

There's catdoc.el, (emacs interface to catdoc) which I've used in the
past. It doesn't support later versions of MS word (97/2000) well,
though. Now I use wv to translate MS word and excel. There is a helper
script (wvMime) that you can call through emacs (and gnus, vm, etc.)
included with the program to present the formatted documents as
postscript, or you can use wvText to strip the formatting to get
ASCII. If I knew how to write lisp, I'd like to write an elisp
interface for wv.

-- 
Bruce Mobarry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
       [not found] <mailman.1036696597.23451.help-gnu-emacs@gnu.org>
@ 2002-11-07 23:40 ` Alex Schroeder
  2002-11-08 10:48   ` Christian Lemburg
                     ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Alex Schroeder @ 2002-11-07 23:40 UTC (permalink / raw)


Roger Mason <rmason@sparky2.esd.mun.ca> writes:

> There was a question about this recently on this forum.  Look for
> undoc.el, I got it from the wiki (I think).  It has worked very well for
> me to date, although I have not attempted ro read complex documents.

Well, it makes things readable, but it is far from perfect -- it seems
to just delete any non-ascii characters, such that sometimes you will
see words such as "Alex8" where "8" is some garbage that just looked
like being part of a real word...  In other words, interfacing to
something like catdoc, antiword, or wvText (included with AbiWord)
might be cool.  Actually all you need is this:

(add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))

(defun no-word ()
  "Run antiword on the entire buffer."
  (shell-command-on-region (point-min) (point-max) "antiword - " t t))

Alex.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
       [not found] <mailman.1036693626.13090.help-gnu-emacs@gnu.org>
  2002-11-07 18:59 ` Bruce Mobarry
@ 2002-11-08  6:18 ` Julien Avarre
  2002-11-08  9:15 ` Thomas Link
  2002-11-09 18:40 ` Kin Cho
  3 siblings, 0 replies; 11+ messages in thread
From: Julien Avarre @ 2002-11-08  6:18 UTC (permalink / raw)


Tiarnan <tiarnan.ocorrain@cmg.com> on 11/07/02 has said :

[...]

> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).

A friend of mine does :

                       C-u M-! strings toto.doc

That's a bit ugly ;-), but that's work when the word document is cheap...

-- 
Julien ``Eole'' Avarre
julien@avarre.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
       [not found] <mailman.1036693626.13090.help-gnu-emacs@gnu.org>
  2002-11-07 18:59 ` Bruce Mobarry
  2002-11-08  6:18 ` Julien Avarre
@ 2002-11-08  9:15 ` Thomas Link
  2002-11-09 18:40 ` Kin Cho
  3 siblings, 0 replies; 11+ messages in thread
From: Thomas Link @ 2002-11-08  9:15 UTC (permalink / raw)


> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).

Just a small note and self-advertisement: filesets.el uses antiword for 
displaying (nothing more) "*.doc" files in an emacs buffer. Having it 
properly configured and having antiword or a similar program installed, 
the command "filesets-find-or-display-file" would do the job.

Cheers,
Thomas.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
  2002-11-07 23:40 ` Alex Schroeder
@ 2002-11-08 10:48   ` Christian Lemburg
  2002-11-08 11:24   ` Tiarnan
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Christian Lemburg @ 2002-11-08 10:48 UTC (permalink / raw)


Alex Schroeder <alex@emacswiki.org> writes:

> Roger Mason <rmason@sparky2.esd.mun.ca> writes:
> 
> > There was a question about this recently on this forum.  Look for
> > undoc.el, I got it from the wiki (I think).  It has worked very well for
> > me to date, although I have not attempted ro read complex documents.
> 
> Well, it makes things readable, but it is far from perfect -- it seems
> to just delete any non-ascii characters, such that sometimes you will
> see words such as "Alex8" where "8" is some garbage that just looked
> like being part of a real word...  In other words, interfacing to
> something like catdoc, antiword, or wvText (included with AbiWord)
> might be cool.  Actually all you need is this:
> 
> (add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))
> 
> (defun no-word ()
>   "Run antiword on the entire buffer."
>   (shell-command-on-region (point-min) (point-max) "antiword - " t t))
> 
> Alex.

Yup, works for me:

- installed wvWare
- found some emacs code for using wvText within Gnus at
http://216.239.37.100/search?q=cache:RW5fo8yVQSgC:www.rhodesmill.org/brandon/notes/emacs.txt+using+wvText+emacs&hl=en&ie=UTF-8
- used code above to modify auto-mode-alist

Working smooth in dired-mode and gnus ...

For your convenience, here are the assorted code snippets (Disclaimer:
all just stolen together, none of this is mine ...):

Of course, one could play the same trick with wvHtml and use an emacs
browser to view the resulting HTML ... hm ... I think I have to get
this wmf2png business working ...


----------------
bin/tempfile
----------------

perl -MPOSIX -e 'print tmpnam()'


----------------
bin/wvTextStdin:
----------------

#!/bin/bash
# Allow wvText to read from the standard input. 
# thanks to brandon from rhodesmill.org
t=$(basename $(tempfile))
cat "$@" > /tmp/$f.doc
cd /tmp
wvText $f.doc $f.txt
cat $f.txt
rm -f $f.doc $f.txt


----------------
emacs/my-mime-types.el
----------------

;; thanks to brandon from rhodesmill.org

(defun mm-inline-msword (handle)
  "Return foo bar"
  (let (text)
    (with-temp-buffer
      (mm-insert-part handle)
      (call-process-region (point-min) (point-max) "wvTextStdin" t t nil)
      (setq text (buffer-string)))
    (mm-insert-inline handle text))) 

(setq mm-automatic-display
      (append mm-automatic-display
              '("application/msword"))) 

(setq mm-inlined-types
      (append mm-inlined-types
              '("application/msword" "application/octet-stream"))) 

(setq mm-inline-media-tests
      (append mm-inline-media-tests
              '(("application/msword" mm-inline-msword identity))
              '(("application/octet-stream" mm-inline-msword
                 (lambda (handle)
                   (let* ((type (mm-handle-type handle))
                          (name-pair (assq 'name type))
                          (name (cdr name-pair)))
                     (if name (equal ".doc" (substring name -4 nil)))
                     ))))))


----------------
emacs/my-automodes.el
----------------

;;; automodes

(setq auto-mode-alist
      (append 
       '(
	 ("\\.\\([pP][Llm]\\|al\\)$" . cperl-mode) 
	 ("\\.\\([xX][sS][dD]\\)$" . xml-mode) 
	 ("\\.\\([xX][mM][lL]\\)$" . xml-mode) 
	 ("\\.[jJ][sS]$" . javascript-mode)
	 ("\\.[pP][hH][pP]$" . php-mode)
         ("\\.doc\\'" . my-word-converter)
	 )
       auto-mode-alist))

;; thanks to Alex Schroeder

(defun my-word-converter ()
  "Run wvTextStdin on the entire buffer."
  (shell-command-on-region (point-min) (point-max) "wvTextStdin" t t))


-- 
Christian Lemburg, <lemburg@aixonix.de>, http://www.clemburg.com/
 43rd Law of Computing:
 	Anything that can go wr
 fortune: Segmentation violation -- Core dumped

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
  2002-11-07 23:40 ` Alex Schroeder
  2002-11-08 10:48   ` Christian Lemburg
@ 2002-11-08 11:24   ` Tiarnan
  2002-11-08 13:52   ` Arnaldo Mandel
       [not found]   ` <mailman.1036765729.27121.help-gnu-emacs@gnu.org>
  3 siblings, 0 replies; 11+ messages in thread
From: Tiarnan @ 2002-11-08 11:24 UTC (permalink / raw)


>>>>> "AS" == Alex Schroeder <alex@emacswiki.org> writes:

    AS> (add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))

    AS> (defun no-word () "Run antiword on the entire buffer."
    AS> (shell-command-on-region (point-min) (point-max) "antiword - "
    AS> t t))

Perfect. Just what I was looking for, since antiword makes a
reasonable stab at doing tables.

Many thanks...

Tiarnan

-- 
Tiarnán Ó Corráin		CMG-WDSC
Sysadmin 			Cork.
tiarnan.o'corrain@cmg.com	+353-21-4933200

"Iraq: incredible weapons - incredible weapons." How do you know that? 
"Uh, well... We looked at the receipt." -- Bill Hicks, 1992

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
  2002-11-07 23:40 ` Alex Schroeder
  2002-11-08 10:48   ` Christian Lemburg
  2002-11-08 11:24   ` Tiarnan
@ 2002-11-08 13:52   ` Arnaldo Mandel
       [not found]   ` <mailman.1036765729.27121.help-gnu-emacs@gnu.org>
  3 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Mandel @ 2002-11-08 13:52 UTC (permalink / raw)


Alex Schroeder wrote (on Nov 8, 2002):

 >                 Actually all you need is this:
 > 
 > (add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))
 > 
 > (defun no-word ()
 >   "Run antiword on the entire buffer."
 >   (shell-command-on-region (point-min) (point-max) "antiword - " t t))

On my system there are lots of filenames ending in .doc whose files
are not Word files.  So I modified your function thusly

(defun no-word ()
  "Run antiword on the entire buffer."
  (if (string-match "Microsoft "
		    (shell-command-to-string (concat "file " buffer-file-name)))
      (shell-command-on-region (point-min) (point-max) "antiword - " t t)))

Works in Solaris and Linux, and should work on other unixes as well.

am

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
       [not found]   ` <mailman.1036765729.27121.help-gnu-emacs@gnu.org>
@ 2002-11-08 17:24     ` Alex Schroeder
  0 siblings, 0 replies; 11+ messages in thread
From: Alex Schroeder @ 2002-11-08 17:24 UTC (permalink / raw)


Arnaldo Mandel <am@ime.usp.br> writes:

> (defun no-word ()
>   "Run antiword on the entire buffer."
>   (if (string-match "Microsoft "
> 		    (shell-command-to-string (concat "file " buffer-file-name)))
>       (shell-command-on-region (point-min) (point-max) "antiword - " t t)))

Cool.  I did not know about "file"...  :)

My stuff is on the wiki, btw:

* http://www.emacswiki.org/cgi-bin/wiki.pl?AntiWord

Alex.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MS Word mode?
       [not found] <mailman.1036693626.13090.help-gnu-emacs@gnu.org>
                   ` (2 preceding siblings ...)
  2002-11-08  9:15 ` Thomas Link
@ 2002-11-09 18:40 ` Kin Cho
  3 siblings, 0 replies; 11+ messages in thread
From: Kin Cho @ 2002-11-09 18:40 UTC (permalink / raw)


I recommend using "!antiword -p letter *" at the doc file in
dired, antiword's ps output is a bit better than the text output.

-kin

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-11-09 18:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-07 18:01 MS Word mode? Tiarnan
2002-11-07 18:48 ` Roger Mason
     [not found] <mailman.1036696597.23451.help-gnu-emacs@gnu.org>
2002-11-07 23:40 ` Alex Schroeder
2002-11-08 10:48   ` Christian Lemburg
2002-11-08 11:24   ` Tiarnan
2002-11-08 13:52   ` Arnaldo Mandel
     [not found]   ` <mailman.1036765729.27121.help-gnu-emacs@gnu.org>
2002-11-08 17:24     ` Alex Schroeder
     [not found] <mailman.1036693626.13090.help-gnu-emacs@gnu.org>
2002-11-07 18:59 ` Bruce Mobarry
2002-11-08  6:18 ` Julien Avarre
2002-11-08  9:15 ` Thomas Link
2002-11-09 18:40 ` Kin Cho

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.