unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#20891: emacs: Back off if .doc is not an Office document
@ 2015-06-24 11:19 era+emacs
  2019-08-01 20:53 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 24+ messages in thread
From: era+emacs @ 2015-06-24 11:19 UTC (permalink / raw)
  To: 20891; +Cc: era+emacs

Package: emacs
Severity: normal
Version: 24.4+1-4ubuntu5
X-Debbugs-Cc: era+emacs@iki.fi

(I am forwarding the following bug from the Ubuntu Launchpad bug
tracking system.  The original report contains some upset lanuage; the
boiled-down summary at the top is mine.)

https://bugs.launchpad.net/ubuntu/+source/emacs24/+bug/1466139

It is not uncommon for *.doc files to contain plain ASCII text. In this
case, the default behavior of Emacs is less than ideal, as described in
more detail in the problem report below. Perhaps the .doc file name
mapping should contain some additional heuristics, and fall back to
plain text if the file is not an Office document.

Original problem description follows.

-----

Today I downloaded the sources of secure delete. Inspected some files
with vi and some with Emacs 24. Did what I wanted to do, started to
listen to my favourite internet radio station, wanted to cite on
Facebook a citation from the secure delete docs.

I wanted to open the file "secure_delete.doc" (a pure ASCII text file)
in Emacs 24 and: "Whenever you see this buffer I'm going to make a
picture of it and you won't be able to edit anything." Haha, no this
really reminds me of the monkey face during the Ubuntu installation. But
don't make a monkey out of me because Emacs 24 is going to be replaced
with svi an extensible text base line editor yet to be written.

Emacs' open file is broken:
 - whenever it sees a file with the extension or post fix ".doc" it
 treats it like a Office document.
 - it takes an image of it
 - and shows you the image - which for a pure text file shows you the
 contents of the file as an image in that gone editor

They should use the /file/ utility to check for the file type - but
showing an unmutable picture of pure text is like making a monkey out of
the user.

-- 
If this were a real .signature, it would suck less.  Well, maybe not.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2015-06-24 11:19 bug#20891: emacs: Back off if .doc is not an Office document era+emacs
@ 2019-08-01 20:53 ` Lars Ingebrigtsen
  2019-11-06  1:53   ` Stefan Kangas
  0 siblings, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-08-01 20:53 UTC (permalink / raw)
  To: era+emacs; +Cc: 20891

era+emacs@iki.fi writes:

> It is not uncommon for *.doc files to contain plain ASCII text. In this
> case, the default behavior of Emacs is less than ideal, as described in
> more detail in the problem report below. Perhaps the .doc file name
> mapping should contain some additional heuristics, and fall back to
> plain text if the file is not an Office document.

(I'm going through old bug reports that have unfortunately not gotten
any responses.)

I think this makes sense.  A fix in Emacs would mean moving the .doc
recognition from `auto-mode-alist' to...  `magic-fallback-mode-alist', I
guess.

According to the interwebs, the magic sequence for Word .doc files is:

D0 CF 11 E0 A1 B1 1A E1

Does anybody have an opinion here?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-08-01 20:53 ` Lars Ingebrigtsen
@ 2019-11-06  1:53   ` Stefan Kangas
  2019-11-06 13:08     ` era
  2019-11-08 20:59     ` Lars Ingebrigtsen
  0 siblings, 2 replies; 24+ messages in thread
From: Stefan Kangas @ 2019-11-06  1:53 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: era+emacs, 20891

Lars Ingebrigtsen <larsi@gnus.org> writes:

> era+emacs@iki.fi writes:
>
>> It is not uncommon for *.doc files to contain plain ASCII text. In this
>> case, the default behavior of Emacs is less than ideal, as described in
>> more detail in the problem report below. Perhaps the .doc file name
>> mapping should contain some additional heuristics, and fall back to
>> plain text if the file is not an Office document.
>
> (I'm going through old bug reports that have unfortunately not gotten
> any responses.)
>
> I think this makes sense.  A fix in Emacs would mean moving the .doc
> recognition from `auto-mode-alist' to...  `magic-fallback-mode-alist', I
> guess.
>
> According to the interwebs, the magic sequence for Word .doc files is:
>
> D0 CF 11 E0 A1 B1 1A E1
>
> Does anybody have an opinion here?

I wasn't aware of the practice to name plain text files *.doc; I can't
remember having encountered any file like that.  Perhaps this practice
is rare.

Would implementing this risk make opening *.doc files slower for most
users?  Perhaps that could make the trade-off not worth it.  Other
than that, I see no problem with the proposal.

Best regards,
Stefan Kangas





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-06  1:53   ` Stefan Kangas
@ 2019-11-06 13:08     ` era
  2019-11-06 23:19       ` Stefan Kangas
  2019-11-08 20:59     ` Lars Ingebrigtsen
  1 sibling, 1 reply; 24+ messages in thread
From: era @ 2019-11-06 13:08 UTC (permalink / raw)
  To: 20891; +Cc: Lars Ingebrigtsen, Stefan Kangas

On Wed, Nov 6, 2019, at 03:53, Stefan Kangas wrote:
> Lars Ingebrigtsen <larsi@gnus.org> writes:
> > era+emacs@iki.fi writes:
> >> It is not uncommon for *.doc files to contain plain ASCII text. In this
> >> case, the default behavior of Emacs is less than ideal
> > I think this makes sense.  A fix in Emacs would mean moving the .doc
> > recognition from `auto-mode-alist' to...  `magic-fallback-mode-alist', I
> > guess.
> I wasn't aware of the practice to name plain text files *.doc; I can't
> remember having encountered any file like that.  Perhaps this practice
> is rare.
> Would implementing this risk make opening *.doc files slower for most
> users?  Perhaps that could make the trade-off not worth it.  Other
> than that, I see no problem with the proposal.

I'd agree that this is probably increasingly rare, but it used to be a practice which wasn't entirely uncommon back when Microsoft was not yet a household brand name and Word wasn't taught in schools.

On the other hand, if the behavior described in the original bug report is still current, that's quirky and unexpected. Really, how many people *expect* Emacs to be able to open a Word document, and are any of them happy when they get a static image to look at in Emacs?

-- 
If this were a real .signature, it would suck less.  Well, maybe not.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-06 13:08     ` era
@ 2019-11-06 23:19       ` Stefan Kangas
  2019-11-07  4:45         ` Richard Stallman
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Kangas @ 2019-11-06 23:19 UTC (permalink / raw)
  To: era; +Cc: Lars Ingebrigtsen, 20891

era <era+emacs@iki.fi> writes:

> On the other hand, if the behavior described in the original bug report is still
> current, that's quirky and unexpected. Really, how many people *expect* Emacs to
> be able to open a Word document, and are any of them happy when they get a
> static image to look at in Emacs?

AFAIU, the problem is that we do not have a mode to edit Microsoft
Word documents.  It would obviously be fantastic if someone would be
willing to write such a package, but it's a potentially big task.

So, as long as we lack editing capabilities, showing an image of the
document in Emacs is actually pretty useful.  More useful than getting
garbled text, at any rate.

Best regards,
Stefan Kangas





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-06 23:19       ` Stefan Kangas
@ 2019-11-07  4:45         ` Richard Stallman
  2019-11-07  8:29           ` era
  0 siblings, 1 reply; 24+ messages in thread
From: Richard Stallman @ 2019-11-07  4:45 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: era+emacs, larsi, 20891

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > So, as long as we lack editing capabilities, showing an image of the
  > document in Emacs is actually pretty useful.

How would Emacs do that?

-- 
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)







^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-07  4:45         ` Richard Stallman
@ 2019-11-07  8:29           ` era
  0 siblings, 0 replies; 24+ messages in thread
From: era @ 2019-11-07  8:29 UTC (permalink / raw)
  To: Richard Stallman, 20891

On Thu, Nov 7, 2019, at 06:45, Richard Stallman wrote:
>   > So, as long as we lack editing capabilities, showing an image of the
>   > document in Emacs is actually pretty useful.
> How would Emacs do that?

The Emacs-side entry point seems to be doc-view-mode-maybe, which is hooked in auto-mode-alist for a number of file name extensions.

As described in https://www.emacswiki.org/emacs/DocViewMode it relies on external utilities to provide the actual image.

I was unable to quickly repro in a fresh Debian or Ubuntu image, but that might be because I didn't have the external utility installed.

Tangentially, googling for doc-view-mode-maybe suggests that lots of people are annoyed by it and want to turn it off, probably often for related but distinct reasons.

-- 
If this were a real .signature, it would suck less.  Well, maybe not.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-06  1:53   ` Stefan Kangas
  2019-11-06 13:08     ` era
@ 2019-11-08 20:59     ` Lars Ingebrigtsen
  2019-11-09  6:25       ` Eli Zaretskii
  1 sibling, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-08 20:59 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: era+emacs, 20891

Stefan Kangas <stefan@marxist.se> writes:

> Would implementing this risk make opening *.doc files slower for most
> users?  Perhaps that could make the trade-off not worth it.  Other
> than that, I see no problem with the proposal.

I don't think it'd be any performance problem -- we'd just have to read
the first 8 bytes of the file to see whether the magic sequence is there.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-08 20:59     ` Lars Ingebrigtsen
@ 2019-11-09  6:25       ` Eli Zaretskii
  2019-11-09 20:14         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-09  6:25 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: era+emacs, 20891, stefan

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Fri, 08 Nov 2019 21:59:54 +0100
> Cc: era+emacs@iki.fi, 20891@debbugs.gnu.org
> 
> Stefan Kangas <stefan@marxist.se> writes:
> 
> > Would implementing this risk make opening *.doc files slower for most
> > users?  Perhaps that could make the trade-off not worth it.  Other
> > than that, I see no problem with the proposal.
> 
> I don't think it'd be any performance problem -- we'd just have to read
> the first 8 bytes of the file to see whether the magic sequence is there.

*.doc files are rare nowadays.  Do the *.docx files have the same
signature?  I doubt that, since they are actually *.zip files in
disguise.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-09  6:25       ` Eli Zaretskii
@ 2019-11-09 20:14         ` Lars Ingebrigtsen
  2019-11-14  8:54           ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-09 20:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, 20891, stefan

Eli Zaretskii <eliz@gnu.org> writes:

>> I don't think it'd be any performance problem -- we'd just have to read
>> the first 8 bytes of the file to see whether the magic sequence is there.
>
> *.doc files are rare nowadays.  Do the *.docx files have the same
> signature?  I doubt that, since they are actually *.zip files in
> disguise.

Yeah, *.docx have a different signature, so this would be for *.doc files
only (and since the Windows *.doc files are becoming rarer, perhaps that
means that doing doc-view only on files that have the magic bytes is
more important than it used to be).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-09 20:14         ` Lars Ingebrigtsen
@ 2019-11-14  8:54           ` Eli Zaretskii
  2019-11-14  9:55             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-14  8:54 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: era+emacs, 20891, stefan

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: stefan@marxist.se,  era+emacs@iki.fi,  20891@debbugs.gnu.org
> Date: Sat, 09 Nov 2019 21:14:52 +0100
> 
> since the Windows *.doc files are becoming rarer, perhaps that
> means that doing doc-view only on files that have the magic bytes is
> more important than it used to be

Sorry, I don't follow that logic.  I'd expect that *.doc MS Word files
becoming rarer would mean plain-text *.doc files become relatively
more important, i.e. the opposite conclusion.  What did I miss?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14  8:54           ` Eli Zaretskii
@ 2019-11-14  9:55             ` Lars Ingebrigtsen
  2019-11-14 14:12               ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-14  9:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, 20891, stefan

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Lars Ingebrigtsen <larsi@gnus.org>
>> Cc: stefan@marxist.se,  era+emacs@iki.fi,  20891@debbugs.gnu.org
>> Date: Sat, 09 Nov 2019 21:14:52 +0100
>> 
>> since the Windows *.doc files are becoming rarer, perhaps that
>> means that doing doc-view only on files that have the magic bytes is
>> more important than it used to be
>
> Sorry, I don't follow that logic.  I'd expect that *.doc MS Word files
> becoming rarer would mean plain-text *.doc files become relatively
> more important, i.e. the opposite conclusion.  What did I miss?

That's what I'm saying.  :-) Or at least I tried to.  It's more
important to add magic byte recognition to doc-mode for .doc files now
than before.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14  9:55             ` Lars Ingebrigtsen
@ 2019-11-14 14:12               ` Eli Zaretskii
  2019-11-14 15:06                 ` Robert Pluim
  2019-11-15  7:50                 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-14 14:12 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: era+emacs, 20891, stefan

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: stefan@marxist.se,  era+emacs@iki.fi,  20891@debbugs.gnu.org
> Date: Thu, 14 Nov 2019 10:55:35 +0100
> 
> > Sorry, I don't follow that logic.  I'd expect that *.doc MS Word files
> > becoming rarer would mean plain-text *.doc files become relatively
> > more important, i.e. the opposite conclusion.  What did I miss?
> 
> That's what I'm saying.  :-) Or at least I tried to.  It's more
> important to add magic byte recognition to doc-mode for .doc files now
> than before.

How would the magic signature recognition help with plain-text files?
They don't have any such signatures?  I'm still missing something,
sorry.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 14:12               ` Eli Zaretskii
@ 2019-11-14 15:06                 ` Robert Pluim
  2019-11-14 16:19                   ` Eli Zaretskii
  2019-11-15  7:50                 ` Lars Ingebrigtsen
  1 sibling, 1 reply; 24+ messages in thread
From: Robert Pluim @ 2019-11-14 15:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, Lars Ingebrigtsen, 20891, stefan

>>>>> On Thu, 14 Nov 2019 16:12:22 +0200, Eli Zaretskii <eliz@gnu.org> said:

    Eli> How would the magic signature recognition help with plain-text files?
    Eli> They don't have any such signatures?  I'm still missing something,
    Eli> sorry.

Today we go: ".doc extension -> show an image of the contents of the
file" which is manifestly the wrong thing to do for a non-doc file. If
we do the signature recognition, those files which are not recognized
end up in (probably) fundamental-mode

Robert





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 15:06                 ` Robert Pluim
@ 2019-11-14 16:19                   ` Eli Zaretskii
  2019-11-14 16:33                     ` Andreas Schwab
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-14 16:19 UTC (permalink / raw)
  To: Robert Pluim; +Cc: era+emacs, larsi, 20891, stefan

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Lars Ingebrigtsen <larsi@gnus.org>,  era+emacs@iki.fi,
>   20891@debbugs.gnu.org,  stefan@marxist.se
> Date: Thu, 14 Nov 2019 16:06:36 +0100
> 
> Today we go: ".doc extension -> show an image of the contents of the
> file"

Where do we have the code or data which does that?

> which is manifestly the wrong thing to do for a non-doc file. If
> we do the signature recognition, those files which are not recognized
> end up in (probably) fundamental-mode

That's OK, but I'm still missing the code which makes this happen.
E.g., I just did "C-x C-f foo.doc RET" and got a buffer in Fundamental
mode.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 16:19                   ` Eli Zaretskii
@ 2019-11-14 16:33                     ` Andreas Schwab
  2019-11-14 16:42                       ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Schwab @ 2019-11-14 16:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, Robert Pluim, 20891, stefan, larsi

On Nov 14 2019, Eli Zaretskii wrote:

> Where do we have the code or data which does that?

See auto-mode-alist, and doc-view-mode-maybe.

> That's OK, but I'm still missing the code which makes this happen.
> E.g., I just did "C-x C-f foo.doc RET" and got a buffer in Fundamental
> mode.

It only works if you have a doc-view-odf->pdf-converter-program.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 16:33                     ` Andreas Schwab
@ 2019-11-14 16:42                       ` Eli Zaretskii
  2019-11-15  7:51                         ` Lars Ingebrigtsen
  2019-11-15  9:14                         ` Robert Pluim
  0 siblings, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-14 16:42 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: era+emacs, rpluim, 20891, stefan, larsi

> From: Andreas Schwab <schwab@suse.de>
> Cc: Robert Pluim <rpluim@gmail.com>,  era+emacs@iki.fi,  larsi@gnus.org,  20891@debbugs.gnu.org,  stefan@marxist.se
> Date: Thu, 14 Nov 2019 17:33:28 +0100
> 
> On Nov 14 2019, Eli Zaretskii wrote:
> 
> > Where do we have the code or data which does that?
> 
> See auto-mode-alist, and doc-view-mode-maybe.
> 
> > That's OK, but I'm still missing the code which makes this happen.
> > E.g., I just did "C-x C-f foo.doc RET" and got a buffer in Fundamental
> > mode.
> 
> It only works if you have a doc-view-odf->pdf-converter-program.

Thanks, I was blind.

So we want to remove docx? from auto-mode-alist and instead to add the
magic signature to magic-mode-alist?  But then AFAIK MS Word documents
had different signatures for different versions, so we should have
several.  And a literal docx should be left in auto-mode-alist, right?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 14:12               ` Eli Zaretskii
  2019-11-14 15:06                 ` Robert Pluim
@ 2019-11-15  7:50                 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-15  7:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, 20891, stefan

Eli Zaretskii <eliz@gnu.org> writes:

>> That's what I'm saying.  :-) Or at least I tried to.  It's more
>> important to add magic byte recognition to doc-mode for .doc files now
>> than before.
>
> How would the magic signature recognition help with plain-text files?
> They don't have any such signatures?  I'm still missing something,
> sorry.

The magic signature recognition is for the MS .doc files, not the text
files.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 16:42                       ` Eli Zaretskii
@ 2019-11-15  7:51                         ` Lars Ingebrigtsen
  2019-11-15  8:48                           ` Eli Zaretskii
  2019-11-15  9:14                         ` Robert Pluim
  1 sibling, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-15  7:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, Andreas Schwab, rpluim, 20891, stefan

Eli Zaretskii <eliz@gnu.org> writes:

> But then AFAIK MS Word documents had different signatures for
> different versions, so we should have several.

All .doc files allegedly start with the same eight bytes.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-15  7:51                         ` Lars Ingebrigtsen
@ 2019-11-15  8:48                           ` Eli Zaretskii
  2019-11-15  8:56                             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-15  8:48 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: era+emacs, schwab, rpluim, 20891, stefan

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Andreas Schwab <schwab@suse.de>,  rpluim@gmail.com,  era+emacs@iki.fi,
>   20891@debbugs.gnu.org,  stefan@marxist.se
> Date: Fri, 15 Nov 2019 08:51:40 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > But then AFAIK MS Word documents had different signatures for
> > different versions, so we should have several.
> 
> All .doc files allegedly start with the same eight bytes.

Maybe my reading of the 'magic' file is wrong, but it seems to say
otherwise.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-15  8:48                           ` Eli Zaretskii
@ 2019-11-15  8:56                             ` Lars Ingebrigtsen
  2019-11-15  9:51                               ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-15  8:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, schwab, rpluim, 20891, stefan

Eli Zaretskii <eliz@gnu.org> writes:

> Maybe my reading of the 'magic' file is wrong, but it seems to say
> otherwise.

I just consulted https://en.wikipedia.org/wiki/List_of_file_signatures,
but I have little practical experience with .doc files myself.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-14 16:42                       ` Eli Zaretskii
  2019-11-15  7:51                         ` Lars Ingebrigtsen
@ 2019-11-15  9:14                         ` Robert Pluim
  1 sibling, 0 replies; 24+ messages in thread
From: Robert Pluim @ 2019-11-15  9:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, Andreas Schwab, larsi, 20891, stefan

>>>>> On Thu, 14 Nov 2019 18:42:40 +0200, Eli Zaretskii <eliz@gnu.org> said:

    Eli> So we want to remove docx? from auto-mode-alist and instead to add the
    Eli> magic signature to magic-mode-alist?  But then AFAIK MS Word documents
    Eli> had different signatures for different versions, so we should have
    Eli> several.  And a literal docx should be left in auto-mode-alist, right?

Yes. The following detects a word 97 file for me, and a text .doc file
opens in fundamental-mode.

diff --git i/lisp/files.el w/lisp/files.el
index 053583b4cb..ea3d3deb34 100644
--- i/lisp/files.el
+++ w/lisp/files.el
@@ -2798,7 +2798,7 @@ auto-mode-alist
      ("\\.\\(diffs?\\|patch\\|rej\\)\\'" . diff-mode)
      ("\\.\\(dif\\|pat\\)\\'" . diff-mode) ; for MS-DOS
      ("\\.[eE]?[pP][sS]\\'" . ps-mode)
-     ("\\.\\(?:PDF\\|DVI\\|OD[FGPST]\\|DOCX?\\|XLSX?\\|PPTX?\\|pdf\\|djvu\\|dvi\\|od[fgpst]\\|docx?\\|xlsx?\\|pptx?\\)\\'" . doc-view-mode-maybe)
+     ("\\.\\(?:PDF\\|DVI\\|OD[FGPST]\\|DOCX\\|XLSX?\\|PPTX?\\|pdf\\|djvu\\|dvi\\|od[fgpst]\\|docx\\|xlsx?\\|pptx?\\)\\'" . doc-view-mode-maybe)
      ("configure\\.\\(ac\\|in\\)\\'" . autoconf-mode)
      ("\\.s\\(v\\|iv\\|ieve\\)\\'" . sieve-mode)
      ("BROWSE\\'" . ebrowse-tree-mode)
@@ -3062,6 +3062,7 @@ magic-fallback-mode-alist
             (comment-re (concat "\\(?:!--" incomment-re "*-->[ \t\r\n]*<\\)")))
        (concat "[ \t\r\n]*<" comment-re "*!DOCTYPE "))
      . sgml-mode)
+    ("\320\317\021\340\241\261\032\341" . doc-view-mode-maybe)
     ("%!PS" . ps-mode)
     ("# xmcd " . conf-unix-mode)))
   "Like `magic-mode-alist' but has lower priority than `auto-mode-alist'.





^ permalink raw reply related	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-15  8:56                             ` Lars Ingebrigtsen
@ 2019-11-15  9:51                               ` Eli Zaretskii
  2019-11-15 13:20                                 ` Robert Pluim
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-15  9:51 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: era+emacs, schwab, rpluim, 20891, stefan

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: schwab@suse.de,  rpluim@gmail.com,  era+emacs@iki.fi,
>   20891@debbugs.gnu.org,  stefan@marxist.se
> Date: Fri, 15 Nov 2019 09:56:07 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Maybe my reading of the 'magic' file is wrong, but it seems to say
> > otherwise.
> 
> I just consulted https://en.wikipedia.org/wiki/List_of_file_signatures,
> but I have little practical experience with .doc files myself.

Thanks, I'm good with using just one signature for now.  I don't think
the other signatures, if they exist, are important enough to postpone
fixing this issue.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#20891: emacs: Back off if .doc is not an Office document
  2019-11-15  9:51                               ` Eli Zaretskii
@ 2019-11-15 13:20                                 ` Robert Pluim
  0 siblings, 0 replies; 24+ messages in thread
From: Robert Pluim @ 2019-11-15 13:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: era+emacs, schwab, Lars Ingebrigtsen, 20891, stefan

tags 20891 fixed
close 20891 27.1
quit

>>>>> On Fri, 15 Nov 2019 11:51:23 +0200, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Thanks, I'm good with using just one signature for now.  I don't think
    Eli> the other signatures, if they exist, are important enough to postpone
    Eli> fixing this issue.

Closing.
Committed as 904146cf79

Robert





^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-11-15 13:20 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-24 11:19 bug#20891: emacs: Back off if .doc is not an Office document era+emacs
2019-08-01 20:53 ` Lars Ingebrigtsen
2019-11-06  1:53   ` Stefan Kangas
2019-11-06 13:08     ` era
2019-11-06 23:19       ` Stefan Kangas
2019-11-07  4:45         ` Richard Stallman
2019-11-07  8:29           ` era
2019-11-08 20:59     ` Lars Ingebrigtsen
2019-11-09  6:25       ` Eli Zaretskii
2019-11-09 20:14         ` Lars Ingebrigtsen
2019-11-14  8:54           ` Eli Zaretskii
2019-11-14  9:55             ` Lars Ingebrigtsen
2019-11-14 14:12               ` Eli Zaretskii
2019-11-14 15:06                 ` Robert Pluim
2019-11-14 16:19                   ` Eli Zaretskii
2019-11-14 16:33                     ` Andreas Schwab
2019-11-14 16:42                       ` Eli Zaretskii
2019-11-15  7:51                         ` Lars Ingebrigtsen
2019-11-15  8:48                           ` Eli Zaretskii
2019-11-15  8:56                             ` Lars Ingebrigtsen
2019-11-15  9:51                               ` Eli Zaretskii
2019-11-15 13:20                                 ` Robert Pluim
2019-11-15  9:14                         ` Robert Pluim
2019-11-15  7:50                 ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).