Problem with auto-coding-functions

* Problem with auto-coding-functions
@ 2007-06-16 12:53 Dmitriyi Paduchikh
  0 siblings, 0 replies; only message in thread
From: Dmitriyi Paduchikh @ 2007-06-16 12:53 UTC (permalink / raw)
  To: bug-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 2438 bytes --]

Dear Emacs developers,

I am using Debian package emacs-snapshot 20070608-1 of Romain Francoise. It
is a snapshot of Emacs 22 branch.

I wrote a function for auto-coding-functions hook which would run a program
called Enca[1] as a subprocess to determine encoding of the text. Function
runs enca by calling call-process-region. See attachment for full source.
This worked OK when opening a file, however if I tried to save file with C-x
C-m c utf-8, Emacs asked:

   Selected encoding no-conversion disagrees with iso-8859-5-unix specified
   by file contents. Really save (else edit coding cookies and try again)?
   (yes or no)

I think that iso-8859-5-unix comes from Enca. Enca incorrectly determines
Cyrillic text in emacs-mule as ISO-8859-5. But no-conversion was very
confusing. As it appears (see backtrace in the attachment) it is due to
select-safe-coding-system being called recursively via call-process-region.
I solved this by binding auto-coding-functions to nil inside my function.
However some problems remain.

First, auto-coding-functions get executed via select-safe-coding-system on
decoded text whereas documentation of auto-coding-functions promises that
functions will be called on undecoded text. Clearly there is a bug here. If
functions need to be called on both decoded and undecoded text this should
be mentioned in the documentation, and some method of distinguishing wether
the text is decoded or not should be provided.

Second, when saving with C-x C-m c utf-8, Emacs still asks

   Selected encoding utf-8-unix disagrees with iso-8859-5-unix specified by
   file contents. Really save (else edit coding cookies and try again)? (yes
   or no)

which is somewhat annoying. I would rather prefer that Emacs just silently
saved file with coding system given via C-x C-m c (utf-8 in my case) despite
whatever coding system auto-coding-functions suggest.

Third, if I set coding system by C-x C-m f, Emacs silently saves file with
the coding system returned by auto-coding-functions (iso-8859-5 in my case)
no matter what coding system was specified with C-x C-m f. This looks wrong
to me. IMHO it would be better if Emacs respected the explicit user choice
and at least asked him.

Note. My language environment is Russian using koi8-r if this matters.

Footnotes: 
[1] Extremely Naive Charset Analyser. Available as Debian package, or else
    look for it at http://packages.debian.org/unstable/source/enca

[-- Attachment #2: auto-enca.el --]
[-- Type: application/emacs-lisp, Size: 688 bytes --]

[-- Attachment #3: The backtrace --]
[-- Type: application/octet-stream, Size: 14711 bytes --]

[-- Attachment #4: Type: text/plain, Size: 149 bytes --]

_______________________________________________
bug-gnu-emacs mailing list
bug-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnu-emacs

^ permalink raw reply	[flat|nested] only message in thread