* Re: auto-detection
[not found] ` <jwvhe11d1s7.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
@ 2003-11-21 2:06 ` Kenichi Handa
0 siblings, 0 replies; only message in thread
From: Kenichi Handa @ 2003-11-21 2:06 UTC (permalink / raw)
Cc: epameinondas, emacs-unicode, emacs-devel
In article <jwvhe11d1s7.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>, Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>>> I think it would be good when saving a file to automatically verify that
>>> the coding-system chosen will be correctly auto-detected if read by
>>> a similarly-configured Emacs. This is already done w.r.t the
>>> coding-cookie but not with the auto-detection.
>> The easy but slow way to implement it is to insert the file
>> again in a temporary buffer with (let
>> ((coding-system-for-read 'undecided)) ..), and check which
>> coding system is detected. And I think any other methods
>> are quite difficult to implement.
> That's indeed the problem: there doesn't seem to be any easy way to make
> the test robust and lightweight.
Something like this function is mostly acculate and
lightweight. It would be better that it also accepts FILE
argument to check auto-coding-alist and
file-coding-system-alist. But, for the moment, I don't have
a time to work on it further.
(defun coding-system-round-trip-safe-p (coding-system from to &optional string)
"Check if CODING-SYSTEM is round-trip safe for the region FROM and TO.
The value is non-nil if and only if we can recover the same text
by encoding a text in the region between FROM and TO with
CODING-SYSTEM and decoding the result back with auto-detection.
In the case the value is nil, you can check how it was asctually
detected by the value of `last-coding-system-used'.
If the optional 4th argument STRING is a string, FROM and TO are
indices to STRING defaulting to 0 and length of STRING
respectively.
The check is done only for the first 10 non-ASCII characters."
(let ((str "")
(count 10))
(if (stringp string)
(progn
(or from (setq from 0))
(or to (setq to (length string)))
(while (and (> count 0)
(setq from (string-match "[^\000-\177]" string from))
(< from to))
(setq str (concat str (string (aref string from)))
from (1+ from)
count (1- count))))
(save-excursion
(goto-char from)
(while (and (> count 0)
(re-search-forward "[^\000-\177]" to t))
(setq str (concat str (string (preceding-char)))
count (1- count)))))
(or (= (length str) 0)
(string= (decode-coding-string
(encode-coding-string str coding-system) 'undecided)
str))))
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2003-11-21 2:06 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <877k20a0q5.fsf@ID-87814.user.dfncis.de>
[not found] ` <200311171103.UAA12032@etlken.m17n.org>
[not found] ` <jwvllqf9gci.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
[not found] ` <200311180127.KAA13174@etlken.m17n.org>
[not found] ` <jwvhe11d1s7.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
2003-11-21 2:06 ` auto-detection Kenichi Handa
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).