* Detecting if a file is binary
@ 2009-11-24 15:23 Nordlöw
2009-11-24 17:42 ` tomas
0 siblings, 1 reply; 2+ messages in thread
From: Nordlöw @ 2009-11-24 15:23 UTC (permalink / raw)
To: help-gnu-emacs
Is there a way in emacs-lisp code to detect if a file binary, that is
it does *not* contain a correct multi-character coding.
Or can every possible combination of bytes always be correctly decoded
by some character coding?
/Nordlöw
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Detecting if a file is binary
2009-11-24 15:23 Detecting if a file is binary Nordlöw
@ 2009-11-24 17:42 ` tomas
0 siblings, 0 replies; 2+ messages in thread
From: tomas @ 2009-11-24 17:42 UTC (permalink / raw)
To: Nordlöw; +Cc: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tue, Nov 24, 2009 at 07:23:34AM -0800, Nordlöw wrote:
> Is there a way in emacs-lisp code to detect if a file binary, that is
> it does *not* contain a correct multi-character coding.
> Or can every possible combination of bytes always be correctly decoded
> by some character coding?
Yes, it can. For all one-byte encodings of the iso-8859-x family, each
byte represents a valid code point, for example. In utf-8 there are byte
sequences which can't (shouldn't) happen.
I think the only way to gain some confidence is by statistical analysis
of the text.
Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLDBrsBcgs9XrR2kYRAvtOAJ9wJZ1Q9oTHX7rJUCb/0G3IhbzzKwCfaqBt
2ZZsjoR0Skn0QwptSPQVH1A=
=/HfN
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-11-24 17:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-24 15:23 Detecting if a file is binary Nordlöw
2009-11-24 17:42 ` tomas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).