unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Re: problem with editing/decoding utf-8 text
@ 2003-05-27  8:06 Fery
  0 siblings, 0 replies; 13+ messages in thread
From: Fery @ 2003-05-27  8:06 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 377 bytes --]

> > Attached a small text file, which opens as latin-1 at me, and refuse to
> > save.
> 
> The attachment was missing (we really need our mailers to check the
> presence of an attachment when the main text mentions the word
> "attachment").

Yes, the choice is whether the program should be smarter or its user.
:))  Next try.

Circum

PS: Sorry Stefan for the two copies... :(

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: test --]
[-- Type: text/plain; charset=us-ascii; name="test", Size: 76 bytes --]

# it is a hungarian word, coded in utf-8...
   comment = Õrzõfejlesztés


[-- Attachment #3: Type: text/plain, Size: 151 bytes --]

_______________________________________________
Help-gnu-emacs mailing list
Help-gnu-emacs@gnu.org
http://mail.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 13+ messages in thread
[parent not found: <mailman.6818.1054022957.21513.help-gnu-emacs@gnu.org>]
[parent not found: <mailman.6770.1053942670.21513.help-gnu-emacs@gnu.org>]
* Re: problem with editing/decoding utf-8 text
@ 2003-05-26  9:47 Fery
  0 siblings, 0 replies; 13+ messages in thread
From: Fery @ 2003-05-26  9:47 UTC (permalink / raw)


Stefan Monnier wrote:
> 
> > Now, no matter what I choose (raw-text, no-conversion, utf-8), it
> > modifies all of the utf8 chars which are not fit into the ascii charset.
> > It seems, that it inserts a \201 before every char which is not in the
> > ascii charset. I.e. if I just load and save a file, emacs does not
> > behaves transparently.
> 
> Do you also get the \201 if you choose `utf-8' ?
> If so, it's definitely a bug.

Yes.

> Of course the fact that Emacs happily visited the file in latin-1 but then
> refused to save it in latin-1 is a bug.  I vaguely seem to remember that
> such a bug has been fixed in Emacs-CVS, but it would be great if you could
> either check it or report a precise test case.

Attached a small text file, which opens as latin-1 at me, and refuse to
save.

> > 2. What is the difference between raw-text, no-conversion, binary? On
> > some places, I can choose any of them, on other places not... This whole
> > coding system is a nightmare... :(((
> 
> Yes it is but it's not all Emacs fault.  The only alternative would be for

I know, I just have to look at my 'another OS' :(((

Circum

PS: What about the another (losing the file completely) bug?

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: problem with editing/decoding utf-8 text
@ 2003-05-26  9:47 Fery
  0 siblings, 0 replies; 13+ messages in thread
From: Fery @ 2003-05-26  9:47 UTC (permalink / raw)


Oliver Scholz wrote:
> 
> kai.grossjohann@gmx.net (Kai Großjohann) writes:
> > (setq coding-category-list
> >       (cons 'coding-category-utf-8
> >             (delq 'coding-cateogcoding-utf-8
> >                   coding-category-list)))
> >
> 
> Not 100% if this really makes a difference --
> 
> (set-coding-priority (list 'coding-category-utf-8))
> 
> maybe?

It helps. Thanks!

> >> 1. Cannot I tell to a buffer (after the load of a file) that interpet it
> >> as binary, and save exactly the same bytes what it did read into the
> >> buffer (i.e. transparent buffer)?
> >
> > It's not a good idea.  The buffer contents might already be munged at
> > that point.

I know, I know, but I am the user, I should know if it is safe... :)

> Maybe the OP wants to visit files with `M-x find-file-literally'?

Yes, this is what I wanted originally. :) But, without the existence of
a 'literal keyboard' it doesn't help _so_ much (although, I can edit the
non-utf-8 part of the file)...

Circum

^ permalink raw reply	[flat|nested] 13+ messages in thread
[parent not found: <mailman.6635.1053692285.21513.help-gnu-emacs@gnu.org>]
* problem with editing/decoding utf-8 text
@ 2003-05-23 12:08 Fery
  0 siblings, 0 replies; 13+ messages in thread
From: Fery @ 2003-05-23 12:08 UTC (permalink / raw)


Hello there,

I have a UTF-8 text file, containing latin-1 text. When I try to edit it
with emacs, it does not detect that it is utf-8; the
describe-coding-system gives back 'iso-latin-1-unix'. (And I see the
two-byte representation of latin1 chars, which is not bad to me.)

When I save the buffer, it displays an error message:

These default coding systems were tried:
  iso-latin-1-unix
However, none of them safely encodes the target text.

Now, no matter what I choose (raw-text, no-conversion, utf-8), it
modifies all of the utf8 chars which are not fit into the ascii charset.
It seems, that it inserts a \201 before every char which is not in the
ascii charset. I.e. if I just load and save a file, emacs does not
behaves transparently.

Moreover, there is a BUG: if I press ^G at the error message above, and
quit without saving the file, it _deletes_ the file, although leaves an
auto-save file (where the latin1 chars are bad).

I have found one solution: opening the file with
universal-coding-system-argument, using even UTF-8 (then I see correctly
the chars, although it is not always important) or e.g. no-conversion.

My questions:

0. What is this \201 byte?

1. Cannot I tell to a buffer (after the load of a file) that interpet it
as binary, and save exactly the same bytes what it did read into the
buffer (i.e. transparent buffer)?

2. What is the difference between raw-text, no-conversion, binary? On
some places, I can choose any of them, on other places not... This whole
coding system is a nightmare... :(((

3. Cannot I tell to emacs that interpret the keyboard input as "raw"? I
have set input-meta to On, convert-meta to Off in .inputrc, and if I
could tell emacs that "just interpret the bytes from the terminal input
what they are", then I could copy/paste utf-8 data (in raw format) from
another application. (I run emacs on linux, with the 'putty' terminal on
windows).

GNU Emacs 21.3.2 on debian unstable linux.

Thanks:
Circum

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2003-05-30 13:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-27  8:06 problem with editing/decoding utf-8 text Fery
     [not found] <mailman.6818.1054022957.21513.help-gnu-emacs@gnu.org>
2003-05-27 11:10 ` Oliver Scholz
     [not found] ` <3ED37785.CA5A9AD5@innomed.hu>
     [not found]   ` <ubrxnb5m2.fsf@ID-87814.user.dfncis.de>
2003-05-30 12:45     ` Fery
     [not found]     ` <mailman.7046.1054298932.21513.help-gnu-emacs@gnu.org>
2003-05-30 13:24       ` Kai Großjohann
     [not found] <mailman.6770.1053942670.21513.help-gnu-emacs@gnu.org>
2003-05-27 11:05 ` Oliver Scholz
2003-05-27 11:41   ` Oliver Scholz
  -- strict thread matches above, loose matches on Subject: below --
2003-05-26  9:47 Fery
2003-05-26  9:47 Fery
     [not found] <mailman.6635.1053692285.21513.help-gnu-emacs@gnu.org>
2003-05-23 16:50 ` Kai Großjohann
2003-05-23 19:23   ` Oliver Scholz
2003-05-23 20:53     ` Kai Großjohann
2003-05-23 21:20 ` Stefan Monnier
2003-05-23 12:08 Fery

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).