* Emacs text bug @ 2013-01-26 20:23 drain 2013-01-26 22:26 ` Peter Dyballa 0 siblings, 1 reply; 15+ messages in thread From: drain @ 2013-01-26 20:23 UTC (permalink / raw) To: Help-gnu-emacs Before I report this as a bug, I want to make sure it doesn't already have a solution: All of the "-" characters have been replaced with "\ 342\200\224" (which has a different face and cannot be replaced with replace-string). -- View this message in context: http://emacs.1067599.n5.nabble.com/Emacs-text-bug-tp276577.html Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 20:23 Emacs text bug drain @ 2013-01-26 22:26 ` Peter Dyballa 2013-01-26 22:43 ` drain 2013-01-26 22:48 ` Drew Adams 0 siblings, 2 replies; 15+ messages in thread From: Peter Dyballa @ 2013-01-26 22:26 UTC (permalink / raw) To: drain; +Cc: Help-gnu-emacs Am 26.01.2013 um 21:23 schrieb drain: > All of the "-" characters have been replaced with "\ 342\200\224" (which > has a different face and cannot be replaced with replace-string). Because the encoding of the buffer has changed? I can see similar things in one specific user's GNU Emacs. In *compilation* buffers the curly quotes are turned into their byte-triplets, in dired buffers the "ä" in the German name März for March are also sometimes lost. But why and when does this happen? Without this knowledge it's kind of senseless to report… -- Greetings Pete The best way to accelerate a PC is 9.8 m/s² ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 22:26 ` Peter Dyballa @ 2013-01-26 22:43 ` drain 2013-01-26 22:59 ` Peter Dyballa 2013-01-26 22:48 ` Drew Adams 1 sibling, 1 reply; 15+ messages in thread From: drain @ 2013-01-26 22:43 UTC (permalink / raw) To: Help-gnu-emacs Perhaps the encoding did change. I recall copy / pasting a bunch of text from a book online into the buffer, and somewhere along the way I might have blindly changed the setting. Which encoding system supports the "—" character? -- View this message in context: http://emacs.1067599.n5.nabble.com/Emacs-text-bug-tp276577p276587.html Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 22:43 ` drain @ 2013-01-26 22:59 ` Peter Dyballa 2013-01-26 23:23 ` drain 0 siblings, 1 reply; 15+ messages in thread From: Peter Dyballa @ 2013-01-26 22:59 UTC (permalink / raw) To: drain; +Cc: Help-gnu-emacs Am 26.01.2013 um 23:43 schrieb drain: > Which encoding system supports the "—" character? You showed before that three bytes were used for the EM DASH' encoding, so it was done in UTF-8. (This character can also be encoded in CP125[0-2] and ISO 8859-1 – but then as 1 byte only.) -- Greetings Pete Chicago, n.: Where the dead still vote … early and often! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 22:59 ` Peter Dyballa @ 2013-01-26 23:23 ` drain 2013-01-26 23:29 ` Peter Dyballa 0 siblings, 1 reply; 15+ messages in thread From: drain @ 2013-01-26 23:23 UTC (permalink / raw) To: Help-gnu-emacs That was a bit tricky. The local buffer setting was "raw text", and I had to change it to UTF-8. But the strings of codes were not automatically converted (which would have been nice); I had to copy / paste the text into the buffer again. Is there a way to reload these characters once the encoding is changed? I might have a few buffers like this, and it would save me copy / pasting texts again. replace-string modus operandi would even work for me. -- View this message in context: http://emacs.1067599.n5.nabble.com/Emacs-text-bug-tp276577p276591.html Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 23:23 ` drain @ 2013-01-26 23:29 ` Peter Dyballa 2013-01-31 17:55 ` drain 0 siblings, 1 reply; 15+ messages in thread From: Peter Dyballa @ 2013-01-26 23:29 UTC (permalink / raw) To: drain; +Cc: Help-gnu-emacs Am 27.01.2013 um 00:23 schrieb drain: > Is there a way to reload these characters once the encoding is changed? Yes: revert-buffer-with-coding-system or C-x RET r <encoding> RET -- Greetings Pete Work is the curse of the drinking class. – Oscar Wilde ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 23:29 ` Peter Dyballa @ 2013-01-31 17:55 ` drain 2013-01-31 18:36 ` Doug Lewan 2013-01-31 18:52 ` Eli Zaretskii 0 siblings, 2 replies; 15+ messages in thread From: drain @ 2013-01-31 17:55 UTC (permalink / raw) To: Help-gnu-emacs Still problems. (1) revert-buffer-with-coding system RET (2) utf-8 RET (3) "Revert buffer from file[...]" y RET (4) [characters appear as they should now] (5) [make change so I can save] (6) save-buffer (7) "Select coding system (default raw-text)" utf-8 (8) "wrote buffer [...]" (9) kill-buffer RET foo.org RET (10) find-file foo.org RET, sees it's back to raw-text, not utf-8, with characters mangled. -- View this message in context: http://emacs.1067599.n5.nabble.com/Emacs-text-bug-tp276577p276925.html Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Emacs text bug 2013-01-31 17:55 ` drain @ 2013-01-31 18:36 ` Doug Lewan 2013-01-31 18:45 ` drain 2013-01-31 18:52 ` Eli Zaretskii 1 sibling, 1 reply; 15+ messages in thread From: Doug Lewan @ 2013-01-31 18:36 UTC (permalink / raw) To: drain, Help-gnu-emacs@gnu.org > (9) kill-buffer RET foo.org RET > (10) find-file foo.org RET, sees it's back to raw-text, not utf-8, with > characters mangled. I think that's what you should expect. Once you kill the buffer, emacs forgets all about the file that it had held. Apparently emacs can't figure out that the file is UTF-8. You'll need to provide a hint. `-*- coding: utf-8 -*-' in the first line is one way. You'll find more in the emacs info page, node `Coding Systems'. I hope this helps. ,Douglas Douglas Lewan Shubert Ticketing (201) 489-8600 ext 224 When I do good, I feel good. When I do bad, I feel bad and that's my religion. - Abraham Lincoln > -----Original Message----- > From: help-gnu-emacs-bounces+dougl=shubertticketing.com@gnu.org > [mailto:help-gnu-emacs-bounces+dougl=shubertticketing.com@gnu.org] On > Behalf Of drain > Sent: Thursday, 2013 January 31 12:56 > To: Help-gnu-emacs@gnu.org > Subject: Re: Emacs text bug > > Still problems. > > (1) revert-buffer-with-coding system RET > (2) utf-8 RET > (3) "Revert buffer from file[...]" y RET > (4) [characters appear as they should now] > (5) [make change so I can save] > (6) save-buffer > (7) "Select coding system (default raw-text)" utf-8 > (8) "wrote buffer [...]" > (9) kill-buffer RET foo.org RET > (10) find-file foo.org RET, sees it's back to raw-text, not utf-8, with > characters mangled. > > > > -- > View this message in context: http://emacs.1067599.n5.nabble.com/Emacs- > text-bug-tp276577p276925.html > Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Emacs text bug 2013-01-31 18:36 ` Doug Lewan @ 2013-01-31 18:45 ` drain 2013-01-31 19:08 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: drain @ 2013-01-31 18:45 UTC (permalink / raw) To: Help-gnu-emacs Doug Lewan wrote > You'll need to provide a hint. `-*- coding: utf-8 -*-' in the first line > is one way. That appears to have worked. A bit ugly having that instruction at the top, but better than manually reverting the buffer every single time. -- View this message in context: http://emacs.1067599.n5.nabble.com/Emacs-text-bug-tp276577p276937.html Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-31 18:45 ` drain @ 2013-01-31 19:08 ` Eli Zaretskii 0 siblings, 0 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-01-31 19:08 UTC (permalink / raw) To: Help-gnu-emacs > Date: Thu, 31 Jan 2013 10:45:31 -0800 (PST) > From: drain <aeuster@gmail.com> > > Doug Lewan wrote > > You'll need to provide a hint. `-*- coding: utf-8 -*-' in the first line > > is one way. > > That appears to have worked. A bit ugly having that instruction at the top, > but better than manually reverting the buffer every single time. You shouldn't need that. You need to clean up your file instead. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-31 17:55 ` drain 2013-01-31 18:36 ` Doug Lewan @ 2013-01-31 18:52 ` Eli Zaretskii 2013-01-31 19:28 ` drain 1 sibling, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-01-31 18:52 UTC (permalink / raw) To: Help-gnu-emacs > Date: Thu, 31 Jan 2013 09:55:52 -0800 (PST) > From: drain <aeuster@gmail.com> > > Still problems. > > (1) revert-buffer-with-coding system RET > (2) utf-8 RET > (3) "Revert buffer from file[...]" y RET > (4) [characters appear as they should now] > (5) [make change so I can save] > (6) save-buffer > (7) "Select coding system (default raw-text)" utf-8 > (8) "wrote buffer [...]" > (9) kill-buffer RET foo.org RET > (10) find-file foo.org RET, sees it's back to raw-text, not utf-8, with > characters mangled. Evidently, you have in that file bytes that are not valid UTF-8 sequences. You need to fix them (the "Select coding system ..." prompt tells you which characters cannot be encoded in UTF-8 -- those are the ones you need to fix.). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-31 18:52 ` Eli Zaretskii @ 2013-01-31 19:28 ` drain 2013-01-31 20:04 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: drain @ 2013-01-31 19:28 UTC (permalink / raw) To: Help-gnu-emacs Now I see. This problem must have started when I copied an early 19th century letter into the buffer, and the characters did not transliterate properly into modern English. Whatever those characters were, they turned into circumflexed /a/ (â), the pound sign (£), and a (special) right double quotation mark (”). utf-8 apparently cannot handle these. But why would this prevent utf-8 from encoding the rest of the buffer? Why not just leave those three characters mangled, and display the rest properly? It reverted fine; it just would not stay in utf-8 unless I (1) put the instruction at the top of the buffer or (2) deleted those special characters. So the functionality appears to be there: Emacs just would not accept it as a saved state (absent instruction at the top). Somehow that buffer got stuck with a limited encoding system. I'm composing this message right now in a "scratch.org" buffer which is using utf-8-unix -- and apparently handles those three characters fine (consequently I'm switching the problem file from utf-8 to utf-8-unix). Anyway, glad to get that sorted. -- View this message in context: http://emacs.1067599.n5.nabble.com/Emacs-text-bug-tp276577p276942.html Sent from the Emacs - Help mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-31 19:28 ` drain @ 2013-01-31 20:04 ` Eli Zaretskii 0 siblings, 0 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-01-31 20:04 UTC (permalink / raw) To: Help-gnu-emacs > Date: Thu, 31 Jan 2013 11:28:47 -0800 (PST) > From: drain <aeuster@gmail.com> > > Now I see. This problem must have started when I copied an early 19th > century letter into the buffer, and the characters did not transliterate > properly into modern English. Whatever those characters were, they turned > into circumflexed /a/ (â), the pound sign (£), and a (special) right double > quotation mark (”). utf-8 apparently cannot handle these. UTF-8 certainly _can_ handle them. I suspect that these characters got copied as raw bytes instead. > But why would this prevent utf-8 from encoding the rest of the buffer? Why > not just leave those three characters mangled, and display the rest > properly? It reverted fine; it just would not stay in utf-8 unless I (1) > put the instruction at the top of the buffer or (2) deleted those special > characters. So the functionality appears to be there: Emacs just would not > accept it as a saved state (absent instruction at the top). Emacs auto-detects the encoding each time you visit a file, unless either the file (by the 'coding:' cookie) or you (by using "C-x RET c") tell it exactly how to decode the file. ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Emacs text bug 2013-01-26 22:26 ` Peter Dyballa 2013-01-26 22:43 ` drain @ 2013-01-26 22:48 ` Drew Adams 2013-01-26 23:26 ` Peter Dyballa 1 sibling, 1 reply; 15+ messages in thread From: Drew Adams @ 2013-01-26 22:48 UTC (permalink / raw) To: 'Peter Dyballa', 'drain'; +Cc: Help-gnu-emacs > But why and when does this happen? Without this knowledge > it's kind of senseless to report. I disagree with that claim. While it is always better to base a bug report on more information, even just reporting a problem can sometimes help. At the very least it gives Emacs core developers and other users a heads-up to look further wrt the problem and its details (e.g. "why and when"). That's already happening, because the OP posted here, thanks to your reply and his followup wrt encoding. Staying in one's corner because one does not have all the info or understanding is too often a brake on progress. Not every user has the motivation or the means, including time, to dig deeper and investigate a problem encountered, to determine the why & when. Just communicating that there seems to be a problem, even if one is not sure, is a good start. There is no way that Emacs developers can completely test every change they make. Users reporting questions and perceived problems are indispensable to getting it right. IMHO, it is better for users, especially new users or those who feel unsure, to err on the side of reporting too much than too little. It is definitely _not_ the case, IMO, that "it's kind of senseless to report" without knowledge of the why & when. The OP brought up the question here first, before reporting, in order to pose ask whether he was missing something. That's a good thing. If the replies here ultimately suggest that "it doesn't already have a solution", then I, for one, encourage a bug report. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Emacs text bug 2013-01-26 22:48 ` Drew Adams @ 2013-01-26 23:26 ` Peter Dyballa 0 siblings, 0 replies; 15+ messages in thread From: Peter Dyballa @ 2013-01-26 23:26 UTC (permalink / raw) To: Drew Adams; +Cc: 'drain', Help-gnu-emacs Am 26.01.2013 um 23:48 schrieb Drew Adams: > While it is always better to base a bug report on more information, even just > reporting a problem can sometimes help. At the very least it gives Emacs core > developers and other users a heads-up to look further wrt the problem and its > details (e.g. "why and when"). This happens as far as I can see rarely. Just some days ago it happened again and I was very soon there. C-h l did not show anything. While the compilation was still going on and showed UTF-8 encoding in the mode-line I tried to fix the way the buffer contents was presented by invoking revert-buffer-with-coding-system, C-x RET r, but it did not change anything. All other buffers (I visited) containing non-US ASCII characters showed the same fault: the UTF-8 encoding bytes were displayed. This could be a Mac OS X problem. Here I can see that 'find … -ls' inserts ASCII NULs, ^@, into *shell* buffer at the transition from the column with the file size to the next one, the one with the date. Or it happens between the date column and the file name column – I am not completely sure about it. Something like these extra characters or bytes could be inserted into the *compilation* buffer as well and then the binary byte sequence gets out of sequence and order. But why does it hit all buffers and not only the faulty one with the extraneous bytes? There seems to be one more indication: the hardware is PowerPC, 32-bit. The Mac OS X version is also close to ancient: Mac OS X 10.4 or 10.5 (Tiger or Leopard). On intel hardware it did occur yet… -- Greetings Pete A blizzard is when it snows sideways. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-01-31 20:04 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-26 20:23 Emacs text bug drain 2013-01-26 22:26 ` Peter Dyballa 2013-01-26 22:43 ` drain 2013-01-26 22:59 ` Peter Dyballa 2013-01-26 23:23 ` drain 2013-01-26 23:29 ` Peter Dyballa 2013-01-31 17:55 ` drain 2013-01-31 18:36 ` Doug Lewan 2013-01-31 18:45 ` drain 2013-01-31 19:08 ` Eli Zaretskii 2013-01-31 18:52 ` Eli Zaretskii 2013-01-31 19:28 ` drain 2013-01-31 20:04 ` Eli Zaretskii 2013-01-26 22:48 ` Drew Adams 2013-01-26 23:26 ` Peter Dyballa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).