* 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* @ 2019-03-04 1:46 Van L 2019-03-05 16:07 ` Eli Zaretskii 0 siblings, 1 reply; 13+ messages in thread From: Van L @ 2019-03-04 1:46 UTC (permalink / raw) To: help-gnu-emacs Hello, From the *scratch* buffer, I lookup the keybinding possibilities by C-h b Under the Global Bindings section, the two lines under SPC look to be encoded in Latin-1. I guess Emacs assumes UTF-8. The problem is I see \200 \377 and a two row box having inside of it 3FF F7F as follows -- quote - unknown encoding characters replaced with lookalike sequence SPC .. ~ self-insert-command \200 .. 3FF_F7F self-insert-command \200 .. \377 self-insert-command -- quote ends I know what to do for this kind of situation in EWW, type "E latin-1 RET". What goes here? -- © 2019 Van L gpg using EEF2 37E9 3840 0D5D 9183 251E 9830 384E 9683 B835 "What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* 2019-03-04 1:46 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* Van L @ 2019-03-05 16:07 ` Eli Zaretskii 2019-03-06 0:47 ` Van L 0 siblings, 1 reply; 13+ messages in thread From: Eli Zaretskii @ 2019-03-05 16:07 UTC (permalink / raw) To: help-gnu-emacs > From: Van L <van@scratch.space> > Date: Mon, 04 Mar 2019 12:46:02 +1100 > > >From the *scratch* buffer, I lookup the keybinding possibilities by > > C-h b > > Under the Global Bindings section, the two lines under SPC look to be > encoded in Latin-1. I guess Emacs assumes UTF-8. No, this has nothing to do with encoding. This text is produced by Emacs itself (unlike the previous problem with EWW, where the text came from an external source), so decoding text is not necessary, because text generated by Emacs itself and inserted into its buffers is always in the correct "encoding" (we prefer to call that "representation", to distinguish between the internal representation of characters in Emacs buffers and strings, and encoded text outside Emacs). > The problem is I see \200 \377 and a two row box having inside of it > 3FF F7F as follows > > -- quote - unknown encoding characters replaced with lookalike sequence > SPC .. ~ self-insert-command > \200 .. 3FF_F7F self-insert-command > \200 .. \377 self-insert-command Yes. This is admittedly confusing, although 100% correct. To start digging into what happens here, go to each of the 2 \200's and type "C-u C-x =". You will see that these two look identically on display, but are actually two very different beasts: the former is a Unicode character whose codepoint happens to be 200 octal (0x80 in hex), the latter is a raw byte of the same value. Emacs distinguishes between them. The confusing bit here is that they are by default both displayed identically, for dull historical reasons (once upon a time, Emacs didn't distinguish between them). (Perhaps there's no longer a reason to use this confusing display nowadays.) So the first of the above 2 lines stands for all the non-ASCII Unicode characters, all of which are bound to self-insert-command by default. The funny display of both ends of that character code range is because none of the shown codes corresponds to a printable character. In particular, the \200 codepoint is currently unassigned, i.e. there's no character whose Unicode codepoint is 0x80. By contrast, the second row shows all the raw bytes, which are also bound to self-insert-command by default. IOW, unlike the case with EWW showing incorrectly decoded text, here the issue is with how characters are _displayed_, not how they are decoded. To change how they look you need to fiddle with display features, not with decoding features. And now to your question: > I know what to do for this kind of situation in EWW, type "E latin-1 RET". > > What goes here? Type M-x customize-variable RET glyphless-char-display-control RET In the buffer this displays, check the box to the left of the "c1-control" group. This enables the button to the right of the checkbox; click on it and select the method you want, e.g. "Display acronym" or "Display hex code in a box". Then click "Apply". This will change how all the characters in the range [0x80..0x9f] are displayed. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* 2019-03-05 16:07 ` Eli Zaretskii @ 2019-03-06 0:47 ` Van L 2019-03-06 16:13 ` Eli Zaretskii 0 siblings, 1 reply; 13+ messages in thread From: Van L @ 2019-03-06 0:47 UTC (permalink / raw) To: help-gnu-emacs Eli writes: >> >From the *scratch* buffer, I lookup the keybinding possibilities by >> >> C-h b >> >> Under the Global Bindings section, the two lines under SPC look to be >> encoded in Latin-1. I guess Emacs assumes UTF-8. > > No, this has nothing to do with encoding. This text is produced by > Emacs itself … the internal representation of characters in Emacs > buffers and strings >> \200 .. 3FF_F7F self-insert-command >> \200 .. \377 self-insert-command > > Yes. This is admittedly confusing, although 100% correct. But. But. But. Less than 100% beautiful. The out of ASCII range row terminated by unprintables as visually balanced hex values in a box would look and feel nicer. > To start > digging into what happens here, go to each of the 2 \200's and type > "C-u C-x =". You will see that these two look identically on display, > but are actually two very different beasts: the former is a Unicode > character whose codepoint happens to be 200 octal (0x80 in hex), the > latter is a raw byte of the same value. They are born digital homonyms. > Emacs distinguishes between > them. The confusing bit here is that they are by default both > displayed identically, "C-u C-x =" or M-x describe-char RET puts them in category: l:Latin category: L:Left-to-right (strong) > for dull historical reasons (once upon a time, > Emacs didn't distinguish between them). (Perhaps there's no longer a > reason to use this confusing display nowadays.) Wouldn't it be funny to pull on that string? all the way to the bottom is tied a boat anchor in the shape of a first of its kind 1950s Chinese electric computer keyboard invented and made in the U.S.A. which was being considered a gift to China by the Ike Admin. > So the first of the above 2 lines stands for all the non-ASCII Unicode > characters, all of which are bound to self-insert-command by default. > By contrast, the second row shows all the raw bytes, which are also > bound to self-insert-command by default. > IOW, unlike the case with EWW showing incorrectly decoded text, here > the issue is with how characters are _displayed_, > And now to your question: > >> I know what to do for this kind of situation in EWW, type "E latin-1 RET". >> >> What goes here? > > Type > > M-x customize-variable RET glyphless-char-display-control RET > Thank you. Should I file a bug report for copy and paste inconsistency when trying to collect in one buffer the `M-x describe-char' output? for the above two. Highlight region then M-w C-y fails whereas the middle-mouse button paste works. Having done that and attempting to save the buffer presents the following on problematic characters which makes sense given the above explanation -- quote These default coding systems were tried to encode text in the buffer ‘x’: (utf-8 (845 . 4194176) (861 . 4194176) (1376 . 4194176)) However, each of them encountered characters it couldn’t encode: utf-8 cannot encode these: \200 \200 \200 Click on a character (or switch to this window by ‘C-x o’ and select the characters by RET) to jump to the place it appears, where ‘C-u C-x =’ will give information about it. Select one of the safe coding systems listed below, or cancel the writing with C-g and edit the buffer to remove or modify the problematic characters, or specify any other coding system (and risk losing the problematic characters). raw-text no-conversion -- quote ends -- © 2019 Van L gpg using EEF2 37E9 3840 0D5D 9183 251E 9830 384E 9683 B835 "What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* 2019-03-06 0:47 ` Van L @ 2019-03-06 16:13 ` Eli Zaretskii 2019-03-21 12:13 ` 26.2 RC1 copy-and-paste fail Van L 0 siblings, 1 reply; 13+ messages in thread From: Eli Zaretskii @ 2019-03-06 16:13 UTC (permalink / raw) To: help-gnu-emacs > From: Van L <van@scratch.space> > Date: Wed, 06 Mar 2019 11:47:29 +1100 > > >> \200 .. 3FF_F7F self-insert-command > >> \200 .. \377 self-insert-command > > > > Yes. This is admittedly confusing, although 100% correct. > > But. But. But. Less than 100% beautiful. The out of ASCII range row > terminated by unprintables as visually balanced hex values in a box > would look and feel nicer. This just uses the default Emacs display of these characters. Producing some fancy alternatives might be source of a different kind of confusion ("why does 'C-h b' show the characters differently than what they look like in my buffers?"). > "C-u C-x =" or M-x describe-char RET puts them in > > category: l:Latin > category: L:Left-to-right (strong) You are looking at the wrong parts. Look at the "charset" part. > Should I file a bug report for copy and paste inconsistency when trying > to collect in one buffer the `M-x describe-char' output? for the above two. What inconsistency is that? > Highlight region then M-w C-y fails Fails how? It didn't fail for me. > Select one of the safe coding systems listed below, > or cancel the writing with C-g and edit the buffer > to remove or modify the problematic characters, > or specify any other coding system (and risk losing > the problematic characters). > > raw-text no-conversion That's because you have raw bytes in the buffer. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-06 16:13 ` Eli Zaretskii @ 2019-03-21 12:13 ` Van L 2019-03-21 14:44 ` Eli Zaretskii 0 siblings, 1 reply; 13+ messages in thread From: Van L @ 2019-03-21 12:13 UTC (permalink / raw) To: help-gnu-emacs Eli Zaretskii <eliz@gnu.org> writes: >> >> \200 .. 3FF_F7F self-insert-command >> >> \200 .. \377 self-insert-command -- [snip] >> category: l:Latin >> category: L:Left-to-right (strong) > > You are looking at the wrong parts. Look at the "charset" part. charset: unicode (Unicode (ISO10646)) charset: eight-bit (Raw bytes 128-255) I see. Thanks. > >> Should I file a bug report for copy and paste inconsistency when trying >> to collect in one buffer the `M-x describe-char' output? for the above two. > > What inconsistency is that? I can copy-and-paste the entire *Help* buffer for \200 unicode. Then when I try to do the same to \200 eight-bit I experience unexpected behavior. > Fails how? It didn't fail for me. -- A 1. goto *scratch* buffer 2. C-h b 3. goto *Help* buffer -- B 1. search for 'self' in *Help* buffer 2. C-u C-x = ,apply to unicode \200 under SPC 3. C-x h ,highlight all 4. M-w ,copy highlight region 5. C-y ,paste to *scratch* buffer is OK 6. do some random copy-and-paste in *scratch* buffer and elsewhere -- C 1. goto to last *Help* buffer (= C-h b) 2. C-u C-x = ,apply to \200 eight-bit 3. C-x h ,highlight all 4. M-w ,copy highlight region 5. C-y ,paste to *scratch* buffer is FAILS (it seems to be a one-off issue, the penultimate copy-and-paste operation occurs) For me I experience the same fail to copy-and-paste behavior on 26.2 RC1, 26.1.92, 26.1-mac-7.4. -- © 2019 Van L gpg using EEF2 37E9 3840 0D5D 9183 251E 9830 384E 9683 B835 "What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-21 12:13 ` 26.2 RC1 copy-and-paste fail Van L @ 2019-03-21 14:44 ` Eli Zaretskii 2019-03-21 22:33 ` Van L 0 siblings, 1 reply; 13+ messages in thread From: Eli Zaretskii @ 2019-03-21 14:44 UTC (permalink / raw) To: help-gnu-emacs > From: Van L <van@scratch.space> > Date: Thu, 21 Mar 2019 23:13:19 +1100 > > -- C > > 1. goto to last *Help* buffer (= C-h b) > 2. C-u C-x = ,apply to \200 eight-bit > 3. C-x h ,highlight all > 4. M-w ,copy highlight region > 5. C-y ,paste to *scratch* buffer is FAILS > (it seems to be a one-off issue, the penultimate copy-and-paste operation occurs) What exactly does "FAILS" mean here? I may be blind, but this last C-y does work for me, it pastes a second copy of the \200 description into *scratch*. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-21 14:44 ` Eli Zaretskii @ 2019-03-21 22:33 ` Van L 2019-03-22 7:15 ` Eli Zaretskii 2019-03-22 14:01 ` Van L 0 siblings, 2 replies; 13+ messages in thread From: Van L @ 2019-03-21 22:33 UTC (permalink / raw) To: help-gnu-emacs Eli Zaretskii <eliz@gnu.org> writes: >> 5. C-y ,paste to *scratch* buffer is FAILS >> (it seems to be a one-off issue, the penultimate copy-and-paste operation occurs) > > What exactly does "FAILS" mean here? I may be blind, but this last > C-y does work for me, it pastes a second copy of the \200 description > into *scratch*. When I paste I expect the second description of \200 eight-bit to land on *scratch* buffer. What I get is anything but that. For example, in the following quote block after //[paste 3] there is no way I can copy and paste the details of \200 eight-bit to there. What is pasted is an earlier copy of anything else in the kill ring. If it isn't an Emacs problem then maybe the clipboard mechanism on XQuartz/darwin is bung. -- quote ;; This buffer is for text that is not saved, and for Lisp evaluation. ;; To create a file, visit it with C-x C-f and enter text in its buffer. ;; C-u C-x = ;; charset: unicode (Unicode (ISO10646)) //[paste 0] ;; charset: eight-bit (Raw bytes 128-255) //[paste 1] ;; ----- ;; position: 10941 of 38231 (29%), column: 0 ;; character: (displayed as ) (codepoint 128, #o200, #x80) ;; charset: unicode (Unicode (ISO10646)) ;; code point in charset: 0x80 ;; syntax: w which means: word ;; category: l:Latin ;; to input: type "C-x 8 RET 80" ;; buffer code: #xC2 #x80 ;; file code: #xC2 #x80 (encoded by coding system utf-8-unix) ;; display: by this font (glyph code) ;; xft:-MS -Wingdings-normal-normal-normal-*-15-*-*-*-*-0-iso10646-1 (#x62) ;; Character code properties: customize what to show ;; general-category: Cc (Other, Control) ;; decomposition: (128) ('') //[paste 2] ;; [back] ;; ----- FAIL ;; charset: eight-bit (Raw bytes 128-255) //[paste 3] -- quote ends ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-21 22:33 ` Van L @ 2019-03-22 7:15 ` Eli Zaretskii 2019-03-22 8:35 ` Van L 2019-03-22 14:01 ` Van L 1 sibling, 1 reply; 13+ messages in thread From: Eli Zaretskii @ 2019-03-22 7:15 UTC (permalink / raw) To: help-gnu-emacs > From: Van L <van@scratch.space> > Date: Fri, 22 Mar 2019 09:33:51 +1100 > > > What exactly does "FAILS" mean here? I may be blind, but this last > > C-y does work for me, it pastes a second copy of the \200 description > > into *scratch*. > > When I paste I expect the second description of \200 eight-bit to land on > *scratch* buffer. What I get is anything but that. For example, in the > following quote block after //[paste 3] there is no way I can copy and > paste the details of \200 eight-bit to there. What is pasted is an > earlier copy of anything else in the kill ring. If it isn't an Emacs > problem then maybe the clipboard mechanism on XQuartz/darwin is bung. Yes, that could be it. Is this in "emacs -Q"? If so, do you have some clipboard-handling application running on your system, which could be causing this? Failing all of the above, please submit a bug report. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-22 7:15 ` Eli Zaretskii @ 2019-03-22 8:35 ` Van L 2019-03-22 9:10 ` Eli Zaretskii 0 siblings, 1 reply; 13+ messages in thread From: Van L @ 2019-03-22 8:35 UTC (permalink / raw) To: help-gnu-emacs >> . If it isn't an Emacs >> problem then maybe the clipboard mechanism on XQuartz/darwin is bung. > > Is this in "emacs -Q"? `emacs -Q` doesn't have the problem. `git tags/emacs-26.2-rc1-mac-7.5` build doesn't have the problem without needing the `emacs -Q` start. AFAIK I've not done anything special to the clipboard. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-22 8:35 ` Van L @ 2019-03-22 9:10 ` Eli Zaretskii 0 siblings, 0 replies; 13+ messages in thread From: Eli Zaretskii @ 2019-03-22 9:10 UTC (permalink / raw) To: help-gnu-emacs > From: Van L <van@scratch.space> > Date: Fri, 22 Mar 2019 19:35:16 +1100 > > > Is this in "emacs -Q"? > > `emacs -Q` doesn't have the problem. > > `git tags/emacs-26.2-rc1-mac-7.5` build doesn't have the problem > without needing the `emacs -Q` start. > > AFAIK I've not done anything special to the clipboard. Do you have any customizations related to encoding selections? If nothing else gives a hint, bisect your customizations to find the culprit. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-21 22:33 ` Van L 2019-03-22 7:15 ` Eli Zaretskii @ 2019-03-22 14:01 ` Van L 2019-03-22 14:43 ` Eli Zaretskii 1 sibling, 1 reply; 13+ messages in thread From: Van L @ 2019-03-22 14:01 UTC (permalink / raw) To: help-gnu-emacs > Do you have any customizations related to encoding selections? LANG=en_AU.UTF-8 > If nothing else gives a hint, bisect your customizations to find the > culprit. I use the same .emacs file for parallel running instances of GNU/Emacs version 26.1, 26.1.92, 26.2-rc1, 27.0.50. That complicates it. I will give bisecting a try. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-22 14:01 ` Van L @ 2019-03-22 14:43 ` Eli Zaretskii 2019-03-24 4:34 ` Van L 0 siblings, 1 reply; 13+ messages in thread From: Eli Zaretskii @ 2019-03-22 14:43 UTC (permalink / raw) To: help-gnu-emacs > From: Van L <van@scratch.space> > Date: Sat, 23 Mar 2019 01:01:21 +1100 > > > Do you have any customizations related to encoding selections? > > LANG=en_AU.UTF-8 I don't think this could be the culprit. If it were, we'd have complaints like yours long ago. > > If nothing else gives a hint, bisect your customizations to find the > > culprit. > > I use the same .emacs file for parallel running instances of > GNU/Emacs version 26.1, 26.1.92, 26.2-rc1, 27.0.50. And the problem happens in only some of those? > That complicates it. I will give bisecting a try. Thanks. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 26.2 RC1 copy-and-paste fail 2019-03-22 14:43 ` Eli Zaretskii @ 2019-03-24 4:34 ` Van L 0 siblings, 0 replies; 13+ messages in thread From: Van L @ 2019-03-24 4:34 UTC (permalink / raw) To: help-gnu-emacs Eli Zaretskii <eliz@gnu.org> writes: >> I use the same .emacs file for parallel running instances of >> GNU/Emacs version 26.1, 26.1.92, 26.2-rc1, 27.0.50. > > And the problem happens in only some of those? > After reboot, using .emacs file on single instance run: GNU Emacs 26.2 [x86_64-apple-darwin15.6.0] - first run is OK - second run fails - third run fails - forth run fails despite `emacs -Q` invocation (I sent a bug-report there) GNU Emacs 26.1.92 [emacs-26.2-rc-rc1-mac-7.5, x86_64-apple-darwin15.6.0] - first run is OK - second run is OK GNU Emacs 26.1 [x86_64--netbsd] - first run fails ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-03-24 4:34 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-04 1:46 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* Van L 2019-03-05 16:07 ` Eli Zaretskii 2019-03-06 0:47 ` Van L 2019-03-06 16:13 ` Eli Zaretskii 2019-03-21 12:13 ` 26.2 RC1 copy-and-paste fail Van L 2019-03-21 14:44 ` Eli Zaretskii 2019-03-21 22:33 ` Van L 2019-03-22 7:15 ` Eli Zaretskii 2019-03-22 8:35 ` Van L 2019-03-22 9:10 ` Eli Zaretskii 2019-03-22 14:01 ` Van L 2019-03-22 14:43 ` Eli Zaretskii 2019-03-24 4:34 ` Van L
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).