unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help*
@ 2019-03-04  1:46 Van L
  2019-03-05 16:07 ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Van L @ 2019-03-04  1:46 UTC (permalink / raw)
  To: help-gnu-emacs


Hello,

From the *scratch* buffer, I lookup the keybinding possibilities by

  C-h b

Under the Global Bindings section, the two lines under SPC look to be
encoded in Latin-1. I guess Emacs assumes UTF-8. The problem is I see
\200 \377 and a two row box having inside of it 3FF F7F as follows

-- quote - unknown encoding characters replaced with lookalike sequence
SPC .. ~	self-insert-command
\200 .. 3FF_F7F	self-insert-command
\200 .. \377	self-insert-command

-- quote ends

I know what to do for this kind of situation in EWW, type "E latin-1 RET".

What goes here?

-- 
© 2019 Van L
gpg using EEF2 37E9 3840 0D5D 9183  251E 9830 384E 9683 B835
"What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help*
  2019-03-04  1:46 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* Van L
@ 2019-03-05 16:07 ` Eli Zaretskii
  2019-03-06  0:47   ` Van L
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-03-05 16:07 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Van L <van@scratch.space>
> Date: Mon, 04 Mar 2019 12:46:02 +1100
> 
> >From the *scratch* buffer, I lookup the keybinding possibilities by
> 
>   C-h b
> 
> Under the Global Bindings section, the two lines under SPC look to be
> encoded in Latin-1. I guess Emacs assumes UTF-8.

No, this has nothing to do with encoding.  This text is produced by
Emacs itself (unlike the previous problem with EWW, where the text
came from an external source), so decoding text is not necessary,
because text generated by Emacs itself and inserted into its buffers
is always in the correct "encoding" (we prefer to call that
"representation", to distinguish between the internal representation
of characters in Emacs buffers and strings, and encoded text outside
Emacs).

> The problem is I see \200 \377 and a two row box having inside of it
> 3FF F7F as follows
> 
> -- quote - unknown encoding characters replaced with lookalike sequence
> SPC .. ~	self-insert-command
> \200 .. 3FF_F7F	self-insert-command
> \200 .. \377	self-insert-command

Yes.  This is admittedly confusing, although 100% correct.  To start
digging into what happens here, go to each of the 2 \200's and type
"C-u C-x =".  You will see that these two look identically on display,
but are actually two very different beasts: the former is a Unicode
character whose codepoint happens to be 200 octal (0x80 in hex), the
latter is a raw byte of the same value.  Emacs distinguishes between
them.  The confusing bit here is that they are by default both
displayed identically, for dull historical reasons (once upon a time,
Emacs didn't distinguish between them).  (Perhaps there's no longer a
reason to use this confusing display nowadays.)

So the first of the above 2 lines stands for all the non-ASCII Unicode
characters, all of which are bound to self-insert-command by default.
The funny display of both ends of that character code range is because
none of the shown codes corresponds to a printable character.  In
particular, the \200 codepoint is currently unassigned, i.e. there's
no character whose Unicode codepoint is 0x80.

By contrast, the second row shows all the raw bytes, which are also
bound to self-insert-command by default.

IOW, unlike the case with EWW showing incorrectly decoded text, here
the issue is with how characters are _displayed_, not how they are
decoded.  To change how they look you need to fiddle with display
features, not with decoding features.

And now to your question:

> I know what to do for this kind of situation in EWW, type "E latin-1 RET".
> 
> What goes here?

Type

  M-x customize-variable RET glyphless-char-display-control RET

In the buffer this displays, check the box to the left of the
"c1-control" group.  This enables the button to the right of the
checkbox; click on it and select the method you want, e.g. "Display
acronym" or "Display hex code in a box".  Then click "Apply".  This
will change how all the characters in the range [0x80..0x9f] are
displayed.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help*
  2019-03-05 16:07 ` Eli Zaretskii
@ 2019-03-06  0:47   ` Van L
  2019-03-06 16:13     ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Van L @ 2019-03-06  0:47 UTC (permalink / raw)
  To: help-gnu-emacs

Eli writes:

>> >From the *scratch* buffer, I lookup the keybinding possibilities by
>> 
>>   C-h b
>> 
>> Under the Global Bindings section, the two lines under SPC look to be
>> encoded in Latin-1. I guess Emacs assumes UTF-8.
>
> No, this has nothing to do with encoding.  This text is produced by
> Emacs itself … the internal representation of characters in Emacs
> buffers and strings

>> \200 .. 3FF_F7F	self-insert-command
>> \200 .. \377	self-insert-command
>
> Yes.  This is admittedly confusing, although 100% correct.

But. But. But. Less than 100% beautiful. The out of ASCII range row
terminated by unprintables as visually balanced hex values in a box
would look and feel nicer.

> To start
> digging into what happens here, go to each of the 2 \200's and type
> "C-u C-x =".  You will see that these two look identically on display,
> but are actually two very different beasts: the former is a Unicode
> character whose codepoint happens to be 200 octal (0x80 in hex), the
> latter is a raw byte of the same value.

They are born digital homonyms.

> Emacs distinguishes between
> them.  The confusing bit here is that they are by default both
> displayed identically, 

"C-u C-x =" or M-x describe-char RET puts them in

    category: l:Latin
    category: L:Left-to-right (strong)

> for dull historical reasons (once upon a time,
> Emacs didn't distinguish between them).  (Perhaps there's no longer a
> reason to use this confusing display nowadays.)

Wouldn't it be funny to pull on that string? all the way to the bottom
is tied a boat anchor in the shape of a first of its kind 1950s Chinese
electric computer keyboard invented and made in the U.S.A. which was
being considered a gift to China by the Ike Admin.

> So the first of the above 2 lines stands for all the non-ASCII Unicode
> characters, all of which are bound to self-insert-command by default.

> By contrast, the second row shows all the raw bytes, which are also
> bound to self-insert-command by default.

> IOW, unlike the case with EWW showing incorrectly decoded text, here
> the issue is with how characters are _displayed_, 

> And now to your question:
>
>> I know what to do for this kind of situation in EWW, type "E latin-1 RET".
>> 
>> What goes here?
>
> Type
>
>   M-x customize-variable RET glyphless-char-display-control RET
>

Thank you.

Should I file a bug report for copy and paste inconsistency when trying
to collect in one buffer the `M-x describe-char' output? for the above two.

Highlight region then M-w C-y fails
whereas the middle-mouse button
paste works.

Having done that and attempting to save the buffer presents the
following on problematic characters which makes sense given the above
explanation

-- quote
These default coding systems were tried to encode text
in the buffer ‘x’:
  (utf-8 (845 . 4194176) (861 . 4194176) (1376 . 4194176))
However, each of them encountered characters it couldn’t encode:
  utf-8 cannot encode these: \200 \200 \200

Click on a character (or switch to this window by ‘C-x o’
and select the characters by RET) to jump to the place it appears,
where ‘C-u C-x =’ will give information about it.

Select one of the safe coding systems listed below,
or cancel the writing with C-g and edit the buffer
   to remove or modify the problematic characters,
or specify any other coding system (and risk losing
   the problematic characters).

  raw-text no-conversion

-- quote ends

-- 
© 2019 Van L
gpg using EEF2 37E9 3840 0D5D 9183  251E 9830 384E 9683 B835
"What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help*
  2019-03-06  0:47   ` Van L
@ 2019-03-06 16:13     ` Eli Zaretskii
  2019-03-21 12:13       ` 26.2 RC1 copy-and-paste fail Van L
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-03-06 16:13 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Van L <van@scratch.space>
> Date: Wed, 06 Mar 2019 11:47:29 +1100
> 
> >> \200 .. 3FF_F7F	self-insert-command
> >> \200 .. \377	self-insert-command
> >
> > Yes.  This is admittedly confusing, although 100% correct.
> 
> But. But. But. Less than 100% beautiful. The out of ASCII range row
> terminated by unprintables as visually balanced hex values in a box
> would look and feel nicer.

This just uses the default Emacs display of these characters.
Producing some fancy alternatives might be source of a different kind
of confusion ("why does 'C-h b' show the characters differently than
what they look like in my buffers?").

> "C-u C-x =" or M-x describe-char RET puts them in
> 
>     category: l:Latin
>     category: L:Left-to-right (strong)

You are looking at the wrong parts.  Look at the "charset" part.

> Should I file a bug report for copy and paste inconsistency when trying
> to collect in one buffer the `M-x describe-char' output? for the above two.

What inconsistency is that?

> Highlight region then M-w C-y fails

Fails how?  It didn't fail for me.

> Select one of the safe coding systems listed below,
> or cancel the writing with C-g and edit the buffer
>    to remove or modify the problematic characters,
> or specify any other coding system (and risk losing
>    the problematic characters).
> 
>   raw-text no-conversion

That's because you have raw bytes in the buffer.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-06 16:13     ` Eli Zaretskii
@ 2019-03-21 12:13       ` Van L
  2019-03-21 14:44         ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Van L @ 2019-03-21 12:13 UTC (permalink / raw)
  To: help-gnu-emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> >> \200 .. 3FF_F7F	self-insert-command
>> >> \200 .. \377	self-insert-command

-- [snip]

>>     category: l:Latin
>>     category: L:Left-to-right (strong)
>
> You are looking at the wrong parts.  Look at the "charset" part.

               charset: unicode (Unicode (ISO10646))
               charset: eight-bit (Raw bytes 128-255)

I see. Thanks.

>
>> Should I file a bug report for copy and paste inconsistency when trying
>> to collect in one buffer the `M-x describe-char' output? for the above two.
>
> What inconsistency is that?

I can copy-and-paste the entire *Help* buffer for \200 unicode.
Then when I try to do the same to \200 eight-bit I experience unexpected behavior.

> Fails how?  It didn't fail for me.

-- A

1. goto *scratch* buffer
2. C-h b
3. goto *Help* buffer

-- B

1. search for 'self' in *Help* buffer
2. C-u C-x = ,apply to unicode \200 under SPC 
3. C-x h ,highlight all
4. M-w ,copy highlight region
5. C-y ,paste to *scratch* buffer is OK
6. do some random copy-and-paste in *scratch* buffer and elsewhere

-- C

1. goto to last *Help* buffer (= C-h b)
2. C-u C-x = ,apply to \200 eight-bit
3. C-x h ,highlight all
4. M-w ,copy highlight region
5. C-y ,paste to *scratch* buffer is FAILS
   (it seems to be a one-off issue, the penultimate copy-and-paste operation occurs)

For me I experience the same fail to copy-and-paste behavior on
26.2 RC1, 26.1.92, 26.1-mac-7.4.

-- 
© 2019 Van L
gpg using EEF2 37E9 3840 0D5D 9183  251E 9830 384E 9683 B835
"What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-21 12:13       ` 26.2 RC1 copy-and-paste fail Van L
@ 2019-03-21 14:44         ` Eli Zaretskii
  2019-03-21 22:33           ` Van L
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-03-21 14:44 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Van L <van@scratch.space>
> Date: Thu, 21 Mar 2019 23:13:19 +1100
> 
> -- C
> 
> 1. goto to last *Help* buffer (= C-h b)
> 2. C-u C-x = ,apply to \200 eight-bit
> 3. C-x h ,highlight all
> 4. M-w ,copy highlight region
> 5. C-y ,paste to *scratch* buffer is FAILS
>    (it seems to be a one-off issue, the penultimate copy-and-paste operation occurs)

What exactly does "FAILS" mean here?  I may be blind, but this last
C-y does work for me, it pastes a second copy of the \200 description
into *scratch*.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-21 14:44         ` Eli Zaretskii
@ 2019-03-21 22:33           ` Van L
  2019-03-22  7:15             ` Eli Zaretskii
  2019-03-22 14:01             ` Van L
  0 siblings, 2 replies; 13+ messages in thread
From: Van L @ 2019-03-21 22:33 UTC (permalink / raw)
  To: help-gnu-emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> 5. C-y ,paste to *scratch* buffer is FAILS
>>    (it seems to be a one-off issue, the penultimate copy-and-paste operation occurs)
>
> What exactly does "FAILS" mean here?  I may be blind, but this last
> C-y does work for me, it pastes a second copy of the \200 description
> into *scratch*.

When I paste I expect the second description of \200 eight-bit to land on
*scratch* buffer. What I get is anything but that. For example, in the
following quote block after //[paste 3] there is no way I can copy and
paste the details of \200 eight-bit to there. What is pasted is an
earlier copy of anything else in the kill ring. If it isn't an Emacs
problem then maybe the clipboard mechanism on XQuartz/darwin is bung.

-- quote
;; This buffer is for text that is not saved, and for Lisp evaluation.
;; To create a file, visit it with C-x C-f and enter text in its buffer.

;; C-u C-x =

;;               charset: unicode (Unicode (ISO10646))  //[paste 0]
;;               charset: eight-bit (Raw bytes 128-255) //[paste 1]

;; -----

;;              position: 10941 of 38231 (29%), column: 0
;;             character: € (displayed as €) (codepoint 128, #o200, #x80)
;;               charset: unicode (Unicode (ISO10646))
;; code point in charset: 0x80
;;                syntax: w 	which means: word
;;              category: l:Latin
;;              to input: type "C-x 8 RET 80"
;;           buffer code: #xC2 #x80
;;             file code: #xC2 #x80 (encoded by coding system utf-8-unix)
;;               display: by this font (glyph code)
;;     xft:-MS  -Wingdings-normal-normal-normal-*-15-*-*-*-*-0-iso10646-1 (#x62)

;; Character code properties: customize what to show
;;   general-category: Cc (Other, Control)
;;   decomposition: (128) ('€')                      //[paste 2]

;; [back]

;; ----- FAIL

;; charset: eight-bit (Raw bytes 128-255)             //[paste 3]

-- quote ends





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-21 22:33           ` Van L
@ 2019-03-22  7:15             ` Eli Zaretskii
  2019-03-22  8:35               ` Van L
  2019-03-22 14:01             ` Van L
  1 sibling, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-03-22  7:15 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Van L <van@scratch.space>
> Date: Fri, 22 Mar 2019 09:33:51 +1100
> 
> > What exactly does "FAILS" mean here?  I may be blind, but this last
> > C-y does work for me, it pastes a second copy of the \200 description
> > into *scratch*.
> 
> When I paste I expect the second description of \200 eight-bit to land on
> *scratch* buffer. What I get is anything but that. For example, in the
> following quote block after //[paste 3] there is no way I can copy and
> paste the details of \200 eight-bit to there. What is pasted is an
> earlier copy of anything else in the kill ring. If it isn't an Emacs
> problem then maybe the clipboard mechanism on XQuartz/darwin is bung.

Yes, that could be it.  Is this in "emacs -Q"?  If so, do you have
some clipboard-handling application running on your system, which
could be causing this?

Failing all of the above, please submit a bug report.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-22  7:15             ` Eli Zaretskii
@ 2019-03-22  8:35               ` Van L
  2019-03-22  9:10                 ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Van L @ 2019-03-22  8:35 UTC (permalink / raw)
  To: help-gnu-emacs


>>                                               . If it isn't an Emacs
>> problem then maybe the clipboard mechanism on XQuartz/darwin is bung.
>
> Is this in "emacs -Q"?

`emacs -Q` doesn't have the problem.

`git tags/emacs-26.2-rc1-mac-7.5` build doesn't have the problem
without needing the `emacs -Q` start.

AFAIK I've not done anything special to the clipboard.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-22  8:35               ` Van L
@ 2019-03-22  9:10                 ` Eli Zaretskii
  0 siblings, 0 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-03-22  9:10 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Van L <van@scratch.space>
> Date: Fri, 22 Mar 2019 19:35:16 +1100
> 
> > Is this in "emacs -Q"?
> 
> `emacs -Q` doesn't have the problem.
> 
> `git tags/emacs-26.2-rc1-mac-7.5` build doesn't have the problem
> without needing the `emacs -Q` start.
> 
> AFAIK I've not done anything special to the clipboard.

Do you have any customizations related to encoding selections?

If nothing else gives a hint, bisect your customizations to find the
culprit.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-21 22:33           ` Van L
  2019-03-22  7:15             ` Eli Zaretskii
@ 2019-03-22 14:01             ` Van L
  2019-03-22 14:43               ` Eli Zaretskii
  1 sibling, 1 reply; 13+ messages in thread
From: Van L @ 2019-03-22 14:01 UTC (permalink / raw)
  To: help-gnu-emacs


> Do you have any customizations related to encoding selections?

LANG=en_AU.UTF-8

> If nothing else gives a hint, bisect your customizations to find the
> culprit.

I use the same .emacs file for parallel running instances of
GNU/Emacs version 26.1, 26.1.92, 26.2-rc1, 27.0.50. That complicates
it. I will give bisecting a try. Thanks.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-22 14:01             ` Van L
@ 2019-03-22 14:43               ` Eli Zaretskii
  2019-03-24  4:34                 ` Van L
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-03-22 14:43 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Van L <van@scratch.space>
> Date: Sat, 23 Mar 2019 01:01:21 +1100
> 
> > Do you have any customizations related to encoding selections?
> 
> LANG=en_AU.UTF-8

I don't think this could be the culprit.  If it were, we'd have
complaints like yours long ago.

> > If nothing else gives a hint, bisect your customizations to find the
> > culprit.
> 
> I use the same .emacs file for parallel running instances of
> GNU/Emacs version 26.1, 26.1.92, 26.2-rc1, 27.0.50.

And the problem happens in only some of those?

> That complicates it. I will give bisecting a try. Thanks.

Thanks.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 26.2 RC1 copy-and-paste fail
  2019-03-22 14:43               ` Eli Zaretskii
@ 2019-03-24  4:34                 ` Van L
  0 siblings, 0 replies; 13+ messages in thread
From: Van L @ 2019-03-24  4:34 UTC (permalink / raw)
  To: help-gnu-emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> I use the same .emacs file for parallel running instances of
>> GNU/Emacs version 26.1, 26.1.92, 26.2-rc1, 27.0.50.
>
> And the problem happens in only some of those?
>

After reboot, using .emacs file on single instance run:

GNU Emacs 26.2 [x86_64-apple-darwin15.6.0]
- first  run is OK
- second run fails
- third  run fails
- forth  run fails despite `emacs -Q` invocation (I sent a bug-report there)

GNU Emacs 26.1.92 [emacs-26.2-rc-rc1-mac-7.5, x86_64-apple-darwin15.6.0]
- first  run is OK
- second run is OK

GNU Emacs 26.1 [x86_64--netbsd]
- first run fails




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-24  4:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-04  1:46 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* Van L
2019-03-05 16:07 ` Eli Zaretskii
2019-03-06  0:47   ` Van L
2019-03-06 16:13     ` Eli Zaretskii
2019-03-21 12:13       ` 26.2 RC1 copy-and-paste fail Van L
2019-03-21 14:44         ` Eli Zaretskii
2019-03-21 22:33           ` Van L
2019-03-22  7:15             ` Eli Zaretskii
2019-03-22  8:35               ` Van L
2019-03-22  9:10                 ` Eli Zaretskii
2019-03-22 14:01             ` Van L
2019-03-22 14:43               ` Eli Zaretskii
2019-03-24  4:34                 ` Van L

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).