all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: help-gnu-emacs@gnu.org
Subject: Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help*
Date: Tue, 05 Mar 2019 18:07:09 +0200	[thread overview]
Message-ID: <831s3le24y.fsf@gnu.org> (raw)
In-Reply-To: <m2r2bnpg2t.fsf@scratch.space> (message from Van L on Mon, 04 Mar 2019 12:46:02 +1100)

> From: Van L <van@scratch.space>
> Date: Mon, 04 Mar 2019 12:46:02 +1100
> 
> >From the *scratch* buffer, I lookup the keybinding possibilities by
> 
>   C-h b
> 
> Under the Global Bindings section, the two lines under SPC look to be
> encoded in Latin-1. I guess Emacs assumes UTF-8.

No, this has nothing to do with encoding.  This text is produced by
Emacs itself (unlike the previous problem with EWW, where the text
came from an external source), so decoding text is not necessary,
because text generated by Emacs itself and inserted into its buffers
is always in the correct "encoding" (we prefer to call that
"representation", to distinguish between the internal representation
of characters in Emacs buffers and strings, and encoded text outside
Emacs).

> The problem is I see \200 \377 and a two row box having inside of it
> 3FF F7F as follows
> 
> -- quote - unknown encoding characters replaced with lookalike sequence
> SPC .. ~	self-insert-command
> \200 .. 3FF_F7F	self-insert-command
> \200 .. \377	self-insert-command

Yes.  This is admittedly confusing, although 100% correct.  To start
digging into what happens here, go to each of the 2 \200's and type
"C-u C-x =".  You will see that these two look identically on display,
but are actually two very different beasts: the former is a Unicode
character whose codepoint happens to be 200 octal (0x80 in hex), the
latter is a raw byte of the same value.  Emacs distinguishes between
them.  The confusing bit here is that they are by default both
displayed identically, for dull historical reasons (once upon a time,
Emacs didn't distinguish between them).  (Perhaps there's no longer a
reason to use this confusing display nowadays.)

So the first of the above 2 lines stands for all the non-ASCII Unicode
characters, all of which are bound to self-insert-command by default.
The funny display of both ends of that character code range is because
none of the shown codes corresponds to a printable character.  In
particular, the \200 codepoint is currently unassigned, i.e. there's
no character whose Unicode codepoint is 0x80.

By contrast, the second row shows all the raw bytes, which are also
bound to self-insert-command by default.

IOW, unlike the case with EWW showing incorrectly decoded text, here
the issue is with how characters are _displayed_, not how they are
decoded.  To change how they look you need to fiddle with display
features, not with decoding features.

And now to your question:

> I know what to do for this kind of situation in EWW, type "E latin-1 RET".
> 
> What goes here?

Type

  M-x customize-variable RET glyphless-char-display-control RET

In the buffer this displays, check the box to the left of the
"c1-control" group.  This enables the button to the right of the
checkbox; click on it and select the method you want, e.g. "Display
acronym" or "Display hex code in a box".  Then click "Apply".  This
will change how all the characters in the range [0x80..0x9f] are
displayed.



  reply	other threads:[~2019-03-05 16:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-04  1:46 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* Van L
2019-03-05 16:07 ` Eli Zaretskii [this message]
2019-03-06  0:47   ` Van L
2019-03-06 16:13     ` Eli Zaretskii
2019-03-21 12:13       ` 26.2 RC1 copy-and-paste fail Van L
2019-03-21 14:44         ` Eli Zaretskii
2019-03-21 22:33           ` Van L
2019-03-22  7:15             ` Eli Zaretskii
2019-03-22  8:35               ` Van L
2019-03-22  9:10                 ` Eli Zaretskii
2019-03-22 14:01             ` Van L
2019-03-22 14:43               ` Eli Zaretskii
2019-03-24  4:34                 ` Van L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=831s3le24y.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.