all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Van L <van@scratch.space>
To: help-gnu-emacs@gnu.org
Subject: Re: 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help*
Date: Wed, 06 Mar 2019 11:47:29 +1100	[thread overview]
Message-ID: <m28sxs3k2m.fsf@scratch.space> (raw)
In-Reply-To: 831s3le24y.fsf@gnu.org

Eli writes:

>> >From the *scratch* buffer, I lookup the keybinding possibilities by
>> 
>>   C-h b
>> 
>> Under the Global Bindings section, the two lines under SPC look to be
>> encoded in Latin-1. I guess Emacs assumes UTF-8.
>
> No, this has nothing to do with encoding.  This text is produced by
> Emacs itself … the internal representation of characters in Emacs
> buffers and strings

>> \200 .. 3FF_F7F	self-insert-command
>> \200 .. \377	self-insert-command
>
> Yes.  This is admittedly confusing, although 100% correct.

But. But. But. Less than 100% beautiful. The out of ASCII range row
terminated by unprintables as visually balanced hex values in a box
would look and feel nicer.

> To start
> digging into what happens here, go to each of the 2 \200's and type
> "C-u C-x =".  You will see that these two look identically on display,
> but are actually two very different beasts: the former is a Unicode
> character whose codepoint happens to be 200 octal (0x80 in hex), the
> latter is a raw byte of the same value.

They are born digital homonyms.

> Emacs distinguishes between
> them.  The confusing bit here is that they are by default both
> displayed identically, 

"C-u C-x =" or M-x describe-char RET puts them in

    category: l:Latin
    category: L:Left-to-right (strong)

> for dull historical reasons (once upon a time,
> Emacs didn't distinguish between them).  (Perhaps there's no longer a
> reason to use this confusing display nowadays.)

Wouldn't it be funny to pull on that string? all the way to the bottom
is tied a boat anchor in the shape of a first of its kind 1950s Chinese
electric computer keyboard invented and made in the U.S.A. which was
being considered a gift to China by the Ike Admin.

> So the first of the above 2 lines stands for all the non-ASCII Unicode
> characters, all of which are bound to self-insert-command by default.

> By contrast, the second row shows all the raw bytes, which are also
> bound to self-insert-command by default.

> IOW, unlike the case with EWW showing incorrectly decoded text, here
> the issue is with how characters are _displayed_, 

> And now to your question:
>
>> I know what to do for this kind of situation in EWW, type "E latin-1 RET".
>> 
>> What goes here?
>
> Type
>
>   M-x customize-variable RET glyphless-char-display-control RET
>

Thank you.

Should I file a bug report for copy and paste inconsistency when trying
to collect in one buffer the `M-x describe-char' output? for the above two.

Highlight region then M-w C-y fails
whereas the middle-mouse button
paste works.

Having done that and attempting to save the buffer presents the
following on problematic characters which makes sense given the above
explanation

-- quote
These default coding systems were tried to encode text
in the buffer ‘x’:
  (utf-8 (845 . 4194176) (861 . 4194176) (1376 . 4194176))
However, each of them encountered characters it couldn’t encode:
  utf-8 cannot encode these: \200 \200 \200

Click on a character (or switch to this window by ‘C-x o’
and select the characters by RET) to jump to the place it appears,
where ‘C-u C-x =’ will give information about it.

Select one of the safe coding systems listed below,
or cancel the writing with C-g and edit the buffer
   to remove or modify the problematic characters,
or specify any other coding system (and risk losing
   the problematic characters).

  raw-text no-conversion

-- quote ends

-- 
© 2019 Van L
gpg using EEF2 37E9 3840 0D5D 9183  251E 9830 384E 9683 B835
"What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell




  reply	other threads:[~2019-03-06  0:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-04  1:46 26.1.92, 26.1-mac-7.4; unrecognised escaped chars in *Help* Van L
2019-03-05 16:07 ` Eli Zaretskii
2019-03-06  0:47   ` Van L [this message]
2019-03-06 16:13     ` Eli Zaretskii
2019-03-21 12:13       ` 26.2 RC1 copy-and-paste fail Van L
2019-03-21 14:44         ` Eli Zaretskii
2019-03-21 22:33           ` Van L
2019-03-22  7:15             ` Eli Zaretskii
2019-03-22  8:35               ` Van L
2019-03-22  9:10                 ` Eli Zaretskii
2019-03-22 14:01             ` Van L
2019-03-22 14:43               ` Eli Zaretskii
2019-03-24  4:34                 ` Van L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m28sxs3k2m.fsf@scratch.space \
    --to=van@scratch.space \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.