Terminal or keyboard decoding system?

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Terminal or keyboard decoding system?
@ 2021-09-19  3:12 Ravi R
  2021-09-19 23:23 ` Stefan Monnier
  0 siblings, 1 reply; 4+ messages in thread
From: Ravi R @ 2021-09-19  3:12 UTC (permalink / raw)
  To: emacs-devel

My attempt to use all the standard emacs modifiers (shift, control,
meta, alt, super, hyper) on *terminal* emacs has been fairly successful,
but it's not clear to me that my current approach of modifying
input-decode-map and key-translation-map is the correct one, as it seems
to have some pitfalls I do not understand. Please help me identify the
correct way to decode these terminal sequences.

My terminal emulator of choice, kitty, now provides an unambiguous
mechanism [1] for sending all the modifier keys listed above for any key
press. The protocol is straightfoward; every key (without exception) is
reported to the application in one of the following forms:
  CSI number ; modifier u
  CSI 1 ; modifier suffix
where spaces have been added above for clarity, but are not part of the
protocol.
  CSI: "\e[" (as an emacs string)
  number: decimal number in ASCII, e.g., two byte string "97" for ?a
  modifier: decimal number in ASCII representing a bit-field (see below)
  suffix: one character for some specific keys
The modifier decimal number minus one represents each modifier in one of
its bits, e.g., shift is bit LSB, alt is bit 1, etc.

My current approach [2] is to bind the CSI sequence to the following
function in input-decode-map:

(fset 'xterm-kitty--original-read-char-exclusive (symbol-function 'read-char-exclusive))
(defun xterm-kitty--handle-escape-code (prompt)
  "Handle keycode using integer math; PROMPT is ignored."
  (let ((keycode 0)
        (modifiers 0)
        (suffix nil)
        (current-num 0)
        (e))
    (while (not suffix)
      (setq e (xterm-kitty--original-read-char-exclusive))
      (if (<= ?0 e ?9)
          (setq current-num (+ (* current-num 10) (- e ?0)))
        (if (eql e ?\;)
            (setq keycode current-num
                  current-num 0)
          (setq suffix e)
          (if (> keycode 0)
              (setq modifiers (1- current-num))
            (setq keycode current-num)))))
    ;; (message "Code: %d modifiers %d suffix: %s" keycode modifiers suffix)
    (xterm-kitty-decode-key-stroke keycode modifiers suffix)))

where  xterm-kitty-decode-key-stroke creates the appropriate emacs key
sequence, e.g., "\e[97;51u" goes to it as (97 51 u), which is then
mapped to (kbd "H-M-A-a") since 51-1 = 32(meta)+16(hyper)+2(alt). This
approach has two major pitfalls:
  1. read-char-exclusive can be advised by packages
  2. C-DEL has some issues
  3. Standard C-, M-, C-M- bindings cannot distinguish between shifted
     and non-shifted variants for characters: abcdefghjklnopqrstuvwxyz\
I don't understand the "C-DEL" issue at all, and work around the
shifted/non-shifted bindings via the following hacks:
  a. Mapping shifted to non-shifted versions in key-tranlation-map
  b. Add any non-shifted personal bindings (e.g., to "C-A" rather than
     "C-a") in local-function-key-map (really ugly)
The first issue of read-char-exclusive advise by packages cannot be
solved at all, as far as I can tell.

Questions:

1. Is modifying input-decode-map and key-translation-map the right
   approach? Or should this be done using a coding system, e.g.,
   keyboard-coding-system or terminal-coding-system? If so, can CCL or
   equivalent be used to translate from the keyboard protocol to emacs
   key representations? As far as I can tell, CCL programs transform a
   byte sequence to a different byte sequence; how can they be used to
   produce emacs key representations? I'm happy to write C code to
   implement this new coding system, if that's the best approach. Or is
   there a function at a lower level than read-char-exclusive that
   should be used?

2. Emacs key representations seem to come in two forms: a 28-bit integer
   (e.g., "H-M-A-a" is (1<<24)+(1<<27)+(1<<22)+97) and a symbol list
   (e.g., "H-<f1>" is 'H-f1), both of which are then stored in a vector.
   To be able to disambiguate between shifted versions and non-shifted
   versions, I ended up computing [3] the values directly for the
   integer representation, since event-convert-list (used internally by
   define-key) strips off shift modifiers for alphabetic characters in C
   code, and provides no way to work around it. However, "C-a" and
   "C-S-a" seem to be work perfectly fine if the integer value is
   computed explicitly; is this expected, or are there other minefields
   that I haven't encountered?

3. When allowing mouse focus-in/out and or bracketed paste events,
   occasionally, they seem to interleave with keyboard input read using
   read-char-exclusive. This problem has been very hard to reproduce
   deterministically, but happens once or twice daily. How should one
   debug such issues?

Even with all of these issues, the ability to use all 6 modifier keys on
a 24-bit color terminal with multiple real frames (rather than just one
frame) over chained SSH connections has been fantastic! For my use
cases, font sizes/shapes and image display seem to be the only missing
features compared to graphical emacs. Implementing meta/hyper support in
kitty (after giving up on a more famous terminal emulator), determining
heuristics for detecting modifiers in XKB on Wayland (which does not
natively provide a way to query them), and modifying XKB system
configuration on Wayland to support all these modifiers without breaking
other applications, has been quite a journey to getting a terminal emacs
experience on par with graphical emacs.

Regards,
Ravi

[1] https://sw.kovidgoyal.net/kitty/keyboard-protocol/#report-all-keys-as-escape-codes
[2] http://cgit.lexarcana.com/cgit.cgi/dotemacs/tree/lisp/term/xterm-kitty.el#n390
[3] http://cgit.lexarcana.com/cgit.cgi/dotemacs/tree/lisp/term/xterm-kitty.el#n351

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Terminal or keyboard decoding system?
  2021-09-19  3:12 Terminal or keyboard decoding system? Ravi R
@ 2021-09-19 23:23 ` Stefan Monnier
  2021-09-24  3:05   ` Ravi R
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Monnier @ 2021-09-19 23:23 UTC (permalink / raw)
  To: Ravi R; +Cc: emacs-devel

> My current approach [2] is to bind the CSI sequence to the following

AFAIK the CSI prefix is used for "all" key sequences as well as for
other "side communication" (such as the bracketed paste functionality or
the mouse events), so binding your function directly to CSI seems
a bit drastic.

IOW, Maybe a better option is to bind it to all the `CSI
<digit>` sequences.  But I guess it depends what other sequences can
occur in your terminal (in xterm, there are many standard keys that use
sequences that start with CSI <digit>, and backeted paste actually also
matches this pattern).

>   1. read-char-exclusive can be advised by packages

It can, but it doesn't seem common (I can't find any such occurrence in
Emacs nor (Non)GNU ELPA other than inside a test inside Org) and it's
not necessarily a problem.
What problems have you encountered?

BTW, why use `read-char-exclusive` rather than, say, `read-event`?

>   3. Standard C-, M-, C-M- bindings cannot distinguish between shifted
>      and non-shifted variants for characters: abcdefghjklnopqrstuvwxyz\
> I don't understand the "C-DEL" issue at all, and work around the
> shifted/non-shifted bindings via the following hacks:
>   a. Mapping shifted to non-shifted versions in key-tranlation-map
>   b. Add any non-shifted personal bindings (e.g., to "C-A" rather than
>      "C-a") in local-function-key-map (really ugly)

We have some long standing issues here, I think.
Maybe a bug report is in order.

> 1. Is modifying input-decode-map and key-translation-map the right
>    approach?

Yes (tho `key-translation-map` is better avoided, but sometimes it's
the only option).

>    Or should this be done using a coding system, e.g.,
>    keyboard-coding-system or terminal-coding-system?

Definitely not (these deal with characters like those that can occur
inside a string, i.e. not ones with modifiers).

> 2. Emacs key representations seem to come in two forms: a 28-bit integer
>    (e.g., "H-M-A-a" is (1<<24)+(1<<27)+(1<<22)+97) and a symbol list
>    (e.g., "H-<f1>" is 'H-f1), both of which are then stored in a vector.
>    To be able to disambiguate between shifted versions and non-shifted
>    versions, I ended up computing [3] the values directly for the
>    integer representation, since event-convert-list (used internally by
>    define-key) strips off shift modifiers for alphabetic characters in C
>    code, and provides no way to work around it. However, "C-a" and
>    "C-S-a" seem to be work perfectly fine if the integer value is
>    computed explicitly; is this expected, or are there other minefields
>    that I haven't encountered?

As mentioned above, there are indeed weirdnesses here around the
equivalence between `A` and `S-a` as well as between `C-A` and `C-S-a`
but I think the better place to address them is in a bug report.

It is a fairly messy part, since the shift and control modifiers break
the clean separation between a character and a modifier (since the `A`
character is also an `a` with a shift modifier and since `C-a` is
itself a character as well).

        Stefan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Terminal or keyboard decoding system?
  2021-09-19 23:23 ` Stefan Monnier
@ 2021-09-24  3:05   ` Ravi R
  2021-09-24 13:07     ` Stefan Monnier
  0 siblings, 1 reply; 4+ messages in thread
From: Ravi R @ 2021-09-24  3:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Sun 2021-09-19  7:23:09PM -04, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> My current approach [2] is to bind the CSI sequence to the following
>
> AFAIK the CSI prefix is used for "all" key sequences as well as for
> other "side communication" (such as the bracketed paste functionality or
> the mouse events), so binding your function directly to CSI seems
> a bit drastic.
>
> IOW, Maybe a better option is to bind it to all the `CSI
> <digit>` sequences.  But I guess it depends what other sequences can
> occur in your terminal (in xterm, there are many standard keys that use
> sequences that start with CSI <digit>, and backeted paste actually also
> matches this pattern).

I do handle all sequences that can occur in my terminal, which are just
key sequences, focus in/out, and bracketed paste.

>>   1. read-char-exclusive can be advised by packages
>
> It can, but it doesn't seem common (I can't find any such occurrence in
> Emacs nor (Non)GNU ELPA other than inside a test inside Org) and it's
> not necessarily a problem.
> What problems have you encountered?

The multiple cursors package advises `read-key` which I originally used.
I haven't run into any packages that advise `read-char-exclusive`.

> BTW, why use `read-char-exclusive` rather than, say, `read-event`?

Fantastic catch! I did not know that `read-event` could be used here;
after replacing `read-char-exclusive` with `read-event`, two days of
emacs use has not resulted in the keystroke interleaving problem. Why
does this work better?

>>   3. Standard C-, M-, C-M- bindings cannot distinguish between shifted
>>      and non-shifted variants for characters: abcdefghjklnopqrstuvwxyz\
>> I don't understand the "C-DEL" issue at all, and work around the
>> shifted/non-shifted bindings via the following hacks:
>>   a. Mapping shifted to non-shifted versions in key-tranlation-map
>>   b. Add any non-shifted personal bindings (e.g., to "C-A" rather than
>>      "C-a") in local-function-key-map (really ugly)
>
> We have some long standing issues here, I think.
> Maybe a bug report is in order.

Ok, will do.

>> 1. Is modifying input-decode-map and key-translation-map the right
>>    approach?
>
> Yes (tho `key-translation-map` is better avoided, but sometimes it's
> the only option).

Why should it be avoided? I haven't run into any issues, but it'd be
good to know.

Regards,
Ravi



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Terminal or keyboard decoding system?
  2021-09-24  3:05   ` Ravi R
@ 2021-09-24 13:07     ` Stefan Monnier
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 2021-09-24 13:07 UTC (permalink / raw)
  To: Ravi R; +Cc: emacs-devel

>> BTW, why use `read-char-exclusive` rather than, say, `read-event`?
> Fantastic catch! I did not know that `read-event` could be used here;
> after replacing `read-char-exclusive` with `read-event`, two days of
> emacs use has not resulted in the keystroke interleaving problem.
> Why does this work better?

I have no idea and I suspect noone else does either :-(

>>> 1. Is modifying input-decode-map and key-translation-map the right
>>>    approach?
>> Yes (tho `key-translation-map` is better avoided, but sometimes it's
>> the only option).
> Why should it be avoided? I haven't run into any issues, but it'd be
> good to know.

Because it applies after `function-key-map`, and the interaction between
the two can be ... disappointing in corner cases.


        Stefan




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-24 13:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-19  3:12 Terminal or keyboard decoding system? Ravi R
2021-09-19 23:23 ` Stefan Monnier
2021-09-24  3:05   ` Ravi R
2021-09-24 13:07     ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).