From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Max Mikhanosha Newsgroups: gmane.emacs.devel Subject: Re: Bugfix for utf-8 XTerm/MinTTY and (set-input-meta-mode t) Date: Wed, 02 Jun 2021 10:21:26 +0000 Message-ID: References: Reply-To: Max Mikhanosha Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31538"; mail-complaints-to="usenet@ciao.gmane.io" Cc: "emacs-devel@gnu.org" To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jun 02 13:53:31 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1loPRC-0007vK-W2 for ged-emacs-devel@m.gmane-mx.org; Wed, 02 Jun 2021 13:53:31 +0200 Original-Received: from localhost ([::1]:46972 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1loPRC-0003C0-27 for ged-emacs-devel@m.gmane-mx.org; Wed, 02 Jun 2021 07:53:30 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48804) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1loO0C-0001d0-MC for emacs-devel@gnu.org; Wed, 02 Jun 2021 06:21:32 -0400 Original-Received: from mail1.protonmail.ch ([185.70.40.18]:62820) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1loO09-0003pa-Sy for emacs-devel@gnu.org; Wed, 02 Jun 2021 06:21:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail; t=1622629286; bh=zs+qVAkc/zGmzNVZPDmczHODZYNCKY9JYM6QtQWyJks=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=l8r7nOh4j6ImAxMSaFLd8aQVBhYUsf1ij8IGqvJv51yv2Z8vcX5UvCY2RPjxIvuLi q6zpwW96R2Hn0YpNmjCOyhRUkxWbtnbOtDHSqzVx9bWqgRycfiH62HfMn/nlx4IFDk 8Xp+Z5JVHnvvApRmdPHbnI0FoygP2fHVNfmtu6vI= In-Reply-To: Received-SPF: pass client-ip=185.70.40.18; envelope-from=max.mikhanosha@protonmail.com; helo=mail1.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Wed, 02 Jun 2021 07:52:48 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:270284 Archived-At: On Tuesday, June 1st, 2021 at 4:06 PM, Stefan Monnier wrote: > > Both XTerm and MinTTY, when configured to send meta modifier as 8th > > bit while in utf-8 mode, will first add 8th bit, and then encode > > resulting character with utf-8. For example Meta-X is encoded > > as ?x+128 =3D #248 codepoint, encoded as 0xc3,0xb8 > > How did they end up with that weird design? It seems to be logical extension to preserve backward compatibility as much as possible. I mean since without UTF-8, in meta-as-8th-bit mode, Meta-x ge= nerates #248, just having terminal in UTF-8 mode should not change that. And in UTF8, sta= rting sequences that have 8th bit set indicate start of the encoding, so sending = 128-256 as is would be confused for UTF8 sequence by the receiver. Obvious solution= would be to just UTF-8 encode the output that non-utf-8 terminal would be sending. > I mean they could have made meta toggle the 24th bit, for example, so it > doesn't collide with other existing characters. There is XTerm solution for this, called modifyOtherKey resource with new e= num, which can be set so that any modifiers even on ordinary keys like M-x would generate properl= y structured ESC[ sequences describing the modifiers. I agree that in perfect world we w= ould have come out with some binary bitmask solution, rather than current thing where terminal= can send you 20 byte sequence for Ctrl-Alt-Shift-PageDown, but it is what it is. > This design is quite weird since it breaks all the latin-1 chars of > unicode plus all the uses of meta with non-ASCII chars. > > How do they encode M-=CE=BB ? > Is it also sent as the same byte-sequence as `?=CE=BB + 128 =3D ?=D0= =BB` ? Unfortunately yes, at least mintty in meta-as-8th bit mode (which is my ter= minal on cygwin) just dumbly shoves 8th bit into even wide characters (like when I press Alt+lett= er in a cyrillic layout), but my patch does not change that. TBH Xterm maybe smarter, and just generates= English M-x when you press M-x when on a different keyboard layout, or you can probably make it behave= like this with some xkb config magic. Proper way to support multi-modifier key sequences is by usi= ng modifyOtherKeys:2 but it would need to have wider adoption than just xterm, hopefully it will percolate to= other terminals just as xterm-direct truecolor mode. For now i'm pretty happy with meta as 8th bit mode, as it allows me to use = stuff like M-C-v that is sent as a single char (or 2 in utf8 mode), and with a bit of magic M-S-v and suc= h work too, so all my keybindings are exactly the same regardless if I'm in a terminal or GUI frame.