From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Android input methods Date: Mon, 13 Feb 2023 17:17:14 +0200 Message-ID: <835yc5fvd1.fsf@gnu.org> References: <83r0uvghw7.fsf@gnu.org> <87k00nyo60.fsf@yahoo.com> <83ilg7gdjj.fsf@gnu.org> <87bklyzyyj.fsf_-_@yahoo.com> <83a61iho6r.fsf@gnu.org> <87ttzqxq3f.fsf@yahoo.com> <83cz6dfzet.fsf@gnu.org> <87edqtws0w.fsf@yahoo.com> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37351"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Po Lu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Feb 13 16:18:16 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pRaat-0009PX-QC for ged-emacs-devel@m.gmane-mx.org; Mon, 13 Feb 2023 16:18:15 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pRaaK-0003fv-Cd; Mon, 13 Feb 2023 10:17:40 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pRaaI-0003fm-Ov for emacs-devel@gnu.org; Mon, 13 Feb 2023 10:17:38 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pRaaI-0003uy-Ff; Mon, 13 Feb 2023 10:17:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=B3jluJlYw3KfGlLXsMfKlabMQy8hOHQIbVwzLiwSVIA=; b=oEWZOcEDVvxp szPBJ4aMk3SK6NK6zojJaC/XpRZS05tRja8ETQ2KIA9QWCL3Zpe48ULeecsq2QcZQDJvwccKpF5F+ 8g8acNid2MZaELo3UIvNx2dHPEgtQjMU2+bMmTcChSonH8rRLUs0+yu6d7GVumKwbVIZs9D86MslL 61QHB3Rvug3CUDlje4Pwkcj/oBRGdcd1X4OKe5/Fx0kUPvq/XgH0N8DwsAeCgjbESFBmUzwRm2I1y 0sMy7bY/CuFpwOtKJ6U3DnvGwfkN7pFo9Az2T0l2YLT5k402NOT2wxeCuOQM+Tf3YB2+I10AnxqaA S/4LCMJzqiKSQotKYFz0vw==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pRaaH-00009B-QS; Mon, 13 Feb 2023 10:17:38 -0500 In-Reply-To: <87edqtws0w.fsf@yahoo.com> (message from Po Lu on Mon, 13 Feb 2023 22:37:19 +0800) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:303218 Archived-At: > From: Po Lu > Cc: emacs-devel@gnu.org > Date: Mon, 13 Feb 2023 22:37:19 +0800 > > Eli Zaretskii writes: > > >> My problem is that modes such as `electric-indent-mode' expect, for > >> example, newline characters to be inserted by the return key, and do not > >> indent if the change is made directly by the input method. > > > > is the only problem with such modes? Or are there other issues? > > There are many others: consider the vc-dir buffer: when you type ``mmmm > i'', it expects individual key presses, each one of which marks a file > or registers it with the VC system. These commands don't insert text, so I'm unsure how they are relevant. The "mmm" is not inserted into a buffer, it is a series of 3 commands. > > If input methods are actually modifying relatively small portions of > > text (even if they request much larger regions to do that), producing > > single-key events from that should not be too hard: all you need is > > compare two almost identical stretches of text. Am I missing > > something? > > If you insert text, the input method might choose to suggest > replacements for the text, typically of the surrounding word, but also > sometimes up to entire sentences in length. This feature should be turned off. It is incompatible with Emacs. We request users to turn off bidi reordering of terminal emulators for similar reasons. There's no way we can or should allow external features do stuff like that, because they will never be as flexible as Emacs features. At the very least we should disable them now. Maybe later we will find less drastic solutions (or maybe the input methods will grow up and become friendlier to Emacs). > In addition, there is a mode where the input method displays extracted > text in a window of its own, and only sends the resulting changes back > to Emacs after it finishes. Such changes can be almost arbitrary in > many cases. Turn this off. > However, I think I've found an easier solution that doesn't involve any > text comparison: we can enable input methods only for editing modes that > derive from `text-mode', and perhaps prog-mode as well, whilst utilizing > the ASCII keyboard fallback on modes which derive from special-mode. I don't believe this is so easy. We'd need a more flexible control on when the input method is enabled and disabled. Just the major mode is not fine-grained enough. > > No problems here, except the usual issue with our superset of UTF-8. > > We'd need to encode the text. However, for relatively short stretches > > of text this should be fast enough. And the other direction already > > exists: decode_coding_gap. > > This should be easy: we provide character positions and Unicode > characters to the input method, but use the NULL byte for characters > that are not representable in UTF-16 (including those which need to be > represented by surrogate pairs.) We already have the machinery to replace un-encodable characters with a fixed character while encoding, but my point is that we will need to encode; we cannot just memcpy. So this will be slower than just copying, but not terribly so. Btw, are you saying that the text should be encoded in UTF-16? Is that because it's Java? > > Strange design. Any idea why non-ASCII characters get such complex > > treatment? > > I don't know. It seems to be an initial oversight that had to be kept > for backwards compatibility reasons, because applications do not expect > key events with (a definite misnomer) the `unicode_char' field set to > some value greater than 127. I can't find this written down anywhere, > however, except there are simply no keymaps that map keys to larger > characters. Even MS-Windows is capable of accepting and processing UTF-16 encoded characters in its character input routines. So I'm still puzzled.