From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#58070: [PATCH] Add tamil99 input method Date: Tue, 27 Sep 2022 09:23:20 +0300 Message-ID: <83sfkdjpyf.fsf@gnu.org> References: <20220925100020.13229-1-arunisaac@systemreboot.net> <20220925100244.13482-1-arunisaac@systemreboot.net> <87h70vsmyd.fsf@gmail.com> <87ill9ony7.fsf@systemreboot.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="19725"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 58070@debbugs.gnu.org, visuweshm@gmail.com To: Arun Isaac Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Sep 27 08:42:29 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1od4IX-0004z2-Dr for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 27 Sep 2022 08:42:29 +0200 Original-Received: from localhost ([::1]:45620 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1od4IW-0003k1-Cl for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 27 Sep 2022 02:42:28 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49278) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1od40g-0000TU-HZ for bug-gnu-emacs@gnu.org; Tue, 27 Sep 2022 02:24:07 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:53175) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1od40g-0000C7-77 for bug-gnu-emacs@gnu.org; Tue, 27 Sep 2022 02:24:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1od40f-0007RE-Q3 for bug-gnu-emacs@gnu.org; Tue, 27 Sep 2022 02:24:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 27 Sep 2022 06:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58070 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 58070-submit@debbugs.gnu.org id=B58070.166425982128563 (code B ref 58070); Tue, 27 Sep 2022 06:24:01 +0000 Original-Received: (at 58070) by debbugs.gnu.org; 27 Sep 2022 06:23:41 +0000 Original-Received: from localhost ([127.0.0.1]:52253 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1od40K-0007Qd-UM for submit@debbugs.gnu.org; Tue, 27 Sep 2022 02:23:41 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:34224) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1od40J-0007QQ-Qm for 58070@debbugs.gnu.org; Tue, 27 Sep 2022 02:23:40 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:39758) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1od40D-0000A1-0Y; Tue, 27 Sep 2022 02:23:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=uRDEkpW5gPOK+NVRSwydQckMLR1Y/jlP/pASuworKsg=; b=OtBwTlvrBFjcOxG9Fatz HxfmVxGF5UwuQJXqQnzi/yqgXVzRF4CpEvLiyfh2kC7RZk2097jGjivVnV2QYHVP/lbhLQidda5ia 2GPt0jRjUTizBdxnDsdM4JL/BH4rmIjGuHPFmnuiN3e+Z4pJuEc0FiONTPXQDVMNH4vL0ii/HBpmg cHu97dqE07f6+aILYVl7Q+Kt1Yc3sR89X1jYXDQ6lfEl6pMNS0yMM8GRRS3zGF2CkYCwnuzRMtPiE YLXup12P4pKmoNJD+TIlh+cZfUz87I6zCUYb1jioIT+KKrLBU5OlhuSz/vcVpyzgSiGbvuplTdVPm fqqFzSJoBbfUYg==; Original-Received: from [87.69.77.57] (port=1271 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1od409-0004Ur-6I; Tue, 27 Sep 2022 02:23:30 -0400 In-Reply-To: <87ill9ony7.fsf@systemreboot.net> (message from Arun Isaac on Tue, 27 Sep 2022 02:25:28 +0530) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:243694 Archived-At: > Cc: 58070@debbugs.gnu.org > From: Arun Isaac > Date: Tue, 27 Sep 2022 02:25:28 +0530 > > > This has the advantage that you can insert the vowel sign for any > > consonant out-of-sequence i.e., you can say h j BACKSPACE s > > to insert கி (and so do other rules). > > I agree. Your imperative approach does have this advantage. But, it > comes at the price of having to inspect the buffer at (point). The > declarative approach does not need to inspect the buffer at all since it > merely composes sequential keystrokes and doesn't know anything about > what's already on the buffer. I personally think buffer inspection is a > lot of code complexity for a simple input method like tamil99, but > perhaps Eli should take a call on this. I don't think I understand what you are talking about (I'm not an expert on Quail). Does this complexity slow down the input noticeably? Does it make the code much harder to understand, even if you put enough comments there to explain what's going on? If not, then I don't think the added complexity should be a problem, and you should decide based on other aspects. And as I said earlier, we could have two input methods for Tamil, so we don't necessarily have to decide which of the two is better. > Also, while the out-of-sequence vowel insertion is a very clever > feature, it shouldn't be required at all if we handled grapheme cluster > boundaries correctly. See > https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries Well, we do, that's why cursor motion moves by grapheme clusters, right? Also, see below. > Let me explain with a latin example for the benefit of non-Tamil > readers. Suppose we had: > > g̀| > > where | is the position of the cursor. Now, if we press backspace, the > entire g+grave accent grapheme cluster should be deleted. But, what > actually happens is that the grave accent alone is deleted and we are > left with a 'g' like so: > > g| > > A similar thing happens in Tamil. Now, based on user expectation, this > may be acceptable in some languages. But, in Tamil, it is quite contrary > to user expectation. If I have > > கி| > > and press backspace, I get: > > க| > > But, I want the whole "user-perceived character" (கி) deleted like so: > > | There's a problem with the above: in some situations you want deletion by codepoints, in others you want deletion by grapheme clusters. (It is possible that with Tamil the former is rarely the case, but it is definitely a frequent case with other scripts, in particular with those that have diacriticals.) Emacs 29 solves this by having delete-forward-char, which is usually bound to the key, delete by grapheme clusters, while DEL (which deletes backward) and C-d delete individual codepoints. The primary motivation for DEL to delete by codepoints is that it allows you to make sub-grapheme corrections to stuff you just typed, for example if you typed an incorrect accent. Emacs 29 also has the composition-break-at-point variable, which you could set non-nil, in which case will also work by codepoints. So perhaps the out-of-sequence vowel insertion would be possible without further complications if composition-break-at-point is non-nil?