From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Proposed change to greek-ibycus4 input-method Date: Tue, 11 Jul 2006 23:29:14 +0200 Message-ID: <85irm3bzkl.fsf@lola.goethe.zz> References: <87fyh7lukl.fsf@heslin.eclipse.co.uk> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1152653393 18425 80.91.229.2 (11 Jul 2006 21:29:53 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 11 Jul 2006 21:29:53 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jul 11 23:29:51 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1G0Pnb-0005Un-HA for ged-emacs-devel@m.gmane.org; Tue, 11 Jul 2006 23:29:43 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G0Pnb-0001xX-1G for ged-emacs-devel@m.gmane.org; Tue, 11 Jul 2006 17:29:43 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1G0PnQ-0001wb-N0 for emacs-devel@gnu.org; Tue, 11 Jul 2006 17:29:32 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1G0PnQ-0001wO-55 for emacs-devel@gnu.org; Tue, 11 Jul 2006 17:29:32 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G0PnQ-0001wL-0K for emacs-devel@gnu.org; Tue, 11 Jul 2006 17:29:32 -0400 Original-Received: from [199.232.76.164] (helo=fencepost.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.52) id 1G0Poo-0007dg-H1 for emacs-devel@gnu.org; Tue, 11 Jul 2006 17:30:58 -0400 Original-Received: from localhost ([127.0.0.1] helo=lola.goethe.zz) by fencepost.gnu.org with esmtp (Exim 4.34) id 1G0PnO-0004sO-OT; Tue, 11 Jul 2006 17:29:31 -0400 Original-Received: by lola.goethe.zz (Postfix, from userid 1002) id 1D2FE1C4CD1E; Tue, 11 Jul 2006 23:29:15 +0200 (CEST) Original-To: Peter Heslin In-Reply-To: <87fyh7lukl.fsf@heslin.eclipse.co.uk> (Peter Heslin's message of "Tue, 11 Jul 2006 22:07:06 +0100") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:56937 Archived-At: Peter Heslin writes: > I would like to propose a change in the way the greek-ibycus4 > input-method for ancient Greek handles capital letters with iota > subscript (ypogegrammeni), as a result of having been bitten by its (to > me) very surprising behavior. > > Currently, when you type a capital letter followed by a normal iota, the > input method arbitrarily decides that this must actually be a subscript > iota. But in the vast majority of cases, this is not what the user > intends at all -- he or she wants a normal iota after the capital > letter. > > The reason for this ambiguity is that the ibycus4 encoding for LaTeX > does not actually support subscript iotas under capital letters (it > expects you to write them adscript, as if they were normal iotas). So > there is no pre-existing standard to appeal to, but it seems logical to > use the | character after the vowel, just as for lower case vowels. > > In other words, with the current code: > > )Ai =3D> =E1=BE=88=20 > > which has two problems: (1) it is very surprising, and (2) there is no > straight-forward way to type the common sequence of Greek characters > =E1=BC=88=CE=B9 I think there is a bit of code point vs reprentation problem here, too. I think that =E1=BE=88 is actually the same `character' as =E1=BC=88= =CE=B9 or even =CE=91=CE=B9 (though =CE=91=CE=B9 would also be short for =E1=BC=89=CE=B9, like in =E1= =BC=90=CE=BD =E1=BC=89=CE=B9=CE=B4=C3=B5=CF=85, and so needs a different code point just in case someone wants to use a font with spirited capitals), only with a different writing convention. I would think it likely that you'll never encounter both =E1=BE=88 as well as =E1= =BC=88=CE=B9 in the same text, but you might find consistently either one or the other (or the third) everywhere, depending on the printer's taste. This is in contrast to =CE=91=E1=BC=B0 or =E1=BC=88=CF=8A which are actuall= y different character combinations. > =E2=80=90=E2=80=90 you have to separate the vowels with a space and then = go back and > delete the space between them. > > With my proposal: > > )A| =3D> =E1=BE=88=20 > )Ai =3D> =E1=BC=88=CE=B9 > > Now there is an easy way to type both sequences and the default behavior > is much less surprising. It's also consistent with the behavior of the > greek-babel input-method. Patch attached. I think your idea sounds reasonable. I just have no clue whether the Ibycus rules would actually suggest one or the other. For convenient typing in Emacs, your suggestion is likely the best and most intuitive way, anyhow. But if we want to convert "input just like a quail method" into Unicode, like recently discussed on this list, it might start making a difference. --=20 David Kastrup, Kriemhildstr. 15, 44793 Bochum