From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: bidi-display-reordering is now non-nil by default Date: Wed, 03 Aug 2011 04:56:15 -0400 Message-ID: References: <20110731.082721.451360942.wl@gnu.org> <20110731.085115.40009301.wl@gnu.org> <877h6yanje.fsf@fencepost.gnu.org> <878vre95g3.fsf@fencepost.gnu.org> <87fwlm7fam.fsf@fencepost.gnu.org> <87bowa7dza.fsf@fencepost.gnu.org> <877h6y7chn.fsf@fencepost.gnu.org> <831ux6cv5o.fsf@gnu.org> <83r555c04f.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1312361786 14476 80.91.229.12 (3 Aug 2011 08:56:26 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 3 Aug 2011 08:56:26 +0000 (UTC) Cc: emacs-devel@gnu.org To: Mohsen BANAN Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Aug 03 10:56:22 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QoXFW-0000oD-6l for ged-emacs-devel@m.gmane.org; Wed, 03 Aug 2011 10:56:22 +0200 Original-Received: from localhost ([::1]:50456 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QoXFV-00014t-Fl for ged-emacs-devel@m.gmane.org; Wed, 03 Aug 2011 04:56:21 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:56744) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QoXFS-0000zl-AO for emacs-devel@gnu.org; Wed, 03 Aug 2011 04:56:19 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QoXFQ-0005H1-Fm for emacs-devel@gnu.org; Wed, 03 Aug 2011 04:56:18 -0400 Original-Received: from fencepost.gnu.org ([140.186.70.10]:56643) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QoXFQ-0005Gw-Ds for emacs-devel@gnu.org; Wed, 03 Aug 2011 04:56:16 -0400 Original-Received: from eliz by fencepost.gnu.org with local (Exim 4.71) (envelope-from ) id 1QoXFP-0002A0-Gp; Wed, 03 Aug 2011 04:56:15 -0400 In-reply-to: (message from Mohsen BANAN on Tue, 02 Aug 2011 19:39:08 -0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.10 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:142794 Archived-At: > From: Mohsen BANAN > Cc: Mohsen BANAN , emacs-devel@gnu.org > Date: Tue, 02 Aug 2011 19:39:08 -0700 > > >>> > >>> This is paragraph 1 all in english. > >>> > >>> اين پاراگراف ۲ تماما به فارسى است. > >>> > >>> This is paragraph 3 which is mixed with فارسى and english. > >>> > >>> اين پاراگراف ۴ مخلوت English و فارسى است. > >>> > > --- > > Note all 4 paragraphs' directionality are detected > fine based on UAX#9 prior to citation. > > Once cited/prefixed they all become 1 paragraph. > > The communication ramifications on paragraph 2 are > minor. The reader would get it. > > But communication ramifications on paragraph 4 are > very problematic. The sentence becomes > incomprehensible/mangled. Yes, I agree. > So, here is what I am suggesting: > > - We add an additional value of 'fancy to > bidi-paragraph-direction making it: > nil, left-to-right, right-to-left, fancy > > - If bidi-paragraph-direction is 'fancy, > > - do prefix guessing > - ignore prefix > - do UAX#9 > - re-insert prefix > > This will likely help with simple tabularization > in addition to mail citations. > > What do you think? This is too complicated. Paragraph detection is at the lowest level of the reordering engine, so it must be simple and fast. We saw just yesterday a case of severe slowdown of redisplay because paragraph detection was working too hard. I have an idea that would be much easier to implement: ignore neutral and weak characters for the purposes of the decision where a paragraph ends. This can be done by modifying the regexp used to detect the paragraph separator. This would make all the citations preserve their original paragraphs, notwithstanding the citation. Then it will look like this (but without the added empty lines, of course): >>> >>> This is paragraph 1 all in english. >>> >>> اين پاراگراف ۲ تماما به فارسى است. >>> >>> This is paragraph 3 which is mixed with فارسى and english. >>> >>> اين پاراگراف ۴ مخلوت English و فارسى است. >>> (I assume you are looking at this in Emacs 24, which will reorder correctly.) Is this okay? Anyway, this feature will probably have to wait until Emacs 24.2, because there are more serious issues on the table, and we want to start the pretest for Emacs 24.1 ASAP. Modifications of paragraph separator is something that needs to be thoroughly tested before it hits the end users' machines, because it can potentially destabilize the display engine to a great extent. For now, users will have to learn to insert empty lines ;-)