From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Itai Berli Newsgroups: gmane.emacs.bugs Subject: bug#27526: 25.1; Nonconformance to Unicode bidirectionality algorithm due to paragraph separator Date: Thu, 29 Jun 2017 21:36:39 +0300 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Trace: blaine.gmane.org 1498761496 14773 195.159.176.226 (29 Jun 2017 18:38:16 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 29 Jun 2017 18:38:16 +0000 (UTC) To: 27526@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jun 29 20:38:12 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQeKR-0003g1-4c for geb-bug-gnu-emacs@m.gmane.org; Thu, 29 Jun 2017 20:38:11 +0200 Original-Received: from localhost ([::1]:40766 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQeKW-0008Gd-BW for geb-bug-gnu-emacs@m.gmane.org; Thu, 29 Jun 2017 14:38:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43929) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQeKN-0008GM-LT for bug-gnu-emacs@gnu.org; Thu, 29 Jun 2017 14:38:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dQeKI-0002QZ-LL for bug-gnu-emacs@gnu.org; Thu, 29 Jun 2017 14:38:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:42334) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dQeKI-0002QT-HT for bug-gnu-emacs@gnu.org; Thu, 29 Jun 2017 14:38:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dQeKI-0005Ku-57 for bug-gnu-emacs@gnu.org; Thu, 29 Jun 2017 14:38:02 -0400 X-Loop: help-debbugs@gnu.org In-Reply-To: Resent-From: Itai Berli Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 29 Jun 2017 18:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 27526 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 27526-submit@debbugs.gnu.org id=B27526.149876144720469 (code B ref 27526); Thu, 29 Jun 2017 18:38:02 +0000 Original-Received: (at 27526) by debbugs.gnu.org; 29 Jun 2017 18:37:27 +0000 Original-Received: from localhost ([127.0.0.1]:45011 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQeJj-0005K5-0X for submit@debbugs.gnu.org; Thu, 29 Jun 2017 14:37:27 -0400 Original-Received: from mail-vk0-f50.google.com ([209.85.213.50]:36595) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQeJh-0005Js-Uy for 27526@debbugs.gnu.org; Thu, 29 Jun 2017 14:37:26 -0400 Original-Received: by mail-vk0-f50.google.com with SMTP id y70so54523077vky.3 for <27526@debbugs.gnu.org>; Thu, 29 Jun 2017 11:37:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=MsJ5C+NJxAriJOEiLdqIKW/ubAzYmidcu+NDF5USsnw=; b=uuNW81L2fJrgscz0pA1ACOjHSvV1YWqhu0JI4vYydtgAiLd52jVyMg+zc8MTebweUr 64IZTeApLgXMxZxX4KeFVlVvZfRh3LiaqQ2eeqcGOscmv6P6rOtnaJ21Y/9r5pblNOTC 75uEylJ8Caupp9Z7b3Dp0bxv4O/VEaNPlz7qq83SSFjx9bFirnLA+jmmkCcF9RLBoD9w BeiwZmR1pFQmT6QFm+NXuhp+RC420a+geWOVT8PpjzHjm/GkhajwfLR/zDtOWMXx+MjR m9ndpLQ3uYgHV4kEuyV/5HuVXPrt40Zk3nPNjz7JGFXS8EIY5rXUNIT3WgZG/k0aLKU9 sngw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=MsJ5C+NJxAriJOEiLdqIKW/ubAzYmidcu+NDF5USsnw=; b=i6lla450eHjZG8XZagcI2IVV3H3bik9mj8aIiPAGAwSPTI8wCXzEKKwH95LPXZwhaG NSJvjlXC3SlsFneknkot4UDKQ/Sg9tUWWVTp0kO/aXoRu94/6HsDfkFJn15obyQ0tRKX tURcPsUxvdSLGfpUO8VapBskzIH8xk8F1Qv7LlrjEkfTa770MoE50NsIm6vu64xxXZ/J giRH2t2TgBtDMxhqjj+NUDciduNHU38XGDtuvk/8t5SZXvILVxLqqbDDqEQgbBGQ3r65 +U4yiYqRhf1btrxg8wW/nowepS4ijsMjk5gYRTv3ODzDkyJwMYOLeYp/Gb8C65OcN68/ ZERw== X-Gm-Message-State: AKS2vOyAFECegFv6PIZpWzbPnNs5GTF29jbS+pwNacQ/fRJbmsOLfF2f BBoYAuRzVdgPGbF/hgQayuKl7P9E5MSRr+E= X-Received: by 10.31.138.135 with SMTP id m129mr8607154vkd.84.1498761440205; Thu, 29 Jun 2017 11:37:20 -0700 (PDT) Original-Received: by 10.176.70.85 with HTTP; Thu, 29 Jun 2017 11:36:39 -0700 (PDT) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:134039 Archived-At: > The UBA allows applications to employ "higher-level protocols" when > deciding on base paragraph direction. See section 4.3 in UAX#9 and specifically clause HL1 there. > This is what Emacs does: it applies its own heuristics for this > decision. The reason for that is that Emacs's implementation of the > UBA must work reasonably well in plain-text buffers, where typically > long paragraphs are broken into lines by newline characters (which are > paragraph separators according to the UBA), and many times the > partition into lines is done by auto-fill or similar features, thus > making the first character of the next line fairly arbitrary. Using > the UBA paragraph-direction determination would then produce > unacceptable results, whereby the direction of a part of a paragraph > could change in unpredictable ways when text is refilled. As I understand it, the "higher-level protocols" provision is intended to allow for such things as table cells, elements of structured markup languages, and word processors that use an idio-syncratic implementation of a paragraph separator *under the hood*. It is not intended for plain running text; for this the standard specifies explicitly what the paragraph separators for every operating system are. > typically long paragraphs are broken into lines by newline characters I see no evidence of the validity of this statement on my system (Emacs 25.1.1). But even if this were so, it would still not merit *hard-coding* the paragraph separator as a blank line, as there are situations (such as the one I presented in my bug report) that require a diffferent configuration. > You can alleviate this to some extent by ...(in your case) starting > the paragraph with an RLM control character before \noindent, > optionally followed by an LRM or enclosing \noindent in LRE..PDF (so > that the backslash displays to the left of "noindent"). This is > admittedly a bit awkward, but I think the results are still acceptable. As you mentioned, the solution is cubersome. It might have been acceptable if this was the sole issue, but this example illustrates just one of several problems that arise due to current paragraph separator convention. In conclusion, and on a personal note, I implore you to change this behavior, and to do so as soon as possible, and not only for specialized markup documents, but for every document. I am currently working on my thesis. Emacs is useless to me as a text editor of Hebrew texts without this feature. This is no exaggeration. The original reason I chose Emacs over other editors was because of the combination of AUCTeX and the promise of full Unicode compatibility. AUCTeX has delivered on its promise, but in the area of Unicode, as far as my needs are concerned it is if there was no Unicode support at all, and I will be sadly forced to look for a different editor.