From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#61269: 28.2; Sequence of spaces preceding tab in bidirectional line Date: Sat, 04 Feb 2023 13:38:20 +0200 Message-ID: <83sfflu0cz.fsf@gnu.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34496"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 61269@debbugs.gnu.org To: Halim Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 04 12:39:26 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pOGtC-0008pO-0X for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 04 Feb 2023 12:39:26 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pOGsq-0005m9-7m; Sat, 04 Feb 2023 06:39:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pOGso-0005lu-Lv for bug-gnu-emacs@gnu.org; Sat, 04 Feb 2023 06:39:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pOGso-0004f6-D6 for bug-gnu-emacs@gnu.org; Sat, 04 Feb 2023 06:39:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pOGsn-000324-Tc for bug-gnu-emacs@gnu.org; Sat, 04 Feb 2023 06:39:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Feb 2023 11:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61269 X-GNU-PR-Package: emacs Original-Received: via spool by 61269-submit@debbugs.gnu.org id=B61269.167551070811607 (code B ref 61269); Sat, 04 Feb 2023 11:39:01 +0000 Original-Received: (at 61269) by debbugs.gnu.org; 4 Feb 2023 11:38:28 +0000 Original-Received: from localhost ([127.0.0.1]:40788 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pOGsF-000319-Hh for submit@debbugs.gnu.org; Sat, 04 Feb 2023 06:38:27 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:44182) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pOGsA-00030q-0a for 61269@debbugs.gnu.org; Sat, 04 Feb 2023 06:38:25 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pOGs4-0004cH-2N; Sat, 04 Feb 2023 06:38:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=8IJr5p2xgRPkaE7jkT8yn6jiETAXrvQwEazXnxMjap8=; b=SiNem7f+7DfWILpUfFeC rTaj/M6oDRqGJ4ypWG2k+IrrpJUm4XiIKwH1SWzApOnLAK1b1DNCM2rh4ixiRs9Qh62cti6IorFRC bTy8nzGCwvrbLCFqN+Qk4UIISJWkecbZZiTDVjQCXMY5wwLidJh8wbYriDE5MxWmR/lYVYHHw89Nq hgXcsGAUtA3tIE6PDatp44sg6E95o2duW6mtGqG77Z3PUa6/FRbIlaxX68A371w4rqE4fR9j+OFpx ceX+rGMmNypfGVsSvFln9/d1HQIzfeDn/TdUxJLJtNIzRTgkZWyrTQzyNG31u20YLEnVdziT67iaU yzyAJlgrzEZI1Q==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pOGs2-0008KY-Sr; Sat, 04 Feb 2023 06:38:15 -0500 In-Reply-To: (message from Halim on Sat, 04 Feb 2023 02:41:35 +0700) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:254769 Archived-At: > From: Halim > Date: Sat, 04 Feb 2023 02:41:35 +0700 > > > In a left-to-right line emacs display a sequence of one or more > spaces (U+0020), where the spaces precede a tab (U+0009) and they > both appear between two right-to-left alphabet, to the left of the > first (in typing order) rtl alphabet. > > The bug does not present when the rtl text is inside an rtl > isolate. > > Let s represent space, t represet tab, l represent itself, r and > m represent arabic alphabet. The following example have this format > in typing order from left to right. > > Format: > lsrssstm > > Example text: > l ح م > > The expected display is 'lsrssstm', the actual is 'lssssrtm'. > The spaces following 'r' in the format is displayed to the left > of 'r' in the actual display. Using 'C-f' from 'r' moves the > cursor to the left until it hits 't' where the cursor move to > the right of 'r'. > > I have tried to view the file containing the buggy text in > focuswriter and fribidi. They both display the same expected > way. > > Extra Info > > The bug also present to ltr text on rtl line. I believe > this is generic and is caused by this line > '&& level != bidi_it->level_stack[0].level' (see below). > > The bug also present in emacs built from commit > 'ac7ec87a7a0db887e4ae7fe9005aea517958b778' with > --without-all. In this commit I make the following > modification. > > --------------- > $ git diff ac7ec87a7a0db887e4ae7fe9005aea517958b778 > diff --git a/src/bidi.c b/src/bidi.c > index e012512..fe6e4d6 100644 > --- a/src/bidi.c > +++ b/src/bidi.c > @@ -3302,10 +3302,7 @@ bidi_level_of_next_char (struct bidi_it *bidi_it) > if ((bidi_it->orig_type == NEUTRAL_WS > || bidi_it->orig_type == WEAK_BN > || bidi_isolate_fmt_char (bidi_it->orig_type)) > - && bidi_it->next_for_ws.charpos < bidi_it->charpos > - /* If this character is already at base level, we don't need to > - reset it, so avoid the potentially costly loop below. */ > - && level != bidi_it->level_stack[0].level) > + && bidi_it->next_for_ws.charpos < bidi_it->charpos) > { > int ch; > ptrdiff_t clen = bidi_it->ch_len; > --------------- > > It fixes the bug. Thanks. You are right that the logic there was flawed. However, just removing the base-level test is sub-optimal: that test was added to speed up redisplay when the buffer has a lot of control characters (e.g., binary null bytes) that don't need to be reordered; see bug#22739. So I have installed a slightly different change, reproduced below; please see that it solves the problem, including (presumably) some real-life problems you had in displaying RTL text with embedded TABs. diff --git a/src/bidi.c b/src/bidi.c index e012512..93875d2 100644 --- a/src/bidi.c +++ b/src/bidi.c @@ -3300,12 +3300,15 @@ bidi_level_of_next_char (struct bidi_it *bidi_it) it belongs to a sequence of WS characters preceding a newline or a TAB or a paragraph separator. */ if ((bidi_it->orig_type == NEUTRAL_WS - || bidi_it->orig_type == WEAK_BN + || (bidi_it->orig_type == WEAK_BN + /* If this BN character is already at base level, we don't + need to consider resetting it, since I1 and I2 below + will not change the level, so avoid the potentially + costly loop below. */ + && level != bidi_it->level_stack[0].level) || bidi_isolate_fmt_char (bidi_it->orig_type)) - && bidi_it->next_for_ws.charpos < bidi_it->charpos - /* If this character is already at base level, we don't need to - reset it, so avoid the potentially costly loop below. */ - && level != bidi_it->level_stack[0].level) + /* This means the informaition about WS resolution is not valid. */ + && bidi_it->next_for_ws.charpos < bidi_it->charpos) { int ch; ptrdiff_t clen = bidi_it->ch_len; @@ -3340,7 +3343,7 @@ bidi_level_of_next_char (struct bidi_it *bidi_it) || bidi_it->orig_type == NEUTRAL_S || bidi_it->ch == '\n' || bidi_it->ch == BIDI_EOB || ((bidi_it->orig_type == NEUTRAL_WS - || bidi_it->orig_type == WEAK_BN + || bidi_it->orig_type == WEAK_BN /* L1/Retaining */ || bidi_isolate_fmt_char (bidi_it->orig_type) || bidi_explicit_dir_char (bidi_it->ch)) && (bidi_it->next_for_ws.type == NEUTRAL_B