From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Line wrap reconsidered Date: Tue, 26 May 2020 13:34:01 -0400 Message-ID: References: <92FF4412-04FB-4521-B6CE-52B08526E4E5@gmail.com> <878shfsq35.fsf@gnus.org> <83imgivjak.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_3A4B0BE6-09BD-44F1-830A-C8B48E391D03" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="29376"; mail-complaints-to="usenet@ciao.gmane.io" Cc: larsi@gnus.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue May 26 19:35:48 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jddUR-0007ZS-Uj for ged-emacs-devel@m.gmane-mx.org; Tue, 26 May 2020 19:35:48 +0200 Original-Received: from localhost ([::1]:57388 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jddUQ-0006tO-Or for ged-emacs-devel@m.gmane-mx.org; Tue, 26 May 2020 13:35:46 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41322) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jddSo-0004Ft-Hf for emacs-devel@gnu.org; Tue, 26 May 2020 13:34:06 -0400 Original-Received: from mail-qt1-x830.google.com ([2607:f8b0:4864:20::830]:43958) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jddSn-0002wz-B2; Tue, 26 May 2020 13:34:06 -0400 Original-Received: by mail-qt1-x830.google.com with SMTP id j32so1874606qte.10; Tue, 26 May 2020 10:34:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=4xPa5H6e8fhWRjyM6ZwXW52dPSLI0bbPxM/lZ7oLejs=; b=jdjo9V4cSuMQfvymCOW18dv7ty50LbfRu3S/8Q5Fv9TQAzmcY/PCjHf4J6vQcH7B5d HAflYZ5Jxu2MwO/Ffj+SQC3StVz3TApjoBBVwx3BXlmkdVlK+S2ahedoalLGigLDK9so aW3YV1FgsXb5x37nMiXcr2xwJk+0vUmrBG3mDemz7fegxWd7JcBr/XVzVJcVycoMFOsz 6dWiIs3NYGhaS65tRAuYWPMN3llREHcLNxTXxBTDuUJwobryoWwNSvPEHsmVsdZSqGVr kuosOMDXpLYJHtjfU+HWZijPdcBhyatNHgRJgNATb2j1ejHzEK18nJsodGXxSnBfEWWg 2TCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=4xPa5H6e8fhWRjyM6ZwXW52dPSLI0bbPxM/lZ7oLejs=; b=nqzmFyAYSAFqUg1oUEmjIH6N/5wPL6aqX0PLwLVjeE+7xKwXUHNQCn6f1hS46rXJzT q+R5Asd3iVQUdDaK6E0kWN67P+cEBjI3zW6pq6jmfDumX7osqdjKNC2l0Q6wPgvRK7CI IEnoKUEAPTN/U4FbURzmmFQycqjRRQVqdJErZXcjVgmX5eWHDJ34XQDoh+oZIpyXLYWo hHsduYRs/DUWyt0ryy0jVuiKWst026bO/HcJsKkdCMfvTKRACIdvWVy32xWyXW4k/BQw ZEcOV8c82gG6F9QdlcmtiW5fBixOncnh0ZAxQRnXI1TRax9klJbQYQ74i5ezJdeqQhy5 OJ+Q== X-Gm-Message-State: AOAM531IJ7e5uz1f9CkUVYJXNAcnbv+SC02tDxCxBI62dU5fmm+XNxx7 Vc/8CGvznCeg6XCqlHN0RCcxqvONu5gucA== X-Google-Smtp-Source: ABdhPJwfuCIqy7RHmqexpmpwxUSNUdP9+cLinCCZacm0RIAl/HzA8n4L7a2j0MjBJR4na2Yh8NjRWw== X-Received: by 2002:ac8:3a87:: with SMTP id x7mr2497748qte.251.1590514442807; Tue, 26 May 2020 10:34:02 -0700 (PDT) Original-Received: from [192.168.1.10] (c-174-60-229-153.hsd1.pa.comcast.net. [174.60.229.153]) by smtp.gmail.com with ESMTPSA id g64sm271393qtd.39.2020.05.26.10.34.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 May 2020 10:34:02 -0700 (PDT) In-Reply-To: <83imgivjak.fsf@gnu.org> X-Mailer: Apple Mail (2.3608.80.23.2.2) Received-SPF: pass client-ip=2607:f8b0:4864:20::830; envelope-from=casouri@gmail.com; helo=mail-qt1-x830.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:251467 Archived-At: --Apple-Mail=_3A4B0BE6-09BD-44F1-830A-C8B48E391D03 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Eli, I fixed the problems and it now works. If you apply the patch below and = load kinsaku.el, open the test.txt and M-x toggle-word-wrap. You should = see the text properly wrapped: wrapping between CJK characters and = whitespaces but not between ASCII characters. Also according to kinsoku = rules, CJK comma will not be placed at the beginning of a line; CJK = =E2=80=9C=E3=80=8A=E2=80=9D will not be place at the end of a line, etc. It determines whether we can wrap before/after a character by looking at = =E2=80=9C<=E2=80=9C, =E2=80=9C>=E2=80=9D and =E2=80=9C|=E2=80=9D = categories, roughly corresponding to =E2=80=9Cdon=E2=80=99t wrap = before=E2=80=9D, =E2=80=9Cdon=E2=80=99t wrap after=E2=80=9D and =E2=80=9Cw= rap before and after=E2=80=9D.=20 Yuan --Apple-Mail=_3A4B0BE6-09BD-44F1-830A-C8B48E391D03 Content-Disposition: attachment; filename=wrap.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="wrap.patch" Content-Transfer-Encoding: 7bit diff --git a/src/xdisp.c b/src/xdisp.c index 01f272033e..3dd1450847 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -366,6 +366,7 @@ Copyright (C) 1985-1988, 1993-1995, 1997-2020 Free Software Foundation, #include "termchar.h" #include "dispextern.h" #include "character.h" +#include "category.h" #include "buffer.h" #include "charset.h" #include "indent.h" @@ -427,6 +428,84 @@ #define IT_DISPLAYING_WHITESPACE(it) \ && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' ' \ || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t')))) +// TODO make Lisp-visible? +/* Calculate the wrapping rule for character CH and add it as a text + property to current buffer at CHARPOS. Return the text property + value. */ +static Lisp_Object apply_wrap_property (Lisp_Object charpos, int ch) +{ + /* These are category sets we use. */ + int not_at_eol = 60; /* < */ + int not_at_bol = 62; /* > */ + int line_breakable = 124; /* | */ + if (CHAR_HAS_CATEGORY (ch, line_breakable)) + { + if (CHAR_HAS_CATEGORY (ch, not_at_bol)) + { + Fput_text_property (charpos, Fadd1 (charpos), + Qword_wrap, Qonly_after, Qnil); + return Qonly_after; + } + + else if (CHAR_HAS_CATEGORY (ch, not_at_eol)) + { + Fput_text_property (charpos, Fadd1 (charpos), + Qword_wrap, Qonly_before, Qnil); + return Qonly_before; + } + else + { + Fput_text_property (charpos, Fadd1 (charpos), + Qword_wrap, Qt, Qnil); + return Qt; + } + } + else + { + /* For normal characters, since they _can_ appear at the + beginning of a line, we make their rule only_before. */ + Fput_text_property (charpos, Fadd1 (charpos), + Qword_wrap, Qonly_before, Qnil); + return Qonly_before; + } +} + +/* Return true if the current character allows wrapping before it. */ +static bool char_can_wrap_before (struct it *it) +{ + Lisp_Object charpos = make_fixnum (IT_CHARPOS (*it)); + Lisp_Object prop = Fget_text_property (charpos, Qword_wrap, Qnil); + // TODO handle other types of it->what? + if (EQ (prop, Qnil) && it->what == IT_CHARACTER) + prop = apply_wrap_property(charpos, it->c); + if (EQ (Qt, prop) || EQ (Qonly_before, prop)) + return true; + else + return false; +} + +/* Return true if the current character allows wrapping after it. */ +static bool char_can_wrap_after (struct it *it) +{ + Lisp_Object charpos = make_fixnum (IT_CHARPOS (*it)); + Lisp_Object prop = Fget_text_property (charpos, Qword_wrap, Qnil); + if (EQ (prop, Qnil) && it->what == IT_CHARACTER) + prop = apply_wrap_property(charpos, it->c); + if (EQ (Qt, prop) || EQ (Qonly_after, prop)) + return true; + else + return false; +} + +/* True if we can wrap before the current character. */ +#define IT_CAN_WRAP_BEFORE(it) \ + (!IT_DISPLAYING_WHITESPACE (it) && char_can_wrap_before (it)) + +/* True if we can wrap after the current character. */ +#define IT_CAN_WRAP_AFTER(it) \ + (IT_DISPLAYING_WHITESPACE (it) || char_can_wrap_after (it)) + + /* If all the conditions needed to print the fill column indicator are met, return the (nonnegative) column number, else return a negative value. */ @@ -9098,13 +9177,13 @@ #define IT_RESET_X_ASCENT_DESCENT(IT) \ { if (it->line_wrap == WORD_WRAP && it->area == TEXT_AREA) { - if (IT_DISPLAYING_WHITESPACE (it)) - may_wrap = true; - else if (may_wrap) + /* Can we wrap here? */ + if (may_wrap && IT_CAN_WRAP_BEFORE(it)) { /* We have reached a glyph that follows one or more - whitespace characters. If the position is - already found, we are done. */ + whitespace characters (or a character that allows + wrapping after it). If the position is already + found, we are done. */ if (atpos_it.sp >= 0) { RESTORE_IT (it, &atpos_it, atpos_data); @@ -9119,8 +9198,14 @@ #define IT_RESET_X_ASCENT_DESCENT(IT) \ } /* Otherwise, we can wrap here. */ SAVE_IT (wrap_it, *it, wrap_data); - may_wrap = false; } + /* This has to run after the previous block. */ + if (IT_CAN_WRAP_AFTER (it)) + /* may_wrap basically means "previous char allows + wrapping after it". */ + may_wrap = true; + else + may_wrap = false; } } @@ -9248,10 +9333,10 @@ #define IT_RESET_X_ASCENT_DESCENT(IT) \ { bool can_wrap = true; - /* If we are at a whitespace character - that barely fits on this screen line, - but the next character is also - whitespace, we cannot wrap here. */ + /* If the previous character says we can + wrap after it, but the current + character says we can't wrap before + it, then we can't wrap here. */ if (it->line_wrap == WORD_WRAP && wrap_it.sp >= 0 && may_wrap @@ -9263,7 +9348,7 @@ #define IT_RESET_X_ASCENT_DESCENT(IT) \ SAVE_IT (tem_it, *it, tem_data); set_iterator_to_next (it, true); if (get_next_display_element (it) - && IT_DISPLAYING_WHITESPACE (it)) + && !IT_CAN_WRAP_BEFORE(it)) can_wrap = false; RESTORE_IT (it, &tem_it, tem_data); } @@ -9342,19 +9427,18 @@ #define IT_RESET_X_ASCENT_DESCENT(IT) \ else IT_RESET_X_ASCENT_DESCENT (it); - /* If the screen line ends with whitespace, and we - are under word-wrap, don't use wrap_it: it is no - longer relevant, but we won't have an opportunity - to update it, since we are done with this screen - line. */ + /* If the screen line ends with whitespace (or + wrap-able character), and we are under word-wrap, + don't use wrap_it: it is no longer relevant, but + we won't have an opportunity to update it, since + we are done with this screen line. */ if (may_wrap && IT_OVERFLOW_NEWLINE_INTO_FRINGE (it) /* If the character after the one which set the - may_wrap flag is also whitespace, we can't - wrap here, since the screen line cannot be - wrapped in the middle of whitespace. - Therefore, wrap_it _is_ relevant in that - case. */ - && !(moved_forward && IT_DISPLAYING_WHITESPACE (it))) + may_wrap flag says we can't wrap before it, + we can't wrap here. Therefore, wrap_it + (previously found wrap-point) _is_ relevant + in that case. */ + && !(moved_forward && IT_CAN_WRAP_BEFORE(it))) { /* If we've found TO_X, go back there, as we now know the last word fits on this screen line. */ @@ -23180,9 +23264,8 @@ #define RECORD_MAX_MIN_POS(IT) \ if (it->line_wrap == WORD_WRAP && it->area == TEXT_AREA) { - if (IT_DISPLAYING_WHITESPACE (it)) - may_wrap = true; - else if (may_wrap) + /* Can we wrap here? */ + if (may_wrap && IT_CAN_WRAP_BEFORE(it)) { SAVE_IT (wrap_it, *it, wrap_data); wrap_x = x; @@ -23196,9 +23279,13 @@ #define RECORD_MAX_MIN_POS(IT) \ wrap_row_min_bpos = min_bpos; wrap_row_max_pos = max_pos; wrap_row_max_bpos = max_bpos; - may_wrap = false; } - } + /* This has to run after the previous block. */ + if (IT_CAN_WRAP_AFTER (it)) + may_wrap = true; + else + may_wrap = false; + } } PRODUCE_GLYPHS (it); @@ -23321,14 +23408,18 @@ #define RECORD_MAX_MIN_POS(IT) \ /* If line-wrap is on, check if a previous wrap point was found. */ if (!IT_OVERFLOW_NEWLINE_INTO_FRINGE (it) - && wrap_row_used > 0 + && wrap_row_used > 0 /* Found. */ /* Even if there is a previous wrap point, continue the line here as usual, if (i) the previous character - was a space or tab AND (ii) the - current character is not. */ - && (!may_wrap - || IT_DISPLAYING_WHITESPACE (it))) + allows wrapping after it, AND (ii) + the current character allows wrapping + before it. Because this is a valid + break point, we can just continue to + the next line at here, there is no + need to wrap early at the previous + wrap point. */ + && (!may_wrap || !IT_CAN_WRAP_BEFORE(it))) goto back_to_wrap; /* Record the maximum and minimum buffer @@ -23356,13 +23447,16 @@ #define RECORD_MAX_MIN_POS(IT) \ /* If line-wrap is on, check if a previous wrap point was found. */ else if (wrap_row_used > 0 - /* Even if there is a previous wrap - point, continue the line here as - usual, if (i) the previous character - was a space or tab AND (ii) the - current character is not. */ - && (!may_wrap - || IT_DISPLAYING_WHITESPACE (it))) + /* Even if there is a previous + wrap point, continue the + line here as usual, if (i) + the previous character was a + space or tab AND (ii) the + current character is not, + AND (iii) the current + character allows wrapping + before it. */ + && (!may_wrap || !IT_CAN_WRAP_BEFORE(it))) goto back_to_wrap; } @@ -34231,6 +34325,10 @@ syms_of_xdisp (void) DEFSYM (QCfile, ":file"); DEFSYM (Qfontified, "fontified"); DEFSYM (Qfontification_functions, "fontification-functions"); + DEFSYM (Qword_wrap, "word-wrap"); + DEFSYM (Qonly_before, "only-before"); + DEFSYM (Qonly_after, "only-after"); + DEFSYM (Qno_wrap, "no-wrap"); /* Name of the symbol which disables Lisp evaluation in 'display' properties. This is used by enriched.el. */ --Apple-Mail=_3A4B0BE6-09BD-44F1-830A-C8B48E391D03 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_3A4B0BE6-09BD-44F1-830A-C8B48E391D03 Content-Disposition: attachment; filename=test.txt Content-Type: text/plain; x-unix-mode=0644; name="test.txt" Content-Transfer-Encoding: base64 5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re3 5o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH 5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux 5paH5re35o6SIEVuZ2xpc2ggRW5nbGlzaCDkuK3oi7Hmlofmt7fmjpLkuK3vvIzoi7Hmlofmt7fm jpLkuK3oi7Hmlofmt7fmjpLkuK3oi7Hmlofmt7fmjpLkuK3oi7Hmlofmt7fmjpLkuK3oi7Hmlofm t7fmjpLkuK3oi7Hmlofmt7fmjpLkuK3oi7HmlofjgIrmt7fmjpLkuK3oi7Hmlofmt7cKCuiLseaW h+a3t+aOkuS4reiLseaWh+a3t+aOkuS4reiLseaWh+a3t+aOkuS4reiLseaWh+a3t+aOkuS4reiL seaWh+a3t+a3t2VuZ2xpc2jmjpLkuK3oi7Hmlofmt7fkuK3oi7Hmlofmt7fmjpLkuK3vvIzkuK3o i7Hmlofmt7fmjpLkuK3oi7Hmlofmt7fmt7fmjpLkuK3oi7Hmlofmt7fkuK3oi7Hmlofmt7fmjpLk uK0KCuiLseaWh+a3t+aOkuS4reiLseaWh+a3t+aOkuS4reiLseaWh+a3t+aOkuS4reiLseaWh+a3 t+aOkuS4reiLseaWh+a3t2VuZ2xpc2jmjpLkuK3oi7Hmlofmt7fmjpLkuK3oi7HmlofjgIrmt7fm jpLkuK3oi7Hmlofmt7cKCgoKCuS4reiLseaWh+a3t+iLseaWh+a3t+aOkuS4reiLseaWh+a3t+aO kuS4reiLseaWh+a3t+aOkuS4reiLseaWh+a3t+aOkuOAneOAnQoK5Lit6Iux5paH5re36Iux5paH 5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S5Lit6Iux5paH5re35o6S44Cd6Iux CgoK --Apple-Mail=_3A4B0BE6-09BD-44F1-830A-C8B48E391D03--