From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Line wrap reconsidered Date: Sat, 18 Jul 2020 11:15:33 +0300 Message-ID: <838sfhi6hm.fsf@gnu.org> References: <92FF4412-04FB-4521-B6CE-52B08526E4E5@gmail.com> <878shfsq35.fsf@gnus.org> <83imgivjak.fsf@gnu.org> <83lfletr03.fsf@gnu.org> <4895C6EE-5E1F-44BF-93C1-CC5F7C096F73@gmail.com> <9766BA3D-B8F9-456B-9F59-60D21B86E390@gmail.com> <83sgfls2ul.fsf@gnu.org> <83v9kgq6jy.fsf@gnu.org> <5E75D1E2-8BFF-45DA-A643-40DBD5784508@gmail.com> <83r1v3qlel.fsf@gnu.org> <83blm6lzj3.fsf@gnu.org> <83pnakj8fs.fsf@gnu.org> <83k10sj60l.fsf@gnu.org> <0B30F8C8-9B8F-4FCB-B9FB-1B5A0E993CDB@gmail.com> <838sgjzij2.fsf@gnu.org> <83sgerxmbs.fsf@gnu.org> <83bllfx80g.fsf@gnu.org> <2F9680C4-11D8-4092-A485-2590AAF62CC9@gmail.com> <837dw2ykeb.fsf@gnu.org> <43C485E7-A9E7-448A-B1EC-9085F83670E9@gmail.com> <83365woafv.fsf@gnu.org> <214EBF50-8BEE-4935-9DE9-526E82F7D85C@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="36600"; mail-complaints-to="usenet@ciao.gmane.io" Cc: larsi@gnus.org, emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jul 18 10:16:22 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jwi18-0009Qe-5l for ged-emacs-devel@m.gmane-mx.org; Sat, 18 Jul 2020 10:16:22 +0200 Original-Received: from localhost ([::1]:53652 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jwi17-00033l-4g for ged-emacs-devel@m.gmane-mx.org; Sat, 18 Jul 2020 04:16:21 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35298) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jwi0Y-0002bj-F7 for emacs-devel@gnu.org; Sat, 18 Jul 2020 04:15:46 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:47482) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jwi0Y-0004ve-2g; Sat, 18 Jul 2020 04:15:46 -0400 Original-Received: from [176.228.60.248] (port=1683 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jwi0W-0006EF-S8; Sat, 18 Jul 2020 04:15:45 -0400 In-Reply-To: <214EBF50-8BEE-4935-9DE9-526E82F7D85C@gmail.com> (message from Yuan Fu on Mon, 13 Jul 2020 15:46:16 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:253067 Archived-At: > From: Yuan Fu > Date: Mon, 13 Jul 2020 15:46:16 -0400 > Cc: Lars Ingebrigtsen , > emacs-devel@gnu.org > > Please have a look at the patch and see if it’s ok. If you think it’s good I can then update NEWS and the manual and submit a bug report. wrap.txt is the file I used to test word wrapping. To enable the full feature, set cjk-word-wrap to t and load kinsoku.el. Yes, we need to update NEWS and the manual. Also, we may need to rename cjk-word-wrap to something more accurate, as result of your answers to my questions below. A few minor comments below. > * src/xdisp.c (it_char_has_category, char_can_wrap_before, > char_can_wrap_after): New function. ^^^^^^^^^^^^ "New functions", in plural. > (move_it_in_display_line_to, display_line): Replace > IT_DISPLAYING_WHITESPACE with char_can_wrap_before and > char_can_wrap_after. Please quote all references in commit log messages to functions and variables 'like this'. > +/* These are the category sets we use. */ > +#define NOT_AT_EOL 60 /* < */ > +#define NOT_AT_BOL 62 /* > */ > +#define LINE_BREAKABLE 124 /* | */ Why not just use the characters themselves, as in '<' and '|' ? Also, if these characters are from kinsoku.el, please says so in comments, because if kinsoku.el changes, we may need to update those. > +static bool it_char_has_category(struct it *it, int cat) > +{ > + if (it->what == IT_CHARACTER) > + return CHAR_HAS_CATEGORY (it->c, cat); > + else if (STRINGP (it->string)) > + return CHAR_HAS_CATEGORY (SREF (it->string, > + IT_STRING_BYTEPOS (*it)), cat); > + else if (it->s) > + return CHAR_HAS_CATEGORY (it->s[IT_BYTEPOS (*it)], cat); > + else if (IT_BYTEPOS (*it) < ZV_BYTE) > + return CHAR_HAS_CATEGORY (*BYTE_POS_ADDR (IT_BYTEPOS (*it)), cat); > + else > + return false; > +} A minor stylistic nit: I'd prefer the if - elseif clauses to yield the relevant character, and then apply CHAR_HAS_CATEGORY only once to that character at the end. (It is generally better to have only one return point from a function, especially when the function is short. If nothing else, it makes debugging easier.) > + return (!IT_DISPLAYING_WHITESPACE (it) > + // Can be at BOL. Please don't use //-style C++ comments, we use the C /* style */ comments instead. > + return (IT_DISPLAYING_WHITESPACE (it) > + // Can break after && can be at EOL. > + || (it_char_has_category (it, LINE_BREAKABLE) > + && !it_char_has_category (it, not_at_eol))); Same here. > if (it->line_wrap == WORD_WRAP && it->area == TEXT_AREA) > { > - if (IT_DISPLAYING_WHITESPACE (it)) > - may_wrap = true; > - else if (may_wrap) > + /* Can we wrap here? */ > + if (may_wrap && char_can_wrap_before (it)) I'm worried about a potential change in logic here, when cjk-word-wrap is off. Previously, we just tested IT_DISPLAYING_WHITESPACE, but now we also test may_wrap. Is it guaranteed that may_wrap is always true in that case? > @@ -23292,9 +23365,8 @@ #define RECORD_MAX_MIN_POS(IT) \ > > if (it->line_wrap == WORD_WRAP && it->area == TEXT_AREA) > { > - if (IT_DISPLAYING_WHITESPACE (it)) > - may_wrap = true; > - else if (may_wrap) > + /* Can we wrap here? */ > + if (may_wrap && char_can_wrap_before (it)) Likewise here. > { > SAVE_IT (wrap_it, *it, wrap_data); > wrap_x = x; > @@ -23308,9 +23380,13 @@ #define RECORD_MAX_MIN_POS(IT) \ > wrap_row_min_bpos = min_bpos; > wrap_row_max_pos = max_pos; > wrap_row_max_bpos = max_bpos; > - may_wrap = false; > } > - } > + /* This has to run after the previous block. */ > + if (char_can_wrap_after (it)) > + may_wrap = true; > + else > + may_wrap = false; Please use TABs and spaces to indent code in C source files. The last 2 lines use only spaces. > + DEFVAR_BOOL("cjk-word-wrap", Vcjk_word_wrap, > + doc: /* Non-nil means wrap after CJK chracters. This is unclear. Does it mean after _any_ CJK character, or just after some? And if the latter, which ones? Thanks.