> On Jul 18, 2020, at 4:15 AM, Eli Zaretskii wrote: > >> From: Yuan Fu >> Date: Mon, 13 Jul 2020 15:46:16 -0400 >> Cc: Lars Ingebrigtsen , >> emacs-devel@gnu.org >> >> Please have a look at the patch and see if it’s ok. If you think it’s good I can then update NEWS and the manual and submit a bug report. wrap.txt is the file I used to test word wrapping. To enable the full feature, set cjk-word-wrap to t and load kinsoku.el. > > Yes, we need to update NEWS and the manual. > > Also, we may need to rename cjk-word-wrap to something more accurate, > as result of your answers to my questions below. Cool, I’ll start on NEWS and manual once we are settled on the name of the new variable. I agree cjk-word-wrap isn’t a good name. I just used it as a placeholder. > > A few minor comments below. > >> * src/xdisp.c (it_char_has_category, char_can_wrap_before, >> char_can_wrap_after): New function. > ^^^^^^^^^^^^ > "New functions", in plural. > >> (move_it_in_display_line_to, display_line): Replace >> IT_DISPLAYING_WHITESPACE with char_can_wrap_before and >> char_can_wrap_after. > > Please quote all references in commit log messages to functions and > variables 'like this'. > >> +/* These are the category sets we use. */ >> +#define NOT_AT_EOL 60 /* < */ >> +#define NOT_AT_BOL 62 /* > */ >> +#define LINE_BREAKABLE 124 /* | */ > > Why not just use the characters themselves, as in '<' and '|' ? > > Also, if these characters are from kinsoku.el, please says so in > comments, because if kinsoku.el changes, we may need to update those. > Fixed. >> +static bool it_char_has_category(struct it *it, int cat) >> +{ >> + if (it->what == IT_CHARACTER) >> + return CHAR_HAS_CATEGORY (it->c, cat); >> + else if (STRINGP (it->string)) >> + return CHAR_HAS_CATEGORY (SREF (it->string, >> + IT_STRING_BYTEPOS (*it)), cat); >> + else if (it->s) >> + return CHAR_HAS_CATEGORY (it->s[IT_BYTEPOS (*it)], cat); >> + else if (IT_BYTEPOS (*it) < ZV_BYTE) >> + return CHAR_HAS_CATEGORY (*BYTE_POS_ADDR (IT_BYTEPOS (*it)), cat); >> + else >> + return false; >> +} > > A minor stylistic nit: I'd prefer the if - elseif clauses to yield the > relevant character, and then apply CHAR_HAS_CATEGORY only once to that > character at the end. (It is generally better to have only one return > point from a function, especially when the function is short. If > nothing else, it makes debugging easier.) I changed the it, do you code below this is ok? if (ch == 0) return false; else return CHAR_HAS_CATEGORY(ch, cat); > >> + return (!IT_DISPLAYING_WHITESPACE (it) >> + // Can be at BOL. > > Please don't use //-style C++ comments, we use the C /* style */ > comments instead. > >> + return (IT_DISPLAYING_WHITESPACE (it) >> + // Can break after && can be at EOL. >> + || (it_char_has_category (it, LINE_BREAKABLE) >> + && !it_char_has_category (it, not_at_eol))); > > Same here. Fixed. > >> if (it->line_wrap == WORD_WRAP && it->area == TEXT_AREA) >> { >> - if (IT_DISPLAYING_WHITESPACE (it)) >> - may_wrap = true; >> - else if (may_wrap) >> + /* Can we wrap here? */ >> + if (may_wrap && char_can_wrap_before (it)) > > I'm worried about a potential change in logic here, when cjk-word-wrap > is off. Previously, we just tested IT_DISPLAYING_WHITESPACE, but now > we also test may_wrap. Is it guaranteed that may_wrap is always true > in that case? > >> @@ -23292,9 +23365,8 @@ #define RECORD_MAX_MIN_POS(IT) \ >> >> if (it->line_wrap == WORD_WRAP && it->area == TEXT_AREA) >> { >> - if (IT_DISPLAYING_WHITESPACE (it)) >> - may_wrap = true; >> - else if (may_wrap) >> + /* Can we wrap here? */ >> + if (may_wrap && char_can_wrap_before (it)) > > Likewise here. In both can_wrap_before and can_wrap_after, I have a short circuit for the case when cjk_word_wrap is nil: if (!Vcjk_word_wrap) return IT_DISPLAYING_WHITESPACE (it); That should guarantee the old behavior when cjk_word_wrap is nil, if that’s what you are asking about. > >> { >> SAVE_IT (wrap_it, *it, wrap_data); >> wrap_x = x; >> @@ -23308,9 +23380,13 @@ #define RECORD_MAX_MIN_POS(IT) \ >> wrap_row_min_bpos = min_bpos; >> wrap_row_max_pos = max_pos; >> wrap_row_max_bpos = max_bpos; >> - may_wrap = false; >> } >> - } >> + /* This has to run after the previous block. */ >> + if (char_can_wrap_after (it)) >> + may_wrap = true; >> + else >> + may_wrap = false; > > Please use TABs and spaces to indent code in C source files. The last > 2 lines use only spaces. Sorry, fixed. > >> + DEFVAR_BOOL("cjk-word-wrap", Vcjk_word_wrap, >> + doc: /* Non-nil means wrap after CJK chracters. > > This is unclear. Does it mean after _any_ CJK character, or just > after some? And if the latter, which ones? I added more detail and hopefully they are clearer now. > > Thanks. Thanks! Yuan