From: Adam Tack <adam.tack.513@gmail.com>
To: 13399@debbugs.gnu.org
Subject: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 8 Dec 2017 01:02:08 +0000 [thread overview]
Message-ID: <CAA+VxxHdj3795qbgTJV-EE_G+nC9-yLGvjs5KmQJMN4RE-RMAA@mail.gmail.com> (raw)
In-Reply-To: <50EE7BE5.2060806@gmx.at>
[-- Attachment #1: Type: text/plain, Size: 2760 bytes --]
I have a patch for the original issue of word-wrap not wrapping at a
zero-width space. The implementation uses a character table, and is
closely based on that written by Martin Rudalics
(https://debbugs.gnu.org/cgi/bugreport.cgi?bug=13399#113), with Eli
Zaretski's suggestions regarding unicode.
The patch applies cleanly to the latest master, compiles on GNU+Linux
(Ubuntu Xenial) and appears to work — both of the following tests
result in the expected wrapping on the zero-width space character (the
first of these is taken verbatim from this bug thread, the second,
adapted from the first, checks that there is no regression of Bug#11341):
(with-current-buffer (get-buffer-create "*foo*")
(dotimes (i 1000)
(insert "1234")) ; U-200B
(setq word-wrap t)
(display-buffer "*foo*"))
(with-current-buffer (get-buffer-create "*bar*")
(dotimes (i 1000)
(insert "1234")) ; U-200B
(setq word-wrap t)
(setq whitespace-display-mappings
'((space-mark 32
[183]
[46])
(space-mark 160
[164]
[95])
(space-mark 8203
[164]
[95])
(newline-mark 10
[36 10])
(tab-mark 9
[187 9]
[92 9])))
(whitespace-mode)
(display-buffer "*bar*"))
Setting other word-wrap characters using set-char-table-range with
lisp also works as expected in the simple situations that I tested.
However, this is my first foray into modifying a serious C codebase,
so I am not sure if I have done the right thing. In particular, I
have serious doubts about the second and third cases from
IT_DISPLAYING_WHITESPACE, especially since I don't really know when
they would be applicable.
|| ((STRINGP (it->string) \
&& !NILP (CHAR_TABLE_REF \
(Vword_wrap_chars, STRING_CHAR \
(SDATA (it->string) + IT_STRING_BYTEPOS (*it))))) \
|| (it->s && !NILP (CHAR_TABLE_REF \
(Vword_wrap_chars, \
STRING_CHAR(it->s + IT_BYTEPOS (*it))))) \
Additionally, I'm not certain whether syms_of_character in character.c
is the right location for the definition of the char-table and whether
the range of characters U+2000 to U+200B should be in the chartable,
or if it should just be space and tab, by default.
I am aware that if this were to be accepted, I would also need to make
a change to etc/NEWS, probably the docstring of `word-wrap' and
somewhere in the Texinfo manual.
I have not yet filled out a copyright assignment form, though I will
do so if this patch (modulo changes) is considered acceptable.
Thanks!
[-- Attachment #2: word_wrap_char_table.diff --]
[-- Type: text/plain, Size: 2525 bytes --]
diff --git a/src/character.c b/src/character.c
index c8ffa2b..6e7f55a 100644
--- a/src/character.c
+++ b/src/character.c
@@ -1145,4 +1145,15 @@ All Unicode characters have one of the following values (symbol):
See The Unicode Standard for the meaning of those values. */);
/* The correct char-table is setup in characters.el. */
Vunicode_category_table = Qnil;
+
+ DEFVAR_LISP ("word-wrap-chars", Vword_wrap_chars,
+ doc: /* A char-table for characters at which word-wrap occurs.
+Such characters have value t in this table.
+By default these are the whitespace characters. */);
+ Vword_wrap_chars = Fmake_char_table (Qnil, Qnil);
+ Fset_char_table_range (Vword_wrap_chars, make_number (9), Qt);
+ Fset_char_table_range (Vword_wrap_chars, make_number (32), Qt);
+ Fset_char_table_range (Vword_wrap_chars,
+ Fcons (make_number (8192),
+ make_number (8203)), Qt);
}
diff --git a/src/xdisp.c b/src/xdisp.c
index 7e47c06..7152220 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -348,20 +348,23 @@ static Lisp_Object list_of_error;
#endif /* HAVE_WINDOW_SYSTEM */
/* Test if the display element loaded in IT, or the underlying buffer
- or string character, is a space or a TAB character. This is used
- to determine where word wrapping can occur. */
+ or string character, belongs to the word-wrap-chars char-table.
+ This is used to determine where word wrapping can occur. */
#define IT_DISPLAYING_WHITESPACE(it) \
- ((it->what == IT_CHARACTER && (it->c == ' ' || it->c == '\t')) \
+ ((it->what == IT_CHARACTER \
+ && !NILP (CHAR_TABLE_REF (Vword_wrap_chars, it->c))) \
|| ((STRINGP (it->string) \
- && (SREF (it->string, IT_STRING_BYTEPOS (*it)) == ' ' \
- || SREF (it->string, IT_STRING_BYTEPOS (*it)) == '\t')) \
- || (it->s \
- && (it->s[IT_BYTEPOS (*it)] == ' ' \
- || it->s[IT_BYTEPOS (*it)] == '\t')) \
+ && !NILP (CHAR_TABLE_REF \
+ (Vword_wrap_chars, STRING_CHAR \
+ (SDATA (it->string) + IT_STRING_BYTEPOS (*it))))) \
+ || (it->s && !NILP (CHAR_TABLE_REF \
+ (Vword_wrap_chars, \
+ STRING_CHAR(it->s + IT_BYTEPOS (*it))))) \
|| (IT_BYTEPOS (*it) < ZV_BYTE \
- && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' ' \
- || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t')))) \
+ && !NILP (CHAR_TABLE_REF \
+ (Vword_wrap_chars, \
+ (FETCH_CHAR(IT_BYTEPOS (*it)))))))) \
/* True means print newline to stdout before next mini-buffer message. */
next prev parent reply other threads:[~2017-12-08 1:02 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-10 8:29 bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B martin rudalics
2013-01-10 19:15 ` Eli Zaretskii
2013-01-11 8:16 ` martin rudalics
2013-01-11 8:58 ` Eli Zaretskii
2013-01-11 10:29 ` martin rudalics
2013-01-11 10:57 ` Eli Zaretskii
2013-01-11 14:30 ` martin rudalics
2013-01-11 14:49 ` Eli Zaretskii
2013-01-11 15:17 ` martin rudalics
2013-01-11 15:22 ` Christopher Schmidt
2013-01-11 18:04 ` martin rudalics
2013-01-11 15:53 ` Eli Zaretskii
2013-01-11 18:04 ` martin rudalics
2013-01-11 16:08 ` Stefan Monnier
2013-01-11 18:06 ` martin rudalics
2013-01-11 18:50 ` Stefan Monnier
2013-01-11 19:29 ` Eli Zaretskii
2013-01-11 22:47 ` Stefan Monnier
2013-01-12 8:28 ` Eli Zaretskii
2013-01-12 13:20 ` Stefan Monnier
2013-01-12 14:12 ` Eli Zaretskii
2013-01-12 16:06 ` Stefan Monnier
2013-02-02 16:48 ` martin rudalics
2013-02-02 17:52 ` Eli Zaretskii
2013-02-02 18:20 ` martin rudalics
2013-02-02 18:36 ` Eli Zaretskii
2013-02-03 9:44 ` martin rudalics
2013-02-03 16:01 ` Stefan Monnier
2013-02-03 19:32 ` Eli Zaretskii
2013-02-04 17:04 ` martin rudalics
2013-02-04 17:57 ` Eli Zaretskii
2013-01-11 19:08 ` Eli Zaretskii
2013-01-12 14:29 ` martin rudalics
2013-01-12 14:56 ` Eli Zaretskii
2013-01-12 16:37 ` martin rudalics
2013-01-12 16:51 ` Eli Zaretskii
2013-01-12 18:01 ` martin rudalics
2013-01-12 18:38 ` Eli Zaretskii
2013-01-14 18:04 ` martin rudalics
2013-02-03 18:57 ` martin rudalics
2013-02-03 19:45 ` Eli Zaretskii
2017-12-08 1:02 ` Adam Tack [this message]
2017-12-08 10:12 ` martin rudalics
2017-12-08 15:38 ` Eli Zaretskii
2017-12-08 20:08 ` Eli Zaretskii
2017-12-09 3:50 ` Adam Tack
2017-12-12 17:13 ` Eli Zaretskii
2017-12-13 4:00 ` Adam Tack
2017-12-13 16:09 ` Eli Zaretskii
2017-12-17 2:22 ` Adam Tack
2020-09-18 14:55 ` Lars Ingebrigtsen
2020-09-18 15:39 ` Eli Zaretskii
2020-09-19 13:15 ` Lars Ingebrigtsen
2020-09-19 14:36 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAA+VxxHdj3795qbgTJV-EE_G+nC9-yLGvjs5KmQJMN4RE-RMAA@mail.gmail.com \
--to=adam.tack.513@gmail.com \
--cc=13399@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.