From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Adam Tack Newsgroups: gmane.emacs.bugs Subject: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B Date: Fri, 8 Dec 2017 01:02:08 +0000 Message-ID: References: <50EE7BE5.2060806@gmx.at> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="001a11370fe8ea2677055fc9bbd2" X-Trace: blaine.gmane.org 1512700700 32462 195.159.176.226 (8 Dec 2017 02:38:20 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 8 Dec 2017 02:38:20 +0000 (UTC) To: 13399@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Dec 08 03:38:11 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eN8YE-00085y-Ve for geb-bug-gnu-emacs@m.gmane.org; Fri, 08 Dec 2017 03:38:11 +0100 Original-Received: from localhost ([::1]:35305 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eN8YF-0007WI-Ue for geb-bug-gnu-emacs@m.gmane.org; Thu, 07 Dec 2017 21:38:11 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39181) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eN8Y9-0007WA-K2 for bug-gnu-emacs@gnu.org; Thu, 07 Dec 2017 21:38:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eN8Y6-0003VQ-FY for bug-gnu-emacs@gnu.org; Thu, 07 Dec 2017 21:38:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:42541) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eN8Y6-0003V9-9k for bug-gnu-emacs@gnu.org; Thu, 07 Dec 2017 21:38:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1eN8Y5-00054M-UY for bug-gnu-emacs@gnu.org; Thu, 07 Dec 2017 21:38:01 -0500 X-Loop: help-debbugs@gnu.org In-Reply-To: <50EE7BE5.2060806@gmx.at> Resent-From: Adam Tack Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 08 Dec 2017 02:38:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13399 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 13399-submit@debbugs.gnu.org id=B13399.151270066719466 (code B ref 13399); Fri, 08 Dec 2017 02:38:01 +0000 Original-Received: (at 13399) by debbugs.gnu.org; 8 Dec 2017 02:37:47 +0000 Original-Received: from localhost ([127.0.0.1]:51222 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eN8Xr-00053t-34 for submit@debbugs.gnu.org; Thu, 07 Dec 2017 21:37:47 -0500 Original-Received: from mail-ot0-f179.google.com ([74.125.82.179]:44243) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eN73P-0005Ty-6b for 13399@debbugs.gnu.org; Thu, 07 Dec 2017 20:02:16 -0500 Original-Received: by mail-ot0-f179.google.com with SMTP id d27so7975759ote.11 for <13399@debbugs.gnu.org>; Thu, 07 Dec 2017 17:02:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=+b0iCIbG3itir6CUgI0WACppNWZFwloXR18Olax6+FQ=; b=RjVY10isL+MT1nnnJTHI/FoD2dsOLdWKBPdP84axWHiq5VXPBewuaMbq33Sdn9T5HG f7Ih/CWLNvDvkICn9OLE9JdIRLv3hnZ2N4lcmpEuyNWGoCps2Icn3p1Q1e3GnDYgFOEA LlEIbnWNyiqXaAkqmzS+0hUu55nFs+wHOkMySCc4+1Ma4OJTp9JBAMTLp6gaWiRUyN1T 1OXqj7BdKVCD6HgpXckmhF+N/MldxCPEpFhSHRQfsvT6urQf/KQBNAW3TwcoeyXIMeAR u/eCZRWW1AXpgWogInVQAhQBTetYuqKRFmqx/3zz6vBr+VhY89LL2JdyXl/X4Myv9FBL LKZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+b0iCIbG3itir6CUgI0WACppNWZFwloXR18Olax6+FQ=; b=LB2ZNk0dWQCx1URdbPTKK2K3eTqcmP1LsvIVXlerntEjk4Xe0Ae+zyvRRvl50WOhCV tiEN6fEZPh/lu0fE9QV+ZgjjmGOCBSznTZHE18GPPsU32XblvlRgrqxfqRdZhloMKFkX c/9cIF7bRAChWI3JuMq3AHAStGXnS7HS0XiZtn3H9oj6S3cKfP2ZiJbPcHJZswWP6OqD ZbJGsKvRGWP3Om+t9jxsT+pK7r2bjA+uTrTW13L9EFjZNQokA9LAndlC1RSpkw1DRffT iGcwXuHze+5ndGkzj4CLjXhwxA1PuCqwhKN1EeB7t2BDJ9qTsdrUr8D6cTjAysO0tKfz u2wg== X-Gm-Message-State: AKGB3mL1Oq2TZwT1L18gNaQAZbHc7F4aO6mdOMvaDBv8jFTGC09ptW5q OUcTBMOKFw4N57Ng58EHj1Wvusb4eKgU1rnhDEUyyA== X-Google-Smtp-Source: AGs4zMZfkO+G66v+Pyf1//Veb9R6801aYWtq0Qith2KtWmZD9B/cVSyVAGwYLRzn6+dNo/OSxL+aJjky1rjJd6lueHI= X-Received: by 10.157.34.37 with SMTP id o34mr6099606ota.237.1512694929287; Thu, 07 Dec 2017 17:02:09 -0800 (PST) Original-Received: by 10.74.140.100 with HTTP; Thu, 7 Dec 2017 17:02:08 -0800 (PST) X-Mailman-Approved-At: Thu, 07 Dec 2017 21:37:46 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:140799 Archived-At: --001a11370fe8ea2677055fc9bbd2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I have a patch for the original issue of word-wrap not wrapping at a zero-width space. The implementation uses a character table, and is closely based on that written by Martin Rudalics (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D13399#113), with Eli Zaretski's suggestions regarding unicode. The patch applies cleanly to the latest master, compiles on GNU+Linux (Ubuntu Xenial) and appears to work =E2=80=94 both of the following tests result in the expected wrapping on the zero-width space character (the first of these is taken verbatim from this bug thread, the second, adapted from the first, checks that there is no regression of Bug#11341): (with-current-buffer (get-buffer-create "*foo*") (dotimes (i 1000) (insert "1234")) ; U-200B (setq word-wrap t) (display-buffer "*foo*")) (with-current-buffer (get-buffer-create "*bar*") (dotimes (i 1000) (insert "1234")) ; U-200B (setq word-wrap t) (setq whitespace-display-mappings '((space-mark 32 [183] [46]) (space-mark 160 [164] [95]) (space-mark 8203 [164] [95]) (newline-mark 10 [36 10]) (tab-mark 9 [187 9] [92 9]))) (whitespace-mode) (display-buffer "*bar*")) Setting other word-wrap characters using set-char-table-range with lisp also works as expected in the simple situations that I tested. However, this is my first foray into modifying a serious C codebase, so I am not sure if I have done the right thing. In particular, I have serious doubts about the second and third cases from IT_DISPLAYING_WHITESPACE, especially since I don't really know when they would be applicable. || ((STRINGP (it->string) \ && !NILP (CHAR_TABLE_REF \ (Vword_wrap_chars, STRING_CHAR \ (SDATA (it->string) + IT_STRING_BYTEPOS (*it))))) \ || (it->s && !NILP (CHAR_TABLE_REF \ (Vword_wrap_chars, \ STRING_CHAR(it->s + IT_BYTEPOS (*it))))) \ Additionally, I'm not certain whether syms_of_character in character.c is the right location for the definition of the char-table and whether the range of characters U+2000 to U+200B should be in the chartable, or if it should just be space and tab, by default. I am aware that if this were to be accepted, I would also need to make a change to etc/NEWS, probably the docstring of `word-wrap' and somewhere in the Texinfo manual. I have not yet filled out a copyright assignment form, though I will do so if this patch (modulo changes) is considered acceptable. Thanks! --001a11370fe8ea2677055fc9bbd2 Content-Type: text/plain; charset="US-ASCII"; name="word_wrap_char_table.diff" Content-Disposition: attachment; filename="word_wrap_char_table.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_jax7e7ne0 ZGlmZiAtLWdpdCBhL3NyYy9jaGFyYWN0ZXIuYyBiL3NyYy9jaGFyYWN0ZXIuYwppbmRleCBjOGZm YTJiLi42ZTdmNTVhIDEwMDY0NAotLS0gYS9zcmMvY2hhcmFjdGVyLmMKKysrIGIvc3JjL2NoYXJh Y3Rlci5jCkBAIC0xMTQ1LDQgKzExNDUsMTUgQEAgQWxsIFVuaWNvZGUgY2hhcmFjdGVycyBoYXZl IG9uZSBvZiB0aGUgZm9sbG93aW5nIHZhbHVlcyAoc3ltYm9sKToKIFNlZSBUaGUgVW5pY29kZSBT dGFuZGFyZCBmb3IgdGhlIG1lYW5pbmcgb2YgdGhvc2UgdmFsdWVzLiAgKi8pOwogICAvKiBUaGUg Y29ycmVjdCBjaGFyLXRhYmxlIGlzIHNldHVwIGluIGNoYXJhY3RlcnMuZWwuICAqLwogICBWdW5p Y29kZV9jYXRlZ29yeV90YWJsZSA9IFFuaWw7CisKKyAgREVGVkFSX0xJU1AgKCJ3b3JkLXdyYXAt Y2hhcnMiLCBWd29yZF93cmFwX2NoYXJzLAorCSAgICAgICBkb2M6IC8qIEEgY2hhci10YWJsZSBm b3IgY2hhcmFjdGVycyBhdCB3aGljaCB3b3JkLXdyYXAgb2NjdXJzLgorU3VjaCBjaGFyYWN0ZXJz IGhhdmUgdmFsdWUgdCBpbiB0aGlzIHRhYmxlLgorQnkgZGVmYXVsdCB0aGVzZSBhcmUgdGhlIHdo aXRlc3BhY2UgY2hhcmFjdGVycy4gKi8pOworICBWd29yZF93cmFwX2NoYXJzID0gRm1ha2VfY2hh cl90YWJsZSAoUW5pbCwgUW5pbCk7CisgIEZzZXRfY2hhcl90YWJsZV9yYW5nZSAoVndvcmRfd3Jh cF9jaGFycywgbWFrZV9udW1iZXIgKDkpLCBRdCk7CisgIEZzZXRfY2hhcl90YWJsZV9yYW5nZSAo VndvcmRfd3JhcF9jaGFycywgbWFrZV9udW1iZXIgKDMyKSwgUXQpOworICBGc2V0X2NoYXJfdGFi bGVfcmFuZ2UgKFZ3b3JkX3dyYXBfY2hhcnMsCisJCQkgRmNvbnMgKG1ha2VfbnVtYmVyICg4MTky KSwKKwkJCQltYWtlX251bWJlciAoODIwMykpLCBRdCk7CiB9CmRpZmYgLS1naXQgYS9zcmMveGRp c3AuYyBiL3NyYy94ZGlzcC5jCmluZGV4IDdlNDdjMDYuLjcxNTIyMjAgMTAwNjQ0Ci0tLSBhL3Ny Yy94ZGlzcC5jCisrKyBiL3NyYy94ZGlzcC5jCkBAIC0zNDgsMjAgKzM0OCwyMyBAQCBzdGF0aWMg TGlzcF9PYmplY3QgbGlzdF9vZl9lcnJvcjsKICNlbmRpZiAvKiBIQVZFX1dJTkRPV19TWVNURU0g Ki8KIAogLyogVGVzdCBpZiB0aGUgZGlzcGxheSBlbGVtZW50IGxvYWRlZCBpbiBJVCwgb3IgdGhl IHVuZGVybHlpbmcgYnVmZmVyCi0gICBvciBzdHJpbmcgY2hhcmFjdGVyLCBpcyBhIHNwYWNlIG9y IGEgVEFCIGNoYXJhY3Rlci4gIFRoaXMgaXMgdXNlZAotICAgdG8gZGV0ZXJtaW5lIHdoZXJlIHdv cmQgd3JhcHBpbmcgY2FuIG9jY3VyLiAgKi8KKyAgIG9yIHN0cmluZyBjaGFyYWN0ZXIsIGJlbG9u Z3MgdG8gdGhlIHdvcmQtd3JhcC1jaGFycyBjaGFyLXRhYmxlLgorICAgVGhpcyBpcyB1c2VkIHRv IGRldGVybWluZSB3aGVyZSB3b3JkIHdyYXBwaW5nIGNhbiBvY2N1ci4gICovCiAKICNkZWZpbmUg SVRfRElTUExBWUlOR19XSElURVNQQUNFKGl0KQkJCQkJXAotICAoKGl0LT53aGF0ID09IElUX0NI QVJBQ1RFUiAmJiAoaXQtPmMgPT0gJyAnIHx8IGl0LT5jID09ICdcdCcpKQlcCisgICgoaXQtPndo YXQgPT0gSVRfQ0hBUkFDVEVSCQkJCQkJXAorICAgICYmICFOSUxQIChDSEFSX1RBQkxFX1JFRiAo VndvcmRfd3JhcF9jaGFycywgaXQtPmMpKSkJCVwKICAgIHx8ICgoU1RSSU5HUCAoaXQtPnN0cmlu ZykJCQkJCQlcCi0JJiYgKFNSRUYgKGl0LT5zdHJpbmcsIElUX1NUUklOR19CWVRFUE9TICgqaXQp KSA9PSAnICcJCVwKLQkgICAgfHwgU1JFRiAoaXQtPnN0cmluZywgSVRfU1RSSU5HX0JZVEVQT1Mg KCppdCkpID09ICdcdCcpKQlcCi0gICAgICAgfHwgKGl0LT5zCQkJCQkJCVwKLQkgICAmJiAoaXQt PnNbSVRfQllURVBPUyAoKml0KV0gPT0gJyAnCQkJCVwKLQkgICAgICAgfHwgaXQtPnNbSVRfQllU RVBPUyAoKml0KV0gPT0gJ1x0JykpCQkJXAorCSYmICFOSUxQIChDSEFSX1RBQkxFX1JFRgkJCQkJ XAorCQkgIChWd29yZF93cmFwX2NoYXJzLCBTVFJJTkdfQ0hBUgkJCVwKKwkJICAgKFNEQVRBIChp dC0+c3RyaW5nKSArIElUX1NUUklOR19CWVRFUE9TICgqaXQpKSkpKQlcCisgICAgICAgfHwgKGl0 LT5zICYmICFOSUxQIChDSEFSX1RBQkxFX1JFRgkJCQlcCisJCQkgICAoVndvcmRfd3JhcF9jaGFy cywJCQkJXAorCQkJICAgIFNUUklOR19DSEFSKGl0LT5zICsgSVRfQllURVBPUyAoKml0KSkpKSkJ XAogICAgICAgIHx8IChJVF9CWVRFUE9TICgqaXQpIDwgWlZfQllURQkJCQkJXAotCSAgICYmICgq QllURV9QT1NfQUREUiAoSVRfQllURVBPUyAoKml0KSkgPT0gJyAnCQkJXAotCSAgICAgICB8fCAq QllURV9QT1NfQUREUiAoSVRfQllURVBPUyAoKml0KSkgPT0gJ1x0JykpKSkJCVwKKwkgICAmJiAh TklMUCAoQ0hBUl9UQUJMRV9SRUYJCQkJCVwKKwkJICAgICAoVndvcmRfd3JhcF9jaGFycywJCQkJ CVwKKwkJICAgICAgKEZFVENIX0NIQVIoSVRfQllURVBPUyAoKml0KSkpKSkpKSkJCVwKIAogLyog VHJ1ZSBtZWFucyBwcmludCBuZXdsaW5lIHRvIHN0ZG91dCBiZWZvcmUgbmV4dCBtaW5pLWJ1ZmZl ciBtZXNzYWdlLiAgKi8KIAo= --001a11370fe8ea2677055fc9bbd2--