From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id QqZ4LgBwOV/4GQAA0tVLHw (envelope-from ) for ; Sun, 16 Aug 2020 17:42:24 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id GEbIKQBwOV+ndwAAB5/wlQ (envelope-from ) for ; Sun, 16 Aug 2020 17:42:24 +0000 Received: from mail.notmuchmail.org (nmbug.tethera.net [144.217.243.247]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (2048 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 349D994062F for ; Sun, 16 Aug 2020 17:42:23 +0000 (UTC) Received: from [144.217.243.247] (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id 3917B29BBF; Sun, 16 Aug 2020 13:42:15 -0400 (EDT) Received: from lahtoruutu.iki.fi (unknown [IPv6:2a0b:5c81:1c1::37]) by mail.notmuchmail.org (Postfix) with ESMTPS id 48F9E1FF7B for ; Sun, 16 Aug 2020 13:42:13 -0400 (EDT) Received: from mithlond (mobile-access-bceeee-200.dhcp.inet.fi [188.238.238.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: tlikonen) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id D2CFC1B0029D; Sun, 16 Aug 2020 20:42:01 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1597599722; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=kQJCPsk+A7+j8bMfhcJluF1/JOxJDRxqtOUcpTPwstw=; b=op/bujJt2Gy5IQvu1bF1VlzqnJzdwBjfFjDxG57jXv3jn49UnCvhPn7WwaaZWYRrPEV7PR O0Ei/f1ZkSMBZXIDwXtXKcW7W9mWVFZ12L3zYsdDYtzM4nL1oG9qGiUYTh7rS/axvjXnrK FlbMx53KXtvwjUI0iixxm0p+57trBtBNK9w5dRPZr8MSbtJd9a9vUbyxRJbkMhnmFfb7+U r5G69jH/JL02YL+sZZstiO/HyvtRV19Z3pJVMeMcl8S3QBbmxWACvCI4HcVDk+vHTMX+F4 GcVIPEW6gKmEog1NF7WAx3RTtM2lJhLUGmre8MoHQmPFnBj/qxtohBTr2jkL4w== From: Teemu Likonen To: Tomi Ollila , notmuch@notmuchmail.org Cc: David Edmondson Subject: Re: [PATCH 1/2] Emacs: Add a new function for balancing bidi control chars In-Reply-To: References: <20200815093036.5930-1-tlikonen@iki.fi> <20200815093036.5930-2-tlikonen@iki.fi> Date: Sun, 16 Aug 2020 20:41:55 +0300 Message-ID: <87v9hi8p3w.fsf@iki.fi> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1597599722; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=kQJCPsk+A7+j8bMfhcJluF1/JOxJDRxqtOUcpTPwstw=; b=CSs6DrggZ9apbssuamI74tudqQv6NPoP6QNQ8hLCJ74vquOxr0a4tYlTnzM32nW6IyBv8X uOCKB4iKx7iEpWKTZpkaXjnl9pVCg6lO8b1wKUfa4SpzhwCcv/1iJ+US/0ahGNUCZUeTek 3TWQxbfUTKR1eDlLStoeRXDp5aUygKWKAq7gWb5LEiRAjeIj/KPb6hGbVs/MXHicIeH3gW 0ubWe0cTGFQ1DDVWyz3ii4aHmZBlhOkyepLPASMRcqc5wBRlM+PZXBwOd6lN8JXAXTDJ0U 9Plt5cD0PbfsCBiCZnJzjnlnCBSY67SfvazUuA84iwclLlUjCpaZAsJSLvNzJw== ARC-Seal: i=1; s=lahtoruutu; d=iki.fi; t=1597599722; a=rsa-sha256; cv=none; b=JWXAsNvLivi/GXXM/yHk92sJX1J/3tHv2S+oFUFsq+hwHDYkk9HNu6qWPM6VuW+R90S0/h X3UIpGAyl9BSoOVjgMfcAIujlaz6dl2b92p6eERAVF3rUbuV8yh5/wA1PX613Sx9UTWgw+ PGvxTNqFIPR2bxHhAZ+eJqraWrtbPN/kIeoDCEmZOQhLA8gfEUnYMdonWCL7ke/SrH4d51 hoavFj97IjNXs5aZioyGhZ6puk7CFIOoasp1qzGQvNu8WZYELP/obMe8i0iE5Cb+OKlqSG u6jTNL0EUvtDq7i1VDceJdc/0QpFhDdHgS5yBiC3Xe5ER6NkZjM/Dm+II1xYpQ== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=tlikonen smtp.mailfrom=tlikonen@iki.fi Message-ID-Hash: Q4JQI6OJADQE44PFJNCKVUW2JHREOWZA X-Message-ID-Hash: Q4JQI6OJADQE44PFJNCKVUW2JHREOWZA X-MailFrom: tlikonen@iki.fi X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.1 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: multipart/mixed; boundary="===============8052190486838980418==" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (body hash did not verify) header.d=iki.fi header.s=lahtoruutu header.b=op/bujJt; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 144.217.243.247 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Spam-Score: 1.43 X-TUID: +vCMGrls4XyV --===============8052190486838980418== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable * 2020-08-16 19:28:51+03, Tomi Ollila wrote: > Good stuff -- implementation looks like port of the php code in=20 > > https://www.iamcal.com/understanding-bidirectional-text > > to emacs lisp... anyway nice implementation took be a bit of > time for me to understand it... I don't read PHP and didn't try to read that code at all but the idea is simple enough. > thoughts > > - is it slow to execute it always, pure lisp implementation; > (string-match "[\u202a-\u202e]") could be done before that. > (if it were executed often could loop with `looking-at` > (and then moving point based on match-end) be faster... I don't see any speed issues but if we want to optimize I would create a new sanitize function which walks just once across the characters without using regular expressions. But currently I think it's unnecessary micro optimization. > - *but* adding U+202C's in `notmuch-sanitize` is doing it too early, as > some functions truncate the strings afterwards if those are too long > (e.g. `notmuch-search-insert-authors`) so those get lost..=20 Good point. This would mean that we shouldn't do "bidi ctrl char balancing" in notmuch-sanitize. We should call the new notmuch-balance-bidi-ctrl-chars function in various places before inserting arbitrary strings to buffer and before combining such strings with other strings. > (what I noticed when looking `notmuch-search-insert-authors` that it uses > `length` to check the length of a string -- but that also counts these b= idi > mode changing "characters" (as one char). `string-width` would be better > there -- and probably in many other places.) Yes, definitely string-width when truncating is based on width and when using tabular format in buffers. With that function zero-width characters really have no width. =2D-=20 /// Teemu Likonen - .-.. http://www.iki.fi/tlikonen/ // OpenPGP: 4E1055DC84E9DFF613D78557719D69D324539450 --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iIYEARYIAC4WIQTJW2wqtelxC1gHdbitnXWr7pTCcwUCXzlv4xAcdGxpa29uZW5A aWtpLmZpAAoJEK2ddavulMJz8WgBAIEGzGoRtfNLZqQF0TRQ8XHCiieZHyH7kfIH DA3tLyVcAQDOX0A1JnTUdAd1urptpkFbLltAztBMNlFo3e9nVLynDw== =V/4e -----END PGP SIGNATURE----- --=-=-=-- --===============8052190486838980418== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline --===============8052190486838980418==--