* 2020-08-16 19:28:51+03, Tomi Ollila wrote: > Good stuff -- implementation looks like port of the php code in > > https://www.iamcal.com/understanding-bidirectional-text > > to emacs lisp... anyway nice implementation took be a bit of > time for me to understand it... I don't read PHP and didn't try to read that code at all but the idea is simple enough. > thoughts > > - is it slow to execute it always, pure lisp implementation; > (string-match "[\u202a-\u202e]") could be done before that. > (if it were executed often could loop with `looking-at` > (and then moving point based on match-end) be faster... I don't see any speed issues but if we want to optimize I would create a new sanitize function which walks just once across the characters without using regular expressions. But currently I think it's unnecessary micro optimization. > - *but* adding U+202C's in `notmuch-sanitize` is doing it too early, as > some functions truncate the strings afterwards if those are too long > (e.g. `notmuch-search-insert-authors`) so those get lost.. Good point. This would mean that we shouldn't do "bidi ctrl char balancing" in notmuch-sanitize. We should call the new notmuch-balance-bidi-ctrl-chars function in various places before inserting arbitrary strings to buffer and before combining such strings with other strings. > (what I noticed when looking `notmuch-search-insert-authors` that it uses > `length` to check the length of a string -- but that also counts these bidi > mode changing "characters" (as one char). `string-width` would be better > there -- and probably in many other places.) Yes, definitely string-width when truncating is based on width and when using tabular format in buffers. With that function zero-width characters really have no width. -- /// Teemu Likonen - .-.. http://www.iki.fi/tlikonen/ // OpenPGP: 4E1055DC84E9DFF613D78557719D69D324539450