From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Paul Maragakis via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#52067: possible fix for string-glyph-split halts on certain emoji strings. Date: Tue, 23 Nov 2021 23:58:17 -0500 Message-ID: <33CD01AE-0B26-42EA-83F0-A1FFEBE6E11B@icloud.com> References: Reply-To: Paul Maragakis Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20256"; mail-complaints-to="usenet@ciao.gmane.io" To: 52067@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Nov 24 05:59:14 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mpkNE-000560-Pr for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 24 Nov 2021 05:59:12 +0100 Original-Received: from localhost ([::1]:33592 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mpkND-0004zh-5x for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 23 Nov 2021 23:59:11 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:42630) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mpkN5-0004zY-Er for bug-gnu-emacs@gnu.org; Tue, 23 Nov 2021 23:59:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:41167) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mpkN3-0007nC-V7 for bug-gnu-emacs@gnu.org; Tue, 23 Nov 2021 23:59:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mpkN3-0002B0-Ri for bug-gnu-emacs@gnu.org; Tue, 23 Nov 2021 23:59:01 -0500 X-Loop: help-debbugs@gnu.org In-Reply-To: Resent-From: Paul Maragakis Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 24 Nov 2021 04:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 52067 X-GNU-PR-Package: emacs Original-Received: via spool by 52067-submit@debbugs.gnu.org id=B52067.16377299098309 (code B ref 52067); Wed, 24 Nov 2021 04:59:01 +0000 Original-Received: (at 52067) by debbugs.gnu.org; 24 Nov 2021 04:58:29 +0000 Original-Received: from localhost ([127.0.0.1]:52713 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mpkMW-00029x-Ri for submit@debbugs.gnu.org; Tue, 23 Nov 2021 23:58:29 -0500 Original-Received: from st43p00im-zteg10073401.me.com ([17.58.63.181]:42607) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mpkMS-00029e-9O for 52067@debbugs.gnu.org; Tue, 23 Nov 2021 23:58:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1637729898; bh=9O21npuiPl2TwTHBEI8rKbU00dn6q++ouLjcKGavrI0=; h=From:Content-Type:Mime-Version:Subject:Message-Id:Date:To; b=d65Skoe4QZnTQYAfcotduWWhlL8nqAFIpyXB30KMpUR4WFPLgy6NipMpLKBs4VXKs R1x8aqVd/AlzdM9IHhqwaJo+I7DX/kB7NQhlY/xxOK47yaKuouYTyqO7t+EOWf76jO gjWhGcjaxAM4fipxf2tJxCgdPZxeGG3uiYCXRl0Y0H6X7jicpI9LTfvkVi3pwT6l4/ TKggOQRUm83uTLRryAnFJub4RHBdf0KfTM4BW8oSnugi++qxIQmyFTEEVyIP8cQvUt Us5/KbzC/U3lNycC+2G0xsS+tK53hllFOVA2y48nFV89Vz0TtdQw+PYnHx+AGEjOiJ bZbTr8Nywmm/Q== Original-Received: from smtpclient.apple (unknown [160.39.47.209]) by st43p00im-zteg10073401.me.com (Postfix) with ESMTPSA id 812EA5E0428 for <52067@debbugs.gnu.org>; Wed, 24 Nov 2021 04:58:18 +0000 (UTC) X-Mailer: Apple Mail (2.3693.20.0.1.32) X-Proofpoint-Virus-Version: vendor=fsecure engine=1.1.170-22c6f66c430a71ce266a39bfe25bc2903e8d5c8f:6.0.425, 18.0.790, 17.0.607.475.0000000 definitions=2021-11-24_01:2021-11-23_01, 2021-11-24_01, 2020-04-07_01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 clxscore=1011 mlxscore=0 spamscore=0 adultscore=0 mlxlogscore=746 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2111240028 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:220699 Archived-At: The following code fixes this bug, though there might be better ways to = fix it for someone who understands the domain. I don't know much about glyph/grapheme representations, so although this = code passes my limited tests, it may break other things. (defun pm-string-glyph-split (string) "Split STRING into a list of strings representing separate glyphs. This takes into account combining characters and grapheme clusters." (let ((result nil) (start 0) (laststart -1) ;; the last start of a character with the = composition property comp) (while (< start (length string)) (setq comp (find-composition-internal start nil string nil)) (if (and comp (/=3D laststart (car comp))) ;; check that we don't = return to same start (progn (push (substring string (car comp) (cadr comp)) result) (setq laststart start) ;; keep the start of the last = successful search. (setq start (cadr comp))) (push (substring string start (1+ start)) result) (setq start (1+ start)))) (nreverse result))) Compare to the original: (defun string-glyph-split (string) "Split STRING into a list of strings representing separate glyphs. This takes into account combining characters and grapheme clusters." (let ((result nil) (start 0) comp) (while (< start (length string)) (if (setq comp (find-composition-internal start nil string nil)) (progn (push (substring string (car comp) (cadr comp)) result) (setq start (cadr comp))) (push (substring string start (1+ start)) result) (setq start (1+ start)))) (nreverse result)))