From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Entering emojis Date: Sat, 30 Oct 2021 09:36:40 +0300 Message-ID: <83k0hvow7b.fsf@gnu.org> References: <87cznths5j.fsf@gnus.org> <87ilxi7531.fsf@gnus.org> <875yth7bjr.fsf@gnus.org> <8335oltgyd.fsf@gnu.org> <837ddws6p5.fsf@gnu.org> <87zgqrsdwg.fsf@gnus.org> <87v91fschk.fsf@gnus.org> <83mtmrox0k.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23054"; mail-complaints-to="usenet@ciao.gmane.io" Cc: mardani29@yahoo.es, stefankangas@gmail.com, emacs-devel@gnu.org To: larsi@gnus.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Oct 30 08:38:04 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mgi0C-0005qJ-CK for ged-emacs-devel@m.gmane-mx.org; Sat, 30 Oct 2021 08:38:04 +0200 Original-Received: from localhost ([::1]:55838 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mgi0A-0003ZO-So for ged-emacs-devel@m.gmane-mx.org; Sat, 30 Oct 2021 02:38:02 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mghz4-0002ry-H9 for emacs-devel@gnu.org; Sat, 30 Oct 2021 02:36:54 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:45128) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mghz4-0006hf-3r; Sat, 30 Oct 2021 02:36:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=Wvu9bs+ru3HTm6wEP9MZnH3TRoTg2MtkgxKecM9uzqE=; b=KBIfKm4C26zMvkD8VFoy KuPm9EluscQaZgE74+9WrLvTBsWLJPs/YcMgh5E5hZf77Z8RO7ZzfqkLSCwAsfazjg+rFFvwPAPEt fQ5EYv4Oe4bPft0lcGql4HbSEi4/S0vEYDHCXAT/HIvde6HJ9fRnfim6JqOHr5jNM8KLq9smKVbTB H0fka+qcno0zryoUnHKzkvlEzBXWSba9FvkwScPczWh/GjWBvajmJ+F8+vt0JtLYUg8vgOeqPsaGQ DPsf+lGY8bTgXlZvSmcfs9PXccW5Gq1fo2gTcjuBM5ylghsKJnmOtJaRnzB8+2HAPXA4n1X48BDeo 2bfbk7pBJfBM8g==; Original-Received: from [87.69.77.57] (port=1828 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mghz3-0003Ve-LH; Sat, 30 Oct 2021 02:36:53 -0400 In-Reply-To: <83mtmrox0k.fsf@gnu.org> (message from Eli Zaretskii on Sat, 30 Oct 2021 09:19:07 +0300) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278260 Archived-At: > Date: Sat, 30 Oct 2021 09:19:07 +0300 > From: Eli Zaretskii > Cc: emacs-devel@gnu.org, stefankangas@gmail.com, mardani29@yahoo.es > > > (truncate-string-to-width "👨🏽‍❤️‍💋‍👨🏾" 2 nil t) > > => "👨" > > Nothing, they should "just work", barring bugs. > > What does string-width return for this string on your system? > > > which is... uhm... In a way, this grapheme cluster thing is slightly > > like it was during the shift to utf-8, when not all string primitives > > worked on characters, but bytes instead. Less dramatic, of course, but. > > > > I think we'll be seeing many amusing display glitches in this area. 🥲 > > We shouldn't, because string-width already supports composed text. > There's always one more bug, of course, but there are no design > problems here, AFAICT. And I see that there is, indeed, a bug (or a missing feature) in truncate-string-to-width: its algorithm assumes that string-width returns a number that is the sum of char-width values for its constituent characters, which is not necessarily true when character-composition is involved. It needs instead to consider string-width values on subsequent substrings. It also cannot assume that string-width is monotonically increasing in the number of characters in the substring, as that, too, could be false when character-composition is involved. Again, this is nothing specific to Emoji, this can happen with any composed text, for example any ligature that produces a single glyph from 2 or more characters.