From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Jo=C3=A3o_?= =?UTF-8?Q?T=C3=A1vora?= Newsgroups: gmane.emacs.bugs Subject: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator Date: Sun, 26 Nov 2023 13:52:30 +0000 Message-ID: References: <87a5r5ph3p.fsf@bernoul.li> <87msv2vmzf.fsf@bernoul.li> <878r6mzezo.fsf@ushin.org> <87sf4tg6ts.fsf@bernoul.li> <875y1po3nk.fsf@ushin.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23786"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 67390@debbugs.gnu.org, Adam Porter , Jonas Bernoulli To: Joseph Turner Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Nov 26 14:53:13 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1r7FZQ-00060M-GF for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 26 Nov 2023 14:53:12 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r7FZD-0001Kb-11; Sun, 26 Nov 2023 08:52:59 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r7FZB-0001KA-52 for bug-gnu-emacs@gnu.org; Sun, 26 Nov 2023 08:52:57 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r7FZA-0003nB-Tb for bug-gnu-emacs@gnu.org; Sun, 26 Nov 2023 08:52:56 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1r7FZG-0007MF-II for bug-gnu-emacs@gnu.org; Sun, 26 Nov 2023 08:53:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?Jo=C3=A3o_?= =?UTF-8?Q?T=C3=A1vora?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 26 Nov 2023 13:53:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 67390 X-GNU-PR-Package: emacs Original-Received: via spool by 67390-submit@debbugs.gnu.org id=B67390.170100677828274 (code B ref 67390); Sun, 26 Nov 2023 13:53:02 +0000 Original-Received: (at 67390) by debbugs.gnu.org; 26 Nov 2023 13:52:58 +0000 Original-Received: from localhost ([127.0.0.1]:40937 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1r7FZC-0007Lw-3x for submit@debbugs.gnu.org; Sun, 26 Nov 2023 08:52:58 -0500 Original-Received: from mail-lf1-x135.google.com ([2a00:1450:4864:20::135]:42299) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1r7FZA-0007Lh-3G for 67390@debbugs.gnu.org; Sun, 26 Nov 2023 08:52:57 -0500 Original-Received: by mail-lf1-x135.google.com with SMTP id 2adb3069b0e04-507a5f2193bso3439518e87.1 for <67390@debbugs.gnu.org>; Sun, 26 Nov 2023 05:52:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701006764; x=1701611564; darn=debbugs.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LzAPH7lmC9sgsd8vvJWAGbbmzRtbFFkh3EOfE2aEupM=; b=MAq2AVFAUqRg55dSNtq5+QNvk2dJGCdDDMXbIOxIDsfJojee+6yW6A6oh9s0astjuf Ih9A8659vaejvNUaYAQ5oY6QFdvL1ayBEKLwPCL581CdUzkqpHajlo1Bjlxwq0ISU6gY 8rTLe2XMv1oNpI6r9AVcbSwZagH0etPdYj1sSmWaFc9ifagMoTPPYXZu+MWiz+BugIPD P1BK0yZKVjdTKQm1l1Ro7A2KrKyc/LI1vuaddiT19m77CfDTVwJKQzI6QrS4ibPho6Ng ZR7cX7KpHdiq5+o1Nm0W5IA1Dvq3TmktovASCd2nljPcdWCvWeqie0qATX6ibh8O9c2T 15oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701006764; x=1701611564; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LzAPH7lmC9sgsd8vvJWAGbbmzRtbFFkh3EOfE2aEupM=; b=uIds5049UTUma/bZHpv2Wz3xokJjzpFeyQywmjh936Gg5FhepSbgfoLaftceXe8sFO iHeuPT8XkQQFLsZM0Ooy5S9kWBvXSvQtKnwNEv4wTMtDeGAOxrANy76Kxk1DtJh7s8If B2fznZIfY76Cq8OUFC3Y331+tReIL6NHIKEHwBbZH82PP4oRf/Ca5WCXaq5guZc4Vuzg 3uetCWf4TKOZDQD3OZZF6KQKNbkpjVDEr8EhfDyFiFNx4UFtNp4LrkHssSMOSz+6wmv9 4VxBNCHWXvwAyQdXvc5wGFZJNpwg0XQJkA+lkvs2iJQmuygPMrtT3riYB+e0dwF2eTTW 1tHQ== X-Gm-Message-State: AOJu0YyzVnYVV3/YUsz0RFgQ0UzvR4VOe5pMYwSn3W1bPysA9Iix0eu8 ObKsVVUQ2EUGrK14xGWcKwYSIznKtBa3/GWn97OOdnJSsl4= X-Google-Smtp-Source: AGHT+IHTXMPuf8zLKonxTPBeoy3w/QnYnVM+P6qeZmWvvtq/qpctIhyVTjrQ/HypI8dKQP+0H0X2gNTm0etNTN4F7Uc= X-Received: by 2002:ac2:532d:0:b0:509:366b:a01c with SMTP id f13-20020ac2532d000000b00509366ba01cmr2289204lfh.14.1701006763571; Sun, 26 Nov 2023 05:52:43 -0800 (PST) In-Reply-To: <875y1po3nk.fsf@ushin.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:275055 Archived-At: On Sat, Nov 25, 2023 at 10:43=E2=80=AFPM Joseph Turner w= rote: > > Jonas Bernoulli writes: > > > Joseph Turner writes: > > > >> + (car (shorthands--find-if > >> + (lambda (short) > >> + (string-prefix-p short (match-strin= g 1))) > >> + read-symbol-shorthands #'car))))) > > > > Or simply: > > (car (assoc (match-string 1) > > read-symbol-shorthands > > #'string-prefix-p)) > > Much nicer - see patch. Thanks, Jonas! So, I had a look at this patch and I think we should compare it with the patch after my sig, which keeps 'shorthands--mismatch-from-end' and also fixes this bug. The main difference I see is that my patch keeps doing one string comparison, via the mismatch function (which btw is now perfectly analogous to CL mismatch and thus correctly named). In the worst case, Josheph's patch does 1 + N where N is the number of shorthands. So this is a fundamental complexity change. Normally, that would be the end of the story, but here, it isn't. For two reasons. My version keeps a behaviour that can be considered buggy. If a shorthand prefix has a common suffix with the longhand prefix then that suffix will not be highlighted. Like: ;; Local Variables: ;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-") ;; End: Here only "b" would be highlighted, effectively showing the user how much typing was saved. Is this wrong? Does it makes sense to use shorthands like this? The other reason why this isn't the end of the story is that even if we take that bug for granted, the string comparison functions in Joshep's patch delegate to built-in C comparison operators, which are often much, much faster than Elisp. At least before the advent of native compilation, it used to be like this. Of course for a large enough N number of shorthands, my version wins, but that is probably not very common either (or is it? Not very hard to imagine a file making use of many libraries and shorthanding each of them?) So, benchmarking it will have to be, I'm afraid, because AFAIK font-locking is a very performance sensitive area of Emacs. In the meantime I will push my patch, but keep the bug open to see if it is worth pushing Joseph's version. Jo=C3=A3o diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el index 82200ab88e9..b0665a55695 100644 --- a/lisp/emacs-lisp/shorthands.el +++ b/lisp/emacs-lisp/shorthands.el @@ -53,11 +53,16 @@ elisp-shorthand-font-lock-face :group 'font-lock-faces) (defun shorthands--mismatch-from-end (str1 str2) + "Tell index of first mismatch in STR1 and STR2, from end. +The index is a valid 0-based index on STR1. Returns nil if STR1 +equals STR2. Return 0 if STR1 is a suffix of STR2." (cl-loop with l1 =3D (length str1) with l2 =3D (length str2) for i from 1 for i1 =3D (- l1 i) for i2 =3D (- l2 i) - while (and (>=3D i1 0) (>=3D i2 0) (eq (aref str1 i1) (aref str= 2 i2))) - finally (return (1- i)))) + while (eq (aref str1 i1) (aref str2 i2)) + if (zerop i2) return (if (zerop i1) nil i1) + if (zerop i1) return 0 + finally (return i1))) (defun shorthands-font-lock-shorthands (limit) (when read-symbol-shorthands @@ -69,10 +74,16 @@ shorthands-font-lock-shorthands font-lock-string-face))) (intern-soft (match-string 1)))) (sname (and probe (symbol-name probe))) - (mm (and sname (shorthands--mismatch-from-end - (match-string 1) sname)))) - (unless (or (null mm) (=3D mm (length sname))) - (add-face-text-property (match-beginning 1) (1+ (- (match-end 1)= mm)) + (mismatch (and sname (shorthands--mismatch-from-end + (match-string 1) sname))) + (guess (and mismatch (1+ mismatch)))) + (when guess + (when (and (< guess (1- (length (match-string 1)))) + ;; In bug#67390 we allow other separators + (eq (char-syntax (aref (match-string 1) guess)) ?_)) + (setq guess (1+ guess))) + (add-face-text-property (match-beginning 1) + (+ (match-beginning 1) guess) 'elisp-shorthand-font-lock-face)))))) (font-lock-add-keywords 'emacs-lisp-mode '((shorthands-font-lock-shorthands)) t)