From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Nicolas Ouellet-payeur Newsgroups: gmane.emacs.devel Subject: Re: Supporting stylistic sets Date: Fri, 23 Sep 2022 15:20:54 -0400 Message-ID: References: <83wn9up0es.fsf@gnu.org> <83h70yotd4.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29549"; mail-complaints-to="usenet@ciao.gmane.io" Cc: =?UTF-8?B?4KS44KSu4KWA4KSwIOCkuOCkv+CkguCkuSBTYW1lZXIgU2luZ2g=?= , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Sep 23 21:23:52 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oboHA-0007aE-1t for ged-emacs-devel@m.gmane-mx.org; Fri, 23 Sep 2022 21:23:52 +0200 Original-Received: from localhost ([::1]:45660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oboH9-0007xg-5c for ged-emacs-devel@m.gmane-mx.org; Fri, 23 Sep 2022 15:23:51 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55304) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oboEb-0006Ig-S0 for emacs-devel@gnu.org; Fri, 23 Sep 2022 15:21:13 -0400 Original-Received: from mail-lf1-x12f.google.com ([2a00:1450:4864:20::12f]:33350) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oboEY-0000eg-Vm for emacs-devel@gnu.org; Fri, 23 Sep 2022 15:21:12 -0400 Original-Received: by mail-lf1-x12f.google.com with SMTP id d42so1840116lfv.0 for ; Fri, 23 Sep 2022 12:21:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date; bh=ZoQiWrjb4pAwEqsHgKiZVhN22OEa39jDzDL2Yuzfr4k=; b=YH3pH0bG9okCDh5x8FFb8HKhmBywXsYEPr/Mc/HYBHNUpEkFomFYt1ab3mj4spBLlc BFymGriYgmU5MxVIdIOvFsJtDbgFNaGnchswNVHME5O5UMkH8w3c6gAzizN76qX26zUQ zNWfOdnNRGpqkJUM2elddWvub+Hh86IdVGht2iXF+86XkP/6TGsYOefEyvL3pRZQ8rvA HG8oaz7DbZNkAOk+xjNoF9w4vqJESwHoZeB33jOJ1V/CHZiygRo4Ef9ls4N54t94xPK+ 1ZaXWkVUrwr2z2W2w1lWZxCpidHKXdtcaWiaNVTd4v/QqM2Rc9xQiKmMr0XWOi2gtpQ6 sERg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date; bh=ZoQiWrjb4pAwEqsHgKiZVhN22OEa39jDzDL2Yuzfr4k=; b=mfI/1StEFRld3HpljH9acqXv92blhO8FWe6dTA1XwlVaCYMBEnZhOJ9xQx+xEBptZJ TG9hl148qH6u5d0TVFTQ8yKs5wHN71hidJFK6EBNlJLP+K5rYbb2eOfJwneyH6lKCS48 u9y7WsxJSGiqyx3jXhOZkELPeKT5sz1FqNMa2oVPLn5Uq3hOQCZQZiYGVHF97TeJPLia lqSuamyuxsHZuSM6E/8IZd+LQN5YpKsw1YgcPlpFh+losi6ePB+EYVFhfU32bsC7+/rx UbjdKIyy2WhddSKghFw0MEoanf3Vh3LZeGCAEOzyGbObvbP8bV61MhUqHKkZVxP+Hr45 T5kw== X-Gm-Message-State: ACrzQf3r7P3Txdlvmih7o80Vbja4mCSNzflfskORLohNaCs2g7r+q3oy A0XLyvtjSafA80QkkclZdHs+wGUQguhqPQ2e7R+hxQ== X-Google-Smtp-Source: AMsMyM4VMcZy5ISiRbizs6+Nyb0FmLHxJkSIkMDaK6r/yvpowjJ7Heux1DXrXgclMZ6mHFv7eUdbTjHidPKL3QH7iCM= X-Received: by 2002:ac2:4114:0:b0:4a0:6357:c362 with SMTP id b20-20020ac24114000000b004a06357c362mr254577lfi.140.1663960866310; Fri, 23 Sep 2022 12:21:06 -0700 (PDT) In-Reply-To: <83h70yotd4.fsf@gnu.org> Received-SPF: pass client-ip=2a00:1450:4864:20::12f; envelope-from=nicolaso@google.com; helo=mail-lf1-x12f.google.com X-Spam_score_int: -175 X-Spam_score: -17.6 X-Spam_bar: ----------------- X-Spam_report: (-17.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:296083 Archived-At: > From: Nicolas Ouellet-payeur > Date: Fri, 23 Sep 2022 13:31:30 -0400 > Cc: Eli Zaretskii , emacs-devel@gnu.org > This is not a font property, this is a face property. It will be in > effect for every font used for the default face, including fonts used > for non-ASCII characters, like CJK and Emoji. Are you sure this is > what you want? Thanks for clearing up the confusion RE: font vs. face properites. > I'm not against adding a face property, I just think it isn't enough, > even if that's what people will want. > > Also, AFAIU this feature is meant for special styling of select text > segments, not for the entire buffer. Ah, that's a good point. I hadn't considered what happens to different scripts if, say, the 'ss03' variant means a different thing in each font. There's no reason it *can't* be for the entire buffer though. See below. > > This uses my chosen stylistic sets everywhere with a single line of > > Lisp. If I don't like having it *everywhere* (e.g. minibuffer, > > mode-line), I can still set it separately for each face. > > That'd be very tedious, since Emacs uses so many different faces. Well yes, but what's the alternative? And in the specific use-case of > > With a text property, I'd have to hook into everything that displays > > text, somehow, and add the text property there. > > Yes. > > Maybe we should talk about higher-level use cases: when and why would > one want to use stylistic-sets in Emacs? In my case specific: because the 'cv31' variant of my font changes the shape of parentheses, and I like that shape better. I would rather set it everywhere, even the minibuffer, mode-line, header-line. This is all just ASCII though. And ligatures, which might be weirder. IIUC, Sameer's suggestion with set-fontset-font would address this use-case. (set-fontset-font t 'ascii (font-spec :family "Fira Code" :stylistic-set '("cv31"))) > > Also, while we're on this topic: I'm working on a patch to pass *all* > > strings/buffer contents to hbfont_shape() during redisplay, and making > > all text a composition. That way we could have stylistic sets for Latin > > scripts as well (and most other scripts), not just "composed" scripts > > like Bengali and Arabic. It'd also achieve "Support ligatures out of th= e > > box" from etc/TODO, by giving HarfBuzz the means to shape text properly= . > > I think you will find out that this makes Emacs redisplay unbearably > slow. > > IMO, it is impractical to shape everything via a shaping engine > without completely redesigning how we handle character compositions in > the display engine, because what we have now was not designed to be > used for all the text we display. That's one of my big worries, as well. That's why I'm anxious to share a proof-of-concept ASAP, and see if it's worth pursuing. Initially it *was* super janky, because I did the na=C3=AFve thing and pass= ed strings through hbfont_shape() without caching the result. Each redisplay would pass the strings to HarfBuzz again, and create a bunch of new Lisp vectors... Then the GC would go crazy, and cause a *lot* of jank. I then tried to put everything into `gstring_hash_table' in composite.c. That made things better, but the hashtable would keep growing in size all the time. Basically leaking memory on purpose... My latest attempt is to tweak `region_cache' so it can store arbitrary Lisp_Objects, rather than just 1s or 0s for known-ness of regions. Then I stick the lgstrings into that cache. Different frames have different face properties, so there's one region_cache per (frame,buffer) pair. This works, but right now it's incomplete and crashes all the time like I said. It's not *noticeably* jankier so far, and I even used the `gstring_hash_table' version as my main editor for a while. But maybe my computer is just fast enough, or my use-cases are too simple. It's *definitely* way too slow for very long lines. For example, a JSON file that's a single line and multiple megabytes. We'd need to fall back to the existing code in cases like that. I haven't tried running it through `gprof' yet, and this is all subjective. So I'm still worried about performance. > You can have ligatures today without passing everything through the > shaping engine. Yes, but you need to configure those manually. Here, the advantage of passing them to HarfBuzz is that it can read those ligatures from the font file without any extra configuration. i.e., "Support ligatures out of the box" as described in etc/TODO. That's not going to change the world, but if we can make it work it'll be neat. > And the main difficulty with such a redesign is to do it in a way that > still allows easy customization of composition rules from Lisp. A quick test shows that `prettify-symbols-mode' still works, but I'm sure there's subtle bugs hiding underneath.