From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id mNccMlcP42L5MQEAbAwnHQ (envelope-from ) for ; Fri, 29 Jul 2022 00:36:07 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id aHbvMVcP42JLMwEAauVa8A (envelope-from ) for ; Fri, 29 Jul 2022 00:36:07 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 77D06A38E for ; Fri, 29 Jul 2022 00:36:07 +0200 (CEST) Received: from localhost ([::1]:55500 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oHC6s-00044f-Aw for larch@yhetil.org; Thu, 28 Jul 2022 18:36:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51708) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oHC6C-00044X-Ad for emacs-orgmode@gnu.org; Thu, 28 Jul 2022 18:35:20 -0400 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]:52814) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oHC6A-00083W-DI for emacs-orgmode@gnu.org; Thu, 28 Jul 2022 18:35:20 -0400 Received: by mail-pj1-x102d.google.com with SMTP id ha11so3263291pjb.2 for ; Thu, 28 Jul 2022 15:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=references:user-agent:from:to:cc:subject:date:in-reply-to :message-id:mime-version:content-transfer-encoding; bh=+3yjW79aoRVgmW76qHzhoUA3fuLoTiWKSWoN7yYjQF8=; b=TLNCxuaWO1GcdvQyKAqFHlCvWaJNj0UdXUpvOhJBGpDo50UmZmroNvU/IsuaSFc+X0 JvW6m49bcARS0SlEFe4CrR3217bUfZnTpDuAczC8aSKS0P3K3ODeYL0Ixd2wjcmZrmL9 NkopO2xcATZKyeFEzLy1aKUgootrp41ltSlsTSKWeZ+/3bbQFOQPtsozxKelAN+loOfZ Ii594Yoyqzekvr0by181QrEplcau7Z8vf12uuBkbBSrHko5Z9jnuqMs2tk7rt6HP+EW6 1IuP3YLQ2/SiSZGfHyi+NUnmFMpWBfKcjQ29YfXLl6tYXUmt4kAoCNZ3NIDDxOxlXbjY 83eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:references:user-agent:from:to:cc:subject:date :in-reply-to:message-id:mime-version:content-transfer-encoding; bh=+3yjW79aoRVgmW76qHzhoUA3fuLoTiWKSWoN7yYjQF8=; b=n/2TjwksLRYxNu+yF7zRI6SSaT4YrEko8cRonlb4m5MJQpAkXJGI7hsjHLXj6qmSLV D1vOKfDN1KHmo+lqo1CgVG2TFPMZ2uiMnXE6ScaktVuLXrswVk8qQJ5dLqcD3OMBTScw X4qWjbrDQM6MHOn67t0rN/YrFKbX0N3ZgSiVKe7B1fnKVxh0JOKmnwhwxhUIoJ8jl8fQ gcIJ2Xy+efIl+6tcDBwoO5q56wVlc6+11LQUYKA0el/NiFaWOliEoYy7outqyVTlcwKJ GE2oszPBIQUztH73dQXRRc3CzsU3I4wanM5bVNozrBVeSMMc6Bbb+cqogIilqcelFhia kaEA== X-Gm-Message-State: ACgBeo3/kC7G4uBMJZYietaGvIMPX8j6Z3/xi6K1DvUErQkyMjMg/Gkd EF4paw/pptPPNAxzim5pn/wDpgKCYxE= X-Google-Smtp-Source: AA6agR4AsAwONrtr+kWYbUDmIA0wfeq32PyurqNEO8VL4yS2Mhy/aNA3r320qvBM4WdHPo0zx2Akuw== X-Received: by 2002:a17:90b:350b:b0:1f0:23d9:57eb with SMTP id ls11-20020a17090b350b00b001f023d957ebmr1499348pjb.17.1659047716152; Thu, 28 Jul 2022 15:35:16 -0700 (PDT) Received: from dingbat (2001-44b8-31f2-bb00-842a-7361-87c7-2662.static.ipv6.internode.on.net. [2001:44b8:31f2:bb00:842a:7361:87c7:2662]) by smtp.gmail.com with ESMTPSA id x22-20020aa79416000000b0052c02d8c65asm1267181pfo.123.2022.07.28.15.35.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Jul 2022 15:35:15 -0700 (PDT) References: <87r128d5pp.fsf@localhost> <80f0990042a564556cc6b047a94f7e9dddf5a280.camel@outlook.com> <87v8rkav2x.fsf@localhost> <87mtct9y1f.fsf@localhost> User-agent: mu4e 1.8.6; emacs 29.0.50 From: Tim Cross To: Ihor Radchenko Cc: K K , Max Nikulin , emacs-orgmode@gnu.org Subject: Re: [PATCH] Add new entity \-- serving as markup separator/escape symbol Date: Fri, 29 Jul 2022 08:20:03 +1000 In-reply-to: <87mtct9y1f.fsf@localhost> Message-ID: <86sfmk50io.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=theophilusx@gmail.com; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1659047767; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=+3yjW79aoRVgmW76qHzhoUA3fuLoTiWKSWoN7yYjQF8=; b=KqdwSavp8w30QX7l3wEkk5oNTuj9BD4wOB1446iC4qljDtHBukPMmaWKS9PaJVcts0VuEU 5feQg5p6epSlZkqyB81UX42iX/vvlZaUdp1qrvoIKti8VZpStlj6gJ9yjazboCYBY4ZRpo nNh1V+BptuUZXEpS7zQKzzSP7rSTppp5wrc7tydEPvrPXEEiLrnP9L6DkQoU4fuSjqrtL7 z9Tft8mRAkJlHnxK+0m7iCO3JKRlnDyOHEaEQ6HzdCrTWneWEf3TYd1IfoJOi8IBmSYmkt hKKIV6v/L3S9OBKNcnr1aici1KKwxuECP48knb7ZgRmX24m6xdhRH41EzZSCdg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1659047767; a=rsa-sha256; cv=none; b=c1Km6yMahTHaqTTnDpwsJbrPIgr1NNYwqw0isKQEfwlCSLI2dl8r/e8BnLE088rG73tmOA LiGIQXNcoLzx+ffbRrGtUEGIcOYqa+jUzu8dy0YUtb4CdH6j+ubGqIEZv9D9uzMCDx4ESy xZm/x3IcP460NUGe9d1hal1ehFfexXwcGa9DEBgjroV2cOfFsQbWFEBDpbIb25dt3nItvv niRgVn1C/Ndubl9fNVFhVG38Pyrh2GfVbceyySBRRbTC/THDPPHLRU90j7DswbumOPFo4J xfVrEPXl1xTEoibCsKaKMQeUoKcXbj1p5pl8ZsOhr0ANjdxO5s+hHgsHPyxxBg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TLNCxuaW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.92 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TLNCxuaW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 77D06A38E X-Spam-Score: -3.92 X-Migadu-Scanner: scn1.migadu.com X-TUID: av6MIWf/mLA1 Ihor Radchenko writes: > Ihor Radchenko writes: > >> I am attaching a tentative patch that will make Org export remove >> zero-width spaces when those spaces actually separate the object >> boundaries. >> >> Any objections? > > Given the raised objections, zero-width space does not appear to be a > useful escape symbol because it has its valid uses as a standalone space > symbol. > > The raised objections can be solved using some kind of intricate > heuristics, but I do not feel like it is a good direction to go. The > code will be too complex and fragile. > Ihor, thanks for articulating this as it was something I was becoming increasingly concerned about.=20 > Therefore, I am proposing a different approach for shielding > fontification: introducing a special entity. > > The new entity is \--, which is a valid boundary between emphasis > markup. It will be removed during export (replaced by ""). > > "\--" specifically is somewhat arbitrary choice. The actual requirements > for the entity name are: (1) No clash with LaTeX (which is why simpler > \- would not cut it); (2) Being a valid markup boundary: entity must end > with (any space ?- ?\( ?' ?\" ?\{). > > I am attaching a tentative patch introducing the new entity. Note that > some minor tweaks to the parser were needed. I do not see it as a big > deal - the current entity regexp has much more cumbersome exceptions. > > Also, the patch will not work correctly on org =E2=86=92 org export, simi= lar to > pointed in one of the replies to the previous abandoned approach. I do > not want to address it here because a much more appropriate solution for > this issue is changing org-element-interpret-data. > > Consider (org-element-interpret-data '("asd" (bold () "bold") "bsd")) > This will return "asd*bold*bsd", which is not correct even though the > given Org datum is not wrong by itself - such things can easily appear > when user filters are applied to parse tree during org=E2=86=92org export. > > Otherwise, the patch should be good enough to play around and kick-start > the discussion. > > WDYT? > I think this is definitely preferred over the zero width space as it is clearer and 'intentional'. While I'm still 'on the fence' regarding the tension between the need for this new functionality and the additional complexity it introduces, this approach seems potentially cleaner and more manageable. Given the important work you are doing to integrate parsing of elements and fontification, I feel you are in the best position to judge whether this addition can be justified wrt complexity vs functionality and am confident your on the right track here.