From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id ihadIzbBqGG0awEAgWs5BA (envelope-from ) for ; Thu, 02 Dec 2021 13:51:02 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id CHeoHjbBqGGOLQAAB5/wlQ (envelope-from ) for ; Thu, 02 Dec 2021 12:51:02 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 2BEEAA5BC for ; Thu, 2 Dec 2021 13:51:02 +0100 (CET) Received: from localhost ([::1]:49660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mslYD-0003gJ-4G for larch@yhetil.org; Thu, 02 Dec 2021 07:51:01 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41360) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mslWA-0003fo-TD for emacs-orgmode@gnu.org; Thu, 02 Dec 2021 07:48:56 -0500 Received: from ciao.gmane.io ([116.202.254.214]:50572) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mslW8-0000vf-Qn for emacs-orgmode@gnu.org; Thu, 02 Dec 2021 07:48:54 -0500 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1mslW6-0004Kf-0S for emacs-orgmode@gnu.org; Thu, 02 Dec 2021 13:48:50 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Max Nikulin Subject: Re: Org-syntax: Intra-word markup Date: Thu, 2 Dec 2021 19:48:42 +0700 Message-ID: References: <4897bc60-b74f-ccfd-e13e-9b89a1194fdf@mailbox.org> <87o85zi7b0.fsf@localhost> <87czmf9rcf.fsf@posteo.net> <871r2vw7vt.fsf@gmail.com> <875ys7i4w1.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 In-Reply-To: <875ys7i4w1.fsf@localhost> Content-Language: en-US Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-3.3, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638449462; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=kKgRKlwsKrd4lZ/hStpKe75WGioSOPvBTDT/qjy5+/s=; b=CoJhkJwIfj0an3YJObLCy0CA8cyCdJkdnDZh6tgty95KEdTCND4dKLTFPnU4OxQg9chygn 2EiNn9eBO0sjzs+nF2atwYiKulnYVhy1ZwRuHf3eatF4nrom2R7Y3YDYcEJr6+6+77PJqt pZLncqqHqVj/cmOuZfVlm/I/cme1t4isYp0dyK6/Yo3vvtSkgb7jRtrgzi1L+BmywhKEIK I1j65oeEO2F+5zcuBUwy55M/eKLk9R1/O/H0CJYYKgGEs8ZgAQWdcqlmpEKeGECpiXM+TO AL4ZskcysfrkX6xkWBdvX52tyqQGIAR5cfOYRBqVqJtiASFrD6eHm8No2cVZEw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638449462; a=rsa-sha256; cv=none; b=Ul5sgYFdtolZ210WBQqWBkmDnA0DjZp8gz3cTMQCJhR6shJnrO/96l4vqEJuTWxqIh5+rp ljeEPvu/Qe1+XPPxkZKbx05BuRqqv/Geg9U7WCayH8x6EGGQwxadltfvjZzd50Vmhn32lK egwWxnieDonY+TzP4RtnBD6p+5WskdqorZcRBOkLUXVFXaPIdGqmhoaNTh7ZdwTciAvtaH 5cQNNvikxK2wVfDZLP5NwAMrsrt/81ySeFDlDJEXUcL397MziZQi0yBGq+4nAK/F3zhCbs KsvD4Uouf5TH8uklBzRHdg/oQ3kCFuL9L5VGeheBzEbbpNWfpkilOztqOy4teg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -2.32 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 2BEEAA5BC X-Spam-Score: -2.32 X-Migadu-Scanner: scn0.migadu.com X-TUID: st9zTvzmImaZ On 02/12/2021 19:10, Ihor Radchenko wrote: > Denis Maier writes: > >> Just a furter remark: while zero-width-spaces can be used as a >> workaround, they may create problems in some export formats. E.g., they >> will mess up hyphenation in latex. I think if read somewhere that those >> can be removed with hooks or filters, but I think that shouldn't be >> necessary. > > Probably, we just need to strip all zero-width spaces at the basic ox.el > level. I think, legitimate cases when zero-width spaces should be preserved in a document may exist, so unconditionally stripping them is not a perfect solution. I am afraid, regexps detecting start and end of emphasis are similar to a short blanket. They will always fail for some cases, especially since verbatim, URLs and similar contexts (that significantly differ from prose in respect to punctuation) do not have higher priority for parser. Extensive test set is required for tuning of heuristics. Failures should be reported in a such way that allows to estimate overall quality before and after change. Ideally, format of file with such tests should allow to use the *same* input data for other tools like ruby-org.