From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id DAwLNQYPJWZgWQAAe85BDQ:P1 (envelope-from ) for ; Sun, 21 Apr 2024 15:05:11 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2.migadu.com with LMTPS id DAwLNQYPJWZgWQAAe85BDQ (envelope-from ) for ; Sun, 21 Apr 2024 15:05:11 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=CKjKHe4g; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" ARC-Seal: i=1; s=key1; d=yhetil.org; t=1713704710; a=rsa-sha256; cv=none; b=mGGBDhC+o23djqiv+bshKMP0+J/oZ62krLPBtlNFiiOqNU+8IHD8LtVMBGHoh/MlTBXgRp iDrWMEzBlqIuAdLZ6bUoGGnsGt9T6xlmxGHAYm0p0/mlW9lSvWA6A0oFy7egIshutnacp2 JGAMCwDFhzVVRZWAuOB/M3dHoN9b+4gp5pC0P0cNMzLWycB7vlCu6qJQzQe0Ry/X6CXwoL EkJch3megLTJDDGEC7bkOioqQupUMmXwgQkrFxQw8lE7TjiYYhGD9xzs9rY5gTuIvY6dPU 5+RHs/ymYaz75J74yrLksR9ZaNrTKkX789nSXpZNAdAOMp5a5lEnBbME6i25Mw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=CKjKHe4g; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1713704710; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=BKSV8wZkxF4+w1UhiKgBxrCZ/mY8S5m68vVzZb1Aozk=; b=gxUK5JVZcWmcHNt8euVRz7aDIRPVWFf3VrLz55t4j64y/deARA8UuBDcDEBuF7cTs3OJsp WLlpazVrxOgQIDCJEfxMlCssCKsCQKDr9TdNjrD252ieitBQE2roWqi1Qn3HmqkVsU2wTy BY2vjht+FEv4yNnwAlcLVCAp1v1IPXM8HvgwRodYzP257djDwBSBrknP0ObR8gyij66IDl Kacrr5CT7X73BoAoExiPQTCitTnVBUT04ll0yQ0zYEIA+KXUSfEn7lX4+11xKGqmTtzBoE JBdY/YsRKf/cbWKciIHFUtehJFGF/bQvAfaMfE5nAih87y3ap1MZuQZzywMLnA== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 53425745B3 for ; Sun, 21 Apr 2024 15:05:10 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ryWn3-0003kD-NI; Sun, 21 Apr 2024 08:59:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ryWn2-0003k0-6i for emacs-orgmode@gnu.org; Sun, 21 Apr 2024 08:59:28 -0400 Received: from mout02.posteo.de ([185.67.36.66]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ryWmy-0003uX-Rj for emacs-orgmode@gnu.org; Sun, 21 Apr 2024 08:59:27 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 804F9240101 for ; Sun, 21 Apr 2024 14:59:21 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1713704361; bh=FL1rDtHTys4z9V3ci1kArb32o1FW895hlK1CStMJfBw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type: From; b=CKjKHe4gDKhd2GP3DXWRhi6yxFB3TwQwjyXcz/zcW3buq9I3tgFgSBOi8gj5JRB1C eLpKiryXA2nV00fRMFxgrlImVLPT7T3fRDgVa92fpTlCefeQasbHNL7fyNQWOS1NDF JQxnj4selqDg5UYkeEpbhXeWI77/8CarB3psSLtcSP590YaxY4nAhScImDy8RTnqd1 9E840bDd8B1TDdPLH084W1mCmBer4N5MiOXZcUzHEpK1pK6MWjSxXLTm0GcE+CX71O mdLKZ6ezCDKAESvmP/nlH856CLqu4o+MySzsInMQ2ZMDpxagu2ZcmXxr2K+JX/hnGN vg++dOLnpQeSw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4VMpP42TJWz6tvd; Sun, 21 Apr 2024 14:59:20 +0200 (CEST) From: Ihor Radchenko To: Max Nikulin Cc: emacs-orgmode@gnu.org Subject: Re: Trailing whitespace after export snippets without a transcoder In-Reply-To: References: <5210ac1c-ed73-4b82-a296-41cf90b9f0a7@gmail.com> <87jzvmwnmw.fsf@localhost> <87ilauagvy.fsf@localhost> <87y1hqqgiu.fsf@localhost> <87ttk26crq.fsf@localhost> <87h6fwmgkm.fsf@localhost> Date: Sun, 21 Apr 2024 13:00:10 +0000 Message-ID: <87wmoqq3ad.fsf@localhost> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=185.67.36.66; envelope-from=yantar92@posteo.net; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Spam-Score: -9.59 X-Migadu-Queue-Id: 53425745B3 X-Migadu-Spam-Score: -9.59 X-Migadu-Scanner: mx10.migadu.com X-TUID: iA3VXE6Ss5t6 --=-=-= Content-Type: text/plain Max Nikulin writes: >> I have no clue about the rationale of this special behaviour - it dates >> back to the days when Org export was merged. It is also not documented >> anywhere, AFAIK. > > I would not expect that the space after the following export snippet is > ignored in the case of ox-html or ox-latex backend: > > A space@@ascii:*@@ character. > > The space may be put inside the export snippet if the intention is to > omit it for output formats other than plain text. So current behavior is > perfectly reasonable and flexible enough. Hmm. We actually have a similar scenario in `org-export--prune-tree' with a slightly different logic - only keep spaces when previous object does not have any. I am attaching tentative patch that will duplicate the logic for export snippets as well. And for any other object where transcoder returns nil. WDYT? --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-org-export-data-Handle-trailing-spaces-when-transcod.patch >From 54939c4044fb407b068c0666c258ccd01e59c2af Mon Sep 17 00:00:00 2001 Message-ID: <54939c4044fb407b068c0666c258ccd01e59c2af.1713703523.git.yantar92@posteo.net> From: Ihor Radchenko Date: Sun, 21 Apr 2024 15:37:18 +0300 Subject: [PATCH 1/2] org-export-data: Handle trailing spaces when transcoder returns nil * lisp/ox.el (org-export-data): When transcoder returns nil, handle trailing spaces after an object the same way `org-export--prune-tree' does. Remove special handling of export snippets that unconditionally keep their trailing spaces. Link: https://orgmode.org/list/87h6fwmgkm.fsf@localhost --- lisp/ox.el | 43 ++++++++++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 11 deletions(-) diff --git a/lisp/ox.el b/lisp/ox.el index fc746950d..ccc4c94ce 100644 --- a/lisp/ox.el +++ b/lisp/ox.el @@ -1930,15 +1930,11 @@ (defun org-export-data (data info) (eq (plist-get info :with-archived-trees) 'headline) (org-element-property :archivedp data))) (let ((transcoder (org-export-transcoder data info))) - (or (and (functionp transcoder) - (if (eq type 'link) - (broken-link-handler - (funcall transcoder data nil info)) - (funcall transcoder data nil info))) - ;; Export snippets never return a nil value so - ;; that white spaces following them are never - ;; ignored. - (and (eq type 'export-snippet) "")))) + (and (functionp transcoder) + (if (eq type 'link) + (broken-link-handler + (funcall transcoder data nil info)) + (funcall transcoder data nil info))))) ;; Element/Object with contents. (t (let ((transcoder (org-export-transcoder data info))) @@ -1979,8 +1975,33 @@ (defun org-export-data (data info) (puthash data (cond - ((not results) "") - ((memq type '(nil org-data plain-text raw)) results) + ((not results) + ;; TRANSCODER returned nil. When DATA is an object, + ;; interpret this as if DATA should be ignored (see + ;; `org-export--prune-tree'). Keep spaces in place of + ;; removed element, if necessary. + ;; Example: "Foo.[10%] Bar" would become + ;; "Foo.Bar" if we do not keep spaces. + ;; Another example: "A space@@ascii:*@@ character." + ;; should become "A space character" in non-ASCII export. + (let ((post-blank (org-element-post-blank data))) + (or + (unless (or (not post-blank) + (zerop post-blank) + (eq 'element (org-element-class data))) + (let ((previous (org-export-get-previous-element data info))) + (unless (or (not previous) + (pcase (org-element-type previous) + (`plain-text + (string-match-p + (rx whitespace eos) previous)) + (_ (org-element-post-blank previous)))) + ;; When previous element does not have + ;; trailing spaces, keep the trailing + ;; spaces from DATA. + (make-string post-blank ?\s)))) + ""))) + ((memq type '(nil org-data plain-text raw)) results) ;; Append the same white space between elements or objects ;; as in the original buffer, and call appropriate filters. (t -- 2.44.0 --=-=-= Content-Type: text/plain > The issue is empty lines that serve as paragraph separators and that may > appear due to expansion of a macro or due to skipped export snippets. > Perhaps transcoders of other elements, e.g. links, may return empty > strings as well. Right. This is a special case in MD where blank lines separate paragraphs. Just like in ox-latex, where we fixed exactly same thing reported in https://orgmode.org/list/tufdb6$11h2$1@ciao.gmane.io It is also a side effect of the fact that newlines are not considered a part of the Org markup objects. I do not think that we need to handle this Org mode-wide (it will be difficult and will likely cause breaking changes). We can only adjust the export backends sensitive to blank lines. See the attached tentative fix. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-ox-md-ox-ascii-ox-texinfo-Strip-blank-lines-from-par.patch >From 08c584b90ca6950e4186093bf5742d7443448254 Mon Sep 17 00:00:00 2001 Message-ID: <08c584b90ca6950e4186093bf5742d7443448254.1713704215.git.yantar92@posteo.net> From: Ihor Radchenko Date: Sun, 21 Apr 2024 15:54:48 +0300 Subject: [PATCH] ox-md, ox-ascii, ox-texinfo: Strip blank lines from paragraphs * lisp/org-macs.el (org-remove-blank-lines): New helper function to strip blank lines from string. * lisp/ox-ascii.el (org-ascii-paragraph): * lisp/ox-latex.el (org-latex-paragraph): * lisp/ox-md.el (org-md-paragraph): * lisp/ox-texinfo.el (org-texinfo-paragraph): Strip blank lines from paragraphs - these exporters are using blank lines as paragraph separators. Reported-by: Max Nikulin Link: https://orgmode.org/list/v00le7$frp$1@ciao.gmane.io --- lisp/org-macs.el | 4 ++++ lisp/ox-ascii.el | 6 ++++++ lisp/ox-latex.el | 4 +--- lisp/ox-md.el | 6 ++++++ lisp/ox-texinfo.el | 7 ++++++- 5 files changed, 23 insertions(+), 4 deletions(-) diff --git a/lisp/org-macs.el b/lisp/org-macs.el index 0046f3493..85bf6e4fa 100644 --- a/lisp/org-macs.el +++ b/lisp/org-macs.el @@ -1554,6 +1554,10 @@ (defun org-remove-tabs (s &optional width) t t s))) s) +(defun org-remove-blank-lines (s) + "Remove blank lines in S." + (replace-regexp-in-string (rx "\n" (1+ (0+ space) "\n")) "\n" s)) + (defun org-wrap (string &optional width lines) "Wrap string to either a number of lines, or a width in characters. If WIDTH is non-nil, the string is wrapped to that width, however many lines diff --git a/lisp/ox-ascii.el b/lisp/ox-ascii.el index db4356ec6..e767f66cf 100644 --- a/lisp/ox-ascii.el +++ b/lisp/ox-ascii.el @@ -1651,6 +1651,12 @@ (defun org-ascii-paragraph (paragraph contents info) "Transcode a PARAGRAPH element from Org to ASCII. CONTENTS is the contents of the paragraph, as a string. INFO is the plist used as a communication channel." + ;; Ensure that we do not create multiple paragraphs, when a single + ;; paragraph is expected. + ;; Multiple newlines may appear in CONTENTS, for example, when + ;; certain objects are stripped from export, leaving single newlines + ;; before and after. + (setq contents (org-remove-blank-lines contents)) (org-ascii--justify-element (let ((indented-line-width (plist-get info :ascii-indented-line-width))) (if (not (wholenump indented-line-width)) contents diff --git a/lisp/ox-latex.el b/lisp/ox-latex.el index 8a10f9390..cae7bb3b2 100644 --- a/lisp/ox-latex.el +++ b/lisp/ox-latex.el @@ -3040,9 +3040,7 @@ (defun org-latex-paragraph (_paragraph contents _info) ;; Multiple newlines may appear in CONTENTS, for example, when ;; certain objects are stripped from export, leaving single newlines ;; before and after. - (replace-regexp-in-string - (rx "\n" (1+ (0+ space) "\n")) "\n" - contents)) + (org-remove-blank-lines contents)) ;;;; Plain List diff --git a/lisp/ox-md.el b/lisp/ox-md.el index fa2beeb95..28f0a4cf6 100644 --- a/lisp/ox-md.el +++ b/lisp/ox-md.el @@ -628,6 +628,12 @@ (defun org-md-paragraph (paragraph contents _info) "Transcode PARAGRAPH element into Markdown format. CONTENTS is the paragraph contents. INFO is a plist used as a communication channel." + ;; Ensure that we do not create multiple paragraphs, when a single + ;; paragraph is expected. + ;; Multiple newlines may appear in CONTENTS, for example, when + ;; certain objects are stripped from export, leaving single newlines + ;; before and after. + (setq contents (org-remove-blank-lines contents)) (let ((first-object (car (org-element-contents paragraph)))) ;; If paragraph starts with a #, protect it. (if (and (stringp first-object) (string-prefix-p "#" first-object)) diff --git a/lisp/ox-texinfo.el b/lisp/ox-texinfo.el index 4aef9c41c..fc9ec9209 100644 --- a/lisp/ox-texinfo.el +++ b/lisp/ox-texinfo.el @@ -1517,7 +1517,12 @@ (defun org-texinfo-paragraph (_paragraph contents _info) "Transcode a PARAGRAPH element from Org to Texinfo. CONTENTS is the contents of the paragraph, as a string. INFO is the plist used as a communication channel." - contents) + ;; Ensure that we do not create multiple paragraphs, when a single + ;; paragraph is expected. + ;; Multiple newlines may appear in CONTENTS, for example, when + ;; certain objects are stripped from export, leaving single newlines + ;; before and after. + (org-remove-blank-lines contents)) ;;;; Plain List -- 2.44.0 --=-=-= Content-Type: text/plain -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at --=-=-=--