all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Lars Ingebrigtsen <larsi@gnus.org>
To: emacs-devel@gnu.org
Subject: Improve `replace-regexp-in-string' ergonomics?
Date: Wed, 22 Sep 2021 06:36:27 +0200	[thread overview]
Message-ID: <878rzpw7jo.fsf@gnus.org> (raw)

`replace-regexp-in-string' often leads to pretty awkward code.  I wonder
whether we could improve it somehow.

Here's a real life example:

(defun org-babel-js-read (results)
[...]
       (org-babel-read
        (concat "'"
                (replace-regexp-in-string
                 "\\[" "(" (replace-regexp-in-string
                            "\\]" ")" (replace-regexp-in-string
                                       ",[[:space:]]" " "
				       (replace-regexp-in-string
					"'" "\"" results))))))

That's kinda hard to read, but variations on this is pretty common.
When you have one `replace-regexp-in-string', you often have another.

We introduced `thread-last' in 2014, and there seems to be one (1) place
in the Emacs code base, so I guess that didn't take off, but rewriting
with that, we get:

       (org-babel-read
        (concat "'"
		(thread-last
		  results
		  (replace-regexp-in-string "'" "\"")
		  (replace-regexp-in-string ",[[:space:]]" " ")
		  (replace-regexp-in-string "\\]" ")")
                  (replace-regexp-in-string "\\[" "("))))

Which is somewhat more readable (but note that this totally breaks down
if you want to mix in LITERAL etc).  But I wonder whether we should
consider renaming the function to something more palatable, and since we
have `string-replace', why not `regexp-replace'?  The length of the name
of this common function is itself offputting.

       (org-babel-read
        (concat "'"
		(thread-last
		  results
		  (regexp-replace "'" "\"")
		  (regexp-replace ",[[:space:]]" " ")
		  (regexp-replace "\\]" ")")
                  (regexp-replace "\\[" "("))))

We could also consider making `regexp-replace' take a series of pairs,
since this is so common.  Like:

       (org-babel-read
        (concat "'"
		(regexp-replace "'" "\""
				",[[:space:]]" " "
				"\\]" ")"
				"\\[" "("
				results)))

Or some variation thereupon with some more ()s to group pairs.

The most popular way to deal with the awkwardness is to just give up and
go all imperative:

(defun authors-canonical-author-name (author file pos)
[...]
  (when author
    (setq author (replace-regexp-in-string "[ \t]*[(<].*$" "" author))
    (setq author (replace-regexp-in-string "\\`[ \t]+" "" author))
    (setq author (replace-regexp-in-string "[ \t]+$" "" author))
    (setq author (replace-regexp-in-string "[ \t]+" " " author))

Which leads me to my other point -- about a quarter of the usages of the
function in Emacs core has "" as the replacement, so perhaps that should
have its own function?  `regexp-remove'?

Then that could be:

  (when author
    (setq author (regexp-remove "[ \t]*[(<].*$" author))
    (setq author (regexp-remove "\\`[ \t]+" author))
    (setq author (regexp-remove "[ \t]+$" author))
    (setq author (regexp-replace "[ \t]+" " " author))

or

  (when author
    (setq author
	  (regexp-replace
	   "[ \t]+" " " (regexp-remove
			 "[ \t]*[(<].*$" (regexp-remove
					  "\\`[ \t]+" (regexp-remove
						       "[ \t]+$" author)))))))
or

  (when author
    (setq author
	  (thread-last author
		       (regexp-remove "[ \t]*[(<].*$")
		       (regexp-remove "\\`[ \t]+")
		       (regexp-remove "[ \t]+$")
		       (regexp-replace "[ \t]+" " ")))))


Or...  something else.  I'm sure nobody else has thought about this
issue before.  

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




             reply	other threads:[~2021-09-22  4:36 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22  4:36 Lars Ingebrigtsen [this message]
2021-09-22  5:22 ` Improve `replace-regexp-in-string' ergonomics? Yuri Khan
2021-09-22  6:36   ` Lars Ingebrigtsen
2021-09-22  7:47   ` Thierry Volpiatto
2021-09-22  5:24 ` Po Lu
2021-09-22  6:37   ` Lars Ingebrigtsen
2021-09-22 10:56     ` Po Lu
2021-09-22 20:08       ` Lars Ingebrigtsen
2021-09-23  0:11         ` Po Lu
2021-09-22  7:33 ` Adam Porter
2021-09-22  8:09   ` Lars Ingebrigtsen
2021-09-22  7:51 ` Andreas Schwab
2021-09-22  8:14 ` Augusto Stoffel
2021-09-22  8:21   ` Adam Porter
2021-09-22 18:01     ` Stefan Monnier
2021-09-22 18:24       ` Basil L. Contovounesios
2021-09-22 22:56       ` Adam Porter
2021-09-22 23:53         ` Eric Abrahamsen
2021-09-22 20:06   ` Lars Ingebrigtsen
2021-09-22 10:59 ` Dmitry Gutov
2021-09-22 20:18   ` Lars Ingebrigtsen
2021-09-22 22:23     ` Dmitry Gutov
2021-09-22 23:24       ` [External] : " Drew Adams
2021-09-22 18:14 ` Stefan Monnier
2021-09-22 19:30   ` Mattias Engdegård
2021-09-22 20:22   ` Lars Ingebrigtsen
2021-09-22 20:29     ` Lars Ingebrigtsen
2021-09-23  2:15     ` Stefan Monnier
2021-10-05 16:18 ` Juri Linkov
2021-10-12  6:53   ` Juri Linkov
2021-10-12 12:10     ` Lars Ingebrigtsen
2021-10-12 12:34       ` Stefan Monnier
2021-10-12 12:41         ` Lars Ingebrigtsen
2021-10-12 13:18           ` Lars Ingebrigtsen
2021-10-12 13:32             ` Mattias Engdegård
2021-10-12 15:48             ` Stefan Monnier
2021-10-12 13:33           ` Thierry Volpiatto
2021-10-12 19:16             ` Juri Linkov
2021-10-12 20:44               ` Thierry Volpiatto
2021-10-13  7:57                 ` Juri Linkov
2021-10-13  8:41                   ` Thierry Volpiatto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878rzpw7jo.fsf@gnus.org \
    --to=larsi@gnus.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.