From: Lars Ingebrigtsen <larsi@gnus.org>
To: emacs-devel@gnu.org
Subject: Improve `replace-regexp-in-string' ergonomics?
Date: Wed, 22 Sep 2021 06:36:27 +0200 [thread overview]
Message-ID: <878rzpw7jo.fsf@gnus.org> (raw)
`replace-regexp-in-string' often leads to pretty awkward code. I wonder
whether we could improve it somehow.
Here's a real life example:
(defun org-babel-js-read (results)
[...]
(org-babel-read
(concat "'"
(replace-regexp-in-string
"\\[" "(" (replace-regexp-in-string
"\\]" ")" (replace-regexp-in-string
",[[:space:]]" " "
(replace-regexp-in-string
"'" "\"" results))))))
That's kinda hard to read, but variations on this is pretty common.
When you have one `replace-regexp-in-string', you often have another.
We introduced `thread-last' in 2014, and there seems to be one (1) place
in the Emacs code base, so I guess that didn't take off, but rewriting
with that, we get:
(org-babel-read
(concat "'"
(thread-last
results
(replace-regexp-in-string "'" "\"")
(replace-regexp-in-string ",[[:space:]]" " ")
(replace-regexp-in-string "\\]" ")")
(replace-regexp-in-string "\\[" "("))))
Which is somewhat more readable (but note that this totally breaks down
if you want to mix in LITERAL etc). But I wonder whether we should
consider renaming the function to something more palatable, and since we
have `string-replace', why not `regexp-replace'? The length of the name
of this common function is itself offputting.
(org-babel-read
(concat "'"
(thread-last
results
(regexp-replace "'" "\"")
(regexp-replace ",[[:space:]]" " ")
(regexp-replace "\\]" ")")
(regexp-replace "\\[" "("))))
We could also consider making `regexp-replace' take a series of pairs,
since this is so common. Like:
(org-babel-read
(concat "'"
(regexp-replace "'" "\""
",[[:space:]]" " "
"\\]" ")"
"\\[" "("
results)))
Or some variation thereupon with some more ()s to group pairs.
The most popular way to deal with the awkwardness is to just give up and
go all imperative:
(defun authors-canonical-author-name (author file pos)
[...]
(when author
(setq author (replace-regexp-in-string "[ \t]*[(<].*$" "" author))
(setq author (replace-regexp-in-string "\\`[ \t]+" "" author))
(setq author (replace-regexp-in-string "[ \t]+$" "" author))
(setq author (replace-regexp-in-string "[ \t]+" " " author))
Which leads me to my other point -- about a quarter of the usages of the
function in Emacs core has "" as the replacement, so perhaps that should
have its own function? `regexp-remove'?
Then that could be:
(when author
(setq author (regexp-remove "[ \t]*[(<].*$" author))
(setq author (regexp-remove "\\`[ \t]+" author))
(setq author (regexp-remove "[ \t]+$" author))
(setq author (regexp-replace "[ \t]+" " " author))
or
(when author
(setq author
(regexp-replace
"[ \t]+" " " (regexp-remove
"[ \t]*[(<].*$" (regexp-remove
"\\`[ \t]+" (regexp-remove
"[ \t]+$" author)))))))
or
(when author
(setq author
(thread-last author
(regexp-remove "[ \t]*[(<].*$")
(regexp-remove "\\`[ \t]+")
(regexp-remove "[ \t]+$")
(regexp-replace "[ \t]+" " ")))))
Or... something else. I'm sure nobody else has thought about this
issue before.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
next reply other threads:[~2021-09-22 4:36 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-22 4:36 Lars Ingebrigtsen [this message]
2021-09-22 5:22 ` Improve `replace-regexp-in-string' ergonomics? Yuri Khan
2021-09-22 6:36 ` Lars Ingebrigtsen
2021-09-22 7:47 ` Thierry Volpiatto
2021-09-22 5:24 ` Po Lu
2021-09-22 6:37 ` Lars Ingebrigtsen
2021-09-22 10:56 ` Po Lu
2021-09-22 20:08 ` Lars Ingebrigtsen
2021-09-23 0:11 ` Po Lu
2021-09-22 7:33 ` Adam Porter
2021-09-22 8:09 ` Lars Ingebrigtsen
2021-09-22 7:51 ` Andreas Schwab
2021-09-22 8:14 ` Augusto Stoffel
2021-09-22 8:21 ` Adam Porter
2021-09-22 18:01 ` Stefan Monnier
2021-09-22 18:24 ` Basil L. Contovounesios
2021-09-22 22:56 ` Adam Porter
2021-09-22 23:53 ` Eric Abrahamsen
2021-09-22 20:06 ` Lars Ingebrigtsen
2021-09-22 10:59 ` Dmitry Gutov
2021-09-22 20:18 ` Lars Ingebrigtsen
2021-09-22 22:23 ` Dmitry Gutov
2021-09-22 23:24 ` [External] : " Drew Adams
2021-09-22 18:14 ` Stefan Monnier
2021-09-22 19:30 ` Mattias Engdegård
2021-09-22 20:22 ` Lars Ingebrigtsen
2021-09-22 20:29 ` Lars Ingebrigtsen
2021-09-23 2:15 ` Stefan Monnier
2021-10-05 16:18 ` Juri Linkov
2021-10-12 6:53 ` Juri Linkov
2021-10-12 12:10 ` Lars Ingebrigtsen
2021-10-12 12:34 ` Stefan Monnier
2021-10-12 12:41 ` Lars Ingebrigtsen
2021-10-12 13:18 ` Lars Ingebrigtsen
2021-10-12 13:32 ` Mattias Engdegård
2021-10-12 15:48 ` Stefan Monnier
2021-10-12 13:33 ` Thierry Volpiatto
2021-10-12 19:16 ` Juri Linkov
2021-10-12 20:44 ` Thierry Volpiatto
2021-10-13 7:57 ` Juri Linkov
2021-10-13 8:41 ` Thierry Volpiatto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878rzpw7jo.fsf@gnus.org \
--to=larsi@gnus.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).