unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Dmitry Gutov <dgutov@yandex.ru>
To: Lars Ingebrigtsen <larsi@gnus.org>
Cc: emacs-devel@gnu.org
Subject: Re: Improve `replace-regexp-in-string' ergonomics?
Date: Thu, 23 Sep 2021 01:23:07 +0300	[thread overview]
Message-ID: <2675d5ef-946d-0474-ef9d-a8989289c3d0@yandex.ru> (raw)
In-Reply-To: <87r1dgtlcy.fsf@gnus.org>

On 22.09.2021 23:18, Lars Ingebrigtsen wrote:

>> But if we target thread-first instead and make the new function accept
>> STRING in the first position, all optional arguments would be still
>> available.
> 
> Yes, I've always found it weird that these functions have the object to
> be worked upon as the last non-optional parameter.  I had to look it up
> for years when using `replace-regexp-in-string'.  And it didn't help
> that Emacs took this function from XEmacs, which had the string in a
> different position...  But I don't remember where...
> 
> *Lars says "apt install xemacs21"*
> 
> I misremembered:
> 
> `replace-in-string' is a compiled Lisp function
>    -- loaded from "/build/xemacs21-rcHAYB/xemacs21-21.4.24/lisp/subr.elc"
> (replace-in-string STR REGEXP NEWTEXT &optional LITERAL)
> 
> So it has the placement of STRING that seems logical, I think.
> 
> On the other hand, changing the placement in a new function like this
> will probably be even more confusing.

Adding a new function is the only time we *can* change the arguments 
order. If we subsequently obsolete the current function, it could fly.

It's not the wildest among the alternatives anyway -- the idea about the 
argument being a list takes the first place, I think. And either could 
work, ultimately.

If we want to be able to use threading macros more consistently, it 
seems functions should expect the "main" argument in either the first or 
the last position, across the standard library. Or at least portions of it.

For example, in Clojure:

   By convention, core functions that operate on sequences expect the
   sequence as their last argument. Accordingly, pipelines containing
   map, filter, remove, reduce, into, etc usually call for the ->> macro.

   Core functions that operate on data structures, on the other hand,
   expect the value they work on as their first argument. These include
   assoc, update, dissoc, get and their -in variants. Pipelines that
   transform maps using these functions often require the -> macro.

(https://clojure.org/guides/threading_macros)

It seems to me, with penchant for optional arguments, it's generally 
harder to put the "main" argument into the last position in our case. I 
could be wrong, though. But STRING being in neither first or last 
position makes threading macro decidedly less useful.

>>> 		(regexp-replace "'" "\""
>>> 				",[[:space:]]" " "
>>> 				"\\]" ")"
>>> 				"\\[" "("
>>> 				results)))
>>> Or some variation thereupon with some more ()s to group pairs.
>>
>> I'm not sure how to also make it accept "normal" convention, and we
>> probably don't want to always have to wrap the args in an alist, even
>> when only one replacement is needed.
> 
> No, that's the problem.  We could hack it up by doing a &rest in
> reality, and then checking if the first parameter is a list, but yuck.

Probably check that the number of &rest arguments divides by two as 
well. Or three, or four? FIXEDCASE, LITERAL and SUBEXP could apply to a 
single replacement. At best, it will create an ambiguity (do those args 
apply to all steps, or do I need to repeat them?), but at worst it can 
limit the applicability of the approach (when steps need different 
values of these). Threading solves it.

>>>       (setq author (regexp-remove "[ \t]*[(<].*$" author))
>>>       (setq author (regexp-remove "\\`[ \t]+" author))
>>>       (setq author (regexp-remove "[ \t]+$" author))
>>>       (setq author (regexp-replace "[ \t]+" " " author))
>>
>> IDK, if that leads to no increase in efficiency, then probably not?
>> Replacing with "" is an established pattern by now.
> 
> It helps with readability -- the function says what the intention is.

True. I'm not sold, though.



  reply	other threads:[~2021-09-22 22:23 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22  4:36 Improve `replace-regexp-in-string' ergonomics? Lars Ingebrigtsen
2021-09-22  5:22 ` Yuri Khan
2021-09-22  6:36   ` Lars Ingebrigtsen
2021-09-22  7:47   ` Thierry Volpiatto
2021-09-22  5:24 ` Po Lu
2021-09-22  6:37   ` Lars Ingebrigtsen
2021-09-22 10:56     ` Po Lu
2021-09-22 20:08       ` Lars Ingebrigtsen
2021-09-23  0:11         ` Po Lu
2021-09-22  7:33 ` Adam Porter
2021-09-22  8:09   ` Lars Ingebrigtsen
2021-09-22  7:51 ` Andreas Schwab
2021-09-22  8:14 ` Augusto Stoffel
2021-09-22  8:21   ` Adam Porter
2021-09-22 18:01     ` Stefan Monnier
2021-09-22 18:24       ` Basil L. Contovounesios
2021-09-22 22:56       ` Adam Porter
2021-09-22 23:53         ` Eric Abrahamsen
2021-09-22 20:06   ` Lars Ingebrigtsen
2021-09-22 10:59 ` Dmitry Gutov
2021-09-22 20:18   ` Lars Ingebrigtsen
2021-09-22 22:23     ` Dmitry Gutov [this message]
2021-09-22 23:24       ` [External] : " Drew Adams
2021-09-22 18:14 ` Stefan Monnier
2021-09-22 19:30   ` Mattias Engdegård
2021-09-22 20:22   ` Lars Ingebrigtsen
2021-09-22 20:29     ` Lars Ingebrigtsen
2021-09-23  2:15     ` Stefan Monnier
2021-10-05 16:18 ` Juri Linkov
2021-10-12  6:53   ` Juri Linkov
2021-10-12 12:10     ` Lars Ingebrigtsen
2021-10-12 12:34       ` Stefan Monnier
2021-10-12 12:41         ` Lars Ingebrigtsen
2021-10-12 13:18           ` Lars Ingebrigtsen
2021-10-12 13:32             ` Mattias Engdegård
2021-10-12 15:48             ` Stefan Monnier
2021-10-12 13:33           ` Thierry Volpiatto
2021-10-12 19:16             ` Juri Linkov
2021-10-12 20:44               ` Thierry Volpiatto
2021-10-13  7:57                 ` Juri Linkov
2021-10-13  8:41                   ` Thierry Volpiatto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2675d5ef-946d-0474-ef9d-a8989289c3d0@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=emacs-devel@gnu.org \
    --cc=larsi@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).