From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Improve `replace-regexp-in-string' ergonomics? Date: Thu, 23 Sep 2021 01:23:07 +0300 Message-ID: <2675d5ef-946d-0474-ef9d-a8989289c3d0@yandex.ru> References: <878rzpw7jo.fsf@gnus.org> <87r1dgtlcy.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34055"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 Cc: emacs-devel@gnu.org To: Lars Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Sep 23 00:24:36 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mTAfL-0008e5-AI for ged-emacs-devel@m.gmane-mx.org; Thu, 23 Sep 2021 00:24:35 +0200 Original-Received: from localhost ([::1]:37680 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mTAfK-0007Il-03 for ged-emacs-devel@m.gmane-mx.org; Wed, 22 Sep 2021 18:24:34 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:37046) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAe5-0006Qf-6A for emacs-devel@gnu.org; Wed, 22 Sep 2021 18:23:17 -0400 Original-Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:42667) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mTAe1-0008QR-P8 for emacs-devel@gnu.org; Wed, 22 Sep 2021 18:23:15 -0400 Original-Received: by mail-wr1-x42a.google.com with SMTP id q11so11272852wrr.9 for ; Wed, 22 Sep 2021 15:23:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=jRSNgCqNvaXCIqa4CmQ6UXjauB8zrKzdKfdJwOp2ZWw=; b=obmDRAAPi9jgURo8xlDPgD7lEZF8vWOd9b+R8tH3PJiP7znG9ErDsWwGUucwn65WKO 9CRcrYXvFChr2OeBBBGQC/U3EXGZKXlNFRhcZiKr1meHr16XmJcvTZRSkKds1NBbAs9x NTkyKSv4pSid+XtnikWc1CXohlTmMnKySRBZbMWspLy0VnQHMG0NQam6L0qXpQEhHRhM gmABsv1VsS4mSkxtJ7nLCuioo4z6rEHKDNc1BD+mNfcGve/gLXuC+sw9vKZwP4vi7vO+ mM1MA5IJgmfdm1j+nxvIz3yrj2s3zWHsvvdcaR4VWnwUph7DMqDt2CSUs9SXw0ow9WDo iUow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=jRSNgCqNvaXCIqa4CmQ6UXjauB8zrKzdKfdJwOp2ZWw=; b=VX43ypebu87mz1lB/TYbLTzRJFeU3tVe1EknegOAgsQK1pl0O+6SOyGq7TmOw6Xtpf I3gob8gMJTL0DN3wJlFhM1PDAeWwI9tHDX+Sr1vfuhl3FItfF/9RWAnIrDFIMuzDgzeD mBXjc3l34oPlHiTle3gzxS3mdO7e7m+Zzcgo1jwp5qsCBF52jUpv2gmNHnN6KI/sL6ml qblYeLB3YkDjwmgjzFEXy8WRbKs6vbmqL/FmqXMHGdA3fyLiiuFV2l1m+qKRwEUcPyGD t2mTB9QFMuT8oSiGN7HZtw5Bov9GYprHgNfi4VJlRu5L1iu7vYJ279Apru2Vol5Gpb2M Hgew== X-Gm-Message-State: AOAM530+o/E5HvRMDBugcfkeIY4GuSDPuqJcEjhsdMzWsXtaTAifZHtX uatmEDVuNk1WdxzM28QVRPT1udvz7Lg= X-Google-Smtp-Source: ABdhPJxvdRjO20xLTPoWlOzUWv4Htghav5RaPgM+Dk6MtJd/otnnP401wPGboJU+ysW8eNhlf98hmQ== X-Received: by 2002:adf:fd03:: with SMTP id e3mr1500139wrr.46.1632349391082; Wed, 22 Sep 2021 15:23:11 -0700 (PDT) Original-Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id o19sm3679571wrg.60.2021.09.22.15.23.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Sep 2021 15:23:10 -0700 (PDT) In-Reply-To: <87r1dgtlcy.fsf@gnus.org> Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=raaahh@gmail.com; helo=mail-wr1-x42a.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:275343 Archived-At: On 22.09.2021 23:18, Lars Ingebrigtsen wrote: >> But if we target thread-first instead and make the new function accept >> STRING in the first position, all optional arguments would be still >> available. > > Yes, I've always found it weird that these functions have the object to > be worked upon as the last non-optional parameter. I had to look it up > for years when using `replace-regexp-in-string'. And it didn't help > that Emacs took this function from XEmacs, which had the string in a > different position... But I don't remember where... > > *Lars says "apt install xemacs21"* > > I misremembered: > > `replace-in-string' is a compiled Lisp function > -- loaded from "/build/xemacs21-rcHAYB/xemacs21-21.4.24/lisp/subr.elc" > (replace-in-string STR REGEXP NEWTEXT &optional LITERAL) > > So it has the placement of STRING that seems logical, I think. > > On the other hand, changing the placement in a new function like this > will probably be even more confusing. Adding a new function is the only time we *can* change the arguments order. If we subsequently obsolete the current function, it could fly. It's not the wildest among the alternatives anyway -- the idea about the argument being a list takes the first place, I think. And either could work, ultimately. If we want to be able to use threading macros more consistently, it seems functions should expect the "main" argument in either the first or the last position, across the standard library. Or at least portions of it. For example, in Clojure: By convention, core functions that operate on sequences expect the sequence as their last argument. Accordingly, pipelines containing map, filter, remove, reduce, into, etc usually call for the ->> macro. Core functions that operate on data structures, on the other hand, expect the value they work on as their first argument. These include assoc, update, dissoc, get and their -in variants. Pipelines that transform maps using these functions often require the -> macro. (https://clojure.org/guides/threading_macros) It seems to me, with penchant for optional arguments, it's generally harder to put the "main" argument into the last position in our case. I could be wrong, though. But STRING being in neither first or last position makes threading macro decidedly less useful. >>> (regexp-replace "'" "\"" >>> ",[[:space:]]" " " >>> "\\]" ")" >>> "\\[" "(" >>> results))) >>> Or some variation thereupon with some more ()s to group pairs. >> >> I'm not sure how to also make it accept "normal" convention, and we >> probably don't want to always have to wrap the args in an alist, even >> when only one replacement is needed. > > No, that's the problem. We could hack it up by doing a &rest in > reality, and then checking if the first parameter is a list, but yuck. Probably check that the number of &rest arguments divides by two as well. Or three, or four? FIXEDCASE, LITERAL and SUBEXP could apply to a single replacement. At best, it will create an ambiguity (do those args apply to all steps, or do I need to repeat them?), but at worst it can limit the applicability of the approach (when steps need different values of these). Threading solves it. >>> (setq author (regexp-remove "[ \t]*[(<].*$" author)) >>> (setq author (regexp-remove "\\`[ \t]+" author)) >>> (setq author (regexp-remove "[ \t]+$" author)) >>> (setq author (regexp-replace "[ \t]+" " " author)) >> >> IDK, if that leads to no increase in efficiency, then probably not? >> Replacing with "" is an established pattern by now. > > It helps with readability -- the function says what the intention is. True. I'm not sold, though.