From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Improve `replace-regexp-in-string' ergonomics? Date: Wed, 22 Sep 2021 13:59:54 +0300 Message-ID: References: <878rzpw7jo.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15179"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 To: Lars Ingebrigtsen , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Sep 22 13:03:09 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mT01s-0003kJ-PC for ged-emacs-devel@m.gmane-mx.org; Wed, 22 Sep 2021 13:03:08 +0200 Original-Received: from localhost ([::1]:36298 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mT01q-0000wO-DX for ged-emacs-devel@m.gmane-mx.org; Wed, 22 Sep 2021 07:03:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52426) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mSzyt-00075j-2g for emacs-devel@gnu.org; Wed, 22 Sep 2021 07:00:03 -0400 Original-Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]:36860) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mSzyq-0004IQ-T7 for emacs-devel@gnu.org; Wed, 22 Sep 2021 07:00:02 -0400 Original-Received: by mail-wr1-x434.google.com with SMTP id g16so5557980wrb.3 for ; Wed, 22 Sep 2021 04:00:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=r5c+WAlEdfnTNzBSBkktb5/Hh9OHA1kSt8y2pr4rj4I=; b=NLo6u22z7g3lr7OS7vqbv5so6muT6YX2kFxOUrXW3JIC/fjmHRRGABgV3tb6wjzKgZ ohJ0d2qW4HNLCTJjLx6wwZ8tch2uk9R7ciZkl818JF42wFeZyqmUQXFx+Ch4knHbcImb ulqjb9HHO4RAUIQMjTXlexvQfnXaccXNJY0AB/H8QWZw0iRw//K6iDASF5ai+e/tfwRK zbLk75miZWwx/3ntGXG3WT94q3FJ0sHZ6OySoLhchaYlilN9fwYo9F/woT0Fu0WLPqiQ kOCvQSVAWIjkNcM67kITzyIgGoClHkOLUrYcMahGTPuZPLLpk0PsJvtVZlqyJNbMEZpg +qjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=r5c+WAlEdfnTNzBSBkktb5/Hh9OHA1kSt8y2pr4rj4I=; b=UdzBhf29WMVO4OXRBEKV+IvdursLpXd5yaFEi0cXUklsJeXGlBr1WfmdahWZ/o2lSC jyoEG4ZSEBuHGDhKtXPN13gEfWLw0NGaNdT08g/LSpFGYpHU8spkhRkTgi/UAFx12vEk 6oGbBbOECqmuMqadjifObmx8aT45Cuu29o02ULfab1OL/vIm4KoSeDxk8ki933YaCp1D hyNXy0JXa9L/gsZ4tFsjytrfChNJEuPrNZU4ZGau/OkDO+gRqgrI62SJPvG3Hj6cd0Te tkZ+c9bv5Hv00uzzNv20e055qXon17Ad4q6Z+IaSs97i99FR82qdlywysDjWKQ/favTZ ha7w== X-Gm-Message-State: AOAM530yzRvDLmgF1Da0owiu/1aHFd/yx9DroKZF6FCofwmYG3cgvIn0 trl5Z32YgLZ7KehT8Hdtv4j/si6xOKw= X-Google-Smtp-Source: ABdhPJzj6HDIxpF79HtzZO//bWXKWaec+ax0sK/f0QXjj+GP+fGiI68/teERz2K3usxWhoNcfGOzIQ== X-Received: by 2002:a05:6000:14d:: with SMTP id r13mr41575538wrx.420.1632308398305; Wed, 22 Sep 2021 03:59:58 -0700 (PDT) Original-Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id 5sm1697546wmb.37.2021.09.22.03.59.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Sep 2021 03:59:57 -0700 (PDT) In-Reply-To: <878rzpw7jo.fsf@gnus.org> Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=raaahh@gmail.com; helo=mail-wr1-x434.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:275311 Archived-At: On 22.09.2021 07:36, Lars Ingebrigtsen wrote: > But I wonder whether we should > consider renaming the function to something more palatable, and since we > have `string-replace', why not `regexp-replace'? The length of the name > of this common function is itself offputting. > > (org-babel-read > (concat "'" > (thread-last > results > (regexp-replace "'" "\"") > (regexp-replace ",[[:space:]]" " ") > (regexp-replace "\\]" ")") > (regexp-replace "\\[" "(")))) This way makes it impossible to use any optional arguments, right? But if we target thread-first instead and make the new function accept STRING in the first position, all optional arguments would be still available. > We could also consider making `regexp-replace' take a series of pairs, > since this is so common. Like: > > (org-babel-read > (concat "'" > (regexp-replace "'" "\"" > ",[[:space:]]" " " > "\\]" ")" > "\\[" "(" > results))) > > Or some variation thereupon with some more ()s to group pairs. I'm not sure how to also make it accept "normal" convention, and we probably don't want to always have to wrap the args in an alist, even when only one replacement is needed. > The most popular way to deal with the awkwardness is to just give up and > go all imperative: > > (defun authors-canonical-author-name (author file pos) > [...] > (when author > (setq author (replace-regexp-in-string "[ \t]*[(<].*$" "" author)) > (setq author (replace-regexp-in-string "\\`[ \t]+" "" author)) > (setq author (replace-regexp-in-string "[ \t]+$" "" author)) > (setq author (replace-regexp-in-string "[ \t]+" " " author)) > > Which leads me to my other point -- about a quarter of the usages of the > function in Emacs core has "" as the replacement, so perhaps that should > have its own function? `regexp-remove'? > > Then that could be: > > (when author > (setq author (regexp-remove "[ \t]*[(<].*$" author)) > (setq author (regexp-remove "\\`[ \t]+" author)) > (setq author (regexp-remove "[ \t]+$" author)) > (setq author (regexp-replace "[ \t]+" " " author)) IDK, if that leads to no increase in efficiency, then probably not? Replacing with "" is an established pattern by now.