From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Colin Woodbury" Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] lisp/files.el: Add `file-name-set-extension` Date: Tue, 25 May 2021 14:21:07 -0700 Message-ID: <3780a7f9-19f4-4216-baa9-ce00b3dbace9@www.fastmail.com> References: <6ff4b7d7-03ac-48d1-8d49-de66431d4e5b@www.fastmail.com> <87lf82y9pv.fsf@tcd.ie> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=8564ca5883344b0883af151232898444 Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="40262"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Cyrus-JMAP/3.5.0-alpha0-448-gae190416c7-fm-20210505.004-gae190416 Cc: "Basil L. Contovounesios" , emacs-devel@gnu.org To: "Stefan Monnier" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue May 25 23:22:38 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lleVY-000AGs-PJ for ged-emacs-devel@m.gmane-mx.org; Tue, 25 May 2021 23:22:37 +0200 Original-Received: from localhost ([::1]:41874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lleVX-00037U-1B for ged-emacs-devel@m.gmane-mx.org; Tue, 25 May 2021 17:22:35 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34106) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lleUh-0002S6-0S for emacs-devel@gnu.org; Tue, 25 May 2021 17:21:43 -0400 Original-Received: from out4-smtp.messagingengine.com ([66.111.4.28]:37427) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lleUe-0004uj-Tm for emacs-devel@gnu.org; Tue, 25 May 2021 17:21:42 -0400 Original-Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 60C255C01D5; Tue, 25 May 2021 17:21:40 -0400 (EDT) Original-Received: from imap1 ([10.202.2.51]) by compute1.internal (MEProxy); Tue, 25 May 2021 17:21:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fosskers.ca; h= mime-version:message-id:in-reply-to:references:date:from:to:cc :subject:content-type; s=fm1; bh=N6E+fjGIvE5bNgVMtwC2Jnv4unFS+hn GLxPFwaTHoXw=; b=hqc+z6PU2dPKTTeHBHsM/RvVyBn7Z3xJgWNjOKQOKgrw43N qbRZwYbH+beBmkx876joh/CSxHA+sAQS5IJ+zh17O9G4KPUPuH90fAr74gmRhYCj uLdn+PK57lXmriJypcMU0DB20AFWwVcZawKu7qCX0BxNpMuxLg1zQmPR/9/F28yU EJq/SilNbATP/wbPuiNS0FybvnR8qiWt8b17ZHKL6u53yEpm8SmhBVJxysa9kRD9 2PrlBUDllLVEYwJpMV5PL+L6QHysLXRvPWuwzGAF3otfefdf5cBCbrxI6QUuIzi9 zm9lRWuSVfpYll6Fap9ZzJfrvoEnMWWF5xBn/OQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=N6E+fj GIvE5bNgVMtwC2Jnv4unFS+hnGLxPFwaTHoXw=; b=GFUQ7Whtl8O751N9nEtZQV K6VlKJOZfRVDBPmu0igr4581QiRc22/X/JICjaEe9frfGWMDYrbAR/VYFeisEINA I4ght57vhPfs2N/Ouu7smE7xZa/jaC+KYGYRzTegTIYWlGMBLKHj3NR+hozYQtm/ RyG5SbHKIl8s/gPm8PULUBflSTXPnwezVhYxdj/CIyNpuuRURoRtOYVKpNwnXcpi aK6tOvbvw1E0lkvMyXsrs8jpuDEO0OfMej/WR8XLJQ89AZ3OzFZho6tt75TxO3Kf PvX7sRfIfQr9wwUUvMqxBPqP7B5BarwA0B0tkIbtLccmeeDEPaKf4y6gYr4A5vAA == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdekuddgudeiudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtsegrtderreerredtnecuhfhrohhmpedfveho lhhinhcuhghoohgusghurhihfdcuoegtohhlihhnsehfohhsshhkvghrshdrtggrqeenuc ggtffrrghtthgvrhhnpeefhfdvgedtueejffdtiefgheeugedvhfegueevuedtleevgeel geevfeffudejffenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpegtohhlihhnsehfohhsshhkvghrshdrtggr X-ME-Proxy: Original-Received: by mailuser.nyi.internal (Postfix, from userid 501) id E2FF0130005F; Tue, 25 May 2021 17:21:39 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface In-Reply-To: Received-SPF: none client-ip=66.111.4.28; envelope-from=colin@fosskers.ca; helo=out4-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:269898 Archived-At: --8564ca5883344b0883af151232898444 Content-Type: text/plain That regex was borrowed directly from `string-trim`, to which I added a `.`. You're right that the main motivation was to allow the caller to pass either `.foo` or `foo` as the extension and have either case work (many languages do this). Leaving the other characters in the regex seemed like a sanitary thing to do, just in case they pass something bogus. Although as the code is, there's nothing preventing them from still giving bogus characters before the file, or after the extension, making the resulting filepath still meaningless. So we could consider: 1. Simplifying the regex to just include dots (and perhaps whitespace) (i.e. trust the caller), or; 2. Expanding the logic to sanitize both ends of both arguments, (i.e. don't trust the caller), or; 3. Adding error-throwing logic if malformed files or extensions are given (i.e. warn the user). I'm happy to alter the patch for any of those scenarios, whichever is preferable. Thanks! On Tue, 25 May 2021, at 13:25, Stefan Monnier wrote: > > + (when (and filename extension) > > + (let* ((patt "[ \t\n\r.]+") ; Inspired by `string-trim'. > > + (filename (string-trim-right filename patt)) > > + (extension (string-trim-left extension patt))) > > + (unless (or (string-empty-p filename) > > + (string-empty-p extension)) > > + (concat (file-name-sans-extension filename) "." extension))))) > > Why do you trim [ \t\n\r.]+ from the end of the filename and the > beginning of the extension? > > I can see why you'd want to remove a single "." at the beginning of > `extension`, so as to allow extension to come with or without a leading > ".", but the rest seems rather surprising because such characters in my > experience almost never show up in such a circumstance (which means > that if they do, it's either on purpose and we should preserve it, or > it's an error "upstream" and we should try and help expose the error > rather than silently try to cover it). > > > Stefan > > --8564ca5883344b0883af151232898444 Content-Type: text/html Content-Transfer-Encoding: quoted-printable
That regex was = borrowed directly from `string-trim`, to which I added a `.`. You're rig= ht that the main motivation was to allow the caller to pass either `.foo= ` or `foo` as the extension and have either case work (many languages do= this). Leaving the other characters in the regex seemed like a sanitary= thing to do, just in case they pass something bogus. Although as the co= de is, there's nothing preventing them from still giving bogus character= s before the file, or after the extension, making the resulting filepath= still meaningless.

So we could consider:<= br>

1. Simplifying the regex to just include do= ts (and perhaps whitespace) (i.e. trust the caller), or;
2= . Expanding the logic to sanitize both ends of both arguments, (i.e. don= 't trust the caller), or;
3. Adding error-throwing logic i= f malformed files or extensions are given (i.e. warn the user).

I'm happy to alter the patch for any of those scena= rios, whichever is preferable. Thanks!

On T= ue, 25 May 2021, at 13:25, Stefan Monnier wrote:
> +  (when (and filename e= xtension)
> +    (let* ((patt "[ \t\n\r.= ]+") ; Inspired by `string-trim'.
> +   =         (filename (string-trim-right = filename patt))
> +      =      (extension (string-trim-left extension patt)))<= br>
> +      (unless (or (string-e= mpty-p filename)
> +      = ;            (str= ing-empty-p extension))
> +    &nbs= p;   (concat (file-name-sans-extension filename) "." extension= )))))

Why do you trim [ \t\n\r.]+ from the = end of the filename and the
beginning of the extension?

I can see why you'd want to remove a single "= ." at the beginning of
`extension`, so as to allow extensi= on to come with or without a leading
".", but the rest see= ms rather surprising because such characters in my
experie= nce almost never show up in such a circumstance (which means
that if they do, it's either on purpose and we should preserve it, or=
it's an error "upstream" and we should try and help expos= e the error
rather than silently try to cover it).


      = ;  Stefan


=
--8564ca5883344b0883af151232898444--