From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.bugs Subject: bug#43598: replace-in-string: finishing touches Date: Fri, 25 Sep 2020 13:11:15 +0200 Message-ID: <87lfgyqffw.fsf@gnus.org> References: <77CD28C7-77C0-4DED-ACD0-21418489BADE@acm.org> <87blhuu3w9.fsf@gnus.org> <2CFAAACA-2FD3-44C2-B12E-E49DAA968115@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="25606"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: 43598@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Sep 25 13:12:28 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kLleO-0006Yu-JF for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 25 Sep 2020 13:12:28 +0200 Original-Received: from localhost ([::1]:49216 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLleN-0005OT-Mi for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 25 Sep 2020 07:12:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLldy-0005Mw-RS for bug-gnu-emacs@gnu.org; Fri, 25 Sep 2020 07:12:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:59530) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLldy-0005sT-H6 for bug-gnu-emacs@gnu.org; Fri, 25 Sep 2020 07:12:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kLldy-00019G-D5 for bug-gnu-emacs@gnu.org; Fri, 25 Sep 2020 07:12:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 25 Sep 2020 11:12:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 43598 X-GNU-PR-Package: emacs Original-Received: via spool by 43598-submit@debbugs.gnu.org id=B43598.16010322924371 (code B ref 43598); Fri, 25 Sep 2020 11:12:02 +0000 Original-Received: (at 43598) by debbugs.gnu.org; 25 Sep 2020 11:11:32 +0000 Original-Received: from localhost ([127.0.0.1]:42843 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kLldU-00018R-7K for submit@debbugs.gnu.org; Fri, 25 Sep 2020 07:11:32 -0400 Original-Received: from quimby.gnus.org ([95.216.78.240]:38878) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kLldS-00018E-Dz for 43598@debbugs.gnu.org; Fri, 25 Sep 2020 07:11:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=SzVFm9vI58v86rSOTeFoWHVl6zqiDSjCxexlDkBJtb8=; b=Te0FbZrFC6tRncYDcqIhfoCPgL 668AsleUQJDTIL8OmiwCk7uMNcJm/mquKd/kaP+TsGpBgNr8aFX9NWIuiidVMlX2jmJYV2GPZAFfF q1dvYIHW0Gu/cmVgQN5lMa91VdB8BE5ticUafpwy8A9ae7EWqJVjdVlnFd8fbt6coaPo=; Original-Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kLldG-0001Yc-LI; Fri, 25 Sep 2020 13:11:24 +0200 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAG1BMVEXXnlrDt7BaLSqw RzEvJCgNCAx8eIRdVFX////3Dm9OAAAAAWJLR0QIht6VegAAAAd0SU1FB+QJGQklNdCRphsAAAG2 SURBVDjLldRLj5swEADgSSJlr8yqYs+1odzDqr22xGTuxJNcy6oL1w1y4r9f2xCFEPawI4Gl+TR+ YgAWQshEy7RGxEVUMhe5D4C1EPlrIVO8wn4AcO+NHEE1wCpAzohRDzzAUnrY6BtshwoPQlQBwMP+ NoYbP7tV6AGEL+ihDhU8gEu7pbzSBbH+O4ZVEMNMuGgCyB6WQjk4uwSdonEFmEKIzAPrjzGsUxLi R+ozfBzB2vBOCBsyQ2wDmFLv3rJulA8LAVClpuf2DnwJPGnWytCZJyXw5JqSuLwD3q9g6dtK8ySk r/A1U9j3wDOwmQc5wDSqHGI6UNvaQ2GtstYaIttl9gMAT39e6IAnZU2RiM4UJAx1tHIQO4hjpVQx PC5KCWjtpWldP9fo57EFnB3czSqehxy+zcPuy3CBd7+HD/ljDf6Efqcz4Jvv5yloDB9r+7C5HL7i 6t/j4Jnv6qA+gf5g76emgD+JWXBdaHicqSpsojW0ibbnu7zKOpkk7mr+xPeQIlKK/M1SJVnr7+wv EwBPtqkxbOqxacLVTK5wqfFFCMVUI4gJRP4fkgUodb9yDBFBH/8BdU85s9N4aKgAAAAldEVYdGRh dGU6Y3JlYXRlADIwMjAtMDktMjVUMDk6Mzc6NTMrMDA6MDD5GdWXAAAAJXRFWHRkYXRlOm1vZGlm eQAyMDIwLTA5LTI1VDA5OjM3OjUzKzAwOjAwiERtKwAAAABJRU5ErkJggg== X-Now-Playing: Thievery Corporation's _The Mirror Conspiracy_: "The Mirror Conspiracy" In-Reply-To: <2CFAAACA-2FD3-44C2-B12E-E49DAA968115@acm.org> ("Mattias =?UTF-8?Q?Engdeg=C3=A5rd?="'s message of "Fri, 25 Sep 2020 12:42:06 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:188952 Archived-At: Mattias Engdeg=C3=A5rd writes: > 1. Check the range of the START-POS argument so that we don't crash. > The permitted range is [0..N] where N is (length HAYSTACK), thus we > permit a start right after the last character but no further. > We could also return nil in these cases but I think an error is more usef= ul. Good point. :-) > 2. Make the docs more precise about various things. > > 3. Slight simplification of the implementation logic to avoid testing > the same conditions multiple times. > > 4. More tests, especially for edge cases. Can't have too many! It all looks good to me; please apply. > One test still fails: > > (string-search "=C3=B8" "\303\270") > > which should return nil but currently matches. > I think it's wrong to convert the needle to unibyte (using > Fstring_as_unibyte) in this case, but I haven't decided what the best > solution would be. Yeah, that's the bit I was most unsure about, because it just didn't look quite correct to me, but I couldn't come up with the correct test case last night; thanks. > We should also consider the optimisations: > - If SCHARS(needle)>SCHARS(haystack) then no match is possible. Yup. > - If either needle or haystack is all-ASCII (all bytes in 0..127), > then we can use memmem without conversion. Right, so if the multibyteness differs, then do another check to see whether both strings are all-ASCII anyway, and do the comparison without conversion... Yes, makes sense to me. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no