From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Artur Malabarba Newsgroups: gmane.emacs.devel Subject: Re: What to do for faster `remove-duplicates'? Date: Wed, 6 May 2015 15:49:11 +0100 Message-ID: References: <87383atb2p.fsf@gmail.com> <873839ltoa.fsf@gmail.com> <87ioc5kd54.fsf@gmail.com> Reply-To: bruce.connor.am@gmail.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: ger.gmane.org 1430923777 10324 80.91.229.3 (6 May 2015 14:49:37 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 6 May 2015 14:49:37 +0000 (UTC) Cc: Stefan Monnier , emacs-devel To: Oleh Krehel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 06 16:49:36 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Yq0dj-0000Az-Rm for ged-emacs-devel@m.gmane.org; Wed, 06 May 2015 16:49:35 +0200 Original-Received: from localhost ([::1]:45562 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yq0dj-0007PV-9w for ged-emacs-devel@m.gmane.org; Wed, 06 May 2015 10:49:35 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59253) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yq0dN-0007PA-CG for emacs-devel@gnu.org; Wed, 06 May 2015 10:49:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yq0dM-00023b-K9 for emacs-devel@gnu.org; Wed, 06 May 2015 10:49:13 -0400 Original-Received: from mail-lb0-x235.google.com ([2a00:1450:4010:c04::235]:33161) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yq0dM-00023U-CY for emacs-devel@gnu.org; Wed, 06 May 2015 10:49:12 -0400 Original-Received: by lbbzk7 with SMTP id zk7so9462412lbb.0 for ; Wed, 06 May 2015 07:49:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=iPPOBYXRtIxfe9A9YhhYizsiAovrDx4yJeaKhiAhoss=; b=WV81+IS0oi40Kd1odkoF4i3SpkodN/UHQKZp7NSiYFe1qCRH2z+vW1MBgaS/Idv9DL gv+2fNLjB0SMaMHcXq5GpGMD+yMBJJIrMwdM/FtqfIz5AVBNhE7qJz2P9GHkZDc/k9h5 TuyEEzSZuolgoIhFTuFK9mW0gL+xbqAFROaGlUtHDoZHpmhZQONtXRsu7ekqVSnVljt1 MjH56EANy5XByYbtC8KC+LbGW34xfTN51MNkavspLo3oVIkL62BQLQ/JSk4ij8rPx9vw NgIWtM4qtOVDN5GKUHkYyIxJ+C3QU2SGY43gi9aqf/6r4pf9tVCBY52sZ9H2DM7kUVcO psLw== X-Received: by 10.112.97.202 with SMTP id ec10mr26683841lbb.4.1430923751503; Wed, 06 May 2015 07:49:11 -0700 (PDT) Original-Received: by 10.25.150.1 with HTTP; Wed, 6 May 2015 07:49:11 -0700 (PDT) In-Reply-To: <87ioc5kd54.fsf@gmail.com> X-Google-Sender-Auth: 6iOnCl6LndwgEeeYyu_Bw1S8vtM X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:4010:c04::235 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:186285 Archived-At: >>> I attach the patch. I did a bunch of `benchmark-run' and it seems that >>> 100 elements is the breaking point. >> >> Small question. How much slower is this patch compared to the current >> version on a list of 99 elements? (Due to having to calculate the >> length) > > For 99 unique candidates, the call to `length' takes 5% time compared to > the call to `delete-dups': > > (/ 3.774e-06 6.3028e-05) > 0.05987814939392017 > > It becomes worse for a small amount of unique candidates, going to 30% > for 10 candidates, but a lot of that is the standard cost of calling a > function. > > I don't know if it's worth optimizing further. Nah, I think it's fine, I was just curious. Doing (nthcdr 100 list) instead of (> (length list) 100) might improve that a little, but either way is fine.