From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Change Emacs 'sort' API to use three-way comparison Date: Fri, 29 Aug 2014 14:19:44 -0700 Organization: UCLA Computer Science Department Message-ID: <5400EE70.8050207@cs.ucla.edu> References: <83fvgfinea.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1409347231 9697 80.91.229.3 (29 Aug 2014 21:20:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 29 Aug 2014 21:20:31 +0000 (UTC) Cc: emacs-devel@gnu.org To: Dmitry Antipov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Aug 29 23:20:24 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XNTap-00031z-EA for ged-emacs-devel@m.gmane.org; Fri, 29 Aug 2014 23:20:23 +0200 Original-Received: from localhost ([::1]:44329 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XNTao-0003f2-WF for ged-emacs-devel@m.gmane.org; Fri, 29 Aug 2014 17:20:23 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57376) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XNTaV-0003ZM-Ph for emacs-devel@gnu.org; Fri, 29 Aug 2014 17:20:11 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XNTaM-0007jp-R2 for emacs-devel@gnu.org; Fri, 29 Aug 2014 17:20:03 -0400 Original-Received: from smtp.cs.ucla.edu ([131.179.128.62]:59501) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XNTaM-0007jV-L1 for emacs-devel@gnu.org; Fri, 29 Aug 2014 17:19:54 -0400 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 484E6A60013; Fri, 29 Aug 2014 14:19:53 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Original-Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sFznh1wtPUfW; Fri, 29 Aug 2014 14:19:44 -0700 (PDT) Original-Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 93EACA6000B; Fri, 29 Aug 2014 14:19:44 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 In-Reply-To: <83fvgfinea.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 131.179.128.62 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:173896 Archived-At: One infelicity I noticed in the recent change to 'sort' in the trunk is that the new implementation calls its predicate twice for each comparison. This is because the Lisp API says the comparison function returns a boolean (nil or non-nil), whereas qsort_r wants the comparison function to return a ternary value (-1, 0, or 1). If the predicate is expensive, the new Fsort can be twice as slow as the old. We could tune it but I don't see how to get it any faster than 1.5x slower than before, assuming random input and an expensive comparison function. To fix this I propose changing the API for 'sort' so that its function argument is no longer a predicate, but instead returns a negative integer, 0, or a positive integer. For compatibility with old code, it would treat nil as if it were nonpositive (thus requiring a reverse comparison) and noninteger nonnil values as if they were positive. This wouldn't be 100% upward-compatible, because if an existing predicate returns a nonpositive integer to stand for 'true' there will be a silent change to behavior, but I expect such usage is so rare that we don't need to worry about it.