From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?ISO-8859-1?Q?Andreas_R=F6hler?= Newsgroups: gmane.emacs.help Subject: Re: Sorting on compound keys? Date: Sun, 29 May 2011 22:17:59 +0200 Message-ID: <4DE2A9F7.4040605@easy-emacs.de> References: <4DDC9A94.3080903@easy-emacs.de> <4DDDF2FA.2050201@easy-emacs.de> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1306700301 27032 80.91.229.12 (29 May 2011 20:18:21 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sun, 29 May 2011 20:18:21 +0000 (UTC) Cc: Tim Landscheidt To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun May 29 22:18:16 2011 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QQmRE-0002av-8c for geh-help-gnu-emacs@m.gmane.org; Sun, 29 May 2011 22:18:16 +0200 Original-Received: from localhost ([::1]:40878 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QQmRD-0005ed-Qv for geh-help-gnu-emacs@m.gmane.org; Sun, 29 May 2011 16:18:15 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:46324) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QQmR7-0005eU-Iz for help-gnu-emacs@gnu.org; Sun, 29 May 2011 16:18:10 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QQmR5-0002SG-Up for help-gnu-emacs@gnu.org; Sun, 29 May 2011 16:18:09 -0400 Original-Received: from moutng.kundenserver.de ([212.227.126.187]:51998) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QQmR5-0002S5-Hu for help-gnu-emacs@gnu.org; Sun, 29 May 2011 16:18:07 -0400 Original-Received: from [192.168.178.27] (brln-4db9f47c.pool.mediaWays.net [77.185.244.124]) by mrelayeu.kundenserver.de (node=mrbap2) with ESMTP (Nemesis) id 0Mb2OB-1QBAQR0hEx-00KDZg; Sun, 29 May 2011 22:18:04 +0200 User-Agent: Mozilla/5.0 (X11; U; Linux i686; de; rv:1.9.2.17) Gecko/20110414 SUSE/3.1.10 Thunderbird/3.1.10 In-Reply-To: X-Provags-ID: V02:K0:hgd9lSlNY4201X2Kibjq3xV6T5OMmZU3a4xeOoLqxO0 h8sQlKMlDZOIlVfX0Ug7wAOpGXZj5LoqY1VHq//V4S2IHQ5QQ7 1lxLMisDzo069MbVOtukqC24wUhY1jzqaFfH+36SJdAjJLxXjm eNFvZ9+aFP4SW0j0+/mP3+va2RPwbjs+SL/DoBop87h8JxVxuK Wngp6YC21oyvOTlQWtRkAjcHkX8cdZGHsiBN1v2NPg= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 212.227.126.187 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:81262 Archived-At: Am 27.05.2011 00:49, schrieb Tim Landscheidt: > Andreas Röhler wrote: > >>>>> sometimes I want to sort unified diffs of CSV files (sepa- >>>>> rated by tabs (here: \t)): > >>>>> | +A 1\t1\tx >>>>> | +A 1\t2\ty >>>>> | +B 2\t3\tz >>>>> | -A 1\t1\tx >>>>> | -B 2\t2\ty >>>>> | -B 2\t3\tz > >>>>> by the second column, then the first column, then "+" vs. >>>>> "-". Unfortunately, it seems that sort-regexp-fields doesn't >>>>> allow more than one match field as a key. sort-fields >>>>> doesn't work either as it requires the fields to be sur- >>>>> rounded by white space (no "+" vs. "-") and doesn't allow >>>>> white space inside the fields. > >>>>> Is there any function in vanilla Emacs (23.1.1) that I >>>>> missed? I looked at pimping sort-regexp-fields, but it seems >>>>> to me that sort-subr would have to be rewritten from scratch >>>>> to achieve sorting on compound keys. > >>>> last time I looked into that feature was missing indeed. >>>> However, didn't look for a need of re-write from the >>>> scratch, just to extend to existing routine - ie. introduce >>>> one or more levels of sorting. > >>> I remember our discussion in de.comp.editoren :-), but as I >>> read sort-subr it is hard-coded that the sort key is one >>> literal, continuous part of the buffer as sort-lists is a >>> list of buffer positions. > >> sort-subr takes functions to determine the fields to sort. > > No, it accepts functions to determine the *boundaries* of > the fields that have to be part of the buffer as I have > written above. > >> As for the functions as arguments, maybe have a look at >> `ar-th-sort' in thingatpt-utils-base.el > >> https://code.launchpad.net/s-x-emacs-werkstatt/ > > How is this useful in this case? > > Tim > > > Hi Tim, you are right. It must not be done inside sort-subr, but on the top of it. BTW as sort-subr takes whitespace as field-delimiter, there is no way to get +A considered as two fields. Beside this limitation, code below should provide multiple-fields sorting. ;;; sort-multiple-keys.el --- sort multiple fields ;; Author: Andreas Roehler ;; Keywords: data ;; This program is free software; you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation, either version 3 of the License, or ;; (at your option) any later version. ;; This program is distributed in the hope that it will be useful, ;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;; GNU General Public License for more details. ;; You should have received a copy of the GNU General Public License ;; along with this program. If not, see . ;;; Commentary: ;; Sort lines in region lexicographically by the ;; ARG-LIST fields. Fields already sorted by a field ;; specified by a previous arg are sorted by the next ;; remaining. Uses any number of args given in a list. ;; Fields are separated by whitespace and numbered from ;; 1 up. With a negative arg, sorts by the ARGth field ;; counted from the right. Called from a program, there ;; are three arguments: BEG END and FIELD-LIST. BEG ;; and END specify region to sort. The variable ;; `sort-fold-case' determines whether alphabetic case ;; affects the sort order. ;; Example - assume the code below uncommented at the ;; beginning of a buffer: ;; +C 2 1 x ;; +A 2 2 y ;; +A 1 2 y ;; +A 1 2 z ;; +C 1 1 x ;; +A 4 2 z ;; +A 3 2 y ;; +B 3 3 x ;; +C 2 1 x ;; +B 2 3 z ;; -A 6 1 x ;; -B 1 2 y ;; -A 2 1 x ;; -B 1 3 z ;; sort region hierarchically with first, fourth and second field ;; (sort-multiple-fields 1 126 '(1 4 2)) ;; ==> ;; +A 1 2 y ;; +A 2 2 y ;; +A 3 2 y ;; +A 1 2 z ;; +A 4 2 z ;; +B 2 3 z ;; +B 3 3 x ;; +C 1 1 x ;; +C 2 1 x ;; +C 2 1 x ;; -A 2 1 x ;; -A 6 1 x ;; -B 1 2 y ;; -B 1 3 z ;;; Code: (defun sort-multiple-fields (beg end fields) (interactive "*r\nnSort for field: ") (save-excursion (when (interactive-p) (while (yes-or-no-p "Sort another field?") (add-to-list 'fields (read-number "Sort for field: "))) (message "Sorting for fields %s" (prin1-to-string fields))) (let* ((positions (copy-sequence fields)) (max-field (car (sort positions #'>)))) (sort-multiple-fields-base beg end fields)))) (defun sort-multiple-fields-base (beg end fields) (lexical-let ((key (or (car-safe fields) (list fields))) (this-fields (copy-sequence fields)) last) (save-restriction (narrow-to-region beg end) (sort-fields key beg end) (setq last (car fields)) (when (cadr this-fields) (setq this-fields (cdr this-fields)) (sort-multiple-fields-intern beg end last this-fields fields))))) (defun sort-multiple-fields-intern (beg end &optional last this-fields fields) (lexical-let ((beg beg) (pos end) (end end) (last last) (fields fields) (this-fields (copy-sequence this-fields)) regexp) (setq key (pop this-fields)) (dotimes (i max-field) ;; i starts with 0, first field is done above (cond ((eq 0 i) (if (eq 1 last) (setq regexp "^[ \t\n]*\\([^ \t\n]+\\)") (setq regexp "^[ \t\n]*[^ \t\n]+"))) ((eq last (1+ i)) (setq regexp (concat regexp "[ \t\n]+\\([^ \t\n]+\\)"))) (t (setq regexp (concat regexp "[ \t\n]+[^ \t\n]+"))))) (setq regexp (concat regexp ".*$")) (goto-char beg) (while (and (re-search-forward regexp pos t 1) (setq beg (line-beginning-position)) (setq erg (match-string-no-properties 1))) ;; at least one success (when (and (re-search-forward regexp pos t 1) (string= (match-string-no-properties 1) erg) (setq end (line-end-position))) (while (and (re-search-forward regexp pos t 1) (string= (match-string-no-properties 1) erg) (setq end (line-end-position)))) (when (and beg end) ;; we really moved, there is another region to sort (save-restriction (narrow-to-region beg end) (sort-fields key beg end) (when (car this-fields) (setq last key) (sort-multiple-fields-intern beg end last this-fields)))))))) (provide 'sort-multiple-keys) ;;; sort-multiple-keys.el ends here