From: "Andreas Röhler" <andreas.roehler@easy-emacs.de>
To: help-gnu-emacs@gnu.org
Cc: Tim Landscheidt <tim@tim-landscheidt.de>
Subject: Re: Sorting on compound keys?
Date: Sun, 29 May 2011 22:17:59 +0200 [thread overview]
Message-ID: <4DE2A9F7.4040605@easy-emacs.de> (raw)
In-Reply-To: <m3boyprsbj.fsf@passepartout.tim-landscheidt.de>
Am 27.05.2011 00:49, schrieb Tim Landscheidt:
> Andreas Röhler<andreas.roehler@easy-emacs.de> wrote:
>
>>>>> sometimes I want to sort unified diffs of CSV files (sepa-
>>>>> rated by tabs (here: \t)):
>
>>>>> | +A 1\t1\tx
>>>>> | +A 1\t2\ty
>>>>> | +B 2\t3\tz
>>>>> | -A 1\t1\tx
>>>>> | -B 2\t2\ty
>>>>> | -B 2\t3\tz
>
>>>>> by the second column, then the first column, then "+" vs.
>>>>> "-". Unfortunately, it seems that sort-regexp-fields doesn't
>>>>> allow more than one match field as a key. sort-fields
>>>>> doesn't work either as it requires the fields to be sur-
>>>>> rounded by white space (no "+" vs. "-") and doesn't allow
>>>>> white space inside the fields.
>
>>>>> Is there any function in vanilla Emacs (23.1.1) that I
>>>>> missed? I looked at pimping sort-regexp-fields, but it seems
>>>>> to me that sort-subr would have to be rewritten from scratch
>>>>> to achieve sorting on compound keys.
>
>>>> last time I looked into that feature was missing indeed.
>>>> However, didn't look for a need of re-write from the
>>>> scratch, just to extend to existing routine - ie. introduce
>>>> one or more levels of sorting.
>
>>> I remember our discussion in de.comp.editoren :-), but as I
>>> read sort-subr it is hard-coded that the sort key is one
>>> literal, continuous part of the buffer as sort-lists is a
>>> list of buffer positions.
>
>> sort-subr takes functions to determine the fields to sort.
>
> No, it accepts functions to determine the *boundaries* of
> the fields that have to be part of the buffer as I have
> written above.
>
>> As for the functions as arguments, maybe have a look at
>> `ar-th-sort' in thingatpt-utils-base.el
>
>> https://code.launchpad.net/s-x-emacs-werkstatt/
>
> How is this useful in this case?
>
> Tim
>
>
>
Hi Tim,
you are right. It must not be done inside sort-subr, but on the top of it.
BTW as sort-subr takes whitespace as field-delimiter, there is no way to
get +A considered as two fields. Beside this limitation, code below
should provide multiple-fields sorting.
;;; sort-multiple-keys.el --- sort multiple fields
;; Author: Andreas Roehler <andreas.roehler@online.de>
;; Keywords: data
;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;; You should have received a copy of the GNU General Public License
;; along with this program. If not, see <http://www.gnu.org/licenses/>.
;;; Commentary:
;; Sort lines in region lexicographically by the
;; ARG-LIST fields. Fields already sorted by a field
;; specified by a previous arg are sorted by the next
;; remaining. Uses any number of args given in a list.
;; Fields are separated by whitespace and numbered from
;; 1 up. With a negative arg, sorts by the ARGth field
;; counted from the right. Called from a program, there
;; are three arguments: BEG END and FIELD-LIST. BEG
;; and END specify region to sort. The variable
;; `sort-fold-case' determines whether alphabetic case
;; affects the sort order.
;; Example - assume the code below uncommented at the
;; beginning of a buffer:
;; +C 2 1 x
;; +A 2 2 y
;; +A 1 2 y
;; +A 1 2 z
;; +C 1 1 x
;; +A 4 2 z
;; +A 3 2 y
;; +B 3 3 x
;; +C 2 1 x
;; +B 2 3 z
;; -A 6 1 x
;; -B 1 2 y
;; -A 2 1 x
;; -B 1 3 z
;; sort region hierarchically with first, fourth and second field
;; (sort-multiple-fields 1 126 '(1 4 2))
;; ==>
;; +A 1 2 y
;; +A 2 2 y
;; +A 3 2 y
;; +A 1 2 z
;; +A 4 2 z
;; +B 2 3 z
;; +B 3 3 x
;; +C 1 1 x
;; +C 2 1 x
;; +C 2 1 x
;; -A 2 1 x
;; -A 6 1 x
;; -B 1 2 y
;; -B 1 3 z
;;; Code:
(defun sort-multiple-fields (beg end fields)
(interactive "*r\nnSort for field: ")
(save-excursion
(when (interactive-p)
(while
(yes-or-no-p "Sort another field?")
(add-to-list 'fields (read-number "Sort for field: ")))
(message "Sorting for fields %s" (prin1-to-string fields)))
(let* ((positions (copy-sequence fields))
(max-field (car (sort positions #'>))))
(sort-multiple-fields-base beg end fields))))
(defun sort-multiple-fields-base (beg end fields)
(lexical-let ((key (or (car-safe fields) (list fields)))
(this-fields (copy-sequence fields))
last)
(save-restriction
(narrow-to-region beg end)
(sort-fields key beg end)
(setq last (car fields))
(when (cadr this-fields)
(setq this-fields (cdr this-fields))
(sort-multiple-fields-intern beg end last this-fields fields)))))
(defun sort-multiple-fields-intern (beg end &optional last this-fields
fields)
(lexical-let ((beg beg)
(pos end)
(end end)
(last last)
(fields fields)
(this-fields (copy-sequence this-fields))
regexp)
(setq key (pop this-fields))
(dotimes (i max-field)
;; i starts with 0, first field is done above
(cond ((eq 0 i)
(if (eq 1 last)
(setq regexp "^[ \t\n]*\\([^ \t\n]+\\)")
(setq regexp "^[ \t\n]*[^ \t\n]+")))
((eq last (1+ i))
(setq regexp (concat regexp "[ \t\n]+\\([^ \t\n]+\\)")))
(t (setq regexp (concat regexp "[ \t\n]+[^ \t\n]+")))))
(setq regexp (concat regexp ".*$"))
(goto-char beg)
(while (and (re-search-forward regexp pos t 1)
(setq beg (line-beginning-position))
(setq erg (match-string-no-properties 1)))
;; at least one success
(when (and (re-search-forward regexp pos t 1)
(string= (match-string-no-properties 1) erg)
(setq end (line-end-position)))
(while (and (re-search-forward regexp pos t 1)
(string= (match-string-no-properties 1) erg)
(setq end (line-end-position))))
(when (and beg end)
;; we really moved, there is another region to sort
(save-restriction
(narrow-to-region beg end)
(sort-fields key beg end)
(when (car this-fields)
(setq last key)
(sort-multiple-fields-intern beg end last this-fields))))))))
(provide 'sort-multiple-keys)
;;; sort-multiple-keys.el ends here
next prev parent reply other threads:[~2011-05-29 20:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-24 20:57 Sorting on compound keys? Tim Landscheidt
2011-05-25 5:58 ` Andreas Röhler
2011-05-25 22:08 ` Tim Landscheidt
2011-05-26 6:28 ` Andreas Röhler
2011-05-26 22:49 ` Tim Landscheidt
2011-05-29 20:17 ` Andreas Röhler [this message]
2011-06-10 0:26 ` Tim Landscheidt
2011-06-13 7:32 ` Andreas Röhler
2011-05-29 21:50 ` Mark Tilford
2011-06-10 0:27 ` Tim Landscheidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DE2A9F7.4040605@easy-emacs.de \
--to=andreas.roehler@easy-emacs.de \
--cc=help-gnu-emacs@gnu.org \
--cc=tim@tim-landscheidt.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).