From: Tim Landscheidt <tim@tim-landscheidt.de>
To: help-gnu-emacs@gnu.org
Subject: Re: Sorting on compound keys?
Date: Fri, 10 Jun 2011 00:27:37 +0000 [thread overview]
Message-ID: <m3mxhqsf7q.fsf@passepartout.tim-landscheidt.de> (raw)
In-Reply-To: BANLkTinDQn=m0=8pCJiXMrwu3PbKHf3UnQ@mail.gmail.com
Mark Tilford <ralphmerridew@gmail.com> wrote:
>> sometimes I want to sort unified diffs of CSV files (sepa-
>> rated by tabs (here: \t)):
>> | +A 1\t1\tx
>> | +A 1\t2\ty
>> | +B 2\t3\tz
>> | -A 1\t1\tx
>> | -B 2\t2\ty
>> | -B 2\t3\tz
>> by the second column, then the first column, then "+" vs.
>> "-". Unfortunately, it seems that sort-regexp-fields doesn't
>> allow more than one match field as a key. sort-fields
>> doesn't work either as it requires the fields to be sur-
>> rounded by white space (no "+" vs. "-") and doesn't allow
>> white space inside the fields.
>> Is there any function in vanilla Emacs (23.1.1) that I
>> missed? I looked at pimping sort-regexp-fields, but it seems
>> to me that sort-subr would have to be rewritten from scratch
>> to achieve sorting on compound keys.
> Is there an option to do a stable sort, such as mergesort?
Eureka! Of course! All Emacs sort functions are stable, so
99 % of my use cases can be dealt with by multiple calls to
sort-regexp-fields (the only exception being sorting numeri-
cally and the like).
Unfortunately, those multiple calls can be tedious when
done interactively, so voilà:
| (defun tl-sort-regexp-fields (reverse record-regexp key-regexp beg end)
| (interactive "P\nsRegexp specifying records to sort:
| sRegexp specifying key within record: \nr")
| (if (string-match "\\`\\(?:-\\\\[1-9]\\|\\(?:-?\\\\[1-9]\\)\\{2,\\}\\)\\'" key-regexp)
| (let
| ((i (length key-regexp)))
| (while (> i 0)
| (let ((key-reverse (and (> i 2) (= (aref key-regexp (- i 3)) ?-)))
| (key (substring key-regexp (- i 2) i)))
| (sort-regexp-fields (if reverse (not key-reverse) key-reverse) record-regexp key beg end)
| (if key-reverse
| (setq i (- i 1)))
| (setq i (- i 2)))))
| (sort-regexp-fields reverse record-regexp key-regexp beg end)))
A key-regexp of "\2\3\1" will yield the region sorted by the
second field, then the third, then the first. The fields can
be prefixed with "-" to negate the sort order for this
field, e. g. "\2-\3\1" will sort by the second field ascend-
ingly, then the third descendingly, then the first ascend-
ingly.
With regard to performance, the region is sorted once for
every key, so it may not be suitable for larger datasets,
but up to a few thousand lines it's fast enough for me. If
someone wants to integrate this into Emacs, please go ahead.
Thanks, also to Andreas,
Tim
P. S.: Is there really no xor in elisp?
prev parent reply other threads:[~2011-06-10 0:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-24 20:57 Sorting on compound keys? Tim Landscheidt
2011-05-25 5:58 ` Andreas Röhler
2011-05-25 22:08 ` Tim Landscheidt
2011-05-26 6:28 ` Andreas Röhler
2011-05-26 22:49 ` Tim Landscheidt
2011-05-29 20:17 ` Andreas Röhler
2011-06-10 0:26 ` Tim Landscheidt
2011-06-13 7:32 ` Andreas Röhler
2011-05-29 21:50 ` Mark Tilford
2011-06-10 0:27 ` Tim Landscheidt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3mxhqsf7q.fsf@passepartout.tim-landscheidt.de \
--to=tim@tim-landscheidt.de \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).