all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Sorting buffer with string-collate-lessp
@ 2015-05-26 11:53 Rasmus
  2015-05-26 14:49 ` Tassilo Horn
  0 siblings, 1 reply; 15+ messages in thread
From: Rasmus @ 2015-05-26 11:53 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

How can I easily sort a buffer using string-collate-lessp?

Info: I would like to sort a buffer using string-collate-lessp (line by
line).  Sort-lines is the obvious candidate but it uses string<.  I tried
to write my own sort-lines using sort-subr, as it has a predicate
argument.  However, for buffers, it needs something like
compare-buffer-substrings, which takes no predicate and is in the C-level
and pretty long.

I could write a wrapper that convert each buffer-chunk into its
buffer-substring first, and then compares it with string-collate-lessp, I
guess, but that seems like a lot of boiler plate.  So maybe a better
solution exists?

Bonus point if one can specify something simple and intuitive as
string-collate-lessp in a defcustom.

Thanks,
Rasmus

-- 
History is what should never happen again




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 11:53 Rasmus
@ 2015-05-26 14:49 ` Tassilo Horn
  2015-05-26 14:53   ` Rasmus
  0 siblings, 1 reply; 15+ messages in thread
From: Tassilo Horn @ 2015-05-26 14:49 UTC (permalink / raw)
  To: Rasmus; +Cc: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

Hi Rasmus,

> How can I easily sort a buffer using string-collate-lessp?
>
> Info: I would like to sort a buffer using string-collate-lessp (line
> by line).  Sort-lines is the obvious candidate but it uses string<.  I
> tried to write my own sort-lines using sort-subr, as it has a
> predicate argument.  However, for buffers, it needs something like
> compare-buffer-substrings, which takes no predicate and is in the
> C-level and pretty long.
>
> I could write a wrapper that convert each buffer-chunk into its
> buffer-substring first, and then compares it with
> string-collate-lessp, I guess, but that seems like a lot of boiler
> plate.  So maybe a better solution exists?

I think you can use `cl-left' to temporarily change the definition of
`string<' to `string-collate-lessp', so this should work in theory.

--8<---------------cut here---------------start------------->8---
(cl-letf (((symbol-function 'string<) #'string-collate-lessp))
  (sort-lines nil (point-min) (point-max)))
--8<---------------cut here---------------end--------------->8---

However, I've tried it with (lambda (a b) (string-collate-lessp b a))
which should use `string-collate-lessp' and sort in reverse (note the
switched arguments) and that didn't change anything.

So either my `cl-letf' usage is wrong or `sort-lines' doesn't really use
`string<'.  (Actually, `string<' is an alias to `string-lessp', so I
also tried changing that accordingly, but still no effect...)

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 14:49 ` Tassilo Horn
@ 2015-05-26 14:53   ` Rasmus
  2015-05-26 17:57     ` Tassilo Horn
  0 siblings, 1 reply; 15+ messages in thread
From: Rasmus @ 2015-05-26 14:53 UTC (permalink / raw)
  To: help-gnu-emacs

Tassilo Horn <tsdh@gnu.org> writes:

> So either my `cl-letf' usage is wrong or `sort-lines' doesn't really use
> `string<'.  (Actually, `string<' is an alias to `string-lessp', so I
> also tried changing that accordingly, but still no effect...)

I though sort-lines used compare-buffer-substrings, but this is just based
on the docstring.

It seems the nesting of parentheses is wrong in your example, but I guess
it just the example.  I did not try to letf to temporarily rename string<.

Thanks,
Rasmus

-- 
Send from my Emacs



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
       [not found] <mailman.3682.1432641217.904.help-gnu-emacs@gnu.org>
@ 2015-05-26 15:37 ` Emanuel Berg
  2015-05-26 15:43   ` Rasmus
  0 siblings, 1 reply; 15+ messages in thread
From: Emanuel Berg @ 2015-05-26 15:37 UTC (permalink / raw)
  To: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

> How can I easily sort a buffer using
> string-collate-lessp?

I don't have string-collate-lessp - where did you get it?

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 15:37 ` Sorting buffer with string-collate-lessp Emanuel Berg
@ 2015-05-26 15:43   ` Rasmus
  2015-05-26 15:52     ` Rasmus
       [not found]     ` <mailman.3702.1432655548.904.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 15+ messages in thread
From: Rasmus @ 2015-05-26 15:43 UTC (permalink / raw)
  To: help-gnu-emacs

Emanuel Berg <embe8573@student.uu.se> writes:

> I don't have string-collate-lessp - where did you get it?

Emacs-25.

-- 
You people at the NSA are becoming my new best friends!




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 15:43   ` Rasmus
@ 2015-05-26 15:52     ` Rasmus
       [not found]     ` <mailman.3702.1432655548.904.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 15+ messages in thread
From: Rasmus @ 2015-05-26 15:52 UTC (permalink / raw)
  To: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

> Emanuel Berg <embe8573@student.uu.se> writes:
>
>> I don't have string-collate-lessp - where did you get it?
>
> Emacs-25.

That was a bit short.  From the news file of 25.1.

    ** The new functions `string-collate-lessp' and `string-collate-equalp'
    preserve the collation order as defined by the system's locale(1)
    environment.  For the time being this is implemented for modern POSIX
    systems and for MS-Windows, for other systems they fall back to their
    counterparts `string-lessp' and `string-equal'.


-- 
It was you, Jezebel, it was you




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
       [not found]     ` <mailman.3702.1432655548.904.help-gnu-emacs@gnu.org>
@ 2015-05-26 16:06       ` Emanuel Berg
  2015-05-26 16:36         ` Rasmus
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Emanuel Berg @ 2015-05-26 16:06 UTC (permalink / raw)
  To: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

>>> I don't have string-collate-lessp - where did you
>>> get it?
>>
>> Emacs-25.
>
> That was a bit short. From the news file of 25.1.
>
>     ** The new functions `string-collate-lessp' and
> `string-collate-equalp' preserve the collation order
> as defined by the system's locale(1) environment.
> For the time being this is implemented for modern
> POSIX systems and for MS-Windows, for other systems
> they fall back to their counterparts `string-lessp'
> and `string-equal'.

Cool - I didn't know there was a new version.
That will be interesting to see if that breaks any of
my code. But it isn't in the Debian repos so probably
you compiled it yourself, ay?

Anyhow, that function name ends with a "p" so it is
a unary function which returns a boolean, I take it.
So why doesn't it work to do just as you suggested
with `sort-subr' with that as the third optional
argument, namely PREDICATE?

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 16:06       ` Emanuel Berg
@ 2015-05-26 16:36         ` Rasmus
  2015-05-26 16:40           ` Rasmus
  2015-05-26 18:01         ` Tassilo Horn
       [not found]         ` <mailman.3706.1432658186.904.help-gnu-emacs@gnu.org>
  2 siblings, 1 reply; 15+ messages in thread
From: Rasmus @ 2015-05-26 16:36 UTC (permalink / raw)
  To: help-gnu-emacs

Emanuel Berg <embe8573@student.uu.se> writes:

> Cool - I didn't know there was a new version.
> That will be interesting to see if that breaks any of
> my code. But it isn't in the Debian repos so probably
> you compiled it yourself, ay?

Indeed.  The Emacs wiki suggest that there is a Debian repo with Emacs
snapshots.

> Anyhow, that function name ends with a "p" so it is
> a unary function which returns a boolean, I take it.
> So why doesn't it work to do just as you suggested
> with `sort-subr' with that as the third optional
> argument, namely PREDICATE?

Yeah, but you get bufferf pointers.  As I hinted in my first post you can
write a boilerplate around it,

    (with-temp-buffer
      (save-excursion (insert (mapconcat 'symbol-name '(c b a á) "\n")))
      (sort-subr nil 'forward-line 'end-of-line nil nil
                 (lambda (a b) (funcall
                           (if (fboundp 'string-collate-lessp)
                               'string-collate-lessp
                             'string-lessp)
                           (buffer-substring (car a) (cdr a))
                           (buffer-substring (car b) (cdr b)))))
      (buffer-string))

But this seems like a lot of trouble just to get locale-aware sorting...

Rasmus

-- 
Bang bang




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 16:36         ` Rasmus
@ 2015-05-26 16:40           ` Rasmus
  0 siblings, 0 replies; 15+ messages in thread
From: Rasmus @ 2015-05-26 16:40 UTC (permalink / raw)
  To: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

>> Anyhow, that function name ends with a "p" so it is
>> a unary function which returns a boolean, I take it.
>> So why doesn't it work to do just as you suggested
>> with `sort-subr' with that as the third optional
>> argument, namely PREDICATE?
>
> Yeah, but you get bufferf pointers.  As I hinted in my first post you can
> write a boilerplate around it,
>
>     (with-temp-buffer
>       (save-excursion (insert (mapconcat 'symbol-name '(c b a á) "\n")))
>       (sort-subr nil 'forward-line 'end-of-line nil nil
>                  (lambda (a b) (funcall
>                            (if (fboundp 'string-collate-lessp)
>                                'string-collate-lessp
>                              'string-lessp)
>                            (buffer-substring (car a) (cdr a))
>                            (buffer-substring (car b) (cdr b)))))
>       (buffer-string))

I should have given this function as well:

     (with-temp-buffer
       (save-excursion (insert (mapconcat 'symbol-name '(c b a á) "\n")))
       (sort-subr nil 'forward-line 'end-of-line nil nil
                  (if (fboundp 'string-collate-lessp)
                      'string-collate-lessp
                    'string-lessp))
       (buffer-string))

It results in an error (on my system).

Rasmus

-- 
May contains speling mistake




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 14:53   ` Rasmus
@ 2015-05-26 17:57     ` Tassilo Horn
  2015-05-27 11:23       ` Michael Heerdegen
  0 siblings, 1 reply; 15+ messages in thread
From: Tassilo Horn @ 2015-05-26 17:57 UTC (permalink / raw)
  To: Rasmus; +Cc: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

>> So either my `cl-letf' usage is wrong or `sort-lines' doesn't really use
>> `string<'.  (Actually, `string<' is an alias to `string-lessp', so I
>> also tried changing that accordingly, but still no effect...)
>
> I though sort-lines used compare-buffer-substrings, but this is just based
> on the docstring.

Ah, that's the reason it had no effect. :-)

> It seems the nesting of parentheses is wrong in your example, but I
> guess it just the example.  I did not try to letf to temporarily
> rename string<.

No, indeed you have to replace `compare-buffer-substrings'.  This seems
to do the trick although my replacement lambda doesn't really satisfy
the contract of the return value of `compare-buffer-substrings' but it
seems to be good enough for usage in `sort-lines'.

--8<---------------cut here---------------start------------->8---
(cl-letf (((symbol-function 'compare-buffer-substrings)
           (lambda (b1 s1 e1 b2 s2 e2)
             (if (string-collate-lessp (buffer-substring s1 e1)
                                       (buffer-substring s2 e2))
                 -1
               1))))
  (sort-lines nil (point-min) 191))
--8<---------------cut here---------------end--------------->8---

The (point-min) and 191 when performed in *scratch* sort the initial
scratch message comment lines.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 16:06       ` Emanuel Berg
  2015-05-26 16:36         ` Rasmus
@ 2015-05-26 18:01         ` Tassilo Horn
       [not found]         ` <mailman.3706.1432658186.904.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 15+ messages in thread
From: Tassilo Horn @ 2015-05-26 18:01 UTC (permalink / raw)
  To: help-gnu-emacs

Emanuel Berg <embe8573@student.uu.se> writes:

> Anyhow, that function name ends with a "p" so it is
> a unary function which returns a boolean, I take it.

No, the "p" just stands for predicate which only implies that it returns
a generalized boolean.  It still might have arbitrary arity, e.g.,
`buffer-narrowed-p' has zero arguments, `consp` has one argument,
`string-lessp' has two arguments, etc.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
       [not found]         ` <mailman.3706.1432658186.904.help-gnu-emacs@gnu.org>
@ 2015-05-26 22:44           ` Emanuel Berg
  2015-05-27  0:12             ` Rasmus
       [not found]             ` <mailman.3732.1432685593.904.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 15+ messages in thread
From: Emanuel Berg @ 2015-05-26 22:44 UTC (permalink / raw)
  To: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

>> Anyhow, that function name ends with a "p" so it is
>> a unary function which returns a boolean, I take
>> it. So why doesn't it work to do just as you
>> suggested with `sort-subr' with that as the third
>> optional argument, namely PREDICATE?
>
> Yeah, but you get bufferf pointers.  As I hinted in my first post you can
> write a boilerplate around it,
>
>     (with-temp-buffer
>       (save-excursion (insert (mapconcat 'symbol-name '(c b a .) "\n")))
>       (sort-subr nil 'forward-line 'end-of-line nil nil
>                  (lambda (a b) (funcall
>                            (if (fboundp 'string-collate-lessp)
>                                'string-collate-lessp
>                              'string-lessp)
>                            (buffer-substring (car a) (cdr a))
>                            (buffer-substring (car b) (cdr b)))))
>       (buffer-string))
>
> But this seems like a lot of trouble just to get
> locale-aware sorting...

"Locale aware" stuff is always trouble which is why
I don't care for it but eat, sleep and dream an
assimilated Anglo-American when it comes to computing,
and with no shame or blame attached - that said,
I don't think ten lines of Elisp is "a lot of
trouble".

That code looks a bit complicated - perhaps some
things can be extracted, and in particular the
predicate?

But I don't see anything reoccurring so if it works
why not just put it in some init file and move on?

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 22:44           ` Emanuel Berg
@ 2015-05-27  0:12             ` Rasmus
       [not found]             ` <mailman.3732.1432685593.904.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 15+ messages in thread
From: Rasmus @ 2015-05-27  0:12 UTC (permalink / raw)
  To: help-gnu-emacs

Emanuel Berg <embe8573@student.uu.se> writes:

> "Locale aware" stuff is always trouble which is why
> I don't care for it but eat, sleep and dream an
> assimilated Anglo-American when it comes to computing,
> and with no shame or blame attached

Maybe you'd care if your name was Émanuel given the fact that
(string< "É" "å") is non-nil.

Diacritics are used in English as well.

> But I don't see anything reoccurring so if it works
> why not just put it in some init file and move on?

It's for package.  It seems bewildering that it's so "hard" to sort a
buffer in this way.

—Rasmus

-- 
Need more coffee. . .




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
       [not found]             ` <mailman.3732.1432685593.904.help-gnu-emacs@gnu.org>
@ 2015-05-27  0:25               ` Emanuel Berg
  0 siblings, 0 replies; 15+ messages in thread
From: Emanuel Berg @ 2015-05-27  0:25 UTC (permalink / raw)
  To: help-gnu-emacs

Rasmus <rasmus@gmx.us> writes:

>> "Locale aware" stuff is always trouble which is why
>> I don't care for it but eat, sleep and dream an
>> assimilated Anglo-American when it comes to
>> computing, and with no shame or blame attached
>
> Maybe you'd care if your name was Émanuel given the
> fact that (string< "É" "å") is non-nil.

It is only impractical to bother with all that.
You can call me Manny if that feels better.

> Diacritics are used in English as well.

Not in Computer English.

>> But I don't see anything reoccurring so if it works
>> why not just put it in some init file and move on?
>
> It's for package. It seems bewildering that it's so
> "hard" to sort a buffer in this way.

OK, put it in a package then. That's beside the point.
I don't understand the problem. Did you solve it but
your are unhappy with the solution? Why? What's wrong
with it?

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Sorting buffer with string-collate-lessp
  2015-05-26 17:57     ` Tassilo Horn
@ 2015-05-27 11:23       ` Michael Heerdegen
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Heerdegen @ 2015-05-27 11:23 UTC (permalink / raw)
  To: help-gnu-emacs

Tassilo Horn <tsdh@gnu.org> writes:

> (cl-letf (((symbol-function 'compare-buffer-substrings)
>            (lambda (b1 s1 e1 b2 s2 e2)
>              (if (string-collate-lessp (buffer-substring s1 e1)
>                                        (buffer-substring s2 e2))
>                  -1
>                1))))
>   (sort-lines nil (point-min) 191))


Alternatively one could call `sort-subr' directly to avoid `cl-letf':

--8<---------------cut here---------------start------------->8---
(sort-subr
 nil
 #'forward-line #'end-of-line
 (lambda () (buffer-substring-no-properties
        (point) (line-end-position)))
 nil
 #'string-collate-lessp)
--8<---------------cut here---------------end--------------->8---


Michael.




^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-05-27 11:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.3682.1432641217.904.help-gnu-emacs@gnu.org>
2015-05-26 15:37 ` Sorting buffer with string-collate-lessp Emanuel Berg
2015-05-26 15:43   ` Rasmus
2015-05-26 15:52     ` Rasmus
     [not found]     ` <mailman.3702.1432655548.904.help-gnu-emacs@gnu.org>
2015-05-26 16:06       ` Emanuel Berg
2015-05-26 16:36         ` Rasmus
2015-05-26 16:40           ` Rasmus
2015-05-26 18:01         ` Tassilo Horn
     [not found]         ` <mailman.3706.1432658186.904.help-gnu-emacs@gnu.org>
2015-05-26 22:44           ` Emanuel Berg
2015-05-27  0:12             ` Rasmus
     [not found]             ` <mailman.3732.1432685593.904.help-gnu-emacs@gnu.org>
2015-05-27  0:25               ` Emanuel Berg
2015-05-26 11:53 Rasmus
2015-05-26 14:49 ` Tassilo Horn
2015-05-26 14:53   ` Rasmus
2015-05-26 17:57     ` Tassilo Horn
2015-05-27 11:23       ` Michael Heerdegen

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.