unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Aris Spathis <agspathis@gmail.com>
To: mattias.engdegard@gmail.com
Cc: 69709@debbugs.gnu.org
Subject: bug#69709: `sort` interface improvement and universal ordering predicate
Date: Sun, 14 Apr 2024 17:03:11 +0300	[thread overview]
Message-ID: <CAPnC22aT2+uwxByLnr4g78+wWzZ7mWgpDPLRns2rHNseU3EcRg@mail.gmail.com> (raw)
In-Reply-To: <D3DD6F8C-55EA-4602-8A8C-3CDD6252CBCB@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 1170 bytes --]

Thank you for your excellent work on `sort` and related functionality!

Unfortunately, the new `sort` implementation occasionally crashes with a
segfault. The following code reproduces that in current master:

(dotimes (i 500)
  (sort (make-list 128 42)
        :key (lambda (n) (make-list i n))))

It happens for inputs of length >= `MERGESTATE_TEMP_SIZE / 2` (= 128
currently) along with a non-NIL `:key` function. In such cases, a
`Lisp_Object` array is explicitly heap-allocated to store the keys, which is
never marked against GC. This would not be a problem if not for the fact
that
the `:key` function call may trigger GC.

I'm attaching a patch with a proposed solution, consisting of three changes:

1. Allocate with `xzalloc` to have the new array initialized to `Qnil`. This
   ensures its objects can be marked properly.

2. Mark `ms->allocated_keys` in `merge_markmem`. We don't need to check that
   `ms->allocated_keys != NULL`, as `merge_register_cleanup` is only called
   under this exact condition.

3. Move the computation of keys (which may trigger a GC) after `merge_init`,
   which registers `merge_markmem`.

I hope this is useful.

Cheers,
Aris

[-- Attachment #1.2: Type: text/html, Size: 1327 bytes --]

[-- Attachment #2: sort-segfault-fix.patch --]
[-- Type: text/x-patch, Size: 1664 bytes --]

diff --git a/src/sort.c b/src/sort.c
index 527d5550342..562f88ede3c 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -532,6 +532,8 @@ merge_markmem (void *arg)
   merge_state *ms = arg;
   eassume (ms != NULL);
 
+  mark_objects (ms->allocated_keys, ms->listlen);
+
   if (ms->reloc.size != NULL && *ms->reloc.size > 0)
     {
       Lisp_Object *src = (ms->reloc.src->values
@@ -1107,21 +1109,29 @@ tim_sort (Lisp_Object predicate, Lisp_Object keyfunc,
       if (length < MERGESTATE_TEMP_SIZE / 2)
 	keys = &ms.temparray[length + 1];
       else
-	keys = allocated_keys = xmalloc (length * word_size);
-
-      for (ptrdiff_t i = 0; i < length; i++)
-	keys[i] = call1 (keyfunc, seq[i]);
+	{
+	  /* Allocate with xzalloc to obtain an array of valid
+	     Lisp_Objects (Qnils), so that they can be marked. */
+	  verify (NIL_IS_ZERO);
+	  keys = allocated_keys = xzalloc (length * word_size);
+	}
 
       lo.keys = keys;
       lo.values = seq;
     }
 
+  merge_init (&ms, length, allocated_keys, &lo, predicate);
+
+  /* Compute keys after merge_markmem has been registered by merge_init
+     (any call to keyfunc might trigger a GC). */
+  if (!NILP (keyfunc))
+    for (ptrdiff_t i = 0; i < length; i++)
+      keys[i] = call1 (keyfunc, seq[i]);
+
   /* FIXME: This is where we would check the keys for interesting
      properties for more optimised comparison (such as all being fixnums
      etc).  */
 
-  merge_init (&ms, length, allocated_keys, &lo, predicate);
-
   /* March over the array once, left to right, finding natural runs,
      and extending short natural runs to minrun elements.  */
   const ptrdiff_t minrun = merge_compute_minrun (length);

  parent reply	other threads:[~2024-04-14 14:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-10 13:28 bug#69709: `sort` interface improvement and universal ordering predicate Mattias Engdegård
2024-03-10 14:09 ` Eli Zaretskii
2024-03-10 14:56   ` Mattias Engdegård
2024-03-20 19:01     ` Mattias Engdegård
2024-03-20 19:37       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-21 14:55         ` Mattias Engdegård
2024-03-21 14:54       ` Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-22 20:55       ` Dmitry Gutov
2024-03-23 14:58         ` Mattias Engdegård
2024-03-23 17:39           ` Dmitry Gutov
2024-03-23 20:09             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-23 23:19               ` Dmitry Gutov
2024-03-23 23:25                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-25 11:11                   ` Mattias Engdegård
2024-03-29 10:59                     ` Mattias Engdegård
2024-03-29 11:38                       ` Daniel Mendler via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 11:52                         ` Mattias Engdegård
2024-05-17 12:29                           ` Daniel Mendler via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-17 17:49                             ` Mattias Engdegård
2024-03-29 12:06                       ` Eli Zaretskii
2024-03-29 15:02                         ` Mattias Engdegård
2024-03-29 15:35                           ` Eli Zaretskii
2024-03-29 16:13                             ` Mattias Engdegård
2024-03-29 18:09                               ` Eli Zaretskii
2024-03-10 15:48 ` Dmitry Gutov
2024-03-10 15:56   ` Mattias Engdegård
2024-03-10 16:03     ` Dmitry Gutov
2024-03-10 16:46       ` Mattias Engdegård
2024-03-10 16:55         ` Eli Zaretskii
2024-03-10 17:54           ` Dmitry Gutov
2024-03-11  7:01 ` Gerd Möllmann
2024-04-14 14:03 ` Aris Spathis [this message]
2024-04-14 16:26   ` Eli Zaretskii
2024-04-14 16:33     ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPnC22aT2+uwxByLnr4g78+wWzZ7mWgpDPLRns2rHNseU3EcRg@mail.gmail.com \
    --to=agspathis@gmail.com \
    --cc=69709@debbugs.gnu.org \
    --cc=mattias.engdegard@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).