Tags: patch

Hi,

When I use 'equal' on large, cyclic objects, emacs freezes.

Reproduce:

    ;; emacs -Q
    (require 'org-element)

    ;; get two identical `org-element' trees of this file
    ;; https://github.com/emacs-mirror/emacs/blob/master/doc/misc/modus-themes.org
    (let ((file (expand-file-name "doc/misc/modus-themes.org"
                                  source-directory)))
      (with-current-buffer (find-file-noselect file)
        (setq org-o1 (org-element-parse-buffer))
        (org-element-cache-reset)
        (setq org-o2 (org-element-parse-buffer))))

    ;; the test
    (equal org-o1 org-o2)

This equal call freezes emacs. And the memory usage of emacs grows
quickly, until I quit. Reproduced with emacs 29.4 and the master branch.


The cause:

Current code on the master branch:

    static bool
    internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,
                    int depth, Lisp_Object ht)
    {
     tail_recurse:
      if (depth > 10)
        {
          // ...
          if (NILP (ht))
            ht = CALLN (Fmake_hash_table, QCtest, Qeq);
          // ...
        }
      // ...
            if (! internal_equal (XCAR (o1), XCAR (o2),
                                  equal_kind, depth + 1, ht))
              return false;
            // ...
    }

When internal_equal calls itself, it passes the hash table argument 'ht'
by value. As a result, each internal_equal(depth=11) call initializes
its own hash table, separately recording the objects it encounters in
further recursive calls. This leads to excessive 'hash_put' and
significant memory allocation.


Fix:

Make internal_equal pass the hash table by pointer (see the patch).

With this patch, the test above returns quickly:

    (benchmark-run 100 (equal org-o1 org-o2))

(2.562444 20 0.30933900000000014)

This will also improve the performance for smaller cyclic objects:

    ;; emacs -Q
    (defun make-cyclic (n m)
      (let* ((l1 (make-list n 1))
             (l2 (make-list m 2)))
        (setf (nth (1- m) l2) l1)
        (setf (nth (1- n) l1) l2)
        l1))

    (let ((a (make-cyclic 100 7))
          (b (make-cyclic 100 7)))
      (cl-assert (equal a b))
      (benchmark-run 10000 (equal a b)))

Current:
(0.530081 32 0.49294199999999977)

With the patch:
(0.036401 1 0.0173350000000001)

And it passes all the tests in 'make fns-tests'.


P.S. Could I get the copyright assignment paperwork please?

Best,
Ethan Kong


In GNU Emacs 29.4 (build 2, x86_64-w64-mingw32) of 2024-07-05 built on
 AVALON
Windowing system distributor 'Microsoft Corp.', version 10.0.22631
System Description: Microsoft Windows 10 Pro (v10.0.2009.22631.4317)

Configured using:
 'configure --with-modules --without-dbus --with-native-compilation=aot
 --without-compress-install --with-sqlite3 --with-tree-sitter
 CFLAGS=-O2'