From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Date: Tue, 16 Aug 2022 19:54:54 +0300 Message-ID: <83tu6cdt7l.fsf@gnu.org> References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34533"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 57245@debbugs.gnu.org To: Dmitry Gutov , Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Aug 16 18:56:14 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oNzrR-0008m8-HC for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 16 Aug 2022 18:56:13 +0200 Original-Received: from localhost ([::1]:39498 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oNzrQ-0006GP-Gt for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 16 Aug 2022 12:56:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56806) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNzrG-0006Fp-VA for bug-gnu-emacs@gnu.org; Tue, 16 Aug 2022 12:56:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:58409) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oNzrG-000591-MI for bug-gnu-emacs@gnu.org; Tue, 16 Aug 2022 12:56:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oNzrF-000111-J4 for bug-gnu-emacs@gnu.org; Tue, 16 Aug 2022 12:56:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 16 Aug 2022 16:56:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 57245 X-GNU-PR-Package: emacs Original-Received: via spool by 57245-submit@debbugs.gnu.org id=B57245.16606689363873 (code B ref 57245); Tue, 16 Aug 2022 16:56:01 +0000 Original-Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 16:55:36 +0000 Original-Received: from localhost ([127.0.0.1]:48158 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNzqq-00010P-Bz for submit@debbugs.gnu.org; Tue, 16 Aug 2022 12:55:36 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:43816) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNzqo-000107-Hy for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 12:55:35 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:41206) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNzqi-000559-GQ; Tue, 16 Aug 2022 12:55:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=9j+UbrOuYfNR3Yf/6A80q0HoNkHj3K7e6ds231M0Bac=; b=YsAOQj7VWJEE uHzGQwezEJ5oJ8Cw2kjx8W4Ru5nghimdZZgHrb/N7mZZ2TO3ufvOcFinlkxNLr9EOLOnMt7Cmn8H/ 0EVidvk90aJ+1Ag3KKrYp5k2dxRDPEvgDvHSils2/etCHbNyh/Td5QoaWPmVQfnRuapevUQxEiymq g3niZCkXPKQDCeJEhPiIDzG+Voea04teyLdZKQRLPkRIm7OD3sumDtx9E06vjB7e18Fuj8qKyy+Sb +dC+BNzwY9jlx/IODOQ59RXdxvSD6Yow12ntG1Mpc0LfOByddttjR4QhYKMBR5hTymGh1qbggRdUW dFZXf5XCcaaYwsrPA32hwA==; Original-Received: from [87.69.77.57] (port=3386 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNzqL-000097-7x; Tue, 16 Aug 2022 12:55:25 -0400 In-Reply-To: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> (message from Dmitry Gutov on Tue, 16 Aug 2022 17:33:58 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:239974 Archived-At: > Date: Tue, 16 Aug 2022 17:33:58 +0300 > From: Dmitry Gutov > > Prerequisite: Have an XML file that is 20 MB in size, and doesn't have > long lines. > > Or follow steps 1-3 to create one. > > 1. wget -o large-file.xml > https://updates.drupal.org/release-history/drupal/current > 2. M-% /> RET ^J/> RET (to break up the long line into smaller pieces) > 3. Select the contents of the file and copy them over and over for 99 > times. Alternatively, copy them 9 times, then select the result, and > copy it 9 times as well. Save the buffer. > > (To try to keep XML valid -- not sure if necessary -- you can only > perform the copying operation on the contents of the tag. But > that's probably not important. I did that, though.) > > 4. Kill the buffer and re-visit it again. Press M->. > 5. Note the delay. > > Here's the profiler output: > > 1397 95% - command-execute > 1397 95% - call-interactively > 1338 91% - funcall-interactively > 1331 90% - end-of-buffer > 1327 90% - recenter > 1327 90% - jit-lock-function > 1327 90% - jit-lock-fontify-now > 1327 90% - jit-lock--run-functions > 1327 90% - run-hook-wrapped > 1327 90% - # > 1327 90% - font-lock-fontify-region > 1327 90% - font-lock-default-fontify-region > 1327 90% - nxml-extend-region > 845 57% - skip-syntax-forward > 845 57% - internal--syntax-propertize > 845 57% - syntax-propertize > 845 57% - nxml-syntax-propertize > 845 57% - sgml-syntax-propertize > 842 57% - # > 479 32% sgml--syntax-propertize-ppss > 3 0% syntax-ppss > 482 32% - nxml-move-outside-backwards > 482 32% - nxml-inside-start > 482 32% syntax-ppss > 7 0% + execute-extended-command > 59 4% + byte-code > 59 4% + ... > 10 0% + timer-event-handler > Thanks. It looks like some problem in nXML mode or in syntax.c or in how nXML uses the syntax stuff. Maybe the code there is simply not scalable. Stefan, can you see why syntax-related stuff in sgml-mode is so heavy here? What the above profile doesn't show is that this code creates tons of garbage, so GC is called a lot, and adds its share of slowdown.