From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Joost Kremers Newsgroups: gmane.emacs.devel Subject: Re: `format` slows down my function even though it shouldn't be called at all... Date: Sun, 15 Dec 2024 23:31:20 +0100 Message-ID: <864j34wpc7.fsf@fastmail.fm> References: <86r068wrqk.fsf@fastmail.fm> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23859"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.12.7; emacs 29.4 To: Emacs development discussions. Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Dec 15 23:32:25 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tMx9y-00061Z-U2 for ged-emacs-devel@m.gmane-mx.org; Sun, 15 Dec 2024 23:32:24 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tMx99-0005xf-6Q; Sun, 15 Dec 2024 17:31:31 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tMx96-0005xM-Og for emacs-devel@gnu.org; Sun, 15 Dec 2024 17:31:29 -0500 Original-Received: from fout-a7-smtp.messagingengine.com ([103.168.172.150]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tMx93-0007d0-8W for emacs-devel@gnu.org; Sun, 15 Dec 2024 17:31:27 -0500 Original-Received: from phl-compute-11.internal (phl-compute-11.phl.internal [10.202.2.51]) by mailfout.phl.internal (Postfix) with ESMTP id 2494A1382B76 for ; Sun, 15 Dec 2024 17:31:24 -0500 (EST) Original-Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-11.internal (MEProxy); Sun, 15 Dec 2024 17:31:24 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1734301884; x=1734388284; bh=wAT0/oHj7I Dj7ZnBri0irmupI1iuFSNAAAAyrbWKnr8=; b=aGK8pqHbHZJrOWDIbsJsiB8rMK A79dqMyPC7gk8rZPRj+fAoh9YWUIqV5sSzgVx3vtEH5WLxH6cega8lIuQIDjLApZ r0LUmhonm9g+7x14aZQBzZdIwQPhqcnrYYg82iCOmTOj4Y6zpGO98RvdoyiEdW54 7bff25LNCxrJpdaDuTSSw6lL3WhS7L7gRZjwE853eWiSwKySt46tb0qksjir8rdL VkZHtlamh3Ec8SGEDVFTYZjrZ0hnQvxzupJBcGHrMBgr8mmqQ5GnwFR/AKpwjZVB 322v8087RACym/HNNtihJ5n/RAgMQDouIK5XeVav+DFdHI9hosM0P3nVaNWA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1734301884; x=1734388284; bh=wAT0/oHj7IDj7ZnBri0irmupI1iuFSNAAAA yrbWKnr8=; b=Fc5CSZQ5AMEGHf3dlVXDmglP0eA70g4CIAIoKITMzCMn7Fe3erq LH7yHUC5/xSHFXKH1w4r01Xn88aoGVA+rFLpUAERaIl8MncRqFOrDJblbVCHW8x8 ruNhJGf3KdId0kvQduaskKyYiLIz7h1TnDr4oWk7h9fUV6ntxkm6CqhF8itZa0Le RVBr2DOmYQh8KWodlGa/vJFSuHq2fe0Tf9EdS5NjNlCjB6dpXYAOz0UgZIhIWaIF jpD4mi/8YKplBh9vD5UvKuoMReDy0t/oYoBztFsQReszfv4BV9HcE5CEKj9YoS1m ORywiIpXgyNKF38zpvik6S2lk2V1fLi8Jzg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrledugdduheelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhephffvufgjfhgffffkgggtsehttdertddtredtnecu hfhrohhmpeflohhoshhtucfmrhgvmhgvrhhsuceojhhoohhsthhkrhgvmhgvrhhssehfrg hsthhmrghilhdrfhhmqeenucggtffrrghtthgvrhhnpedvfeevkefhhfegheetgeeiledt jeegheehgeffkeeltedutdfhtdffveeitddvgeenucffohhmrghinhepghhithhhuhgsrd gtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep jhhoohhsthhkrhgvmhgvrhhssehfrghsthhmrghilhdrfhhmpdhnsggprhgtphhtthhope dupdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvmhgrtghsqdguvghvvghlsehg nhhurdhorhhg X-ME-Proxy: Feedback-ID: ie15541ac:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Sun, 15 Dec 2024 17:31:23 -0500 (EST) In-Reply-To: <86r068wrqk.fsf@fastmail.fm> (Joost Kremers's message of "Sun, 15 Dec 2024 22:39:31 +0100") Received-SPF: pass client-ip=103.168.172.150; envelope-from=joostkremers@fastmail.fm; helo=fout-a7-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:326537 Archived-At: I did it again... Right after posting my message, I find the cause of the problem. Sigh... Ignore please. (For anyone wondering: the parser actually signals many errors, but most of them are caught using `condition-case`...) On Sun, Dec 15 2024, Joost Kremers wrote: > Hi, > > I have a library for parsing `.bib` files[1] and today a user reported an > issue[2] that lead me to a weird discovery. Basically, a call to `format` > that is never even executed slows down execution tremendously. > > The reader part of the parser consists of a couple of functions that all > have the following structure: > > ``` > (defun parsebib--chars (chars &optional noerror) > "Read the character at point. > CHARS is a list of characters. If the character at point matches > a character in CHARS, return it and move point, otherwise signal > an error, unless NOERROR is non-nil, in which case return nil." > (parsebib--skip-whitespace) > (if (memq (char-after) chars) > (prog1 > (char-after) > (forward-char 1)) > (unless noerror > (signal 'parsebib-error (list (format "Expected one of %S, got %c at position %d,%d" > chars > (following-char) > (line-number-at-pos) (current-column))))))) > ``` > > In short, after skipping whitespace, the reader functions try to read some > element (character, keyword, etc. depending on the function), return it if > it's found and signal an error if it's not found. > > The call to `signal` contains a call to `format` to provide a useful error > message. It's this `format` that slows down parsing, even though no errors > are ever signalled. > > This is an excerpt from a profiler report: > > ============================================================ > 17758 81% - parsebib--chars > 17755 80% - if > 17749 80% - if > 17737 80% - signal > 17737 80% - list > 17515 79% format > ============================================================ > > This is from parsing a 28MB .bib file: The 79% of processing time spent in > `format` seems very weird to me, given that the file contains no errors and > no error is ever signalled. > > After removing the `format` calls, replacing them with a simple string, the > parser runs much, much, much faster. To give an idea of the speed increase: > without `format`, the 28MB .bib file I mentioned above is parsed in 1-2 > seconds (on my machine); with `format`, I don't even have enough patience > to wait for parsing to finish... (I let it run for at least 20-30 seconds > before interrupting it.) > > Anyone know what's going on here? Am I missing something, or could this be > a bug in Emacs? (I'm running Emacs 29.4, BTW). > > TIA > > Joost > > > > Footnotes: > [1] At https://github.com/joostkremers/parsebib > > [2] https://github.com/joostkremers/parsebib/issues/34 -- Joost Kremers Life has its moments