From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Joost Kremers Newsgroups: gmane.emacs.devel Subject: `format` slows down my function even though it shouldn't be called at all... Date: Sun, 15 Dec 2024 22:39:31 +0100 Message-ID: <86r068wrqk.fsf@fastmail.fm> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7519"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.12.7; emacs 29.4 To: Emacs development discussions. Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Dec 15 22:40:40 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tMwLv-0001nW-EP for ged-emacs-devel@m.gmane-mx.org; Sun, 15 Dec 2024 22:40:40 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tMwL5-00065k-5S; Sun, 15 Dec 2024 16:39:47 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tMwL0-00065L-3F for emacs-devel@gnu.org; Sun, 15 Dec 2024 16:39:42 -0500 Original-Received: from fout-a7-smtp.messagingengine.com ([103.168.172.150]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tMwKw-0001BU-Sn for emacs-devel@gnu.org; Sun, 15 Dec 2024 16:39:40 -0500 Original-Received: from phl-compute-03.internal (phl-compute-03.phl.internal [10.202.2.43]) by mailfout.phl.internal (Postfix) with ESMTP id B23421384168 for ; Sun, 15 Dec 2024 16:39:35 -0500 (EST) Original-Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-03.internal (MEProxy); Sun, 15 Dec 2024 16:39:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:content-type:content-type:date:date:from:from:in-reply-to :message-id:mime-version:reply-to:subject:subject:to:to; s=fm1; t=1734298775; x=1734385175; bh=pAwNvoZ+WTsQicHgCjqgPqxwmBdGE0fY WnIkAg3n2sA=; b=IiATyOsb5vqC2i0HTKXlELOhhpxegPLx8jO99u1iX5owlkgm qGzwtGgp0pmPzETugg3Fx7hDGXqQdDGPIWYkMOz28A98+2mxatkMqSziU5iuBMad VERiBN5cR1jDxxRFVZzUIxlqMpWUT5C9w724ZqUne7zo+S3EXpeckQLivDqXuGY9 UPjfG0EYJv1vDvOM0fH/H+82PSNsS0nXXrzqV1lMZuPTyx2koWYMy+xZK2im9Ytd UYRLGJFo9riXRvdbWNRrUnxQadQEMpjW4FzBQBSuZOcnGUqgWJ/GZhyOpInsunSq esVoyzcraXF9DB0Uowuej9cxWKsZe55naZ76qA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:message-id :mime-version:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1734298775; x= 1734385175; bh=pAwNvoZ+WTsQicHgCjqgPqxwmBdGE0fYWnIkAg3n2sA=; b=2 hfiDSMb7/4uXFvAho41twPdz+t1wCYCg5sqJomA5oNuqm7ayevKs/gv64bq98f6t B8SPnA0GbktCUaqhmWPIFB/PPowIrkn+DnK/5GnsykH29tE29QcPS+LMe0QnATnr H6PGrtg6EZphireLsjnG0jwRp1zMoMI398IoV7f6l3Ir6HN+vjsoJqxqQrwuRWEa 5HhrBS8elj8dGK2QPG+vKPfmBdGDgNJd5cfnTtVXClbORs1MDb2wMptH+m7YMCFN valM9w61L+29kxF1yQ912Ulo+/sd6rp/zdau/fKyJ+7ic1cPfgChKWR8nh/m0Iiy Mig3+29KlHUwkHhr8tQAw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrledugddugeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhephffvufgffffkgggtsehttdertddtredtnecuhfhr ohhmpeflohhoshhtucfmrhgvmhgvrhhsuceojhhoohhsthhkrhgvmhgvrhhssehfrghsth hmrghilhdrfhhmqeenucggtffrrghtthgvrhhnpeduueffheeuvdfgieeihfdvjeevleef hefhfeegtdekjeeltdegheehvdeiuedtteenucffohhmrghinhepghhithhhuhgsrdgtoh hmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepjhho ohhsthhkrhgvmhgvrhhssehfrghsthhmrghilhdrfhhmpdhnsggprhgtphhtthhopedupd hmohguvgepshhmthhpohhuthdprhgtphhtthhopegvmhgrtghsqdguvghvvghlsehgnhhu rdhorhhg X-ME-Proxy: Feedback-ID: ie15541ac:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Sun, 15 Dec 2024 16:39:34 -0500 (EST) Received-SPF: pass client-ip=103.168.172.150; envelope-from=joostkremers@fastmail.fm; helo=fout-a7-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:326536 Archived-At: Hi, I have a library for parsing `.bib` files[1] and today a user reported an issue[2] that lead me to a weird discovery. Basically, a call to `format` that is never even executed slows down execution tremendously. The reader part of the parser consists of a couple of functions that all have the following structure: ``` (defun parsebib--chars (chars &optional noerror) "Read the character at point. CHARS is a list of characters. If the character at point matches a character in CHARS, return it and move point, otherwise signal an error, unless NOERROR is non-nil, in which case return nil." (parsebib--skip-whitespace) (if (memq (char-after) chars) (prog1 (char-after) (forward-char 1)) (unless noerror (signal 'parsebib-error (list (format "Expected one of %S, got %c at position %d,%d" chars (following-char) (line-number-at-pos) (current-column))))))) ``` In short, after skipping whitespace, the reader functions try to read some element (character, keyword, etc. depending on the function), return it if it's found and signal an error if it's not found. The call to `signal` contains a call to `format` to provide a useful error message. It's this `format` that slows down parsing, even though no errors are ever signalled. This is an excerpt from a profiler report: ============================================================ 17758 81% - parsebib--chars 17755 80% - if 17749 80% - if 17737 80% - signal 17737 80% - list 17515 79% format ============================================================ This is from parsing a 28MB .bib file: The 79% of processing time spent in `format` seems very weird to me, given that the file contains no errors and no error is ever signalled. After removing the `format` calls, replacing them with a simple string, the parser runs much, much, much faster. To give an idea of the speed increase: without `format`, the 28MB .bib file I mentioned above is parsed in 1-2 seconds (on my machine); with `format`, I don't even have enough patience to wait for parsing to finish... (I let it run for at least 20-30 seconds before interrupting it.) Anyone know what's going on here? Am I missing something, or could this be a bug in Emacs? (I'm running Emacs 29.4, BTW). TIA Joost Footnotes: [1] At https://github.com/joostkremers/parsebib [2] https://github.com/joostkremers/parsebib/issues/34 -- Joost Kremers Life has its moments