From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andreas Politz Newsgroups: gmane.emacs.bugs Subject: bug#14776: 24.3.50; [PATCH] parse-time-string performance Date: Thu, 04 Jul 2013 23:08:33 +0200 Message-ID: <87ppuyhw32.fsf@hochschule-trier.de> References: <87sizwxwu2.fsf@hochschule-trier.de> <87li5m1geu.fsf@hochschule-trier.de> <87k3l6jedt.fsf@hochschule-trier.de> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: ger.gmane.org 1372972213 21121 80.91.229.3 (4 Jul 2013 21:10:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 4 Jul 2013 21:10:13 +0000 (UTC) Cc: 14776@debbugs.gnu.org To: Lars Magne Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jul 04 23:10:13 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Uuqn6-0001dI-Bw for geb-bug-gnu-emacs@m.gmane.org; Thu, 04 Jul 2013 23:10:12 +0200 Original-Received: from localhost ([::1]:50664 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uuqn5-0004CR-RD for geb-bug-gnu-emacs@m.gmane.org; Thu, 04 Jul 2013 17:10:11 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57874) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uuqn0-000484-BM for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2013 17:10:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uuqmx-0004KA-3o for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2013 17:10:06 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:35519) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uuqmw-0004Ir-Uo for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2013 17:10:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Uuqmw-0007s0-1o for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2013 17:10:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Andreas Politz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 04 Jul 2013 21:10:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 14776 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 14776-submit@debbugs.gnu.org id=B14776.137297214230156 (code B ref 14776); Thu, 04 Jul 2013 21:10:01 +0000 Original-Received: (at 14776) by debbugs.gnu.org; 4 Jul 2013 21:09:02 +0000 Original-Received: from localhost ([127.0.0.1]:58062 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Uuqlx-0007q9-JD for submit@debbugs.gnu.org; Thu, 04 Jul 2013 17:09:02 -0400 Original-Received: from gateway-a.fh-trier.de ([143.93.54.181]:39027) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Uuqlq-0007pb-MW for 14776@debbugs.gnu.org; Thu, 04 Jul 2013 17:08:59 -0400 X-Virus-Scanned: by Amavisd-new + McAfee uvscan + ClamAV [Rechenzentrum Hochschule Trier] Original-Received: from luca (dslb-178-004-172-178.pools.arcor-ip.net [178.4.172.178]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: politza) by gateway-a.fh-trier.de (Postfix) with ESMTPSA id 85C43175FE99; Thu, 4 Jul 2013 23:08:34 +0200 (CEST) Original-Received: from localhost ([127.0.0.1] helo=luca) by luca with esmtp (Exim 4.72) (envelope-from ) id 1UuqlV-0001W2-RO; Thu, 04 Jul 2013 23:08:33 +0200 In-Reply-To: (Lars Magne Ingebrigtsen's message of "Thu, 04 Jul 2013 22:31:35 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:75917 Archived-At: --=-=-= Content-Type: text/plain Lars Magne Ingebrigtsen writes: > It's not much of a bottleneck, no: > > (benchmark-elapse (dotimes (i 10000) (parse-time-string "Thu, 04 Jul 2013 20:06:00 +0200"))) > => 1.120856647 => 0.215108395 ;-O But only with this: --=-=-= Content-Type: application/emacs-lisp Content-Disposition: inline; filename=parse-time-rfc2822.el Content-Transfer-Encoding: quoted-printable (defconst parse-time-rfc2822-regex (let* ((fws "[ \t\r\n]+") (ofws "[ \t\r\n]*") (day-of-week (regexp-opt (mapcar 'car parse-time-weekdays))) (day "[0-9]\\{1,2\\}") (month (regexp-opt (mapcar 'car parse-time-months))) (year "[0-9]\\{4,\\}") (hour "[0-9]\\{2\\}") (minute hour) (second hour) (zone-hour (concat "[+-]" hour)) (zone-min hour) ;; A rather non strict comment. (cfws "(\\(?:.\\|\n\\)*)")) (concat (format "\\`%s\\(?:\\(%s\\),%s\\)?" ofws day-of-week ofws) (format "\\(%s\\)%s\\(%s\\)%s\\(%s\\)%s" day fws month fws year fws) (format "\\(%s\\):\\(%s\\)\\(?::\\(%s\\)\\)?%s" hour minute second fws) (format "\\(%s\\)\\(%s\\)" zone-hour zone-min) (format "%s\\(?:%s\\)?%s\\'" ofws cfws ofws))) "A regex matching a strict rfc2822 date.") (defun parse-time-string-rfc2822 (string) "Parse the strict rfc2822 time-string STRING Return either a list (SEC MIN HOUR DAY MON YEAR DOW DST TZ) or nil, if STRING is not valid." (when (string-match parse-time-rfc2822-regex string) (let ((dow (cdr (assoc (match-string 1 string) parse-time-weekdays))) (day (string-to-number (match-string 2 string))) (month (cdr (assoc (match-string 3 string) parse-time-months))) (year (string-to-number (match-string 4 string))) (hour (string-to-number (match-string 5 string))) (min (string-to-number (match-string 6 string))) (sec (string-to-number (or (match-string 7 string) "0"))) (zhour (string-to-number (match-string 8 string))) (zmin (string-to-number (match-string 9 string)))) (list sec min hour day month year dow nil (+ (* 3600 zhour) (* 60 zmin)))))) --=-=-= Content-Type: text/plain > But sorting a summary buffer of 5K messages on Date (which some people > do) might get a performance boost. But I was thinking that it might be > more likely that parse-datetime parses more date strings correctly than > the version in parse-time.el. It looks that way, i.e. parse-time-string is pretty simple compared to that. But most Date header I've seen popping up in my mail seem to adhere to a strict rfc2822 format anyway, except for the occasional non-strict timezone. -ap --=-=-=--