From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#2790: emacs 22.1.1 cannot open 5GB file on 64GB 64-bit GNU/Linux box Date: Sun, 29 Mar 2009 16:10:26 -0400 Message-ID: References: <3c6c07c20903260850r180e942dscb2c61d1096793f8@mail.gmail.com> <3c6c07c20903270927y5292e32as857233aa1fd75737@mail.gmail.com> Reply-To: Stefan Monnier , 2790@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1238358271 1141 80.91.229.12 (29 Mar 2009 20:24:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 29 Mar 2009 20:24:31 +0000 (UTC) Cc: 2790@emacsbugs.donarmstrong.com, tutufan@gmail.com To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Mar 29 22:25:47 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Lo1Ze-0006vP-89 for geb-bug-gnu-emacs@m.gmane.org; Sun, 29 Mar 2009 22:25:43 +0200 Original-Received: from localhost ([127.0.0.1]:42374 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Lo1YG-0000H5-RI for geb-bug-gnu-emacs@m.gmane.org; Sun, 29 Mar 2009 16:24:16 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Lo1Xu-0008Vv-Vn for bug-gnu-emacs@gnu.org; Sun, 29 Mar 2009 16:23:55 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Lo1Xp-0008V4-Tt for bug-gnu-emacs@gnu.org; Sun, 29 Mar 2009 16:23:54 -0400 Original-Received: from [199.232.76.173] (port=41509 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Lo1Xp-0008Um-ON for bug-gnu-emacs@gnu.org; Sun, 29 Mar 2009 16:23:49 -0400 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:58523) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Lo1Xp-0000K3-84 for bug-gnu-emacs@gnu.org; Sun, 29 Mar 2009 16:23:49 -0400 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2TKNkE7024309; Sun, 29 Mar 2009 13:23:47 -0700 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.13.8/8.13.8/Submit) id n2TKK2qd023165; Sun, 29 Mar 2009 13:20:02 -0700 X-Loop: owner@emacsbugs.donarmstrong.com Resent-From: Stefan Monnier Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sun, 29 Mar 2009 20:20:02 +0000 Resent-Message-ID: Resent-Sender: owner@emacsbugs.donarmstrong.com X-Emacs-PR-Message: followup 2790 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by 2790-submit@emacsbugs.donarmstrong.com id=B2790.123835743921891 (code B ref 2790); Sun, 29 Mar 2009 20:20:02 +0000 Original-Received: (at 2790) by emacsbugs.donarmstrong.com; 29 Mar 2009 20:10:39 +0000 X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. Original-Received: from ironport2-out.teksavvy.com (ironport2-out.pppoe.ca [206.248.154.182]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2TKAWmG021885 for <2790@emacsbugs.donarmstrong.com>; Sun, 29 Mar 2009 13:10:33 -0700 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjEFAHZyz0lFxIfy/2dsb2JhbACBUMUmg3oGhHc X-IronPort-AV: E=Sophos;i="4.38,442,1233550800"; d="scan'208";a="35901509" Original-Received: from 69-196-135-242.dsl.teksavvy.com (HELO ceviche.home) ([69.196.135.242]) by ironport2-out.teksavvy.com with ESMTP; 29 Mar 2009 16:10:26 -0400 Original-Received: by ceviche.home (Postfix, from userid 20848) id 36964706A9; Sun, 29 Mar 2009 16:10:26 -0400 (EDT) In-Reply-To: (Eli Zaretskii's message of "Sat, 28 Mar 2009 11:45:35 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.91 (gnu/linux) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) Resent-Date: Sun, 29 Mar 2009 16:23:54 -0400 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:26742 Archived-At: > The patch below does this: >> - || st.st_size > INT_MAX / 4) >> + /* Actually, it should test either INT_MAX or LONG_MAX >> + depending on which one is used for EMACS_INT. But in >> + any case, in practice, this test is redundant with the >> + one above. >> + || st.st_size > INT_MAX / 4 */) >> error ("Maximum buffer size exceeded"); > But what about the commentary immediately preceding the modified code: > The calculations below double the file size twice, so check that it > can be multiplied by 4 safely. The patch also adds a comment explaining that this test is actually redundant in practice (and it will stay redundant as long as our Lisp integers have at least 2bits of tag). > I'm not sure to which calculations it alludes, but if you think they > are no longer relevant, please remove that part of the comment, > otherwise we will wonder in a couple of years why the code does not do > what the comment says it should. Since I'm not sure either, I kept the comment and added another one explaining why I removed the check anyway. > Personally, I would change INT_MAX/4 to LONG_MAX/4, because that does > TRT on all supported platforms, 32-bit and 64-bit alike (long and int > are both 32-bit wide on 32-bit machines). That would avoid too > radical changes during a pretest, which is a Good Thing, IMO. In that case I'd rather do the check more directly, e.g.: (((EMACS_INT)st.st_size)*4)/4 == st.st_size But as explained, I'm not convinced the check is needed/useful. >> Note also that when you open large files, it's worthwhile to use >> find-file-literally to be sure it's opened in unibyte mode; >> otherwise it gets decoded which takes ages. > Perhaps the prompt we pop for large file should suggest visiting > literally as an option. Yes, that's also what I was thinking. Together with having different "large-threshold" values for unibyte and multibyte. >> Also if the file has many lines (my >> 800MB file was made up by copying a C file many times, so it had >> millions of lines), turning off line-number-mode is is needed to recover >> responsiveness when navigating near the end of the buffer. > Perhaps we should make the default value of line-number-display-limit > non-nil, at least in 64-bit builds. Agreed. We could even do something better: - do it more efficiently (once computed for a page, it should be able to update the count instantly when paging up/down, whereas it seems not to always be able to do that). - when computing really would take a lot of time (e.g. we're far from the closest known line position), display ??? and postpone the actual computation to some future idle time. In any case, large file introduce lots of problem. Stefan