unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Emacs opens only first 16384 bytes of file?!
@ 2023-02-11 21:16 Christoph Groth
  2023-02-12  3:55 ` Ruijie Yu via Users list for the GNU Emacs text editor
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Christoph Groth @ 2023-02-11 21:16 UTC (permalink / raw)
  To: help-gnu-emacs

Hello,

I just noticed something that made me doubt my own sanity:  With Emacs
27.1 (from Debian stable running Debian’s Linux kernel
6.0.0-0.deb11.6-amd64) the command

  emacs -q -nw /proc/cpuinfo

reproducibly opens cpuinfo only up to a portion of line 272, while the
entire file is 560 lines long on that machine.  I checked on a different
machine (with same Emacs), and I see the same behavior.

C-u C-x = tells that the file is read up to position 16384 (= 2^14).

Running M-x revert-buffer loads the whole file...

Is/was this a known issue?  I could not find anything on the web.

----------------------------------------------------------------

I’m willing to investigate this further, but it would seem very strange
if Emacs indeed had a bug that prevented it from loading some files
entirely.  Perhaps someone here knows the answer.

Thanks
Christoph



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-11 21:16 Christoph Groth
@ 2023-02-12  3:55 ` Ruijie Yu via Users list for the GNU Emacs text editor
  2023-02-12  4:55 ` Óscar Fuentes
  2023-02-12  6:02 ` Eli Zaretskii
  2 siblings, 0 replies; 11+ messages in thread
From: Ruijie Yu via Users list for the GNU Emacs text editor @ 2023-02-12  3:55 UTC (permalink / raw)
  To: Christoph Groth; +Cc: help-gnu-emacs


Christoph Groth <christoph@grothesque.org> writes:

> Hello,
>
> I just noticed something that made me doubt my own sanity:  With Emacs
> 27.1 (from Debian stable running Debian’s Linux kernel
> 6.0.0-0.deb11.6-amd64) the command
>
>   emacs -q -nw /proc/cpuinfo
>
> reproducibly opens cpuinfo only up to a portion of line 272, while the
> entire file is 560 lines long on that machine.  I checked on a different
> machine (with same Emacs), and I see the same behavior.
>
> C-u C-x = tells that the file is read up to position 16384 (= 2^14).

Can't reproduce on 28.1 nor 29 (8a18369afdc3).  Can you compile 28.1 and
see if the problem goes away?

Since my /proc/cpuinfo is smaller than 16384 bytes, I used the following
steps to try to reproduce the issue:

$ zcat /proc/config.gz > ~/c
$ /path/to/emacs-28.1/src/emacs -Q -nw ~/c
`C-u C-x =' says: "position: 1 of 252720 (0%), column: 0"

> Running M-x revert-buffer loads the whole file...
>
> Is/was this a known issue?  I could not find anything on the web.

In the future you might get more responses if you mail to
bug-gnu-emacs@gnu.org.

> ----------------------------------------------------------------
>
> I’m willing to investigate this further, but it would seem very strange
> if Emacs indeed had a bug that prevented it from loading some files
> entirely.  Perhaps someone here knows the answer.
>
> Thanks
> Christoph

--
Best,


RY



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-11 21:16 Christoph Groth
  2023-02-12  3:55 ` Ruijie Yu via Users list for the GNU Emacs text editor
@ 2023-02-12  4:55 ` Óscar Fuentes
  2023-02-12  5:56   ` Ruijie Yu via Users list for the GNU Emacs text editor
  2023-02-12  6:02 ` Eli Zaretskii
  2 siblings, 1 reply; 11+ messages in thread
From: Óscar Fuentes @ 2023-02-12  4:55 UTC (permalink / raw)
  To: Christoph Groth; +Cc: help-gnu-emacs

Christoph Groth <christoph@grothesque.org> writes:

> Hello,
>
> I just noticed something that made me doubt my own sanity:  With Emacs
> 27.1 (from Debian stable running Debian’s Linux kernel
> 6.0.0-0.deb11.6-amd64) the command
>
>   emacs -q -nw /proc/cpuinfo
>
> reproducibly opens cpuinfo only up to a portion of line 272, while the
> entire file is 560 lines long on that machine.  I checked on a different
> machine (with same Emacs), and I see the same behavior.
>
> C-u C-x = tells that the file is read up to position 16384 (= 2^14).
>
> Running M-x revert-buffer loads the whole file...
>
> Is/was this a known issue?  I could not find anything on the web.
>
> ----------------------------------------------------------------
>
> I’m willing to investigate this further, but it would seem very strange
> if Emacs indeed had a bug that prevented it from loading some files
> entirely.  Perhaps someone here knows the answer.

As of today (Emacs 30, current development branch) the same is true.

/proc/cpuinfo is not a regular file, in the sense that its content is
not stored in a device. AFAIK it is generated on-the-fly when it is
read.

You can do some simple observations:

$ ls -l /proc/cpuinfo 
-r--r--r-- 1 root root 0 feb  7 03:34 /proc/cpuinfo
$ du /proc/cpuinfo 
0       /proc/cpuinfo

Here those tools are saying that the file's size is 0.

I guess that Emacs detects that the file is special and reads its
contents following some heuristics. What surprises me is that M-x
revert-file actually reads all the content.

Of course, looking at the sources would be enlightening, but why do the
effort of actually clearing the matter when it is so cheap to throw
speculation? ;-)

(I looked at insert-file-contents, but bailed out after the fifth
screenful of code. A 1000+ lines function, no kidding.)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-12  4:55 ` Óscar Fuentes
@ 2023-02-12  5:56   ` Ruijie Yu via Users list for the GNU Emacs text editor
  2023-02-12  9:58     ` Christoph Groth
  0 siblings, 1 reply; 11+ messages in thread
From: Ruijie Yu via Users list for the GNU Emacs text editor @ 2023-02-12  5:56 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: Christoph Groth, help-gnu-emacs


Óscar Fuentes <ofv@wanadoo.es> writes:

> [...]
> As of today (Emacs 30, current development branch) the same is true.
>
> /proc/cpuinfo is not a regular file, in the sense that its content is
> not stored in a device. AFAIK it is generated on-the-fly when it is
> read.
>
> You can do some simple observations:
>
> $ ls -l /proc/cpuinfo
> -r--r--r-- 1 root root 0 feb  7 03:34 /proc/cpuinfo
> $ du /proc/cpuinfo
> 0       /proc/cpuinfo
>
> Here those tools are saying that the file's size is 0.

Thanks for elaborating on the part that I missed.  So, to reproduce the
issue, we need to find a file which does not show size, and whose
contents are generated on-the-fly when read -- in the original report,
it is /proc/cpuinfo, and I have also found /proc/kallsyms which contains
around 12M of data.  In this case, I have reproduced the issue on my
Emacs 29 build.

> I guess that Emacs detects that the file is special and reads its
> contents following some heuristics. What surprises me is that M-x
> revert-file actually reads all the content.
>
> Of course, looking at the sources would be enlightening, but why do the
> effort of actually clearing the matter when it is so cheap to throw
> speculation? ;-)

I think the limit of 16384 is probably caused by fileio.c:3919
(907fd1f7ff4 somewhere on master, inside DEFUN "insert-file-contents")
which declares a read buffer.  The constant READ_BUF_SIZE is indirectly
set as 16 * 1024 = 16384.  Didn't read further there, nor inside
`revert-buffer'.

> (I looked at insert-file-contents, but bailed out after the fifth
> screenful of code. A 1000+ lines function, no kidding.)
Yeah, that's exactly why I didn't read further there.

--
Best,


RY



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-11 21:16 Christoph Groth
  2023-02-12  3:55 ` Ruijie Yu via Users list for the GNU Emacs text editor
  2023-02-12  4:55 ` Óscar Fuentes
@ 2023-02-12  6:02 ` Eli Zaretskii
  2023-02-12  7:11   ` tomas
  2 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2023-02-12  6:02 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Christoph Groth <christoph@grothesque.org>
> Date: Sat, 11 Feb 2023 22:16:52 +0100
> 
> I just noticed something that made me doubt my own sanity:  With Emacs
> 27.1 (from Debian stable running Debian’s Linux kernel
> 6.0.0-0.deb11.6-amd64) the command
> 
>   emacs -q -nw /proc/cpuinfo
> 
> reproducibly opens cpuinfo only up to a portion of line 272, while the
> entire file is 560 lines long on that machine.  I checked on a different
> machine (with same Emacs), and I see the same behavior.
> 
> C-u C-x = tells that the file is read up to position 16384 (= 2^14).
> 
> Running M-x revert-buffer loads the whole file...
> 
> Is/was this a known issue?  I could not find anything on the web.

Please report this as a bug.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-12  6:02 ` Eli Zaretskii
@ 2023-02-12  7:11   ` tomas
  2023-02-12  7:33     ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: tomas @ 2023-02-12  7:11 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

On Sun, Feb 12, 2023 at 08:02:37AM +0200, Eli Zaretskii wrote:
> > From: Christoph Groth <christoph@grothesque.org>
> > Date: Sat, 11 Feb 2023 22:16:52 +0100
> > 
> > I just noticed something that made me doubt my own sanity:  With Emacs
> > 27.1 (from Debian stable running Debian’s Linux kernel
> > 6.0.0-0.deb11.6-amd64) the command
> > 
> >   emacs -q -nw /proc/cpuinfo
> > 
> > reproducibly opens cpuinfo only up to a portion of line 272, while the
> > entire file is 560 lines long on that machine.  I checked on a different
> > machine (with same Emacs), and I see the same behavior.
> > 
> > C-u C-x = tells that the file is read up to position 16384 (= 2^14).
> > 
> > Running M-x revert-buffer loads the whole file...
> > 
> > Is/was this a known issue?  I could not find anything on the web.
> 
> Please report this as a bug.

I think the bug is already there (#9800) [1]

[1] https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-02/msg00459.html

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-12  7:11   ` tomas
@ 2023-02-12  7:33     ` Eli Zaretskii
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2023-02-12  7:33 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Sun, 12 Feb 2023 08:11:04 +0100
> From: <tomas@tuxteam.de>
> 
> On Sun, Feb 12, 2023 at 08:02:37AM +0200, Eli Zaretskii wrote:
> > Please report this as a bug.
> 
> I think the bug is already there (#9800) [1]

Right.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-12  5:56   ` Ruijie Yu via Users list for the GNU Emacs text editor
@ 2023-02-12  9:58     ` Christoph Groth
  2023-02-12 11:45       ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Groth @ 2023-02-12  9:58 UTC (permalink / raw)
  To: Ruijie Yu; +Cc: Óscar Fuentes, help-gnu-emacs

Ruijie Yu wrote:
> Óscar Fuentes <ofv@wanadoo.es> writes:
>
> > [...]
> > As of today (Emacs 30, current development branch) the same is true.
> >
> > /proc/cpuinfo is not a regular file, in the sense that its content
> > is not stored in a device. AFAIK it is generated on-the-fly when it
> > is read.
> >
> > You can do some simple observations:
> >
> > $ ls -l /proc/cpuinfo
> > -r--r--r-- 1 root root 0 feb  7 03:34 /proc/cpuinfo
> > $ du /proc/cpuinfo
> > 0       /proc/cpuinfo
> >
> > Here those tools are saying that the file's size is 0.
>
> Thanks for elaborating on the part that I missed.  So, to reproduce
> the issue, we need to find a file which does not show size, and whose
> contents are generated on-the-fly when read -- in the original report,
> it is /proc/cpuinfo, and I have also found /proc/kallsyms which
> contains around 12M of data.  In this case, I have reproduced the
> issue on my Emacs 29 build.

Yes, that’s it, thanks for examining this further!

> > I guess that Emacs detects that the file is special and reads its
> > contents following some heuristics. What surprises me is that M-x
> > revert-file actually reads all the content.
> >
> > Of course, looking at the sources would be enlightening, but why do the
> > effort of actually clearing the matter when it is so cheap to throw
> > speculation? ;-)
>
> I think the limit of 16384 is probably caused by fileio.c:3919
> (907fd1f7ff4 somewhere on master, inside DEFUN "insert-file-contents")
> which declares a read buffer.  The constant READ_BUF_SIZE is
> indirectly set as 16 * 1024 = 16384.  Didn't read further there, nor
> inside `revert-buffer'.

Indeed this seems to be related to the advertised file size being zero.

However, no other programs (including editors) seem to have a problem
with this.  So it really must be Emacs trying to be clever.

Finding files under /proc/ is, of course, not a very relevant
application of Emacs, but Elisp code gets posted that relies on this:
https://nullprogram.com/blog/2015/10/14/

I had something similar in my init.el to determine the value to use with
"make -j" as a default compile command - this is how I noticed this
issue in the first place.  (Now I changed that to executing the command
nproc.)

What worries me more is that there may be other, more sneaky, cases
where this backfires.  For example, perhaps opening a file that is being
written (so that its size increases), is broken as well in some cases?

Should I report this to bug-gnu-emacs@gnu.org, or is posting on this
list enough?  Or should I open a ticket with debbugs.gnu.org?

Cheers
Christoph



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-12  9:58     ` Christoph Groth
@ 2023-02-12 11:45       ` Eli Zaretskii
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2023-02-12 11:45 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Christoph Groth <christoph@grothesque.org>
> Cc: Óscar Fuentes <ofv@wanadoo.es>,  help-gnu-emacs@gnu.org
> Date: Sun, 12 Feb 2023 10:58:14 +0100
> 
> However, no other programs (including editors) seem to have a problem
> with this.  So it really must be Emacs trying to be clever.

Not clever, but flexible, powerful, and performant.  Which other
program supports on-the-fly decoding of non-ASCII text in so many
encodings?

> Should I report this to bug-gnu-emacs@gnu.org, or is posting on this
> list enough?  Or should I open a ticket with debbugs.gnu.org?

A bug for this already exists: bug#9800.  A possible solution was also
pointed out there.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
@ 2023-02-13  7:59 Christoph Groth
  2023-02-13 13:06 ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Groth @ 2023-02-13  7:59 UTC (permalink / raw)
  To: eliz; +Cc: help-gnu-emacs

Eli Zaretskii wrote:

> > From: Christoph Groth <christoph@grothesque.org>
> > 
> > However, no other programs (including editors) seem to have
> > a problem with this.  So it really must be Emacs trying to be
> > clever.
> 
> Not clever, but flexible, powerful, and performant.  Which other
> program supports on-the-fly decoding of non-ASCII text in so many
> encodings?

No need to convince me.  I have been living in Emacs since 1998.

That said, the finding that Emacs does not read files until EOF
surprised me quite a lot.  I thought that it had solid underpinnings.

> > Should I report this to bug-gnu-emacs@gnu.org, or is posting on this
> > list enough?  Or should I open a ticket with debbugs.gnu.org?
> 
> A bug for this already exists: bug#9800.  A possible solution was also
> pointed out there.

Thanks, I did not manage to find this one.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Emacs opens only first 16384 bytes of file?!
  2023-02-13  7:59 Emacs opens only first 16384 bytes of file?! Christoph Groth
@ 2023-02-13 13:06 ` Eli Zaretskii
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2023-02-13 13:06 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Christoph Groth <christoph@grothesque.org>
> Cc: help-gnu-emacs@gnu.org
> Date: Mon, 13 Feb 2023 08:59:01 +0100
> 
> > Not clever, but flexible, powerful, and performant.  Which other
> > program supports on-the-fly decoding of non-ASCII text in so many
> > encodings?
> 
> No need to convince me.  I have been living in Emacs since 1998.
> 
> That said, the finding that Emacs does not read files until EOF
> surprised me quite a lot.  I thought that it had solid underpinnings.

It does.  There are several valid reasons why we don't read
everything: insert-file-contents has more than a single operation
mode.



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-02-13 13:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-13  7:59 Emacs opens only first 16384 bytes of file?! Christoph Groth
2023-02-13 13:06 ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2023-02-11 21:16 Christoph Groth
2023-02-12  3:55 ` Ruijie Yu via Users list for the GNU Emacs text editor
2023-02-12  4:55 ` Óscar Fuentes
2023-02-12  5:56   ` Ruijie Yu via Users list for the GNU Emacs text editor
2023-02-12  9:58     ` Christoph Groth
2023-02-12 11:45       ` Eli Zaretskii
2023-02-12  6:02 ` Eli Zaretskii
2023-02-12  7:11   ` tomas
2023-02-12  7:33     ` Eli Zaretskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).