unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* how to reading 0 byte files properly
@ 2022-11-17  8:45 Binbin YE
  2022-11-17 10:15 ` Andreas Schwab
  0 siblings, 1 reply; 4+ messages in thread
From: Binbin YE @ 2022-11-17  8:45 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]

Hi devs!

I've been trying using emacs to view some files mounted by FUSE. Sadly file
attr in the system says the file size is 0 bytes, just like files under
/proc

emacs can open the file but only can open it partially. I've checked the
code to handle find-file and narrow down to

```c
/* emacs/src/fileio.c:4587 */

if (seekable || !NILP (end))
  total = end_offset - beg_offset;
else
  /* For a special file, all we can do is guess.  */
  total = READ_BUF_SIZE;
```
Judging from the code, it assume the total size would be READ_BUF_SIZE

which seems to be not a very big number ( or not, I'm not sure)

```c
/* emacs/src/fileio.c:3692 */

enum { READ_BUF_SIZE = MAX_ALLOCA };
```

```h
/* emacs/src/lisp.h:5300 */

enum MAX_ALLOCA { MAX_ALLOCA = 16 * 1024 };
```

since vscode, vim, cat, and less can read this file properly, I would like
to clarify whether it is a bug to fix or there another way to open a big
file like this (say the actual size is 2MB~ but showing as 0 byte on stat)

Best,

Binbin

[-- Attachment #2: Type: text/html, Size: 1479 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: how to reading 0 byte files properly
  2022-11-17  8:45 how to reading 0 byte files properly Binbin YE
@ 2022-11-17 10:15 ` Andreas Schwab
  2022-11-18  1:04   ` Binbin YE
  2022-11-18  2:17   ` Binbin YE
  0 siblings, 2 replies; 4+ messages in thread
From: Andreas Schwab @ 2022-11-17 10:15 UTC (permalink / raw)
  To: Binbin YE; +Cc: emacs-devel

On Nov 17 2022, Binbin YE wrote:

> /* emacs/src/fileio.c:4587 */
>
> if (seekable || !NILP (end))
>   total = end_offset - beg_offset;
> else
>   /* For a special file, all we can do is guess.  */
>   total = READ_BUF_SIZE;
> ```
> Judging from the code, it assume the total size would be READ_BUF_SIZE

For a non-seekable file this is just a buffer size, see the read loop
later in the function (how_much stays zero then).

If the file is seekable, the important part is this:

	  /* The file size returned from fstat may be zero, but data
	     may be readable nonetheless, for example when this is a
	     file in the /proc filesystem.  */
	  if (end_offset == 0)
	    end_offset = READ_BUF_SIZE;

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: how to reading 0 byte files properly
  2022-11-17 10:15 ` Andreas Schwab
@ 2022-11-18  1:04   ` Binbin YE
  2022-11-18  2:17   ` Binbin YE
  1 sibling, 0 replies; 4+ messages in thread
From: Binbin YE @ 2022-11-18  1:04 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2238 bytes --]

Thank you Andreas for pointing that out


at this moment "total" is READ_BUF_SIZE

```c
/* emacs/src/fileio.c:4613 */

/* In the following loop, HOW_MUCH contains the total bytes read so
   far for a regular file, and not changed for a special file.  But,
   before exiting the loop, it is set to a negative value if I/O
   error occurs.  */
how_much = 0;
```

I have confirmed the file is not seekable on my side using,  which is
different from /proc files

```c
/* test.c:11 */

int fd = open("/proc/2051/arch_status", O_RDONLY);
int sek = lseek(fd, 0, SEEK_CUR);
printf("proc file is seekable %d\n", sek); // returns 0

fd = open("/run/test.json", O_RDONLY);
sek = lseek(fd, 0, SEEK_CUR);
printf("fuse file is seekable %d\n", sek); // returns -1
```

I think it hits this block. But I don't see anything special to increase
the count. Could that mean emacs only reads "READ_BUF_SIZE" amount of data?

```c
/* emacs/src/fileio.c:4627 */

while (how_much < total)
  {
    /* `try' is reserved in some compilers (Microsoft C).  */
    ptrdiff_t trytry = min (total - how_much, READ_BUF_SIZE);
    ptrdiff_t this;

    if (!seekable && NILP (end))
```

should the fix be quitting at actual io?


Best,

Binbin


On Thu, Nov 17, 2022 at 7:15 PM Andreas Schwab <schwab@linux-m68k.org>
wrote:

> On Nov 17 2022, Binbin YE wrote:
>
> > /* emacs/src/fileio.c:4587 */
> >
> > if (seekable || !NILP (end))
> >   total = end_offset - beg_offset;
> > else
> >   /* For a special file, all we can do is guess.  */
> >   total = READ_BUF_SIZE;
> > ```
> > Judging from the code, it assume the total size would be READ_BUF_SIZE
>
> For a non-seekable file this is just a buffer size, see the read loop
> later in the function (how_much stays zero then).
>
> If the file is seekable, the important part is this:
>
>           /* The file size returned from fstat may be zero, but data
>              may be readable nonetheless, for example when this is a
>              file in the /proc filesystem.  */
>           if (end_offset == 0)
>             end_offset = READ_BUF_SIZE;
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
>

[-- Attachment #2: Type: text/html, Size: 3341 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: how to reading 0 byte files properly
  2022-11-17 10:15 ` Andreas Schwab
  2022-11-18  1:04   ` Binbin YE
@ 2022-11-18  2:17   ` Binbin YE
  1 sibling, 0 replies; 4+ messages in thread
From: Binbin YE @ 2022-11-18  2:17 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]

Sorry I put mailing list in  "cc",  but not "to"

Andreas
Thank you for pointing that out

at this moment "total" is READ_BUF_SIZE

```c
/* emacs/src/fileio.c:4613 */

/* In the following loop, HOW_MUCH contains the total bytes read so
   far for a regular file, and not changed for a special file.  But,
   before exiting the loop, it is set to a negative value if I/O
   error occurs.  */
how_much = 0;
```

I have confirmed the file is not seekable on my side using,  which is
different from /proc files

```c
/* test.c:11 */

int fd = open("/proc/2051/arch_status", O_RDONLY);
int sek = lseek(fd, 0, SEEK_CUR);
printf("proc file is seekable %d\n", sek); // returns 0

fd = open("/run/test.json", O_RDONLY);
sek = lseek(fd, 0, SEEK_CUR);
printf("fuse file is seekable %d\n", sek); // returns -1
```

I think it hits this block. But I don't see anything special to increase
the count. Could that mean emacs only reads "READ_BUF_SIZE" amount of data?

```c
/* emacs/src/fileio.c:4627 */

while (how_much < total)
  {
    /* `try' is reserved in some compilers (Microsoft C).  */
    ptrdiff_t trytry = min (total - how_much, READ_BUF_SIZE);
    ptrdiff_t this;

    if (!seekable && NILP (end))
```

Should the fix be quitting at actual io?

Best,

Binbin


On Thu, Nov 17, 2022 at 7:15 PM Andreas Schwab <schwab@linux-m68k.org>
wrote:

> On Nov 17 2022, Binbin YE wrote:
>
> > /* emacs/src/fileio.c:4587 */
> >
> > if (seekable || !NILP (end))
> >   total = end_offset - beg_offset;
> > else
> >   /* For a special file, all we can do is guess.  */
> >   total = READ_BUF_SIZE;
> > ```
> > Judging from the code, it assume the total size would be READ_BUF_SIZE
>
> For a non-seekable file this is just a buffer size, see the read loop
> later in the function (how_much stays zero then).
>
> If the file is seekable, the important part is this:
>
>           /* The file size returned from fstat may be zero, but data
>              may be readable nonetheless, for example when this is a
>              file in the /proc filesystem.  */
>           if (end_offset == 0)
>             end_offset = READ_BUF_SIZE;
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
>

[-- Attachment #2: Type: text/html, Size: 3281 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-18  2:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-17  8:45 how to reading 0 byte files properly Binbin YE
2022-11-17 10:15 ` Andreas Schwab
2022-11-18  1:04   ` Binbin YE
2022-11-18  2:17   ` Binbin YE

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).