From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Binbin YE Newsgroups: gmane.emacs.devel Subject: Re: how to reading 0 byte files properly Date: Fri, 18 Nov 2022 11:17:07 +0900 Message-ID: References: <87o7t5ubs9.fsf@igel.home> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000000f442c05edb54e88" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14053"; mail-complaints-to="usenet@ciao.gmane.io" To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Nov 18 07:57:46 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ovvJq-0003Sw-In for ged-emacs-devel@m.gmane-mx.org; Fri, 18 Nov 2022 07:57:46 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovvIi-0000Ng-6k; Fri, 18 Nov 2022 01:56:36 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovqwT-0004lZ-90 for emacs-devel@gnu.org; Thu, 17 Nov 2022 21:17:21 -0500 Original-Received: from mail-qt1-x82c.google.com ([2607:f8b0:4864:20::82c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovqwR-0001zS-Jg for emacs-devel@gnu.org; Thu, 17 Nov 2022 21:17:21 -0500 Original-Received: by mail-qt1-x82c.google.com with SMTP id s4so2369823qtx.6 for ; Thu, 17 Nov 2022 18:17:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=uJoQf9Y/uUPBZnRqvyfY7z5+0QNESuL8L+1Hot4HCQM=; b=MJ9hZnohUmK6Bmtvc2x8WQZMrCNYDjG0sPAKHo0L1QaeH3yRA6IS9qDIIPW0Bvbekv vPNpYeO9siwZmif/RXH6Eo/0NnD46Ml298yFo3fSwXu5hpGyBY1bjJemNN5MqGIOY/fy Aj5oQloPETuDWXYE7NtqxR1RTXz1bbsqf9Zx/enZwpTbZxWAaWyglU4HCcmNvPltPNiY PxlKylJZHIM9HfraudqNk4WeTBjdZV6WB2a3DYpo2mmjRGIKfOWKh9SotdeY6b34rgjJ XWzQXul6NHWml/m7b/+i4JDzudAnf2lmX0zA6nNIUEcbguCsOxWXZR7rzUSgxVei0NGY f1YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uJoQf9Y/uUPBZnRqvyfY7z5+0QNESuL8L+1Hot4HCQM=; b=raCI/3DykUdHEIBm1U7wsMbHRYKT3NgFfByHDhNjQuZHbeNDbrTxBGTaXPEVKFDrSj JxgxEThVhC8/ausdMEo6P7X9IU/HGMoKC8P9pqXQDpmJK4feb7ipQxUrTvsBF2JNoc3c aYPbgF1EW0cP/zWygWf2RWD29Faaa9TnIvwvTtsQwxuj6cSzxKU4lkLGzyAtd8dT0QR1 uwiL74smkmzxI26F6P9pNqZ1C4uQFd17MxbS8yVzQ1/YGGev1tPMame//hyyRa/oo7+y X1MDsMS2qvnO7oXWzo49h0WPxIs7qKiCUfyAr+TNmrwkXx3vPqS9t1SiCDlQEdlvV57g V5HA== X-Gm-Message-State: ANoB5pnjYT/K6GfCGMZuG+m1W1r9M0ZMc3FJnaZaBdq8mwwiX4ibuzAA ew3RYMHPYdFSpw8pnG4QHQLNnZqGCkf5Rrp0vdqpocmJMkY5rg== X-Google-Smtp-Source: AA0mqf5r5akURczPyQ+izMP7H8I1e54jQK36add0Ny6YPfllTxBMs40XNBZDeZrOHuu7taeSeqUePpYJIncACkawnf4= X-Received: by 2002:ac8:7615:0:b0:3a5:1be7:3603 with SMTP id t21-20020ac87615000000b003a51be73603mr4999007qtq.168.1668737838038; Thu, 17 Nov 2022 18:17:18 -0800 (PST) In-Reply-To: <87o7t5ubs9.fsf@igel.home> Received-SPF: pass client-ip=2607:f8b0:4864:20::82c; envelope-from=phantom2501@gmail.com; helo=mail-qt1-x82c.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 18 Nov 2022 01:56:33 -0500 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:300087 Archived-At: --0000000000000f442c05edb54e88 Content-Type: text/plain; charset="UTF-8" Sorry I put mailing list in "cc", but not "to" Andreas Thank you for pointing that out at this moment "total" is READ_BUF_SIZE ```c /* emacs/src/fileio.c:4613 */ /* In the following loop, HOW_MUCH contains the total bytes read so far for a regular file, and not changed for a special file. But, before exiting the loop, it is set to a negative value if I/O error occurs. */ how_much = 0; ``` I have confirmed the file is not seekable on my side using, which is different from /proc files ```c /* test.c:11 */ int fd = open("/proc/2051/arch_status", O_RDONLY); int sek = lseek(fd, 0, SEEK_CUR); printf("proc file is seekable %d\n", sek); // returns 0 fd = open("/run/test.json", O_RDONLY); sek = lseek(fd, 0, SEEK_CUR); printf("fuse file is seekable %d\n", sek); // returns -1 ``` I think it hits this block. But I don't see anything special to increase the count. Could that mean emacs only reads "READ_BUF_SIZE" amount of data? ```c /* emacs/src/fileio.c:4627 */ while (how_much < total) { /* `try' is reserved in some compilers (Microsoft C). */ ptrdiff_t trytry = min (total - how_much, READ_BUF_SIZE); ptrdiff_t this; if (!seekable && NILP (end)) ``` Should the fix be quitting at actual io? Best, Binbin On Thu, Nov 17, 2022 at 7:15 PM Andreas Schwab wrote: > On Nov 17 2022, Binbin YE wrote: > > > /* emacs/src/fileio.c:4587 */ > > > > if (seekable || !NILP (end)) > > total = end_offset - beg_offset; > > else > > /* For a special file, all we can do is guess. */ > > total = READ_BUF_SIZE; > > ``` > > Judging from the code, it assume the total size would be READ_BUF_SIZE > > For a non-seekable file this is just a buffer size, see the read loop > later in the function (how_much stays zero then). > > If the file is seekable, the important part is this: > > /* The file size returned from fstat may be zero, but data > may be readable nonetheless, for example when this is a > file in the /proc filesystem. */ > if (end_offset == 0) > end_offset = READ_BUF_SIZE; > > -- > Andreas Schwab, schwab@linux-m68k.org > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > "And now for something completely different." > --0000000000000f442c05edb54e88 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Sorry I put mailing list in=C2=A0=C2=A0"cc"= ,=C2=A0=C2=A0but not "to"

Andreas
Thank you for pointing that out

at this moment "total&q= uot; is READ_BUF_SIZE

```c
/* emacs/src/fileio.c:4613 */

/= * In the following loop, HOW_MUCH contains the total bytes read so
=C2= =A0 =C2=A0far for a regular file, and not changed for a special file.=C2=A0= But,
=C2=A0 =C2=A0before exiting the loop, it is set to a negative valu= e if I/O
=C2=A0 =C2=A0error occurs. =C2=A0*/
how_much =3D 0;
```
I have confirmed the file is not seekable on my side using, =C2=A0whi= ch is different from /proc files

```c
/* test.c:11 */

int = fd =3D open("/proc/2051/arch_status", O_RDONLY);
int sek =3D l= seek(fd, 0, SEEK_CUR);
printf("proc file is seekable %d\n", se= k); // returns 0

fd =3D open("/run/test.json", O_RDONLY);<= br>sek =3D lseek(fd, 0, SEEK_CUR);
printf("fuse file is seekable %d= \n", sek); // returns -1
```

I think it hits this block. But= I don't see anything special to increase the count. Could that mean em= acs only reads "READ_BUF_SIZE" amount of data?

```c
/* = emacs/src/fileio.c:4627 */

while (how_much < total)
=C2=A0 {=C2=A0 =C2=A0 /* `try' is reserved in some compilers (Microsoft C). = =C2=A0*/
=C2=A0 =C2=A0 ptrdiff_t trytry =3D min (total - how_much, READ_= BUF_SIZE);
=C2=A0 =C2=A0 ptrdiff_t this;

=C2=A0 =C2=A0 if (!seeka= ble && NILP (end))
```

Should the fix be quitting at actu= al io?=C2=A0

Best,=

Binbin
=


On Thu, Nov 17, 2022 at 7:15 PM Andreas Schwab <schwab@linux-m68k.org> wrote:
<= /div>
On Nov 17 2022, Binb= in YE wrote:

> /* emacs/src/fileio.c:4587 */
>
> if (seekable || !NILP (end))
>=C2=A0 =C2=A0total =3D end_offset - beg_offset;
> else
>=C2=A0 =C2=A0/* For a special file, all we can do is guess.=C2=A0 */ >=C2=A0 =C2=A0total =3D READ_BUF_SIZE;
> ```
> Judging from the code, it assume the total size would be READ_BUF_SIZE=

For a non-seekable file this is just a buffer size, see the read loop
later in the function (how_much stays zero then).

If the file is seekable, the important part is this:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* The file size returned from fstat may= be zero, but data
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0may be readable nonetheless= , for example when this is a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file in the /proc filesyste= m.=C2=A0 */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (end_offset =3D=3D 0)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 end_offset =3D READ_BUF_SIZE;

--
Andreas Schwab, = schwab@linux-m68k.org
GPG Key fingerprint =3D 7578 EB47 D4E5 4D69 2510=C2=A0 2552 DF73 E780 A9DA = AEC1
"And now for something completely different."
--0000000000000f442c05edb54e88--