From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Binbin YE Newsgroups: gmane.emacs.devel Subject: Re: how to reading 0 byte files properly Date: Fri, 18 Nov 2022 10:04:51 +0900 Message-ID: References: <87o7t5ubs9.fsf@igel.home> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000a6d96005edb44bfc" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12133"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Andreas Schwab Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Nov 18 07:57:22 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ovvJS-00031l-7B for ged-emacs-devel@m.gmane-mx.org; Fri, 18 Nov 2022 07:57:22 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovvIg-0000N4-Ji; Fri, 18 Nov 2022 01:56:34 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovpoZ-0000vS-Vh for emacs-devel@gnu.org; Thu, 17 Nov 2022 20:05:08 -0500 Original-Received: from mail-qv1-xf29.google.com ([2607:f8b0:4864:20::f29]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovpoV-0007Fj-UE for emacs-devel@gnu.org; Thu, 17 Nov 2022 20:05:07 -0500 Original-Received: by mail-qv1-xf29.google.com with SMTP id h7so1824468qvs.3 for ; Thu, 17 Nov 2022 17:05:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wZ2EEF90zgiojyNYo+GnUYmwp6uJrQnY6UwrVTOA4SU=; b=fJ/tDaSEhM0MOUI0+acJWqcnm8HvlBAWRQJM/JHcPdW5jcBHW6pv0XhgZAe3LT50J9 s6J7Joz3Nm3dbOqZFExImOXVZsihXlCPBbiCGHAxyFAdhDsZuh5iXzrVw11BRP8P1TO+ j7sOhLFE1wnmNPBLoxg1Cf3lFvMmVmW5VPlTyl0NUkJkwGVnrWemV7Deq+TqAySMQC5V 4pLt5ywIfWuwMHSeSkfrhxbDEYpOCFaYjbgbL7Wa16QFWXSfNbkznGkIAvh++0L3QP7d yF+eTdPYWapuPnto+KabTScbiIPFouHyJPiUgl/d1cmffzLpq626y3MQFVoNg9UfWjdo 6+rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wZ2EEF90zgiojyNYo+GnUYmwp6uJrQnY6UwrVTOA4SU=; b=MuO+SC+ynl91XDoXe8HT8VCEUgKN1Ju/0meHdy6dXxfUD1Qy/K6C3q1xCyei6mckgF EyGuxb84aN9d8iWQEjBCbBwmllchEso1L5uMeNvmAgSqL1syCoDYWVLXwj9t9d4mUKmP x47FQLRa7MBRnmLwarNSV4RF2cA+h8N9ZVoBW8R3TgBDnSCSZ1gFVAvWegsm1C4sXTir O+STALZRlFj9APxUDCgbCmvM/wDjE+aIZGXKV5l5cPqfIPHo5MIO0GK72iuyqU34hKbR rbmbvDRpWOq10elnBxX0gi/Cs8aACkcREKRxGl/oMy9BuclHZu9bhCTsMYjeH3JjXAis 25sA== X-Gm-Message-State: ANoB5plHujHftobudG4i+iyhxjQHBkL9XMNqnxL6VVTlcXo1xRD0U+xB a7teVQLMfQJO+Nh+CnAwO50Q36Wjt4wmdYkV6bCiDIafVnvbmg== X-Google-Smtp-Source: AA0mqf5k4O5Tj7LymvvExFynQhH0EuF3UQjXZ0xXqK5Q2OsUnpLo2FfIiVSCxFvP7zJJIf40CzD/oqWwlp0RVuXb2NY= X-Received: by 2002:a05:6214:806:b0:4c6:9385:eadb with SMTP id df6-20020a056214080600b004c69385eadbmr809231qvb.114.1668733502674; Thu, 17 Nov 2022 17:05:02 -0800 (PST) In-Reply-To: <87o7t5ubs9.fsf@igel.home> Received-SPF: pass client-ip=2607:f8b0:4864:20::f29; envelope-from=phantom2501@gmail.com; helo=mail-qv1-xf29.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 18 Nov 2022 01:56:32 -0500 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:300086 Archived-At: --000000000000a6d96005edb44bfc Content-Type: text/plain; charset="UTF-8" Thank you Andreas for pointing that out at this moment "total" is READ_BUF_SIZE ```c /* emacs/src/fileio.c:4613 */ /* In the following loop, HOW_MUCH contains the total bytes read so far for a regular file, and not changed for a special file. But, before exiting the loop, it is set to a negative value if I/O error occurs. */ how_much = 0; ``` I have confirmed the file is not seekable on my side using, which is different from /proc files ```c /* test.c:11 */ int fd = open("/proc/2051/arch_status", O_RDONLY); int sek = lseek(fd, 0, SEEK_CUR); printf("proc file is seekable %d\n", sek); // returns 0 fd = open("/run/test.json", O_RDONLY); sek = lseek(fd, 0, SEEK_CUR); printf("fuse file is seekable %d\n", sek); // returns -1 ``` I think it hits this block. But I don't see anything special to increase the count. Could that mean emacs only reads "READ_BUF_SIZE" amount of data? ```c /* emacs/src/fileio.c:4627 */ while (how_much < total) { /* `try' is reserved in some compilers (Microsoft C). */ ptrdiff_t trytry = min (total - how_much, READ_BUF_SIZE); ptrdiff_t this; if (!seekable && NILP (end)) ``` should the fix be quitting at actual io? Best, Binbin On Thu, Nov 17, 2022 at 7:15 PM Andreas Schwab wrote: > On Nov 17 2022, Binbin YE wrote: > > > /* emacs/src/fileio.c:4587 */ > > > > if (seekable || !NILP (end)) > > total = end_offset - beg_offset; > > else > > /* For a special file, all we can do is guess. */ > > total = READ_BUF_SIZE; > > ``` > > Judging from the code, it assume the total size would be READ_BUF_SIZE > > For a non-seekable file this is just a buffer size, see the read loop > later in the function (how_much stays zero then). > > If the file is seekable, the important part is this: > > /* The file size returned from fstat may be zero, but data > may be readable nonetheless, for example when this is a > file in the /proc filesystem. */ > if (end_offset == 0) > end_offset = READ_BUF_SIZE; > > -- > Andreas Schwab, schwab@linux-m68k.org > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > "And now for something completely different." > --000000000000a6d96005edb44bfc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you Andreas for pointing=C2=A0that out


at this moment "total" is READ_BUF_SIZE

```c
/* emacs/src/fileio.c:4613 */

/* I= n the following loop, HOW_MUCH contains the total bytes read so
=C2=A0 = =C2=A0far for a regular file, and not changed for a special file.=C2=A0 But= ,
=C2=A0 =C2=A0before exiting the loop, it is set to a negative value if= I/O
=C2=A0 =C2=A0error occurs. =C2=A0*/
how_much =3D 0;
```

I have confirmed the file is not seekable on my sid= e using,=C2=A0 which is different from /proc files

```c
/* test.c:11 */

int fd =3D open("/proc/2051/arch_statu= s", O_RDONLY);
int sek =3D lseek(fd, 0, SEEK_CUR);
printf("= proc file is seekable %d\n", sek); // returns 0

fd =3D open(&qu= ot;/run/test.json", O_RDONLY);
sek =3D lseek(fd, 0, SEEK_CUR);
p= rintf("fuse file is seekable %d\n", sek); // returns -1
```

I think it hits this block. But I don't see anyt= hing=C2=A0special to increase the count. Could that mean=C2=A0emacs only re= ads "READ_BUF_SIZE" amount of data?

```c=
/* emacs/src/fileio.c:4627 */

while (how_much < total)
=C2= =A0 {
=C2=A0 =C2=A0 /* `try' is reserved in some compilers (Microsof= t C). =C2=A0*/
=C2=A0 =C2=A0 ptrdiff_t trytry =3D min (total - how_much,= READ_BUF_SIZE);
=C2=A0 =C2=A0 ptrdiff_t this;

=C2=A0 =C2=A0 if (= !seekable && NILP (end))
```

should= the fix be quitting at actual io?


Best,

Binbin

On Thu, N= ov 17, 2022 at 7:15 PM Andreas Schwab <schwab@linux-m68k.org> wrote:
On Nov 17 2022, Binbin YE wrote:

> /* emacs/src/fileio.c:4587 */
>
> if (seekable || !NILP (end))
>=C2=A0 =C2=A0total =3D end_offset - beg_offset;
> else
>=C2=A0 =C2=A0/* For a special file, all we can do is guess.=C2=A0 */ >=C2=A0 =C2=A0total =3D READ_BUF_SIZE;
> ```
> Judging from the code, it assume the total size would be READ_BUF_SIZE=

For a non-seekable file this is just a buffer size, see the read loop
later in the function (how_much stays zero then).

If the file is seekable, the important part is this:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* The file size returned from fstat may= be zero, but data
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0may be readable nonetheless= , for example when this is a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file in the /proc filesyste= m.=C2=A0 */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (end_offset =3D=3D 0)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 end_offset =3D READ_BUF_SIZE;

--
Andreas Schwab, = schwab@linux-m68k.org
GPG Key fingerprint =3D 7578 EB47 D4E5 4D69 2510=C2=A0 2552 DF73 E780 A9DA = AEC1
"And now for something completely different."
--000000000000a6d96005edb44bfc--