unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH] Bindings for ‘sendfile’
@ 2013-03-20 22:21 Ludovic Courtès
  2013-03-21  0:17 ` Noah Lavine
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Ludovic Courtès @ 2013-03-20 22:21 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 101 bytes --]

Hi,

I plan to commit the patch below, which adds bindings for ‘sendfile’.

Comments?

Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: the patch --]
[-- Type: text/x-patch, Size: 12286 bytes --]

From a10f5d5d69d63495cab5432d858b1af52a2bacbf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Wed, 20 Mar 2013 23:04:11 +0100
Subject: [PATCH] Add bindings for Linux's `sendfile'.

* configure.ac: Check for <sys/sendfile.h> and `sendfile'.
* libguile/filesys.c (scm_sendfile): New function.
* libguile/filesys.h (scm_sendfile): New declaration.
* test-suite/tests/filesys.test ("sendfile"): New test prefix.
* doc/ref/posix.texi (File System): Document `sendfile'.
---
 configure.ac                  |   20 ++++++++--
 doc/ref/posix.texi            |   23 +++++++++++
 libguile/filesys.c            |   85 +++++++++++++++++++++++++++++++++++++++++
 libguile/filesys.h            |    4 +-
 test-suite/tests/filesys.test |   70 ++++++++++++++++++++++++++++++++-
 5 files changed, 195 insertions(+), 7 deletions(-)

diff --git a/configure.ac b/configure.ac
index 42de733..bcfc1a6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -647,12 +647,13 @@ AC_SUBST([SCM_I_GSC_HAVE_STRUCT_DIRENT64])
 #     this file instead of <fenv.h>
 #   process.h - mingw specific
 #   sched.h - missing on MinGW
+#   sys/sendfile.h - non-POSIX, found in glibc
 #
 AC_CHECK_HEADERS([complex.h fenv.h io.h libc.h limits.h memory.h process.h string.h \
 sys/dir.h sys/ioctl.h sys/select.h \
 sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
 sys/utime.h time.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
-direct.h machine/fpu.h sched.h])
+direct.h machine/fpu.h sched.h sys/sendfile.h])
 
 # "complex double" is new in C99, and "complex" is only a keyword if
 # <complex.h> is included
@@ -744,10 +745,21 @@ AC_CHECK_HEADERS([assert.h crt_externs.h])
 #   _NSGetEnviron - Darwin specific
 #   strcoll_l, newlocale - GNU extensions (glibc), also available on Darwin
 #   fork - unavailable on Windows
-#   utimensat: posix.1-2008
-#   sched_getaffinity, sched_setaffinity: GNU extensions (glibc)
+#   utimensat - posix.1-2008
+#   sched_getaffinity, sched_setaffinity - GNU extensions (glibc)
+#   sendfile - non-POSIX, found in glibc
 #
-AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid fesetround ftime ftruncate fchown fchmod getcwd geteuid getsid gettimeofday gmtime_r ioctl lstat mkdir mknod nice pipe _pipe readdir_r readdir64_r readlink rename rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l newlocale utimensat sched_getaffinity sched_setaffinity])
+AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid		\
+  fesetround ftime ftruncate fchown fchmod getcwd geteuid getsid	\
+  gettimeofday gmtime_r ioctl lstat mkdir mknod nice pipe _pipe		\
+  readdir_r readdir64_r readlink rename rmdir select setegid seteuid	\
+  setlocale setpgid setsid sigaction siginterrupt stat64 strftime	\
+  strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid	\
+  strdup system usleep atexit on_exit chown link fcntl ttyname getpwent	\
+  getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp	\
+  index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron	\
+  strcoll strcoll_l newlocale utimensat sched_getaffinity		\
+  sched_setaffinity sendfile])
 
 AM_CONDITIONAL([HAVE_FORK], [test "x$ac_cv_func_fork" = "xyes"])
 
diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index d659cf3..ca02093 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -803,6 +803,29 @@ Copy the file specified by @var{oldfile} to @var{newfile}.
 The return value is unspecified.
 @end deffn
 
+@deffn {Scheme Procedure} sendfile out in count [offset]
+@deffnx {C Function} scm_sendfile (out, in, count, offset)
+Send @var{count} bytes from @var{in} to @var{out}, both of which
+are either open file ports or file descriptors.  When
+@var{offset} is omitted, start reading from @var{in}'s current
+position; otherwise, start reading at @var{offset}.
+
+When @var{in} is a port, it is often preferable to specify @var{offset},
+because @var{in}'s offset as a port may be different from the offset of
+its underlying file descriptor.
+
+On systems that support it, such as GNU/Linux, this procedure uses the
+@code{sendfile} libc function, which usually corresponds to a system
+call.  This is faster than doing a series of @code{read} and
+@code{write} system calls.  A typical application is to send a file over
+a socket.
+
+In some cases, the @code{sendfile} libc function may return
+@code{EINVAL} or @code{ENOSYS}.  In that case, Guile's @code{sendfile}
+procedure automatically falls back to doing a series of @code{read} and
+@code{write} calls.
+@end deffn
+
 @findex rename
 @deffn {Scheme Procedure} rename-file oldname newname
 @deffnx {C Function} scm_rename (oldname, newname)
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 282ff31..097b03a 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -98,6 +98,14 @@
 
 #define NAMLEN(dirent)  strlen ((dirent)->d_name)
 
+#ifdef HAVE_SYS_SENDFILE_H
+# include <sys/sendfile.h>
+#endif
+
+#include <full-read.h>
+#include <full-write.h>
+
+
 /* Some more definitions for the native Windows port. */
 #ifdef __MINGW32__
 # define fsync(fd) _commit (fd)
@@ -1096,6 +1104,83 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
 }
 #undef FUNC_NAME
 
+SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
+	    (SCM out, SCM in, SCM count, SCM offset),
+	    "Send @var{count} bytes from @var{in} to @var{out}, both of which "
+	    "are either open file ports or file descriptors.  When "
+	    "@var{offset} is omitted, start reading from @var{in}'s current "
+	    "position; otherwise, start reading at @var{offset}.")
+#define FUNC_NAME s_scm_sendfile
+{
+#define VALIDATE_FD_OR_PORT(cvar, svar, pos)	\
+  if (scm_is_integer (svar))			\
+    cvar = scm_to_int (svar);			\
+  else						\
+    {						\
+      SCM_VALIDATE_OPFPORT (pos, svar);		\
+      scm_flush (svar);				\
+      cvar = SCM_FPORT_FDES (svar);		\
+    }
+
+  size_t c_count;
+  off_t c_offset;
+  ssize_t result;
+  int in_fd, out_fd;
+
+  VALIDATE_FD_OR_PORT (out_fd, out, 1);
+  VALIDATE_FD_OR_PORT (in_fd, in, 2);
+  c_count = scm_to_size_t (count);
+  c_offset = SCM_UNBNDP (offset) ? 0 : scm_to_off_t (offset);
+
+#ifdef HAVE_SENDFILE
+  result = sendfile (out_fd, in_fd,
+		     SCM_UNBNDP (offset) ? NULL : &c_offset,
+		     c_count);
+
+  /* Quoting the Linux man page: "In Linux kernels before 2.6.33, out_fd
+     must refer to a socket.  Since Linux 2.6.33 it can be any file."
+     Fall back to read(2) and write(2) such an error happens.  */
+  if (result < 0 && errno != EINVAL && errno != ENOSYS)
+    SCM_SYSERROR;
+  else if (result < 0)
+#endif
+  {
+    char buf[8192];
+    size_t left;
+
+    if (!SCM_UNBNDP (offset))
+      {
+	if (SCM_PORTP (in))
+	  scm_seek (in, offset, scm_from_int (SEEK_SET));
+	else
+	  lseek_or_lseek64 (in_fd, c_offset, SEEK_SET);
+      }
+
+    for (result = 0, left = c_count; result < c_count; )
+      {
+	size_t asked, obtained;
+
+	asked = SCM_MIN (sizeof buf, left);
+	obtained = full_read (in_fd, buf, asked);
+	if (obtained < asked)
+	  SCM_SYSERROR;
+
+	left -= obtained;
+
+	obtained = full_write (out_fd, buf, asked);
+	if (obtained < asked)
+	  SCM_SYSERROR;
+
+	result += obtained;
+      }
+  }
+
+  return scm_from_ssize_t (result);
+
+#undef VALIDATE_FD_OR_PORT
+}
+#undef FUNC_NAME
+
 #endif /* HAVE_POSIX */
 
 \f
diff --git a/libguile/filesys.h b/libguile/filesys.h
index 967ce74..776b263 100644
--- a/libguile/filesys.h
+++ b/libguile/filesys.h
@@ -3,7 +3,8 @@
 #ifndef SCM_FILESYS_H
 #define SCM_FILESYS_H
 
-/* Copyright (C) 1995,1997,1998,1999,2000,2001, 2006, 2008, 2009, 2010 Free Software Foundation, Inc.
+/* Copyright (C) 1995, 1997, 1998, 1999, 2000, 2001, 2006, 2008, 2009,
+ *   2010, 2013 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -66,6 +67,7 @@ SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
 SCM_API SCM scm_dirname (SCM filename);
 SCM_API SCM scm_basename (SCM filename, SCM suffix);
 SCM_API SCM scm_canonicalize_path (SCM path);
+SCM_API SCM scm_sendfile (SCM out, SCM in, SCM count, SCM offset);
 SCM_INTERNAL SCM scm_i_relativize_path (SCM path, SCM in_path);
 
 SCM_INTERNAL void scm_init_filesys (void);
diff --git a/test-suite/tests/filesys.test b/test-suite/tests/filesys.test
index a6bfb6e..c80c295 100644
--- a/test-suite/tests/filesys.test
+++ b/test-suite/tests/filesys.test
@@ -1,6 +1,6 @@
 ;;;; filesys.test --- test file system functions -*- scheme -*-
 ;;;; 
-;;;; Copyright (C) 2004, 2006 Free Software Foundation, Inc.
+;;;; Copyright (C) 2004, 2006, 2013 Free Software Foundation, Inc.
 ;;;; 
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -18,7 +18,10 @@
 
 (define-module (test-suite test-filesys)
   #:use-module (test-suite lib)
-  #:use-module (test-suite guile-test))
+  #:use-module (test-suite guile-test)
+  #:use-module (ice-9 match)
+  #:use-module (rnrs io ports)
+  #:use-module (rnrs bytevectors))
 
 (define (test-file)
   (data-file-name "filesys-test.tmp"))
@@ -125,5 +128,68 @@
 	(close-port port)
 	(eqv? 5 (stat:size st))))))
 
+(with-test-prefix "sendfile"
+
+  (pass-if "file"
+    (let ((file (search-path %load-path "ice-9/boot-9.scm")))
+      (call-with-input-file file
+        (lambda (input)
+          (let ((len (stat:size (stat input))))
+            (call-with-output-file (test-file)
+              (lambda (output)
+                (sendfile output input len 0))))))
+      (let ((ref (call-with-input-file file get-bytevector-all))
+            (out (call-with-input-file (test-file) get-bytevector-all)))
+        (bytevector=? ref out))))
+
+  (pass-if "file with offset"
+    (let ((file (search-path %load-path "ice-9/boot-9.scm")))
+      (call-with-input-file file
+        (lambda (input)
+          (let ((len (stat:size (stat input))))
+            (call-with-output-file (test-file)
+              (lambda (output)
+                (sendfile output input (- len 777) 777))))))
+      (let ((ref (call-with-input-file file
+                   (lambda (input)
+                     (seek input 777 SEEK_SET)
+                     (get-bytevector-all input))))
+            (out (call-with-input-file (test-file) get-bytevector-all)))
+        (bytevector=? ref out))))
+
+  (pass-if "pipe"
+    (let* ((file   (search-path %load-path "ice-9/boot-9.scm"))
+           (in+out (pipe))
+           (child  (call-with-new-thread
+                    (lambda ()
+                      (call-with-input-file file
+                        (lambda (input)
+                          (let ((len (stat:size (stat input))))
+                            (sendfile (cdr in+out) (fileno input) len 0)
+                            (close-port (cdr in+out)))))))))
+      (let ((ref (call-with-input-file file get-bytevector-all))
+            (out (get-bytevector-all (car in+out))))
+        (close-port (car in+out))
+        (bytevector=? ref out))))
+
+  (pass-if "pipe with offset"
+    (let* ((file   (search-path %load-path "ice-9/boot-9.scm"))
+           (in+out (pipe))
+           (child  (call-with-new-thread
+                    (lambda ()
+                      (call-with-input-file file
+                        (lambda (input)
+                          (let ((len (stat:size (stat input))))
+                            (sendfile (cdr in+out) (fileno input)
+                                      (- len 777) 777)
+                            (close-port (cdr in+out)))))))))
+      (let ((ref (call-with-input-file file
+                   (lambda (input)
+                     (seek input 777 SEEK_SET)
+                     (get-bytevector-all input))))
+            (out (get-bytevector-all (car in+out))))
+        (close-port (car in+out))
+        (bytevector=? ref out)))))
+
 (delete-file (test-file))
 (delete-file (test-symlink))
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-20 22:21 [PATCH] Bindings for ‘sendfile’ Ludovic Courtès
@ 2013-03-21  0:17 ` Noah Lavine
  2013-03-21  9:15   ` Ludovic Courtès
  2013-03-21  3:45 ` Nala Ginrut
  2013-03-21  4:50 ` Mark H Weaver
  2 siblings, 1 reply; 30+ messages in thread
From: Noah Lavine @ 2013-03-21  0:17 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

Hi,

sendfile looks very useful!

I've thought for a while that if I had time (which I know I won't) I would
make a module called (linux) with bindings for non-POSIX Linux kernel
features. What do you think of this idea? If so, what do you think of
putting sendfile there and expanding it with other functions as we need
them?

That would make it very clear when writing a program which functions were
portable and which weren't.

Best,
Noah


On Wed, Mar 20, 2013 at 6:21 PM, Ludovic Courtès <ludo@gnu.org> wrote:

> Hi,
>
> I plan to commit the patch below, which adds bindings for ‘sendfile’.
>
> Comments?
>
> Ludo’.
>
>

[-- Attachment #2: Type: text/html, Size: 1073 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-20 22:21 [PATCH] Bindings for ‘sendfile’ Ludovic Courtès
  2013-03-21  0:17 ` Noah Lavine
@ 2013-03-21  3:45 ` Nala Ginrut
  2013-03-21  4:50 ` Mark H Weaver
  2 siblings, 0 replies; 30+ messages in thread
From: Nala Ginrut @ 2013-03-21  3:45 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Wed, 2013-03-20 at 23:21 +0100, Ludovic Courtès wrote:
> Hi,
> 
> I plan to commit the patch below, which adds bindings for ‘sendfile’.
> 
> Comments?
> 

As a server-develop fan, I definitely love it.
Besides, can we add more linux-specific features and add them into a
place like (ice-9 linux)?

> Ludo’.
> 
> differences between files attachment
> (0001-Add-bindings-for-Linux-s-sendfile.patch), "the patch"
> From a10f5d5d69d63495cab5432d858b1af52a2bacbf Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
> Date: Wed, 20 Mar 2013 23:04:11 +0100
> Subject: [PATCH] Add bindings for Linux's `sendfile'.
> 
> * configure.ac: Check for <sys/sendfile.h> and `sendfile'.
> * libguile/filesys.c (scm_sendfile): New function.
> * libguile/filesys.h (scm_sendfile): New declaration.
> * test-suite/tests/filesys.test ("sendfile"): New test prefix.
> * doc/ref/posix.texi (File System): Document `sendfile'.
> ---
>  configure.ac                  |   20 ++++++++--
>  doc/ref/posix.texi            |   23 +++++++++++
>  libguile/filesys.c            |   85 +++++++++++++++++++++++++++++++++++++++++
>  libguile/filesys.h            |    4 +-
>  test-suite/tests/filesys.test |   70 ++++++++++++++++++++++++++++++++-
>  5 files changed, 195 insertions(+), 7 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 42de733..bcfc1a6 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -647,12 +647,13 @@ AC_SUBST([SCM_I_GSC_HAVE_STRUCT_DIRENT64])
>  #     this file instead of <fenv.h>
>  #   process.h - mingw specific
>  #   sched.h - missing on MinGW
> +#   sys/sendfile.h - non-POSIX, found in glibc
>  #
>  AC_CHECK_HEADERS([complex.h fenv.h io.h libc.h limits.h memory.h process.h string.h \
>  sys/dir.h sys/ioctl.h sys/select.h \
>  sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
>  sys/utime.h time.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
> -direct.h machine/fpu.h sched.h])
> +direct.h machine/fpu.h sched.h sys/sendfile.h])
>  
>  # "complex double" is new in C99, and "complex" is only a keyword if
>  # <complex.h> is included
> @@ -744,10 +745,21 @@ AC_CHECK_HEADERS([assert.h crt_externs.h])
>  #   _NSGetEnviron - Darwin specific
>  #   strcoll_l, newlocale - GNU extensions (glibc), also available on Darwin
>  #   fork - unavailable on Windows
> -#   utimensat: posix.1-2008
> -#   sched_getaffinity, sched_setaffinity: GNU extensions (glibc)
> +#   utimensat - posix.1-2008
> +#   sched_getaffinity, sched_setaffinity - GNU extensions (glibc)
> +#   sendfile - non-POSIX, found in glibc
>  #
> -AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid fesetround ftime ftruncate fchown fchmod getcwd geteuid getsid gettimeofday gmtime_r ioctl lstat mkdir mknod nice pipe _pipe readdir_r readdir64_r readlink rename rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l newlocale utimensat sched_getaffinity sched_setaffinity])
> +AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid		\
> +  fesetround ftime ftruncate fchown fchmod getcwd geteuid getsid	\
> +  gettimeofday gmtime_r ioctl lstat mkdir mknod nice pipe _pipe		\
> +  readdir_r readdir64_r readlink rename rmdir select setegid seteuid	\
> +  setlocale setpgid setsid sigaction siginterrupt stat64 strftime	\
> +  strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid	\
> +  strdup system usleep atexit on_exit chown link fcntl ttyname getpwent	\
> +  getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp	\
> +  index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron	\
> +  strcoll strcoll_l newlocale utimensat sched_getaffinity		\
> +  sched_setaffinity sendfile])
>  
>  AM_CONDITIONAL([HAVE_FORK], [test "x$ac_cv_func_fork" = "xyes"])
>  
> diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
> index d659cf3..ca02093 100644
> --- a/doc/ref/posix.texi
> +++ b/doc/ref/posix.texi
> @@ -803,6 +803,29 @@ Copy the file specified by @var{oldfile} to @var{newfile}.
>  The return value is unspecified.
>  @end deffn
>  
> +@deffn {Scheme Procedure} sendfile out in count [offset]
> +@deffnx {C Function} scm_sendfile (out, in, count, offset)
> +Send @var{count} bytes from @var{in} to @var{out}, both of which
> +are either open file ports or file descriptors.  When
> +@var{offset} is omitted, start reading from @var{in}'s current
> +position; otherwise, start reading at @var{offset}.
> +
> +When @var{in} is a port, it is often preferable to specify @var{offset},
> +because @var{in}'s offset as a port may be different from the offset of
> +its underlying file descriptor.
> +
> +On systems that support it, such as GNU/Linux, this procedure uses the
> +@code{sendfile} libc function, which usually corresponds to a system
> +call.  This is faster than doing a series of @code{read} and
> +@code{write} system calls.  A typical application is to send a file over
> +a socket.
> +
> +In some cases, the @code{sendfile} libc function may return
> +@code{EINVAL} or @code{ENOSYS}.  In that case, Guile's @code{sendfile}
> +procedure automatically falls back to doing a series of @code{read} and
> +@code{write} calls.
> +@end deffn
> +
>  @findex rename
>  @deffn {Scheme Procedure} rename-file oldname newname
>  @deffnx {C Function} scm_rename (oldname, newname)
> diff --git a/libguile/filesys.c b/libguile/filesys.c
> index 282ff31..097b03a 100644
> --- a/libguile/filesys.c
> +++ b/libguile/filesys.c
> @@ -98,6 +98,14 @@
>  
>  #define NAMLEN(dirent)  strlen ((dirent)->d_name)
>  
> +#ifdef HAVE_SYS_SENDFILE_H
> +# include <sys/sendfile.h>
> +#endif
> +
> +#include <full-read.h>
> +#include <full-write.h>
> +
> +
>  /* Some more definitions for the native Windows port. */
>  #ifdef __MINGW32__
>  # define fsync(fd) _commit (fd)
> @@ -1096,6 +1104,83 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
>  }
>  #undef FUNC_NAME
>  
> +SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
> +	    (SCM out, SCM in, SCM count, SCM offset),
> +	    "Send @var{count} bytes from @var{in} to @var{out}, both of which "
> +	    "are either open file ports or file descriptors.  When "
> +	    "@var{offset} is omitted, start reading from @var{in}'s current "
> +	    "position; otherwise, start reading at @var{offset}.")
> +#define FUNC_NAME s_scm_sendfile
> +{
> +#define VALIDATE_FD_OR_PORT(cvar, svar, pos)	\
> +  if (scm_is_integer (svar))			\
> +    cvar = scm_to_int (svar);			\
> +  else						\
> +    {						\
> +      SCM_VALIDATE_OPFPORT (pos, svar);		\
> +      scm_flush (svar);				\
> +      cvar = SCM_FPORT_FDES (svar);		\
> +    }
> +
> +  size_t c_count;
> +  off_t c_offset;
> +  ssize_t result;
> +  int in_fd, out_fd;
> +
> +  VALIDATE_FD_OR_PORT (out_fd, out, 1);
> +  VALIDATE_FD_OR_PORT (in_fd, in, 2);
> +  c_count = scm_to_size_t (count);
> +  c_offset = SCM_UNBNDP (offset) ? 0 : scm_to_off_t (offset);
> +
> +#ifdef HAVE_SENDFILE
> +  result = sendfile (out_fd, in_fd,
> +		     SCM_UNBNDP (offset) ? NULL : &c_offset,
> +		     c_count);
> +
> +  /* Quoting the Linux man page: "In Linux kernels before 2.6.33, out_fd
> +     must refer to a socket.  Since Linux 2.6.33 it can be any file."
> +     Fall back to read(2) and write(2) such an error happens.  */
> +  if (result < 0 && errno != EINVAL && errno != ENOSYS)
> +    SCM_SYSERROR;
> +  else if (result < 0)
> +#endif
> +  {
> +    char buf[8192];
> +    size_t left;
> +
> +    if (!SCM_UNBNDP (offset))
> +      {
> +	if (SCM_PORTP (in))
> +	  scm_seek (in, offset, scm_from_int (SEEK_SET));
> +	else
> +	  lseek_or_lseek64 (in_fd, c_offset, SEEK_SET);
> +      }
> +
> +    for (result = 0, left = c_count; result < c_count; )
> +      {
> +	size_t asked, obtained;
> +
> +	asked = SCM_MIN (sizeof buf, left);
> +	obtained = full_read (in_fd, buf, asked);
> +	if (obtained < asked)
> +	  SCM_SYSERROR;
> +
> +	left -= obtained;
> +
> +	obtained = full_write (out_fd, buf, asked);
> +	if (obtained < asked)
> +	  SCM_SYSERROR;
> +
> +	result += obtained;
> +      }
> +  }
> +
> +  return scm_from_ssize_t (result);
> +
> +#undef VALIDATE_FD_OR_PORT
> +}
> +#undef FUNC_NAME
> +
>  #endif /* HAVE_POSIX */
>  
>  \f
> diff --git a/libguile/filesys.h b/libguile/filesys.h
> index 967ce74..776b263 100644
> --- a/libguile/filesys.h
> +++ b/libguile/filesys.h
> @@ -3,7 +3,8 @@
>  #ifndef SCM_FILESYS_H
>  #define SCM_FILESYS_H
>  
> -/* Copyright (C) 1995,1997,1998,1999,2000,2001, 2006, 2008, 2009, 2010 Free Software Foundation, Inc.
> +/* Copyright (C) 1995, 1997, 1998, 1999, 2000, 2001, 2006, 2008, 2009,
> + *   2010, 2013 Free Software Foundation, Inc.
>   *
>   * This library is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU Lesser General Public License
> @@ -66,6 +67,7 @@ SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
>  SCM_API SCM scm_dirname (SCM filename);
>  SCM_API SCM scm_basename (SCM filename, SCM suffix);
>  SCM_API SCM scm_canonicalize_path (SCM path);
> +SCM_API SCM scm_sendfile (SCM out, SCM in, SCM count, SCM offset);
>  SCM_INTERNAL SCM scm_i_relativize_path (SCM path, SCM in_path);
>  
>  SCM_INTERNAL void scm_init_filesys (void);
> diff --git a/test-suite/tests/filesys.test b/test-suite/tests/filesys.test
> index a6bfb6e..c80c295 100644
> --- a/test-suite/tests/filesys.test
> +++ b/test-suite/tests/filesys.test
> @@ -1,6 +1,6 @@
>  ;;;; filesys.test --- test file system functions -*- scheme -*-
>  ;;;; 
> -;;;; Copyright (C) 2004, 2006 Free Software Foundation, Inc.
> +;;;; Copyright (C) 2004, 2006, 2013 Free Software Foundation, Inc.
>  ;;;; 
>  ;;;; This library is free software; you can redistribute it and/or
>  ;;;; modify it under the terms of the GNU Lesser General Public
> @@ -18,7 +18,10 @@
>  
>  (define-module (test-suite test-filesys)
>    #:use-module (test-suite lib)
> -  #:use-module (test-suite guile-test))
> +  #:use-module (test-suite guile-test)
> +  #:use-module (ice-9 match)
> +  #:use-module (rnrs io ports)
> +  #:use-module (rnrs bytevectors))
>  
>  (define (test-file)
>    (data-file-name "filesys-test.tmp"))
> @@ -125,5 +128,68 @@
>  	(close-port port)
>  	(eqv? 5 (stat:size st))))))
>  
> +(with-test-prefix "sendfile"
> +
> +  (pass-if "file"
> +    (let ((file (search-path %load-path "ice-9/boot-9.scm")))
> +      (call-with-input-file file
> +        (lambda (input)
> +          (let ((len (stat:size (stat input))))
> +            (call-with-output-file (test-file)
> +              (lambda (output)
> +                (sendfile output input len 0))))))
> +      (let ((ref (call-with-input-file file get-bytevector-all))
> +            (out (call-with-input-file (test-file) get-bytevector-all)))
> +        (bytevector=? ref out))))
> +
> +  (pass-if "file with offset"
> +    (let ((file (search-path %load-path "ice-9/boot-9.scm")))
> +      (call-with-input-file file
> +        (lambda (input)
> +          (let ((len (stat:size (stat input))))
> +            (call-with-output-file (test-file)
> +              (lambda (output)
> +                (sendfile output input (- len 777) 777))))))
> +      (let ((ref (call-with-input-file file
> +                   (lambda (input)
> +                     (seek input 777 SEEK_SET)
> +                     (get-bytevector-all input))))
> +            (out (call-with-input-file (test-file) get-bytevector-all)))
> +        (bytevector=? ref out))))
> +
> +  (pass-if "pipe"
> +    (let* ((file   (search-path %load-path "ice-9/boot-9.scm"))
> +           (in+out (pipe))
> +           (child  (call-with-new-thread
> +                    (lambda ()
> +                      (call-with-input-file file
> +                        (lambda (input)
> +                          (let ((len (stat:size (stat input))))
> +                            (sendfile (cdr in+out) (fileno input) len 0)
> +                            (close-port (cdr in+out)))))))))
> +      (let ((ref (call-with-input-file file get-bytevector-all))
> +            (out (get-bytevector-all (car in+out))))
> +        (close-port (car in+out))
> +        (bytevector=? ref out))))
> +
> +  (pass-if "pipe with offset"
> +    (let* ((file   (search-path %load-path "ice-9/boot-9.scm"))
> +           (in+out (pipe))
> +           (child  (call-with-new-thread
> +                    (lambda ()
> +                      (call-with-input-file file
> +                        (lambda (input)
> +                          (let ((len (stat:size (stat input))))
> +                            (sendfile (cdr in+out) (fileno input)
> +                                      (- len 777) 777)
> +                            (close-port (cdr in+out)))))))))
> +      (let ((ref (call-with-input-file file
> +                   (lambda (input)
> +                     (seek input 777 SEEK_SET)
> +                     (get-bytevector-all input))))
> +            (out (get-bytevector-all (car in+out))))
> +        (close-port (car in+out))
> +        (bytevector=? ref out)))))
> +
>  (delete-file (test-file))
>  (delete-file (test-symlink))





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-20 22:21 [PATCH] Bindings for ‘sendfile’ Ludovic Courtès
  2013-03-21  0:17 ` Noah Lavine
  2013-03-21  3:45 ` Nala Ginrut
@ 2013-03-21  4:50 ` Mark H Weaver
  2013-03-21  5:24   ` Mark H Weaver
                     ` (2 more replies)
  2 siblings, 3 replies; 30+ messages in thread
From: Mark H Weaver @ 2013-03-21  4:50 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi Ludovic,

ludo@gnu.org (Ludovic Courtès) writes:
> I plan to commit the patch below, which adds bindings for ‘sendfile’.
>
> Comments?

Looks great to me, modulo one comment below.

I especially like the fact that although it can make use of the
non-standard Linux syscall, it works properly on all systems.  In
response to suggestions by others that we create a "linux" module:
I'd prefer to follow the good precedent set by Ludovic here.

> diff --git a/libguile/filesys.c b/libguile/filesys.c
> index 282ff31..097b03a 100644
> --- a/libguile/filesys.c
> +++ b/libguile/filesys.c
> @@ -98,6 +98,14 @@
>  
>  #define NAMLEN(dirent)  strlen ((dirent)->d_name)
>  
> +#ifdef HAVE_SYS_SENDFILE_H
> +# include <sys/sendfile.h>
> +#endif
> +
> +#include <full-read.h>
> +#include <full-write.h>

I didn't know about Gnulib's 'full-read' and 'full-write'.  Nice!
We should probably make more use of these throughout the tree.

> +SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
> +	    (SCM out, SCM in, SCM count, SCM offset),
> +	    "Send @var{count} bytes from @var{in} to @var{out}, both of which "
> +	    "are either open file ports or file descriptors.  When "
> +	    "@var{offset} is omitted, start reading from @var{in}'s current "
> +	    "position; otherwise, start reading at @var{offset}.")
> +#define FUNC_NAME s_scm_sendfile
> +{
> +#define VALIDATE_FD_OR_PORT(cvar, svar, pos)	\
> +  if (scm_is_integer (svar))			\
> +    cvar = scm_to_int (svar);			\
> +  else						\
> +    {						\
> +      SCM_VALIDATE_OPFPORT (pos, svar);		\
> +      scm_flush (svar);				\
> +      cvar = SCM_FPORT_FDES (svar);		\
> +    }
> +
> +  size_t c_count;
> +  off_t c_offset;
> +  ssize_t result;
> +  int in_fd, out_fd;
> +
> +  VALIDATE_FD_OR_PORT (out_fd, out, 1);
> +  VALIDATE_FD_OR_PORT (in_fd, in, 2);
> +  c_count = scm_to_size_t (count);

Since the code below will behave badly if 'c_count' does not fit in an
'ssize_t', we should validate here that it _does_ fit.

> +  c_offset = SCM_UNBNDP (offset) ? 0 : scm_to_off_t (offset);
> +
> +#ifdef HAVE_SENDFILE
> +  result = sendfile (out_fd, in_fd,
> +		     SCM_UNBNDP (offset) ? NULL : &c_offset,
> +		     c_count);
> +
> +  /* Quoting the Linux man page: "In Linux kernels before 2.6.33, out_fd
> +     must refer to a socket.  Since Linux 2.6.33 it can be any file."
> +     Fall back to read(2) and write(2) such an error happens.  */
> +  if (result < 0 && errno != EINVAL && errno != ENOSYS)
> +    SCM_SYSERROR;
> +  else if (result < 0)
> +#endif
> +  {
> +    char buf[8192];
> +    size_t left;
> +
> +    if (!SCM_UNBNDP (offset))
> +      {
> +	if (SCM_PORTP (in))
> +	  scm_seek (in, offset, scm_from_int (SEEK_SET));
> +	else
> +	  lseek_or_lseek64 (in_fd, c_offset, SEEK_SET);
> +      }
> +
> +    for (result = 0, left = c_count; result < c_count; )

If 'c_count's does not fit in a 'ssize_t', then this loop will go
forever and 'result' will wrap around to negative numbers and undefined
C behavior.

> +      {
> +	size_t asked, obtained;
> +
> +	asked = SCM_MIN (sizeof buf, left);
> +	obtained = full_read (in_fd, buf, asked);
> +	if (obtained < asked)
> +	  SCM_SYSERROR;
> +
> +	left -= obtained;
> +
> +	obtained = full_write (out_fd, buf, asked);
> +	if (obtained < asked)
> +	  SCM_SYSERROR;
> +
> +	result += obtained;
> +      }
> +  }
> +
> +  return scm_from_ssize_t (result);
> +
> +#undef VALIDATE_FD_OR_PORT
> +}
> +#undef FUNC_NAME
> +
>  #endif /* HAVE_POSIX */

Other than that, looks beautiful to me :)

    Thanks!
      Mark



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  4:50 ` Mark H Weaver
@ 2013-03-21  5:24   ` Mark H Weaver
  2013-03-21  9:40   ` Ludovic Courtès
  2013-04-07 21:51   ` Ludovic Courtès
  2 siblings, 0 replies; 30+ messages in thread
From: Mark H Weaver @ 2013-03-21  5:24 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Mark H Weaver <mhw@netris.org> writes:
>> +    for (result = 0, left = c_count; result < c_count; )
>
> If 'c_count's does not fit in a 'ssize_t', then this loop will go
> forever and 'result' will wrap around to negative numbers and undefined
> C behavior.

Having just consulted the relevant C standards, I see that I was wrong
to claim that the loop will spin forever.  If a signed integer is
compared with an unsigned integer of equal rank, the signed integer is
converted to unsigned first.

The latter part of my statement (about undefined behavior) was true, and
invalidates any other predictions.  If 'result' overflows (and it will
if 'c_count' does not fit in a 'ssize_t' and nothing else goes wrong)
then the behavior of the program is completely undefined.  Modern C
compilers are increasingly fond of taking advantage of that fact, as
demonstrated by the recent discovery that they were optimizing out our
overflow checks.

http://stackoverflow.com/questions/14495636/strange-multiplication-behavior-in-guile-scheme-interpreter
http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=2355f01709eadfd5350c510cdb095b4e3f71f17c

      Mark



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  0:17 ` Noah Lavine
@ 2013-03-21  9:15   ` Ludovic Courtès
  2013-03-21 10:04     ` Andrew Gaylard
  2013-03-21 15:39     ` Noah Lavine
  0 siblings, 2 replies; 30+ messages in thread
From: Ludovic Courtès @ 2013-03-21  9:15 UTC (permalink / raw)
  To: Noah Lavine; +Cc: guile-devel

Hi Noah,

Noah Lavine <noah.b.lavine@gmail.com> skribis:

> I've thought for a while that if I had time (which I know I won't) I would
> make a module called (linux) with bindings for non-POSIX Linux kernel
> features. What do you think of this idea? If so, what do you think of
> putting sendfile there and expanding it with other functions as we need
> them?

I’ve thought about it, but ended up with making sendfile work whether or
not the syscall is available (just like glibc does, after all).

So for this particular case, I’d rather keep it in the global name
space.  There’s also the untold argument that even if sendfile(2) is
unavailable, the loop written in C is going to be faster than the
equivalent bytecode.

FWIW, I plan to integrate the Linux bindings I wrote for “boot-to-Guile”
eventually:

  http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/patches/guile-linux-syscalls.patch

These are not defined in POSIX, but apart from the Linux module
syscalls, they (that is, mount(2) and the networking ioctls) happen to
be supported by glibc on all 3 kernels, and also by other libcs.  Thus,
I’d rather keep them in the global name space as well.

WDYT?

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  4:50 ` Mark H Weaver
  2013-03-21  5:24   ` Mark H Weaver
@ 2013-03-21  9:40   ` Ludovic Courtès
  2013-03-21  9:49     ` Andy Wingo
                       ` (2 more replies)
  2013-04-07 21:51   ` Ludovic Courtès
  2 siblings, 3 replies; 30+ messages in thread
From: Ludovic Courtès @ 2013-03-21  9:40 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1290 bytes --]

Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>> I plan to commit the patch below, which adds bindings for ‘sendfile’.
>>
>> Comments?
>
> Looks great to me, modulo one comment below.

Thanks for the quick review!

> I especially like the fact that although it can make use of the
> non-standard Linux syscall, it works properly on all systems.  In
> response to suggestions by others that we create a "linux" module:
> I'd prefer to follow the good precedent set by Ludovic here.

Yeah.  (And experience has shown that POSIX largely follows GNU/Linux
nowadays.)

>> +  size_t c_count;
>> +  off_t c_offset;
>> +  ssize_t result;
>> +  int in_fd, out_fd;
>> +
>> +  VALIDATE_FD_OR_PORT (out_fd, out, 1);
>> +  VALIDATE_FD_OR_PORT (in_fd, in, 2);
>> +  c_count = scm_to_size_t (count);
>
> Since the code below will behave badly if 'c_count' does not fit in an
> 'ssize_t', we should validate here that it _does_ fit.

Oops, indeed.  (Note that sendfile(2) and write(2) have that problem:
they take a size_t and return a ssize_t...)

There was also the other issue of making sure we use the right function,
depending on _FILE_OFFSET_BITS & co.

Here are the changes compared to the previous patch:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1546 bytes --]

diff --git a/libguile/filesys.c b/libguile/filesys.c
index 097b03a..6804db9 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -102,6 +102,10 @@
 # include <sys/sendfile.h>
 #endif
 
+/* Glibc's `sendfile' function.  */
+#define sendfile_or_sendfile64			\
+  CHOOSE_LARGEFILE (sendfile, sendfile64)
+
 #include <full-read.h>
 #include <full-write.h>
 
@@ -1123,7 +1127,7 @@ SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
     }
 
   size_t c_count;
-  off_t c_offset;
+  scm_t_off c_offset;
   ssize_t result;
   int in_fd, out_fd;
 
@@ -1133,20 +1137,20 @@ SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
   c_offset = SCM_UNBNDP (offset) ? 0 : scm_to_off_t (offset);
 
 #ifdef HAVE_SENDFILE
-  result = sendfile (out_fd, in_fd,
+  result = sendfile_or_sendfile64 (out_fd, in_fd,
 				   SCM_UNBNDP (offset) ? NULL : &c_offset,
 				   c_count);
 
   /* Quoting the Linux man page: "In Linux kernels before 2.6.33, out_fd
      must refer to a socket.  Since Linux 2.6.33 it can be any file."
-     Fall back to read(2) and write(2) such an error happens.  */
+     Fall back to read(2) and write(2) when such an error occurs.  */
   if (result < 0 && errno != EINVAL && errno != ENOSYS)
     SCM_SYSERROR;
   else if (result < 0)
 #endif
   {
     char buf[8192];
-    size_t left;
+    size_t result, left;
 
     if (!SCM_UNBNDP (offset))
       {
@@ -1173,6 +1177,8 @@ SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
 
 	result += obtained;
       }
+
+    return scm_from_size_t (result);
   }
 
   return scm_from_ssize_t (result);

[-- Attachment #3: Type: text/plain, Size: 30 bytes --]


WDYT?

Thanks,
Ludo’.

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  9:40   ` Ludovic Courtès
@ 2013-03-21  9:49     ` Andy Wingo
  2013-03-21 10:10       ` Ludovic Courtès
  2013-03-22 19:22     ` Mark H Weaver
  2013-04-07 19:53     ` Ludovic Courtès
  2 siblings, 1 reply; 30+ messages in thread
From: Andy Wingo @ 2013-03-21  9:49 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Mark H Weaver, guile-devel

On Thu 21 Mar 2013 10:40, ludo@gnu.org (Ludovic Courtès) writes:

>> ludo@gnu.org (Ludovic Courtès) writes:
>>> I plan to commit the patch below, which adds bindings for
>>‘sendfile’.

Should probably go in gnulib at some point, no?  Looks good tho :)
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  9:15   ` Ludovic Courtès
@ 2013-03-21 10:04     ` Andrew Gaylard
  2013-03-21 15:39     ` Noah Lavine
  1 sibling, 0 replies; 30+ messages in thread
From: Andrew Gaylard @ 2013-03-21 10:04 UTC (permalink / raw)
  To: guile-devel

On 03/21/13 11:15, Ludovic Courtès wrote:
> Noah Lavine <noah.b.lavine@gmail.com> skribis:
>> I've thought for a while that if I had time (which I know I won't) I would
>> make a module called (linux) with bindings for non-POSIX Linux kernel
>> features. What do you think of this idea? If so, what do you think of
>> putting sendfile there and expanding it with other functions as we need
>> them?
> I’ve thought about it, but ended up with making sendfile work whether or
> not the syscall is available (just like glibc does, after all).
>
> So for this particular case, I’d rather keep it in the global name
> space.  There’s also the untold argument that even if sendfile(2) is
> unavailable, the loop written in C is going to be faster than the
> equivalent bytecode.
Just another datapoint: Solaris 10 has sendfile.  So it's not just a 
Linux feature.

-- 
Andrew



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  9:49     ` Andy Wingo
@ 2013-03-21 10:10       ` Ludovic Courtès
  0 siblings, 0 replies; 30+ messages in thread
From: Ludovic Courtès @ 2013-03-21 10:10 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Mark H Weaver, guile-devel

Andy Wingo <wingo@pobox.com> skribis:

> On Thu 21 Mar 2013 10:40, ludo@gnu.org (Ludovic Courtès) writes:
>
>>> ludo@gnu.org (Ludovic Courtès) writes:
>>>> I plan to commit the patch below, which adds bindings for
>>>‘sendfile’.
>
> Should probably go in gnulib at some point, no?

Yes, you’re right.  I’ll see what I can do.

Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  9:15   ` Ludovic Courtès
  2013-03-21 10:04     ` Andrew Gaylard
@ 2013-03-21 15:39     ` Noah Lavine
  1 sibling, 0 replies; 30+ messages in thread
From: Noah Lavine @ 2013-03-21 15:39 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]

Hello,

Yes, you're completely right - making it work on all platforms is much
better than what I had proposed. I'm glad you're doing this.

Thanks,
Noah


On Thu, Mar 21, 2013 at 5:15 AM, Ludovic Courtès <ludo@gnu.org> wrote:

> Hi Noah,
>
> Noah Lavine <noah.b.lavine@gmail.com> skribis:
>
> > I've thought for a while that if I had time (which I know I won't) I
> would
> > make a module called (linux) with bindings for non-POSIX Linux kernel
> > features. What do you think of this idea? If so, what do you think of
> > putting sendfile there and expanding it with other functions as we need
> > them?
>
> I’ve thought about it, but ended up with making sendfile work whether or
> not the syscall is available (just like glibc does, after all).
>
> So for this particular case, I’d rather keep it in the global name
> space.  There’s also the untold argument that even if sendfile(2) is
> unavailable, the loop written in C is going to be faster than the
> equivalent bytecode.
>
> FWIW, I plan to integrate the Linux bindings I wrote for “boot-to-Guile”
> eventually:
>
>
> http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/patches/guile-linux-syscalls.patch
>
> These are not defined in POSIX, but apart from the Linux module
> syscalls, they (that is, mount(2) and the networking ioctls) happen to
> be supported by glibc on all 3 kernels, and also by other libcs.  Thus,
> I’d rather keep them in the global name space as well.
>
> WDYT?
>
> Thanks,
> Ludo’.
>

[-- Attachment #2: Type: text/html, Size: 2146 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  9:40   ` Ludovic Courtès
  2013-03-21  9:49     ` Andy Wingo
@ 2013-03-22 19:22     ` Mark H Weaver
  2013-03-22 21:27       ` Ludovic Courtès
  2013-04-07 19:53     ` Ludovic Courtès
  2 siblings, 1 reply; 30+ messages in thread
From: Mark H Weaver @ 2013-03-22 19:22 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> Since the code below will behave badly if 'c_count' does not fit in an
>> 'ssize_t', we should validate here that it _does_ fit.
>
> Oops, indeed.  (Note that sendfile(2) and write(2) have that problem:
> they take a size_t and return a ssize_t...)
>
> There was also the other issue of making sure we use the right function,
> depending on _FILE_OFFSET_BITS & co.
>
> Here are the changes compared to the previous patch:

[...]

> WDYT?

Looks good to me! :)

    Mark



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-22 19:22     ` Mark H Weaver
@ 2013-03-22 21:27       ` Ludovic Courtès
  2013-03-24  5:04         ` Mark H Weaver
  0 siblings, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-03-22 21:27 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

Mark H Weaver <mhw@netris.org> skribis:

> Looks good to me! :)

Thanks, committed.

Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-22 21:27       ` Ludovic Courtès
@ 2013-03-24  5:04         ` Mark H Weaver
  2013-03-25 12:55           ` Ludovic Courtès
  0 siblings, 1 reply; 30+ messages in thread
From: Mark H Weaver @ 2013-03-24  5:04 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi Ludovic,

The sendfile commit (fbac7c6113056bc6ee85996b10bdc08325c742a5) has
caused the following build failures on Hydra.

    Thanks,
      Mark


======================================================================
On GNU/Linux: (both i686-linux and x86_64-linux without threads)

<http://hydra.nixos.org/build/4463700/log/raw>
<http://hydra.nixos.org/build/4463695/log/raw>

checking sys/sendfile.h usability... yes
checking sys/sendfile.h presence... yes
checking for sys/sendfile.h... yes
[...]
checking for sendfile... yes

ERROR: filesys.test: sendfile: pipe - arguments: ((system-error #f "~A" ("Function not implemented") (38)))
ERROR: filesys.test: sendfile: pipe with offset - arguments: ((system-error #f "~A" ("Function not implemented") (38)))

======================================================================
On FreeBSD: (x86_64-freebsd without threads)

<http://hydra.nixos.org/build/4463693/log/raw>

checking sys/sendfile.h usability... no
checking sys/sendfile.h presence... no
checking for sys/sendfile.h... no
[...]
checking for sendfile... yes

filesys.c: In function 'scm_sendfile':
filesys.c:1140: warning: implicit declaration of function 'sendfile'

ERROR: filesys.test: sendfile: file - arguments: ((system-error "sendfile" "~A" ("Bad file descriptor") (9)))
ERROR: filesys.test: sendfile: file with offset - arguments: ((system-error "sendfile" "~A" ("Bad file descriptor") (9)))
ERROR: filesys.test: sendfile: pipe - arguments: ((system-error #f "~A" ("Function not implemented") (78)))
ERROR: filesys.test: sendfile: pipe with offset - arguments: ((system-error #f "~A" ("Function not implemented") (78)))

======================================================================
On Darwin: (x86_64-darwin, both with and without threads)

<http://hydra.nixos.org/build/4463702/log/raw>
<http://hydra.nixos.org/build/4463698/log/raw>

checking sys/sendfile.h usability... no
checking sys/sendfile.h presence... no
checking for sys/sendfile.h... no
[...]
checking for sendfile... yes

../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c: In function 'scm_sendfile':
../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c:1142:8: warning: passing argument 3 of 'sendfile' makes integer from pointer without a cast [enabled by default]
/usr/include/sys/socket.h:619:5: note: expected 'off_t' but argument is of type 'scm_t_off *'
../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c:1142:8: warning: passing argument 4 of 'sendfile' makes pointer from integer without a cast [enabled by default]
/usr/include/sys/socket.h:619:5: note: expected 'off_t *' but argument is of type 'size_t'
../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c:1142:8: error: too few arguments to function 'sendfile'
/usr/include/sys/socket.h:619:5: note: declared here
make[3]: *** [libguile_2.0_la-filesys.lo] Error 1



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-24  5:04         ` Mark H Weaver
@ 2013-03-25 12:55           ` Ludovic Courtès
  0 siblings, 0 replies; 30+ messages in thread
From: Ludovic Courtès @ 2013-03-25 12:55 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

Mark H Weaver <mhw@netris.org> skribis:

> The sendfile commit (fbac7c6113056bc6ee85996b10bdc08325c742a5) has
> caused the following build failures on Hydra.

Thanks for the notification.

> On GNU/Linux: (both i686-linux and x86_64-linux without threads)
>
> <http://hydra.nixos.org/build/4463700/log/raw>
> <http://hydra.nixos.org/build/4463695/log/raw>
>
> checking sys/sendfile.h usability... yes
> checking sys/sendfile.h presence... yes
> checking for sys/sendfile.h... yes
> [...]
> checking for sendfile... yes
>
> ERROR: filesys.test: sendfile: pipe - arguments: ((system-error #f "~A" ("Function not implemented") (38)))
> ERROR: filesys.test: sendfile: pipe with offset - arguments: ((system-error #f "~A" ("Function not implemented") (38)))

Oops, fixed by 45417ab.

> On FreeBSD: (x86_64-freebsd without threads)

[...]

> On Darwin: (x86_64-darwin, both with and without threads)
>
> <http://hydra.nixos.org/build/4463702/log/raw>
> <http://hydra.nixos.org/build/4463698/log/raw>
>
> checking sys/sendfile.h usability... no
> checking sys/sendfile.h presence... no
> checking for sys/sendfile.h... no
> [...]
> checking for sendfile... yes

It turns out that the BSDs have their own flavor:

  http://www.unix.com/man-page/FreeBSD/2/sendfile/
  https://developer.apple.com/library/mac/#documentation/Darwin/Reference/Manpages/man2/sendfile.2.html

We should support it, eventually, but for now commit 11ed427 just makes
sure we don’t mistakenly attempt to use it.

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  9:40   ` Ludovic Courtès
  2013-03-21  9:49     ` Andy Wingo
  2013-03-22 19:22     ` Mark H Weaver
@ 2013-04-07 19:53     ` Ludovic Courtès
  2013-04-09  8:33       ` Thien-Thi Nguyen
  2 siblings, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-07 19:53 UTC (permalink / raw)
  To: guile-devel; +Cc: Mark H. Weaver

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

There were sporadic failures of the sendfile/pipe tests on Hydra.  I
managed to reproduce them and noticed that sometimes sendfile(2) would
return 64KiB, which is much less than what we asked for.

Although the man page doesn’t mention it, this can happen when writing
to a slow or limited device, like a pipe (pretty much like write(2)).

I was hesitant whether to stick to libc behavior (return the number of
bytes sent and let the user handle it), or to handle it directly in our
‘sendfile’ procedure.  After discussion with Mark on IRC, I became
convinced that the latter is preferable, so here’s the patch.

I’ll commit it shortly if there are no objections.

Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: the patch --]
[-- Type: text/x-patch, Size: 4354 bytes --]

From 42459c382ca512c927b6228c1557afe8e10aedae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Sun, 7 Apr 2013 21:47:48 +0200
Subject: [PATCH] Change `sendfile' to loop until everything has been sent.

* libguile/filesys.c (scm_sendfile)[HAVE_SYS_SENDFILE_H &&
  HAVE_SENDFILE]: Compare RESULT with C_COUNT.  Loop until C_COUNT bytes
  have been sent.  Return SCM_UNSPECIFIED.
* doc/ref/posix.texi (File System): Update the description.  Explain the
  new semantics.
---
 doc/ref/posix.texi |   11 +++++++++--
 libguile/filesys.c |   36 +++++++++++++++++++++++++++++-------
 2 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index 45f320f..13f3a48 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -806,9 +806,10 @@ The return value is unspecified.
 @deffn {Scheme Procedure} sendfile out in count [offset]
 @deffnx {C Function} scm_sendfile (out, in, count, offset)
 Send @var{count} bytes from @var{in} to @var{out}, both of which
-are either open file ports or file descriptors.  When
+must be either open file ports or file descriptors.  When
 @var{offset} is omitted, start reading from @var{in}'s current
-position; otherwise, start reading at @var{offset}.
+position; otherwise, start reading at @var{offset}.  The return
+value is unspecified.
 
 When @var{in} is a port, it is often preferable to specify @var{offset},
 because @var{in}'s offset as a port may be different from the offset of
@@ -824,6 +825,12 @@ In some cases, the @code{sendfile} libc function may return
 @code{EINVAL} or @code{ENOSYS}.  In that case, Guile's @code{sendfile}
 procedure automatically falls back to doing a series of @code{read} and
 @code{write} calls.
+
+In other cases, the libc function may send fewer bytes than
+@var{count}---for instance because @var{out} is a slow or limited
+device, such as a pipe.  When that happens, Guile's @code{sendfile}
+automatically retries until exactly @var{count} bytes were sent or an
+error occurs.
 @end deffn
 
 @findex rename
diff --git a/libguile/filesys.c b/libguile/filesys.c
index d318ae7..069c7ad 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -1111,9 +1111,10 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
 SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
 	    (SCM out, SCM in, SCM count, SCM offset),
 	    "Send @var{count} bytes from @var{in} to @var{out}, both of which "
-	    "are either open file ports or file descriptors.  When "
+	    "must be either open file ports or file descriptors.  When "
 	    "@var{offset} is omitted, start reading from @var{in}'s current "
-	    "position; otherwise, start reading at @var{offset}.")
+	    "position; otherwise, start reading at @var{offset}.  The return "
+	    "value is unspecified.")
 #define FUNC_NAME s_scm_sendfile
 {
 #define VALIDATE_FD_OR_PORT(cvar, svar, pos)	\
@@ -1139,9 +1140,31 @@ SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
 #if defined HAVE_SYS_SENDFILE_H && defined HAVE_SENDFILE
   /* The Linux-style sendfile(2), which is different from the BSD-style.  */
 
-  result = sendfile_or_sendfile64 (out_fd, in_fd,
-				   SCM_UNBNDP (offset) ? NULL : &c_offset,
-				   c_count);
+  {
+    size_t total;
+    off_t *offset_ptr;
+
+    offset_ptr = SCM_UNBNDP (offset) ? NULL : &c_offset;
+
+    /* On Linux, when OUT_FD is a file, everything is transferred at once and
+       RESULT == C_COUNT.  However, when OUT_FD is a pipe or other "slow"
+       device, fewer bytes may be transferred, hence the loop.  */
+    for (total = 0, result = 0; total < c_count && result >= 0; )
+      {
+	result = sendfile_or_sendfile64 (out_fd, in_fd, offset_ptr,
+					 c_count - total);
+	if (result > 0)
+	  {
+	    total += result;
+	    if (offset_ptr == NULL)
+	      offset_ptr = &c_offset;
+	    *offset_ptr = total;
+	  }
+	else if (result < 0 && errno == EINTR)
+	  /* Keep going.  */
+	  result = 0;
+      }
+  }
 
   /* Quoting the Linux man page: "In Linux kernels before 2.6.33, out_fd
      must refer to a socket.  Since Linux 2.6.33 it can be any file."
@@ -1183,10 +1206,9 @@ SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
 	result += obtained;
       }
 
-    return scm_from_size_t (result);
   }
 
-  return scm_from_ssize_t (result);
+  return SCM_UNSPECIFIED;
 
 #undef VALIDATE_FD_OR_PORT
 }
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-03-21  4:50 ` Mark H Weaver
  2013-03-21  5:24   ` Mark H Weaver
  2013-03-21  9:40   ` Ludovic Courtès
@ 2013-04-07 21:51   ` Ludovic Courtès
  2 siblings, 0 replies; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-07 21:51 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

I just pushed a new version, which reinstates the return value, and
appropriately detects EOF condition on the input (by just returning 0).

Comments welcome!

Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-07 19:53     ` Ludovic Courtès
@ 2013-04-09  8:33       ` Thien-Thi Nguyen
  2013-04-09 15:02         ` Ludovic Courtès
  0 siblings, 1 reply; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-09  8:33 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 708 bytes --]

() ludo@gnu.org (Ludovic Courtès)
() Sun, 07 Apr 2013 21:53:26 +0200

   +In other cases, the libc function may send fewer bytes than
   +@var{count}---for instance because @var{out} is a slow or limited
   +device, such as a pipe.  When that happens, Guile's @code{sendfile}
   +automatically retries until exactly @var{count} bytes were sent or an
   +error occurs.

A short write is an opportunity for the caller to Do Something Else
(i.e., go asynchronous).  I think that is more useful than internalizing
the looping.  To accomodate both usage patterns, you could leave the
low-level proc as-is (as-was :-D) and provide another proc that loops.

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-09  8:33       ` Thien-Thi Nguyen
@ 2013-04-09 15:02         ` Ludovic Courtès
  2013-04-09 20:34           ` Thien-Thi Nguyen
  0 siblings, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-09 15:02 UTC (permalink / raw)
  To: guile-devel

Hi!

Thien-Thi Nguyen <ttn@gnuvola.org> skribis:

> () ludo@gnu.org (Ludovic Courtès)
> () Sun, 07 Apr 2013 21:53:26 +0200
>
>    +In other cases, the libc function may send fewer bytes than
>    +@var{count}---for instance because @var{out} is a slow or limited
>    +device, such as a pipe.  When that happens, Guile's @code{sendfile}
>    +automatically retries until exactly @var{count} bytes were sent or an
>    +error occurs.
>
> A short write is an opportunity for the caller to Do Something Else
> (i.e., go asynchronous).  I think that is more useful than internalizing
> the looping.  To accomodate both usage patterns, you could leave the
> low-level proc as-is (as-was :-D) and provide another proc that loops.

Well, I hesitated a lot.  My initial inclination was to do what you say,
and let users handle the looping.

After a long discussion on IRC was Mark, I was mostly convinced that
leaving that to users is questionable.

First, because one obviously expects the procedure to send the file, not
just part of it.  Second, those who did not forget to implement the loop
can easily get it wrong.  Third, if you really want to interleave I/O
and computation, you probably don’t want to use sendfile(2) in the first
place.  Fourth, ‘write’, ‘put-bytevector’, etc. also hide short writes.

My 4¢,
Ludo’.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-09 15:02         ` Ludovic Courtès
@ 2013-04-09 20:34           ` Thien-Thi Nguyen
  2013-04-10 11:26             ` Mark H Weaver
  2013-04-10 20:56             ` Ludovic Courtès
  0 siblings, 2 replies; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-09 20:34 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

() ludo@gnu.org (Ludovic Courtès)
() Tue, 09 Apr 2013 17:02:03 +0200

   After a long discussion on IRC was Mark, I was mostly convinced that
   leaving that to users is questionable.

User behavior in the wild is always questionable.  :-D

   First, because one obviously expects the procedure to send the file,
   not just part of it.  Second, those who did not forget to implement
   the loop can easily get it wrong.  Third, if you really want to
   interleave I/O and computation, you probably don’t want to use
   sendfile(2) in the first place.  Fourth, ‘write’, ‘put-bytevector’,
   etc. also hide short writes.

Yes, the overall intent is to send all the file.  However, it's not wise
to assume caller doesn't want to keep stats (for example) on how much of
the file is sent per chunk.  That is not a heavy computation, but still
quite valid.  (Imagine an adaptive server that has write access to the
kernel network and virtual memory subsystem knobs.)

The same thinking applies to any internalized looping.  The design
principle to pursue is that the lowest level should leave the most
control to the caller, w/o losing the essence of the functionality.
[Insert quote from that wily programmer, A. Einstein, here. :-D]
[Insert analogous "bufferbloat" problem, here.]

(Of course, higher-level convenience interfaces are always welcome, once
the possibility of control (by caller) is guaranteed.)

Another way to think about it is: A ‘sendfile/all’ can be implemented in
terms of a ‘sendfile/non-looping’ but not the other way around.

So overall, i think hiding partial i/o is a mistake.  This is just one
instance of that, it seems (‘write’ and ‘put-bytevector’ being others).
Unlike the others, however, there is still opportunity for change.

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-09 20:34           ` Thien-Thi Nguyen
@ 2013-04-10 11:26             ` Mark H Weaver
  2013-04-14 10:48               ` Thien-Thi Nguyen
  2013-04-10 20:56             ` Ludovic Courtès
  1 sibling, 1 reply; 30+ messages in thread
From: Mark H Weaver @ 2013-04-10 11:26 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-devel

Thien-Thi Nguyen <ttn@gnuvola.org> writes:

> Another way to think about it is: A ‘sendfile/all’ can be implemented in
> terms of a ‘sendfile/non-looping’ but not the other way around.
>
> So overall, i think hiding partial i/o is a mistake.  This is just one
> instance of that, it seems (‘write’ and ‘put-bytevector’ being others).
> Unlike the others, however, there is still opportunity for change.

I acknowledge that it might be useful to add a lower-level interface for
'sendfile', perhaps called 'sendfile-some', analogous to the R6RS
'get-bytevector-some'.  We can still do that.  However, I continue to
believe that 'sendfile' should do what its name implies, and what 99% of
users will want and expect.

The alternative would most likely result in the vast majority of code
written using 'sendfile' to be sporadically unreliable.  Even Ludovic
had trouble getting the loop right, and we all know that he's an
exceptionally talented and careful hacker.

Regarding the proposed low-level interface (which I will call
'sendfile-some' for now), I have a question: can it actually be used to
write a robust asynchronous server?  Is it guaranteed to never block for
more than a short time?  I don't know the answer.  If the answer is
"no", then it seems to me that this would eliminate the most compelling
reason for a 'sendfile-some' API.

Or how about the other potential use case you gave: to keep stats on how
much is sent per "chunk".  What is the meaning of "chunk"?  If so, is
sendfile(2) guaranteed to return once for each chunk?  If not, then what
is the meaning of the statistics you would gather?

     Regards,
       Mark



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-09 20:34           ` Thien-Thi Nguyen
  2013-04-10 11:26             ` Mark H Weaver
@ 2013-04-10 20:56             ` Ludovic Courtès
  2013-04-14 10:52               ` Thien-Thi Nguyen
  1 sibling, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-10 20:56 UTC (permalink / raw)
  To: guile-devel

Thien-Thi Nguyen <ttn@gnuvola.org> skribis:

> So overall, i think hiding partial i/o is a mistake.

Well, I sympathize with the idea of sticking to the underlying syscall
semantics; yet, for my own uses of ‘sendfile’, I can only think of cases
all I want is to send the file.

(Besides, it seems to be unusual to get ‘sendfile’ to send less than
asked to.  The ‘pipe’ example mentioned in the manual would show up at
most once every 10 iterations with our test case.)

Ludo’.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-10 11:26             ` Mark H Weaver
@ 2013-04-14 10:48               ` Thien-Thi Nguyen
  2013-04-16 16:31                 ` Ludovic Courtès
  0 siblings, 1 reply; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-14 10:48 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 2107 bytes --]

() Mark H Weaver <mhw@netris.org>
() Wed, 10 Apr 2013 07:26:32 -0400

   Regarding the proposed low-level interface (which I will call
   'sendfile-some' for now), I have a question: can it actually be used
   to write a robust asynchronous server?

I can't imagine why not, although i certainly don't pretend to be an
expert on async server programming.  My experience is mostly on the
client side, w/ Emacs as the front-end.  I would be surprised if
sendfile(2) is NOT used in robust asynchronous servers written in C.

   Is it guaranteed to never block for more than a short time?  I don't
   know the answer.

Me neither, although i think this depends on ‘O_NONBLOCK’, which is
established on open(2).

   If the answer is "no", then it seems to me that this would eliminate
   the most compelling reason for a 'sendfile-some' API.

Why?

   Or how about the other potential use case you gave: to keep stats on
   how much is sent per "chunk".  What is the meaning of "chunk"?  If
   so, is sendfile(2) guaranteed to return once for each chunk?  If not,
   then what is the meaning of the statistics you would gather?

I'd rather not get into particulars in this line of discussion, if you
don't mind, for lack of time.  Instead, i'll just ask you to try to
imagine a robust P2P architecture (every node as both client and server)
that does NOT care about flow control, and spew some philosophy here
(feel free to ignore)...

My reading of sendfile(2) is that it does its best to send as much as
possible, but does not guarantee sending everything.  What it does
succeed in sending, it reports to the caller.  The caller loops as
desired, after evaluating (in some caller-meaningful way) the returned
information.

All i desire is the semantics of a Scheme ‘sendfile’ not deviate from
that of the syscall sendfile(2); i judge not "implication of the name
for the statistical majority", only fidelity, when it comes to syscalls.

Control, not coddling, please -- why should C programmers have all the fun?

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-10 20:56             ` Ludovic Courtès
@ 2013-04-14 10:52               ` Thien-Thi Nguyen
  0 siblings, 0 replies; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-14 10:52 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 670 bytes --]

() ludo@gnu.org (Ludovic Courtès)
() Wed, 10 Apr 2013 22:56:38 +0200

   Well, I sympathize with the idea of sticking to the underlying
   syscall semantics; yet, for my own uses of ‘sendfile’, I can only
   think of cases all I want is to send the file.

I understand; it's easy to project one's limitations into the code.

   (Besides, it seems to be unusual to get ‘sendfile’ to send less than
   asked to.  The ‘pipe’ example mentioned in the manual would show up
   at most once every 10 iterations with our test case.)

API design is the art of making possible both the usual and the unusual.

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-14 10:48               ` Thien-Thi Nguyen
@ 2013-04-16 16:31                 ` Ludovic Courtès
  2013-04-16 20:14                   ` Thien-Thi Nguyen
  0 siblings, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-16 16:31 UTC (permalink / raw)
  To: guile-devel

Thien-Thi Nguyen <ttn@gnuvola.org> skribis:

> My reading of sendfile(2) is that it does its best to send as much as
> possible, but does not guarantee sending everything.  What it does
> succeed in sending, it reports to the caller.  The caller loops as
> desired, after evaluating (in some caller-meaningful way) the returned
> information.

Just a note, this information is definitely not conveyed by the man page:

   sendfile()  copies  data  between one file descriptor and another.  

   [...]

   count is the number of bytes to copy between the file descriptors.

Of course the return type hints at a write(2)-like interface, but the
truth is that short writes had to be evidenced by experimental
validation, so to speak.

Ludo’.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-16 16:31                 ` Ludovic Courtès
@ 2013-04-16 20:14                   ` Thien-Thi Nguyen
  2013-04-17 12:52                     ` Ludovic Courtès
  0 siblings, 1 reply; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-16 20:14 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 2334 bytes --]

() ludo@gnu.org (Ludovic Courtès)
() Tue, 16 Apr 2013 18:31:00 +0200

   Thien-Thi Nguyen <ttn@gnuvola.org> skribis:

   > My reading of sendfile(2) is that it does its best to send as much
   > as possible, but does not guarantee sending everything.  What it
   > does succeed in sending, it reports to the caller.  The caller
   > loops as desired, after evaluating (in some caller-meaningful way)
   > the returned information.

   Just a note, this information is definitely not conveyed by the man page:

      sendfile()  copies  data  between one file descriptor and another.  

      [...]

      count is the number of bytes to copy between the file descriptors.

Yes, that part is what you tell (request of) the OS, not what the OS
tells (reports to) you.  I think the manpage is not remiss here.

   Of course the return type hints at a write(2)-like interface, but the
   truth is that short writes had to be evidenced by experimental
   validation, so to speak.

Well, i don't know what the truth is, but certainly it can be easy to
misinterpret the sendfile(2) manpage if you don't take all of it into
account.  The major hint is actually:

       If offset is not NULL, then it points to a variable holding the
       file offset from which sendfile() will start reading data from
       in_fd.  When sendfile() returns, this variable will be set to the
       offset of the byte following the last byte that was read.

That is basically another way of saying that the syscall "does its best
to send as much as possible, but does not guarantee sending everything".

This is also conveyed in the general context of system programming,
where an OS that guarantees a single-syscall full write of a possibly
multi-gigabyte file is an OS that can be easily bludgeoned to death by
mischievous (or simply misbehaving) userland programs, the eventual
consequence being OS writers tend to avoid making such guarantees
(especially for i/o, where the Real World is known to be Real Weird).

Anyway, i hope Guile grows a ‘sendfile-some’ (or whatever) so that i can
adapt my programs to use it instead of the ttn-do ‘sendfile’:

http://www.gnuvola.org/software/ttn-do/ttn-do.html.gz#zz-sys-linux-gnu

Until then, happy hacking!

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-16 20:14                   ` Thien-Thi Nguyen
@ 2013-04-17 12:52                     ` Ludovic Courtès
  2013-04-17 15:18                       ` Thien-Thi Nguyen
  0 siblings, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-17 12:52 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-devel

Thien-Thi Nguyen <ttn@gnuvola.org> skribis:

> Anyway, i hope Guile grows a ‘sendfile-some’ (or whatever) so that i can
> adapt my programs to use it instead of the ttn-do ‘sendfile’:

I’m open to the idea.  To get a better understanding, do you have
examples of real applications where it makes a difference?  I.e.,
concrete cases of recurring short writes?

Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-17 12:52                     ` Ludovic Courtès
@ 2013-04-17 15:18                       ` Thien-Thi Nguyen
  2013-04-17 16:27                         ` Ludovic Courtès
  0 siblings, 1 reply; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-17 15:18 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

() ludo@gnu.org (Ludovic Courtès)
() Wed, 17 Apr 2013 14:52:40 +0200

   concrete cases of recurring short writes?

The original application was "ttn-do serve-debiso", which configures
"ttn-do sizzweb" to serve debian ISOs (or rather, the various component
.deb files and metadata that apt-get requests from the loopback-mounted
ISO) over the home network.

Specifically, proc ‘prep-file-transfer’ in module ‘(ttn-do sizzweb)’
calls ‘sendfile’ in a loop.  Admittedly, that code could be replaced by
‘sendfile/never-fewer’, as the retval is not used in a particularly
intelligent way, but anyway, the point is, at time of writing and early
testing *sans looping*, apt-get on the other computer would stall and
eventually die, complaining.  That's when i learned to read and think
about sendfile(2) more carefully than i had before.

It would be nice to point to a ChangeLog entry that embodies this "ah
ha!" moment, but as it turned out, the testing was done before commit.
Oh well, life goes on...

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-17 15:18                       ` Thien-Thi Nguyen
@ 2013-04-17 16:27                         ` Ludovic Courtès
  2013-04-18  6:58                           ` Thien-Thi Nguyen
  0 siblings, 1 reply; 30+ messages in thread
From: Ludovic Courtès @ 2013-04-17 16:27 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-devel

Thien-Thi Nguyen <ttn@gnuvola.org> skribis:

> The original application was "ttn-do serve-debiso", which configures
> "ttn-do sizzweb" to serve debian ISOs (or rather, the various component
> .deb files and metadata that apt-get requests from the loopback-mounted
> ISO) over the home network.
>
> Specifically, proc ‘prep-file-transfer’ in module ‘(ttn-do sizzweb)’
> calls ‘sendfile’ in a loop.  Admittedly, that code could be replaced by
> ‘sendfile/never-fewer’, as the retval is not used in a particularly
> intelligent way, but anyway, the point is, at time of writing and early
> testing *sans looping*, apt-get on the other computer would stall and
> eventually die, complaining.  That's when i learned to read and think
> about sendfile(2) more carefully than i had before.

So that was over a PF_INET/SOCK_STREAM socket, right?

Would it occur systematically, or just once in a while?

Ludo’.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Bindings for ‘sendfile’
  2013-04-17 16:27                         ` Ludovic Courtès
@ 2013-04-18  6:58                           ` Thien-Thi Nguyen
  0 siblings, 0 replies; 30+ messages in thread
From: Thien-Thi Nguyen @ 2013-04-18  6:58 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 665 bytes --]

() ludo@gnu.org (Ludovic Courtès)
() Wed, 17 Apr 2013 18:27:04 +0200

   So that was over a PF_INET/SOCK_STREAM socket, right?

Yes.  FWIW, "ttn-do sizzweb" can also operate over PF_UNIX/SOCK_STREAM
and i recall doing some testing (w/ similar results) there.

   Would it occur systematically, or just once in a while?

Sorry, i can't recall precisely.  My vague memory is that on the client
(apt-get) side, i saw some packages download completely, no problem,
about 20% of the time.  I suppose it would be good to collect some hard
data, draw graphs, tweak kernel knobs, publish papers, etc.  Some day...

-- 
Thien-Thi Nguyen
GPG key: 4C807502

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-04-18  6:58 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-20 22:21 [PATCH] Bindings for ‘sendfile’ Ludovic Courtès
2013-03-21  0:17 ` Noah Lavine
2013-03-21  9:15   ` Ludovic Courtès
2013-03-21 10:04     ` Andrew Gaylard
2013-03-21 15:39     ` Noah Lavine
2013-03-21  3:45 ` Nala Ginrut
2013-03-21  4:50 ` Mark H Weaver
2013-03-21  5:24   ` Mark H Weaver
2013-03-21  9:40   ` Ludovic Courtès
2013-03-21  9:49     ` Andy Wingo
2013-03-21 10:10       ` Ludovic Courtès
2013-03-22 19:22     ` Mark H Weaver
2013-03-22 21:27       ` Ludovic Courtès
2013-03-24  5:04         ` Mark H Weaver
2013-03-25 12:55           ` Ludovic Courtès
2013-04-07 19:53     ` Ludovic Courtès
2013-04-09  8:33       ` Thien-Thi Nguyen
2013-04-09 15:02         ` Ludovic Courtès
2013-04-09 20:34           ` Thien-Thi Nguyen
2013-04-10 11:26             ` Mark H Weaver
2013-04-14 10:48               ` Thien-Thi Nguyen
2013-04-16 16:31                 ` Ludovic Courtès
2013-04-16 20:14                   ` Thien-Thi Nguyen
2013-04-17 12:52                     ` Ludovic Courtès
2013-04-17 15:18                       ` Thien-Thi Nguyen
2013-04-17 16:27                         ` Ludovic Courtès
2013-04-18  6:58                           ` Thien-Thi Nguyen
2013-04-10 20:56             ` Ludovic Courtès
2013-04-14 10:52               ` Thien-Thi Nguyen
2013-04-07 21:51   ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).