From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Neil Jerram Newsgroups: gmane.lisp.guile.bugs Subject: Re: windows sockets vs. file descriptors bugs in Guile Date: Sat, 27 Mar 2010 22:06:30 +0000 Message-ID: <87634hh121.fsf@ossau.uklinux.net> References: <4AC42CD2.2060507@coverity.com> <87iqexbf3e.fsf@ossau.uklinux.net> <4AC694B6.2030900@coverity.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1269729157 25335 80.91.229.12 (27 Mar 2010 22:32:37 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 27 Mar 2010 22:32:37 +0000 (UTC) Cc: bug-guile@gnu.org To: Scott McPeak Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Sat Mar 27 23:32:33 2010 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1NveYS-0005vj-L2 for guile-bugs@m.gmane.org; Sat, 27 Mar 2010 23:32:33 +0100 Original-Received: from localhost ([127.0.0.1]:48979 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NveT2-0004jK-31 for guile-bugs@m.gmane.org; Sat, 27 Mar 2010 18:26:56 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NveKA-0002j5-2w for bug-guile@gnu.org; Sat, 27 Mar 2010 18:17:46 -0400 Original-Received: from [140.186.70.92] (port=37483 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NveK7-0002iF-PP for bug-guile@gnu.org; Sat, 27 Mar 2010 18:17:45 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1NveK5-00085w-43 for bug-guile@gnu.org; Sat, 27 Mar 2010 18:17:43 -0400 Original-Received: from mail3.uklinux.net ([80.84.72.33]:36752) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NveK4-00082u-NE for bug-guile@gnu.org; Sat, 27 Mar 2010 18:17:41 -0400 Original-Received: from arudy (host86-182-154-126.range86-182.btcentralplus.com [86.182.154.126]) by mail3.uklinux.net (Postfix) with ESMTP id 46FF51F6864; Sat, 27 Mar 2010 22:17:03 +0000 (GMT) Original-Received: from arudy (arudy [127.0.0.1]) by arudy (Postfix) with ESMTP id 3F5733801F; Sat, 27 Mar 2010 22:06:31 +0000 (GMT) In-Reply-To: <4AC694B6.2030900@coverity.com> (Scott McPeak's message of "Fri, 02 Oct 2009 17:03:02 -0700") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4-2.6 X-BeenThere: bug-guile@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:4536 Archived-At: --=-=-= Scott McPeak writes: >> I've attached my patch for these. It's a bit simpler than yours, and >> also avoids needing copyright assignment from you. Can you take a look, >> see if you notice any disadvantages compared with your version, and if >> possible try it out? > > Except for the modifications to 'scm_std_select', the two patches are > nearly equivalent. Hi Scott, First, I'm sorry for leaving this hanging for so long. I'd like to finalise this, first for 1.8.x and then for master, and the patch that I now propose for 1.8.x is attached. --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-Fix-windows-sockets-vs.-file-descriptors-bugs.patch >From 56a69318b55f0a231cf1531d2b7035c994df6579 Mon Sep 17 00:00:00 2001 From: Neil Jerram Date: Sat, 27 Mar 2010 21:41:54 +0000 Subject: [PATCH] Fix windows sockets vs. file descriptors bugs Patch is from Scott McPeak, modified by me to be as conservative as possible with respect to existing 1.8.x non-MinGW operation. Scott writes: "Guile-1.8.7 appears to have socket support and Windows support, but they don't seem to work together, despite the existence of things like win32-socket.c. (I'm building Guile for Windows using the mingw cross compiler on linux. Maybe it's different with Cygwin?) Specifically, Guile assumes the POSIX rule that a socket is just a file descriptor, but on Windows that does not work; socket functions only accept sockets, and file functions only accept file descriptors. Several Guile functions are affected; I don't think this is an exhaustive list, but it's what I ran into while trying to get a simple client and server working (see testcase below): * fport_fill_input: Passes a socket to 'read'. * write_all: Passes a socket to 'write'. * fport_close: Passes a socket to 'close'. The EBADF error message is then silently discarded (...), but the bug still manifests, e.g., as a server that never closes its connections. * scm_std_select: Passes a pipe file descriptor to 'select'. I'm not sure what the best solution is in the Guile framework. I see that ports have some sort of dynamic dispatch table, so perhaps sockets should be made a distinct kind of port from a file port using that mechanism. However, for expedience, I basically hacked the file port functions by recognizing sockets as ports with a SCM_FILENAME that is sym_socket and handling them differently. However, that is certainly not perfect since 'set-port-filename!' can be used to change the file name after creation. Since there should be no harm in calling the socket-specific functions for socket file descriptors on any operating system, and differences in behavior among platforms make good hiding places for bugs, my patch makes most of the hacks regardless of the platform. I've tested it on linux/x86 and win32/x86. For 'select', I don't understand the purpose of the sleep pipe, and, whatever it does, it would likely be a lot of work to code it in a Windows-compatible way, so I just removed it on Windows." Regarding the sleep pipe, as I've written in a comment in threads.c... The consequence of not implementing the sleep pipe (on MinGW) is that threads cannot receive signals, or more generally any asyncs, while they are blocked waiting for I/O (select etc.), or for a mutex or condition variable. But that is better than those operations (i.e. I/O, mutexes and condition variables) not working at all! * libguile/fports.c: Include libguile/socket.h. (fport_fill_input): On MinGW, use scm_is_socket_port and scm_socket_read_via_recv. (write_all): On MinGW, use scm_is_socket_port and scm_socket_write_via_send. (fport_close): On MinGW, use scm_is_socket_port and scm_socket_close. * libguile/socket.c (scm_is_socket_port, scm_socket_read_via_recv, scm_socket_write_via_send, scm_socket_close): New functions for MinGW. * libguile/socket.h: Corresponding declarations. * libguile/threads.c (scm_std_select): On MinGW, suppress all the bits to do with the sleep pipe. --- libguile/fports.c | 29 +++++++++++++++++++++++++---- libguile/socket.c | 31 +++++++++++++++++++++++++++++++ libguile/socket.h | 9 +++++++++ libguile/threads.c | 29 ++++++++++++++++++++++++++++- 4 files changed, 93 insertions(+), 5 deletions(-) diff --git a/libguile/fports.c b/libguile/fports.c index 007ee3f..c047a4e 100644 --- a/libguile/fports.c +++ b/libguile/fports.c @@ -26,6 +26,7 @@ #include #include #include "libguile/_scm.h" +#include "libguile/socket.h" #include "libguile/strings.h" #include "libguile/validate.h" #include "libguile/gc.h" @@ -596,8 +597,14 @@ fport_fill_input (SCM port) #ifndef __MINGW32__ fport_wait_for_input (port); -#endif /* !__MINGW32__ */ - SCM_SYSCALL (count = read (fp->fdes, pt->read_buf, pt->read_buf_size)); +#else /* __MINGW32__ */ + if (scm_is_socket_port (port)) + SCM_SYSCALL (count = + scm_socket_read_via_recv (fp->fdes, pt->read_buf, pt->read_buf_size)); + else +#endif /* __MINGW32__ */ + SCM_SYSCALL (count = read (fp->fdes, pt->read_buf, pt->read_buf_size)); + if (count == -1) scm_syserror ("fport_fill_input"); if (count == 0) @@ -717,12 +724,20 @@ scm_i_fport_truncate (SCM port, SCM length) static void write_all (SCM port, const void *data, size_t remaining) { int fdes = SCM_FSTREAM (port)->fdes; +#ifdef __MINGW32__ + int is_socket = scm_is_socket_port (port); +#endif while (remaining > 0) { size_t done; - SCM_SYSCALL (done = write (fdes, data, remaining)); +#ifdef __MINGW32__ + if (is_socket) + SCM_SYSCALL (done = scm_socket_write_via_send (fdes, data, remaining)); + else +#endif + SCM_SYSCALL (done = write (fdes, data, remaining)); if (done == -1) SCM_SYSERROR; @@ -880,7 +895,13 @@ fport_close (SCM port) int rv; fport_flush (port); - SCM_SYSCALL (rv = close (fp->fdes)); +#ifdef __MINGW32__ + if (scm_is_socket_port (port)) + SCM_SYSCALL (rv = scm_socket_close (fp->fdes)); + else +#endif + SCM_SYSCALL (rv = close (fp->fdes)); + if (rv == -1 && errno != EBADF) { if (scm_gc_running_p) diff --git a/libguile/socket.c b/libguile/socket.c index cb954f4..b41f839 100644 --- a/libguile/socket.c +++ b/libguile/socket.c @@ -1642,7 +1642,38 @@ SCM_DEFINE (scm_sendto, "sendto", 3, 1, 1, #undef FUNC_NAME +#ifdef __MINGW32__ +/* The functions in this section support using sockets on Windows, + * where the file descriptor functions cannot be used on sockets. */ + +/* Return true if 'port' is a socket port. */ +int scm_is_socket_port(SCM port) +{ + return SCM_FILENAME (port) == sym_socket; +} + +/* Do what POSIX 'read' system call would do, except for a file + * descriptor that is a socket. */ +int scm_socket_read_via_recv(int fd, void *buf, size_t len) +{ + return recv (fd, buf, len, 0 /*flags*/); +} + +/* Like 'write' but for a socket. */ +int scm_socket_write_via_send(int fd, void const *buf, size_t len) +{ + return send (fd, buf, len, 0 /*flags*/); +} + +/* Like 'close' but for a socket. */ +int scm_socket_close(int fd) +{ + return closesocket(fd); +} +#endif + + void scm_init_socket () { diff --git a/libguile/socket.h b/libguile/socket.h index 146d283..c8213eb 100644 --- a/libguile/socket.h +++ b/libguile/socket.h @@ -64,6 +64,15 @@ SCM_API struct sockaddr *scm_c_make_socket_address (SCM family, SCM address, size_t *address_size); SCM_API SCM scm_make_socket_address (SCM family, SCM address, SCM args); +#ifdef __MINGW32__ +/* Socket functions on file descriptors, exported from socket.c so + * that fports.c does not have to know about system socket headers. */ +SCM_API int scm_is_socket_port(SCM port); +SCM_API int scm_socket_read_via_recv(int fd, void *buf, size_t len); +SCM_API int scm_socket_write_via_send(int fd, void const *buf, size_t len); +SCM_API int scm_socket_close(int fd); +#endif + #endif /* SCM_SOCKET_H */ /* diff --git a/libguile/threads.c b/libguile/threads.c index f2bb556..06aeb3c 100644 --- a/libguile/threads.c +++ b/libguile/threads.c @@ -1419,6 +1419,15 @@ scm_threads_mark_stacks (void) /*** Select */ +/* On Windows, we cannot use 'select' with non-socket file + * descriptors, so we cannot use the sleep pipe. The consequence of + * this is that threads cannot receive signals, or more generally any + * asyncs, while they are blocked waiting for I/O (select etc.), or + * for a mutex or condition variable. But that is better than those + * operations (i.e. I/O, mutexes and condition variables) not working + * at all! + */ + int scm_std_select (int nfds, SELECT_TYPE *readfds, @@ -1437,19 +1446,35 @@ scm_std_select (int nfds, readfds = &my_readfds; } +#ifndef __MINGW32__ while (scm_i_setup_sleep (t, SCM_BOOL_F, NULL, t->sleep_pipe[1])) SCM_TICK; wakeup_fd = t->sleep_pipe[0]; +#endif + ticket = scm_leave_guile (); + +#ifndef __MINGW32__ FD_SET (wakeup_fd, readfds); if (wakeup_fd >= nfds) nfds = wakeup_fd+1; +#endif + res = select (nfds, readfds, writefds, exceptfds, timeout); - t->sleep_fd = -1; eno = errno; + +#ifndef __MINGW32__ + /* XXX: Is it important to set 'sleep_fd' before re-entering Guile + * mode? If not, then this assignment should move down into the + * next 'if' block for simplicity. But in that case, there is no + * point since 'scm_i_reset_sleep' also sets the field to -1. */ + t->sleep_fd = -1; +#endif + scm_enter_guile (ticket); +#ifndef __MINGW32__ scm_i_reset_sleep (t); if (res > 0 && FD_ISSET (wakeup_fd, readfds)) @@ -1467,6 +1492,8 @@ scm_std_select (int nfds, res = -1; } } +#endif + errno = eno; return res; } -- 1.5.6.5 --=-=-= Most of that patch is yours, so we will need a copyright assignment from you. Please say if you need help finding and submitting the papers for that (or if it's a problem). For master I expect the changes will be different, as I think there is some support in Gnulib that we can use. I'll work on that shortly. In response to your points about the differences between your patch and a similar one that I had made... > The differences are: > > * I exported 'scm_is_socket_port()' rather than proliferate the hack > of checking for equality with 'sym_socket'. That seemed like a good > idea because I know the hack is buggy: it is defeated by > 'set-port-filename!'. Good point; agreed. > * I avoided calling socket functions from fports.c, since translation > units that use socket functions typically have to do a lot of > gyrations at the beginning to pull in the right headers on the right > systems. See the top of socket.c. Since fports.c does not have all > that, there is a strong chance that it wouldn't compile on some > systems, if it were not for the next point. Again, good point, thanks. > * I made changes that affect all systems, rather than using #ifdefs > to make it Windows-only. As indicated in the comments, there are two > reasons for this: > > a. Syntax: Code is far less readable when sprinkled with #ifdefs. > I've had the displeasure of reading and maintaining code with heavy > #ifdefs, and it's not fun. Guile is currently relatively free of such > hacks, because it does a good job of factoring system-specific > functionality so it's only exposed to a limited set of modules, and I > was trying to keep it that way. > > b. Semantics: Whenever a program explicitly does something different > depending on the platform, that's an invitation to a platform-specific > bug. If indeed it is correct to call to 'recv(_,_,_,0)' on a socket > instead of 'read(_,_,_)' (an equivalence POSIX guarantees) on Windows, > then it should be correct to do so on every platform. And if it's > wrong or inefficient on some platform, we'd like to know ASAP, right? > > These differences are mainly stylistic and future-proofing. You're > certainly free to ignore them, I just wanted to elaborate on the > reasons in case they weren't clear. Those are good arguments, and for master (i.e. current and future development) I'm happy to accept them (in principle; as already noted, I expect the exact changes to be different). For 1.8.x, though, I'd prefer to make this change in a way that minimises any possible risk to the existing 1.8.x code, which is already quite well tested on a few platforms. Hence in the attached patch there are quite a lot of #ifdefs; I don't think we need to worry too much about those (and their maintainability), because this isn't the codebase of the future. One particular thing is that I didn't like your uses of if (SELECT_USE_SLEEP_PIPE) In fact I'm sure this will cause "conditional expression is constant" warnings with several compilers. I've just used more #ifdefs instead. >>> * scm_std_select: Passes a pipe file descriptor to 'select'. >> >> I have no record of hitting this one; I suspect my code at the time just >> didn't use `sleep'. I'm wondering if we still need scm_std_select in >> Guile now anyway. I'll write again about that. > > As for the change to 'scm_std_select', without that change, the code > does not work. Using my original testcase (plus a couple of calls to > 'flush-all-ports', necessary to see the "listening" message, that I > inadvertently omitted in my original post), Guile-1.8.7 with my > changes removed and yours added instead says: > > > ./guile.exe -e main -s testsockets.scm server 8888 > listening on port 8888 > ERROR: In procedure accept: > ERROR: Socket operation on non-socket > (exit code of 1) > > The reason for this is that 'scm_accept' calls 'scm_std_select', which > sticks a pipe file descriptor into the set of file descriptors to be > checked for reading ('readfds'), which Windows does not like. My > program never calls 'sleep'--Guile adds the sleep pipe > unconditionally. All makes sense; thanks for explaining. (I can't remember what I had in mind when I said above that we might not need scm_std_select any more.) > Once I add just my changes to threads.c on top of your changes to > socket.c, the test case works perfectly. > > While the changes to 'scm_std_select' are certainly the ugliest of > those I proposed, some sort of change is required in order to get > socket server processes to work in Guile-1.8.7 on Windows, as my test > case shows. Yes. I added a note in the patch about the consequence of not implementing the sleep pipe - and the upshot is that I agree with you that this is a useful compromise. Please let me know if you have any comments on that. Regards, Neil --=-=-=--