* bug#22789: 25.1.50; In last master build https connections stop working @ 2016-02-24 10:26 José L. Doménech 2016-02-24 14:00 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: José L. Doménech @ 2016-02-24 10:26 UTC (permalink / raw) To: 22789 In emacs, built from master, https connections have stopped working, while imaps connections are still working. How to reproduce it: emacs -Q eww https://www.fsf.org The eww buffer remains blank. Thw "eww http://www.fsf.org" command shows the page. From my mail client (wanderlust), trying to retrieve rss sources with a https connection fails, but I can access http sources and imaps folders. Thanks. In GNU Emacs 25.1.50.4 (x86_64-w64-mingw32) of 2016-02-24 built on LENOVO-PC Repository revision: 378d138e64e9389e277e95528c143dc2456727a5 Windowing system distributor 'Microsoft Corp.', version 6.3.9600 Configured using: 'configure --without-imagemagick PKG_CONFIG_PATH=/mingw64/lib/pkgconfig' Configured features: XPM JPEG TIFF GIF PNG RSVG SOUND NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS Important settings: value of $LANG: es_ES.UTF-8 locale-coding-system: cp1252 Major mode: eww Minor modes in effect: tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t line-number-mode: t transient-mark-mode: t Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Contacting host: www.fsf.org:443 Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message dired dired-loaddefs rfc822 mml mml-sec epa derived epg epg-config mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader sendmail network-stream nsm starttls url-http tls gnutls mail-parse rfc2231 url-gw url-cache url-auth eww puny mm-url gnus nnheader gnus-util rmail rmail-loaddefs rfc2047 rfc2045 ietf-drums mail-utils wid-edit mm-util mail-prsvr url-queue url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache url-vars mailcap shr svg xml seq byte-opt gv bytecomp byte-compile cconv cl-extra help-mode easymenu dom cl-loaddefs cl-lib subr-x pcase browse-url format-spec time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp disp-table term/w32-win w32-win w32-vars term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese charscript case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote w32notify w32 multi-tty make-network-process emacs) Memory information: ((conses 16 129569 4885) (symbols 56 24069 0) (miscs 48 46 107) (strings 32 28495 4799) (string-bytes 1 822432) (vectors 16 16800) (vector-slots 8 471145 3632) (floats 8 237 46) (intervals 56 262 21) (buffers 976 12)) ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-24 10:26 bug#22789: 25.1.50; In last master build https connections stop working José L. Doménech @ 2016-02-24 14:00 ` Lars Ingebrigtsen 2016-02-24 16:09 ` José L. Doménech 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-24 14:00 UTC (permalink / raw) To: José L. Doménech; +Cc: 22789 José L. Doménech <j_l_domenech@yahoo.com> writes: > In emacs, built from master, https connections have stopped working, while > imaps connections are still working. > > How to reproduce it: > emacs -Q > eww https://www.fsf.org > > The eww buffer remains blank. Does your Emacs have the GnuTLS libraries available? What does (gnutls-available-p) eval to? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-24 14:00 ` Lars Ingebrigtsen @ 2016-02-24 16:09 ` José L. Doménech 2016-02-24 18:06 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: José L. Doménech @ 2016-02-24 16:09 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: José L. Doménech, 22789 On Wed, 24 Feb 2016 15:00:50 +0100, Lars Ingebrigtsen wrote: > > José L. Doménech <j_l_domenech@yahoo.com> writes: > > > In emacs, built from master, https connections have stopped working, while > > imaps connections are still working. > > > > How to reproduce it: > > emacs -Q > > eww https://www.fsf.org > > > > The eww buffer remains blank. > > Does your Emacs have the GnuTLS libraries available? What does > (gnutls-available-p) eval to? (gnutls-available-p) evaluates to t. The libraries should be available since i have no problems with the following built (that I am now using): In GNU Emacs 25.1.50.2 (x86_64-w64-mingw32) of 2016-02-21 built on LENOVO-PC Repository revision: 1ba50a0d8cbef6686ecf752583832e7bbb9137ef Windowing system distributor 'Microsoft Corp.', version 6.3.9600 Configured using: 'configure --without-imagemagick PKG_CONFIG_PATH=/mingw64/lib/pkgconfig' Configured features: XPM JPEG TIFF GIF PNG RSVG SOUND NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS Important settings: value of $LANG: ESN locale-coding-system: cp1252 ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-24 16:09 ` José L. Doménech @ 2016-02-24 18:06 ` Eli Zaretskii 2016-02-24 23:48 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-24 18:06 UTC (permalink / raw) To: José L. Doménech; +Cc: larsi, 22789 > From: José L. Doménech > <j_l_domenech@yahoo.com> > Cc: "José L. Doménech" > <j_l_domenech@yahoo.com>, 22789@debbugs.gnu.org > > On Wed, 24 Feb 2016 15:00:50 +0100, > Lars Ingebrigtsen wrote: > > > > José L. Doménech <j_l_domenech@yahoo.com> writes: > > > > > In emacs, built from master, https connections have stopped working, while > > > imaps connections are still working. > > > > > > How to reproduce it: > > > emacs -Q > > > eww https://www.fsf.org > > > > > > The eww buffer remains blank. > > > > Does your Emacs have the GnuTLS libraries available? What does > > (gnutls-available-p) eval to? > (gnutls-available-p) evaluates to t. > > The libraries should be available since i have no problems with the following built (that I am now using): > > In GNU Emacs 25.1.50.2 (x86_64-w64-mingw32) > of 2016-02-21 built on LENOVO-PC I confirm the problem with the MS-Windows build: on master, https doesn't work; on emacs-25 it does. First suspect is the async changes, of course. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-24 18:06 ` Eli Zaretskii @ 2016-02-24 23:48 ` Lars Ingebrigtsen 2016-02-25 0:02 ` Lars Ingebrigtsen 2016-02-25 3:46 ` Eli Zaretskii 0 siblings, 2 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-24 23:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: José L. Doménech, 22789 Eli Zaretskii <eliz@gnu.org> writes: > I confirm the problem with the MS-Windows build: on master, https > doesn't work; on emacs-25 it does. > > First suspect is the async changes, of course. Yup. I'll try do do a build without getaddrinfo_a support and see whether I can reproduce the https error here... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-24 23:48 ` Lars Ingebrigtsen @ 2016-02-25 0:02 ` Lars Ingebrigtsen 2016-02-25 1:09 ` Lars Ingebrigtsen ` (2 more replies) 2016-02-25 3:46 ` Eli Zaretskii 1 sibling, 3 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-25 0:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: José L. Doménech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >> I confirm the problem with the MS-Windows build: on master, https >> doesn't work; on emacs-25 it does. >> >> First suspect is the async changes, of course. > > Yup. I'll try do do a build without getaddrinfo_a support and see > whether I can reproduce the https error here... I'm unable to reproduce this bug on Ubuntu, even if I compile without getaddrinfo{,_a} support. If you eval the following, does anything show up in the "*foo*" buffer? (setq proc (make-network-process :name "foo" :buffer (get-buffer-create "*foo*") :host "imap.gmail.com" :service 993 :nowait t :tls-parameters (cons 'gnutls-x509pki (gnutls-boot-parameters :type 'gnutls-x509pki :hostname "imap.gmail.com")))) * OK Gimap ready for requests from 60.225.211.161 qr7mb410250987iec should appear. Also, after evaling that, what does (process-status proc) say? It should say "connect" for a little bit, and then "open"... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-25 0:02 ` Lars Ingebrigtsen @ 2016-02-25 1:09 ` Lars Ingebrigtsen 2016-02-25 16:41 ` Eli Zaretskii 2016-02-27 18:05 ` Alain Schneble 2 siblings, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-25 1:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: José L. Doménech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > I'm unable to reproduce this bug on Ubuntu, even if I compile without > getaddrinfo{,_a} support. I've read through the WINDOWSNT parts of process.c to see if there's anything strikingly obviously wrong, but a first read through didn't really show anything... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-25 0:02 ` Lars Ingebrigtsen 2016-02-25 1:09 ` Lars Ingebrigtsen @ 2016-02-25 16:41 ` Eli Zaretskii 2016-02-26 2:29 ` Lars Ingebrigtsen 2016-02-27 18:05 ` Alain Schneble 2 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-25 16:41 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: José L. Doménech <j_l_domenech@yahoo.com>, > 22789@debbugs.gnu.org > Date: Thu, 25 Feb 2016 11:02:51 +1100 > > If you eval the following, does anything show up in the "*foo*" buffer? > > (setq proc > (make-network-process :name "foo" > :buffer (get-buffer-create "*foo*") > :host "imap.gmail.com" > :service 993 > :nowait t > :tls-parameters > (cons 'gnutls-x509pki > (gnutls-boot-parameters > :type 'gnutls-x509pki > :hostname "imap.gmail.com")))) I see there only this: Process foo connect > * OK Gimap ready for requests from 60.225.211.161 qr7mb410250987iec > > should appear. Also, after evaling that, what does > > (process-status proc) > > say? It should say "connect" for a little bit, and then "open"... It stays "connect" forever. But "netstat" doesn't show any connections to that server, AFAICT. I think the connection simply doesn't begin. Btw, one other difference of the Windows build, this time wrt GnuTLS, is that on Windows we instruct GnuTLS to use our own pull and push functions, see gnutls.c around line 450. The functions themselves are defined in w32.c, at the end. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-25 16:41 ` Eli Zaretskii @ 2016-02-26 2:29 ` Lars Ingebrigtsen 2016-02-26 9:36 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-26 2:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: > It stays "connect" forever. But "netstat" doesn't show any > connections to that server, AFAICT. I think the connection simply > doesn't begin. Hm... when you mentioned that the Windows GnuTLS functions used our own functions for the actual pull/push, I thought that perhaps the problem was there (and was about to suggest the patch below), but this must mean that the socket isn't even created. `make-network-socket' ends with #ifdef HAVE_GETADDRINFO_A /* If we're doing async address resolution, the list of addresses here will be nil, so we postpone connecting to the server. */ if (!p->is_server && NILP (ip_addresses)) { p->dns_request = dns_request; p->status = Qconnect; } else { connect_network_socket (proc, ip_addresses); } #else /* HAVE_GETADDRINFO_A */ connect_network_socket (proc, ip_addresses); #endif so that should happen unconditionally on Windows. Let's see... Oh! This code in connect_network_socket looks suspect, perhaps. If it fails, then the socket will never actually be created... hm... but it may be caught later... and it doesn't explain why non-blocking non-TLS sockets still work... so it can't be that... #ifdef NON_BLOCKING_CONNECT if (p->is_non_blocking_client) { ret = fcntl (s, F_SETFL, O_NONBLOCK); if (ret < 0) { xerrno = errno; emacs_close (s); s = -1; continue; } } #endif So perhaps it's in the TLS code anyway. Could you try the following code? It'll make TLS negotiation blocking on WINDOWSNT again. > Btw, one other difference of the Windows build, this time wrt GnuTLS, > is that on Windows we instruct GnuTLS to use our own pull and push > functions, see gnutls.c around line 450. The functions themselves are > defined in w32.c, at the end. diff --git a/src/gnutls.c b/src/gnutls.c index d1b34c5..00d0e56 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -410,12 +410,17 @@ gnutls_try_handshake (struct Lisp_Process *proc) QUIT; } while (ret < 0 && gnutls_error_is_fatal (ret) == 0 - && ! proc->is_non_blocking_client); +#ifndef WINDOWSNT + && ! proc->is_non_blocking_client +#endif + ); proc->gnutls_initstage = GNUTLS_STAGE_HANDSHAKE_TRIED; +#ifndef WINDOWSNT if (proc->is_non_blocking_client) proc->gnutls_p = true; +#endif if (ret == GNUTLS_E_SUCCESS) { -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-26 2:29 ` Lars Ingebrigtsen @ 2016-02-26 9:36 ` Eli Zaretskii 2016-02-27 2:30 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-26 9:36 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Fri, 26 Feb 2016 12:59:39 +1030 > > So perhaps it's in the TLS code anyway. Could you try the following > code? It'll make TLS negotiation blocking on WINDOWSNT again. Now, when I evaluate the same form you posted earlier, I get a lot of binary garbage in *foo*, and (process-status proc) yields "failed". And contacting https://www.fsf.org still doesn't work. Can you post a summary of the changes that you've done, including the files and functions where they were made? That will help Someone™ look into this problem on Windows. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-26 9:36 ` Eli Zaretskii @ 2016-02-27 2:30 ` Lars Ingebrigtsen 2016-02-27 2:43 ` John Wiegley ` (2 more replies) 0 siblings, 3 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-27 2:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> So perhaps it's in the TLS code anyway. Could you try the following >> code? It'll make TLS negotiation blocking on WINDOWSNT again. > > Now, when I evaluate the same form you posted earlier, I get a lot of > binary garbage in *foo*, and (process-status proc) yields "failed". Hm... interesting... I think this might point towards emacs_gnutls_push/pull needing to be tweaked somehow. (In particular, the bytes from the stream do not seem to be delivered to the GnuTLS library, but instead consumed by Emacs (and output into the buffer).) But it shows that the problem definitely is in the TLS handling itself, and not in the DNS part of the changes. Let's see... Oh, I think I see the problem with the patch I asked you to test: #ifdef HAVE_GNUTLS if (p->gnutls_p && p->gnutls_state) nbytes = emacs_gnutls_read (p, chars + carryover + buffered, readmax - buffered); else #endif nbytes = emacs_read (channel, chars + carryover + buffered, readmax - buffered); So deferring setting g->gnutls_p is not a good idea. I'll try to debug this further, but may not have time today... > Can you post a summary of the changes that you've done, including the > files and functions where they were made? That will help Someone™ > look into this problem on Windows. Uhm. Well, it was a 3200 line patch that affected process.c and gnutls.c... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 2:30 ` Lars Ingebrigtsen @ 2016-02-27 2:43 ` John Wiegley 2016-02-27 3:50 ` Lars Ingebrigtsen 2016-02-27 3:49 ` Lars Ingebrigtsen 2016-02-27 8:13 ` Eli Zaretskii 2 siblings, 1 reply; 124+ messages in thread From: John Wiegley @ 2016-02-27 2:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 >>>>> Lars Ingebrigtsen <larsi@gnus.org> writes: >> Can you post a summary of the changes that you've done, including the files >> and functions where they were made? That will help Someone™ look into this >> problem on Windows. > Uhm. Well, it was a 3200 line patch that affected process.c and gnutls.c... Yes, but you have a lot more knowledge of what that patch contained that someone who will be looking at it afresh. I guess he's asking you to do a little work to save him a lot of work, if you'd be willing. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 2:43 ` John Wiegley @ 2016-02-27 3:50 ` Lars Ingebrigtsen 2016-02-27 8:14 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-27 3:50 UTC (permalink / raw) To: John Wiegley; +Cc: John Wiegley, 22789, j_l_domenech John Wiegley <jwiegley@gmail.com> writes: >>>>>> Lars Ingebrigtsen <larsi@gnus.org> writes: > >>> Can you post a summary of the changes that you've done, including the files >>> and functions where they were made? That will help Someone™ look into this >>> problem on Windows. > >> Uhm. Well, it was a 3200 line patch that affected process.c and gnutls.c... > > Yes, but you have a lot more knowledge of what that patch contained that > someone who will be looking at it afresh. I guess he's asking you to do a > little work to save him a lot of work, if you'd be willing. Sure, but... that's a lot of typing. :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 3:50 ` Lars Ingebrigtsen @ 2016-02-27 8:14 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-02-27 8:14 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, johnw, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: 22789@debbugs.gnu.org, John Wiegley <johnw@gnu.org>, Eli Zaretskii <eliz@gnu.org>, j_l_domenech@yahoo.com > Date: Sat, 27 Feb 2016 14:20:00 +1030 > > John Wiegley <jwiegley@gmail.com> writes: > > >>>>>> Lars Ingebrigtsen <larsi@gnus.org> writes: > > > >>> Can you post a summary of the changes that you've done, including the files > >>> and functions where they were made? That will help Someone™ look into this > >>> problem on Windows. > > > >> Uhm. Well, it was a 3200 line patch that affected process.c and gnutls.c... > > > > Yes, but you have a lot more knowledge of what that patch contained that > > someone who will be looking at it afresh. I guess he's asking you to do a > > little work to save him a lot of work, if you'd be willing. > > Sure, but... that's a lot of typing. :-) It is not needed if you are keen on continuing to debug this problem. If someone else would take over, having such a description would help. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 2:30 ` Lars Ingebrigtsen 2016-02-27 2:43 ` John Wiegley @ 2016-02-27 3:49 ` Lars Ingebrigtsen 2016-02-27 8:10 ` Eli Zaretskii 2016-02-27 8:13 ` Eli Zaretskii 2 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-27 3:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > So deferring setting g->gnutls_p is not a good idea. I'll try to debug > this further, but may not have time today... Hey! Time! The following patch should make the rest of Emacs leave the TLS socket alone until we've done the handshake, I think. It works for me under Linux (and makes the negotiation blocking). Could you test this under Windows? diff --git a/src/gnutls.c b/src/gnutls.c index d1b34c5..002e7b4 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -403,6 +403,9 @@ gnutls_try_handshake (struct Lisp_Process *proc) gnutls_session_t state = proc->gnutls_state; int ret; + if (proc->is_non_blocking_client) + proc->gnutls_p = true; + do { ret = gnutls_handshake (state); @@ -410,13 +413,13 @@ gnutls_try_handshake (struct Lisp_Process *proc) QUIT; } while (ret < 0 && gnutls_error_is_fatal (ret) == 0 - && ! proc->is_non_blocking_client); +#if 0 + && ! proc->is_non_blocking_client +#endif + ); proc->gnutls_initstage = GNUTLS_STAGE_HANDSHAKE_TRIED; - if (proc->is_non_blocking_client) - proc->gnutls_p = true; - if (ret == GNUTLS_E_SUCCESS) { /* Here we're finally done. */ -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 3:49 ` Lars Ingebrigtsen @ 2016-02-27 8:10 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-02-27 8:10 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: 22789@debbugs.gnu.org, j_l_domenech@yahoo.com > Date: Sat, 27 Feb 2016 14:19:08 +1030 > > Lars Ingebrigtsen <larsi@gnus.org> writes: > > > So deferring setting g->gnutls_p is not a good idea. I'll try to debug > > this further, but may not have time today... > > Hey! Time! > > The following patch should make the rest of Emacs leave the TLS socket > alone until we've done the handshake, I think. It works for me under > Linux (and makes the negotiation blocking). Could you test this under > Windows? I'm afraid it doesn't help. The connection still doesn't happen. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 2:30 ` Lars Ingebrigtsen 2016-02-27 2:43 ` John Wiegley 2016-02-27 3:49 ` Lars Ingebrigtsen @ 2016-02-27 8:13 ` Eli Zaretskii 2 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-02-27 8:13 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Sat, 27 Feb 2016 13:00:44 +1030 > > > Can you post a summary of the changes that you've done, including the > > files and functions where they were made? That will help Someone™ > > look into this problem on Windows. > > Uhm. Well, it was a 3200 line patch that affected process.c and > gnutls.c... I can easily generate the patch, of course. What I was asking for is some description of the changes, which will make it easier to decide where the problem might be. Otherwise, that Someone™ will have to review the entire patch to make that decision. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-25 0:02 ` Lars Ingebrigtsen 2016-02-25 1:09 ` Lars Ingebrigtsen 2016-02-25 16:41 ` Eli Zaretskii @ 2016-02-27 18:05 ` Alain Schneble 2016-02-27 22:38 ` Lars Ingebrigtsen 2016-02-28 16:47 ` Eli Zaretskii 2 siblings, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-02-27 18:05 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: José L. Doménech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > (setq proc > (make-network-process :name "foo" > :buffer (get-buffer-create "*foo*") > :host "imap.gmail.com" > :service 993 > :nowait t > :tls-parameters > (cons 'gnutls-x509pki > (gnutls-boot-parameters > :type 'gnutls-x509pki > :hostname "imap.gmail.com")))) > > * OK Gimap ready for requests from 60.225.211.161 qr7mb410250987iec > > should appear. Also, after evaling that, what does It seems to be a timing issue. If I set gnutls-log-level to 5, this works also on Windows (i.e i get OK Gimap...). What I found out is that it runs into the following branch in wait_reading_process_output: ... else { /* Preserve status of processes already terminated. */ XPROCESS (proc)->tick = ++process_tick; deactivate_process (proc); if (XPROCESS (proc)->raw_status_new) update_status (XPROCESS (proc)); if (EQ (XPROCESS (proc)->status, Qrun)) pset_status (XPROCESS (proc), list2 (Qexit, make_number (256))); } Here it deactivates the process, but as its status is "connect", it won't change it. That's the reason why it remains in "connect" state. I guess that it enters this path because the socket is not ready yet. But why? I will try to figure it out later... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 18:05 ` Alain Schneble @ 2016-02-27 22:38 ` Lars Ingebrigtsen 2016-02-27 23:06 ` Alain Schneble 2016-02-28 16:47 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-27 22:38 UTC (permalink / raw) To: Alain Schneble; +Cc: José L. Doménech, 22789 Alain Schneble <a.s@realize.ch> writes: > What I found out is that it runs into the following branch in > wait_reading_process_output: > > ... > else > { > /* Preserve status of processes already terminated. */ > XPROCESS (proc)->tick = ++process_tick; > deactivate_process (proc); > if (XPROCESS (proc)->raw_status_new) > update_status (XPROCESS (proc)); > if (EQ (XPROCESS (proc)->status, Qrun)) > pset_status (XPROCESS (proc), > list2 (Qexit, make_number (256))); > } > > Here it deactivates the process, but as its status is "connect", it > won't change it. That's the reason why it remains in "connect" state. > > I guess that it enters this path because the socket is not ready yet. > But why? I will try to figure it out later... I think you're on to something! The thing starts with nread = read_process_output (proc, channel); and for un-setup TLS sockets, it'll now get back -1, and it should ideally end up in the else if (nread == -1 && errno == EAGAIN) ; thing, so that it tries again later. But errno is not EAGAIN here (usually)... Does the following patch make things work on Windows? diff --git a/src/gnutls.c b/src/gnutls.c index d1b34c5..a6b1294 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -403,6 +403,9 @@ gnutls_try_handshake (struct Lisp_Process *proc) gnutls_session_t state = proc->gnutls_state; int ret; + if (proc->is_non_blocking_client) + proc->gnutls_p = true; + do { ret = gnutls_handshake (state); @@ -410,13 +413,13 @@ gnutls_try_handshake (struct Lisp_Process *proc) QUIT; } while (ret < 0 && gnutls_error_is_fatal (ret) == 0 - && ! proc->is_non_blocking_client); +#if 0 + && ! proc->is_non_blocking_client +#endif + ); proc->gnutls_initstage = GNUTLS_STAGE_HANDSHAKE_TRIED; - if (proc->is_non_blocking_client) - proc->gnutls_p = true; - if (ret == GNUTLS_E_SUCCESS) { /* Here we're finally done. */ @@ -541,7 +544,10 @@ emacs_gnutls_read (struct Lisp_Process *proc, char *buf, ptrdiff_t nbyte) gnutls_session_t state = proc->gnutls_state; if (proc->gnutls_initstage != GNUTLS_STAGE_READY) - return -1; + { + errno = EAGAIN; + return -1; + } rtnval = gnutls_record_recv (state, buf, nbyte); if (rtnval >= 0) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 22:38 ` Lars Ingebrigtsen @ 2016-02-27 23:06 ` Alain Schneble 2016-02-27 23:49 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-02-27 23:06 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: José L. Doménech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > I think you're on to something! > > The thing starts with > > nread = read_process_output (proc, channel); > > and for un-setup TLS sockets, it'll now get back -1, and it should > ideally end up in the > > else if (nread == -1 && errno == EAGAIN) > ; > > thing, so that it tries again later. But errno is not EAGAIN here > (usually)... I suspected this as well. > Does the following patch make things work on Windows? But unfortunately, it does not work, also with this patch applied. The problem seems to happen earlier. In w32.c (emacs_gnutls_push) I see that sys_write returns with 0. But the buffer to write contains sz=255 bytes. And here errno is 0 after the write. This is strange. I guess that here errno should be set to EAGAIN... I mean in sys_write... After this broken emacs_gnutls_push call, gnutls_handshake returns: -53 GNUTLS_E_PUSH_ERROR And later... -10 GNUTLS_E_INVALID_SESSION. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 23:06 ` Alain Schneble @ 2016-02-27 23:49 ` Alain Schneble 2016-02-28 3:31 ` Lars Ingebrigtsen 2016-02-28 3:43 ` Eli Zaretskii 0 siblings, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-02-27 23:49 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: José L. Doménech, 22789 Alain Schneble <a.s@realize.ch> writes: > In w32.c (emacs_gnutls_push) I see that sys_write returns with 0. But > the buffer to write contains sz=255 bytes. And here errno is 0 after > the write. This is strange. I guess that here errno should be set to > EAGAIN... I mean in sys_write... > > After this broken emacs_gnutls_push call, gnutls_handshake returns: > > -53 GNUTLS_E_PUSH_ERROR > And later... > -10 GNUTLS_E_INVALID_SESSION. Here we go. I think we are getting closer to the root cause of the problem. In w32.c (sys_write), it runs into the following error: if (nchars == SOCKET_ERROR) { DebPrint (("sys_write.send failed with error %d on socket %ld\n", pfn_WSAGetLastError (), SOCK_HANDLE (fd))); set_errno (); } Strange thing: set_errno returns with errno == 0. This because pfn_WSAGetLastError returns 0 as well. Now, if I do... if (errno == 0) errno = EAGAIN; ...just after the call to set_errno above, guess what: It seems to work! At least for me, it will be an exercise for tomorrow to find the reason why pfn_WSAGetLastError returns 0 in this case. *snore* Do you agree it shouldn't return 0? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 23:49 ` Alain Schneble @ 2016-02-28 3:31 ` Lars Ingebrigtsen 2016-02-28 9:58 ` Alain Schneble 2016-02-28 16:53 ` Eli Zaretskii 2016-02-28 3:43 ` Eli Zaretskii 1 sibling, 2 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-28 3:31 UTC (permalink / raw) To: Alain Schneble; +Cc: José L. Doménech, 22789 Alain Schneble <a.s@realize.ch> writes: > Here we go. I think we are getting closer to the root cause of the > problem. In w32.c (sys_write), it runs into the following error: > > if (nchars == SOCKET_ERROR) > { > DebPrint (("sys_write.send failed with error %d on socket %ld\n", > pfn_WSAGetLastError (), SOCK_HANDLE (fd))); > set_errno (); > } > > Strange thing: set_errno returns with errno == 0. This because > pfn_WSAGetLastError returns 0 as well. > > Now, if I do... > > if (errno == 0) > errno = EAGAIN; > > ...just after the call to set_errno above, guess what: It seems to work! Aha! Good sleuthing. :-) > At least for me, it will be an exercise for tomorrow to find the reason > why pfn_WSAGetLastError returns 0 in this case. *snore* > > Do you agree it shouldn't return 0? Yes. That would make more sense. Both I don't think that code path (sys_write) has ever been called before on a nonblocking socket. (Because we've always opened the sockets before without O_NONBLOCK, since we've never called `make-network-process' with :nowait t before from `open-gnutls-stream'.) So ... is it possible that these functions that w32.c calls just don't... quite work with nonblocking sockets? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 3:31 ` Lars Ingebrigtsen @ 2016-02-28 9:58 ` Alain Schneble 2016-02-28 16:53 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-02-28 9:58 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: José L. Doménech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Alain Schneble <a.s@realize.ch> writes: > >> At least for me, it will be an exercise for tomorrow to find the reason >> why pfn_WSAGetLastError returns 0 in this case. *snore* >> >> Do you agree it shouldn't return 0? > > Yes. That would make more sense. See my other message in this thread. > Both I don't think that code path (sys_write) has ever been called > before on a nonblocking socket. (Because we've always opened the > sockets before without O_NONBLOCK, since we've never called > `make-network-process' with :nowait t before from `open-gnutls-stream'.) I agree (at least for the gnutls case...). > So ... is it possible that these functions that w32.c calls just > don't... quite work with nonblocking sockets? It seems so, yes :( I'll try to find some more time to dive into this further... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 3:31 ` Lars Ingebrigtsen 2016-02-28 9:58 ` Alain Schneble @ 2016-02-28 16:53 ` Eli Zaretskii 2016-02-29 2:37 ` Lars Ingebrigtsen 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-28 16:53 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Sun, 28 Feb 2016 14:01:19 +1030 > Cc: "José L. Doménech" > <j_l_domenech@yahoo.com>, 22789@debbugs.gnu.org > > Both I don't think that code path (sys_write) has ever been called > before on a nonblocking socket. (Because we've always opened the > sockets before without O_NONBLOCK, since we've never called > `make-network-process' with :nowait t before from `open-gnutls-stream'.) > > So ... is it possible that these functions that w32.c calls just > don't... quite work with nonblocking sockets? That's not true, non-blocking sockets are supported on Windows since a year ago. And the above aren't the right questions anyway: the problem is not with sockets per se, the problem is with a TLS connection specifically. And I think it is caused by the fact that now we proceed to GnuTLS handshaking right after the call to 'connect', which takes some time to complete, when the socket is non-blocking. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 16:53 ` Eli Zaretskii @ 2016-02-29 2:37 ` Lars Ingebrigtsen 0 siblings, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-29 2:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: > That's not true, non-blocking sockets are supported on Windows since a > year ago. I grepped for uses of sys_write, and the only one I noticed was the one from the TLS push functions. Because I missed this: #ifdef WINDOWSNT #define read sys_read #define write sys_write D'oh. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 23:49 ` Alain Schneble 2016-02-28 3:31 ` Lars Ingebrigtsen @ 2016-02-28 3:43 ` Eli Zaretskii 2016-02-28 9:48 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-28 3:43 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > Date: Sun, 28 Feb 2016 00:49:25 +0100 > Cc: "José L. Doménech" > <j_l_domenech@yahoo.com>, 22789@debbugs.gnu.org > > Now, if I do... > > if (errno == 0) > errno = EAGAIN; > > ...just after the call to set_errno above, guess what: It seems to work! This must be conditioned on something that requires EAGAIN. Otherwise overriding the errno of zero sounds like a bad idea to me. Why does the nchars == SOCKET_ERROR happen here at all, if winsock returns with an error of zero? Isn't that strange? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 3:43 ` Eli Zaretskii @ 2016-02-28 9:48 ` Alain Schneble 2016-02-28 17:00 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-02-28 9:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> Date: Sun, 28 Feb 2016 00:49:25 +0100 >> Cc: "José L. Doménech" >> <j_l_domenech@yahoo.com>, 22789@debbugs.gnu.org >> >> Now, if I do... >> >> if (errno == 0) >> errno = EAGAIN; >> >> ...just after the call to set_errno above, guess what: It seems to work! > > This must be conditioned on something that requires EAGAIN. Otherwise > overriding the errno of zero sounds like a bad idea to me. This was just a quick try, to better understand the behavior. Not a proposed solution. Excuse me for not being precise. > Why does the nchars == SOCKET_ERROR happen here at all, if winsock > returns with an error of zero? Isn't that strange? Exactly, this was what I meant in the part you elided: Alain Schneble <a.s@realize.ch> writes: > At least for me, it will be an exercise for tomorrow to find the reason > why pfn_WSAGetLastError returns 0 in this case. I just had some time to investigate it further. The WSAGetLastError gets overridden in the call to pfn_ioctlsocket. That's why errno is 0. If I swap the order of the if statements in sys_write to look as follows, then the reason for the SOCKET_ERROR is revealed: if (nchars == SOCKET_ERROR) { DebPrint (("sys_write.send failed with error %d on socket %ld\n", pfn_WSAGetLastError (), SOCK_HANDLE (fd))); set_errno (); } /* Set the socket back to non-blocking if it was before, for other operations that support it. */ if (fd_info[fd].flags & FILE_NDELAY) { printf ("reset file_ndelay"); nblock = 1; pfn_ioctlsocket (SOCK_HANDLE (fd), FIONBIO, &nblock); } => WSAENOTCONN (10057): Socket is not connected. So that's the prove it accesses the socket too early. Alas, even though it seems to help at least for the test code I tried, turning WSAENOTCONN into EAGAIN seems wrong after all. It shouldn't try to write to the socket before it is connected at all...(?) Also the code "wraps" pfn_send and turns it into a blocking call. Not sure what the implications are... Nevertheless, don't you think the error handling in this code section is not very elaborate and switching the order as shown above might be better anyway? sys_write is primarily about writing, not about switching from non-blocking to blocking and back again... Or shall it somehow aggregate possible errors of both calls (pfn_send and pfn_ioctlsocket)? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 9:48 ` Alain Schneble @ 2016-02-28 17:00 ` Eli Zaretskii 2016-02-29 2:49 ` Lars Ingebrigtsen 2016-02-29 9:55 ` Alain Schneble 0 siblings, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-02-28 17:00 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Sun, 28 Feb 2016 10:48:37 +0100 > > If I swap the order of the if statements in sys_write to look as > follows, then the reason for the SOCKET_ERROR is revealed: > > if (nchars == SOCKET_ERROR) > { > DebPrint (("sys_write.send failed with error %d on socket %ld\n", > pfn_WSAGetLastError (), SOCK_HANDLE (fd))); > set_errno (); > } > > /* Set the socket back to non-blocking if it was before, > for other operations that support it. */ > if (fd_info[fd].flags & FILE_NDELAY) > { > printf ("reset file_ndelay"); > nblock = 1; > pfn_ioctlsocket (SOCK_HANDLE (fd), FIONBIO, &nblock); > } > > => WSAENOTCONN (10057): Socket is not connected. So that's the prove it > accesses the socket too early. Yes. > Alas, even though it seems to help at least for the test code I tried, > turning WSAENOTCONN into EAGAIN seems wrong after all. It does here, although this needs to be done only if the socket is in the process of connecting, and the return value needs to be negative, not zero. I installed a fix along these lines, and it seems to work for me: https://www.fsf.org is displayed OK. > It shouldn't try to write to the socket before it is connected at > all...(?) No, I think it should: that write comes from GnuTLS, when it attempts a handshake. Returning EWOULDBLOCK tells GnuTLS to spin waiting until the connection is complete. How else could this work, since we now proceed with GnuTLS handshake immediately after the call to 'connect' returns, when the connection is not yet complete, this being a non-blocking socket? > Also the code "wraps" pfn_send and turns it into a blocking call. > Not sure what the implications are... The only implication is that we get ENOTCONN instead of EWOULDBLOCK. But that's easy to handle. > Nevertheless, don't you think the error handling in this code section is > not very elaborate and switching the order as shown above might be > better anyway? sys_write is primarily about writing, not about > switching from non-blocking to blocking and back again... Or shall it > somehow aggregate possible errors of both calls (pfn_send and > pfn_ioctlsocket)? Yes, you are right. I did that. The only problem left is that not all images on www.fsf.org's page are downloaded; they are if I use http instead of https. I guess this is some eww thing? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 17:00 ` Eli Zaretskii @ 2016-02-29 2:49 ` Lars Ingebrigtsen 2016-02-29 3:43 ` Eli Zaretskii 2016-02-29 9:55 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-29 2:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, Alain Schneble, 22789 Eli Zaretskii <eliz@gnu.org> writes: > No, I think it should: that write comes from GnuTLS, when it attempts > a handshake. Returning EWOULDBLOCK tells GnuTLS to spin waiting until > the connection is complete. No, it doesn't spin, it just tries again later (from the event loop). But returning EWOULDBLOCK seems correct. > The only problem left is that not all images on www.fsf.org's page are > downloaded; they are if I use http instead of https. I guess this is > some eww thing? No, you should get the images in the https version, too, unless it's unable to verify the TLS certificate. Which I guess is not a problem, since it's displaying the web page itself... What happens if you load the image directly with eww? https://static.fsf.org/nosvn/logos/fsf30-logo/fsf30-header-fsf.png -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 2:49 ` Lars Ingebrigtsen @ 2016-02-29 3:43 ` Eli Zaretskii 2016-02-29 4:38 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-29 3:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Alain Schneble <a.s@realize.ch>, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Mon, 29 Feb 2016 13:49:36 +1100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > No, I think it should: that write comes from GnuTLS, when it attempts > > a handshake. Returning EWOULDBLOCK tells GnuTLS to spin waiting until > > the connection is complete. > > No, it doesn't spin, it just tries again later (from the event loop). That's what I meant by "spin" (what "event loop"?) > But returning EWOULDBLOCK seems correct. > > > The only problem left is that not all images on www.fsf.org's page are > > downloaded; they are if I use http instead of https. I guess this is > > some eww thing? > > No, you should get the images in the https version, too, unless it's > unable to verify the TLS certificate. Which I guess is not a problem, > since it's displaying the web page itself... Some of the images get downloaded, some don't. And it isn't 100% repeatable. Strange... > What happens if you load the image directly with eww? > > https://static.fsf.org/nosvn/logos/fsf30-logo/fsf30-header-fsf.png It's downloaded, of course. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 3:43 ` Eli Zaretskii @ 2016-02-29 4:38 ` Lars Ingebrigtsen 0 siblings, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-29 4:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> No, you should get the images in the https version, too, unless it's >> unable to verify the TLS certificate. Which I guess is not a problem, >> since it's displaying the web page itself... > > Some of the images get downloaded, some don't. And it isn't 100% > repeatable. Strange... Huh. There's another timing issue, perhaps? Or... shr uses url-queue to download images, and it gives up on images that take more than five seconds to download (by default). Could that be the case here? Also, url caches images, so for greater repeatabilty it's often useful to nuke the cache when testing: $ rm -rf ~/.emacs.d/url/cache/ -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-28 17:00 ` Eli Zaretskii 2016-02-29 2:49 ` Lars Ingebrigtsen @ 2016-02-29 9:55 ` Alain Schneble 2016-02-29 10:03 ` Lars Ingebrigtsen 1 sibling, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-02-29 9:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Sun, 28 Feb 2016 10:48:37 +0100 >> >> Alas, even though it seems to help at least for the test code I tried, >> turning WSAENOTCONN into EAGAIN seems wrong after all. > > It does here, although this needs to be done only if the socket is in > the process of connecting, and the return value needs to be negative, > not zero. I installed a fix along these lines, and it seems to work > for me: https://www.fsf.org is displayed OK. Thanks! >> It shouldn't try to write to the socket before it is connected at >> all...(?) > > No, I think it should: that write comes from GnuTLS, when it attempts > a handshake. Returning EWOULDBLOCK tells GnuTLS to spin waiting until > the connection is complete. How else could this work, since we now > proceed with GnuTLS handshake immediately after the call to 'connect' > returns, when the connection is not yet complete, this being a > non-blocking socket? What I had in mind was to start the GnuTLS handshake (or even gnutls_boot) only after the async socket has properly been connected. I just consulted the GnuTLS documentation and I understand now that what you write above is indeed a supported GnuTLS scenario. But I think it is not an optimal one, because the number of TLS handshake retries will then depend on the time it takes to setup the socket connection, IIUC (see process.c: abort if p->gnutls_handshakes_tried > GNUTLS_EMACS_HANDSHAKES_LIMIT). >> Also the code "wraps" pfn_send and turns it into a blocking call. >> Not sure what the implications are... > > The only implication is that we get ENOTCONN instead of EWOULDBLOCK. > But that's easy to handle. Ok, I see. Thanks. >> Nevertheless, don't you think the error handling in this code section is >> not very elaborate and switching the order as shown above might be >> better anyway? sys_write is primarily about writing, not about >> switching from non-blocking to blocking and back again... Or shall it >> somehow aggregate possible errors of both calls (pfn_send and >> pfn_ioctlsocket)? > > Yes, you are right. I did that. Thanks. > The only problem left is that not all images on www.fsf.org's page are > downloaded; they are if I use http instead of https. I guess this is > some eww thing? I guess it's not. There are still some issues with the GnuTLS code paths, I think. I tried out the approach I proposed above, and it seems to resolve this issue as well. I'll try to arrange and propose a patch to discuss in a follow up message. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 9:55 ` Alain Schneble @ 2016-02-29 10:03 ` Lars Ingebrigtsen 2016-02-29 17:57 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-29 10:03 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > What I had in mind was to start the GnuTLS handshake (or even > gnutls_boot) only after the async socket has properly been connected. I > just consulted the GnuTLS documentation and I understand now that what > you write above is indeed a supported GnuTLS scenario. But I think it > is not an optimal one, because the number of TLS handshake retries will > then depend on the time it takes to setup the socket connection, IIUC > (see process.c: abort if p->gnutls_handshakes_tried > > GNUTLS_EMACS_HANDSHAKES_LIMIT). We could just increase that limit. It's currently set to 100, which is a number that's taken from thin air, I think? It should probably be a time-based handshake limit instead -- try handshaking for, say, ten seconds before giving up... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 10:03 ` Lars Ingebrigtsen @ 2016-02-29 17:57 ` Alain Schneble 2016-02-29 18:45 ` Eli Zaretskii 2016-02-29 21:18 ` Lars Ingebrigtsen 0 siblings, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-02-29 17:57 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Alain Schneble <a.s@realize.ch> writes: > >> What I had in mind was to start the GnuTLS handshake (or even >> gnutls_boot) only after the async socket has properly been connected. I >> just consulted the GnuTLS documentation and I understand now that what >> you write above is indeed a supported GnuTLS scenario. But I think it >> is not an optimal one, because the number of TLS handshake retries will >> then depend on the time it takes to setup the socket connection, IIUC >> (see process.c: abort if p->gnutls_handshakes_tried > >> GNUTLS_EMACS_HANDSHAKES_LIMIT). > > We could just increase that limit. It's currently set to 100, which is > a number that's taken from thin air, I think? It should probably be a > time-based handshake limit instead -- try handshaking for, say, ten > seconds before giving up... A time-based limit sounds like a good idea to me. It could even be combined with a min-number-of-tries approach, like this: if (TimeElapsed > Timeout && NumberOfTries > MinNumberOfTries) { // give up... } But the point I tried to address is the following: /When/ shall we start with the handshake "series" and start counting the number of tries (or stopwatch)? Don't you agree that with async sockets, it doesn't make much sense to start it before the socket is connected? So we could just postpone it until then... Otherwise, the number of handshake tries (or time elapsed) durnig the "socket not yet connected" are subtracted from the max number of tries (or timeout) granted. Which I think is, well, at least imprecise... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 17:57 ` Alain Schneble @ 2016-02-29 18:45 ` Eli Zaretskii 2016-02-29 21:22 ` Lars Ingebrigtsen 2016-02-29 23:13 ` Alain Schneble 2016-02-29 21:18 ` Lars Ingebrigtsen 1 sibling, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-02-29 18:45 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > Date: Mon, 29 Feb 2016 18:57:28 +0100 > Cc: j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > > Lars Ingebrigtsen <larsi@gnus.org> writes: > > > Alain Schneble <a.s@realize.ch> writes: > > > >> What I had in mind was to start the GnuTLS handshake (or even > >> gnutls_boot) only after the async socket has properly been connected. I > >> just consulted the GnuTLS documentation and I understand now that what > >> you write above is indeed a supported GnuTLS scenario. But I think it > >> is not an optimal one, because the number of TLS handshake retries will > >> then depend on the time it takes to setup the socket connection, IIUC > >> (see process.c: abort if p->gnutls_handshakes_tried > > >> GNUTLS_EMACS_HANDSHAKES_LIMIT). > > > > We could just increase that limit. It's currently set to 100, which is > > a number that's taken from thin air, I think? It should probably be a > > time-based handshake limit instead -- try handshaking for, say, ten > > seconds before giving up... > > A time-based limit sounds like a good idea to me. It could even be > combined with a min-number-of-tries approach, like this: > > if (TimeElapsed > Timeout && NumberOfTries > MinNumberOfTries) { > // give up... > } I already tried increasing the limit, it doesn't help: the new limit is reached. Interestingly, when the initial connection is made, for the page itself, the handshake completes within 10 attempts. But the subsequent connections, presumably for images, don't succeed, for some reason. > But the point I tried to address is the following: /When/ shall we start > with the handshake "series" and start counting the number of tries (or > stopwatch)? Don't you agree that with async sockets, it doesn't make > much sense to start it before the socket is connected? So we could just > postpone it until then... Otherwise, the number of handshake tries (or > time elapsed) durnig the "socket not yet connected" are subtracted from > the max number of tries (or timeout) granted. Which I think is, well, > at least imprecise... I think we are looking in the wrong direction. We need first to understand why the connection(s) to download the images don't work. Does anyone already have an idea why this happens? If so, please describe that. Failing that, I came to a conclusion that I don't have a clear and complete picture of what should happen when eww receives the page and proceeds to downloading the images. Lars, can you please describe what eww does at this point, and how these downloads are expected to work asynchronously? You can describe what happens on GNU/Linux, if that makes it easier. In particular, what are the differences between the initial connection to get the page (which works) and the connections made to get the images (which don't work)? There is also some disturbing signs in retrying GnuTLS handshake from wait_reading_process_output -- I'm not sure the way that function works, at least on Windows, is according to you expectations. The while loop there doesn't really spin all the time, did you know that? Can you describe what you think should happen there for a connection whose GnuTLS handshake is not yet complete? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 18:45 ` Eli Zaretskii @ 2016-02-29 21:22 ` Lars Ingebrigtsen 2016-03-01 3:35 ` Eli Zaretskii 2016-02-29 23:13 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-29 21:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, Alain Schneble, 22789 Eli Zaretskii <eliz@gnu.org> writes: > I already tried increasing the limit, it doesn't help: the new limit > is reached. Interestingly, when the initial connection is made, for > the page itself, the handshake completes within 10 attempts. But the > subsequent connections, presumably for images, don't succeed, for some > reason. For the images, it tried handshaking 100 times and then marked the connections as failed? > Failing that, I came to a conclusion that I don't have a clear and > complete picture of what should happen when eww receives the page and > proceeds to downloading the images. Lars, can you please describe > what eww does at this point, and how these downloads are expected to > work asynchronously? You can describe what happens on GNU/Linux, if > that makes it easier. In particular, what are the differences between > the initial connection to get the page (which works) and the > connections made to get the images (which don't work)? As I said before, the images are fetched from `url-queue', which gives up if the image hasn't downloaded within five seconds. Could that be the case for you? > There is also some disturbing signs in retrying GnuTLS handshake from > wait_reading_process_output -- I'm not sure the way that function > works, at least on Windows, is according to you expectations. The > while loop there doesn't really spin all the time, did you know that? It runs every time something is available from poll, doesn't it? Which is what we care about. I think. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 21:22 ` Lars Ingebrigtsen @ 2016-03-01 3:35 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 3:35 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Alain Schneble <a.s@realize.ch>, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Tue, 01 Mar 2016 08:22:35 +1100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I already tried increasing the limit, it doesn't help: the new limit > > is reached. Interestingly, when the initial connection is made, for > > the page itself, the handshake completes within 10 attempts. But the > > subsequent connections, presumably for images, don't succeed, for some > > reason. > > For the images, it tried handshaking 100 times and then marked the > connections as failed? It tried as many times as I gave it (I even tried 1000), yes. I didn't check that it marks the connection as failed, but that's deterministic, no? > > Failing that, I came to a conclusion that I don't have a clear and > > complete picture of what should happen when eww receives the page and > > proceeds to downloading the images. Lars, can you please describe > > what eww does at this point, and how these downloads are expected to > > work asynchronously? You can describe what happens on GNU/Linux, if > > that makes it easier. In particular, what are the differences between > > the initial connection to get the page (which works) and the > > connections made to get the images (which don't work)? > > As I said before, the images are fetched from `url-queue', which gives > up if the image hasn't downloaded within five seconds. Could that be > the case for you? Unlikely: I tried enlarging the limit to 120 sec, and the problem wasn't fixed. I feel that we don't really understand the problem, so we are trying random things "just because they are there". I think we should try to understand what's not working first. > > There is also some disturbing signs in retrying GnuTLS handshake from > > wait_reading_process_output -- I'm not sure the way that function > > works, at least on Windows, is according to you expectations. The > > while loop there doesn't really spin all the time, did you know that? > > It runs every time something is available from poll, doesn't it? Which > is what we care about. I think. But something might not become available from the poll for prolonged periods of time. Why would we rely on the loop there to crank frequently enough for the purposes of completing the TLS handshake? E.g., does it work well for you if you disable cursor blinking before invoking eww? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 18:45 ` Eli Zaretskii 2016-02-29 21:22 ` Lars Ingebrigtsen @ 2016-02-29 23:13 ` Alain Schneble 2016-03-01 0:41 ` Lars Ingebrigtsen 2016-03-01 3:41 ` Eli Zaretskii 1 sibling, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-02-29 23:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 [-- Attachment #1: Type: text/plain, Size: 690 bytes --] Eli Zaretskii <eliz@gnu.org> writes: > I already tried increasing the limit, it doesn't help: the new limit > is reached. Interestingly, when the initial connection is made, for > the page itself, the handshake completes within 10 attempts. But the > subsequent connections, presumably for images, don't succeed, for some > reason. Yes that's what I observed as well. But also that GnuTLS returns -10 GNUTLS_E_INVALID_SESSION for some of the connections quite early. Interestingly, I had the impression that it behaves better if the subsequent handshakes are triggered only after the socket is connected. But that may be by chance. And could therefore be a red herring. See patch: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Patch --] [-- Type: text/x-patch, Size: 962 bytes --] From 565ac4ce483faf65f2005b23bf806fff636f5cb1 Mon Sep 17 00:00:00 2001 From: Alain Schneble <a.s@realize.ch> Date: Mon, 29 Feb 2016 23:37:41 +0100 Subject: [PATCH] Optimize GnuTLS handshake on async sockets * src/process.c (wait_reading_process_output): start retrying GnuTLS handshake only after async socket has properly connected. --- src/process.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/process.c b/src/process.c index 85a4885..8aad8d3 100644 --- a/src/process.c +++ b/src/process.c @@ -4940,7 +4940,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, #ifdef HAVE_GNUTLS /* Continue TLS negotiation. */ if (p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED - && p->is_non_blocking_client) + && p->is_non_blocking_client + && (fd_info[p->infd].flags & FILE_CONNECT) == 0) { gnutls_try_handshake (p); p->gnutls_handshakes_tried++; -- 2.6.2.windows.1 [-- Attachment #3: Type: text/plain, Size: 1648 bytes --] >> But the point I tried to address is the following: /When/ shall we start >> with the handshake "series" and start counting the number of tries (or >> stopwatch)? Don't you agree that with async sockets, it doesn't make >> much sense to start it before the socket is connected? So we could just >> postpone it until then... Otherwise, the number of handshake tries (or >> time elapsed) durnig the "socket not yet connected" are subtracted from >> the max number of tries (or timeout) granted. Which I think is, well, >> at least imprecise... > > I think we are looking in the wrong direction. We need first to > understand why the connection(s) to download the images don't work. > Does anyone already have an idea why this happens? If so, please > describe that. I must admit that I have holes in my mental model, and I'm still observing flows at runtime which seem strange to me. So yes, it may be the wrong direction regarding the /issue/. But I was not only referring to the issue, but also to an optimization of the new async paths... > There is also some disturbing signs in retrying GnuTLS handshake from > wait_reading_process_output -- I'm not sure the way that function > works, at least on Windows, is according to you expectations. The > while loop there doesn't really spin all the time, did you know that? > Can you describe what you think should happen there for a connection > whose GnuTLS handshake is not yet complete? Hmm. What I observed is that it stops if the Emacs window looses its focus (and no other window messages are dispatched to the window). If it has focus, it gets called at least once per second. ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 23:13 ` Alain Schneble @ 2016-03-01 0:41 ` Lars Ingebrigtsen 2016-03-01 3:41 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-01 0:41 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > Yes that's what I observed as well. But also that GnuTLS returns -10 > GNUTLS_E_INVALID_SESSION for some of the connections quite early. In my experience working with this stuff under Linux, if the session goes invalid, it's because other parts of Emacs has been doing something with the socket (either reading or (re-)writing bytes that should have gone to the GnuTLS library). Could you strace Emacs while it's doing a failed negotiation, and add some printfs to the handshaking? If I remember correctly, I discovered one instance where the trace looked something like Trying handshake... write( ... ) EAGAIN... Done trying .. write( ... ) Trying handshake... GNUTLS_E_INVALID_SESSION And the reason was that Emacs in the polling code decided to try to retransmit the data (until I made it stop doing that for TLS sockets). Could there be something similar in the Windows code paths? > #ifdef HAVE_GNUTLS > /* Continue TLS negotiation. */ > if (p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED > - && p->is_non_blocking_client) > + && p->is_non_blocking_client > + && (fd_info[p->infd].flags & FILE_CONNECT) == 0) > { > gnutls_try_handshake (p); > p->gnutls_handshakes_tried++; I think this change makes sense. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 23:13 ` Alain Schneble 2016-03-01 0:41 ` Lars Ingebrigtsen @ 2016-03-01 3:41 ` Eli Zaretskii 2016-03-01 4:29 ` Lars Ingebrigtsen ` (2 more replies) 1 sibling, 3 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 3:41 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Tue, 1 Mar 2016 00:13:04 +0100 > > > There is also some disturbing signs in retrying GnuTLS handshake from > > wait_reading_process_output -- I'm not sure the way that function > > works, at least on Windows, is according to you expectations. The > > while loop there doesn't really spin all the time, did you know that? > > Can you describe what you think should happen there for a connection > > whose GnuTLS handshake is not yet complete? > > Hmm. What I observed is that it stops if the Emacs window looses its > focus (and no other window messages are dispatched to the window). If > it has focus, it gets called at least once per second. Disable cursor blinking and try again. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 3:41 ` Eli Zaretskii @ 2016-03-01 4:29 ` Lars Ingebrigtsen 2016-03-01 4:30 ` Lars Ingebrigtsen 2016-03-01 15:36 ` Alain Schneble 2 siblings, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-01 4:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, Alain Schneble, 22789 Eli Zaretskii <eliz@gnu.org> writes: > Disable cursor blinking and try again. You are completely correct! Without a blinking cursor, it doesn't complete the negotiation for (some) of the images on https://www.fsf.org! Oops. Urmn... Well, we have to add some other mechanism to keep the forward progress of the TLS negotiations going, then. Hm... A timer of some kind? That will not be running unless something is negotiated? Hm... Ideas? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 3:41 ` Eli Zaretskii 2016-03-01 4:29 ` Lars Ingebrigtsen @ 2016-03-01 4:30 ` Lars Ingebrigtsen 2016-03-01 9:00 ` Andreas Schwab 2016-03-01 15:36 ` Alain Schneble 2 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-01 4:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, Alain Schneble, 22789 And I would guess that the same is true for the async DNS resolution -- those processes also need some progress, probably? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 4:30 ` Lars Ingebrigtsen @ 2016-03-01 9:00 ` Andreas Schwab 2016-03-01 14:12 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Andreas Schwab @ 2016-03-01 9:00 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Alain Schneble, j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > And I would guess that the same is true for the async DNS resolution -- > those processes also need some progress, probably? getaddrinfo_a provides async notification (SIGEV_SIGNAL). Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 9:00 ` Andreas Schwab @ 2016-03-01 14:12 ` Lars Ingebrigtsen 2016-03-01 14:25 ` Alain Schneble 2016-03-01 15:53 ` Eli Zaretskii 0 siblings, 2 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-01 14:12 UTC (permalink / raw) To: Andreas Schwab; +Cc: Alain Schneble, j_l_domenech, 22789 Andreas Schwab <schwab@suse.de> writes: > Lars Ingebrigtsen <larsi@gnus.org> writes: > >> And I would guess that the same is true for the async DNS resolution -- >> those processes also need some progress, probably? > > getaddrinfo_a provides async notification (SIGEV_SIGNAL). But we're not using that. Anyway, all this async DNS/TLS stuff can probably be taken out of wait_reading_process_output completely, and just be run from the proposed new timer thing... That'll be cleaner, too. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 14:12 ` Lars Ingebrigtsen @ 2016-03-01 14:25 ` Alain Schneble 2016-03-01 14:43 ` Lars Ingebrigtsen 2016-03-01 15:59 ` Eli Zaretskii 2016-03-01 15:53 ` Eli Zaretskii 1 sibling, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-01 14:25 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Andreas Schwab, j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Andreas Schwab <schwab@suse.de> writes: > >> Lars Ingebrigtsen <larsi@gnus.org> writes: >> >>> And I would guess that the same is true for the async DNS resolution -- >>> those processes also need some progress, probably? >> >> getaddrinfo_a provides async notification (SIGEV_SIGNAL). > > But we're not using that. > > Anyway, all this async DNS/TLS stuff can probably be taken out of > wait_reading_process_output completely, and just be run from the > proposed new timer thing... That'll be cleaner, too. But we could probably make use of it and it would not require the timer at least for the async DNS resolution. It would not solve the TLS issue though. But maybe there is a similar notification for async sockets when they get connected? If there exists an approach that does not require timers, then I guess this would be the preferred one... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 14:25 ` Alain Schneble @ 2016-03-01 14:43 ` Lars Ingebrigtsen 2016-03-01 15:59 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-01 14:43 UTC (permalink / raw) To: Alain Schneble; +Cc: Andreas Schwab, j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > But we could probably make use of it and it would not require the timer > at least for the async DNS resolution. It seemed to me (after Googling the issue for a few minutes) before implementing the getaddrinfo_a stuff that getting signal delivery to work across all architectures would be challenging. If I was wrong about that, then we should perhaps look into doing this with signals. (Remember, we're going to drop the actual glibc getaddrinfo_a usage and use a clone from gnulib that's going to work on all platforms...) > It would not solve the TLS issue though. But maybe there is a similar > notification for async sockets when they get connected? If there > exists an approach that does not require timers, then I guess this > would be the preferred one... I haven't looked into that. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 14:25 ` Alain Schneble 2016-03-01 14:43 ` Lars Ingebrigtsen @ 2016-03-01 15:59 ` Eli Zaretskii 2016-03-01 16:19 ` Alain Schneble 2016-03-01 16:33 ` Andreas Schwab 1 sibling, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 15:59 UTC (permalink / raw) To: Alain Schneble; +Cc: schwab, larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > Date: Tue, 1 Mar 2016 15:25:53 +0100 > Cc: Andreas Schwab <schwab@suse.de>, j_l_domenech@yahoo.com, > 22789@debbugs.gnu.org > > > Andreas Schwab <schwab@suse.de> writes: > > > >> Lars Ingebrigtsen <larsi@gnus.org> writes: > >> > >>> And I would guess that the same is true for the async DNS resolution -- > >>> those processes also need some progress, probably? > >> > >> getaddrinfo_a provides async notification (SIGEV_SIGNAL). > > > > But we're not using that. > > > > Anyway, all this async DNS/TLS stuff can probably be taken out of > > wait_reading_process_output completely, and just be run from the > > proposed new timer thing... That'll be cleaner, too. > > But we could probably make use of it and it would not require the timer > at least for the async DNS resolution. It would not solve the TLS issue > though. But maybe there is a similar notification for async sockets > when they get connected? How do you envision we should make use of these notifications through signals? We try very hard not to do anything non-trivial in a signal handler, except setting a flag that is then tested in due time. If that is what you had in mind, then you will face the same problems with testing the flag as we face now: if the loop in wait_reading_process_output is stuck in a call to 'pselect' with a large timeout, Emacs might not be able to test the flag until the timeout ends. > If there exists an approach that does not require timers, then I > guess this would be the preferred one... I think I suggested one such way. In a nutshell, it does the same to the loop as a timer would, but without actually running a timer. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 15:59 ` Eli Zaretskii @ 2016-03-01 16:19 ` Alain Schneble 2016-03-01 17:00 ` Eli Zaretskii 2016-03-01 16:33 ` Andreas Schwab 1 sibling, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-01 16:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> Date: Tue, 1 Mar 2016 15:25:53 +0100 >> Cc: Andreas Schwab <schwab@suse.de>, j_l_domenech@yahoo.com, >> 22789@debbugs.gnu.org >> >> But we could probably make use of it and it would not require the timer >> at least for the async DNS resolution. It would not solve the TLS issue >> though. But maybe there is a similar notification for async sockets >> when they get connected? > > How do you envision we should make use of these notifications through > signals? We try very hard not to do anything non-trivial in a signal > handler, except setting a flag that is then tested in due time. If > that is what you had in mind, then you will face the same problems > with testing the flag as we face now: if the loop in > wait_reading_process_output is stuck in a call to 'pselect' with a > large timeout, Emacs might not be able to test the flag until the > timeout ends. What you describe above is what I had thought of. Regarding large timeout in pselect: isn't that what the sigmask to pselect is for? To wait for a signal in addition to a file descriptor? Or am I misunderstanding something? >> If there exists an approach that does not require timers, then I >> guess this would be the preferred one... > > I think I suggested one such way. In a nutshell, it does the same to > the loop as a timer would, but without actually running a timer. That sounds like a better idea than having a separate timer... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 16:19 ` Alain Schneble @ 2016-03-01 17:00 ` Eli Zaretskii 2016-03-01 17:09 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 17:00 UTC (permalink / raw) To: Alain Schneble; +Cc: schwab, larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <schwab@suse.de>, <larsi@gnus.org>, <j_l_domenech@yahoo.com>, > <22789@debbugs.gnu.org> > Date: Tue, 1 Mar 2016 17:19:00 +0100 > > >> But we could probably make use of it and it would not require the timer > >> at least for the async DNS resolution. It would not solve the TLS issue > >> though. But maybe there is a similar notification for async sockets > >> when they get connected? > > > > How do you envision we should make use of these notifications through > > signals? We try very hard not to do anything non-trivial in a signal > > handler, except setting a flag that is then tested in due time. If > > that is what you had in mind, then you will face the same problems > > with testing the flag as we face now: if the loop in > > wait_reading_process_output is stuck in a call to 'pselect' with a > > large timeout, Emacs might not be able to test the flag until the > > timeout ends. > > What you describe above is what I had thought of. Regarding large > timeout in pselect: isn't that what the sigmask to pselect is for? To > wait for a signal in addition to a file descriptor? Or am I > misunderstanding something? No, it's me: I forgot that a signal will interrupt 'pselect', which will return with EINTR, and that is already handled. So yes, if we have a signal that is delivered from one of these handshakes, it will cause the loop to run again. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 17:00 ` Eli Zaretskii @ 2016-03-01 17:09 ` Alain Schneble 2016-03-01 17:22 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-01 17:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: > So yes, if we have a signal that is delivered from one of these > handshakes, it will cause the loop to run again. I guess if we do a fcntl(socket, F_SETFL, O_ASYNC), we may get a notification if the socket has been connected. This could be the trigger to do the first TLS handshake try. For subsequent tries, if needed, we somehow have to rescedule them. For example with just another round, triggered by a short pselect timeout, like you proposed. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 17:09 ` Alain Schneble @ 2016-03-01 17:22 ` Eli Zaretskii 2016-03-01 17:55 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 17:22 UTC (permalink / raw) To: Alain Schneble; +Cc: schwab, larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <schwab@suse.de>, <larsi@gnus.org>, <j_l_domenech@yahoo.com>, > <22789@debbugs.gnu.org> > Date: Tue, 1 Mar 2016 18:09:40 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > So yes, if we have a signal that is delivered from one of these > > handshakes, it will cause the loop to run again. > > I guess if we do a fcntl(socket, F_SETFL, O_ASYNC), we may get a > notification if the socket has been connected. Doesn't the socket become read-ready in 'pselect' once it is connected? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 17:22 ` Eli Zaretskii @ 2016-03-01 17:55 ` Alain Schneble 2016-03-01 18:13 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-01 17:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <schwab@suse.de>, <larsi@gnus.org>, <j_l_domenech@yahoo.com>, >> <22789@debbugs.gnu.org> >> Date: Tue, 1 Mar 2016 18:09:40 +0100 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > So yes, if we have a signal that is delivered from one of these >> > handshakes, it will cause the loop to run again. >> >> I guess if we do a fcntl(socket, F_SETFL, O_ASYNC), we may get a >> notification if the socket has been connected. > > Doesn't the socket become read-ready in 'pselect' once it is > connected? True, that may indeed be the case (or write-ready? I'm confused ;( ). ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 17:55 ` Alain Schneble @ 2016-03-01 18:13 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 18:13 UTC (permalink / raw) To: Alain Schneble; +Cc: schwab, larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <schwab@suse.de>, <larsi@gnus.org>, <j_l_domenech@yahoo.com>, > <22789@debbugs.gnu.org> > Date: Tue, 1 Mar 2016 18:55:11 +0100 > > > Doesn't the socket become read-ready in 'pselect' once it is > > connected? > > True, that may indeed be the case (or write-ready? I'm confused ;( ). I think you are right, and it's write-ready. At least that's what the w32 emulation does. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 15:59 ` Eli Zaretskii 2016-03-01 16:19 ` Alain Schneble @ 2016-03-01 16:33 ` Andreas Schwab 1 sibling, 0 replies; 124+ messages in thread From: Andreas Schwab @ 2016-03-01 16:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, Alain Schneble, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: > How do you envision we should make use of these notifications through > signals? I think that could be as simple as interrupting the pselect call by the signal. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 14:12 ` Lars Ingebrigtsen 2016-03-01 14:25 ` Alain Schneble @ 2016-03-01 15:53 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 15:53 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: schwab, j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org>, j_l_domenech@yahoo.com, Alain Schneble <a.s@realize.ch>, 22789@debbugs.gnu.org > Date: Wed, 02 Mar 2016 01:12:21 +1100 > > Anyway, all this async DNS/TLS stuff can probably be taken out of > wait_reading_process_output completely, and just be run from the > proposed new timer thing... That'll be cleaner, too. Not sure I understand: you want to launch a timer from C? And you want to have its function to do nothing but sit-for, or something similar? That doesn't sound like very clean to me, but maybe I'm missing something. I think adding simple logic in the wait_reading_process_output loop to reduce the timeout for 'pselect', that I suggested in my previous message to this thread is simpler and cleaner. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 3:41 ` Eli Zaretskii 2016-03-01 4:29 ` Lars Ingebrigtsen 2016-03-01 4:30 ` Lars Ingebrigtsen @ 2016-03-01 15:36 ` Alain Schneble 2016-03-01 16:05 ` Eli Zaretskii 2 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-01 15:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Tue, 1 Mar 2016 00:13:04 +0100 >> >> > There is also some disturbing signs in retrying GnuTLS handshake from >> > wait_reading_process_output -- I'm not sure the way that function >> > works, at least on Windows, is according to you expectations. The >> > while loop there doesn't really spin all the time, did you know that? >> > Can you describe what you think should happen there for a connection >> > whose GnuTLS handshake is not yet complete? >> >> Hmm. What I observed is that it stops if the Emacs window looses its >> focus (and no other window messages are dispatched to the window). If >> it has focus, it gets called at least once per second. > > Disable cursor blinking and try again. Yes, you are right. The frequency it enters the loop drops radically once the cursor doesn't blink. There is one strange thing I'm observing though. If I constantly hover over the Window with the mouse, also then, not all images get downloaded most of the time. The reason then is, as mentioned in a previous post, some of the gnutls_try_handshake calls return -10 GNUTLS_E_INVALID_SESSION. So there must still be another issue, because with the mouse hovering, it seems to me that it enters the loop enough frequently to do TRT. (strangely enough, if I postpone gnutls_try_handshake to after the socket is connected, it seems to work reliably in conjunction with the mouse hovering...) ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 15:36 ` Alain Schneble @ 2016-03-01 16:05 ` Eli Zaretskii 2016-03-01 16:25 ` Alain Schneble 2016-03-04 8:56 ` Eli Zaretskii 0 siblings, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 16:05 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Tue, 1 Mar 2016 16:36:03 +0100 > > There is one strange thing I'm observing though. If I constantly hover > over the Window with the mouse, also then, not all images get downloaded > most of the time. The reason then is, as mentioned in a previous post, > some of the gnutls_try_handshake calls return -10 > GNUTLS_E_INVALID_SESSION. So there must still be another issue, because > with the mouse hovering, it seems to me that it enters the loop enough > frequently to do TRT. I have no doubt there are w32 aspects to this problem. But I see no reason to try debugging that as long as this machinery doesn't work well on Posix hosts, because we will bump into issues that have nothing to do with how w32 emulates Posix functionality. Especially if we are now seriously discussing radical changes in the design (like running the handshake from a timer or such likes), which, if implemented will change the code significantly, and will no doubt affect what the w32 port needs (or doesn't meed) to do to support that. IOW, let's return to the w32-specific issues when the dust settles on the Posix code. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 16:05 ` Eli Zaretskii @ 2016-03-01 16:25 ` Alain Schneble 2016-03-04 8:56 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-01 16:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Tue, 1 Mar 2016 16:36:03 +0100 >> >> There is one strange thing I'm observing though. If I constantly hover >> over the Window with the mouse, also then, not all images get downloaded >> most of the time. The reason then is, as mentioned in a previous post, >> some of the gnutls_try_handshake calls return -10 >> GNUTLS_E_INVALID_SESSION. So there must still be another issue, because >> with the mouse hovering, it seems to me that it enters the loop enough >> frequently to do TRT. > > I have no doubt there are w32 aspects to this problem. But I see no > reason to try debugging that as long as this machinery doesn't work > well on Posix hosts, because we will bump into issues that have > nothing to do with how w32 emulates Posix functionality. Especially > if we are now seriously discussing radical changes in the design (like > running the handshake from a timer or such likes), which, if > implemented will change the code significantly, and will no doubt > affect what the w32 port needs (or doesn't meed) to do to support > that. > > IOW, let's return to the w32-specific issues when the dust settles on > the Posix code. Ok, that sounds wise. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 16:05 ` Eli Zaretskii 2016-03-01 16:25 ` Alain Schneble @ 2016-03-04 8:56 ` Eli Zaretskii 2016-03-04 16:55 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-04 8:56 UTC (permalink / raw) To: a.s, larsi; +Cc: j_l_domenech, 22789 > Date: Tue, 01 Mar 2016 18:05:36 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: larsi@gnus.org, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > > IOW, let's return to the w32-specific issues when the dust settles on > the Posix code. It sounds like that part happened already, so do you still see any w32-specific issues with this? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 8:56 ` Eli Zaretskii @ 2016-03-04 16:55 ` Alain Schneble 2016-03-04 21:36 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-04 16:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> Date: Tue, 01 Mar 2016 18:05:36 +0200 >> From: Eli Zaretskii <eliz@gnu.org> >> Cc: larsi@gnus.org, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org >> >> IOW, let's return to the w32-specific issues when the dust settles on >> the Posix code. > > It sounds like that part happened already, so do you still see any > w32-specific issues with this? Sorry for the delay. It seems like there are still some issues, at least on my system and even without any debugger attached. I'm currently trying to find the cause... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 16:55 ` Alain Schneble @ 2016-03-04 21:36 ` Alain Schneble 2016-03-04 22:33 ` Alain Schneble ` (2 more replies) 0 siblings, 3 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-04 21:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 [-- Attachment #1: Type: text/plain, Size: 1876 bytes --] Alain Schneble <a.s@realize.ch> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >>> Date: Tue, 01 Mar 2016 18:05:36 +0200 >>> From: Eli Zaretskii <eliz@gnu.org> >>> Cc: larsi@gnus.org, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org >>> >>> IOW, let's return to the w32-specific issues when the dust settles on >>> the Posix code. >> >> It sounds like that part happened already, so do you still see any >> w32-specific issues with this? > > Sorry for the delay. It seems like there are still some issues, at > least on my system and even without any debugger attached. I'm > currently trying to find the cause... I have the impression that GnuTLS doesn't like it too much if we start retrying the handshake many times before the socket is connected. At least on MS-Windows. In nearly all of the cases of loading websites with around 20 images, I observe arbitrary failures of gnutls_try_handshake which usually end up with -10 GNUTLS_E_INVALID_SESSION. I believe this because the following patch solves the issue on my MS-Windows system: Postponing the handshake until after the socket is connected. Still, I must be honest: I'm in a kind of a trial-and-error mode. I do not really understand all the aspects of the current implementation. Anyway, I think a change in that direction would probably be a good thing. Do you agree? It eliminates all the handshake-retries that would otherwise happen before the socket is connected. I wonder if the issues that you observed with gdb attached would go away with this patch as well... You had these issues under GNU/Linux, right? It's a bit embarrassing, but I did not yet have time to learn how to use gdb to debug Emacs. (But its on my todo list.) Otherwise I would have tried it out quickly. BTW, `libgnutls-version' evaluates to 30408 on my MS-Windows. And here is the intermediate-and-not-cleaned-up patch: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Patch --] [-- Type: text/x-patch, Size: 2796 bytes --] From cefaba56a33046b588eab81b3ca58224830a44f9 Mon Sep 17 00:00:00 2001 From: Alain Schneble <a.s@realize.ch> Date: Fri, 4 Mar 2016 21:51:31 +0100 Subject: [PATCH] Wait for GnuTLS handshake until socket is connected * src/gnutls.c (emacs_gnutls_handshake, gnutls_try_handshake): Skip GnuTLS handshake when gnutls_boot is called on async socket (aka non blocking client). * src/process.c (connect_network_socket, wait_reading_process_output): Proceed with GnuTLS handshake only after async socket has been connected. --- src/gnutls.c | 11 ++++++++--- src/process.c | 9 ++++++--- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/src/gnutls.c b/src/gnutls.c index 988c010..ddf4648 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -403,9 +403,6 @@ gnutls_try_handshake (struct Lisp_Process *proc) gnutls_session_t state = proc->gnutls_state; int ret; - if (proc->is_non_blocking_client) - proc->gnutls_p = true; - do { ret = gnutls_handshake (state); @@ -426,6 +423,8 @@ gnutls_try_handshake (struct Lisp_Process *proc) { /* check_memory_full (gnutls_alert_send_appropriate (state, ret)); */ } + + //printf ("gnutls_try_handshake: proc fd=%d, ret=%d\n", proc->infd, ret); return ret; } @@ -474,6 +473,12 @@ emacs_gnutls_handshake (struct Lisp_Process *proc) proc->gnutls_initstage = GNUTLS_STAGE_TRANSPORT_POINTERS_SET; } + if (proc->is_non_blocking_client) + { + proc->gnutls_p = true; + return GNUTLS_E_AGAIN; + } + return gnutls_try_handshake (proc); } diff --git a/src/process.c b/src/process.c index 4359f68..bd1c45f 100644 --- a/src/process.c +++ b/src/process.c @@ -3415,7 +3415,8 @@ connect_network_socket (Lisp_Object proc, Lisp_Object ip_addresses) if (p->gnutls_initstage == GNUTLS_STAGE_READY) /* Run sentinels, etc. */ finish_after_tls_connection (proc); - else if (p->gnutls_initstage != GNUTLS_STAGE_HANDSHAKE_TRIED) + else if ((! p->is_non_blocking_client && p->gnutls_initstage != GNUTLS_STAGE_HANDSHAKE_TRIED) || + (p->is_non_blocking_client && p->gnutls_initstage != GNUTLS_STAGE_TRANSPORT_POINTERS_SET)) { deactivate_process (proc); if (NILP (boot)) @@ -4950,8 +4951,10 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, #endif #ifdef HAVE_GNUTLS /* Continue TLS negotiation. */ - if (p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED - && p->is_non_blocking_client) + if ((p->gnutls_initstage == GNUTLS_STAGE_TRANSPORT_POINTERS_SET || + p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED) + && p->is_non_blocking_client + && (! FD_ISSET (p->outfd, &connect_wait_mask))) { gnutls_try_handshake (p); p->gnutls_handshakes_tried++; -- 2.6.2.windows.1 ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 21:36 ` Alain Schneble @ 2016-03-04 22:33 ` Alain Schneble 2016-03-05 8:23 ` Eli Zaretskii 2016-03-05 8:46 ` Lars Magne Ingebrigtsen 2 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-04 22:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 [-- Attachment #1: Type: text/plain, Size: 262 bytes --] Alain Schneble <a.s@realize.ch> writes: > Alain Schneble <a.s@realize.ch> writes: > > And here is the intermediate-and-not-cleaned-up patch: Sorry, the former patch did break the new retry_for_async flag. Here is a corrected one (but still experimental...): [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Patch --] [-- Type: text/x-patch, Size: 2649 bytes --] From eedbcea0224febf579bedb44a9bb2b280abcdc4a Mon Sep 17 00:00:00 2001 From: Alain Schneble <a.s@realize.ch> Date: Fri, 4 Mar 2016 23:17:07 +0100 Subject: [PATCH] Wait for GnuTLS handshake until socket is connected * src/gnutls.c (emacs_gnutls_handshake, gnutls_try_handshake): Skip GnuTLS handshake when gnutls_boot is called on async socket (aka non blocking client). * src/process.c (connect_network_socket, wait_reading_process_output): Proceed with GnuTLS handshake only after async socket has been connected. --- src/gnutls.c | 9 ++++++--- src/process.c | 13 +++++++++---- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/src/gnutls.c b/src/gnutls.c index 988c010..4962ecc 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -403,9 +403,6 @@ gnutls_try_handshake (struct Lisp_Process *proc) gnutls_session_t state = proc->gnutls_state; int ret; - if (proc->is_non_blocking_client) - proc->gnutls_p = true; - do { ret = gnutls_handshake (state); @@ -474,6 +471,12 @@ emacs_gnutls_handshake (struct Lisp_Process *proc) proc->gnutls_initstage = GNUTLS_STAGE_TRANSPORT_POINTERS_SET; } + if (proc->is_non_blocking_client) + { + proc->gnutls_p = true; + return GNUTLS_E_AGAIN; + } + return gnutls_try_handshake (proc); } diff --git a/src/process.c b/src/process.c index 4359f68..d73586c 100644 --- a/src/process.c +++ b/src/process.c @@ -3415,7 +3415,8 @@ connect_network_socket (Lisp_Object proc, Lisp_Object ip_addresses) if (p->gnutls_initstage == GNUTLS_STAGE_READY) /* Run sentinels, etc. */ finish_after_tls_connection (proc); - else if (p->gnutls_initstage != GNUTLS_STAGE_HANDSHAKE_TRIED) + else if ((! p->is_non_blocking_client && p->gnutls_initstage != GNUTLS_STAGE_HANDSHAKE_TRIED) || + (p->is_non_blocking_client && p->gnutls_initstage != GNUTLS_STAGE_TRANSPORT_POINTERS_SET)) { deactivate_process (proc); if (NILP (boot)) @@ -4950,11 +4951,15 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, #endif #ifdef HAVE_GNUTLS /* Continue TLS negotiation. */ - if (p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED + if ((p->gnutls_initstage == GNUTLS_STAGE_TRANSPORT_POINTERS_SET || + p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED) && p->is_non_blocking_client) { - gnutls_try_handshake (p); - p->gnutls_handshakes_tried++; + if (! FD_ISSET (p->outfd, &connect_wait_mask)) + { + gnutls_try_handshake (p); + p->gnutls_handshakes_tried++; + } if (p->gnutls_initstage == GNUTLS_STAGE_READY) { -- 2.6.2.windows.1 ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 21:36 ` Alain Schneble 2016-03-04 22:33 ` Alain Schneble @ 2016-03-05 8:23 ` Eli Zaretskii 2016-03-05 18:27 ` Alain Schneble 2016-03-05 8:46 ` Lars Magne Ingebrigtsen 2 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-05 8:23 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Fri, 4 Mar 2016 22:36:56 +0100 > > I have the impression that GnuTLS doesn't like it too much if we start > retrying the handshake many times before the socket is connected. At > least on MS-Windows. In nearly all of the cases of loading websites > with around 20 images, I observe arbitrary failures of > gnutls_try_handshake which usually end up with -10 > GNUTLS_E_INVALID_SESSION. I think this warrants a question to GnuTLS developers. We need to understand this before we devise a solution. What bothers me is the "many times" part -- how many is "too much", and why? Do you see any difference in behavior of sys_write during those many attempts as opposed to the first few? Also, what URL do you use for testing this? > I believe this because the following patch solves the issue on my > MS-Windows system: Postponing the handshake until after the socket is > connected. Still, I must be honest: I'm in a kind of a trial-and-error > mode. I do not really understand all the aspects of the current > implementation. Feel free to ask, I think I can answer any question about the Emacs part of this, but probably not about the GnuTLS part -- those we should ask on the GnuTLS mailing list. > Anyway, I think a change in that direction would > probably be a good thing. Do you agree? It eliminates all the > handshake-retries that would otherwise happen before the socket is > connected. Why is it needed only on Windows? Why does it matter what reason causes the failure of a handshake? We need to understand these aspects before we consider the solutions. > BTW, `libgnutls-version' evaluates to 30408 on my MS-Windows. It's 30311 here, but I'm not sure this is a factor. We are talking about basic functionality here. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 8:23 ` Eli Zaretskii @ 2016-03-05 18:27 ` Alain Schneble 2016-03-05 19:21 ` Eli Zaretskii 2016-03-06 9:31 ` Lars Magne Ingebrigtsen 0 siblings, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-05 18:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Fri, 4 Mar 2016 22:36:56 +0100 >> >> I have the impression that GnuTLS doesn't like it too much if we start >> retrying the handshake many times before the socket is connected. At >> least on MS-Windows. In nearly all of the cases of loading websites >> with around 20 images, I observe arbitrary failures of >> gnutls_try_handshake which usually end up with -10 >> GNUTLS_E_INVALID_SESSION. > > I think this warrants a question to GnuTLS developers. We need to > understand this before we devise a solution. What bothers me is the > "many times" part -- how many is "too much", and why? Yes, of course, I agree we have to understand it. I just thought that maybe the patch would help us in narrowing down the issue further... And maybe we should wait until somebody else sees the same effects I described on MS-Windows before asking GnuTLS developers? Did you see these problems as well? I should have written "multiple times" not "many times". I just had a case where for four network processes (get image requests) gnutls_try_handshake returned for each of these processes -28 GNUTLS_E_AGAIN three times in a row, followed by a single -110 GNUTLS_E_PREMATURE_TERMINATION and then repeatedly by -10 GNUTLS_E_INVALID_SESSION. > Do you see any > difference in behavior of sys_write during those many attempts as > opposed to the first few? Good point. I'll have to analyze this. > Also, what URL do you use for testing this? Because I'm on MS-Windows, I used https://www.microsoft.com :) (it gets redirected to https://www.microsoft.com/de-ch for me). I currently only have access to a WLAN not under my control. But the same Website loads reliably and pretty fast in other web browsers with the same connection. >> I believe this because the following patch solves the issue on my >> MS-Windows system: Postponing the handshake until after the socket is >> connected. Still, I must be honest: I'm in a kind of a trial-and-error >> mode. I do not really understand all the aspects of the current >> implementation. > > Feel free to ask, I think I can answer any question about the Emacs > part of this, but probably not about the GnuTLS part -- those we > should ask on the GnuTLS mailing list. Ok, thank you! I'll be happy to ask, but I would like to spend some more time to re-read the code once again and clean up my mind first. >> Anyway, I think a change in that direction would >> probably be a good thing. Do you agree? It eliminates all the >> handshake-retries that would otherwise happen before the socket is >> connected. > > Why is it needed only on Windows? Why does it matter what reason > causes the failure of a handshake? We need to understand these > aspects before we consider the solutions. I'm currently unable to answer these questions. I see that there are many differencies in Emacs' "platform adaption layer" for w32 vs the paths for GNU/Linux. And I cannot see if and how the patch I sent could be related to those differencies. So I follow your advice and will try to understand the /issue/ first. >> BTW, `libgnutls-version' evaluates to 30408 on my MS-Windows. > > It's 30311 here, but I'm not sure this is a factor. We are talking > about basic functionality here. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 18:27 ` Alain Schneble @ 2016-03-05 19:21 ` Eli Zaretskii 2016-03-06 22:45 ` Alain Schneble 2016-03-06 9:31 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-05 19:21 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Sat, 5 Mar 2016 19:27:09 +0100 > > I should have written "multiple times" not "many times". I just had a > case where for four network processes (get image requests) > gnutls_try_handshake returned for each of these processes -28 > GNUTLS_E_AGAIN three times in a row, followed by a single -110 > GNUTLS_E_PREMATURE_TERMINATION and then repeatedly by -10 > GNUTLS_E_INVALID_SESSION. Sounds like we are terminating the connection while the handshake is still in progress? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 19:21 ` Eli Zaretskii @ 2016-03-06 22:45 ` Alain Schneble 2016-03-06 23:24 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-06 22:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Sat, 5 Mar 2016 19:27:09 +0100 >> >> I should have written "multiple times" not "many times". I just had a >> case where for four network processes (get image requests) >> gnutls_try_handshake returned for each of these processes -28 >> GNUTLS_E_AGAIN three times in a row, followed by a single -110 >> GNUTLS_E_PREMATURE_TERMINATION and then repeatedly by -10 >> GNUTLS_E_INVALID_SESSION. > > Sounds like we are terminating the connection while the handshake is > still in progress? I'm pretty sure we are not terminating the connection prematurely. After some more debugging, I found a way to reliably solve the issue. It seems like switching the socket to blocking before calling send and back to non-blocking while the socket is not connected in sys_write somehow makes the socket unreliable. I'll send a patch for further discussions shortly. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 22:45 ` Alain Schneble @ 2016-03-06 23:24 ` Alain Schneble 2016-03-07 8:49 ` Alain Schneble 2016-03-07 16:07 ` Eli Zaretskii 0 siblings, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-06 23:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 [-- Attachment #1: Type: text/plain, Size: 503 bytes --] Alain Schneble <a.s@realize.ch> writes: > I'll send a patch for further discussions shortly. And here it is. The fix is quite simple. It ensures that sys_write exits before touching the socket if it is not connected yet. Unfortunately I didn't find any documentation on winsock ioctlsocket that would prove that this is indeed required. But it seems not wrong to me anyway. (I'll try to search the wisock documentation tomorrow to find some hints that lead in this direction, or maybe you know?) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Patch --] [-- Type: text/x-patch, Size: 1140 bytes --] From 01a475ea41265929951e6d14f6dd216671b63331 Mon Sep 17 00:00:00 2001 From: Alain Schneble <a.s@realize.ch> Date: Mon, 7 Mar 2016 00:00:57 +0100 Subject: [PATCH] Solve async GnuTLS handshake issue on w32 * src/w32.c (sys_write): For non-blocking sockets, return immediately with EWOULDBLOCK. This ensures we do not temporarily turn the socket into blocking mode for the pfn_send call if the socket is not (yet) connected. It turned out that doing so causes arbitrary GnuTLS handshake failures on MS-Windows. (bug#22789) --- src/w32.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/w32.c b/src/w32.c index 998f696..ee8cf6c 100644 --- a/src/w32.c +++ b/src/w32.c @@ -8647,6 +8647,12 @@ sys_write (int fd, const void * buffer, unsigned int count) unsigned long nblock = 0; if (winsock_lib == NULL) emacs_abort (); + if ((fd_info[fd].flags & FILE_CONNECT) != 0) + { + errno = EWOULDBLOCK; + return -1; + } + /* TODO: implement select() properly so non-blocking I/O works. */ /* For now, make sure the write blocks. */ if (fd_info[fd].flags & FILE_NDELAY) -- 2.6.2.windows.1 ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 23:24 ` Alain Schneble @ 2016-03-07 8:49 ` Alain Schneble 2016-03-07 16:08 ` Eli Zaretskii 2016-03-07 16:07 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-07 8:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > (I'll try to search the wisock documentation tomorrow to > find some hints that lead in this direction, or maybe you know?) IIUC, this could be the reason why it fails: - The reader_thread in w32proc.c calls _sys_wait_connect which in turn does a WSAEventSelect. - The documentation says: https://msdn.microsoft.com/en-us/library/windows/desktop/ms738573(v=vs.85).aspx The WSAAsyncSelect and WSAEventSelect functions automatically set a socket to nonblocking mode. If WSAAsyncSelect or WSAEventSelect has been issued on a socket, then any attempt to use ioctlsocket to set the socket back to blocking mode will fail with WSAEINVAL. - So there /is/ a dependency between these calls. Unfortunately, I couldn't see that ioctlsocket returns with WSAEINVAL in the scenarios I tried. Could it be a multi-threading issue then? Multiple threads accessing the same socket... I don't see how both threads are synchronized. The patch I sent would synchronize them through the FILE_CONNECT flag, I think. Did I miss something? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 8:49 ` Alain Schneble @ 2016-03-07 16:08 ` Eli Zaretskii 2016-03-07 17:20 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-07 16:08 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Mon, 7 Mar 2016 09:49:00 +0100 > > > (I'll try to search the wisock documentation tomorrow to > > find some hints that lead in this direction, or maybe you know?) > > IIUC, this could be the reason why it fails: > > - The reader_thread in w32proc.c calls _sys_wait_connect which in turn > does a WSAEventSelect. > > - The documentation says: > > https://msdn.microsoft.com/en-us/library/windows/desktop/ms738573(v=vs.85).aspx > > The WSAAsyncSelect and WSAEventSelect functions automatically set a > socket to nonblocking mode. If WSAAsyncSelect or WSAEventSelect has > been issued on a socket, then any attempt to use ioctlsocket to set > the socket back to blocking mode will fail with WSAEINVAL. > > - So there /is/ a dependency between these calls. Unfortunately, I > couldn't see that ioctlsocket returns with WSAEINVAL in the scenarios > I tried. Yes, I know about this gotcha. It's just that it never produced any problems, and I never was able to see the ioctlsocket call fail with WSAEINVAL. > Could it be a multi-threading issue then? Multiple threads > accessing the same socket... Not sure I follow -- are you trying to explain why ioctlsocket doesn't fail as expected, or are you trying to explain some other phenomenon? > I don't see how both threads are synchronized. The synchronization is between reader_thread and sys_select. The latter runs in the main (a.k.a. "Lisp") thread, the same thread where sys_write is called. > The patch I sent would synchronize them through the FILE_CONNECT > flag, I think. Did I miss something? Well, that's not really "thread synchronization", but see my comments and questions there. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 16:08 ` Eli Zaretskii @ 2016-03-07 17:20 ` Alain Schneble 2016-03-07 17:33 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-07 17:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Mon, 7 Mar 2016 09:49:00 +0100 >> >> - So there /is/ a dependency between these calls. Unfortunately, I >> couldn't see that ioctlsocket returns with WSAEINVAL in the scenarios >> I tried. > > Yes, I know about this gotcha. It's just that it never produced any > problems, and I never was able to see the ioctlsocket call fail with > WSAEINVAL. As said, I didn't see WSAEINVAL as well with the current implementation. >> Could it be a multi-threading issue then? Multiple threads >> accessing the same socket... > > Not sure I follow -- are you trying to explain why ioctlsocket doesn't > fail as expected, or are you trying to explain some other phenomenon? Both. 1) I expected it to (sometime) fail but didn't see it. 2) The arbitrary gnutls_handshake failures I observed seem like the socket's state is getting corrupted. Do you agree that WSAEventSelect is called from the reader thread and the ioctlsocket is called from the main thread? And AFAIK, these functions are not thread safe... So I suspect that we ran into a multi-threading issue here, which corrupted the socket. And may have lead to the above mentioned issues. >> I don't see how both threads are synchronized. > > The synchronization is between reader_thread and sys_select. The > latter runs in the main (a.k.a. "Lisp") thread, the same thread where > sys_write is called. This is how I understood it as well... >> The patch I sent would synchronize them through the FILE_CONNECT >> flag, I think. Did I miss something? > > Well, that's not really "thread synchronization", but see my comments > and questions there. Yes, you are right. I just meant that both threads no longer will call WSAEventSelect and ioctlsocket at the same time with the proposed patch. And this is a kind of "synchronization". ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 17:20 ` Alain Schneble @ 2016-03-07 17:33 ` Eli Zaretskii 2016-03-07 18:03 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-07 17:33 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Mon, 7 Mar 2016 18:20:20 +0100 > > >> Could it be a multi-threading issue then? Multiple threads > >> accessing the same socket... > > > > Not sure I follow -- are you trying to explain why ioctlsocket doesn't > > fail as expected, or are you trying to explain some other phenomenon? > > Both. 1) I expected it to (sometime) fail but didn't see it. 2) The > arbitrary gnutls_handshake failures I observed seem like the socket's > state is getting corrupted. I cannot argue with facts. I'm not sure the ioctlsocket calls are what causes the failures, but since you say the failures disappear when we don't, it's a fact that we should accept. > Do you agree that WSAEventSelect is called from the reader thread and > the ioctlsocket is called from the main thread? Yes, of course. > And AFAIK, these functions are not thread safe... I don't think this is true. AFAIK, the WSA* functions are all thread-safe. > So I suspect that we ran into a multi-threading issue here, which > corrupted the socket. And may have lead to the above mentioned > issues. It's possible. But if you see the problems solved after the change, I see no reason to continue arguing. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 17:33 ` Eli Zaretskii @ 2016-03-07 18:03 ` Alain Schneble 2016-03-07 18:10 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-07 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Mon, 7 Mar 2016 18:20:20 +0100 >> >> >> Could it be a multi-threading issue then? Multiple threads >> >> accessing the same socket... >> > >> > Not sure I follow -- are you trying to explain why ioctlsocket doesn't >> > fail as expected, or are you trying to explain some other phenomenon? >> >> Both. 1) I expected it to (sometime) fail but didn't see it. 2) The >> arbitrary gnutls_handshake failures I observed seem like the socket's >> state is getting corrupted. > > I cannot argue with facts. I'm not sure the ioctlsocket calls are > what causes the failures, but since you say the failures disappear > when we don't, it's a fact that we should accept. Neither do I. Yes, the failures disappear completely. >> Do you agree that WSAEventSelect is called from the reader thread and >> the ioctlsocket is called from the main thread? > > Yes, of course. > >> And AFAIK, these functions are not thread safe... > > I don't think this is true. AFAIK, the WSA* functions are all > thread-safe. Hm, it seems you are right, but I was not able to quickly find a *clear* statement about this on MSDN... >> So I suspect that we ran into a multi-threading issue here, which >> corrupted the socket. And may have lead to the above mentioned >> issues. > > It's possible. But if you see the problems solved after the change, I > see no reason to continue arguing. Yes, it solves the problems. (Nevertheless, it would have been nice if somebody else were able to reproduce the failures...) Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 18:03 ` Alain Schneble @ 2016-03-07 18:10 ` Eli Zaretskii 2016-03-07 18:26 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-07 18:10 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Mon, 7 Mar 2016 19:03:58 +0100 > > Yes, it solves the problems. (Nevertheless, it would have been nice if > somebody else were able to reproduce the failures...) If you can describe the recipe for reproducing them, I will do that before and after the change. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 18:10 ` Eli Zaretskii @ 2016-03-07 18:26 ` Alain Schneble 0 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-07 18:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Mon, 7 Mar 2016 19:03:58 +0100 >> >> Yes, it solves the problems. (Nevertheless, it would have been nice if >> somebody else were able to reproduce the failures...) > > If you can describe the recipe for reproducing them, I will do that > before and after the change. Thank you! emacs -Q (require 'shr) (require 'url-queue) (blink-cursor-mode -1) (setq url-queue-timeout 120) (setq shr-ignore-cache t) M-x eww https://www.microsoft.com => Before the patch, several images were not displayed, even after waiting 120s. After the patch, all images should be displayed after a short while (~6s on my machine). ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 23:24 ` Alain Schneble 2016-03-07 8:49 ` Alain Schneble @ 2016-03-07 16:07 ` Eli Zaretskii 2016-03-07 16:47 ` Alain Schneble 2016-03-07 22:21 ` Alain Schneble 1 sibling, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-07 16:07 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Mon, 7 Mar 2016 00:24:35 +0100 > > > I'll send a patch for further discussions shortly. > > And here it is. The fix is quite simple. It ensures that sys_write > exits before touching the socket if it is not connected yet. > Unfortunately I didn't find any documentation on winsock ioctlsocket > that would prove that this is indeed required. But it seems not wrong > to me anyway. (I'll try to search the wisock documentation tomorrow to > find some hints that lead in this direction, or maybe you know?) I think this change should be installed regardless, as it fixes an oversight. However, I think it needs to be augmented, because the fact that FILE_CONNECT flag is set doesn't necessarily mean the connection is in progress: it could have failed already. We need to look at the status as well. The possible states of the FILE_CONNECT flag and the cp->status values are: flag status description ---------------------------------------------------------------------------- ON STATUS_READ_READY reader thread is about to try connect ON STATUS_READ_FAILED reader thread waits in _sys_wait_connect ON STATUS_READ_SUCCEEDED reader thread successfully connected ON STATUS_CONNECT_FAILED reader thread failed to connect OFF STATUS_READ_ACKNOWLEDGED sys_select acknowledged successful connect OFF STATUS_READ_READY reader thread is about to read OFF STATUS_READ_IN_PROGRESS reader thread waits in _sys_read_ahead OFF STATUS_READ_SUCCEEDED reader thread succeeded in reading OFF STATUS_READ_FAILED reader thread failed to read So we should only return EWOULDBLOCK when FILE_CONNECT is set _and_ the status is not STATUS_CONNECT_FAILED. If FILE_CONNECT is set, but the status is STATUS_CONNECT_FAILED, we should instead return the value computed from cp->errcode (if it is non-zero). There's an example of that in sys_read. Other than that, what specific problem does your change try or is known to solve? IOW, what didn't work before the change, and works after it? Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 16:07 ` Eli Zaretskii @ 2016-03-07 16:47 ` Alain Schneble 2016-03-07 22:21 ` Alain Schneble 1 sibling, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-07 16:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Mon, 7 Mar 2016 00:24:35 +0100 >> >> > I'll send a patch for further discussions shortly. >> >> And here it is. The fix is quite simple. It ensures that sys_write >> exits before touching the socket if it is not connected yet. >> Unfortunately I didn't find any documentation on winsock ioctlsocket >> that would prove that this is indeed required. But it seems not wrong >> to me anyway. (I'll try to search the wisock documentation tomorrow to >> find some hints that lead in this direction, or maybe you know?) > > I think this change should be installed regardless, as it fixes an > oversight. However, I think it needs to be augmented, because the > fact that FILE_CONNECT flag is set doesn't necessarily mean the > connection is in progress: it could have failed already. We need to > look at the status as well. Thank you. I'll study the state table later... > So we should only return EWOULDBLOCK when FILE_CONNECT is set _and_ > the status is not STATUS_CONNECT_FAILED. If FILE_CONNECT is set, but > the status is STATUS_CONNECT_FAILED, we should instead return the > value computed from cp->errcode (if it is non-zero). There's an > example of that in sys_read. Ok, thanks for this information. I'll read through that code once again... > Other than that, what specific problem does your change try or is > known to solve? IOW, what didn't work before the change, and works > after it? Aha. Sorry, I was not clear about that. It fixes all the reproducible issues I had when "asynchronously" loading a website with images in eww on MS-Windows (e.g. https://www.microsoft.com). gnutls_handshake returned with arbitrary failures when loading the images. It returned with errors -15 GNUTLS_E_UNEXPECTED_PACKET or -110 GNUTLS_E_PREMATURE_TERMINATION, followed by -10 GNUTLS_E_INVALID_SESSION. It happend all the time, but arbitrarily only for some of the images. The affected images were not downloaded and displayed in eww at all because the GnuTLS session could not be established. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 16:07 ` Eli Zaretskii 2016-03-07 16:47 ` Alain Schneble @ 2016-03-07 22:21 ` Alain Schneble 2016-03-08 16:40 ` Eli Zaretskii 2016-03-10 14:45 ` Eli Zaretskii 1 sibling, 2 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-07 22:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 [-- Attachment #1: Type: text/plain, Size: 1817 bytes --] Eli Zaretskii <eliz@gnu.org> writes: > I think this change should be installed regardless, as it fixes an > oversight. However, I think it needs to be augmented, because the > fact that FILE_CONNECT flag is set doesn't necessarily mean the > connection is in progress: it could have failed already. We need to > look at the status as well. > > The possible states of the FILE_CONNECT flag and the cp->status values > are: > > flag status description > ---------------------------------------------------------------------------- > ON STATUS_READ_READY reader thread is about to try connect > ON STATUS_READ_FAILED reader thread waits in _sys_wait_connect > ON STATUS_READ_SUCCEEDED reader thread successfully connected > ON STATUS_CONNECT_FAILED reader thread failed to connect > OFF STATUS_READ_ACKNOWLEDGED sys_select acknowledged successful connect > OFF STATUS_READ_READY reader thread is about to read > OFF STATUS_READ_IN_PROGRESS reader thread waits in _sys_read_ahead > OFF STATUS_READ_SUCCEEDED reader thread succeeded in reading > OFF STATUS_READ_FAILED reader thread failed to read > > So we should only return EWOULDBLOCK when FILE_CONNECT is set _and_ > the status is not STATUS_CONNECT_FAILED. If FILE_CONNECT is set, but > the status is STATUS_CONNECT_FAILED, we should instead return the > value computed from cp->errcode (if it is non-zero). There's an > example of that in sys_read. Thank you very much for these inputs. I rearranged the patch to include these two cases and removed another special case that should no longer be needed as it is covered by the first one. Is this what you had in mind? Do you agree with the change? Thanks. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Patch --] [-- Type: text/x-patch, Size: 2727 bytes --] From 81295b036eb0a43dee968e8aa3f031030589cddd Mon Sep 17 00:00:00 2001 From: Alain Schneble <a.s@realize.ch> Date: Mon, 7 Mar 2016 23:05:40 +0100 Subject: [PATCH] Resolve non-blocking socket connection issue on w32 * src/w32.c (sys_write): For non-blocking sockets, return immediately with EWOULDBLOCK if connection is still in progress. If connection attempt has failed already, return proper code stashed in cp->errcode. BTW, this ensures we do not temporarily turn the socket into blocking mode for the pfn_send call if the connection is in progress. It turned out that doing so causes arbitrary GnuTLS handshake failures on MS-Windows. (bug#22789) --- src/w32.c | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/src/w32.c b/src/w32.c index ccf7cc3..c553152 100644 --- a/src/w32.c +++ b/src/w32.c @@ -8772,6 +8772,30 @@ sys_write (int fd, const void * buffer, unsigned int count) unsigned long nblock = 0; if (winsock_lib == NULL) emacs_abort (); + child_process *cp = fd_info[fd].cp; + + /* If this is a non-blocking socket whose connection is in + progress or terminated with an error already, return the + proper error code to the caller. */ + if (cp != NULL && (fd_info[fd].flags & FILE_CONNECT) != 0) + { + /* In case connection is in progress, ENOTCONN that would + result from calling pfn_send is not what callers expect. */ + if (cp->status != STATUS_CONNECT_FAILED) + { + errno = EWOULDBLOCK; + return -1; + } + /* In case connection failed, use the actual error code + stashed by '_sys_wait_connect' in cp->errcode. */ + else if (cp->errcode != 0) + { + pfn_WSASetLastError (cp->errcode); + set_errno (); + return -1; + } + } + /* TODO: implement select() properly so non-blocking I/O works. */ /* For now, make sure the write blocks. */ if (fd_info[fd].flags & FILE_NDELAY) @@ -8782,14 +8806,8 @@ sys_write (int fd, const void * buffer, unsigned int count) if (nchars == SOCKET_ERROR) { set_errno (); - /* If this is a non-blocking socket whose connection is in - progress, return the proper error code to the caller; - ENOTCONN is not what they expect . */ - if (errno == ENOTCONN && (fd_info[fd].flags & FILE_CONNECT) != 0) - errno = EWOULDBLOCK; - else - DebPrint (("sys_write.send failed with error %d on socket %ld\n", - pfn_WSAGetLastError (), SOCK_HANDLE (fd))); + DebPrint (("sys_write.send failed with error %d on socket %ld\n", + pfn_WSAGetLastError (), SOCK_HANDLE (fd))); } /* Set the socket back to non-blocking if it was before, -- 2.6.2.windows.1 ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 22:21 ` Alain Schneble @ 2016-03-08 16:40 ` Eli Zaretskii 2016-03-08 16:43 ` Alain Schneble 2016-03-10 14:45 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-08 16:40 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Mon, 7 Mar 2016 23:21:56 +0100 > > > So we should only return EWOULDBLOCK when FILE_CONNECT is set _and_ > > the status is not STATUS_CONNECT_FAILED. If FILE_CONNECT is set, but > > the status is STATUS_CONNECT_FAILED, we should instead return the > > value computed from cp->errcode (if it is non-zero). There's an > > example of that in sys_read. > > Thank you very much for these inputs. I rearranged the patch to include > these two cases and removed another special case that should no longer > be needed as it is covered by the first one. > > Is this what you had in mind? Do you agree with the change? Yes, it looks good to me. I will test it in a couple of days. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-08 16:40 ` Eli Zaretskii @ 2016-03-08 16:43 ` Alain Schneble 0 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-08 16:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Mon, 7 Mar 2016 23:21:56 +0100 >> >> > So we should only return EWOULDBLOCK when FILE_CONNECT is set _and_ >> > the status is not STATUS_CONNECT_FAILED. If FILE_CONNECT is set, but >> > the status is STATUS_CONNECT_FAILED, we should instead return the >> > value computed from cp->errcode (if it is non-zero). There's an >> > example of that in sys_read. >> >> Thank you very much for these inputs. I rearranged the patch to include >> these two cases and removed another special case that should no longer >> be needed as it is covered by the first one. >> >> Is this what you had in mind? Do you agree with the change? > > Yes, it looks good to me. I will test it in a couple of days. Thank you, Eli. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-07 22:21 ` Alain Schneble 2016-03-08 16:40 ` Eli Zaretskii @ 2016-03-10 14:45 ` Eli Zaretskii 2016-03-10 14:59 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-10 14:45 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789-done > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Mon, 7 Mar 2016 23:21:56 +0100 > > Thank you very much for these inputs. I rearranged the patch to include > these two cases and removed another special case that should no longer > be needed as it is covered by the first one. > > Is this what you had in mind? Do you agree with the change? Thanks, I pushed this to master. I'm marking the bug done. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-10 14:45 ` Eli Zaretskii @ 2016-03-10 14:59 ` Alain Schneble 0 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-10 14:59 UTC (permalink / raw) To: 22789; +Cc: j_l_domenech Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Mon, 7 Mar 2016 23:21:56 +0100 >> >> Thank you very much for these inputs. I rearranged the patch to include >> these two cases and removed another special case that should no longer >> be needed as it is covered by the first one. >> >> Is this what you had in mind? Do you agree with the change? > > Thanks, I pushed this to master. > > I'm marking the bug done. Thanks. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 18:27 ` Alain Schneble 2016-03-05 19:21 ` Eli Zaretskii @ 2016-03-06 9:31 ` Lars Magne Ingebrigtsen 2016-03-06 15:24 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Lars Magne Ingebrigtsen @ 2016-03-06 9:31 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > I should have written "multiple times" not "many times". I just had a > case where for four network processes (get image requests) > gnutls_try_handshake returned for each of these processes -28 > GNUTLS_E_AGAIN three times in a row, followed by a single -110 > GNUTLS_E_PREMATURE_TERMINATION and then repeatedly by -10 > GNUTLS_E_INVALID_SESSION. Hm! That GNUTLS_E_PREMATURE_TERMINATION is interesting. Does that mean that Emacs has closed the socket, or that the peer has closed the connection, I wonder? I've had a look at the gnutls source code, and if I'm reading the code correctly, that error code doesn't really distinguish the two cases... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 9:31 ` Lars Magne Ingebrigtsen @ 2016-03-06 15:24 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-06 15:24 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Sun, 06 Mar 2016 10:31:43 +0100 > > Alain Schneble <a.s@realize.ch> writes: > > > I should have written "multiple times" not "many times". I just had a > > case where for four network processes (get image requests) > > gnutls_try_handshake returned for each of these processes -28 > > GNUTLS_E_AGAIN three times in a row, followed by a single -110 > > GNUTLS_E_PREMATURE_TERMINATION and then repeatedly by -10 > > GNUTLS_E_INVALID_SESSION. > > Hm! That GNUTLS_E_PREMATURE_TERMINATION is interesting. Does that > mean that Emacs has closed the socket, or that the peer has closed the > connection, I wonder? I cannot imagine it's the other side: why would they? I think it's us, when we give up (too early, perhaps) and delete the process object. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 21:36 ` Alain Schneble 2016-03-04 22:33 ` Alain Schneble 2016-03-05 8:23 ` Eli Zaretskii @ 2016-03-05 8:46 ` Lars Magne Ingebrigtsen 2016-03-05 18:32 ` Alain Schneble 2 siblings, 1 reply; 124+ messages in thread From: Lars Magne Ingebrigtsen @ 2016-03-05 8:46 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > I have the impression that GnuTLS doesn't like it too much if we start > retrying the handshake many times before the socket is connected. At > least on MS-Windows. In nearly all of the cases of loading websites > with around 20 images, I observe arbitrary failures of > gnutls_try_handshake which usually end up with -10 > GNUTLS_E_INVALID_SESSION. Try to add some printfs around the handshaking, and then strace Emacs while it's doing all this. GNUTLS_E_INVALID_SESSION is usually the result of libgnutls losing control of the socket -- something else is writing or reading from the socket, and that makes the libgnutls state machine become unsynchronised. If you see any reads/writes to the sockets outside the handshake section of the code, you'll have found the culprit. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 8:46 ` Lars Magne Ingebrigtsen @ 2016-03-05 18:32 ` Alain Schneble 0 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-05 18:32 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Alain Schneble <a.s@realize.ch> writes: > >> I have the impression that GnuTLS doesn't like it too much if we start >> retrying the handshake many times before the socket is connected. At >> least on MS-Windows. In nearly all of the cases of loading websites >> with around 20 images, I observe arbitrary failures of >> gnutls_try_handshake which usually end up with -10 >> GNUTLS_E_INVALID_SESSION. > > Try to add some printfs around the handshaking, and then strace Emacs > while it's doing all this. GNUTLS_E_INVALID_SESSION is usually the > result of libgnutls losing control of the socket -- something else is > writing or reading from the socket, and that makes the libgnutls state > machine become unsynchronised. Thank you. I will try to do that. I'll have to learn how to strace first, though. Sorry, that was the reason why I did not already strace it (you suggested it already in a previous message, thanks! Sorry for not having followed your advice in the first place). > If you see any reads/writes to the sockets outside the handshake section > of the code, you'll have found the culprit. Ok, thanks. Hopefully, I'll have time to do it tomorrow... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 17:57 ` Alain Schneble 2016-02-29 18:45 ` Eli Zaretskii @ 2016-02-29 21:18 ` Lars Ingebrigtsen 2016-02-29 23:20 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-29 21:18 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > But the point I tried to address is the following: /When/ shall we start > with the handshake "series" and start counting the number of tries (or > stopwatch)? Don't you agree that with async sockets, it doesn't make > much sense to start it before the socket is connected? Yeah, it probably doesn't make any sense to start trying until it's connected. What's the incantation to check whether a socket has finished its three-way TCP handshake? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 21:18 ` Lars Ingebrigtsen @ 2016-02-29 23:20 ` Alain Schneble 2016-03-01 3:43 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-02-29 23:20 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Alain Schneble <a.s@realize.ch> writes: > >> But the point I tried to address is the following: /When/ shall we start >> with the handshake "series" and start counting the number of tries (or >> stopwatch)? Don't you agree that with async sockets, it doesn't make >> much sense to start it before the socket is connected? > > Yeah, it probably doesn't make any sense to start trying until it's > connected. What's the incantation to check whether a socket has > finished its three-way TCP handshake? I'm not quite sure, but my understanding was that this expression does answer this question: (fd_info[p->infd].flags & FILE_CONNECT) == 0 Or am I wrong? See also patch in the previous message. If we decide to go this direction, then I think we should also inhibit the first call to gnutls_try_handshake from gnutls_boot, for async sockets. This call is currently implicit and will probably fail (return with EAGAIN) in all practical cases. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-29 23:20 ` Alain Schneble @ 2016-03-01 3:43 ` Eli Zaretskii 2016-03-01 5:17 ` Lars Ingebrigtsen 2016-03-01 15:43 ` Alain Schneble 0 siblings, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 3:43 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > Date: Tue, 1 Mar 2016 00:20:45 +0100 > Cc: j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > > See also patch in the previous message. If we decide to go this > direction, then I think we should also inhibit the first call to > gnutls_try_handshake from gnutls_boot, for async sockets. This call is > currently implicit and will probably fail (return with EAGAIN) in all > practical cases. That patch exposes a w32-specific stuff to process.c, so it cannot be right. That's why I asked for a description of how this works on GNU/Linux. With that information in hand, we can reason what is different on w32. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 3:43 ` Eli Zaretskii @ 2016-03-01 5:17 ` Lars Ingebrigtsen 2016-03-01 15:46 ` Eli Zaretskii 2016-03-01 15:43 ` Alain Schneble 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-01 5:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, Alain Schneble, 22789 All the immediate ideas I have to ensure we have a timer that triggers `wait_reading_process_output' now and then when we have processes waiting for DNS or to complete TLS negotiation remind me too much of reference counting. There's always an off by one error or a race condition when counting. :-) But, basically, we have to have a way to say "I'm starting this stuff now, and the timer should continue to trigger every 50ms until I'm done". And then stop when they're all done... Perhaps it isn't as difficult as all that, since Emacs is pretty single threaded. That is, when calling getaddrinfo_a or try_negotiate, we'd have a function that would start the timer unless it's already running. And the timer itself could just look through the process list and see if any such processes remain, and then just commit sudoku if there are no such processes remaining. Now, if Emacs were multithreaded in many dimensions, this would be pretty error prone, but perhaps it's not the way Emacs is now? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 5:17 ` Lars Ingebrigtsen @ 2016-03-01 15:46 ` Eli Zaretskii 2016-03-02 18:03 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 15:46 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Alain Schneble <a.s@realize.ch>, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Tue, 01 Mar 2016 16:17:13 +1100 > > All the immediate ideas I have to ensure we have a timer that triggers > `wait_reading_process_output' now and then when we have processes > waiting for DNS or to complete TLS negotiation remind me too much of > reference counting. There's always an off by one error or a race > condition when counting. :-) > > But, basically, we have to have a way to say "I'm starting this stuff > now, and the timer should continue to trigger every 50ms until I'm > done". And then stop when they're all done... You don't actually need a timer, since there's nothing you'd need the timer function do. What you want is a way to ensure the loop in wait_reading_process_output continues to run, without becoming stuck for too long in a call to 'pselect', for as long as there are process objects waiting for this async stuff to complete; and you want to make sure this happens even if Emacs doesn't get any events from the window-system, the keyboard, the subprocesses, whatever. Timers solve this problem for you because that loop computes the timeout for the 'pselect' call such that the timeout ends when the next timer expires. For example, the blink-cursor timer causes the timeout to be at most 0.5 sec. Indirectly, this causes the loop to crank one more iteration, which helps you, because Emacs then gets an opportunity to check on the status of the TLS negotiation etc. The timer itself is redundant; its effect on the loop is what you want. So what you can do instead of launching a timer is this: as long as some process waits for some sync stuff to complete, reduce the timeout with which we call 'pselect' to some reasonably small value, like half a second or maybe 0.25 sec. This will ensure the loop doesn't stop as long as we wait for at least one such connection. (This will need a simple logic to not exit the loop too early; see the variable timeout_reduced_for_timers for a similar logic we employ already for timers.) ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 15:46 ` Eli Zaretskii @ 2016-03-02 18:03 ` Lars Ingebrigtsen 2016-03-02 19:07 ` Alain Schneble 2016-03-02 19:43 ` Eli Zaretskii 0 siblings, 2 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-02 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: > So what you can do instead of launching a timer is this: as long as > some process waits for some sync stuff to complete, reduce the timeout > with which we call 'pselect' to some reasonably small value, like half > a second or maybe 0.25 sec. This will ensure the loop doesn't stop as > long as we wait for at least one such connection. (This will need a > simple logic to not exit the loop too early; see the variable > timeout_reduced_for_timers for a similar logic we employ already for > timers.) Aha! With the following (for debugging purposes only) patch, it looks like I'm getting progress on my https connections even if I don't have a blinking cursor. (I chose 50ms as my timeout, if I counted my zeroes correctly...) Were you thinking about something along these lines? If so, I can clean the patch up... diff --git a/src/process.c b/src/process.c index 85a4885..5376492 100644 --- a/src/process.c +++ b/src/process.c @@ -4870,6 +4870,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, struct timespec got_output_end_time = invalid_timespec (); enum { MINIMUM = -1, TIMEOUT, INFINITY } wait; int got_some_output = -1; + bool retry_for_async; ptrdiff_t count = SPECPDL_INDEX (); /* Close to the current time if known, an invalid timespec otherwise. */ @@ -4922,6 +4923,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, Lisp_Object process_list_head, aproc; struct Lisp_Process *p; + retry_for_async = false; FOR_EACH_PROCESS(process_list_head, aproc) { p = XPROCESS (aproc); @@ -4935,6 +4937,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, Lisp_Object ip_addresses = check_for_dns (aproc); if (!NILP (ip_addresses) && !EQ (ip_addresses, Qt)) connect_network_socket (aproc, ip_addresses); + else + retry_for_async = true; } #endif #ifdef HAVE_GNUTLS @@ -4950,12 +4954,16 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, gnutls_verify_boot (aproc, Qnil); finish_after_tls_connection (aproc); } - else if (p->gnutls_handshakes_tried - > GNUTLS_EMACS_HANDSHAKES_LIMIT) + else { - deactivate_process (aproc); - pset_status (p, list2 (Qfailed, - build_string ("TLS negotiation failed"))); + retry_for_async = true; + if (p->gnutls_handshakes_tried + > GNUTLS_EMACS_HANDSHAKES_LIMIT) + { + deactivate_process (aproc); + pset_status (p, list2 (Qfailed, + build_string ("TLS negotiation failed"))); + } } } #endif @@ -5044,6 +5052,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, Ctemp = write_mask; timeout = make_timespec (0, 0); + printf("Timeout is %lu\n", timeout.tv_sec); if ((pselect (max (max_process_desc, max_input_desc) + 1, &Atemp, #ifdef NON_BLOCKING_CONNECT @@ -5222,6 +5231,15 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd, if (timeout.tv_sec > 0 || timeout.tv_nsec > 0) now = invalid_timespec (); + if (retry_for_async + && (timeout.tv_sec > 0 || timeout.tv_nsec > 50000000)) + { + timeout.tv_sec = 0; + timeout.tv_nsec = 50000000; + } + + printf("Here timeout is %lu/%lu\n", timeout.tv_sec, timeout.tv_nsec); + #if defined (HAVE_NS) nfds = ns_select #elif defined (HAVE_GLIB) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 18:03 ` Lars Ingebrigtsen @ 2016-03-02 19:07 ` Alain Schneble 2016-03-02 19:15 ` Lars Ingebrigtsen 2016-03-02 19:43 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-02 19:07 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >> So what you can do instead of launching a timer is this: as long as >> some process waits for some sync stuff to complete, reduce the timeout >> with which we call 'pselect' to some reasonably small value, like half >> a second or maybe 0.25 sec. This will ensure the loop doesn't stop as >> long as we wait for at least one such connection. (This will need a >> simple logic to not exit the loop too early; see the variable >> timeout_reduced_for_timers for a similar logic we employ already for >> timers.) > > Aha! With the following (for debugging purposes only) patch, it looks > like I'm getting progress on my https connections even if I don't have a > blinking cursor. (I chose 50ms as my timeout, if I counted my zeroes > correctly...) Were you thinking about something along these lines? If > so, I can clean the patch up... I was debugging it further and AFAICS, there must also be an issue in shr with the processing of images. Because without any patch, if all the images were loaded properly, i.e. the shr-image-fetched callback in shr.el was invoked for each of the images in a page, the images are sometimes not displayed. In shr.el shr-image-fetched, if I comment the line... (url-store-in-cache image-buffer) ...then images are never shown. Strange, not? Or is this expected? And now, even with ordinary http connections, if I delete ~/.emacs.d/url/cache, no images are displayed the first time I load the page. Did I mess up something with my build/installation? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 19:07 ` Alain Schneble @ 2016-03-02 19:15 ` Lars Ingebrigtsen 2016-03-02 19:38 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-02 19:15 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > And now, even with ordinary http connections, if I delete > ~/.emacs.d/url/cache, no images are displayed the first time I load the > page. Perhaps your Emacs isn't completing the DNS resolutions? There's nothing there to ensure their progress (unless you have a blinking cursor). > Did I mess up something with my build/installation? That's also possible. :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 19:15 ` Lars Ingebrigtsen @ 2016-03-02 19:38 ` Alain Schneble 2016-03-02 20:46 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-02 19:38 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Alain Schneble <a.s@realize.ch> writes: > >> And now, even with ordinary http connections, if I delete >> ~/.emacs.d/url/cache, no images are displayed the first time I load the >> page. > > Perhaps your Emacs isn't completing the DNS resolutions? There's > nothing there to ensure their progress (unless you have a blinking > cursor). I don't think so, because I see that shr-image-fetched is called the correct number of times and the status doesn't indicate an error. But the insert-image calls do not update the *eww* buffer and the image data looks, well, not empty. Or could it be an encoding issue, that the images get displayed properly only after being saved as binary to disk and re-read from there? I'll try to find out... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 19:38 ` Alain Schneble @ 2016-03-02 20:46 ` Alain Schneble 2016-03-02 22:02 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-02 20:46 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > Lars Ingebrigtsen <larsi@gnus.org> writes: > >> Alain Schneble <a.s@realize.ch> writes: >> >>> And now, even with ordinary http connections, if I delete >>> ~/.emacs.d/url/cache, no images are displayed the first time I load the >>> page. >> >> Perhaps your Emacs isn't completing the DNS resolutions? There's >> nothing there to ensure their progress (unless you have a blinking >> cursor). > > I don't think so, because I see that shr-image-fetched is called the > correct number of times and the status doesn't indicate an error. But > the insert-image calls do not update the *eww* buffer and the image data > looks, well, not empty. Or could it be an encoding issue, that the > images get displayed properly only after being saved as binary to disk > and re-read from there? I'll try to find out... That's funny: - In `shr-tag-img', if the image has not yet been cached, the following code is evaluated: (url-queue-retrieve (shr-encode-url url) 'shr-image-fetched (list (current-buffer) start (set-marker (make-marker) (1- (point))) (list :width width :height height)) - When the `callback shr-image-fetched' is invoked, start and end will form an empty range. - The image inserted will not be displayed because the range (= alt tag value) is empty. - If I add (setq alt "[empty alt]") on top of `shr-put-image', then loading of images works quite well. Also with https connections. (I do not say that the issue with the loop progress is not there...) ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 20:46 ` Alain Schneble @ 2016-03-02 22:02 ` Alain Schneble 2016-03-02 22:22 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-02 22:02 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 [-- Attachment #1: Type: text/plain, Size: 343 bytes --] Alain Schneble <a.s@realize.ch> writes: > - If I add (setq alt "[empty alt]") on top of `shr-put-image', then > loading of images works quite well. Also with https connections. (I > do not say that the issue with the loop progress is not there...) Here is a patch to fix the empty-range issue when inserting non-cached images in eww: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Patch --] [-- Type: text/x-patch, Size: 1086 bytes --] From ebf4b78ff53b92d99ef3ca771ff0d592d80b1c3d Mon Sep 17 00:00:00 2001 From: Alain Schneble <a.s@realize.ch> Date: Wed, 2 Mar 2016 22:49:32 +0100 Subject: [PATCH] Fix insertion of non-cached images in eww * lisp/net/shr.el (shr-tag-img): Construct a non-empty range to pass to shr-image-fetched, to indicate where to insert the image. Fixes the issue introduced with commit 80852f843e69b81618f29cfb9aa4b074946cb3c4. --- lisp/net/shr.el | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lisp/net/shr.el b/lisp/net/shr.el index c469e69..e463c7e 100644 --- a/lisp/net/shr.el +++ b/lisp/net/shr.el @@ -1499,7 +1499,7 @@ The preference is a float determined from `shr-prefer-media-type'." (insert " ") (url-queue-retrieve (shr-encode-url url) 'shr-image-fetched - (list (current-buffer) start (set-marker (make-marker) (1- (point))) + (list (current-buffer) start (set-marker (make-marker) (point)) (list :width width :height height)) t t))) (when (zerop shr-table-depth) ;; We are not in a table. -- 2.6.2.windows.1 [-- Attachment #3: Type: text/plain, Size: 172 bytes --] And by the way, with this fix, disabling (url-store-in-cache image-buffer) in `shr-image-fetched' does no longer break loading of images, as I would have expected before. ^ permalink raw reply related [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 22:02 ` Alain Schneble @ 2016-03-02 22:22 ` Lars Ingebrigtsen 2016-03-02 22:38 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-02 22:22 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > * lisp/net/shr.el (shr-tag-img): Construct a non-empty range to pass to > shr-image-fetched, to indicate where to insert the image. Fixes the > issue introduced with commit 80852f843e69b81618f29cfb9aa4b074946cb3c4. > --- > lisp/net/shr.el | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lisp/net/shr.el b/lisp/net/shr.el > index c469e69..e463c7e 100644 > --- a/lisp/net/shr.el > +++ b/lisp/net/shr.el > @@ -1499,7 +1499,7 @@ The preference is a float determined from `shr-prefer-media-type'." > (insert " ") > (url-queue-retrieve > (shr-encode-url url) 'shr-image-fetched > - (list (current-buffer) start (set-marker (make-marker) (1- (point))) > + (list (current-buffer) start (set-marker (make-marker) (point)) > (list :width width :height height)) > t t))) > (when (zerop shr-table-depth) ;; We are not in a table. I'm not seeing this issue. And that patch is two weeks old... did you just start seeing this today? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 22:22 ` Lars Ingebrigtsen @ 2016-03-02 22:38 ` Alain Schneble 2016-03-03 0:07 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-02 22:38 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > I'm not seeing this issue. And that patch is two weeks old... did you > just start seeing this today? Yes, I saw it today. The last week or so I was only working on MS Windows. Today I switched to GNU/Linux again. I'll try to do a fresh checkout to see if it vanishes... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 22:38 ` Alain Schneble @ 2016-03-03 0:07 ` Alain Schneble 2016-03-03 5:32 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-03 0:07 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > Lars Ingebrigtsen <larsi@gnus.org> writes: > >> I'm not seeing this issue. And that patch is two weeks old... did you >> just start seeing this today? > > Yes, I saw it today. The last week or so I was only working on MS > Windows. Today I switched to GNU/Linux again. I'll try to do a fresh > checkout to see if it vanishes... Fresh build, still the same issue here... hmm... ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-03 0:07 ` Alain Schneble @ 2016-03-03 5:32 ` Lars Ingebrigtsen 2016-03-03 9:03 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-03 5:32 UTC (permalink / raw) To: Alain Schneble; +Cc: j_l_domenech, 22789 Alain Schneble <a.s@realize.ch> writes: > Alain Schneble <a.s@realize.ch> writes: > >> Lars Ingebrigtsen <larsi@gnus.org> writes: >> >>> I'm not seeing this issue. And that patch is two weeks old... did you >>> just start seeing this today? >> >> Yes, I saw it today. The last week or so I was only working on MS >> Windows. Today I switched to GNU/Linux again. I'll try to do a fresh >> checkout to see if it vanishes... > > Fresh build, still the same issue here... hmm... Odd that I'm not seeing this, too... What GNU/Linux system are you using? I'm on Ubuntu, but I can't really see why that would change anything... Does your Emacs have SVG support? Anyway, I think your patch is correct, so I've applied it. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-03 5:32 ` Lars Ingebrigtsen @ 2016-03-03 9:03 ` Alain Schneble 0 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-03 9:03 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 Lars Ingebrigtsen <larsi@gnus.org> writes: > Odd that I'm not seeing this, too... What GNU/Linux system are you > using? I'm on Ubuntu, but I can't really see why that would change > anything... Does your Emacs have SVG support? > > Anyway, I think your patch is correct, so I've applied it. Thank you, Lars. You are right! It didn't happen if compiled with SVG support. I had /no/ SVG support before, so I ran into this issue. With SVG support it takes another path, where insert-image is not called with an empty string. That's why it didn't happen there. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 18:03 ` Lars Ingebrigtsen 2016-03-02 19:07 ` Alain Schneble @ 2016-03-02 19:43 ` Eli Zaretskii 2016-03-03 5:23 ` Lars Ingebrigtsen 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-02 19:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Wed, 02 Mar 2016 18:03:57 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > So what you can do instead of launching a timer is this: as long as > > some process waits for some sync stuff to complete, reduce the timeout > > with which we call 'pselect' to some reasonably small value, like half > > a second or maybe 0.25 sec. This will ensure the loop doesn't stop as > > long as we wait for at least one such connection. (This will need a > > simple logic to not exit the loop too early; see the variable > > timeout_reduced_for_timers for a similar logic we employ already for > > timers.) > > Aha! With the following (for debugging purposes only) patch, it looks > like I'm getting progress on my https connections even if I don't have a > blinking cursor. (I chose 50ms as my timeout, if I counted my zeroes > correctly...) Were you thinking about something along these lines? Yes. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-02 19:43 ` Eli Zaretskii @ 2016-03-03 5:23 ` Lars Ingebrigtsen 2016-03-04 8:51 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-03 5:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: > Yes. Ok, I've now committed a version that leads to rm -rf ~/.emacs.d/url/cache/; ./src/emacs -Q (progn (blink-cursor-mode -1) (eww "https://www.fsf.org")) loading the images reliably. From the hotel wifi they load fine without the patch, but using the LTE card in the laptop, that form always fails to load the images without the patch. I guess that from the hotel wifi the negotiation ends before Emacs backs off the pselect timeout too much, so it finishes anyway. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-03 5:23 ` Lars Ingebrigtsen @ 2016-03-04 8:51 ` Eli Zaretskii 2016-03-04 11:33 ` Lars Ingebrigtsen 2016-03-04 11:37 ` Lars Ingebrigtsen 0 siblings, 2 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-04 8:51 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Thu, 03 Mar 2016 05:23:51 +0000 > > Ok, I've now committed a version that leads to > > rm -rf ~/.emacs.d/url/cache/; ./src/emacs -Q > > (progn > (blink-cursor-mode -1) > (eww "https://www.fsf.org")) > > loading the images reliably. Same here on MS-Windows, thanks. Interestingly, if I run the same under GDB, the images load with about 50% reliability. So I guess there are still some subtle timing issues somewhere. Can you try this under GDB and see if there's some difference? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 8:51 ` Eli Zaretskii @ 2016-03-04 11:33 ` Lars Ingebrigtsen 2016-03-04 14:48 ` Eli Zaretskii 2016-03-04 11:37 ` Lars Ingebrigtsen 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-04 11:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: > Interestingly, if I run the same under GDB, the images load with about > 50% reliability. So I guess there are still some subtle timing issues > somewhere. Can you try this under GDB and see if there's some > difference? It this with just running under gdb? No special breakpoints or anything that would pause Emacs for a long time? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 11:33 ` Lars Ingebrigtsen @ 2016-03-04 14:48 ` Eli Zaretskii 2016-03-05 12:26 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-04 14:48 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Fri, 04 Mar 2016 11:33:58 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Interestingly, if I run the same under GDB, the images load with about > > 50% reliability. So I guess there are still some subtle timing issues > > somewhere. Can you try this under GDB and see if there's some > > difference? > > It this with just running under gdb? Yes. > No special breakpoints or anything that would pause Emacs for a long > time? Not user breakpoints, no. But GDB always inserts various hooks into the program it runs, and in Emacs (if you run GDB from the src directory), it also sets breakpoints in a few strategic places. It also announces new threads birth and demise etc. All that evidently does have some effect. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 14:48 ` Eli Zaretskii @ 2016-03-05 12:26 ` Lars Magne Ingebrigtsen 2016-03-05 13:24 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Magne Ingebrigtsen @ 2016-03-05 12:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 This is probably totally unrelated, but I saw a "hang" in Emacs for the first time in weeks. shr was downloading an image over https, and Emacs became unresponsive. strace showed the following: [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) This went on for perhaps 30 seconds? (Hm. Which is what I have `url-queue-timeout' set to... Hm...) Then it stopped without downloading the image. Now, the pselect6 call has a timeout of {0, 0}? If I'm reading the man page right. I think it's probably this code? /* If status of something has changed, and no input is available, notify the user of the change right away. After this explicit check, we'll let the SIGCHLD handler zap timeout to get our attention. */ if (update_tick != process_tick) { fd_set Atemp; fd_set Ctemp; if (kbd_on_hold_p ()) FD_ZERO (&Atemp); else Atemp = input_wait_mask; Ctemp = write_mask; timeout = make_timespec (0, 0); if ((pselect (max (max_process_desc, max_input_desc) + 1, &Atemp, #ifdef NON_BLOCKING_CONNECT (num_pending_connects > 0 ? &Ctemp : NULL), #else NULL, #endif NULL, &timeout, NULL) <= 0)) I'm not quite sure what it's trying to do... Hm... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 12:26 ` Lars Magne Ingebrigtsen @ 2016-03-05 13:24 ` Eli Zaretskii 2016-03-06 9:33 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-05 13:24 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Sat, 05 Mar 2016 13:26:50 +0100 > > This is probably totally unrelated, but I saw a "hang" in Emacs for the > first time in weeks. shr was downloading an image over https, and Emacs > became unresponsive. strace showed the following: > > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > [pid 12890] pselect6(39, [3 6 7 12 13 14 15 16 17 18 19 20 38], [], NULL, {0, 0}, {NULL, 8}) = 2 (in [18 19], left {0, 0}) > > This went on for perhaps 30 seconds? (Hm. Which is what I have > `url-queue-timeout' set to... Hm...) Then it stopped without > downloading the image. Hard to say anything intelligent without knowing what are descriptors 18 and 19. > Now, the pselect6 call has a timeout of {0, 0}? If I'm reading the man > page right. > > I think it's probably this code? > > /* If status of something has changed, and no input is > available, notify the user of the change right away. After > this explicit check, we'll let the SIGCHLD handler zap > timeout to get our attention. */ > if (update_tick != process_tick) > { > fd_set Atemp; > fd_set Ctemp; > > if (kbd_on_hold_p ()) > FD_ZERO (&Atemp); > else > Atemp = input_wait_mask; > Ctemp = write_mask; > > timeout = make_timespec (0, 0); > if ((pselect (max (max_process_desc, max_input_desc) + 1, > &Atemp, > #ifdef NON_BLOCKING_CONNECT > (num_pending_connects > 0 ? &Ctemp : NULL), > #else > NULL, > #endif > NULL, &timeout, NULL) > <= 0)) > > I'm not quite sure what it's trying to do... Hm... The comment above seems to explain what it does, no? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-05 13:24 ` Eli Zaretskii @ 2016-03-06 9:33 ` Lars Magne Ingebrigtsen 2016-03-06 15:26 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Magne Ingebrigtsen @ 2016-03-06 9:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> /* If status of something has changed, and no input is >> available, notify the user of the change right away. After >> this explicit check, we'll let the SIGCHLD handler zap >> timeout to get our attention. */ [...] > The comment above seems to explain what it does, no? I don't understand it, or how it relates to calling pselect with a zero timeout. :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 9:33 ` Lars Magne Ingebrigtsen @ 2016-03-06 15:26 ` Eli Zaretskii 2016-03-06 18:33 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-06 15:26 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Sun, 06 Mar 2016 10:33:08 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> /* If status of something has changed, and no input is > >> available, notify the user of the change right away. After > >> this explicit check, we'll let the SIGCHLD handler zap > >> timeout to get our attention. */ > > [...] > > > The comment above seems to explain what it does, no? > > I don't understand it, or how it relates to calling pselect with a zero > timeout. :-) It wants to poll, so it sets the time-out at zero, meaning "don't wait at all". Or am _I_ missing something? ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 15:26 ` Eli Zaretskii @ 2016-03-06 18:33 ` Lars Magne Ingebrigtsen 2016-03-06 18:41 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Magne Ingebrigtsen @ 2016-03-06 18:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Lars Magne Ingebrigtsen <larsi@gnus.org> >> Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org >> Date: Sun, 06 Mar 2016 10:33:08 +0100 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> >> /* If status of something has changed, and no input is >> >> available, notify the user of the change right away. After >> >> this explicit check, we'll let the SIGCHLD handler zap >> >> timeout to get our attention. */ >> >> [...] >> >> > The comment above seems to explain what it does, no? >> >> I don't understand it, or how it relates to calling pselect with a zero >> timeout. :-) > > It wants to poll, so it sets the time-out at zero, meaning "don't wait > at all". Or am _I_ missing something? But why does it want to poll? I'm wondering whether any of the async-related timeout changes are... provoking the "if (update_tick != process_tick)" bit to always be true in some cases, and making this polling behaviour happen infinitely... or something... But I've been completely unable to reproduce the error, so I'm just speculating. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-06 18:33 ` Lars Magne Ingebrigtsen @ 2016-03-06 18:41 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-06 18:41 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Sun, 06 Mar 2016 19:33:43 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Lars Magne Ingebrigtsen <larsi@gnus.org> > >> Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > >> Date: Sun, 06 Mar 2016 10:33:08 +0100 > >> > >> Eli Zaretskii <eliz@gnu.org> writes: > >> > >> >> /* If status of something has changed, and no input is > >> >> available, notify the user of the change right away. After > >> >> this explicit check, we'll let the SIGCHLD handler zap > >> >> timeout to get our attention. */ > >> > >> [...] > >> > >> > The comment above seems to explain what it does, no? > >> > >> I don't understand it, or how it relates to calling pselect with a zero > >> timeout. :-) > > > > It wants to poll, so it sets the time-out at zero, meaning "don't wait > > at all". Or am _I_ missing something? > > But why does it want to poll? To "notify the user of the change right away". ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 8:51 ` Eli Zaretskii 2016-03-04 11:33 ` Lars Ingebrigtsen @ 2016-03-04 11:37 ` Lars Ingebrigtsen 2016-03-04 11:40 ` Lars Ingebrigtsen 2016-03-04 15:40 ` Eli Zaretskii 1 sibling, 2 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-04 11:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: > Interestingly, if I run the same under GDB, the images load with about > 50% reliability. So I guess there are still some subtle timing issues > somewhere. Can you try this under GDB and see if there's some > difference? To avoid the image cache getting in the way (which may be confusing when testing this stuff), can you try (progn (setq shr-ignore-cache t) (blink-cursor-mode -1) (eww "https://www.fsf.org")) And you need Alain's fix for the shr image insertion stuff (which I checked in yesterday) if your Emacs doesn't have SVG support. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 11:37 ` Lars Ingebrigtsen @ 2016-03-04 11:40 ` Lars Ingebrigtsen 2016-03-04 15:41 ` Eli Zaretskii 2016-03-04 15:40 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-04 11:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 (And I'm not able to reproduce the problem when running under gdb, either with the airport wifi or the LTE card.) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 11:40 ` Lars Ingebrigtsen @ 2016-03-04 15:41 ` Eli Zaretskii 2016-03-04 15:43 ` Lars Ingebrigtsen 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-04 15:41 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: j_l_domenech@yahoo.com, a.s@realize.ch, 22789@debbugs.gnu.org > Date: Fri, 04 Mar 2016 11:40:12 +0000 > > (And I'm not able to reproduce the problem when running under gdb, > either with the airport wifi or the LTE card.) Not sure what that means, or where should I look for the problems. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 15:41 ` Eli Zaretskii @ 2016-03-04 15:43 ` Lars Ingebrigtsen 2016-03-04 16:12 ` Eli Zaretskii 0 siblings, 1 reply; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-03-04 15:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, a.s, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> (And I'm not able to reproduce the problem when running under gdb, >> either with the airport wifi or the LTE card.) > > Not sure what that means, or where should I look for the problems. I was trying to say "it works for me (under gbd) both with a fast and a slow network connection". :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 15:43 ` Lars Ingebrigtsen @ 2016-03-04 16:12 ` Eli Zaretskii 0 siblings, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-04 16:12 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: j_l_domenech@yahoo.com, a.s@realize.ch, 22789@debbugs.gnu.org > Date: Fri, 04 Mar 2016 15:43:55 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> (And I'm not able to reproduce the problem when running under gdb, > >> either with the airport wifi or the LTE card.) > > > > Not sure what that means, or where should I look for the problems. > > I was trying to say "it works for me (under gbd) both with a fast and a > slow network connection". :-) This part, I did understand ;-) ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-04 11:37 ` Lars Ingebrigtsen 2016-03-04 11:40 ` Lars Ingebrigtsen @ 2016-03-04 15:40 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-03-04 15:40 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, a.s, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: a.s@realize.ch, j_l_domenech@yahoo.com, 22789@debbugs.gnu.org > Date: Fri, 04 Mar 2016 11:37:26 +0000 > > To avoid the image cache getting in the way (which may be confusing when > testing this stuff), can you try > > (progn > (setq shr-ignore-cache t) > (blink-cursor-mode -1) > (eww "https://www.fsf.org")) Works 100% reliably if I invoke Emacs from the shell, but only about 50% when run under GDB. Maybe an additional factor is that my network connection is somewhat slow. But enlarging the url-queue-timeout to 120 didn't help. > And you need Alain's fix for the shr image insertion stuff (which I > checked in yesterday) if your Emacs doesn't have SVG support. My Emacs does have SVG support (via librsvg), and my master branch is current anyway. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 3:43 ` Eli Zaretskii 2016-03-01 5:17 ` Lars Ingebrigtsen @ 2016-03-01 15:43 ` Alain Schneble 2016-03-01 16:07 ` Eli Zaretskii 1 sibling, 1 reply; 124+ messages in thread From: Alain Schneble @ 2016-03-01 15:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> Date: Tue, 1 Mar 2016 00:20:45 +0100 >> Cc: j_l_domenech@yahoo.com, 22789@debbugs.gnu.org >> >> See also patch in the previous message. If we decide to go this >> direction, then I think we should also inhibit the first call to >> gnutls_try_handshake from gnutls_boot, for async sockets. This call is >> currently implicit and will probably fail (return with EAGAIN) in all >> practical cases. > > That patch exposes a w32-specific stuff to process.c, so it cannot be > right. Thanks. I was unaware of that. I should have done a grep on the usage of FILE_CONNECT first before proposing this patch. That reveals clearly that it is w32-specific. I'm sorry. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 15:43 ` Alain Schneble @ 2016-03-01 16:07 ` Eli Zaretskii 2016-03-01 16:26 ` Alain Schneble 0 siblings, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-03-01 16:07 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Tue, 1 Mar 2016 16:43:50 +0100 > > > That patch exposes a w32-specific stuff to process.c, so it cannot be > > right. > > Thanks. I was unaware of that. I should have done a grep on the usage > of FILE_CONNECT first before proposing this patch. That reveals clearly > that it is w32-specific. I'm sorry. No need to apologize, you couldn't know this. (You may wish reading the large comment around line 800 in w32proc.c, which describes how this works on Windows.) ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-03-01 16:07 ` Eli Zaretskii @ 2016-03-01 16:26 ` Alain Schneble 0 siblings, 0 replies; 124+ messages in thread From: Alain Schneble @ 2016-03-01 16:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: >> From: Alain Schneble <a.s@realize.ch> >> CC: <larsi@gnus.org>, <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> >> Date: Tue, 1 Mar 2016 16:43:50 +0100 >> >> > That patch exposes a w32-specific stuff to process.c, so it cannot be >> > right. >> >> Thanks. I was unaware of that. I should have done a grep on the usage >> of FILE_CONNECT first before proposing this patch. That reveals clearly >> that it is w32-specific. I'm sorry. > > No need to apologize, you couldn't know this. (You may wish reading > the large comment around line 800 in w32proc.c, which describes how > this works on Windows.) Thank you, Eli. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-27 18:05 ` Alain Schneble 2016-02-27 22:38 ` Lars Ingebrigtsen @ 2016-02-28 16:47 ` Eli Zaretskii 1 sibling, 0 replies; 124+ messages in thread From: Eli Zaretskii @ 2016-02-28 16:47 UTC (permalink / raw) To: Alain Schneble; +Cc: larsi, j_l_domenech, 22789 > From: Alain Schneble <a.s@realize.ch> > CC: Eli Zaretskii <eliz@gnu.org>, José L. Doménech > <j_l_domenech@yahoo.com>, <22789@debbugs.gnu.org> > Date: Sat, 27 Feb 2016 19:05:20 +0100 > > Lars Ingebrigtsen <larsi@gnus.org> writes: > > > (setq proc > > (make-network-process :name "foo" > > :buffer (get-buffer-create "*foo*") > > :host "imap.gmail.com" > > :service 993 > > :nowait t > > :tls-parameters > > (cons 'gnutls-x509pki > > (gnutls-boot-parameters > > :type 'gnutls-x509pki > > :hostname "imap.gmail.com")))) > > > > * OK Gimap ready for requests from 60.225.211.161 qr7mb410250987iec > > > > should appear. Also, after evaling that, what does > > It seems to be a timing issue. If I set gnutls-log-level to 5, this > works also on Windows (i.e i get OK Gimap...). Actually, with the latest trunk this works even without increasing the log level. I guess the latest changes in gnutls.c did it. > I guess that it enters this path because the socket is not ready yet. > But why? Because the connection didn't complete yet, and we are already trying the GnuTLS handshake. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-24 23:48 ` Lars Ingebrigtsen 2016-02-25 0:02 ` Lars Ingebrigtsen @ 2016-02-25 3:46 ` Eli Zaretskii 2016-02-25 5:00 ` Lars Ingebrigtsen 1 sibling, 1 reply; 124+ messages in thread From: Eli Zaretskii @ 2016-02-25 3:46 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: j_l_domenech, 22789 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: José L. Doménech > <j_l_domenech@yahoo.com>, 22789@debbugs.gnu.org > Date: Thu, 25 Feb 2016 10:48:26 +1100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I confirm the problem with the MS-Windows build: on master, https > > doesn't work; on emacs-25 it does. > > > > First suspect is the async changes, of course. > > Yup. I'll try do do a build without getaddrinfo_a support and see > whether I can reproduce the https error here... Note that the Windows build doesn't use getaddrinfo, either, it uses gethostbyname etc. ^ permalink raw reply [flat|nested] 124+ messages in thread
* bug#22789: 25.1.50; In last master build https connections stop working 2016-02-25 3:46 ` Eli Zaretskii @ 2016-02-25 5:00 ` Lars Ingebrigtsen 0 siblings, 0 replies; 124+ messages in thread From: Lars Ingebrigtsen @ 2016-02-25 5:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: j_l_domenech, 22789 Eli Zaretskii <eliz@gnu.org> writes: > Note that the Windows build doesn't use getaddrinfo, either, it uses > gethostbyname etc. Yeah, I undefined HAVE_GETADDRINFO, too, but wasn't able to reproduce. Are there other defines that differ between Linux and Windows (other than the obvious WINDOWSNT/GNU_LINUX ones) that are relevant for process.c? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 124+ messages in thread
end of thread, other threads:[~2016-03-10 14:59 UTC | newest] Thread overview: 124+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-24 10:26 bug#22789: 25.1.50; In last master build https connections stop working José L. Doménech 2016-02-24 14:00 ` Lars Ingebrigtsen 2016-02-24 16:09 ` José L. Doménech 2016-02-24 18:06 ` Eli Zaretskii 2016-02-24 23:48 ` Lars Ingebrigtsen 2016-02-25 0:02 ` Lars Ingebrigtsen 2016-02-25 1:09 ` Lars Ingebrigtsen 2016-02-25 16:41 ` Eli Zaretskii 2016-02-26 2:29 ` Lars Ingebrigtsen 2016-02-26 9:36 ` Eli Zaretskii 2016-02-27 2:30 ` Lars Ingebrigtsen 2016-02-27 2:43 ` John Wiegley 2016-02-27 3:50 ` Lars Ingebrigtsen 2016-02-27 8:14 ` Eli Zaretskii 2016-02-27 3:49 ` Lars Ingebrigtsen 2016-02-27 8:10 ` Eli Zaretskii 2016-02-27 8:13 ` Eli Zaretskii 2016-02-27 18:05 ` Alain Schneble 2016-02-27 22:38 ` Lars Ingebrigtsen 2016-02-27 23:06 ` Alain Schneble 2016-02-27 23:49 ` Alain Schneble 2016-02-28 3:31 ` Lars Ingebrigtsen 2016-02-28 9:58 ` Alain Schneble 2016-02-28 16:53 ` Eli Zaretskii 2016-02-29 2:37 ` Lars Ingebrigtsen 2016-02-28 3:43 ` Eli Zaretskii 2016-02-28 9:48 ` Alain Schneble 2016-02-28 17:00 ` Eli Zaretskii 2016-02-29 2:49 ` Lars Ingebrigtsen 2016-02-29 3:43 ` Eli Zaretskii 2016-02-29 4:38 ` Lars Ingebrigtsen 2016-02-29 9:55 ` Alain Schneble 2016-02-29 10:03 ` Lars Ingebrigtsen 2016-02-29 17:57 ` Alain Schneble 2016-02-29 18:45 ` Eli Zaretskii 2016-02-29 21:22 ` Lars Ingebrigtsen 2016-03-01 3:35 ` Eli Zaretskii 2016-02-29 23:13 ` Alain Schneble 2016-03-01 0:41 ` Lars Ingebrigtsen 2016-03-01 3:41 ` Eli Zaretskii 2016-03-01 4:29 ` Lars Ingebrigtsen 2016-03-01 4:30 ` Lars Ingebrigtsen 2016-03-01 9:00 ` Andreas Schwab 2016-03-01 14:12 ` Lars Ingebrigtsen 2016-03-01 14:25 ` Alain Schneble 2016-03-01 14:43 ` Lars Ingebrigtsen 2016-03-01 15:59 ` Eli Zaretskii 2016-03-01 16:19 ` Alain Schneble 2016-03-01 17:00 ` Eli Zaretskii 2016-03-01 17:09 ` Alain Schneble 2016-03-01 17:22 ` Eli Zaretskii 2016-03-01 17:55 ` Alain Schneble 2016-03-01 18:13 ` Eli Zaretskii 2016-03-01 16:33 ` Andreas Schwab 2016-03-01 15:53 ` Eli Zaretskii 2016-03-01 15:36 ` Alain Schneble 2016-03-01 16:05 ` Eli Zaretskii 2016-03-01 16:25 ` Alain Schneble 2016-03-04 8:56 ` Eli Zaretskii 2016-03-04 16:55 ` Alain Schneble 2016-03-04 21:36 ` Alain Schneble 2016-03-04 22:33 ` Alain Schneble 2016-03-05 8:23 ` Eli Zaretskii 2016-03-05 18:27 ` Alain Schneble 2016-03-05 19:21 ` Eli Zaretskii 2016-03-06 22:45 ` Alain Schneble 2016-03-06 23:24 ` Alain Schneble 2016-03-07 8:49 ` Alain Schneble 2016-03-07 16:08 ` Eli Zaretskii 2016-03-07 17:20 ` Alain Schneble 2016-03-07 17:33 ` Eli Zaretskii 2016-03-07 18:03 ` Alain Schneble 2016-03-07 18:10 ` Eli Zaretskii 2016-03-07 18:26 ` Alain Schneble 2016-03-07 16:07 ` Eli Zaretskii 2016-03-07 16:47 ` Alain Schneble 2016-03-07 22:21 ` Alain Schneble 2016-03-08 16:40 ` Eli Zaretskii 2016-03-08 16:43 ` Alain Schneble 2016-03-10 14:45 ` Eli Zaretskii 2016-03-10 14:59 ` Alain Schneble 2016-03-06 9:31 ` Lars Magne Ingebrigtsen 2016-03-06 15:24 ` Eli Zaretskii 2016-03-05 8:46 ` Lars Magne Ingebrigtsen 2016-03-05 18:32 ` Alain Schneble 2016-02-29 21:18 ` Lars Ingebrigtsen 2016-02-29 23:20 ` Alain Schneble 2016-03-01 3:43 ` Eli Zaretskii 2016-03-01 5:17 ` Lars Ingebrigtsen 2016-03-01 15:46 ` Eli Zaretskii 2016-03-02 18:03 ` Lars Ingebrigtsen 2016-03-02 19:07 ` Alain Schneble 2016-03-02 19:15 ` Lars Ingebrigtsen 2016-03-02 19:38 ` Alain Schneble 2016-03-02 20:46 ` Alain Schneble 2016-03-02 22:02 ` Alain Schneble 2016-03-02 22:22 ` Lars Ingebrigtsen 2016-03-02 22:38 ` Alain Schneble 2016-03-03 0:07 ` Alain Schneble 2016-03-03 5:32 ` Lars Ingebrigtsen 2016-03-03 9:03 ` Alain Schneble 2016-03-02 19:43 ` Eli Zaretskii 2016-03-03 5:23 ` Lars Ingebrigtsen 2016-03-04 8:51 ` Eli Zaretskii 2016-03-04 11:33 ` Lars Ingebrigtsen 2016-03-04 14:48 ` Eli Zaretskii 2016-03-05 12:26 ` Lars Magne Ingebrigtsen 2016-03-05 13:24 ` Eli Zaretskii 2016-03-06 9:33 ` Lars Magne Ingebrigtsen 2016-03-06 15:26 ` Eli Zaretskii 2016-03-06 18:33 ` Lars Magne Ingebrigtsen 2016-03-06 18:41 ` Eli Zaretskii 2016-03-04 11:37 ` Lars Ingebrigtsen 2016-03-04 11:40 ` Lars Ingebrigtsen 2016-03-04 15:41 ` Eli Zaretskii 2016-03-04 15:43 ` Lars Ingebrigtsen 2016-03-04 16:12 ` Eli Zaretskii 2016-03-04 15:40 ` Eli Zaretskii 2016-03-01 15:43 ` Alain Schneble 2016-03-01 16:07 ` Eli Zaretskii 2016-03-01 16:26 ` Alain Schneble 2016-02-28 16:47 ` Eli Zaretskii 2016-02-25 3:46 ` Eli Zaretskii 2016-02-25 5:00 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).