* bug#42140: 26.3; sigsegv when using nss-docker
@ 2020-06-30 10:13 Hans van den Bogert
2020-06-30 15:40 ` Eli Zaretskii
0 siblings, 1 reply; 7+ messages in thread
From: Hans van den Bogert @ 2020-06-30 10:13 UTC (permalink / raw)
To: 42140
[-- Attachment #1: Type: text/plain, Size: 6125 bytes --]
Dear Bug squashers,
To reproduce, have 'nss-docker'[1] installed. This library can be added to
nsswitch.conf to intercept .docker host requests.
I have not had other problematic programs icw nss-docker.
Since emacs 26, and most likely due to it's premiered use of
multi-threadedness, a simple `m-x list-packages`, with multiple repos
configured (e.g. gnu, melpa), will crash with sigsegv with high
probability.
I am not well-versed enough in debugging multithreaded emacs to conclude
if this is a problem in emacs or nss-docker. But to iterate, since I
have not encountered this at all with other programs, I'll start at
emacs.
Thanks in advance for any effort,
Hans
[1] https://github.com/dex4er/nss-docker
Starting program: /usr/bin/emacs -u /tmp
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdb75b700 (LWP 26156)]
[New Thread 0x7fffdaa77700 (LWP 26157)]
[New Thread 0x7fffd9fdc700 (LWP 26158)]
[New Thread 0x7fffd8f51b40 (LWP 26232)]
[New Thread 0x7fffd8cffb40 (LWP 26233)]
NSS DEBUG: Called _nss_debug_gethostbyname4_r with args (name:
elpa.gnu.org)
NSS DEBUG: Called _nss_debug_gethostbyname4_r with args (name:
stable.melpa.org)
[New Thread 0x7fffd8f39b40 (LWP 26234)]
_nss_docker_gethostbyname2_r(name="elpa.gnu.org", af=10)
_nss_docker_gethostbyname2_r(name="stable.melpa.org", af=10)
_nss_docker_gethostbyname3_r(name="elpa.gnu.org", af=10)
_nss_docker_gethostbyname2_r(name="elpa.gnu.org", af=2)
NSS DEBUG: Called _nss_debug_gethostbyname4_r with args (name: orgmode.org)
_nss_docker_gethostbyname3_r(name="elpa.gnu.org", af=2)
Thread 6 "emacs" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd8cffb40 (LWP 26233)]
0x00007fffd8aafcd5 in _nss_docker_gethostbyname3_r
(name=0x2e6f732e312e302d ,
af=2002936162, result=0x6e672d78756e696c, buffer=0x2d34365f3638782f ,
buflen=7091318039310988591, errnop=0x312e6f732e312e,
herrnop=0x302d77626162696c, ttlp=0x302e6f732e6563,
canonp=0x697672657373746e)
at libnss_docker.c:72
72 ) {
(gdb) bt full
#0 0x00007fffd8aafcd5 in _nss_docker_gethostbyname3_r
(name=0x2e6f732e312e302d , af=2002936162, result=0x6e672d78756e696c,
buffer=0x2d34365f3638782f , buflen=7091318039310988591,
errnop=0x312e6f732e312e, herrnop=0x302d77626162696c,
ttlp=0x302e6f732e6563, canonp=0x697672657373746e)
at libnss_docker.c:72
name_len = 3414407380873671541
hostname =
"86_64-linux-gnu/libX11-xcb.so\000libXxf86vm.so.1\000/usr/lib/x86_64-linux-gnu/libXxf86vm.so.1\000libXxf86vm.so.1\000/usr/lib/i386-linux-gnu/libXxf86vm.so.1\000libXxf86vm.so\000/usr/lib/x86_64-linux-gnu/libXxf86vm.so\000li"...
hostname_suffix_ptr = 0x312e6f732e616162
docker_api_addr =
{sun_family = 12593, sun_path =
".so.6\000libX11.so.6\000/usr/lib/i386-linux-gnu/libX11.so.6\000libX11.so\000/usr/lib/x86_64-linux-gnu/libX11.so\000libX11-x"}
docker_api_addr_len = 1869819507
buffer_size = 3346019690390575202
buffer_offset = 7795575320214437942
sockfd = 788541486
req_message_buffer =
"86_64-linux-gnu/libX11-xcb.so.1\000libX11-xcb.so.1\000/usr/lib/i386-linux-gnu/libX11-xcb.so.1\000libX11-xcb.so\000/usr/lib"
req_message_len = 7596498840077020928
res_message_buffer = Python Exception value requires 102400 bytes, which
is more than max-value-size:
#1 0x00007fffd8ab0518 in _nss_docker_gethostbyname2_r (name=0x3ba8368
"stable.melpa.org", af=10, result=0x7fffd8cfe7d0, buffer=0x7fffd8cfea40
"\377\002", buflen=1024, errnop=0x7fffd8cff948, herrnop=0x7fffd8cff9ac)
at libnss_docker.c:340
#2 0x00007fffebf70f9f in gaih_inet (name=name@entry=0x3ba8368
"stable.melpa.org", service=, req=req@entry=0x3ba8338,
pai=pai@entry=0x7fffd8cfe9c8, naddrs=naddrs@entry=0x7fffd8cfe9c4,
tmpbuf=tmpbuf@entry=0x7fffd8cfea30) at ../sysdeps/posix/getaddrinfo.c:873
th = {h_name = 0x0, h_aliases = 0x0, h_addrtype = 0, h_length = 0,
h_addr_list = 0x0}
localcanon = 0x0
fct = 0x7fffd8ab04a4 <_nss_docker_gethostbyname2_r>
fct4 =
pat = 0x7fffd8cfe7b8
no_inet6_data = 0
nip = 0x2c5eb30
status =
no_more = 0
no_data = 0
inet6_status = NSS_STATUS_UNAVAIL
res_ctx = 0x7fffc8000b20
res_enable_inet6 =
tp =
st = 0x7fffd8cfe6f0
at = 0x7fffd8cfe6b0
got_ipv6 = false
canon = 0x0
orig_name = 0x3ba8368 "stable.melpa.org"
alloca_used =
port =
malloc_name = false
addrmem = 0x0
canonbuf = 0x0
result = 0
#3 0x00007fffebf72ce4 in __GI_getaddrinfo (name=, service=,
hints=0x3ba8338, pai=pai@entry=0x3ba8318)
at ../sysdeps/posix/getaddrinfo.c:2300
tmpbuf =
{data = 0x7fffd8cfea40, length = 1024, __space = {__align =
{__max_align_ll = 767, __max_align_ld = 5.1301383008835767187e-4937},
__c = "\377\002", '\000' ,
"\002@\352\317\330\377\177\000\000\000\000\000\000\000\000\000\000ff02::2\000ip6-allrouters",
'\000' ,
"v\352\317\330\377\177\000\000\000\000\000\000\000\000\000\000ts\n",
'\000' ...}}
i = 0
last_i = 0
nresults = 0
p = 0x0
gaih_service = {name = 0x3ba8379 "443", num = 443}
pservice =
local_hints =
{ai_flags = 0, ai_family = 0, ai_socktype = 0, ai_protocol = 0,
ai_addrlen = 0, ai_addr = 0x0, ai_canonname = 0x0, ai_next = 0x0}
in6ai = 0x0
in6ailen = 0
seen_ipv4 = false
seen_ipv6 = false
check_pf_called = false
end = 0x7fffd8cfe9c8
naddrs = 0
__PRETTY_FUNCTION__ = "getaddrinfo"
#4 0x00007fffecb5a058 in handle_requests (arg=) at gai_misc.c:317
req = 0x3ba8300
srchp =
lastp =
runp = 0x3d84690
---Type to continue, or q to quit---xbackq
__PRETTY_FUNCTION__ = "handle_requests"
#5 0x00007fffecd646db in start_thread (arg=0x7fffd8cffb40) at
pthread_create.c:463
pd = 0x7fffd8cffb40
now =
unwind_buf =
{cancel_jmp_buf = {{jmp_buf = {140736830896960, -2868501273485909582,
140736830894080, 0, 64505488, 140737488329792, 2868433241719562674,
2868459701649662386}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0,
0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call =
#6 0x00007fffebf8c88f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) xbacktrace
Undefined command: "xbacktrace". Try "help".
In GNU Emacs 26.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.30)
of 2019-09-16 built on lcy01-amd64-030
Windowing system distributor 'The X.Org Foundation
[-- Attachment #2: Type: text/html, Size: 16164 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#42140: 26.3; sigsegv when using nss-docker
2020-06-30 10:13 bug#42140: 26.3; sigsegv when using nss-docker Hans van den Bogert
@ 2020-06-30 15:40 ` Eli Zaretskii
2020-06-30 20:15 ` Hans van den Bogert
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Eli Zaretskii @ 2020-06-30 15:40 UTC (permalink / raw)
To: Hans van den Bogert; +Cc: 42140
> From: Hans van den Bogert <hansbogert@gmail.com>
> Date: Tue, 30 Jun 2020 12:13:24 +0200
>
> Since emacs 26, and most likely due to it's premiered use of
> multi-threadedness, a simple `m-x list-packages`, with multiple repos
> configured (e.g. gnu, melpa), will crash with sigsegv with high
> probability.
>
> I am not well-versed enough in debugging multithreaded emacs to conclude
> if this is a problem in emacs or nss-docker. But to iterate, since I
> have not encountered this at all with other programs, I'll start at
> emacs.
Emacs is not multithreaded. If you never start any additional Lisp
threads, only one thread ever runs (not counting GTK threads, but
those aren't new in Emacs 26).
The backtrace seems to suggest its a problem in nss-docker, since the
crash is in its code. Are you sure this is an Emacs problem?
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#42140: 26.3; sigsegv when using nss-docker
2020-06-30 15:40 ` Eli Zaretskii
@ 2020-06-30 20:15 ` Hans van den Bogert
2020-07-01 12:39 ` Hans van den Bogert
2020-07-06 6:50 ` Hans van den Bogert
2 siblings, 0 replies; 7+ messages in thread
From: Hans van den Bogert @ 2020-06-30 20:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 42140
On 6/30/20 5:40 PM, Eli Zaretskii wrote:
> Emacs is not multithreaded. If you never start any additional Lisp
> threads, only one thread ever runs (not counting GTK threads, but
> those aren't new in Emacs 26).
>
> The backtrace seems to suggest its a problem in nss-docker, since the
> crash is in its code. Are you sure this is an Emacs problem?
> Emacs is not multithreaded.
You are right, poor choice of words; concurrent seems to be the proper
word. The release notes of v26 do note the change to an async network layer:
Release note v26 snippet
--->8---
** The networking code has been reworked so that it's more
asynchronous than it was (when specifying :nowait t in
'make-network-process'). How asynchronous it is varies based on the
capabilities of the system, but on a typical GNU/Linux system the DNS
resolution, the connection, and (for TLS streams) the TLS negotiation
are all done without blocking the main Emacs thread. To get
asynchronous TLS, the TLS boot parameters have to be passed in (see
the manual for details).
--->8---
> If you never start any additional Lisp
> threads, only one thread ever runs (not counting GTK threads, but
> those aren't new in Emacs 26).
I am an extreme novice wrt to emacs development, but I have to disagree,
in contrast to v25, I can see this async change in the debug prints
which I added to `_nss_docker_*_r` functions; the order of internal
method calls can interleave between `_nss_docker_gethostbyname2_r`
invocations.
Further, Ithink I see 2 threads for 2 name resolves (is this what you
meant with 'additional lisp threads'?):
```
Thread 7 (Thread 0x7fffd8ce7b40 (LWP 18899)):
#0 0x00007fffd8acecd5 in _nss_docker_gethostbyname3_r (name=Python
Exception <class 'gdb.MemoryError'> Cannot access memory at address
0x7fffd8ccd388:
#1 0x00007fffd8acf518 in _nss_docker_gethostbyname2_r (name=0x2d72768
"orgmode.org", af=10, result=0x7fffd8ce67d0, buffer=0x7fffd8ce6a40
"\377\002", buflen=1024, errnop=0x7fffd8ce7948, herrnop=0x7fffd8ce79ac)
at libnss_docker.c:340
#2 0x00007fffebf70f9f in gaih_inet (name=name@entry=0x2d72768
"orgmode.org", service=<optimized out>, req=req@entry=0x2d72738,
pai=pai@entry=0x7fffd8ce69c8, naddrs=naddrs@entry=0x7fffd8ce69c4,
tmpbuf=tmpbuf@entry=0x7fffd8ce6a30) at ../sysdeps/posix/getaddrinfo.c:873
#3 0x00007fffebf72ce4 in __GI_getaddrinfo (name=<optimized out>,
service=<optimized out>, hints=0x2d72738, pai=pai@entry=0x2d72718) at
../sysdeps/posix/getaddrinfo.c:2300
#4 0x00007fffecb5a058 in handle_requests (arg=<optimized out>) at
gai_misc.c:317
#5 0x00007fffecd646db in start_thread (arg=0x7fffd8ce7b40) at
pthread_create.c:463
#6 0x00007fffebf8c88f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
...
Thread 5 (Thread 0x7fffd8f51b40 (LWP 18897)):
#0 0x00007fffd8acecd5 in _nss_docker_gethostbyname3_r
(name=0x2e6f732e312e302d <error: Cannot access memory at address
0x2e6f732e312e302d>, af=2002936162, result=0x6e672d78756e696c,
buffer=0x2d34365f3638782f <error: Cannot access memory at address
0x2d34365f3638782f>, buflen=7091318039310988591,
errnop=0x312e6f732e312e, herrnop=0x302d77626162696c,
ttlp=0x302e6f732e6563, canonp=0x697672657373746e) at libnss_docker.c:72
#1 0x00007fffd8acf518 in _nss_docker_gethostbyname2_r (name=0x338a068
"elpa.gnu.org", af=10, result=0x7fffd8f507d0, buffer=0x7fffd8f50a40
"\377\002", buflen=1024, errnop=0x7fffd8f51948, herrnop=0x7fffd8f519ac)
at libnss_docker.c:340
#2 0x00007fffebf70f9f in gaih_inet (name=name@entry=0x338a068
"elpa.gnu.org", service=<optimized out>, req=req@entry=0x338a038,
pai=pai@entry=0x7fffd8f509c8, naddrs=naddrs@entry=0x7fffd8f509c4,
tmpbuf=tmpbuf@entry=0x7fffd8f50a30) at ../sysdeps/posix/getaddrinfo.c:873
#3 0x00007fffebf72ce4 in __GI_getaddrinfo (name=<optimized out>,
service=<optimized out>, hints=0x338a038, pai=pai@entry=0x338a018) at
../sysdeps/posix/getaddrinfo.c:2300
#4 0x00007fffecb5a058 in handle_requests (arg=<optimized out>) at
gai_misc.c:317
#5 0x00007fffecd646db in start_thread (arg=0x7fffd8f51b40) at
pthread_create.c:463
#6 0x00007fffebf8c88f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
```
If someone could help me point out where the libc/nss code is called on
the emacs side, I can debug this further. Because tbh, I'm having
difficulty pin-pointing that.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#42140: 26.3; sigsegv when using nss-docker
2020-06-30 15:40 ` Eli Zaretskii
2020-06-30 20:15 ` Hans van den Bogert
@ 2020-07-01 12:39 ` Hans van den Bogert
2020-07-06 6:50 ` Hans van den Bogert
2 siblings, 0 replies; 7+ messages in thread
From: Hans van den Bogert @ 2020-07-01 12:39 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 42140
Just for information, I've bisected this to commit
fdfb68690f Implement asynchronous name resolution
Hans
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#42140: 26.3; sigsegv when using nss-docker
2020-06-30 15:40 ` Eli Zaretskii
2020-06-30 20:15 ` Hans van den Bogert
2020-07-01 12:39 ` Hans van den Bogert
@ 2020-07-06 6:50 ` Hans van den Bogert
2020-07-06 16:31 ` Eli Zaretskii
2 siblings, 1 reply; 7+ messages in thread
From: Hans van den Bogert @ 2020-07-06 6:50 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 42140
Dear Eli,
Please set this bug to 'invalid'. (could I've done this myself?)
The example in the manpages of `getaddrinfo_a` is enough to trigger this
locally.
I am at my wit's end though where the real problem lies.
Sorry for the lack of confidence in emacs and for the overhead of this
unneeded bug report ;)
Regards,
Hans
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#42140: 26.3; sigsegv when using nss-docker
2020-07-06 6:50 ` Hans van den Bogert
@ 2020-07-06 16:31 ` Eli Zaretskii
2020-07-07 8:49 ` Hans van den Bogert
0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2020-07-06 16:31 UTC (permalink / raw)
To: Hans van den Bogert; +Cc: 42140-done
> Cc: 42140@debbugs.gnu.org
> From: Hans van den Bogert <hansbogert@gmail.com>
> Date: Mon, 6 Jul 2020 08:50:21 +0200
>
> Please set this bug to 'invalid'. (could I've done this myself?)
You can always close a bug by sending email to
NNNN-done@debbugs.gnu.org, where NNNN is the bug number. Like I did
now.
> The example in the manpages of `getaddrinfo_a` is enough to trigger this
> locally.
> I am at my wit's end though where the real problem lies.
Thanks for telling us. Could this be a bug with your kernel or the
standard C library?
> Sorry for the lack of confidence in emacs and for the overhead of this
> unneeded bug report ;)
No need to apologize, it can happen with anyone.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#42140: 26.3; sigsegv when using nss-docker
2020-07-06 16:31 ` Eli Zaretskii
@ 2020-07-07 8:49 ` Hans van den Bogert
0 siblings, 0 replies; 7+ messages in thread
From: Hans van den Bogert @ 2020-07-07 8:49 UTC (permalink / raw)
Cc: 42140-done
On 7/6/20 6:31 PM, Eli Zaretskii wrote:
> Thanks for telling us. Could this be a bug with your kernel or the
> standard C library?
The kernel seems unlikely, the only difference I can see is that
nss_docker's _nss_docker_gethostbynameX_r seem 'off' on assembly level
compared to for example, the equivalent functions of `nss_mdns_minimal`
and libc's `nss_dns`.
The offsets when referencing stack locations on function entry are large
(0xNNNNN), compared to the straightforward function entry assembly I
see in nss_mdns and nss_dns, with 'normal' offsets of 0xNNN. I've
compared compiler flags and all, but I can't explain it. The weird
things remains of course, why does the shared library work fine when
it's called through the non async variant, `gethostbyname`.
But I think the discussion is out of scope for this list/tracker, though
any pointers are welcome of course.
Regards,
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-07-07 8:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-30 10:13 bug#42140: 26.3; sigsegv when using nss-docker Hans van den Bogert
2020-06-30 15:40 ` Eli Zaretskii
2020-06-30 20:15 ` Hans van den Bogert
2020-07-01 12:39 ` Hans van den Bogert
2020-07-06 6:50 ` Hans van den Bogert
2020-07-06 16:31 ` Eli Zaretskii
2020-07-07 8:49 ` Hans van den Bogert
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).