unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
@ 2024-04-11 17:38 Michael Lausch
  2024-04-11 18:39 ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Lausch @ 2024-04-11 17:38 UTC (permalink / raw)
  To: 70342

[-- Attachment #1: Type: text/plain, Size: 898 bytes --]

When loading a treesitter grammar in GNU/Linux, the dlopen() call is used
with the RTLD_GLOBAL flag set. If you load more than one
treesitter grammer, and both grammars define the same functions, most
probably in the scanner.c file, symbol resolution may use the wrong symbol.
For example the org and the yaml grammar both define a deserialize()
function in their scanner.c file. This may result a call from the org
grammar to the yaml defined deserialize() function. This fails, because the
yaml function does different things than the org grammer expects (it's a
free of a dangling pointer and therefore emacs crashes).

A solution can be:
1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to
what the eln loader does.
2) fix all the grammars and make all functions 'static' so that the
functions are not visible outside the compilation unit.
3) something i didn't think about

[-- Attachment #2: Type: text/html, Size: 1212 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
  2024-04-11 17:38 bug#70342: 29.3.50; treesitter and RTLD_GLOBAL Michael Lausch
@ 2024-04-11 18:39 ` Eli Zaretskii
  2024-04-11 18:47   ` Michael Lausch
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2024-04-11 18:39 UTC (permalink / raw)
  To: Michael Lausch; +Cc: 70342

> From: Michael Lausch <mick.lausch@gmail.com>
> Date: Thu, 11 Apr 2024 19:38:52 +0200
> 
> When loading a treesitter grammar in GNU/Linux, the dlopen() call is used with the RTLD_GLOBAL flag set. If
> you load more than one treesitter grammer, and both grammars define the same functions, most probably in
> the scanner.c file, symbol resolution may use the wrong symbol.
> For example the org and the yaml grammar both define a deserialize() function in their scanner.c file. This
> may result a call from the org grammar to the yaml defined deserialize() function. This fails, because the yaml
> function does different things than the org grammer expects (it's a free of a dangling pointer and therefore
> emacs crashes). 
> 
> A solution can be:
> 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does. 
> 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
> compilation unit. 
> 3) something i didn't think about

If those 'serialize' functions are not needed to be called from
outside of the shared library, the usual way is not to export them,
i.e. to give all symbols except the few that need to be exported the
so-called "hidden visibility".





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
  2024-04-11 18:39 ` Eli Zaretskii
@ 2024-04-11 18:47   ` Michael Lausch
  2024-04-11 18:54     ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Lausch @ 2024-04-11 18:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 70342

[-- Attachment #1: Type: text/plain, Size: 1763 bytes --]

On Thu, Apr 11, 2024 at 8:39 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Michael Lausch <mick.lausch@gmail.com>
> > Date: Thu, 11 Apr 2024 19:38:52 +0200
> >
> > When loading a treesitter grammar in GNU/Linux, the dlopen() call is
> used with the RTLD_GLOBAL flag set. If
> > you load more than one treesitter grammer, and both grammars define the
> same functions, most probably in
> > the scanner.c file, symbol resolution may use the wrong symbol.
> > For example the org and the yaml grammar both define a deserialize()
> function in their scanner.c file. This
> > may result a call from the org grammar to the yaml defined deserialize()
> function. This fails, because the yaml
> > function does different things than the org grammer expects (it's a free
> of a dangling pointer and therefore
> > emacs crashes).
> >
> > A solution can be:
> > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to
> what the eln loader does.
> > 2) fix all the grammars and make all functions 'static' so that the
> functions are not visible outside the
> > compilation unit.
> > 3) something i didn't think about
>
> If those 'serialize' functions are not needed to be called from
> outside of the shared library, the usual way is not to export them,
> i.e. to give all symbols except the few that need to be exported the
> so-called "hidden visibility".
>

I agree that this would be the cleanest way to solve the problem, but that
would mean to patch all the existing grammars and maybe all the future
grammars and push the changes to their maintainers.

I started to prep patches for the yaml and org grammar (those were the ones
which triggered the bug for me) and i'm going to have them merged upstream.

[-- Attachment #2: Type: text/html, Size: 2290 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
  2024-04-11 18:47   ` Michael Lausch
@ 2024-04-11 18:54     ` Eli Zaretskii
  2024-04-11 19:04       ` Michael Lausch
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2024-04-11 18:54 UTC (permalink / raw)
  To: Michael Lausch; +Cc: 70342

> From: Michael Lausch <mick.lausch@gmail.com>
> Date: Thu, 11 Apr 2024 20:47:50 +0200
> Cc: 70342@debbugs.gnu.org
> 
>  > A solution can be:
>  > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does. 
>  > 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
>  > compilation unit. 
>  > 3) something i didn't think about
> 
>  If those 'serialize' functions are not needed to be called from
>  outside of the shared library, the usual way is not to export them,
>  i.e. to give all symbols except the few that need to be exported the
>  so-called "hidden visibility".
> 
> I agree that this would be the cleanest way to solve the problem, but that would mean to patch all the existing
> grammars and maybe all the future grammars and push the changes to their maintainers.
> 
> I started to prep patches for the yaml and org grammar (those were the ones which triggered the bug for me)
> and i'm going to have them merged upstream. 

I understand, but why is this an Emacs problem?  We use RTLD_GLOBAL
for a reason, and the problem of not exposing unnecessary symbols
should be solved by the respective libraries and those who build them.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
  2024-04-11 18:54     ` Eli Zaretskii
@ 2024-04-11 19:04       ` Michael Lausch
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Lausch @ 2024-04-11 19:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 70342

[-- Attachment #1: Type: text/plain, Size: 1893 bytes --]

On Thu, Apr 11, 2024 at 8:54 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Michael Lausch <mick.lausch@gmail.com>
> > Date: Thu, 11 Apr 2024 20:47:50 +0200
> > Cc: 70342@debbugs.gnu.org
> >
> >  > A solution can be:
> >  > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar
> to what the eln loader does.
> >  > 2) fix all the grammars and make all functions 'static' so that the
> functions are not visible outside the
> >  > compilation unit.
> >  > 3) something i didn't think about
> >
> >  If those 'serialize' functions are not needed to be called from
> >  outside of the shared library, the usual way is not to export them,
> >  i.e. to give all symbols except the few that need to be exported the
> >  so-called "hidden visibility".
> >
> > I agree that this would be the cleanest way to solve the problem, but
> that would mean to patch all the existing
> > grammars and maybe all the future grammars and push the changes to their
> maintainers.
> >
> > I started to prep patches for the yaml and org grammar (those were the
> ones which triggered the bug for me)
> > and i'm going to have them merged upstream.
>
> I understand, but why is this an Emacs problem?  We use RTLD_GLOBAL
> for a reason, and the problem of not exposing unnecessary symbols
> should be solved by the respective libraries and those who build them.
>

You are completely right, the thing is that it may take a long time to fix
all the grammars and in the meantime,
whenever someone loads two buggy grammars in the same emacs process, it
will crash emacs. And that
causes more bug reports against emacs, even if it isn't an emacs problem.

The addition of yet another dlopen() function may mitigate this, but i
think that would lead to not fixing
the grammars, because it then works.

Therefore i created a bug, instead submitting a patch.

[-- Attachment #2: Type: text/html, Size: 2574 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-11 19:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-11 17:38 bug#70342: 29.3.50; treesitter and RTLD_GLOBAL Michael Lausch
2024-04-11 18:39 ` Eli Zaretskii
2024-04-11 18:47   ` Michael Lausch
2024-04-11 18:54     ` Eli Zaretskii
2024-04-11 19:04       ` Michael Lausch

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).