unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH] load-foreign-library: perform substring match on library files
       [not found] <23TGNISPS9PPL.2YOENNQD9V56X@8pit.net>
@ 2022-07-23 11:09 ` soeren
  2022-07-23 13:17   ` Maxime Devos
  0 siblings, 1 reply; 3+ messages in thread
From: soeren @ 2022-07-23 11:09 UTC (permalink / raw)
  To: bug-guile; +Cc: guile-devel

From: Sören Tempel <soeren@soeren-tempel.net>

This patch is a fix for bug #49232 [1]. To summarize this bug, the
current load-foreign-library implementation does not load versioned
sonames (e.g. libfoo.so.5) which are common on Linux. This is an issue
for us at Alpine Linux since we ship unversioned sonames (e.g. libfoo.so)
separately. Please refer to the original bug report for details.

This patch attempts to fix this issue by performing a substring match
on library files in load-foreign-library. That is, when loading the
library file for `libfoo` the new algorithm will perform a substring
prefix match and return the first file which starts with `libfoo.so`.
Therefore, the new algorithm will match both `libfoo.so.5` and `libfoo.so`
while the current algorithm only matched the latter. In order to
implement this, the new algorithm has to perform a readdir(2) syscall
on directories in $LD_LIBRARY_PATH instead of just checking for
the presence of a single file in each directory.

Discussion: It may be desirable to make the prefix substring check more
strict, presently `libzstd.something` would also match. While I believe
it to be unlikely that such files exist in $LD_LIBRARY_PATH we could
also perform a substring match against `basename + ext + #\.`, i.e.
libzstd.so., libstzstd.so.1, libzstd.so.1.5.2 etc would match but
libzstd.something wouldn't. Furthermore, if both libzstd.so.1 and
libzstd.so exist in $LD_LIBRARY_PATH then the algorithm proposed here
may prefer the former (depending on the readdir(2) file order).

* module/system/foreign-library.scm (file-exists-in-path-with-extension):
  perform a substring match on library files to also match versioned
  .so files.
* modules/system/foreign-library.scm (load-foreign-library): Perform a
  fuzzy substring search even if the library file contains a
  path-separator.

[1]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=49232

Signed-off-by: Sören Tempel <soeren@soeren-tempel.net>
---
This is my first patch for guile, if I missed anything please let me
know. Also, I am not subscribed to the list so please CC me.

 module/system/foreign-library.scm | 63 +++++++++++++++++--------------
 1 file changed, 34 insertions(+), 29 deletions(-)

diff --git a/module/system/foreign-library.scm b/module/system/foreign-library.scm
index dc426385f..87e5a3afc 100644
--- a/module/system/foreign-library.scm
+++ b/module/system/foreign-library.scm
@@ -57,30 +57,33 @@
    (else
     '(".so"))))
 
-(define (has-extension? head exts)
-  (match exts
-    (() #f)
-    ((ext . exts)
-     (or (string-contains head ext)
-         (has-extension? head exts)))))
-
-(define (file-exists-with-extension head exts)
-  (if (has-extension? head exts)
-      (and (file-exists? head) head)
-      (let lp ((exts exts))
-        (match exts
-          (() #f)
-          ((ext . exts)
-           (let ((head (string-append head ext)))
-             (if (file-exists? head)
-                 head
-                 (lp exts))))))))
+(define (filename-matches-with-extension? filename basename exts)
+  (let lp ((exts exts))
+    (match exts
+      (() #f)
+      ((ext . exts)
+       ;; Fuzzy comparison of filename with basename + ext. If the
+       ;; latter is a prefix of the former, consider it a match. This
+       ;; allows matching .so files with versions, e.g. libfoo.so.5.
+       (let ((prefix (string-append basename ext)))
+         (or (string-prefix? prefix filename)
+             (lp exts)))))))
+
+(define (file-exists-in-dir-with-extension dir basename exts)
+  (let* ((dir-stream (opendir dir))
+         (ret (let loop ((fn (readdir dir-stream)))
+                (and (not (eof-object? fn))
+                     (if (filename-matches-with-extension? fn basename exts)
+                       (in-vicinity dir fn)
+                       (loop (readdir dir-stream)))))))
+    (closedir dir-stream)
+    ret))
 
 (define (file-exists-in-path-with-extension basename path exts)
   (match path
     (() #f)
     ((dir . path)
-     (or (file-exists-with-extension (in-vicinity dir basename) exts)
+     (or (file-exists-in-dir-with-extension dir basename exts)
          (file-exists-in-path-with-extension basename path exts)))))
 
 (define path-separator
@@ -198,16 +201,18 @@ name."
      (dlopen* #f))
     ((or (absolute-file-name? filename)
          (string-any file-name-separator? filename))
-     (cond
-      ((or (file-exists-with-extension filename extensions)
-           (and search-ltdl-library-path?
-                (file-exists-with-extension
-                 (in-vicinity (in-vicinity (dirname filename) ".libs")
-                              (basename filename))
-                 extensions)))
-       => dlopen*)
-      (else
-       (error-not-found))))
+     (let ((dirname (dirname filename))
+           (basename (basename filename)))
+       (cond
+        ((or (file-exists-in-dir-with-extension dirname basename extensions)
+             (and search-ltdl-library-path?
+                  (file-exists-in-dir-with-extension
+                   (in-vicinity dirname ".libs")
+                   basename
+                   extensions)))
+         => dlopen*)
+        (else
+         (error-not-found)))))
     ((file-exists-in-path-with-extension filename search-path extensions)
      => dlopen*)
     (search-system-paths?



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] load-foreign-library: perform substring match on library files
  2022-07-23 11:09 ` [PATCH] load-foreign-library: perform substring match on library files soeren
@ 2022-07-23 13:17   ` Maxime Devos
  2022-07-24 12:16     ` Sören Tempel
  0 siblings, 1 reply; 3+ messages in thread
From: Maxime Devos @ 2022-07-23 13:17 UTC (permalink / raw)
  To: soeren, bug-guile; +Cc: guile-devel


[-- Attachment #1.1.1.1: Type: text/plain, Size: 3614 bytes --]


On 23-07-2022 13:09, soeren@soeren-tempel.net wrote:
> From: Sören Tempel<soeren@soeren-tempel.net>
>
> This patch is a fix for bug #49232 [1]. To summarize this bug, the
> current load-foreign-library implementation does not load versioned
> sonames (e.g. libfoo.so.5) which are common on Linux. This is an issue
> for us at Alpine Linux since we ship unversioned sonames (e.g. libfoo.so)
> separately. Please refer to the original bug report for details.
>   [...]

Long term, I think it would be ideal for Guile to decide upon a major 
version (and maybe even location, depending on the choices of the 
distro) at _compile_ time instead of runtime, not unlike how other 
compilers work.

More concretely, we could have a macro 'link-foreign-library', of the form:

(link-foreign-library library [maybe other arguments).

If the Guile module is being compiled with --rpath, it searches 
$CROSS_LIBRARY_PATH or $LIBRARY_PATH and encodes the full file name 
(/usr/lib/.../libguile-... or /gnu/store/.../lib/...) in the .go, which 
avoids some manual patching we have to do in Guix.

It could also detect if it's loading a libfoo.so pointing to 
libfoo.so.N, and if so, replace libfoo.so by libfoo.so.N.

The former (rpath) would work for Guix, and I expect the latter 
(libfoo.so -> libfoo.so.N) to work on Alpine.

> +(define (file-exists-in-dir-with-extension dir basename exts)
> +  (let* ((dir-stream (opendir dir))
> +         (ret (let loop ((fn (readdir dir-stream)))
> +                (and (not (eof-object? fn))
> +                     (if (filename-matches-with-extension? fn basename exts)
> +                       (in-vicinity dir fn)
> +                       (loop (readdir dir-stream)))))))
IIUC, the string-prefix? search is non-deterministic, which can make 
debugging complicated when multiple versions are installed. I think it 
would be better to bail out if there are multiple matches instead of a 
risk of guessing incorrectly.

The prefixing behaviour is also not documented, so some documentation is 
needed in the manual.

Questions:

  * Does it go scanning the dir even if libfoo.so could be found?
    Otherwise, there are some possible performance gains by checking for
    libfoo.so first -- consider the case where /usr/lib is huge.
  * When doing (load-foreign-library "/gnu/store/.../libfoo.so")
    (absolute file name!), would it search for
    /gnu/store/.../libfoo.so.N?  If so, that would be surprising,
    especially if libfoo.so.N exists.
  * If doing libfoo.so.N, will it search for libfoo.so.N.M?
  * Does it only apply to the system paths, or also to
    GUILE_SYSTEM_EXTENSION_PATH, LTDL_LIBRARY_PATH and
    GUILE_EXTENSION_PATH? The latter would be surprising to me, as
    versioning is more of a system thing.
  * To test it, and avoid breaking things later with future changes to
    load-foreign-library, could some tests be added?
  * Is this change desirable?  I mean, this is an FFI API, so the ABI of
    the library is rather important. If a Guile module links to
    libfoo.so, and they had version N in mind, then it's important it
    doesn't link to N-1 or N+1 instead, because of ABI
    incompatibilities. As such, to me it seems _good_ that you got some
    errors, as now you get a reminder to explicitly state which ABI
    version is needed. (YMMV, and the mileage of the Guile maintainers
    might vary, etc.)

Also, this seems like a non-trivial change to me, so a copyright line 
might be in order, unless you did the copyright assignment.

Greetings,
Maxime.


[-- Attachment #1.1.1.2: Type: text/html, Size: 4673 bytes --]

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] load-foreign-library: perform substring match on library files
  2022-07-23 13:17   ` Maxime Devos
@ 2022-07-24 12:16     ` Sören Tempel
  0 siblings, 0 replies; 3+ messages in thread
From: Sören Tempel @ 2022-07-24 12:16 UTC (permalink / raw)
  To: Maxime Devos; +Cc: bug-guile, guile-devel

Hi,

Thanks for your feedback, comments below.

Maxime Devos <maximedevos@telenet.be> wrote:
> Long term, I think it would be ideal for Guile to decide upon a major 
> version (and maybe even location, depending on the choices of the 
> distro) at _compile_ time instead of runtime, not unlike how other 
> compilers work.

Sure, that could work too.

> If the Guile module is being compiled with --rpath, it searches
> $CROSS_LIBRARY_PATH or $LIBRARY_PATH and encodes the full file name
> (/usr/lib/.../libguile-... or /gnu/store/.../lib/...) in the .go,
> which avoids some manual patching we have to do in Guix.

What kind of manual patching do you do on Guix? Could you refer me to
the code for that? Maybe that's something we could also do on Alpine in
the meantime.

> IIUC, the string-prefix? search is non-deterministic, which can make 
> debugging complicated when multiple versions are installed. 

Well, on Linux readdir(3) file name order depends on the file system
implementation. Since the algorithm returns the first file with a
substring match, it depends on the readdir order. As long as the same
filesystem is used, it should be deterministic.

> I think it would be better to bail out if there are multiple matches
> instead of a risk of guessing incorrectly.

Many packages provide multiple .so files with different version
granularity so bailing out if there are multiple substring matches
doesn't really work. For example, for libgit2 I have the following files
installed on my system:

	$ ls /usr/lib/libgit2*
	/usr/lib/libgit2.so.1.4
	/usr/lib/libgit2.so.1.4.4

Where the former is a symlink to the latter. However, it would be
possible to collect all substring matches and prioritize them according
to some algorithm (e.g. alphabetical order). This would also make the
algorithm independent of the readdir(3) order.

> The prefixing behaviour is also not documented, so some documentation is 
> needed in the manual.

I can add some documentation if there is interest in merging this.

>   * Does it go scanning the dir even if libfoo.so could be found?
>     Otherwise, there are some possible performance gains by checking for
>     libfoo.so first -- consider the case where /usr/lib is huge.

Yes, a fast path could be added though you probably really need to have
a lot of files installed in /usr/lib for this to be worth it.

>   * When doing (load-foreign-library "/gnu/store/.../libfoo.so")
>     (absolute file name!), would it search for
>     /gnu/store/.../libfoo.so.N?  If so, that would be surprising,
>     especially if libfoo.so.N exists.

Yep, it does. I originally didn't want to modify the handling of
absolute paths but unfortunately during testing I noticed that Guile
extensions seem to be loaded with an absolute path and hence don't
work without the libfoo.so symlink [1].

>   * If doing libfoo.so.N, will it search for libfoo.so.N.M?

Yes, since libfoo.so.N is a prefix of libfoo.so.N.M.

>   * Does it only apply to the system paths, or also to
>     GUILE_SYSTEM_EXTENSION_PATH, LTDL_LIBRARY_PATH and
>     GUILE_EXTENSION_PATH? The latter would be surprising to me, as
>     versioning is more of a system thing.

If those paths are also searched using the load-foreign-library
procedure then they are affected by this change. Also, I am not a Guile
expert but on Alpine, Guile extensions such as guile-reader also ship
versioned sonames [1].

>   * To test it, and avoid breaking things later with future changes to
>     load-foreign-library, could some tests be added?

Probably, though I am not familiar with the Guile test setup and there
don't seem to be any existing tests for foreign-library.

>  * Is this change desirable?  I mean, this is an FFI API, so the ABI of
>    the library is rather important. If a Guile module links to
>    libfoo.so, and they had version N in mind, then it's important it
>    doesn't link to N-1 or N+1 instead, because of ABI
>    incompatibilities. As such, to me it seems _good_ that you got some
>    errors, as now you get a reminder to explicitly state which ABI
>    version is needed. (YMMV, and the mileage of the Guile maintainers
>    might vary, etc.)

In my experience, most languages which don't link against shared
libraries directly but instead load them at run-time don't hardcode ABI
versions (for example, refer to Python's ctypes.util.find_library).
Also, the current implementation of load-foreign-library does not force
you to specify an ABI version but instead loads whatever the libfoo.so
symlink refers to.

> Also, this seems like a non-trivial change to me, so a copyright line 
> might be in order, unless you did the copyright assignment.

I didn't do any copyright assignment yet but if there is actually any
interest in merging this then I can do it once we agreed on changes to
the algorithm.

Greetings,
Sören

[1]: https://gitlab.alpinelinux.org/alpine/aports/-/issues/12783



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-07-24 12:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <23TGNISPS9PPL.2YOENNQD9V56X@8pit.net>
2022-07-23 11:09 ` [PATCH] load-foreign-library: perform substring match on library files soeren
2022-07-23 13:17   ` Maxime Devos
2022-07-24 12:16     ` Sören Tempel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).