Ludovic Courtès schreef op zo 27-02-2022 om 14:52 [+0100]: > It would add a dependency on Perl, which is not great (I’m not sure > whether it complicates bootstrapping since Perl is already present early > on, but it’s safer to avoid it.) > > We could rewrite ‘unidata_to_charset.pl’ in Scheme, but then Guile would > still need to provide a pre-compiled version of srfi-14.i.c for > bootstrapping purposes.  Or we could rewrite it in Awk, since Guile > already depends on Awk anyway. > > Thoughts? The ‘blob’ seems relatively harmless to the compilation process, so when there are bootstrapping problems, I think we can leave it in. However, all this Unicode is important for some other things (e.g. some DNS and filesystem things). So it would be nice to validate that no attacker with access to the Guile repo stealthily introduced some wrong information in during an otherwise routine update of the Unicode information. Hence, the following proposal: * Make perl an optional dependency of Guile (upstream) and add an '--with-unicode-data=[...]' configure flag or something like that. If perl is detected by './configure' and '--with-unicode-data=...' is set, then let one of the makefiles run 'unidata_to_charset.pl' and compare the 'new' srfi-14.i.c against the old srfi-14.i.c. In case of a mismatch, bail out. When there's no perl or --with-unicode-data, then just use the bundled srfi-14.i.c. * Add 'perl' (or 'perl-boot0' because that perl is probably good enough?) to the native-inputs of guile. Actually, the second is already done in 'guile-final'. Optionally, this can be combined with rewriting it in Scheme or some other language. Greetings, Maxime.