Hi Maxime, Maxime Devos writes: > Ludovic Courtès schreef op ma 28-02-2022 om 12:45 [+0100]: > >> It might be easier to rewrite in Awk in build srfi-14.i.c >> unconditionally no? > > I don't know any Awk and it seems to be quite different from languages > I know, so for me doing that isn't easier. But for someone who knows > some Awk, sure! Well, I don’t consider myself an Awk person, but I had to implement it for Gash-Utils, so I know it well enough! This may not be the most idiomatic Awk program, but to my eyes it is no less readable than the Perl version. Note that this Awk script needs to be invoked using something like: $ awk -f unidata_to_charset.awk < UnicodeData.txt > srfi-14.i.c That is, the Perl version had the file names hard-coded, but the Awk version reads from stdin and writes to stdout. Also, the Awk version does not shell out to 'indent' to post-process the file. That was basically a no-op in the Perl version, so I removed it. There are a few differences in how the script is structured, and I had to convert all the hex literals to decimal, but the logical behaviour should be exactly the same. I preserved all the comments and predicates exactly from the Perl version. There’s probably some differences in error handling, but the input data is so simple that it shouldn’t matter. It runs with “gawk --posix”. If I run “gawk --lint”, I get warnings, but I’m pretty sure they are spurious (they may even be Gawk bugs, but I would have to double check the relevant specs and docs). If the lint warnings are a problem, you can append the empty string to the argument of the ‘hex’ function to make them go away. Also, (as a bonus) as of commit 62c56f9 the Gash-Utils version of Awk can run this script! :) Of course, to use this script as part of the Guile build, someone™ will have to double check that we can legally redistribute the Unicode data file (probably okay, but always good to check), and update the build rules to generate the C file. I can’t guarantee that I’ll get to it.... -- Tim