Le samedi 06 juillet 2024 à 15:32 -0500, Rob Browning a écrit : > At a minimum, I suggest Guile should produce an error by default > (instead of generating incorrect data) when the system bytes cannot be > encoded in the current locale. I agree that an error would be better than replacing with a question mark. > As an incremental step, and as has been discussed elsewhere a bit, we > might add support for uselocale()[2] and then document that the current > recommendation is to always use ISO-8859-1 (i.e. Latin-1)[3] for system > data unless you're certain your program doesn't need to be general > purpose (perhaps you're sure you only care about UTF-8 systems). latin1 locale is a terrible default. Virtually no Linux system these days has a locale encoding different than UTF-8. Except perhaps for the "C" locale, which people still use by habit with "LC_ALL=C" as a way to say "speak English please", although most Linux distros have a C.UTF-8 locale these days. > The most direct (and compact, if we do convert to UTF-8) representation > would bytevectors, but then you would have a much more limited set of > operations available (i.e. strings have all of srfi-13, srfi-14, etc.) > unless we expanded them (likely re-using the existing code paths).  Of > course you could still convert to Latin-1, perform the operation, and > convert back, but that's not ideal. Why is that "not ideal"? The (ice-9 iconv) API is convenient, locale-independent and thread-safe.