From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.lisp.guile.devel Subject: Re: MinGW vs. setlocale Date: Tue, 10 Jun 2014 19:17:14 +0300 Message-ID: <83d2eg20z9.fsf@gnu.org> References: <83lht730k8.fsf@gnu.org> <8761k97ue1.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1402417061 10881 80.91.229.3 (10 Jun 2014 16:17:41 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 10 Jun 2014 16:17:41 +0000 (UTC) Cc: guile-devel@gnu.org To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Jun 10 18:17:34 2014 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WuOjs-00071w-CT for guile-devel@m.gmane.org; Tue, 10 Jun 2014 18:17:32 +0200 Original-Received: from localhost ([::1]:41029 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WuOjr-0001tv-Rx for guile-devel@m.gmane.org; Tue, 10 Jun 2014 12:17:31 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:32876) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WuOjg-0001tS-U2 for guile-devel@gnu.org; Tue, 10 Jun 2014 12:17:27 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WuOja-0003UZ-Qa for guile-devel@gnu.org; Tue, 10 Jun 2014 12:17:20 -0400 Original-Received: from mtaout24.012.net.il ([80.179.55.180]:47721) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WuOja-0003UA-CC; Tue, 10 Jun 2014 12:17:14 -0400 Original-Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0N6Y00H00NE0ZB00@mtaout24.012.net.il>; Tue, 10 Jun 2014 19:13:24 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N6Y00BHUNQCDH80@mtaout24.012.net.il>; Tue, 10 Jun 2014 19:13:24 +0300 (IDT) In-reply-to: <8761k97ue1.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.180 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:17205 Archived-At: > From: ludo@gnu.org (Ludovic Court=C3=A8s) > Date: Mon, 09 Jun 2014 21:30:46 +0200 >=20 > > 1. i18n.test completely fails, because it depends on the ability = to > > change the program's locale at run time. I wish this whole te= st > > were skipped on Windows. (I'm quite sure I reported this last > > year.) >=20 > What does =E2=80=98setlocale=E2=80=99 return when called more than = once on Windows? Is > there an exception thrown or something that would allow i18n.test t= o > determine that tests should be skipped? A very good question, thank you for asking it. The short answer is that yes, it threw an exception. The long answer is that while looking into the reasons of these exceptions, I found a few snafus, and succeeded to fix some, so that most of the i18n test now works on Windows. Here are the details. First, make-locale threw an exception, because it tried to call 'setlocale' with LC_MESSAGES, which the Windows runtime doesn't support. locale-categories.h tried to avoid that by conditioning tha= t call by LC_MESSAGES being defined, but the oh-so-helpful libintl.h header file just happens to define it to some arbitrary large constant. So the ifdef didn't work, and setlocale barfed. Here's th= e suggested solution: --- libguile/locale-categories.h~0=092010-12-14 21:15:17 +0200 +++ libguile/locale-categories.h=092014-06-10 18:54:06 +0300 @@ -23,8 +23,10 @@ SCM_DEFINE_LOCALE_CATEGORY (COLLATE) SCM_DEFINE_LOCALE_CATEGORY (CTYPE) =20 -#ifdef LC_MESSAGES -/* MinGW doesn't have `LC_MESSAGES'. */ +#if defined(LC_MESSAGES) && !(defined(LC_MAX) && LC_MESSAGES > LC_MA= X) +/* MinGW doesn't have `LC_MESSAGES'. libintl.h might define + `LC_MESSAGES' for MinGW to an arbitrary large value which we cann= ot + use in a call to `setlocale'. */ SCM_DEFINE_LOCALE_CATEGORY (MESSAGES) #endif =20 The next problem is that i18n.test uses Posix locale strings, whereas the Windows runtime names the same locales by different names. Moreover, Windows 'setlocale' doesn't support UTF-8 encoding (even though a Windows UTF-8 codepage exists). So every test for a locale other than "C" was failing, because setlocale failed. I replaced Posix locales with similar Windows ones; see the following patch, in which I also emoved all but one LC_MESSAGES, because these always fai= l on Windows: --- test-suite/tests/i18n.test~0=092014-01-22 00:20:53 +0200 +++ test-suite/tests/i18n.test=092014-06-10 11:24:15 +0300 @@ -40,16 +40,19 @@ (pass-if "make-locale (2 args, list)" (not (not (make-locale (list LC_COLLATE LC_MESSAGES) "C")))) =20 + (pass-if "make-locale (2 args, list)" + (not (not (make-locale (list LC_COLLATE LC_NUMERIC) "C")))) + (pass-if "make-locale (3 args)" (not (not (make-locale (list LC_COLLATE) "C" - (make-locale (list LC_MESSAGES) "C"))))) + (make-locale (list LC_NUMERIC) "C"))))) =20 (pass-if-exception "make-locale with unknown locale" exception:loc= ale-error (make-locale LC_ALL "does-not-exist")) =20 (pass-if "locale?" (and (locale? (make-locale (list LC_ALL) "C")) - (locale? (make-locale (list LC_MESSAGES LC_NUMERIC) "C" + (locale? (make-locale (list LC_TIME LC_NUMERIC) "C" (make-locale (list LC_CTYPE) "C"))))) =20 (pass-if "%global-locale" @@ -82,19 +85,29 @@ =20 =0C (define %french-locale-name - "fr_FR.ISO-8859-1") + (if (string-contains %host-type "-mingw32") + "fra_FRA.850" + "fr_FR.ISO-8859-1")) =20 (define %french-utf8-locale-name - "fr_FR.UTF-8") + (if (string-contains %host-type "-mingw32") + "fra_FRA.1252" + "fr_FR.UTF-8")) =20 (define %turkish-utf8-locale-name - "tr_TR.UTF-8") + (if (string-contains %host-type "-mingw32") + "tur_TRK.1254" + "tr_TR.UTF-8")) =20 (define %german-utf8-locale-name - "de_DE.UTF-8") + (if (string-contains %host-type "-mingw32") + "deu_DEU.1252" + "de_DE.UTF-8")) =20 (define %greek-utf8-locale-name - "el_GR.UTF-8") + (if (string-contains %host-type "-mingw32") + "grc_ELL.1253" + "el_GR.UTF-8")) =20 (define %american-english-locale-name "en_US") @@ -148,13 +161,14 @@ (under-locale-or-unresolved %french-utf8-locale thunk)) =20 (define (under-turkish-utf8-locale-or-unresolved thunk) - ;; FreeBSD 8.2 and 9.1, Solaris 2.10, and Darwin 8.11.0 have a bro= ken - ;; tr_TR locale where `i' is mapped to uppercase `I' instead of `= =C3=84=C2=B0', - ;; so disable tests on that platform. + ;; FreeBSD 8.2 and 9.1, Solaris 2.10, Darwin 8.11.0, and MinGW hav= e + ;; a broken tr_TR locale where `i' is mapped to uppercase `I' + ;; instead of `=C4=B0', so disable tests on that platform. (if (or (string-contains %host-type "freebsd8") (string-contains %host-type "freebsd9") (string-contains %host-type "solaris2.10") - (string-contains %host-type "darwin8")) + (string-contains %host-type "darwin8") +=09 (string-contains %host-type "-mingw32")) (throw 'unresolved) (under-locale-or-unresolved %turkish-utf8-locale thunk))) =20 @@ -192,7 +206,10 @@ ;; strings. (dynamic-wind (lambda () - (setlocale LC_ALL "fr_FR.UTF-8")) +=09 (setlocale LC_ALL +=09=09 (if (string-contains %host-type "-mingw32") +=09=09=09 "fra_FRA.1252" +=09=09=09 "fr_FR.UTF-8"))) =09 (lambda () =09 (string-locale-ci=3D? "=C5=93uf" "=C5=92UF")) =09 (lambda () =09 (setlocale LC_ALL "C")))))) After all these changes, some tests still fail or throw exceptions: UNRESOLVED: i18n.test: text collation (French): string-locale-ci= =3D? UNRESOLVED: i18n.test: text collation (French): string-locale-ci= =3D? (2 args, wide strings) UNRESOLVED: i18n.test: text collation (French): string-locale-ci= =3D? (3 args, wide strings) UNRESOLVED: i18n.test: text collation (French): string-locale-ci<>? UNRESOLVED: i18n.test: text collation (French): string-locale-ci<>?= (wide strings) UNRESOLVED: i18n.test: text collation (French): string-locale-ci<>?= (wide and narrow strings) UNRESOLVED: i18n.test: text collation (French): char-locale-ci<>? UNRESOLVED: i18n.test: text collation (French): char-locale-ci<>? (= wide) UNRESOLVED: i18n.test: text collation (Greek): string-locale-ci= =3D? UNRESOLVED: i18n.test: character mapping: char-locale-upcase Turkis= h UNRESOLVED: i18n.test: character mapping: char-locale-downcase Turk= ish UNRESOLVED: i18n.test: string mapping: string-locale-upcase Greek UNRESOLVED: i18n.test: string mapping: string-locale-upcase Greek (= two sigmas) UNRESOLVED: i18n.test: string mapping: string-locale-downcase Greek UNRESOLVED: i18n.test: string mapping: string-locale-downcase Greek= (two sigmas) UNRESOLVED: i18n.test: string mapping: string-locale-upcase Turkish UNRESOLVED: i18n.test: string mapping: string-locale-downcase Turki= sh I don't know why these fail. Is it possible that the underlying functions assume that the string arguments are encoded according to the locale's codeset? If so, since the source file is encoded in UTF-8, that won't work on Windows, and the strings need to be recoded before they are passed to libunistring functions. Any ideas for debugging this are welcome. FAIL: i18n.test: nl-langinfo et al.: locale-day (French) FAIL: i18n.test: nl-langinfo et al.: locale-day (French, using `%gl= obal-locale') This is because gnulib's nl_langinfo only supports C locale for the day names. I'm taking this up with gnulib maintainers. FAIL: i18n.test: number->locale-string: French: integer FAIL: i18n.test: number->locale-string: French: fraction FAIL: i18n.test: number->locale-string: French: fraction, 1 digit FAIL: i18n.test: monetary-amount->locale-string: French: integer FAIL: i18n.test: monetary-amount->locale-string: French: fraction There's no blank after the 7th digit, where the test expects it. Not sure what kind of problem is that, perhaps again due to gnulib's nl_langinfo. UNRESOLVED: i18n.test: format ~h: French: 12345.5678 UNRESOLVED: i18n.test: format ~h: English: 12345.5678 ~h is not supported on Windows.